Required Computer Vision & Speech- MSc and PHD-Summer internship 2025- Research Lab
Introduction
At , work is more than a job - it's a calling: To build. To design. To code. To consult. To think along with clients and sell. To make markets. To invent. To collaborate. Not just to do something better, but to attempt things you've never thought possible. Are you ready to lead in this new era of technology and solve some of the world's most challenging problems? If so, let's talk.
Your role and responsibilities:
If you're a student interested in the fields of machine learning, deep learning, GenAI , and intersection of multiple disciplines of computer vision, speech and audio analysis, and natural language, and you're looking for a place where you will do research with academic and industrial impact, then this position is for you!
Our team develops technologies, models, algorithms, and software that make an impact on our products and on the world; we publish papers and issue patents based on the work we do.
The internship responsibilities involve solving real-world problems using cutting edge deep learning/machine learning methods, with the aim to advance the state of the art in the domain of document understanding, speech analysis and speech generation. The topics include, novel self-supervised learning techniques, realistic data synthesis, multimodal research, and more. To achieve these goals, you will collaborate with fellow team members and have access to nearly limitless compute power (GPU). The work will focus on at least one of the following subjects:
Document understanding is the ability to read documents, understand their structure and multimodal content, extract and act upon it. This is a crucial technology as business documents are key to the day-to-day operation of organizations.
Document understanding remains a research challenge that requires a multi-disciplinary perspective, spanning textual analysis, visual comprehension, layout understanding, knowledge representation, data mining and more.
Speech and Audio technologies provide the ability to understand as well as generate audio and speech. In particular, speech recognition and synthesis are key components of natural spoken interaction, which is crucial for customer care by organizations. This also requires a multi-disciplinary perspective, spanning conversational and generative AI and modeling for speech, language, and audio. The areas we are looking at include also multimodal and foundation models, image and audio understanding, data synthesis, expressive speech synthesis and tokenization.
The results of the internship aim to include a publication in a top AI conference and/or development of a prototype demonstrating new AI functionality.
Succeeding in these tasks is expected to make an important impact on the research community in these exciting fields and lead to strong publications in a leading CV and Speech technologies venues (e.g. CVPR / ICLR / ICCV/InterSpeech/NeurIPS/ICML etc).
Our summer internship program offer you an opportunity to join our research team for 3 months internship (working 5 days a week) in either Haifa or Tel Aviv sites (according to each internship). During the internship, you will be working with our talented researcher on top projects, helping create the next generation of AI, security, quantum, cloud and much more.
Requirements: Required education:
Bachelor's Degree
Preferred education:
Master's Degree
Required technical and professional expertise
M.Sc. or Ph.D. student with knowledge in Machine Learning, Computer Vision, and Deep Learning
Strong CV background using modern methods, deep knowledge of the recent literature, prior CV/ML/DL publications is an advantage
Strong python coding skills. Experience with PyTorch or TensorFlow is an advantage
A team player with great social skills, willingness to collaborate
Strong background in Deep Learning methods. Knowledge of the recent literature.
This position is open to all candidates.