At IBM , work is more than a job - it's a calling: To build. To design. To code. To consult. To think along with clients and sell. To make markets. To invent. To collaborate. Not just to do something better, but to attempt things you've never thought possible. Are you ready to lead in this new era of technology and solve some of the world's most challenging problems? If so, let's talk.
The focus of the internship is on cost-efficient serving of AI inference workloads, with a particular emphasis on optimizing routing strategies and managing KV (Key-Value) cache usage across distributed systems.
The intern will work on:
- Designing and evaluating routing algorithms to improve inference performance and cost.
- Investigating strategies for efficient KV cache management at scale.
- Prototyping and benchmarking ideas to optimize inference serving systems.
This internship offers a unique opportunity to work at the intersection of AI and systems, with real-world impact on scalable inference serving.
Our summer internship program offer you an opportunity to join our research team for 3 months internship (working 5 days a week) in either Haifa or Tel Aviv sites (according to each internship). During the internship, you will be working with our talented researcher on top projects, helping create the next generation of AI, security, quantum, cloud and much more.
- MSC or PhD candidate from CS in advanced stages of studies
- Background in Computer Science, Machine Learning Systems, or related fields.
- Knowledge of distributed systems, networking, or inference infrastructure is a plus.
- Strong programming skills (Python, Go, or similar).
- Interest in AI infrastructure and large-scale system optimization.
- Ability to work independently while also being an excellent team player.
- Familiarity with Kubernetes (K8s) is an advantage.
Publication/s at top-tier peer-reviewed conferences or journals.