RAG Engineering Intern
RAG Engineering Intern (Python / Azure OpenAI)
Alphius | Paid Internship | Remote (US Required)
Alphius is building production-grade Retrieval-Augmented Generation (RAG) systems on Microsoft Azure. We are looking for a RAG Engineering Intern to work on document ingestion, vector search, and LLM-powered APIs within a cloud-native architecture.
This is a hands-on engineering internship. You will work on real retrieval pipelines, embedding systems, and inference services — not just notebooks or experiments.
What You’ll Work On
- Build and improve document ingestion pipelines (PDF, HTML, DOCX, CSV)
- Implement chunking strategies (recursive, semantic, sliding window)
- Work with vector databases (e.g., Azure AI Search, Qdrant, Pinecone)
- Assist in building hybrid retrieval (BM25 + embeddings)
- Support prompt engineering for retrieval-grounded QA
- Develop FastAPI-based inference endpoints
- Help monitor latency, cost, and retrieval quality
- Deploy services to Azure Container Apps or AKS
- Use Application Insights for tracing RAG pipelines
- You will contribute to systems used in real production workflows.
Required Qualifications
- Currently pursuing a degree in Computer Science, Machine Learning, or related field
- Strong Python skills (3.10+ preferred)
- Experience building projects involving LLMs or embeddings
- Familiarity with async programming (asyncio)
- Understanding of REST APIs
- Experience with Git
Preferred Experience
- Exposure to RAG pipelines or vector databases
- Experience with Azure OpenAI or OpenAI APIs
- Familiarity with FastAPI
- Basic knowledge of embeddings and retrieval concepts
- Exposure to Docker and cloud deployments
- Understanding of JSON schema validation or structured outputs
What You’ll Gain
- Real-world experience building production RAG systems
- Exposure to Azure OpenAI and cloud-native AI architecture
- Experience with vector search and hybrid retrieval
- Hands-on work with scalable LLM APIs
- Mentorship from senior engineers
- Opportunity for full-time conversion based on performance
Internship Details
- Paid internship ($1,200-$2,200 monthly)
- Remote
- 15–25 hours per week during semester; up to 40 during summer
- Duration: Summer or ongoing
How to Apply
- Please include:
- Resume
- GitHub or project portfolio
- Brief description of an LLM or RAG-related project you’ve built