
GenAI Software Engineer Intern – GenAI Model Experiment
- Shanghai
- Training
- Part-time
- Deploy and benchmark state-of-the-art AI models (e.g., Llama-3, Mistral, Gemma) end-to-end on cloud/server environments.
- Set up and maintain GPU-powered experimentation environments (CUDA, Docker, conda, etc.).
- Design and run model performance tests (latency, memory usage, output quality) and A/B experiments.
- Develop automation scripts for model deployment, scaling, and resource monitoring.
- Research and prototype emerging AI techniques (MoE, quantization, Agentic workflows, etc.).
- Currently pursuing a Master’s/PhD in Computer Science, AI, or related fields (full-time internship commitment).
- Proven coding rigor: Strong Python skills with AI project experience (share GitHub/reports).
- Model deployment chops: Hands-on with PyTorch/TensorFlow, Hugging Face, and inference optimization (e.g., vLLM, TGI).
- Linux/Cloud fluency: Debugged CUDA issues, managed Docker containers, and used AWS/GCP/Azure (EC2, SageMaker, etc.).
- Self-starter: Ability to set up experiments independently and troubleshoot hardware/software issues.
- English proficiency (read/write technical docs, arXiv papers).
- Experience with model quantization (GGUF, FP8), LoRA fine-tuning, or distillation.
- Familiarity with ML monitoring tools (Weights & Biases, TensorBoard).
- Contributions to open-source AI projects or technical blogs.
- Curiosity about AI Agents (AutoGPT, LangGraph) or multi-modal systems.
- Hands-on experience with real-world AI/ML software development projects with GPUs provided.
- Mentorship from experienced professionals in AI, ML, and software engineering.
- Opportunity to work with cutting-edge technologies and tools.
- Collaborative and innovative work environment.
- Potential for future full-time employment.