LLM Training&Agent Engineer
Advanced Micro Devices View all jobs
- Beijing
- Permanent
- Full-time
- Train, fine-tune, and optimize Large Language Models (LLMs), including but not limited to pretraining, SFT, and RLHF pipelines
- Design and develop LLM-based agent systems (e.g., tool use, planning and reasoning, multi-agent collaboration)
- Optimize LLM inference performance, including latency, throughput, and memory (VRAM) usage
- Participate in GPU computing optimization, including operator/kernel optimization and parallelization strategies
- Collaborate with research and product teams to drive the deployment of LLMs in real-world applications
- Bachelor’s degree or above in Computer Science, Artificial Intelligence, or a related field
- 4+ years of relevant development experience
- Proficient in at least one of Python or C++, with strong engineering skills
- Familiar with LLM training workflows, with hands-on experience in training or fine-tuning; experience deploying LLM-based products is a plus
- Experience in agent development (e.g., LangChain, in-house agents, tool use systems)
- Familiar with LLM inference optimization techniques, including but not limited to acceleration, quantization, and KV cache
- Understanding of GPU computing principles, with some experience in operator/kernel optimization
- Experience with large-scale LLM training (e.g., distributed training, Megatron, DeepSpeed)
- Familiarity with CUDA or Triton, with experience in GPU kernel development or optimization
- Experience in high-performance computing (HPC) or inference framework optimization
- Hands-on experience deploying agent systems in production (e.g., complex task planning, multi-tool orchestration)