
Senior Application Platform Engineer
- Beijing
- Permanent
- Full-time
- Drive reliability engineering initiatives, including infrastructure automation, service monitoring, incident response, and capacity planning.
- Leading and participating in technical design discussions across cross functional teams.
- Collaborate with application teams to define and enforce architectural best practices, CI/CD standards, and cloud-native patterns.
- Diagnose complex production issues through in-depth troubleshooting and implement resilient solutions to prevent recurrence.
- Contribute to the development of internal tools that improve observability, system health, and operational transparency.
- Analyze and optimize existing systems, providing enhancements and ongoing support as needed.
- Stay current with new technologies and proactively recommend improvements to existing cloud architectures and processes.
- Develop and maintain server-side logic, data processing, and application workflows.
- Mentor junior engineers and promote a culture of knowledge sharing and continuous improvement.
- 5+ years of professional experience in cloud engineering.
- Deep understanding of cloud-native application design and operations on platforms like AWS or Ali Cloud.
- Proficient in infrastructure as code and CI/CD pipelines using tools like Terraform, GitHub Actions, Jenkins, or similar.
- Strong experience with containerization (Docker) and orchestration platforms (Kubernetes).
- A Bachelor's or Master's degree in Computer Science, Computer Engineering, or a related technical field-or equivalent practical experience.
- Proven knowledge of security principles, including authentication, authorization, and secrets management in cloud environments.
- Skilled in monitoring and observability using tools like Kibana, CloudWatch, Dynatrace, or Splunk.
- Proficiency in at least one programming language such as Java, JavaScript, Node.js, Python, or Go.
- Expertise in data storage, big data pipelines, dimensional modeling, and data warehousing.
- Demonstrated ability to work independently in fast paced production environments and drive multi-functional initiatives.
- Prior experience with infrastructure for training and deploying ML models and LLMs is a bonus.