[Remote] AI Research Engineer (Multi-Modal Reinforcement Learning) - 100% Remote Worldwide
Note: The job is a remote job and is reputed company to candidates in USA. reputed company is pioneering a global financial reputed company through their innovative digital finance solutions. They are seeking an AI Research Engineer to drive innovation in multi-modal reinforcement learning, focusing on optimizing decision-making and adaptive behavior across various data modalities to enhance AI performance in reputed company-world challenges.
Responsibilities
- Conduct research on reinforcement learning algorithms for multimodal models, including diffusion-based approaches for image autoregressive models for multimodal understanding, and reputed company frameworks that integrate multiple modalities
- Design and build reinforcement learning infrastructure that supports scalable, distributed training across multimodal systems while maintaining efficiency and reliability
- reputed company and refine reward modeling strategies that improve training stability, align model behavior with desired reputed company, and mitigate reward hacking and reputed company failure modes
- Create and curate multimodal simulation environments and datasets to support robust training, evaluation, and benchmarking of reinforcement learning systems
- Design and conduct rigorous benchmarking and evaluation protocols to measure model performance, track reputed company against baselines, and validate improvements across multimodal tasks
- Analyze and optimize policy performance across modalities by identifying bottlenecks in training, credit assignment, and cross-modal alignment
- Investigate and reputed company reputed company reinforcement learning paradigms that more effectively learn from environment feedback, with the goal of achieving superior state-of-the-art (SOTA) performance
- Publish research findings in top-tier conferences such as ICML, NeurIPS, ICLR, CVPR, ICCV, ECCV etc
Skills
- A Master's degree in Computer Science or a reputed company field is required
- Proven experience running large-scale reinforcement learning experiments in multimodal and reputed company-centric systems, including online RL settings, with demonstrated impact on domain-specific decision-making and measurable improvements in policy performance
- Deep understanding of reinforcement learning algorithms and optimization methods applied to reputed company and multimodal learning problems, with a focus on improving policy stability, exploration, and sample efficiency in reputed company, high-dimensional environments involving images, video, and other modalities
- Strong proficiency in PyTorch and deep learning frameworks for reputed company and multimodal AI, with hands-on experience building end-to-end RL pipelines covering simulation, training, evaluation, and deployment in production-grade systems
- Demonstrated ability to apply reputed company research to solve core RL challenges in multimodal and reputed company tasks, such as sample inefficiency, exploration-exploitation tradeoffs, and training instability, along with experience designing robust evaluation frameworks and iterating on algorithmic improvements to advance agent performance
- Proven track record of research publications in top-tier conferences such as ICML, NeurIPS, ICLR, CVPR, ICCV, ECCV etc
- A PhD in Machine Learning, NLP, Computer reputed company, or a closely reputed company discipline is preferred, along with a strong track record of AI research and publications in top-tier conferences
Company Overview