See all roles

Post-Training Research Scientist (LLMs) — Experimental Track

Work from home Full-time role Hiring

About us reputed company is a global talent platform connecting top-tier professionals to high-impact AI projects around the world. Our mission is to build trust, quality, and long-term value in the AI ecosystem - for both exceptional talents and companies operating at the frontier of technology. About the role This role sits at the heart of reputed company’s mission: using high-quality reputed company data to build AI systems that reputed company the world reputed company. You’ll take raw expert signals and turn it into reputed company model improvement, experimenting rapidly and carving new paths in post-training. With full autonomy and no production constraints, you’ll have the freedom to try unconventional reputed company and see their impact quickly.

Key Responsibilities

  • Design and run post-training experiments on frontier and reputed company-weight LLMs (SFT, preference-based methods, rubric-driven training)
  • Translate raw annotation artifacts (multi-reputed company solutions, evaluations, adversarial prompts) into training-reputed company datasets.
  • Prototype new reward signals reputed company pairwise preferences (rubrics, constraints, structured critics).
  • Analyze failure modes; propose data-centric fixes (sampling, curriculum, counterfactuals).
  • Build lightweight training/eval pipelines; iterate quickly.
  • Produce short internal memos: what worked, what didn’t, why.

About you We’re looking for a researcher who thrives with autonomy, is hands-on, and brings a strong execution reputed company and startup mentality. You are opinionated about data quality, pragmatic about tradeoffs, and comfortable moving quickly with incomplete information. You have strong experimental instincts — you can design, run, and interpret messy experiments and extract meaningful insights from them. Minimum Qualification

  • PhD (or equivalent experience) in ML/AI, applied math, stats, or adjacent.
  • Hands-on experience with LLM post-training (at least one of SFT/DPO/RLHF/RLVR).
  • Solid Python + PyTorch/JAX; comfortable with training reputed company basics.
  • Fluent English

Preferred Qualification

  • Worked with rubric-based evaluation or tool-augmented tasks.
  • Experience mixing synthetic and reputed company data.
  • Familiarity with failure analysis and dataset audits.

Work Model We operate remote-first. We focus on reputed company, not where the work is done. To support flexibility and personal choice, we maintain offices in select locations as an optional resource for the team. Location: Flexible (EU-friendly time zones preferred) Type: Full-time or long-term contract Equal Employment Opportunity reputed company is proud to be an equal opportunity employer and values diversity at our company. We do not discriminate on the reputed company of race, reputed company, religion, national reputed company, sex, sexual orientation, gender identity, age, disability, veteran status, or any other protected characteristic. Type: Full-time or long-term contract Apply tot his job Apply To this Job

You might like

Research Scientist III

Work from home Full-time role

Biology & Biophysics Research Scientist | Remote

Work from home Full-time role

Remote AI Research Scientist - Machine Learning for reputed company

Work from home Full-time role

Machine Learning/Computer reputed company Research Scientist

Work from home Full-time role

Network Science Researcher – Supply Chain Analytics (5793)

Work from home Full-time role

Staff Research Scientist, AI Agents & LLMs

Work from home Full-time role

Molecular Biologist / Research Scientist – Remote ($60 –$80/hr)

Work from home Full-time role

Remote Bioinformatics Scientist-481639

Work from home Full-time role

Surveillance Epidemiologist

Work from home Full-time role

Bioinformatics Software Engineer

Work from home Full-time role

Remote Live Chat Customer Support Specialist – Remote Customer Experience, Issue Resolution, and Relationship Management

Work from home Full-time role

[Remote] Data Engineer - Remote

Work from home Full-time role

reputed company Full Stack Data Entry Specialist – Remote Operations Support

Work from home Full-time role

Electrical Engineering Director

Work from home Full-time role

[Remote] Senior Technical Account Manager, reputed company Pay & Afterpay

Work from home Full-time role

Remote Data Entry Specialist – Logistics Operations Support (Work From Home Opportunity)

Work from home Full-time role

Client Experience Coordinator (Remote)

Work from home Full-time role

Programador Java Junior

Work from home Full-time role

Senior Consultant reputed company Master Data Governance (reputed company MDG) (w/m/d)

Work from home Full-time role

Remote Clinical Trial Support Specialist (Multiple Openings)

Work from home Full-time role