See all roles

[Remote] Research Intern (LLM)

Work from home Full-time role Hiring

Note: The job is a remote job and is open to candidates in USA. Abaka AI is focused on advancing artificial intelligence research, and they are seeking a Research Intern to contribute to the development of challenging QA datasets and evaluate large language models. The role involves collaboration with global researchers and requires strong analytical and execution skills.

Responsibilities

  • Design and construct high-quality, sufficiently challenging QA datasets (graduate/PhD level) inspired by GPQA, HLE, and AI4Sci families, collaborating with a global network of talented researchers
  • Evaluate large language models on reasoning, factuality, and problem-solving benchmarks
  • Develop review pipelines and quality-control criteria for expert-level question generation
  • Analyze model outputs, conduct error taxonomy studies, and summarize insights for internal reports and research papers
  • Collaborate with the 2077AI Foundation’s open-source benchmark teams on public dataset releases

Skills

  • Strong background in computer science, data engineering, artificial intelligence, or related fields, with hands-on experience in large-scale data systems
  • 1+ years of experience with LLMs, prompt engineering, and evaluation frameworks (e.g., LM Eval Harness, OpenCompass)
  • Excellent written and verbal English skills and analytical reasoning
  • Strong execution and team management skills—able to translate high-level objectives into actionable plans and drive team outcomes
  • Experience with formal methods, chain-of-thought evaluation, or curriculum generation
  • Relevant publications in top conferences

Company Overview

  • Abaka AI is a leading AI company and we are committed to becoming the data partner in artificial intelligence industry. It was founded in 2021, and is headquartered in Palo Alto, California, USA, with a workforce of 51-200 employees. Its website is https://www.abaka.ai/.
  • Company H1B Sponsorship

  • Abaka AI has a track record of offering H1B sponsorships, with 2 in 2025. Please note that this does not guarantee sponsorship for this specific role.
  • Apply To This Job

    You might like

    [Remote] Student Researcher [Seed LLM Post Training – Reward Modeling] - 2026 Start (PhD)

    Work from home Full-time role

    Mutual Funds Relationship Manager

    Work from home Full-time role

    [Remote] Reinforcement Learning Research Intern for Game AI

    Work from home Full-time role

    Credit Analyst, Power, Energy & Utilities

    Work from home Full-time role

    [Remote] Medical Coding Intern - Fully Remote - Must have a NM Residence

    Work from home Full-time role

    [Remote] Virtual Phone Sales Representative Virtual Phone Sales Representative

    Work from home Full-time role

    Junior Account Executive (AE)

    Work from home Full-time role

    [Remote] Mortgage Loan Originator

    Work from home Full-time role

    Data Engineer, Junior

    Work from home Full-time role

    Associate, Paralegal

    Work from home Full-time role

    Senior Associate eDiscovery, Forensic Technology Services

    Work from home Full-time role

    Experienced Remote Data Entry Specialist – Flexible Work from Home Opportunity with arenaflex

    Work from home Full-time role

    Lead HRIS and Operations Specialist

    Work from home Full-time role

    Linen Distribution Tech | North | PRN Day

    Work from home Full-time role

    Sales Development Representative - Outbound Sales - Remote

    Work from home Full-time role

    Project Manager- Water/Wastewater- West Coast

    Work from home Full-time role

    Chief Human Resource Officer (CHRO) | P&C Insur...

    Work from home Full-time role

    Experienced or Senior GRC Analyst

    Work from home Full-time role

    Job Title:

    Unlock Your Potential as a DevOps Engineer: Join Our Remote Team in Spain

    Work from home Full-time role

    Residential Coordinator IHS 603 - Swing Shift (JR 5038)

    Work from home Full-time role