Overview / Description

ml-intern is an open-source autonomous agent that takes over the repetitive engineering work of ML post-training. It reads arXiv papers to surface relevant techniques, generates and repairs datasets, submits and monitors training jobs, diagnoses failures, and re-runs experiments on its own — closing a loop that normally requires a dedicated team. In independent benchmarks, it delivered a +22 point GPQA improvement in under 10 hours and a +60% gain on HealthBench. Designed for ML researchers and engineers who want to compress iteration cycles, ml-intern runs headlessly, logs its reasoning at each step, and can be pointed at any post-training objective. Being fully open-source, it's inspectable and forkable for custom training pipelines.

Used For

AI tool for creators toolkit workflows

Pricing

Free (Open Source)

$0/month

Free and open source — run via the Hugging Face Space or self-host

View pricing

Pros & Cons

Pros

• Automates the full ML post-training loop: paper reading, dataset generation, job submission, failure diagnosis, and iteration • Proven results: +22 GPQA points in under 10 hours and +60% improvement on HealthBench • Runs headlessly and logs reasoning at every step for full transparency • Fully open source — inspectable and forkable for custom post-training pipelines • Designed to compress ML iteration cycles that normally require a dedicated team

Cons

• Requires significant compute resources to run training jobs autonomously • Self-hosting demands ML infrastructure expertise • Benchmark results are from early runs — results on novel tasks will vary

Questions & Answers

Alternatives

Compare this tool against close alternatives in the same category, focusing on output quality, onboarding speed, and workflow fit.

ml-intern

Overview / Description

Used For

Pricing

Free (Open Source)

Pros & Cons

Pros

Cons

Questions & Answers

What is ml-intern?

What benchmark results has ml-intern achieved?

Does ml-intern run autonomously?

Is ml-intern open source?

Who is ml-intern built for?

Alternatives