Your AutoResearch agent is an LLM. Which one you pick matters.
[[u:/roadmaps/claude-code|Claude Opus ]] is the recommended choice. It's what Karpathy himself used. GPT- also works. If you use Codex, it fails. It ignores the "NEVER STOP" directive in program.md and halts mid-experiment, breaking your loop.
For local inference, community contributors have added Ollama support, so you can run open-weight models on your own hardware. Check the repository's discussions for setup details.