Karpathy's program.md is about lines. It defines val_bpb as the sole metric, restricts edits to train.py, sets VRAM as a soft constraint, and includes the Simplicity Criterion.
The results: roughly experiments over days, additive improvements, and an % training speedup on already-optimized code. The file started minimal and stayed minimal. Karpathy did not overload it with specific hypotheses or detailed instructions.
The lesson: a short, clear program.md with well-defined boundaries produces better results than a long, detailed one. Your agent is an LLM. It can reason about code. You don't need to micromanage it.