Your agent writes its change directly into train.py. It might adjust a single number (learning rate from to ), restructure a block (replacing the warmup schedule), or add new functionality (a regularization term).
The Simplicity Criterion from program.md guides what your agent does. A small improvement plus lines of hacky code means reject. A small improvement from deleting code means keep. Your agent should prefer the simplest change that works.