Common Mistakes

##### ###### ##### ### # # ### # # ###### ## ## ## ## ## ## ## # # # # # ## ##### #### ##### # # # # # # # #### ## # ## ## ## ## # # # # # ## ## # ###### ## ### # ### # ######

$1.$ Ambiguous success criteria. "Keep if results are better" gives no threshold. Write "keep if val_bpb decreases" instead.

$2.$ Missing crash handling. Your agent needs to know: fix simple crashes (typos, missing imports) and move on from broken ideas.

$3.$ Over-specifying implementation. "Insert a layer after attention" removes the agent's ability to find its own solution. Name the goal, not the steps.

$4.$ Leaving VRAM unconstrained. Without any memory guidance, your agent will scale the model until it OOMs on every experiment.

$5.$ Forgetting the NEVER STOP directive. Without it, your agent will politely ask if it should continue. At $3$ AM, nobody answers.