Parallel GPUs change what the agent can search. Instead of testing one variable at a time, the agent runs factorial grid searches. It tests to experiments per wave, varying multiple parameters at once.
Example: the agent tested model widths in a single wave. On a single GPU, that's sequential experiments over + minutes. On GPUs, all run simultaneously. The agent sees the full trend in one round and picks the best width immediately. This catches parameter interactions that sequential search misses.