The decision is binary. No nuance.
If the new val_bpb is lower than the current best, the commit stays. The branch advances. If val_bpb is equal or higher, the commit gets reverted with git reset HEAD~1. If training crashed, the commit also gets reverted.
This creates a ratcheting mechanism. Your git branch can only move forward. Every surviving commit represents a genuine improvement. The branch history is a monotonically improving sequence.
There's no tolerance band. An improvement of is technically kept. This is how metric gaming through random seed changes can creep in.