Original DP: dp[i][k] = best cost to partition positions 1 to i into exactly k segments. This is O(n⋅K) states. Penalized DP: dpλ[i] = best (cost + λ× count) to partition positions 1 to i into any number of segments.
This is O(n) states. The penalized DP is much faster. You compute it for different λ values until you find the one that produces exactly K segments.