Let me save you some money. Not every message needs your most expensive model. Heartbeat pings ("are you still there?") can run on Flash-Lite at a fraction of the cost. Simple acknowledgments and status checks can use DeepSeek or another cheap model.
Reserve your premium model (like Claude Sonnet) for complex reasoning tasks: code review, strategic planning, nuanced conversation. You configure this in your agent's model field with conditional logic. A well-tiered setup can cut your monthly API bill by 60-80% without noticeably degrading response quality for routine interactions. That's real money back in your pocket.