Testing autonomous agents (Or: how I learned to stop worrying and embrace chaos)
Look, we've spent the last 18 months building production AI systems, and we'll tell you what keeps us up at night — and it's not whether the model can answer questions. That's table stakes now. What...
