I had a great conversation with Ravin Thambapillai for the AI Adoption Playbook podcast recently, covering:
- How we’re building an AI investigator at incident.io to analyze logs, traces and metrics to automatically root cause your incidents.
- Why we’ve adopted ‘scorecard-driven-development’ with an evaluation framework that helps us make changes with confidence to the system, knowing they improve it (and how!)
- Combining automated LLM evaluations with human feedback to monitor performance in production.
Anyone building AI agents, especially those stuck in the “is this good/how do I make changes to this?”, should find this really useful.
Link to the podcast is in thread 👇