<p>Getting to the last-mile with our investigations system and digging into message format, how we package the ‘RCA’ and next-steps we propose to responders, etc.</p>

<p>Wanted to share an example of the differences between Sonnet 3.7 and GPT-4.1 in formatting the investigation message.</p>

<p>Things to note are:</p>
<ul>
  <li>Sonnet 3.7 is much more concise than GPT-4.1, and if you look carefully at the messages there is almost no information lost, it’s just speaking more plainly</li>
  <li>GPT-4.1 is more verbose and restates technical detail, something we’ve found to be useful in other parts of our investigation system (we’re using a lot of GPT-4.1 to build the data behind this message!) but doesn’t translate well to a human readable message</li>
  <li>GPT-4.1 is more likely to explain reasoning and caveats, and has downgraded the confidence just slightly (high -&gt; medium) which is consistent with our experience of the model elsewhere</li>
</ul>

<p>In this case I much prefer the Sonnet version. When you’ve just been paged you want a concise and human-friendly message to complement your error reports and stacktraces, so we’re going to stick with Claude for this prompt!</p>