What does switching to AI Engineering actually mean for your career? Drawing
  from my experience pivoting to SRE, I offer an honest assessment of the
  opportunities and challenges for software engineers considering this
  move—from high-impact, high-pressure work to the realities of slower
  progress and less visible product surface.
        
          Continue reading
        
       
     
    
    
    
    
      
      
        
  Working in AI today, I'm seeing the innovator's dilemma play out in real
  time. While larger companies carefully plan deployments that work for their
  entire customer base, smaller teams like ours can ship, learn, and improve
  our AI products through actual usage. This isn't just about moving
  faster—it's about fundamental advantages in how AI products develop that
  favor startups, regardless of the resources incumbents can deploy. The
  dynamics surprised me, and they might surprise you too.
        
          Continue reading
        
       
     
    
    
    
    
      
      
        
  I've met teams who switched to Python just to build AI features, abandoning
  their normal stack for the ecosystem. But it's really not worth it! At
  incident.io we stuck with Go and it's been great - turns out static typing
  and proper concurrency are exactly what you want when building AI systems,
  provided you build some nice abstractions to go with it.
        
          Continue reading
        
       
     
    
    
    
    
      
      
        
  The gap between demo-ready AI products and production-grade systems is much
  larger than most realise. This post explains the four stages of AI product
  maturity, what tooling you actually need to build reliable AI systems, and
  how to recognise if you're stuck in the MVP trap.
        
          Continue reading
        
       
     
    
    
    
    
      
      
        
  From reliability engineering to wrestling with LLMs, my fourth year at incident.io pushed me harder than I'd expected. We launched On-call, weathered some tough times as a team, and I ended the year diving fully into AI.
        
          Continue reading
        
       
     
    
    
    
    
      
      
        
  A story about how incident response training went wrong, with valuable
  lessons about pod priorities, isolation, and the importance of a healthy
  incident response culture.
        
          Continue reading
        
       
     
    
    
    
    
      
      
        
  Of the mental models and rules I use in my life, by far the most useful is
  to learn only one thing at any given time.
        
          Continue reading
        
       
     
    
    
    
    
      
      
        
  My reflections on 2023, now my second full year at incident.io. Doubled the
  team this year (34 to 77), launched Status Pages and Catalog, and spent the
  last six months building a really exciting new product.
        
          Continue reading
        
       
     
    
    
    
    
      
      
        
  Whenever a system has access to a consistent store, you can extend that
  consistency through compare-and-swap to the system's users. This post shows
  how you can add CAS to an HTTP API using example code and real-world
  examples.
        
          Continue reading
        
       
     
    
    
    
    
      
      
        
  If you build a state machine on top of a relational database you can
  abstract concurrency problems away from your business logic and allow
  developers to write safe-by-default code without dealing with concurrency
  concerns.
  This post explains how to build a library that offers those protections, and
  how they work under-the-hood.
        
          Continue reading