Any good investigation builds on assumptions, and when they're wrong you can
end up down a dark path. Read about an incident where this happened badly,
and learn some strategies to avoid falling into the same trap.
Starting next week, I'll be joining the incident.io team after logging off for the
last time at GoCardless. Mostly for myself, here's some reflections on the
last five years.
Working hard is a great way to accelerate learning, but it can come at a
cost. This post shares my experience and lessons taken from great role
models I've found along the way.
The upcoming Golang embed directive can help distribute applications that
website into your Go program, simplifying distribution to single binary.
Building a modern infrastructure stack is difficult, with a bewildering
number of choices to be made. Some technologies complement each other, while
others have very different philosophies: it’s easy to get lost.
To help those facing similar challenges, we’re open-sourcing our “Getting
Started” tutorial, which is what we ask all GoCardless developers to follow
during their onboarding.
Compression is a trick that can be used to solve a load of problems. Outside
of well known use cases, there are a variety of opportunities to improve
efficiency or save money by leveraging compression.
This post covers one such opportunity, where a tiny change allowed us to
save >$30k per year in infrastructure cost, along with a few other
big-savers from judicious application of compression.
Your company probably has a lot of data. When you expose all of these
different sources under a tool that makes complex analysis as fast as
thought, you'll create a load of opportunities to make data-driven
By sharing an example where 2hrs of analysis helped prioritse 2-4 weeks of
engineering work, I'm going to try convincing you that the value of a
connected dataset is far more than the sum of its parts.
Tips-and-tricks to better handle incidents, learned over years of dealing
with production issues. Included are opinions on strategy, process, tools
and how to handle the all-important human element.
Read this if you're new to incident response and want a starter-pack of
advice, or to contrast your own perspective with another.
As a team's infrastructure estate grows, it becomes increasingly beneficial
to create a global registry of all people, services, and components. Once
you do, you can integrate with tools like terraform, Chef, and Kubernetes to
help provision your infrastructure according to a single authoritative
This post explains how GoCardless built their registry, and some of the uses
we’ve put it to.
Most Prometheus metrics recording durations are subject to a
time-of-measurement bias, causing misleading graphs that can derail
investigations. See how an open-source Tracer can help solve this problem.