Learning from Giants #16
Raft Consensus Algorithm, Understanding DNS, Creating Good OKRs, WebAssembly for extensibility at Shopify, and Choosing a North Star Metric for your Company.
👋 Hi, this is Mathias with your weekly drop of the 1% best, most actionable, and timeless resources to grow as an engineering or product leader. Handpicked from the best authors and companies.
Did a friend send this to you? Subscribe to get these weekly drops directly in your inbox. Read the archive for even more great content. Also: I share these articles daily on LinkedIn.
The Raft Consensus Algorithm
One of the foundational concepts of distributed database systems is consensus and consensus algorithms.
Consensus algorithms describe the process followed by multiple servers to agree on a value. Their complexity lies in their capacity to reach an agreement when one or multiple servers fail or are not reachable.
"Typical consensus algorithms make progress when any majority of their servers is available; for example, a cluster of 5 servers can continue to operate even if 2 servers fail."
Many generations of algorithms led to Paxos, the most used consensus algorithm to date. However, Paxos is highly complex to understand and implement correctly and does not provide a practical solution to all consensus problems. So most systems use "Paxos-like" algorithms rather than exact implementations of Lamport's paper.
Raft was designed to solve that: an easier-to-grasp and more practical algorithm.
"Raft is a consensus algorithm that is designed to be easy to understand.
"It's equivalent to Paxos in fault-tolerance and performance. The difference is that it's decomposed into relatively independent subproblems, and it cleanly addresses all major pieces needed for practical systems."
📗 Raft's landing page is a good place to start reading about the consensus algorithm. Still, if you're looking for a step-by-step introduction, I highly recommend The Secret Lives of Data's animated Raft introduction. It takes you through most of the quirks that make Raft one of the most used consensus algorithms in recent distributed systems, e.g. in Etcd, MongoDB, and CockroachDB.
To go further: Animated introduction to Raft (The Secret Lives of Data)
What happens when you update DNS entries?
"There are 2 kinds of DNS servers: authoritative and recursive."
Authoritative servers are just DNS record databases; they don't power the internet alone. Instead, they are the data source of recursive DNS servers.
8.8.8.8, Google's DNS server, or 1.1.1.1, Cloudflare's private alternative, are recursive DNS servers. Your device calls one of them when it makes a request to a distant, domain-identified server.
Recursive DNS servers have one role: answer DNS queries, i.e., returning lists of DNS records for a domain. They do so through a recursive request heavily cached for obvious bandwidth and performance reasons.
Where does the recursive query start?
"It has IP addresses for the root DNS servers hardcoded in its source code."
📗 Julia Evans's How Updating DNS Works introduces DNS queries and how they work from the perspective of a user willing to update a domain's DNS records. While just scratching the surface, it shows the fundamental principles of DNS and how theory differs from practice, particularly on cache TTL.
How to Create Good OKRs
OKRs is the de-facto goal-setting method for start-ups.
"OKRs are so dominant because they bridge a painful gap in every company's life: the one between goals and operations."
Alignment. When operations grow into many different teams, companies must ensure all steps are taken in the right direction.
More specifically, the OKRs method aligns people in a non-prescriptive, measurable way: it focuses on the goals, not the outputs.
📗 Luca Rossi's How to Create Good OKRs introduces the OKRs method by explaining why it's so common and powerful and focusing on a few critical aspects of OKRs. The author skims over many topics that require hours of study and years of practice to master, but provides some pointers to resources that can help speed up this process.
"You can find plenty of articles on how to create good OKRs. No company is the same, though, and everyone needs to iterate to eventually find out what works for them."
How Shopify Uses WebAssembly for Embedded Extensibility
"At Shopify we aim to make what most merchants need easy, and the rest possible. We make the rest possible by exposing interfaces to query, extend and alter our Platform."
This led to Shopify building one of the world's largest B2B app stores, with thousands of apps. All of them used the same API-based integration. Shopify Apps can customize how e-commerce shops work through webhooks and direct API calls.
But this API model has limits:
It only allows async workflows to prevent App failures from cascading to Shopify.
It has high latency. Webhook delivery, webhook processing by the App, API call.
When pushing extensibility even further, the Shopify engineers had no choice: to build reliable synchronous extensible workflows, they needed to run the App code on Shopify's infrastructure. That would solve reliability, scalability, and latency.
📗 Shopify's How Shopify Uses WebAssembly Outside the Browser details why the team chose WebAssembly as a technological basis for this embedded extensibility problem. Duncan Uszkay then details how the team runs thousands of these small WebAssembly modules in the Shopify infrastructure at key synchronous points of their Order workflow, like Checkout.
It's impressive how Shopify has pushed the limits of extensibility everywhere in its product!
Choosing Your North Star Metric
"Your North Star Metric is your strategy, and your strategy is your North Star Metric. Choose wisely."
That North Star Metric is revenue for more than 50% of companies surveyed in the article. Shopify tracks GMV, Notion revenue growth, and Webflow ARR. So simple. Yet it may not be the right match for your company. Using revenue as the sole driver of a company can be pretty dangerous.
"It's spiky, and thus hard to make operational."
"Focusing on revenue goals too early can lead to suboptimal decisions."
"A goal around revenue can be uninspiring to the team."
How can you find the right North Star Metric for your company?
📗 Lenny Rachitsky's Choosing Your North Star Metric surveys growth-stage companies' North Star Metrics before giving a framework for choosing one for your company. By looking at the metrics that would "most accelerate your business' flywheel" and at "jobs your users are hiring your product to do", you will find your path to your NSM!
Disclaimer for earlier stage companies: your North Star should only be product market fit.
"The examples listed above are growth-stage companies. In the earliest stages of a company, however, before you've found the fabled product-market-fit, your singular aim should be answering one question: "Am I building something people want?""
"Your North Star Metric is your strategy, and your strategy is your North Star Metric. Choose wisely."