Learning from Giants #12
Design Docs at Google, Timeless Learnings about using Databases, Working Backwards at Amazon, and Understanding Binary Alternatives to JSON.
👋 Hi, this is Mathias with your weekly drop of the 1% best, most actionable, and timeless resources to grow as an engineering or product leader. Handpicked from the best authors and companies.
Did a friend send this to you? Subscribe to get these weekly drops directly in your inbox. Read the archive for even more great content. Also: I share these articles daily on LinkedIn.
Design Docs at Google
"One of the key elements of Google's software engineering culture is the use of design docs for defining software designs."
Although they each have their names, Design Docs are a process you'll find at most prominent tech companies. Yet they all solve the same problem: how to catch issues early in a project lifecycle while making changes is still cheap?
But there's much more goodness to Design Docs: ensuring architecture consistency, distributing knowledge, acting as documentation, and more.
"The design doc documents the high level implementation strategy and key design decisions with emphasis on the trade-offs that were considered during those decisions."
📗 Malte Ubl's Design Docs at Google is much more than a description of how Google does things. The author outlines best practices for writing Design Docs, the documents' ideal size (10-20 pages), and lifecycle. Here are a few curated highlights:
"The design doc is the place to write down the trade-offs you made in designing your software."
"Design docs should rarely contain code, or pseudo-code."
"Not all topics require Design Docs. A clear indicator that a doc might not be necessary are design docs that are really implementation manuals."
Things I Wished More Developers Knew About Databases
Datastores are often the system we blindly trust to bring all these guarantees for free to our applications. But your database may not be as consistent, isolated, or fault-tolerant as you think.
"In data-heavy systems, databases are at the core of system design goals and tradeoffs."
Yes, tradeoffs. And making informed tradeoffs requires a good level of knowledge or having experienced the mistake firsthand.
📗 Jaana Dogan's Things I Wished More Developers Knew About Databases is Jaana's learned-through-experience advice. The 17 points are timeless and actionable because they're not specific to any database technology or system. Having been an engineer at Google (notably on Spanner), AWS, and Github, they have seen a lot of mistakes causing data loss and outages on many different systems.
Here are my top 3:
Each database has different consistency and isolation capabilities.
Stale data can be useful and lock-free.
Evaluate performance requirements per transaction.
Working Backwards: a Classic Amazon Product Management Process
Start by writing the Press Release.
Sounds familiar? It's part of Working Backwards, a product definition process used at Amazon.
"To ensure that a service meets the needs of the customer (and not more than that) we use a process called "Working Backwards" in which you start with your customer and work your way backwards until you get to the minimum set of technology requirements to satisfy what you try to achieve."
It can sound simple, but it's a powerful technique to avoid shipping a bloated product during a multi-month process where the teams involved ultimately lose sight of the problems they're solving.
And it's not just a Press Release; there's an entire four-step customer-centric process.
📗 Werner Vogels's Working Backwards describes the famous Amazon technique. The author explains these four steps and how they help clarify what the team will build.
"We know at that point that the whole team has a shared vision on what product we are going the build."
JSON Alternatives: an Overview of Binary Serialization Formats
"Serialization is the process of translating a data structure into a bit-string (a sequence of bits) for storage or transmission purposes. The original data structure can be reconstructed from the bit-string using a process called deserialization"
Yes, like JSON, or XML.
But there are hundreds of others, some of which are used extensively: Apache Avro, Protocol Buffers, FlatBuffers, Apache Thrift, MessagePack...
Why can't everybody use JSON? Software is a trade-off game, and some of JSON's shortcomings led people to find alternatives:
Runtime-efficiency: JSON serialization is resource-intensive;
Space-efficiency: JSON takes a lot of space, incredibly costly for network traffic;
📗 Juan Cruz Viotti's A Survey of JSON-compatible Binary Serialization Specifications paper is the best read on binary serialization formats. After introducing JSON's shortcomings, the author goes through the history of binary serialization formats and describes the most used ones, their optimized use cases, and their evolution. I know sharing 100-page papers won't work for everybody, but there is a surprising lack of quality content on serialization formats. And this one is truly great!