Learning from Giants #53
Hashicorp's founder's approach to building Large Technical Projects, Emojis under the hood are complex but fun unicodes combos, and the timeless discussion of how to Scale Stateful Objects.
👋 Hi, this is Mathias with your weekly drop of the 1% best, most actionable, and timeless resources to grow as an engineering or product leader. Handpicked from the best authors and companies. Guaranteed 100% GPT-free content.
Did a friend send this to you? Subscribe to get these weekly drops directly in your inbox. Read the archive for even more great content. Also: I share these articles daily on LinkedIn.
M. Hashimoto’s approach to building Large Technical Projects
How Mitchell Hashimoto (Terraform, Vagrant, and many more) builds and completes large projects.
"Whether it's building a new project from scratch, implementing a big feature, or beginning a large refactor, it can be difficult to stay motivated and complete large technical projects."
Who best than Mitchell, founder of Hashicorp and founding engineer of many of Hashicorp's impressive suite of tools, to take advice from on that topic?
He advises to cadence the visible results to keep your excitement high and use continuous demo-ability and that excitement factor to order your work. How does he stay excited about the project?
"The goal is to always give yourself a good demo. [...] Initially, I try to think what a realistic project is where I can see results as soon as possible."
Break down the large projects into small sub-projects with visible results, and start building to assemble them into a first demo.
Once you reach a runnable demo, you can return to small sub-problems and fix the most critical issues. A usable demo helps prioritize these problems: you can now feel firsthand what needs improvement.
📗 Mitchell Hashimoto's My Approach to Building Large Technical Project explains the method through the example of a terminal emulator personal project Mitchell has been building. While the advice may seem generic, it's essential to remember that motivation is deeply personal.
"Everyone I think needs to find some process to reinforce their motivation in a healthy way. I realized seeing results motivates me really strongly, I've built my work style around that, and it has worked well for me thus far."
Emoji under the hood
Did you know that complex emojis are just combinations of existing ones?
👨 + 💻 → 👨💻
🐻 + ❄️ → 🐻❄️
The Unicode implementation of emojis is quite fascinating.
"At their simplest, they are just that: another symbol in a Unicode table. That's why emoji behave like any other letter: they can be typed in a text field, copied, pasted, rendered in a plain text document,..."
The simple ones have a code of their own.
🐙 is just U+1F419.
The rest are the source of many engineer nightmares because one character maps not to one but to a sequence of codepoints.
"Meet Grapheme Clusters. Grapheme cluster is a sequence of codepoints that is considered a single human-perceived glyph."
"Grapheme clusters create many complications for programmers. You can't just do substring(0, 10) to take the first 10 characters—you might split an emoji in half (or an acute, so don't do it anyway)!"
⚠️ You can't cut strings, can't measure an accurate length, or reverse them without a grapheme-cluster-aware library. So complex emojis are combinations of multiple codes, some of which are valid standalone emojis (but not all).
"Instead of adding a new codepoint for each emoji and skin tone combination, only five new codepoints were added. [...] Together they form a ligature: 👋 (U+1F44B WAVING HAND SIGN) directly followed by 🏽 (U+1F3FD MEDIUM SKIN TONE MODIFIER) becomes 👋🏽."
"U+200D is called ZERO-WIDTH JOINER, or ZWJ for short. It works similarly to what we saw with skin tone, but this time you can join two self-sufficient emoji into one. Not all combinations work, but many do, sometimes in surprising ways!"
📗 Nikita Prokopov's Emojis under the hood details all the hidden secrets of emojis' unicode representation. It's a lot of fun but also quite actionable knowledge. For instance, you do not need a large mapping table to turn an ISO country code into its flag, but a little emoji magic!
Scaling Stateful Objects
"It is widely (MIS)believed that stateless Server-Side Apps are The Only Way™ to scale Server-Side."
Every serious project starts with a stateless handler and a database. And for most, the best practice is to persist every action to the database. At scale, there are many solutions to scaling reads, from caches to replicas.
But writes are a lot more complex to scale without losing ACID guarantees.
For your application, the more you write, the more this becomes a problem. Reads will almost scale infinitely but writes undoubtedly won't.
“keeping our request handlers stateless, does NOT really solve the scalability problem; instead – it merely pushes it to the database."
"In a pretty much any serious real-world interactive system, it is database which is The Bottleneck™."
But write-heavy applications exist, the most famous being online games and all other interactive, real-time applications like Figma. These applications cannot guarantee scalability and low latency if they push every action to durable storage.
"Server-Side Apps with an In-Memory State can easily save us 10x-1000x of database load."
"Of course, these performance benefits of Stateful Server-Side Apps don't come for free (nothing does). The currency we'll be paying with for this drastically improved performance, is Lack of Durability. In other words – if our Stateful Server-Side App crashes, we'll lose all the state which haven't been saved yet to the DB."
📗 IT Hare's Scaling Stateful Objects is a timeless read (except for a few numbers) about the complexity of scaling write-heavy applications. It raises a valid point of durability vs. performance and offers many patterns to solve this. We engineers have a love-hate relationship with stateful systems, but there is still innovation in the space, like Cloudflare's Durable Objects.