Learning from Giants #9
Uber's foundational re-write, Investing in production tooling rather than staging, Picking the right CSS units, Defining a Good Product Manager, and Change data capture.
👋 Hi, this is Mathias with your weekly drop of the 1% best, most actionable, and timeless resources to grow as an engineering or product leader. Handpicked from the best authors and companies.
Did a friend send this to you? Subscribe to get these weekly drops directly in your inbox. Read the archive for even more great content. Also: I share these articles daily on LinkedIn.
Uber’s Fulfillment Platform Re-Write
How Uber built a platform that powers every Uber product in every city.
Platform engineers have this constant question: how much of the complexity of building a product can we abstract for engineers?
The standard answer is cloud primitives, deployment models, and maybe data storage. But the right platform at the right time can enable businesses to move an order of magnitude faster. The Uber team knew that.
"We spent 6 months carefully auditing every product in the stack, gathering 200+ pages of requirements from stakeholder teams, extensively debating architectural options with tens of evaluation criteria, benchmarking database choices, and prototyping application frameworks options. After several critical decisions, we came up with the overall architecture to suit our needs for the new decade."
"The platform handles more than a million concurrent users and billions of trips per year across over ten thousand cities"
📗 Uber's Fulfillment Platform is a unique peek at a foundational internal system that powers Uber on a scale matched by few others. What's amazing about this article is the many levels of detail that it gives. From the database choice to the data model overview and global deployment. I can imagine it took a world-class team of engineers.
Investing in Production Tooling at Early Stage Start-Ups
One of the hardest things to prioritize in early-stage start-ups is developer tooling. What value does that create for our users? What if we're dead in 6 months anyway?
The most common mistake is to invest in multi-step deployment processes and staging environments before production observability.
Because founding engineers have previously worked at a relatively large company, the first thing they set up after basic CI is a staging environment. Their mistake? They stay blind to what happens in production. And when success comes and the team grows, deploying to production grows into this feared, restricted process because it's hard to see what's going right or wrong.
"Because we’ve systematically underinvested in prod-related tooling, we’ve chosen to bar people from prod outright rather than build guardrails that by default help them do the right thing and make it hard to do the wrong thing."
If you talk to them and ask what they think about testing in production, they will laugh and explain that's why they put resources into a staging environment first.
Yet every prod deployment is some sort of test.
"If testing is about uncertainty, you “test” any time you deploy to production. Every deploy, after all, is a unique and never-to-be-replicated combination of artifact, environment, infra, and time of day. By the time you’ve tested, it has changed."
📗 Charity Majors's I test in prod is a longer article on that topic. To be clear, the author does not oppose pre-prod and prod testing. Instead, she discusses the impact of under-consideration and investment into production tooling like observability and deployment. To accept living in an imperfect world where errors happen, they must be adequately monitored and quantified. That should help build a culture of ownership where engineers have all the tools and visibility to deploy to prod confidently.
"Yes, you should test before and in prod. But if I had to choose—and thankfully I do not—I would choose the ability to watch my code in production over all the pre-prod testing in the world. Only one represents reality. Only one gives you the power and flexibility to answer any question. That’s why I test in prod."
Everything you need to know about CSS Units for Typography and Layout
🤷♀️ px, rem, dp, %, em... 🤷♀️
CSS units are one of the reasons front-end engineering and design are tricky. You can achieve the same visual result on a fixed-dimension page with any of them.
Yet they are far from equal. Change the viewport dimensions, aspect ratio, zoom in or out, and you'll start noticing differences. And if you're designing for many users, you'll hit all of these.
"As per the WCAG (accessibility) guideline the content shall be readable at 200% zoom."
So when starting working on a new app or design system, the question will arise. Which CSS units should we use?
"To render anything on the screen we need some space so in order to define that space we also need to define a unit of that measurement."
📗 Razorpay's Units for typography and layout is the team's architecture decision record documenting the choice of units for their Blade design system. As expected, the choice isn't a silver bullet but is extremely well explained. And if you don't care about why, jump to the "What will work for us?" section to get to the conclusion.
What’s a Good PM and how to Hire one
"What is product management? What does a product manager do? What makes a great product manager, and how do you become one?"
Product is such a complex yet crucial role that if you ask two different people about what product management is, you could get two almost opposite answers.
Of course, different contexts require different skillsets; Product is far from a one-size-fits-all.
Yet there are fundamental traits that you can look for in all product leaders. And as a PM, they should drive your personal growth.
📗 Ken Norton's How to Hire a Product Manager is a re-edition of his 2005 essay that was instrumental to the success of many leaders and aspiring PMs. Rather than focusing on hiring, it focuses on these traits that make a good product manager, interview questions being the cherry on top.
One of product management's all-time classics.
Change Data Capture
10x engineers meet 10x technology. Some pieces of tech unlock hundreds of use cases.
In recent years, the trend has been to move away from batching toward streaming. Change-Data-Capture (CDC) is that 10x technology.
It's the simplest form of streaming that enables real-time data processing with no restriction of use case. Anything that can consume a pub/sub type system can become reactive, real-time, thanks to CDC.
The magic: you don't have to craft custom events.
"Change data capture (CDC) is a data replication technique to identify and track changes in a database that publishes each database event as messages to a real-time stream. Downstream systems can consume this row-level change feed for various purposes such as analytics, synchronization, and data replication."
📗 FaunaDB's What is change data capture introduces this idea, its use cases as well as implementation. While log-based replication has become the standard, especially on Postgres and MySQL, there are multiple ways of doing CDC on other storage systems.