Learning from Giants #1

Why authorization is hard, PostgreSQL internals, code review interactions, delivering billions of events reliably, and a pair programming guide.

Aug 28, 2022

👋 Hi, this is Mathias with your weekly drop of the 1% best, most actionable, and timeless resources to grow as an engineering or product leader. Handpicked from the best authors and companies.

Did a friend send this to you? Subscribe to get these weekly drops directly in your inbox. Read the archive for even more great content. Also: I share these articles daily on LinkedIn.

Why Authorization in Software is Hard

Authorization is like ERPs: all companies have similar needs, but almost all end up building or customizing their system at a high cost.

Additionally, building authorization is rarely a two-way-door decision: you can’t easily revert or update the system once in production.

Why is authorization in software that hard? Because once you start putting the requirements on paper, you realize you have three complex problems to solve:

“Enforcement is what your application actually does with an authorization decision.”
“Authorization decisions answer the question: is the user allowed to perform this action on this resource?”
“Modeling is how we group individual pieces of authorization logic into higher-level concepts.”

📗 Oso’s Why Authorization is hard is an excellent deep dive into these three dimensions of authorization. Sam Scott covers them with clear examples of the different options and the cost/benefit trade-offs of each.

Beware: if you're a curious software engineer, make sure you have at least an hour. You will very likely not stop at this article and end up reading about Airbnb, Carta's solution, and the main inspiration of all these systems: Google's Zanzibar.

Authorization is a very deep (and fascinating) rabbit hole.

Read the full article on Oso's blog

How PostgreSQL aggregation works and how it inspired Timescale’s hyperfunctions design

As software engineers, as more and more abstraction layers are added to our stack, we’re trained to understand them on a need-to-know basis. We must accept that we won’t understand them entirely and focus on the interfaces.

But interface-only understanding has limits. Even popular systems can behave very differently in very similar conditions. In particular, SQL databases have a lot of caveats.

Many engineers often miss how important it is to build mental models of how seemingly black box systems work. Anti-patterns look a lot more obvious when you can make sense of why they’re wrong. That’s even truer in companies that operate in 1-2 week sprints and where the “just write the code” attitude prevails.

Let’s start with a simple concept: aggregates in databases.

"Under the hood, aggregates in PostgreSQL work row-by-row. But, then how does an aggregate know anything about the previous rows?"

📗 Timescale's How PostgreSQL aggregation works and how it inspired our hyperfunctions' design" is a well-illustrated article detailing what two-step aggregation is and how it powers PostgreSQL's aggregation functions like MAX and AVG. David Kohn then proceeds to explain how this design inspired the implementation of TimescaleDB's aggregate functions.

Want more? I’ll come back to this idea of building mental models of systems a lot. There are plenty of resources on that topic.

Read the full article on Timescale

How to Make Your Code Reviewer Fall in Love with You

Code reviews are the most critical routine of high-functioning teams.

When team growth quickens or deadlines shorten, code reviews always feel too slow for the author. As an engineer, it can be frustrating to wait a few hours (days?) for a review that feels trivial. It’s only a few lines of code!

What can you do?

"When people talk about code reviews, they focus on the reviewer. But the developer who writes the code is just as important to the review as the person who reads it.”

📗 Michael Lynch's "How to Make Your Code Reviewer Fall in Love with You" has become a foundational internal reference at Bigblue. So much that when I asked the team what they thought I should share, it only took Julien a few seconds to refer to this article.

“This article describes best practices for participating in a code review when you’re the author. In fact, by the end of this post, you’re going to be so good at sending out your code for review that your reviewer will literally fall in love with you.”

Here are a few quotes I've come to mention every day:

”Review your own code first"
"The best changelists just Do One Thing.”
“Minimize lag between rounds of review”

But Michael’s article has 10 more, and you should definitely repeat them every day too!

PS: I guess I do love my team now 🙂 Is that because of code reviews?

Read the full article on Michael's blog

Centrifuge: a system delivering billions of events per day at Segment

Scaling third-party APIs integration is hard.

What starts with a trivial HTTP POST to an external endpoint becomes hundreds of errors in your monitoring system that you can't do much about.

Rate limiting, elevated error rates, downtime increased latency. These always happen at the wrong time. And if your design wasn't careful enough, they can bring you down along with them. Even if they don't, they will affect your quality of service.

What to do next?

→ "Architecture 1: a single queue: to start, let’s first consider a naive approach. We can run a group of workers that read jobs from a single queue."

→ "Architecture 2: queues per destination"

→ "Ideal state: queues per <source, destination>: however, in a large, multi-tenant system, like Segment, this number of queues becomes difficult to manage."

📗 Segment's Centrifuge: a reliable system for delivering billions of events per day is one of the best architecture articles because it gives you an incredibly detailed peek into the problem-solving process the engineers went through. Calvin French-Owen lays out:

The problem: reliably sending billions of messages per day to hundreds of public APIs for thousands of customers.
The naive solutions and their more advanced evolutions, and why they didn't work for Segment.
The final production architecture detailed up to the database schema and systems orchestration.
Validation: how Centrifuge handled its first massive 3rd party API downtime.

A few interesting numbers from 2018:

400,000 outbound HTTP requests per second
340 billion jobs executed in one month
1.5% of all global data succeeds on a retry
Centrifuge’s system architecture. Source: Segment.

Read the full article on Segment's blog

The Complete Guide to Pair Programming by Tuple

"Teams tend to ship slower over time because they accumulate sub-par code that impedes their progress."

"A pair of programmers tends to produce better code than someone working alone."

"Teams that pair often will maintain a fast shipping speed longer."

Who doesn't agree with these statements? But who does regular pair programming weekly?

📗 Tuple's Pair Programming Guide is a really helpful resource to start pair programming or improve your pairing skills. Learn how to de-risk "Your First Pairing Session", or get a nice "Pairing Session Template". Learn pairing patterns and anti-patterns. The guide is actionable, to the point, and definitely worth your time!

Meta: Tuple and I share this love for great external resources, so go ahead and open the "Great External Resources" section. I highly recommend the first one, "On Pair Programming".

Read the full guide

Small tip: You can reply to this email to react, recommend articles, or just send feedback!

See you next week!

Standing on the Shoulders of Giants

Discussion about this post