Learning from Giants #22
Untangling database consistency, Product marketing and when you need it, De-duplicating billions of images at Canva, and WebAssembly from scratch.
👋 Hi, this is Mathias with your weekly drop of the 1% best, most actionable, and timeless resources to grow as an engineering or product leader. Handpicked from the best authors and companies.
Did a friend send this to you? Subscribe to get these weekly drops directly in your inbox. Read the archive for even more great content. Also: I share these articles daily on LinkedIn.
Thoughts on database consistency
"I realized the notion of consistency is pretty darn confusing and contains a bunch of overlapping concepts."
Database consistency is indeed a pretty overloaded term. I have already shared posts about it, including the excellent visual chart of consistency by Kyle Kingsbury. The key to making sense of all this confusion is to understand this one thing:
"[...] the word “consistency” is used to describe multiple, different concepts in the database and distributed systems world."
🤯 CAP, ACID, Consistency models all talk about a different kind of "consistency".
📗 Alex DeBrie's Inconsistent thoughts on database consistency is the clearest article to bring some order to that consistency mess. It will clear your confusion and is the perfect starting point for building consistent knowledge on consistency.
"I’d like to say I’m standing on the shoulders of giants in writing this post, but that would overstate my contribution here. It’s more like some giants are carrying me in a BABYBJÖRN."
What does that make us? Standing on the shoulders of Alex, being carried by giants?
What is Product Marketing, and when will you need it?
People say Product Manager is one of the hardest jobs in the world because you're in charge of no less than a product's success. As companies grow, the amount of Product tasks grows with it, but the bandwidth of PMs is limited. You can split that PM's responsibilities vertically by adding a second PM, and a third. But soon you'll see that it is not enough.
The other split people consider less often, although the default solution at Big Tech, is the horizontal split. PMs stay in charge of the strategy, but product marketing can become the full-time focus of a new role: the Product Marketing Manager.
"Product marketing is part of overall product management, but with the primary goal of understanding the market and buyer (their needs, alternatives, buying process, etc.)."
"Think of product marketing as a role that helps both product management and the company scale."
📗 OpenView Partners's The Role of Product Marketing introduces that role, what it does, why, and when you should add it to your team. Saeed Khan insists that adding a PMM does not remove the research and go-to-market responsibilities from PM but gets them help that will solely focus on market-facing tasks.
Reverse image search using Perceptual Hashes at Canva
Because they operate a media-heavy platform, the Canva team has one recurrent problem: identifying duplicate images at a scale of tens of billions of images. If they were just files, a simple hash would do.
"The ability to map any file into unique hash keys is useful for matching duplicate content but is not very viable in the case of images."
That’s because one image can have many byte representations. So they used perceptual hashes that rely on the image pixels and features rather than raw bytes.
"While cryptographic hashes like SHA512 create hashes of raw bytes, perceptual hashes are created based on the actual pixels of an image. Hence, a reverse image search using perceptual hashing can be done by calculating the Hamming distance between the two hashes."
Once they had these billions of perceptual hashes, they still needed to figure out how to do efficient Hamming distance searches on them.
📗 Canva's Simple, Fast, and Scalable Reverse Image Search Using Perceptual Hashes and DynamoDB tells the team's solution to this interesting problem. Christopher Bong defines the perceptual image hashing and then explains how to match them with similar ones at scale and how the team implemented that on top of DynamoDB, with simple key/value operations.
WebAssembly from scratch
With Google's V8 engine becoming a popular runtime in and out of browsers, many engineers turn to WebAssembly to avoid JS's pitfalls.
Or with WebAssembly's increasing popularity, Google's V8 engine is getting new client and server-side use cases every day.
Both are true and point to the same technology: WebAssembly.
"Wasm is really quite simple, in its way. The specification defines only four numerical types and a handful of operations upon them, plus standards for importing and exporting interfaces and shared memory buffers from and to the surrounding context, whether that is a browser or node, and a few other things."
And it materializes as a binary that can be compiled and instantiated in the middle of a JS V8 execution. WASM is a binary format that exports globals, methods and a few other things to the JS runtime.
📗 Jeff Fowler's Wat is up with WebAssembly is an excellent introduction to that intriguing binary format that powers amazing products like Figma. The author starts from a tiny working binary, that it dissects and then enhances to understand the possibilities that WASM opens up.