When Imperfect Systems are Good

In 2023 I started sharing links on a weekly basis. It worked for a while, but the fixed cadence was hard to maintain. Some weeks, I had plenty of time to read; others, I didn’t.

So, I’m going to change the format. From now on, I’m following Simon Willison’s approach: a post per link, with ad-hoc quotes and a small commentary. It’s more flexible and scalable and helps make writing easier.

Let’s go!


Original post: When Imperfect Systems are Good, Actually: Bluesky’s Lossy Timelines

Bluesky’s Timeline update process is straightforward (at a high level, at least). When a user shares a new post, its reference is fanout in the timeline table of all the users who follow the author. When the followers open the app, they get the latest content.

But what happens when millions of people follow a user? It can create hot spots in the database (i.e., a shard handling much more traffic than others) and slow down the fanout process.

Enter the lossy Timeline!

Imagine a user who follows hundreds of thousands of others. […] For a given user, there’s a threshold beyond which it is unreasonable for them to be able to keep up with their Timeline.

Leveraging this human limit, Bluesky uses a threshold and some probabilistic logic to decide which posts to reference in the timelines. In the end, users with a big following will never notice the difference as long as their Timeline always has something new.

Knowing where it’s okay to be imperfect lets you trade consistency for other desirable aspects of your systems and scale ever higher.

This last quote highlights something fundamental we tend to forget: the perfect system doesn’t exist. Tradeoffs are everywhere, and sometimes, data accuracy can also be overlooked.