On Fundamentals

Ultimately, when using high-level tools to make an impact, we are limited not by the breadth (how many tools & languages we know) but by the depth of our knowledge (how well we know them). I briefly touched on this in my essay for aspiring engineers when I wrote it seven years ago. Back then, it felt more like intuition. Now - I am convinced it is valid and valuable.

It is the primary reason why someone, who already made their first steps in programming, should look at computer science fundamentals to increase their depth of knowledge. It is primarily driven by personal experience and observation. For example, I strive to deepen or refresh my knowledge of computer science fundamentals regularly. I do this by reading books or solving algorithmic exercises. Of course, I also get bored and move on to another subject, continuously wandering, and that is okay.

It is okay to look around at the tools you are using today and increase that depth of understanding by going one level deeper, metaphorically peeling an onion. However, at some point, it would be good to touch the fundamentals.

Others disagree. The primary objection is focused on the obsoleteness of such knowledge. “Where am I going to implement virtual memory?” they ask, and add “I’m not writing an operating system.” I think that is not a great question. The more important question is “where is virtual memory used?” (Replace “virtual memory” with any algorithm or data structure). Not understanding where these fundamental pieces of knowledge are used numbs us into thinking they are obsolete and irrelevant. They are anything but that.

It is easy to say “the right tool for the job,” and go to the next popular framework, library, or service to avoid technological limitations. But think about it - if all we know is just how to use some tool, not how it works, we are essentially limited by what is publicly documented, tried, or generally accepted. We might just as well use any tool in that case. At the very least, an opportunity for learning a new way of solving a problem is lost. At worst - a creative or low effort and high-impact problem-solving possibility is dismissed as irrelevant.

My favourite high-impact example of the above point that I have seen in practice is this.

Fifteen years ago, the primary caching proxy for websites was a project called Squid. Then, about eleven years ago, a project by the name Varnish came out. Its author Poul-Henning Kamp wrote about its impact (I recommend reading the full article):

“The first user of Varnish, the large Norwegian newspaper VG, replaced 12 machines running Squid with three machines running Varnish. The Squid machines were flat-out 100 percent busy, while the Varnish machines had 90 percent of their CPU available for twiddling their digital thumbs.”

I used Varnish on multiple projects back in the day, and servers twiddling their digital thumbs summarizes the experience and the impact well. We, too, replaced overutilized servers with just a few that barely did any work in those projects. So how did Poul achieve such an impact?

<…> The really short version of the story is that Varnish knows it is not running on the bare metal but under an operating system that provides a virtual-memory-based abstract machine. For example, Varnish does not ignore the fact that memory is virtual; it actively exploits it."

Poul had a deep understanding of how memory management in an operating system works. As a result, instead of re-inventing it in the application, he made Varnish utilize the memory management provided by the operating system.

As I was working on the draft of this post, on June 8th, several top-rated websites went down and showed error screens for a few hours due to Fastly, a content delivery service, outage. What is relevant here is that the error messages also exposed the software used to deliver the content (not that it was a massive secret). What started with 12 servers in Norway is now running a significant portion of the internet. It was Varnish.

The critical element here is not that Poul knew how OS managed memory but that he knew the stack deeper than anyone else who worked on the caching problem and had an insight. By my recollection, before Varnish, caching was a solved problem, and if you had to cache something - Squid was a de facto choice for that. So no one was looking for insight.

It is worth saying that knowing stack deeper does not imply building more lower-level things from scratch. I have not made Varnish level of impact, but I have found simple solutions to complex problems on multiple occasions because I understood the stack better. Not buy building new things, but by having a creative insight about existing tools.

There is nothing special about this, and anyone can do it. What is needed is some curiosity and time. If all this sounds reasonable, Julia Evans has an excellent blog post on learning how things work, which covers the practical aspects of doing it continuously.

Recent articles

Requiem for Phabricator

An eulogy to an open source suite of applications that contributed to the development of my software engineering philosphy.