The end of Moore's Law marks the end of a 40 year performance free ride. Now, we are in a new free ride. By shoveling data into Machine Learning models, we get intelligence for free. But is it really free?
Here's how the world of computing worked for about 40 years: The chip fabricators created a new chip, with more and more transistors, every 6 months and they were just cloking it higher (this phenomenon became known as Moore's Law). As a result, if you had a program that was slow, you simply waited for a new chip to come out and, as if by magic, your program ran dramatically faster. This is what Emery Berger called the performance free ride.
It was not free in terms of money of course; you had to actually pay for the chip. However, it was free in the sense that you literally needed to do exactly zero engineering work to make your program run faster. You just waited. And you didn't even have to wait for long.
While everyone was having fun though, physics was lurking around the corner. As it turned out, we can't pack more and more transistors forever because the temperatures get tremendously high. Consequently, this put an end to Moore's Law (and the era of parallelization was born).
This free ride did not end without leaving anything behind. While still in practice, people honestly thought that we would be getting performance for free forever. Or at least, for the foreseeable future. So, they started creating mindsets, tools, programming languages etc. without taking performance into consideration. The reasoning was: "Relax, computers are getting faster and faster, we don't have to worry". Python, Ruby, PHP, JS, HTML, ... - the list goes on and on - all underestimate performance needs.
Here are some examples of the resulting software. Microsoft Word still takes seconds to start-up. Even when it does, sometimes you write something and it just freezes for some seconds. In a similar manner, sometimes Messenger freezes. Browsing static webpages on the internet lags significantly in my laptop (and it's not the Internet).
There are countless examples and it is amazing that they exist when we consider that on the other side, there is software like Grand Theft Auto San Andreas. For the handful of people who don't know, San Andreas was a engineering achievement. Apart from being one of the most badass games ever, it is also a 3D driving / shooting / fighting / swimming / ... simulator. All that with crazy graphics for the time, let alone the rest of the technology like sound etc. San Andreas was launched in 2004 for Playstation 2 whose hardware was at least 10 times less powerful than a normal modern PC but somehow San Andreas worked just fine on it.
Don't get me wrong, this free ride helped us concentrate on other, more important properties, like security, usability, compatibility etc. But some people think that these properties are mutually exclusive with performance. This is far from the truth and a different article altogether.
Now, if software was the only bad thing left behind we would be ok; we would just create new software. However, that is not the case. What is left is heritage, "best practices", tools, programming languages, school curricula and a few other artifacts which should reassure us that while the ride is over, the decline is certain.
In this article, I would like to introduce a new upcoming free ride. I call it: The Intelligence Free Ride.
The setting is not that different from the former ride. Previously it was performance, now it is intelligence. Previously it was achieved by packing more transistors, now it is achieved by stacking machine learning models. The striking resemblance though is the "it's for free" attitude. Just like how we thought performance was for free and we just waited for a faster chip, now we wait for the model to be trained assuming cheap intelligence.
The scary part is that up to now, we actually do get it mostly for free. Take for example the GPT series of neural networks. There's no finesse, it's literally just creating absurdly huge neural networks (there's a lot of finesse and engineering on achieving to train them but that's another story). As a consequence, my concern is that we will follow a similar dangerous path and this time, I am not sure what the result will be. My guess is that if we are not alert, it is ultimately going to be some new form of bad software.
By the way, I'm not saying that Machine Learning is not cool. It surely gives us hope for something big. We aspire to achieve what the "first wave of artificial intelligence", the one in the 70s, failed to accomplish: general intelligence. And you know what, maybe general intelligence is just an enormous amount of neural networks. Let's not forget that a neuron is not that sophisticated on its own. Putting together billions of them, however, creates consciousness.
On the other hand, we should be careful to not treat it like a magic wand. It is a tool, apparently a very powerful one, but still just a tool: good at some things, bad at others. There is a trend lately of "let's do everything with machine learning". People stop paying attention to theory, algorithms, low-level engineering, architecture, programming languages, compiler design etc. Instead of producing novel ideas, different for each area's needs, they fall-back to machine learning as if it is going to solve everything. You need an algorithm ? We'll let machine learning produce it. You want fast code ? Machine learning is here for you. ALL ABOARD TO THE MACHINE LEARNING TRAIN.
The train is cool, it's useful, but please let's not consider it as the only vehicle.
I want to focus now on a particularly suspicious trend in my own area, compiler optimization.
Let's take another walk down the memory lane. This time we're visiting the mid-70s. Back then, compiler optimization started to flourish side by side with the Moore's Law. Along with it, what I call the Compiler Promise 1.0 arrived. It declared that we should not worry about performance when writing software and that for god sake we should stop writing assembly since the compiler will optimize everything for us. People are still firm believers of this idea with a very known saying flowing around: don't try to outsmart the compiler.
End of flashback. The year is 2020. Programmers are still writing incredible amounts of assembly, C and low-level C++ to fully leverage their machines. The compilers did not completely fail, but they did not fulfill their promise either. As a sidenote, one sub-area of compiler optimization that failed spectacularly was auto-parallelization. This is another topic but it's important to know since, right now, parallelization is super necessary and we can't get it automatically.
So, now compiler writers are subtly coming up with a new promise, what I call the Compiler Promise 2.0. This is essentially the same promise as before but disguished behind a different big idea. I hope that by now, we all know what this idea is, right? "MACHINE LEARNING". Exactly! Machine Learning has made it to the rigorous, often forgotten circles of programming languages. Will it succeed? That's a tricky one.
Again, the worrying part is that using ML in compiler optimization does have a lot of potential (Michael O'Boyle - Auto-parallelisation Reloaded, DeepCompiler). But I'm afraid that instead of using ML solely as tool, when it facilitates our goal, we'll just throw it on every possible problem. To put it otherwise, Machine Learning is a new huge hammer and to whoever holds it, which is virtually everyone at this point, everything looks like a nail.
This article was meant to present an opinion rather than assert any truths. I hope that it will give us some useful food for thought. A lot of great research on machine learning and relevant fields is happening out there and I don't mean to dismiss it. It's probably no coincidence that machine learning has become the trend. Nevertheless, we should always be aware of its outcomes.