Embrace theory when it's practical
There. That was simple enough to summarize in a 5-word title. To really understand what that means however, you have to agree with me that most programmers tend to run from theory at every opportunity. You agree? Well, if you don't, just glance over at the languages we use predominantly: Java and C/C++. LISPers will preach to you and have been preaching for the last 50-odd years the power of first-class functions. And yet, this theoretically and practically beautiful concept (which involves no sacrifice whatsoever on the part of the Java/C programmer) has taken years to arrive. Finally, in Java 8, we will have easy-to-write lambdas! Instead of first class functions and good generics, Java gives us boilerplate. I need not list here the arguments for boilerplate the masses of Java programmers have given since the inception of Java. To me, it's incredible enough that someone can argue that boilerplate isn't bad, let alone that it is good. Read Steve Yegge's rant on Java and its obsession with code quantity for more on this. In the world we live in today, programmers tend to prefer more code and less theory.
Almost got into rant mode there. Let's get back to the question at hand: okay, everyone is running from theory but what is theory good for anyways? Isn't the Internet and all other wonderful technology built in C/C++ and not some "beautiful" theorem proving language for good reason?
When theory becomes practical, we should be doubly delighted because we have not only discovered a good practical solution to our problem but we've also found a solution about which we know certain things/properties to be always true.
One sub-field in computer science has always been really theoretical in nature and that's algorithms. People don't just come up with an algorithm and be done with it because it runs "pretty fast" for them. They usually follow up with a detailed analysis of its complexity. As a result, we know optimal algorithms for many tasks and this is extremely powerful in practice. This means that we can stop optimizing that part of the system and focus on optimizing other parts of the system.
Over the long arc of time, our goal as programmers surely is to write programs that don't fail (or are at the least, fail-safe) and have them run as fast as they can. The latter property is ensured by working on algorithmic theory but what about the earlier property? It falls to the programmer to ensure that the program doesn't fail but it's also the responsibility of the programming language and its compiler to ensure that the programmer doesn't make mistakes. And that's the sub-field in CS where the mainstream has been mostly devoid of all theory (I'm referring to the end-user/programmer's view of things, not the compiler writer's). As a collective, we use imperative languages that mimic the specific machine architecture extremely well. Of course, this story is only true after C "won over" LISP but ever since, imperative has been the mainstream. We produce more C/C++/Java code than ever before. The problem with imperative languages is that they deal directly with state (a lot of which is out of the program's control and is managed by the operating system or other programs or through I/O devices) and as a result, in an imperative world, you can't ever have a complete system that can be theorized well. The alternative is to of course think of programs as proofs (see Curry-Howard isomorphism). And proofs are something that we know a lot about. There's great deal of human knowledge about logic, set theory, functions. We can bring hundreds of centuries of thinking by the smartest mind to fruition in a practical way if we can only somehow implement theoretically well-founded programming languages.
What does a theoretically well-founded language look like and what are some practical benefits that we can get out of it? It looks a lot like Haskell; Haskell has great theoretical foundations but is also increasingly practical. To be precise, Haskell's an approximation of this theoretical thing called System F. It's an approximation because there are things Haskell does not enforce. For example, you can have something be an instance of a Monad but not obey the laws of monads, due to the halting problem where you can have certain conditions that cannot be checked in a language like Haskell that allows for infinite recursion. Hence, there are other theorem-proving languages that are not quite as practical (yet) but do verify and enforce the monadic laws. But if we all used Haskell, that would already be two revolutions' worth of revolution.
And now, here comes the exciting part! The practical benefits of using a language like Haskell. There are many many benefits but I'll only talk about a small subset of them here.
(1) Whole classes of errors are eliminated when you express the properties of your programmer in a manner that the compiler can understand. This is essentially the whole notion of types and draws from a long history of research into varied areas. It's like telling the compiler the semantic meaning of this program, "I want this program to figure out the most common female baby names in a certain zip code" and it'll check that. But when you write a C program or a Python program, you can't do that too well because you don't have a very expressive type system. In this regard, Python is actually a step backward from C because you don't even have any static type-checking and you will not know about your bad variable name bugs until you actually execute that line of code. Proof-carrying code is one idea that's analogous except that it's far more explicit.
(2) Haskell provides you with precise abstractions where you get all that good object-oriented stuff related to encapsulation and information hiding but in a stateless manner and with strict guarantees. Haskell models many algebraic structures from category theory which is one mathematical way to look to all algebraic structures (In case you want to delve more deeply into the category-theoretical view of things, I recently discovered The Catsters, a series of 78 videos introducing category theory; if you're looking for something much more brief and directly related to application, this is a great start).
Algebraic structures are well understood and when you map these structures to the data that your program is manipulating, you get a ton of things for free. In Haskell for example, you have an entire module (Control.Monad) worth of functions that you can immediately use in your programs if you make something a monad. And if someone else writes a function that works with monads, because a monad is defined precisely, mathematically and without variation across software packages, it's guaranteed to work for your Monad too. Accordingly, in Haskell, things like lists, the Maybe type and famously IO are all monad instances. You get higher levels of abstraction and a lot less duplicated code (which is, of course, one of the biggest — if not the biggest — evils in software construction). Rather than teach design patterns, we can teach truths about various categories and how these truths mirror applications in different domains. Usually, we are trying to distill truths from domains (which is an arbitrarily imprecise process because domains change while mathematical truths don't) but when we try to fit truths to domains, those truths will always be true. Just as in algorithms, if we know something is undoubtedly true, that's extremely powerful knowledge, it implies that a 100 other things are not true and we can avoid many design decisions. Fewer decisions means fewer places to make mistakes. XMonad is a great study in mapping well-understood algebraic structures to code/data and as result, gaining fail-safety and correctness.
Besides increased abstraction capability, when you use algebraic structures, you also get precision. When you write software, you often create abstractions that work well most of the time but occasionally, there will be a couple exceptions whereby information leaks from the abstraction and then, you end up paying for these exceptions when bugs seep through these information leaks. But with precisely-defined algebraic structures, you can avoid creating leaky abstractions. A simple example in Haskell is that you can't ever perform I/O in a function unless you specify the return type of the function to be IO a. This means that these functions should always run the same way, not fail randomly and give the same answer given the same arguments. In a different language, you might have a function that is seemingly completely pure but for example, uses the last cursor position (which most likely, uses an API that does some I/O). But because you don't want to pass the cursor position in through a chain of arguments let's say, you just do the I/O right there but then, now you have a function that might occasionally fail and a system that is become much harder to debug and more likely to fail because there's tiny bits of I/O happening in different places. Haskell's type system will prevent you from making such mistakes. This is a double-edged sword, however. With precision, you lose some flexibility. Haskell is getting better at handling these things: we are discovering new abstractions (zippers, monads are all fairly recent discoveries) that better encapsulate what we want to do with our programs and by allowing explicit exceptions (such as unsafePerformIO) especially for debugging purposes.
(3) Run-time testing for properties of programs that you can't test statically even with a language like Haskell is greatly improved with tools like QuickCheck. QuickCheck is a simple tool that lets you express properties about a function and verify that your function is doing what you want it to do.
Let's say you write a sort function called sort. Rather than write traditional unit-tests (where you'd only be testing a small subset of cases) like:
(sort [4, 2, 3]) == [2, 3, 4]
You could write
(sort x) == (quicksort x)
And QuickCheck will run through a 100 hundred test cases that it auto-generated including edge cases like empty lists and lists with non-unique elements.
Or if you don't have a reference implementation to compare with, you can express the property directly. What it means to say that a list is sorted is that every element in that list is either equal to or smaller than the next element.
and $ map (\x -> fst x < snd x) $ zip a (tail a)
QuickCheck also does a few other things that are really cool. Check it out.
To conclude, Haskell is in no way the panacea (or the cure). I'm not even advocating a single language here. I only believe that as a community, we need to push towards making more theory practical in all aspects of computer science and software construction so we can write better code that is not a house of cards which might come crashing down at any time and potentially affect the lives of millions adversely. For those of us who are traditionally trained in universities, we maybe do one class on theoretical foundations of computation and maybe, another class on algorithms. Then, we never touch theory for the rest of our lives again. Let that not be case anymore because theory matters and theory will help us solve our problems once and for all better than trial-and-error ever will.