21 January 2013, by Ben 13 comments
My wife gave me a real geek book for Christmas: Masterminds of Programming by two guys named Federico Biancuzzi and Shane Warden. In it they interview the creators of 17 well-known or historically important programming languages.
The book was a very good read, partly because not all the questions were about the languages themselves. The interviewers seemed very knowledgeable, and were able to spring-board from discussing the details of a language to talking about other software concepts that were important to its creator. Like software engineering practices, computer science education, software bloat, debugging, etc. The languages that everyone’s heard of and used are of course in there: C++, Java, C#, Python, Objective-C, Perl, and BASIC. There are a few missing — for example, the Japanese creator of Ruby didn’t feel comfortable being interviewed in English, and the publishers considered translation too expensive.
But what I really liked were interviews about some of the domain-specific languages, such as SQL, AWK, and PostScript. As well as some of the languages that were further off the beaten track, like APL, Haskell, ML, Eiffel, Lua, and Forth. The one thing I didn’t go for was the 60 pages with the UML folks. That could have been cut, or at least condensed — half of it (somewhat ironically) was them talking about how UML had gotten too big.
If you’re a programmer, definitely go and buy the book (the authors paid me 0x0000 to say that). But in the meantime, below are a few more specific notes and quotes from the individual interviews.
This review got rather long. From here on, it’s less of a real review, and more my “quotes and notes” on the individual chapters. I hope you’ll find it interesting, but for best results, click to go to the languages you’re interested in: C++, Python, APL, Forth, BASIC, AWK, Lua, Haskell, ML, SQL, Objective-C, Java, C#, UML, Perl, PostScript, Eiffel.
C++, Bjarne Stroustrup
C++ might be one of the least exciting languages on the planet, but the interview wasn’t too bad.
I knew RAII was big in C++, and Stroustrup plugged it two or three times in this fairly short interview. Another thing I found interesting was his comment that “C++ is not and was never meant to be just an object-oriented programming language … the idea was and is to support multiple programming styles”. Stroustrup’s very big on generic programming with templates, and he badgered Java and C# for adding generics so late in their respective games.
He does note that “the successes at community building around C++ have been too few and too limited, given the size of the community … why hasn’t there been a central repository for C++ libraries since 1986 or so?” A very good thought for budding language designers today. A PyPI or a CPAN for C++ would have been a very good idea.
As usual, though, he sees C++ a little too much as the solution for everything (for example, “I have never seen a program that could be written better in C than in C++”). I think in the long run this works against him.
Python, Guido van Rossum
One thing Python’s creator talks about is how folks are always asking to add new features to the languages, but to avoid it becoming a huge hodge-podge, you’ve got to do an awful lot of pushing back. “Telling people you can already do that and here is how is a first line of defense,” he says, going on to describe stages two, three, and four before he considers a feature worth including into the core. In fact, this is something that came up many times in the book. To keep things sane and simple, you’ve got to stick to your vision, and say no a lot.
Relatedly, Guido notes, “If a user [rather than a Python developer] proposes a new feature, it is rarely a success, since without a thorough understanding of the implementation (and of language design and implementation in general) it is nearly impossible to properly propose a new feature. We like to ask users to explain their problems without having a specific solution in mind, and then the developers will propose solutions and discuss the merits of different alternatives with the users.”
After just reading Stroustrup’s fairly involved approach to testing, Guido’s approach seemed almost primitive — though much more in line with Python’s philosophy, I think: “When writing your basic pure algorithmic code, unit tests are usually great, but when writing code that is highly interactive or interfaces to legacy APIs, I often end up doing a lot of manual testing, assisted by command-line history in the shell or page-reload in the browser.” I know the feeling — when developing a web app, you usually don’t have the luxury of building full-fledged testing systems.
One piece of great advice is that early on when you only have a few users, fix things drastically as soon as you notice a problem. He relates an anecdote about Make: “Stuart Feldman, the original author of “Make” in Unix v7, was asked to change the dependence of the Makefile syntax on hard tab characters. His response was something along the lines that he agreed tab was a problem, but that it was too late to fix since there were already a dozen or so users.”
APL, Adin Falkoff
APL is almost certainly the strangest-looking real language you’ll come across. It uses lots of mathematical symbols instead of ASCII-based keywords, partly for conciseness, partly to make it more in line with maths usage. For example, here’s a one-liner implementation of the Game of Life in APL:
Yes, Falkoff admits, it takes a while to get the hang of the notation. The weird thing is, this is in 1964, years before Unicode, and originally you had to program APL using a special keyboard.
Anyway, despite that, it’s a very interesting language in that it’s array-oriented. So when parallel computing and Single Instruction, Multiple Data came along, APL folks updated their compilers, and all existing APL programs were magically faster without any tweaking. Try that with C’s semantics.
Forth, Chuck Moore
He’s an extremist, but also sometimes half right. For example, “Operating systems are dauntingly complex and totally unnecessary. It’s a brilliant thing Bill Gates has done in selling the world on the notion of operating systems. It’s probably the greatest con the world has ever seen.” And further on, “Compilers are probably the worst code ever written. They are written by someone who has never written a compiler before and will never do so again.”
Despite the extremism in the quotes above, there’s a lot folks could learn from Forth’s KISS approach, and a lot of good insight Moore has to share.
BASIC, Tom Kurtz
I felt the BASIC interview wasn’t the greatest. Sometimes it seemed Kurtz didn’t really know what he was talking about, for instance this paragraph, “I found Visual Basic relatively easy to use. I doubt that anyone outside of Microsoft would define VB as an object-oriented language. As a matter of fact, True BASIC is just as much object-oriented as VB, perhaps more so. True BASIC included modules, which are collections of subroutines and data; they provide the single most important feature of OOP, namely data encapsulation.” Modules are great, but OO? What about instantiation?
Some of his anecdotes about the constraints implementing the original Dartmouth BASIC were interesting, though: “The language was deliberately made simple for the first go-round so that a single-pass parsing was possible. It other words, variable names are very limited. A letter or a letter followed by a digit, and array names, one- and two-dimensional arrays were always single letters followed by a left parenthesis. The parsing was trivial. There was no table lookup and furthermore, what we did was to adopt a simple strategy that a single letter followed by a digit, gives you what, 26 times 11 variable names. We preallocated space, fixed space for the locations for the values of those variables, if and when they had values.”
AWK, Al Aho, Brian Kernighan, and Peter Weinberger
I guess AWK was popular a little before my (scripting) time, but it’s definitely a language with a neat little philosophy: make text processing simple and concise. The three creators are honest about some of the design trade-offs they made early on that might not have been the best. For example, there was tension between keeping AWK a text processing language, and adding more and more general-purpose programming features.
Apparently Aho didn’t write the most readable code, saying, “Brian Kernighan once took a look at the pattern-matching module that I had written and his only addition to that module was putting a comment in ancient Italian: ‘abandon all hope, ye who enter here’. As a consequence … I was the one that always had to make the bug fixes to that module.”
Another interesting Aho quote, this time about hardware: “Software does become more useful as hardware improves, but it also becomes more complex — I don’t know which side is winning.”
This Kernighan comment on bloated software designs echoes what Chuck Moore said about OSs: “Modern operating systems certainly have this problem; it seems to take longer and longer for my machines to boot, even though, thanks to Moore’s Law, they are noticeably faster than the previous ones. All that software is slowing me down.”
I agree with Weinberger that text files are underrated: “Text files are a big win. It requires no special tools to look at them, and all those Unix commands are there to help. If that’s not enough, it’s easy to transform them and load them into some other program. They are a universal type of input to all sorts of software. Further, they are independent of CPU byte order.”
There’s a lot more these three say about computer science education (being educators themselves), programming vs mathematics, and the like. But you’ll have to read the book.
Lua, Roberto Ierosalimschy and Luiz Henrique de Figueiredo
Lua fascinates me: a modern, garbage-collected and dynamically typed scripting languages that fits in about 200KB. Not to mention the minimalist design, with “tables” being Lua’s only container data type. Oh, and the interview was very good too. :-)
As an embedded programmer, I was fascinated by a comment of Roberto’s — he mentions Lua’s use of C doubles as the single numeric type in Lua, but “even using double is not a reasonable choice for embedded systems, so we can compiler the interpreter with an alternative numerical type, such as long.”
Speaking of concurrency, he notes that in the HOPL paper about the evolution of Lua they wrote, “We still think that no one can write correct programs in a language where
a=a+1 is not deterministic.” I’ve been bitten by multi-threading woes several times, and that’s a great way to put!
They note they “made many small [mistakes] along the way.” But in contrast to Make’s hard tab issue, “we had the chance to correct them as Lua evolved. Of course this annoyed some users, because of the incompatibilities between versions, but now Lua is quite stable.”
Roberto also had a fairly extreme, but very thought provoking quote on comments: “I usually consider that if something needs comments, it is not well written. For me, a comment is almost a note like ‘I should try to rewrite this code later.’ I think clear code is much more readable than commented code.”
I really like these guys’ “keep it as simple as possible, but no simpler” philosophy. Most languages (C++, Java, C#, Python) just keep on adding features and features. But to Lua they’ve now “added if not all, most of the features we wanted.” Reminds me of Knuth’s TeX only allowing bugfixes now, and its version number converging to pi — there’s a point at which the feature set just needs to be frozen.
Haskell, Simon Peyton Jones, John Hughes, and Paul Hudak
I’ve heard a lot about Haskell, of course, but mainly I’ve thought, “this is trendy, it must be a waste of time.” And maybe there’s truth to that, but this interview really made me want to learn it. John’s comments about file I/O made me wonder how one does I/O in a purely functional language…
It’s a fascinating language, and if Wikipedia is anything to go by, it’s influenced a boatload of other languages and language features. List comprehensions (and their lazy equivalent, generator expressions), which are one of my favourite features of Python, were not exactly invented by Haskell, but were certainly popularized by it.
There’s a lot more in this interview (on formalism in language specification, education, etc), but again, I’m afraid you’ll have to read the book.
ML, Robin Milner
Sorry ML, I know you came first in history, but Haskell came before you in this book, so you were much less interesting. Seriously though, although ML looks like an interesting language, this chapter didn’t grab me too much. There was a lot of discussion on formalism, models, and theoretical stuff (which aren’t really my cup of tea).
What was interesting (and maybe this is why all the formalism) is that ML was designed “for theorem proving. It turned out that theorem proving was such a demanding sort of task that [ML] became a general-purpose language.”
SQL, Don Chamberlin
One often forgets how old SQL is: almost 40 years now. But still incredibly useful, and — despite the NoSQL people — used by most large-scale websites as well as much desktop and enterprise software. So the interview’s discussion of the history of SQL was a good read.
One of the interesting things was they wanted SQL to be used by users, not just developers. “Computer, query this and that table for X, Y, and Z please.” That didn’t quite work out, of course, and SQL is really only used by developers (and with ORMs and suchlike, much of that is not hand-coded). But it was a laudable goal.
The other interesting point was the reasons they wanted SQL to be declarative, rather than procedural. One of the main reasons was optimizability: “If the user tells the system in detailed steps what algorithm to use to process a query, the the optimizer has no flexibility to make changes, like choosing an alternative access path or choosing a better join order. A declarative language is much more optimizer-friendly than a lower-level procedural language.” Many of the other database query languages of the day were more procedural, and nobody’s heard of them today.
Objective-C, Tom Love and Brad Cox
When I started writing Oyster.com’s iPad app, I of course had to learn Objective-C. At first (like most developers) I was put off by [all [the [square brackets]]] and the longNamesThatTryToDocumentThemselves. But after you get into it, you realize that’s just syntax and style, and the core of Objective-C is actually quite elegant — adding Smalltalk-style OO to C in a low-impact way.
It’s been popularized by Apple for Mac and iOS development, of course, and also been expanded heavily by them, but it really hasn’t strayed from its roots. As Tom Love said, it’s “still Objective-C through and through. It stays alive.”
Tom gives some reasoning behind the ugly syntax: “The square brackets are an indication of a message sent in Objective-C. The original idea was that once you built up a set of libraries of classes, then you’re going to spend most of your time actually operating inside the square brackets … It was a deliberate decision to design a language that essentially had two levels — once you had built up enough capability, you could operate at the higher level … Had we chosen a very C-like syntax, I’m not sure anybody would know the name of the language anymore and it wouldn’t likely still be in use anywhere.”
Tom Love has gone on to be involved with some huge systems and codebases (millions of lines of code), and shares some experience and war stories about those. One of the more off-the-wall ideas he mentions to help would-be project managers get experience is to have a “project simulator” (like a flight simulator): “There is a problem of being able to live long enough to do 100 projects, but if you could simulate some of the decisions and experiences so that you could build your resume based on simulated projects as contrasted to real projects, that would also be another way to solve the problem.”
When asked, “Why emulate Smalltalk?”, Brad Cox says that “it hit me as an epiphany over all of 15 minutes. Like a load of bricks. What had annoyed me so much about trying to build large projects in C was no encapsulation anywhere…”
Comparing Objective-C to C++, he pulls out an integrated circuit metaphor, “Bjarne [C++] was targeting an ambitious language: a complex software fabrication line with an emphasis on gate-level fabrication. I was targeting something much simpler: a software soldering iron capable of assembling software ICs fabricated in plain C.”
One extreme idea (to me) that Brad mentions is in his discussion of why Objective-C forbids multiple inheritance. “The historical reason is that Objective-C was a direct descendant of Smalltalk, which doesn’t support inheritance, either. If I revisited that decision today, I might even go so far as to remove single inheritance as well. Inheritance just isn’t all that important. Encapsulation is OOP’s lasting contribution.”
Unfortunately for me, the rest of the interview was fairly boring, as Brad is interested in all the things I’m not — putting together large business systems with SOA, JBI, SCA, and other TLAs. I’m sure there are real problems those things are trying to solve, but the higher and higher levels of abstraction just put me to sleep.
Java, James Gosling
Like the C++ interview, the Java interview was a lot more interesting than the language is.
I know that in theory JIT compilers can do a better job than more static compilers: “When HotSpot runs, it knows exactly what chipset you’re running on. It knows exactly how the cache works. It knows exactly how the memory hierarchy works. It knows exactly how all the pipeline interlocks work in the CPU … It optimizes for precisely what machine you’re running on. Then the other half of it is that it actually sees the application as it’s running. It’s able to have statistics that know which things are important. It’s able to inline things that a C compiler could never do.” Those are cool concepts, but I was left wondering: how well do they actually work in practice? For what cases does well-written Java actually run faster than well-written C? (One might choose Java for many other reasons than performance, of course.)
James obviously has a few hard feelings towards C#: “C# basically took everything, although they oddly decided to take away the security and reliability stuff by adding all these sort of unsafe pointers, which strikes me as grotesquely stupid.”
And, interestingly, he has almost opposite views on documentation to Roberto Ierosalimschy from Lua: “The more, the better.” That’s a bit of a stretch — small is beautiful, and there’s a reason people like the conciseness of K&R.
C#, Anders Hejlsberg
Gosling may be right that C# is very similar to (and something of a copy of) Java, but it’s also a much cleaner language in many ways. Different enough to be a separate language? I don’t know, but now C# and Java have diverged enough to consider them quite separately. Besides, all languages are influenced by existing languages to a lesser or greater extent, so why fuss?
In any case, Anders was the guy behind the Turbo Pascal compiler, which was a really fast IDE and compiler back in the 1980’s. That alone makes him worth listening to, in my opinion.
What he said about the design of LINQ with regard to C# language features was thought-provoking: “If you break down the work we did with LINQ, it’s actually about six or seven language features like extension methods and lambdas and type inference and so forth. You can then put them together and create a new kind of API. In particular, you can create these query engines implemented as APIs if you will, but the language features themselves are quite useful for all sorts of other things. People are using extension methods for all sorts of other interesting stuff. Local variable type inference is a very nice feature to have, and so forth.”
Surprisingly (with Visual Studio at his fingertips), Anders’ approach to debugging was surprisingly similar to Guido van Rossum’s: “My primary debugging tool is Console.Writeline. To be honest I think that’s true of a lot of programmers. For the more complicated cases, I’ll use a debugger … But quite often you can quickly get to the bottom of it just with some simple little probes.”
UML, Ivar Jacobson, Grady Booch, and James Rumbaugh
As I mentioned, the UML interview was too big, but a good portion of it was the creators talking about how UML itself had grown too big. Not just one or two of them — all three of them said this. :-)
I’m still not quite sure what exactly UML is: a visual programming language, a specified way of diagramming different aspects of a system, or something else? This book is about programming languages, after all — so how do you write a “Hello, World” program in UML? Ah, like this, that makes me very enthusiastic…
Seriously, though, I think their critique of UML as something that had been taken over by design-by-committee made a lot of sense. A couple of them referred to something they called “Essential UML”, which is the 20% of UML that’s actually useful for developers.
Ivar notes how trendy buzzwords can make old ideas seem revolutionary: “The ‘agile’ movement has reminded us that people matter first and foremost when developing software. This is not really new … in bringing these things back to focus, much is lost or obscured by new terms for old things, creating the illusion of something completely new.” In the same vein, he says that “the software industry is the most fashion-conscious industry I know of”. Too true.
Grady Booch gave some good advice about reading code: “A question I often ask academics is, ‘How many of you have reading courses in software?’ I’ve had two people that have said yes. If you’re an English Lit major, you read the works of the masters. If you want to be an architect in the civil space, then you look at Vitruvius and Frank Lloyd Wright … We don’t do this in software. We don’t look at the works of the masters.” This actually made me go looking at the Lua source code, which is very tidy and wonderfully cross-referenced — really a good project to learn from.
Perl, Larry Wall
I’ve never liked the look of Perl, but Wall’s approach to language design is as fascinating as he is. He originally studied linguistics in order to be a missionary with Wycliffe Bible Translators and translate the Bible into unwritten languages, but for health reasons had to pull out of that. Instead, he used his linguistics background to shape his programming language. Some of the “fundamental principles of human language” that have “had a profound influence on the design of Perl over the years” are:
- Expressiveness is more important than learnability.
- A language can be useful even before you have learned the whole language.
- There are often several good ways to say roughly the same thing.
- Shortcuts abound; common expressions should be shorter than uncommon expressions.
- Languages make use of pronouns when the topic of conversation is apparent.
- Healthy culture is more important than specific technology to a language’s success.
- It’s OK to speak with an accent as long as you can make yourself understood.
There are many others he lists, but those are some that piqued my interest. Larry’s a big fan of the human element in computer languages, noting that “many language designers tend to assume that computer programming is an activity more akin to an axiomatic mathematical proof than to a best-effort attempt at cross-cultural communication.”
He discusses at length some of the warts in previous versions of Perl, that they’re trying to remedy with Perl version 6. One of the interesting ones was the (not so) regular expression syntax: “When Unix culture first invented their regular-expression syntax, there were just a very few metacharacters, so they were easy to remember. As people added more and more features to their pattern matches, they either used up more ASCII symbols as metacharacters, or they used longer sequences that had previously been illegal, in order to preserve backward compatibility. Not surprisingly, the result was a mess … In Perl 6, as we were refactoring the syntax of pattern matching we realized that the majority of the ASCII symbols were already metacharacters anyway, so we reserved all of the nonalphanumerics as metacharacters to simplify the cognitive load on the programmer. There’s no longer a list of metacharacters, and the syntax is much, much cleaner.”
Larry’s not a pushy fellow. His subtle humour and humility are evident throughout the interview. For example, “It has been a rare privelege in the Perl space to actually have a successful experiment called Perl 5 that would allow us try a different experiment that is called Perl 6.” And on management, “I figured out in the early stage of Perl 5 that I needed to learn to delegate. The big problem with that, alas, is that I haven’t a management bone in my body. I don’t know how to delegate, so I even delegated the delegating, which seems to have worked out quite well.”
PostScript, Charles Geschke and John Warnock
My experience with Forth makes me very interested in PostScript, even though it’s a domain-specific printer control language, and wasn’t directly inspired by Forth. It’s stack-based and RPN, like Forth, but it’s also dynamically typed, has more powerful built-in data structures than Forth, and is garbage collected.
One thing I wasn’t fully aware of is how closely related PostScript is to PDF. PDF is basically a “static data structure” version of PostScript — all the Turing-complete stuff like control flow and logic is removed, but the fonts, layout and measurements are done exactly the same way as PostScript.
Some of the constraints they relate about implementing PostScript back in the early days are fascinating. The original LaserWriter had the “largest amount of software ever codified in a ROM” — half a megabyte. “Basically we put in the mechanism to allow us to patch around bugs, because if you had tens of thousands or hundreds of thousands of printers out there, you couldn’t afford to send out a new set of ROMs every month.” One of the methods they used for the patching was PostScript’s late binding, and its ability to redefine any operators, even things like the “add” instruction.
I didn’t realize till half way through that the interviewees (creators of PostScript) were the co-founders of Adobe and are still the co-chairmen of the company. The fact that their ability extends both to technical and business … well, I guess there are quite a few software company CEOs that started as programmers, but I admire that.
Weird trivia: In May 1992, Charles Geschke was approached by two men who kidnapped him at gunpoint. The long (and fascinating) story was told five years later when the Geschkes were ready to talk about it. Read it in four parts here: part one, part two, part three, and part four.
Eiffel, Bertrand Meyer
Eiffel, you’re last and not least. Eiffel is quite different from Java or C#, though it influenced features in those languages. It incorporates several features that most developers (read: I) hadn’t heard of.
Design by Contract™, which Microsoft calls Code Contracts, is a big part of Eiffel. I’m sure I’m oversimplifying, but to me they look like a cross between regular asserts and unit tests, but included at the language level (with all the benefits that brings). Bertrand Meyer can’t understand how people can live without it: “I just do not see how anyone can write two lines of code without this. Asking why one uses Design by Contract is like asking people to justify Arabic numerals. It’s those using Roman numerals for multiplication who should justify themselves.”
He does have a few other slightly extreme ideas, though. Here’s his thoughts on C: “C is a reasonably good language for compilers to generate, but the idea that human beings should program in it is completely absurd.” Hmmm … I suspect I’d have a hard time using Eiffel on an MSP430 micro with 128 bytes of RAM.
It appears that Eiffel has a whole ecosystem, a neat-looking IDE, company, and way of life built around it. One of the few languages I know of that’s also a successful company in its own right. It’d be like if ActiveState was called “PythonSoft” and run by Guido van Rossum.
That’s all folks. I know this falls in the category of “sorry about the long letter, I didn’t have time to write a short one”. If you’ve gotten this far, congratulations! Please send me an email and I’ll let you join my fan club as member 001.
Seriously though, if you have any comments, the box is right there below.