2016-04-19

Edited transcript of a talk given on March 4, 2016, at the Computer History Museum, Mountain View, California.



I normally spend my time trying to build the future. But I find history really interesting and informative, and I study it quite a lot. Usually it’s other people’s history. But the Computer History Museum asked me to talk today about my own history, and the history of technology I’ve built. So that’s what I’m going to do here.

This happens to be a really exciting time for me—because a bunch of things that I’ve been working on for more than 30 years are finally coming to fruition. And mostly that’s what I’ve been out in the Bay Area this week talking about.

The focus is the Wolfram Language, which is really a new kind of language—a knowledge-based language—in which as much knowledge as possible about computation and about the world is built in. And in which the language automates as much as possible so one can go as directly as possible from computational thinking to actual implementation.

And what I want to do here is to talk about how all this came to be, and how things like Mathematica and Wolfram|Alpha emerged along the way.

Inevitably a lot of what I’m going to talk about is really my story: basically the story of how I’ve spent most of my life so far building a big stack of technology and science. When I look back, some of what’s happened seems sort of inevitable and inexorable. And some I didn’t see coming.

But let me begin at the beginning. I was born in London, England, in 1959—so, yes, I’m outrageously old, at least by my current standards. My father ran a small company—doing international trading of textiles—for nearly 60 years, and also wrote a few “serious fiction” novels. My mother was a philosophy professor at Oxford. I actually happened to notice her textbook on philosophical logic in the Stanford bookstore last time I was there.

You know, I remember when I was maybe 5 or 6 being bored at some party with a bunch of adults, and somehow ending up talking at great length to some probably very distinguished Oxford philosopher—who I heard say at the end, “One day that child will be a philosopher—but it may take a while.” Well, they were right. It’s sort of funny how these things work out.

Here’s me back then:



I went to elementary school in Oxford—to a place called the Dragon School, that I guess happens to be probably the most famous elementary school in England. Wikipedia seems to think the most famous people now from my class there are myself and the actor Hugh Laurie.

Here’s one of my school reports, from when I was 7. Those are class ranks. So, yes, I did well in poetry and geography, but not in math. (And, yes, it’s England, so they taught “Bible study” in school, at least then.) But at least it said “He is full of spirit and determination; he should go far”…



But OK, that was 1967, and I was learning Latin and things—but what I really liked was the future. And the big future-oriented thing happening back then was the space program. And I was really interested in that, and started collecting all the information I could about every spacecraft launched—and putting together little books summarizing it. And I discovered that even from England one could write to NASA and get all this great stuff mailed to one for free.

Well, back then, there was supposed to be a Mars colony any day, and I started doing little designs for that, and for spacecraft and things.

And that got me interested in propulsion and ion drives and stuff like that—and by the time I was 11 what I was really interested in was physics.

And I discovered—having nothing to do with school—that if one just reads books one can learn stuff pretty quickly. I would pick areas of physics and try to organize knowledge about them. And when I was turning 12 I ended up spending the summer putting together all the facts I could accumulate about physics. And, yes, I suppose you could call some of these “visualizations”. And, yes, like so much else, it’s on the web now:

I found this again a few years ago—around the time Wolfram|Alpha came out—and I thought, “Oh my gosh, I’ve been doing the same thing all my life!” And then of course I started typing in numbers from when I was 11 or 12 to see if Wolfram|Alpha got them right. It did, of course:

Well, when I was 12, following British tradition I went to a so-called public school that’s actually a private school. I went to the most famous such school—Eton—which was founded about 50 years before Columbus came to America. And, oh so impressively , I even got the top scholarship among new kids in 1972.

Yes, everyone wore tailcoats all the time, and King’s Scholars, like me, wore gowns too—which provided excellent rain protection etc. I think I avoided these annual Harry-Potter-like pictures all but one time:

And back in those Latin-and-Greek-and-tailcoat days I had a sort of double life, because my real passion was doing physics.

The summer when I turned 13 I put together a summary of particle physics:

And I made the important meta-discovery that even if one was a kid, one could discover stuff. And I started just trying to answer questions about physics, either by finding answers in books, or by figuring them out myself. And by the time I was 15 I started publishing papers about physics. Yes, nobody asks how old you are when you mail a paper in to a physics journal.

But, OK, something important for me had happened back when I was 12 and first at Eton: I got to know my first computer. It’s an Elliott 903C. This is not the actual one I used, but it’s similar:

It had come to Eton through a teacher of mine named Norman Routledge, who had been a friend of Alan Turing’s. It had 8 kilowords of 18-bit ferrite core memory, and you usually programmed it with paper—or Mylar—tape, most often in a little 16-instruction assembler called SIR.

It often seemed like one of the most important skills was rewinding the tape as quickly as possible after it got dumped in a bin after going through the optical reader.

Anyway, I wanted to use the computer to do physics. When I was 12 I had gotten this book:

What’s on the cover is supposed to be a simulation of gas molecules showing increasing randomness and entropy. As it happens, years later I discovered this picture was actually kind of a fake. But back when I was 12, I really wanted to reproduce it—with the computer.

It wasn’t so easy. The molecule positions were supposed to be real numbers; one had to have an algorithm for collisions; and so on. And to make this fit on the Elliott 903 I ended up simplifying a lot—to what was actually a 2D cellular automaton.

Well, a decade after that, I made some big discoveries about cellular automata. But back then I was unlucky with my cellular automaton rule, and I ended up not discovering anything with it. And in the end my biggest achievement with the Elliott 903 was writing a punched tape loader for it.

You see, the big problem with the Mylar tape that one used for serious programs is that it would get statically electrically charged and pick up little confetti holes, so the bits would be read wrong. Well, for my loader, I came up with what I later found out were error-correcting codes—and I set it up so that if the checks failed, the tape would stop in the reader, and you could pull it back a couple of feet, and then re-read it, after shaking out the confetti.

OK, so by the time I was 16 I had published some physics papers and was starting to be known in physics circles—and I left school, and went to work at a British government lab called the Rutherford Lab that did particle physics research.

Now you might remember from my age-7 school report that I didn’t do very well in math. Things got a bit better when I started to use a slide rule, and then in 1972 a calculator—of which I was a very early adopter. But I never liked doing school math, or math calculations in general. Well, in particle physics there’s a lot of math to be done—and so my dislike of it was a problem.

At the Rutherford Lab, two things helped. First, a lovely HP desktop computer with a plotter, on which I could do very nice interactive computation. And second, a mainframe for crunchier things, that I programmed in Fortran.

Well, after my time at the Rutherford Lab I went to college at Oxford. Within a very short time I’d decided this was a mistake—but in those days one didn’t actually have to go to lectures for classes—so I was able to just hide out and do physics research. And mostly I spent my time in a nice underground air-conditioned room in the Nuclear Physics building—that had terminals connected to a mainframe, and to the ARPANET.

And that was when—in 1976—I first started using computers to do symbolic math, and algebra and things. Feynman diagrams in particle physics involve lots and lots of algebra. And back in 1962, I think, three physicists had met at CERN and decided to try to use computers to do this. They had three different approaches. One wrote a system called ASHMEDAI in Fortran. One—influenced by John McCarthy at Stanford—wrote a system called Reduce in Lisp. And one wrote a system called SCHOONSCHIP in CDC 6000 series assembly language, with mnemonics in Dutch. Curiously, years later, one of these physicists won a Nobel Prize. It was Tini Veltman—the one who wrote SCHOONSCHIP in assembly language.

Anyway, back in 1976 very few people other than the creators of these systems used them. But I started using all of them. But my favorite was a quite different system, written in Lisp at MIT since the mid-1960s. It was a system called Macsyma. It ran on the Project MAC PDP-10 computer. And what was really important to me as a 17-year-old kid in England was that I could get to it on the ARPANET.

It was host 236. So I would type @O 236, and there I was in an interactive operating system. Someone had taken the login SW. So I became Swolf, and started to use Macsyma.

I spent the summer of 1977 at Argonne National Lab—where they actually trusted physicists to be right in the room with the mainframe.

Then in 1978 I went to Caltech as a graduate student. By that point, I think I was the world’s largest user of computer algebra. And it was so neat, because I could just compute all this stuff so easily. I used to have fun putting incredibly ornate formulas in my physics papers. Then I could see if anyone was reading the papers, because I’d get letters saying, “How did you derive line such-and-such from the one before?”

I got a reputation for being a great calculator. Which was of course 100% undeserved—because it wasn’t me, it was just the computer. Well, actually, to be fair, there was part that was me. You see, by being able to compute so many different examples, I had gotten a new kind of intuition. I was no good at computing integrals myself, but I could go back and forth with the computer, knowing from intuition what to try, and then doing experiments to see what worked.

I was writing lots of code for Macsyma, and building this whole tower. And sometime in 1979 I hit the edge. Something new was needed. (Notice, for example, the ominous “MACSYMA RELOAD” line in the diagram.)

Well, in November 1979, just after I turned 20, I put together some papers, called it a thesis, and got my PhD. And a couple of days later I was visiting CERN in Geneva—and thinking about my future in, I thought, physics. And the one thing I was sure about was that I needed something beyond Macsyma that would let me compute things. And that was when I decided I had to build a system for myself. And right then and there, I started designing the system, handwriting its specification.

At first it was going to be ALGY–The Algebraic Manipulator. But I quickly realized that I actually had to make it do much more than algebraic manipulation. I knew most of the general-purpose computer languages of the time—both the ALGOL-like ones, and ones like Lisp and APL. But somehow they didn’t seem to capture what I wanted the system to do.

So I guess I did what I’d learned in physics: I tried to drill down to find the atoms—the primitives—of what’s going on. I knew a certain amount about mathematical logic, and the history of attempts to formulate things using logic and so on—even if my mother’s textbook about philosophical logic didn’t exist yet.

The whole history of this effort at formalization—through Aristotle, Leibniz, Frege, Peano, Hilbert, Whitehead, Russell, and so on—is really interesting. But it’s a different talk. But back in 1979 it was thinking about this kind of thing that led me to the design I came up with, that was based on the idea of symbolic expressions, and doing transformations on them.

I named what I wanted to build SMP: a Symbolic Manipulation Program, and started recruiting people from around Caltech to help me with it. Richard Feynman came to a bunch of the meetings I had to discuss the design of SMP, offering various ideas—which I have to admit I considered hacky—about shortcuts for interacting with the system. Meanwhile, the physics department had just gotten a VAX 11/780, and after some wrangling, it was made to run Unix. Meanwhile, a young physics grad student named Rob Pike—more recently creator of the Go programming language—persuaded me that I should write the code for my system in the “language of the future”: C.

I got pretty good at writing C, for a while averaging about a thousand lines a day. And with the help of a somewhat colorful collection of characters, by June 1981, the first version of SMP existed—with a big book of documentation I’d written.

OK, you might ask: so can we see SMP? Well, back when we were working on SMP I had the bright idea that we should protect the source code by encrypting it. And—you guessed it—over a span of three decades nobody remembers the password. And until a little while ago, that was the situation.

In another bright idea, I had used a modified version of the Unix crypt program to do the encryption—thinking that would be more secure. Well, as part of the 25th anniversary of Mathematica a couple of years ago, we did a crowdsourced project to break the encryption—and we did it. Unfortunately it wasn’t easy to compile the code though—but thanks to a 15-year-old volunteer, we’ve actually now got something running.

So here it is: running inside a VAX virtual machine emulator, I can show you for the first time in public in 30 years—a running version of SMP.

SMP had a mixture of good ideas, and very bad ideas. One example of a bad idea—actually suggested to me by Tini Veltman, author of SCHOONSHIP—was representing rationals using floating point, so one could make use of the faster floating-point instructions on many processors. But there were plenty of other bad ideas too, like having a garbage collector that had to crawl the stack and realign pointers when it ran.

There were some interesting ideas. Like what I called “projections”—which were essentially a unification of functions and lists. They were almost wonderful, but there were confusions about currying—or what I called tiering. And there were weird edge cases about things that were almost vectors with sequential integer indices.

But all in all, SMP worked pretty well, and I certainly found it very useful. So now the next problem was what to do with it. I realized it needed a real team to work on it, and I thought the best way to get that was somehow to make it commercial. But at the time I was a 21-year-old physics-professor type, who didn’t know anything about business.

So I thought, let me go to the tech transfer office at the university, and ask them what to do. But it turned out they didn’t know, because, as they explained, “Mostly professors don’t come to us; they just start their own companies.” “Well,” I said, “can I do that?” And right then and there the lawyer who pretty much was the tech transfer office pulled out the faculty handbook, and looked through it, and said, “Well, yes, it says copyrightable materials are owned by their authors, and software is copyrightable, so, yes, you can do whatever you want.”

And so off I went to try to start a company. Though it turned out not to be so simple—because suddenly the university decided that actually I couldn’t just do what I wanted.

A couple of years ago I was visiting Caltech and I ran into the 95-year-old chap who had been the provost at the time—and he finally filled in for me the remaining details of what he called the “Wolfram Affair”. It was more bizarre than one could possibly imagine. I won’t tell it all here. But suffice it to say that the story starts with Arnold Beckman, Caltech postdoc in 1929, claiming rights to the pH meter, and starting Beckman Instruments—and then in 1980 being chairman of the Caltech board of trustees and being upset when he realized that gene-sequencing technology had been invented at Caltech and had “walked off campus” to turn into Applied Biosystems.

But the company I started weathered this storm—even if I ended up quitting Caltech, and Caltech ended up with a weird software-ownership policy that affected their computer-science recruiting efforts for a long time.

I didn’t do a great job starting what I called Computer Mathematics Corporation. I brought in a person—who happened to be twice my age—to be CEO. And rather quickly things started to diverge from what I thought made sense.

One of my favorite moments of insanity was the idea to get into the hardware business and build a workstation to run SMP on. Well, at the time no workstation had enough memory, and the 68000 didn’t handle virtual memory. So a scheme was concocted whereby two 68000s would run an instruction out of step, and if the first one saw a page fault, it would stop the other one and fetch the data. I thought it was nuts. And I also happened to have visited Stanford, and run into a grad student named Andy Bechtolsheim who was showing off a Stanford University Network—SUN—workstation with a cardboard box as a case.

But worse than all that, this was 1981, and there was the idea that AI—in the form of expert systems—was hot. So the company merged with another company that did expert systems, to form what was called Inference Corporation (which eventually became Nasdaq:INFR). SMP was the cash cow—selling for about $40,000 a copy to industrial and government research labs. But the venture capitalists who’d come in were convinced that the future was expert systems, and after not very long, I left.

Meanwhile I’d become a big expert on the intellectual property policies of universities—and eventually went to work at the Institute for Advanced Study in Princeton, where the director very charmingly said that since they’d “given away the computer” after von Neumann died, it didn’t make much sense for them to claim IP rights to anything now.

I dived into basic science, working a lot on cellular automata, and discovering some things I thought were very interesting. Here’s me with my SUN workstation with cellular automata running on it (and, yes, the mollusc looks like the cellular automaton):

I did some consulting work, mostly on technology strategy, which was very educational, particularly in seeing things not to do. I did quite a lot of work for Thinking Machines Corporation. I think my most important contribution was going to see the movie WarGames with Danny Hillis—and as we were walking out of the movie theater, saying to Danny, “Maybe your computer should have flashing lights too.” (The flashing lights ended up being a big feature of the Connection Machine computer—certainly important in its afterlife in museums.)

I was mostly working on basic science—but “because it would be easy” I decided to do a software project of building a C interpreter that we called IXIS. I hired some young people—one of whom was Tsutomu Shimomura, whom I’d already fished out of several hacking disasters. I made the horrible mistake of writing the boring code nobody else wanted to write myself—so I wrote a (quite lovely) text editor, but the whole project flopped.

I had all kinds of interactions with the computer industry back then. I remember Nathan Myhrvold, then a physics grad student at Princeton, coming to see me to ask what to do with a window system he’d developed. My basic suggestion was “sell it to Microsoft”. As it happens, Nathan later became CTO of Microsoft.

Well, by about 1985 I’d done a bunch of basic science I was pretty pleased with, and I was trying to use it to start the field of what I called complex systems research. I ended up getting a little involved in an outfit called the Rio Grande Institute—that later became the Santa Fe Institute—and encouraging them to pursue this kind of research. But I wasn’t convinced about their chances, and I resolved to start my own research institute.

So I went around to lots of different universities, in effect to get bids. The University of Illinois won, ironically in part because they thought it would help their chances getting funding from the Beckman Foundation—which in fact it did. So in August 1986, off I went to the University of Illinois, and the cornfields of Champaign-Urbana, 100 miles south of Chicago.

I think I did pretty well at recruiting faculty and setting things up for the new Center for Complex Systems Research—and the university lived up to its end of the bargain too. But within a few weeks I started to think it was all a big mistake. I was spending all my time managing things and trying to raise money—and not actually doing science.

So I quickly came up with Plan B. Rather than getting other people to help with the science I wanted to do, I would set things up so I could just do the science myself, as efficiently as possible. And this meant two things: first, I had to have the best possible tools; and second, I needed the best possible environment for myself.

When I was doing my basic science I kept on using different tools. There was some SMP. Quite a lot of C. Some PostScript, and graphics libraries, and things. And a lot of my time was spent gluing all this stuff together. And what I decided was that I should try to build a single system that would just do all the stuff I wanted to do—and that I could expect to keep growing forever.

Well, meanwhile, personal computers were just getting to the point where it was plausible to build a system like this that would run on them. And I knew a lot about what to do—and not do—from my experience with SMP. So I started designing and building what became Mathematica.

My scheme was to write documentation to define what to build. I wrote a bunch of core code—for example for the pattern matcher—a surprising amount of which is still in the system all these years later. The design of Mathematica was in many respects less radical and less extreme than SMP. SMP had insisted on using the idea of transforming symbolic expressions for everything—but in Mathematica I saw my goal as being to design a language that would effectively capture all the possible different paradigms for thinking about programming in a nice seamless way.

At first, of course, Mathematica wasn’t called Mathematica. In a strange piece of later fate, it was actually called Omega. It went through other names. There was Polymath. And Technique. Here’s a list of names. It’s kind of shocking to me how many of these—even the really horrible ones—have actually been used for products in the years since.

Well, meanwhile, I was starting to investigate how to build a company around the system. My original model was something like what Adobe was doing at the time with PostScript: we build core IP, then license it to hardware companies to bundle. And as it happened, the first person to show interest in that was Steve Jobs, who was then in the middle of doing NeXT.

Well, one of the consequences of interacting with Steve was that we talked about the name of the product. With all that Latin I’d learned in school, I’d thought about the name “Mathematica” but I thought it was too long and ponderous. Steve insisted that “that’s the name”—and had a whole theory about taking generic words and romanticizing them. And eventually he convinced me.

It took about 18 months to build Version 1 of Mathematica. I was still officially a professor of physics, math and computer science at the University of Illinois. But apart from that I was spending every waking hour building software and later making deals.

We closed a deal with Steve Jobs at NeXT to bundle Mathematica on the NeXT computer:

We also made a bunch of other deals. With Sun, through Andy Bechtolsheim and Bill Joy. With Silicon Graphics, through Forest Baskett. With Ardent, through Gordon Bell and Cleve Moler. With the AIX/RT part of IBM, basically through Andy Heller and Vicky Markstein.

And eventually we set a release date: June 23, 1988.

Meanwhile, as documentation for the system, I wrote a book called Mathematica: A System for Doing Mathematics by Computer. It was going to be published by Addison-Wesley, and it was the longest lead-time element of the release. And it ended up being very tight, because the book was full of fancy PostScript graphics—which nobody could apparently figure out how to render at high-enough resolution. So eventually I just took a hard disk to a friend of mine in Canada who had a phototypesetting company, and he and I babysat his phototypesetting machine over a holiday weekend, after which I flew to Logan Airport in Boston and handed the finished film for the book to a production person from Addison-Wesley.

We decided to do the announcement of Mathematica in Silicon Valley, and specifically at the TechMart place in Santa Clara. In those days, Mathematica couldn’t run under MS-DOS because of the 640K memory limit. So the only consumer version was for the Mac. And the day before the announcement there we were stuffing disks into boxes, and delivering them to the ComputerWare software store in Palo Alto.

The announcement was a nice affair. Steve Jobs came—even though he was not really “out in public” at the time. Larry Tesler came from Apple—courageously doing a demo himself. John Gage from Sun had the sense to get all the speakers to sign a book:

And so that was how Mathematica was launched. The Mathematica Book became a bestseller in bookstores, and from that people started understanding how to use Mathematica. It was really neat seeing all these science types and so on—of all ages—who’d basically never used computers themselves before, starting to just compute things themselves.

It was fun looking through registration cards. Lots of interesting and famous names. Sometimes some nice juxtapositions. Like when I’d just seen an article about Roger Penrose and his new book in Time magazine with the headline “Those Computers Are Dummies”… but then there was Roger’s registration card for Mathematica.

As part of the growth of Mathematica, we ended up interacting with pretty much all possible computer companies, and collected all kinds of exotic machines. Sometimes that came in handy, like when the Morris worm came through the internet, and our gateway machine was a weird Sony workstation with a Japanese OS that the worm hadn’t been built for.

There were all kind of porting adventures. Probably my favorite was on the Cray-2. With great effort we’d gotten Mathematica compiled. And there we were, ready for the first calculation. And someone typed 2+2. And—I kid you not—it came out “5”. I think it was an issue with integer vs. floating point representation.

You know, here’s a price list from 1990 that’s a bit of a stroll down computer memory lane:

We got a boost when the NeXT computer came out, with Mathematica bundled on it. I think Steve Jobs made a good deal there, because all kinds of people got NeXT machines to run Mathematica. Like the Theory group at CERN—where the systems administrator was Tim Berners-Lee, who decided to do a little networking experiment on those machines.

Well, a couple of years in, the company was growing nicely—we had maybe 150 employees. And I thought to myself: I built this because I wanted to have a way to do my science, so isn’t it time I started doing that? Also, to be fair, I was injecting new ideas at too high a rate; I was worried the company might just fly apart. But anyway, I decided I would take a partial sabbatical—for maybe six months or a year—to do basic science and write a book about it.

So I moved from Illinois to the Oakland Hills—right before the big fire there, which narrowly missed our house. And I started being a remote CEO—using Mathematica to do science. Well, the good news was that I started discovering lots and lots of science. It was kind of a “turn a telescope to the sky for the first time” moment—except now it was the computational universe of possible programs.

It was really great. But I just couldn’t stop—because there kept on being more and more things to discover. And all in all I kept on doing it for ten and a half years. I was really a hermit, mostly living in Chicago, and mostly interacting only virtually… although my oldest three children were born during that period, so there were humans around!

I had thought maybe there’d be a coup at the company. But there wasn’t. And the company continued to steadily grow. We kept on doing new things.

Here’s our first website, from October 7, 1994:

And it wasn’t too long after that we started doing computation on the web:

I actually took a break from my science in 1996 to finish a big new version of Mathematica. Back in 1988 lots of people used Mathematica through a command line interface. In fact, it’s still there today. 1989^1989 is the basic computation I’ve been using since, yes, 1989, to test speed on a new machine. And actually a basic Raspberry Pi today gives a pretty good sense of what it was like back at the beginning.

But, OK, on the Mac and on NeXT back in 1988 we’d invented these things we called notebooks that were documents that mixed text and graphics and structure and computation—and that was the UI. It was all very modern, with a clean front-end/kernel architecture where it was easy to run the kernel on a remote machine—and by 1996 a complete symbolic XML-like representation of the structure of the notebooks.

Maybe I should say something about the software enginee

Show more