2016-09-20

Thirty-five years I've written code, a necromancer weaving spells to bring the dead to life. Hardware and electronics never held any charm for me. I've no love for chips and cables and solder. Give me a keyboard, a screen, and a language, and you have my attention. Thirty-five years produced a lot of work. So I thought, maybe time to talk about some of those projects.

Warning: Indulgences Ahead

If you wanted a brief, snappy piece on the meaning of life, skip this one. I'm writing for my own pleasure this time. This is less of a blog post, more of a novella. And it's all about me, the smartest guy in the room, always with an answer, almost impossible to beat in an argument.

This isn't an autobiography, I'm not going to talk about my family life, or our idyllic childhood years spent in Dar es Salaam. I'm not going to explain why my first languages were Lingala and Swahili. I'll spare you my school years, spent in the cold, dark embrace of the Scottish highlands.

Instead, this is a professional road trip, for my fellow programmers. OK, not entirely.

My first software products ran on a computer with 5,120 bytes of RAM, 1,536 of which were dedicated to storing the contents of the screen. If someone tells you that you can do a lot in 3.5KB of RAM, they're lying. You can't do shit with that. However my next computer came with 64KB of RAM, at the same price. You try filling that up, one assembler instruction at a time. And that's when I realized, these things aren't toys any more. They can do real work. They can make real money.

I've worked on a lot of different systems. More, I've worked in the weirdest possible projects, with crews of all sorts. There has always been a thread of insanity in our business. We're hooked on doing the impossible. Clients lie to themselves that they've hired a team that know what they're doing. The team lie to their clients that they're in control. The sales people lie. The marketing department lies.

Most of the time, most large software projects fail, and this is still true in 2016. Yet for most of my career, my special talent was to make projects work, no matter how impossible the technical challenges. I am a really good technical architect, able to understand systems at the lowest and the highest levels. I can write all the code myself, or I can explain to people exactly what to make, and it all fits together like laser-cut blocks.

As I wrote much later in "Social Architecture,"

I've come to think that the very notion of individual intelligence is a dangerously simplified myth.

It took me decades to realize that technology is a slave to personality. It doesn't matter how good the design, when there are unresolved problems in the organization.

And so gradually I shifted from technical architect to social architect. From caring about technical designs to caring about people and the psychology that drives them. Because in the end, this is what seems to make the difference between a working project and a failure.

The Microcomputer Era

I didn't plan on becoming a programmer. Age six or seven I wanted to be a writer. I had Roget's Thesaurus, and read it like a novel. Words, each a story, intertwined in endless trails. Eight years of boarding school beat the dreams out of me, and at 17 I dutifully collected good exam results and started thinking, what should I study at university?

My career advisor suggested computers as a lucrative option. Shrug. Seems fine. Cambridge offered a computer science course, yet my exam scores weren't good enough for that. Second best rated was York, which had a computer science + mathematics mix. I applied, went for an interview, and was accepted.

English universities offer a 3-year BSc course. I was a lousy student. The maths numbed my brain. Endless theory about encoding, Turing machines, database normalization. And then finally, a chance to write some code. Work out the PDP-11 assembler code on paper. Enter it by hand using console switches. Run it, and get boxes to move around on the screen. Yay!

About this time the Sinclair ZX80 came out. It was a ridiculous thing, tiny and light, with a horrid rubber keyboard. Yet it was real. I'd go with my friends to the store, and write small BASIC programs on the spot. Here, a game that let you chase a star around the screen with the arrow keys. People would watch me write these, and then ask if they could play.

For two years I struggled through subjects that grabbed my attention like a slug grabbing a basket ball. My tutor was patient and miserable with me. My failing grades just piled up, academic debt. There was no repeating a year. I could change to a different degree if I wanted to. Instead, life gave me a better option.

In the spring of 1981, the VIC 20 hit the shelves in the UK. I asked my mother if we could have one. We were not wealthy, and the computer was more expensive than the Sinclair. Yet it was a different beast, with color and sound and potential. She believed in me, as she has always done, and somehow found the money for the computer.

At first I copied games from magazines, typing them in — as we did, in those days — and then playing them. With a VIC cassette recorder, it became possible to save and reload them. Then I started improving the games, adding graphics and sounds. The VIC chip was powerful, if you were able to figure it out. Then I wrote my own games, mixes of assembler and BASIC. And then I started selling these.

There is a kind of shock, the first time you make something with your own hands, and exchange it for money. I bought empty cassettes, drew and copied inlays, and cranked out my games. A quarter page ad in one of the main magazines cost 125 pounds for three months. One ad impression would produce maybe fifty sales, at five pounds each. When I had a new cassette I'd send a mail flyer to existing clients.

By the summer of '81 I had sold a few hundred games, and attracted the attention of Commodore, who sent me a shiny new C64 free of charge. Computer, disk drive, monitor, and printer! I called my tutor and said, look, Bill, I'd really like to take a year off university to write and sell video games. That OK with you? Sure, he said, and did the necessary.

And so it was that I started my first business. The C64 had vast capabilities, poorly documented. I worked through the reference manual until I knew it by heart. There was an assembler cartridge yet it was clumsy, so I wrote my own assembler tool. And then I started writing a new game. Just a standard shoot 'em up, yet crazy fast and fun, and I was coding day and night to get it finished.

At the time, there was nothing in the market for the C64 except the old nasty BASIC games hastily ported to the larger screen. So I was working like a crazy to get a large-scale production line working. I'd designed and printed full color inlays, fifty thousand pieces.

Putting the cart firmly before the horse, I'd even written a natty copy protection scheme that I was so proud of. Your standard cassette held a BASIC program that you would LOAD, and then RUN. You could as well SAVE it again, to a blank cassette, and people did this a lot. It meant that a popular game get widely shared among computer clubs and high schools. God forbid!

The standard LOAD/RUN thing also seemed clumsy, and I wanted to make it smoother for the player. I'd found that you could format the saved data so that it loaded into a specific address in memory. I crafted a block of data that loaded itself into low memory, and overwrote the stack. When the LOAD routine finished, it popped the address off the stack and hence jumped into my code, which was a special loader. That pulled the rest of the game off cassette, loading it into memory, and then running it.

Many years later my friend Jonathan told me other C64 games used the same scheme, and he copied the games using a special cartridge that simply took a snapshot of memory when the game was loaded and ready to run. Pirates 1, Hintjens 0.

I was working with an audio cassette duplicator in Port Seton, and we'd cracked how to produce software cassettes at high speed. We could produce several thousand a week per machine. It was all ready to go, and then I made a fatal mistake that taught me my first big lesson about the corporate world.

Instead of running the ads and shipping out my software, I showed my new game to Commodore.

They loved it and told me they wanted to distribute it. At least UK wide, and possibly all of Europe. I was spell bound. My own sales would be in the tens of thousands. Commodore would be able to sell ten, a hundred times more.

So I told them "OK" and paused my own sales. Boxes of cassettes sat in my room. I did not run my ads. And Commodore told me they were working on it. And so I waited.

As I waited, other games started to appear. The large video game companies had finally gotten their act together, and were attacking the C64 market like a pack of killer whales attacking a young baleen whale. Full color ads promised everything. It became pretty obvious that my window had closed.

Commodore, I asked, where art thou? What the heck is going on? Finally they came back to me. "We're not going to distribute your game after all, thanks," they told me. Incompetent or malicious, who knows. I cursed them, and shut down my business.

As it turned out, it was also summer, and so I went back to university to finish my degree. At least I had a personal computer system to work on, and lots of experience in 6502 assembler. The 6502 chip was simple and fast, with a minimal instruction set. In some ways, a precursor to the RISC CPUs that dominate today's world.

My tutor Bill gave me Loeliger's "Threaded Interpretive Languages: Their Design and Implementation," and I decided to make this my thesis topic. The result was a lovely Forth-like language sat atop that brutalist 6502 assembly language. Fast enough for video games, and such fun to program! Every programmer should make a Forth at some point.

Sitting in my room coding day and night felt right. Whereas I'd spent my first two years avoiding hard work, now my brain was addicted to it. Vaguely, I realized there were courses I should be attending. Mostly, I ignored them, and they ignored me right back. My code got too large for the 170KB floppy drive so I stripped off the comments and carried on coding.

When it came to final exams, I sat with my friend Nigel and we skimmed through the course material, making notes. A few hours was enough to read a years' material, summarize it, and digest the summary. By luck our exams were always in the afternoon. It was a blur. One exam a day for two weeks. I came first or second in every one. The thesis committee asked me why I'd not commented my code and I shrugged. Did it matter? Look, let me demo you a couple of games, and walk you through the core threading logic.

The two years of fails brought down my overall degree yet in my entire career not a single person ever, once, asked me what I'd achieved at university. It was just never relevant. There's a big lesson here. Many of the top students in my course went off to Silicon Valley. I slid under the radar and moved, despite my best intentions, to Belgium.

Working for the Man

The Man got me straight out of university, for in those days we still had mandatory military service in Belgium. I'd lived so long in the UK, and was naturalized (I still have a UK passport), yet Belgium demanded its cup of blood, and it got me.

Military service was actually fun. In boot camp we were split into the intellectuals, and the lifers. People like Henri, a nuclear physicist, and myself wore the glasses and obviously had the nerd soft hands. The lifers were young professional soldiers, 16 or 17, who cleaned and scrubbed and marched and drilled. We nerds sat around the barracks, happily useless. They were not going to give us guns.

Ironically, I was already an excellent rifle and pistol shot, with prizes from Bisely, where my school shooting team had competed every year. I didn't tell them that, mainly because I spoke no French and no Dutch.

I worked at the national map making institute. They had a large project to make a digital map of all of Belgium, for the US air force. They carefully scanned in paper maps, tracing the railways, roads, city outlines, canals. The goal was to make maps for cruise missiles.

My boss showed me the team, their GCOS minicomputer, the IBM terminal running on a time-sharing system somewhere, and told me, "your job is to help us make the tapes."

Turns out the USAF wanted the map data on magnetic tapes. We could send data from the GCOS machine to the IBM, though I completely forget how that worked. Working in PL/I on the IBM, I could load in the map data and write a tape.

Ah, but there was a catch. I was not the first person to try to make tapes. The USAF systematically returned the tapes as "unreadable." After several years of mapping, the project was jammed on this one point. So being the smart young thing I was, I investigated. It turned out the USAF were using a UNIVAC system, which unlike the IBMs, the GCOS, and every normal computer in existence, used seven bits per byte instead of eight.

Magnetic tapes were coded with 8 bits in a stripe, perpendicular to the tape direction. These stripes were written like this "||||||||||||" along the tape. It struck me that the UNIVAC was probably just reading seven bits at a time, instead of eight. So the first stripe had one byte, plus one bit of the next (seven-bit) byte. So I staggered the bits like that, and we sent off a tape to the USAF.

It came back after the usual three or four weeks, with a new error message: "Bad data format." Bingo. Now I went back to the documentation and figured out what the real mapping data should look like. What we were sending from the GCOS wasn't even close.

Some weeks later (I got sidetracked into writing a small compiler on the GCOS that could access more exotic system functions and give me unlimited priority to run my heavy conversion programs) we had a new tape. We sent it to the USAF, and a few weeks later came the reply, "Valid."

At which point, all hell broke lose.

What I didn't realize was that the project had some years of budget left. The mapping was mostly done, and the team mostly idle, and happy with it. Lacking any way to send valid tapes, the Belgians simply kept pushing back the date, and collecting their sweet, sweet USAF cash. (I am speculating that it was sweet, because I was being paid a conscript's wage of 45 BEF or about 1.10 EUR per day.)

My boss told me I could take the rest of the week off. Actually, don't come in every day, or unless you feel like it. I shrugged and started my new regime of two half days per week.

There is a lesson here too. You need to follow the money, and understand how it flows. I could have walked away with my own personal GCOS system, if I'd played it smart.

The second lesson, which many a foreigner has learned in Belgium, is that it's an easy country to come to, and a hard one to leave. For reasons, I decided to stay in the place and find a job.

Still Working for the Man

I found a job at Sobemap, which was in 1984 the largest software consultancy in Belgium. I was hired by a large good natured man called Leif, who was looking for a fellow spirit to help him write software development tools. This was in the days of COBOL and weird systems. Sobemap had a commercially successful accounting system, EASY, which they maintained on seven or eight different platforms, with a team for each platform. Leif's job was to build the basis for a single portable EASY that could run on everything.

Turned out, though Leif only told me this a few months ago, that the main objection to hiring me was my age. Too young. I guess it was a high risk project and they hoped for someone with relevant experience. Needless to say, the number of people building cross-platform COBOL frameworks, worldwide, was approximately zero. We were the first to even think of such a mad thing.

It took only a few months to build the basic layers, and then my job was to make those work on new, exotic systems. I'm talking Siemens BS/2000, IBM S/36, DG AOS, MS-DOS (once we discovered the wonderful Realia COBOL compiler for the PC), and IBM MVS/CICS. The core routines had to be in assembler, to be fast enough. That was when I discovered C, on the DG and the VAX and then on the PC.

All these systems died, one after the other, over time. Good riddance. The inconsistency was incredible. Every machine had its own concept of file structures, organization models, compilation procedures, and so on. Imagine learning ten different varieties of Linux, each from a different planet.

No matter, we learned to dive under the consumer level OS and language, and into system internals. That was inevitably the only way to get the performance and behavior we needed. On a BULL TDS (short for tedious) system I wrote a memory manager, in COBOL no less, to fit large programs into the pathetic excuse for "main memory" that system provided. On IBM S/36, the same, to swap megabytes of code in and out of the 64KB main memory.

Leif and me, helped by various people who came and left, used our portability tools to build a powerful set of tools: editors, code generators, reporting tools, and so on. Developers loved these tools. You could develop and test on a PC, at home, and the same code would run unchanged on an IBM mainframe. We had cracked the problem of "write once, run anywhere," some years before it became a fashionable problem.

The IT managers of our clients didn't share our passion for portable code. As one IT manager told us, "I've just spent so much on a new VMS system. Why do I want portability?" Logic doesn't apply, when people work on pseudo-mystical belief. A very few IT managers saw the potential, used it, and built their empires on it. Yet mostly we had few external clients and direct sales.

All this time, the company kept changing and shifting. The smart, nice guy who hired Leif and then me left for a startup. We were sat on by a succession of incompetent, arrogant illiterates. Despite delivering success after success, there were no pay rises, no bonuses. Eventually after five years, our team called it quits and we left the firm, each to go separate ways.

Sticking it to the Man

One taste of employeeness was enough for me. I did some research, found an accountant, and became an independent consultant. My first gig was for a trade union organization. They did not pay much yet they treated us with respect. Jonathan Schultz and myself built a project management system, using that awesome COBOL framework.

Somewhat adrift, I worked on various projects with old contacts. We built monitoring software for PCs for a UN transport project in Africa. Set-top clients and servers for a cable TV company. Software to produce and decode shipping manifests, using a prehistoric format called EDIFACT. Mostly I worked in C, using Turbo C on MS-DOS. And a lot of x86 assembler, using MASM, the Microsoft assembler.

Then I got a call from my friend Hugo at Sobemap, now renamed to Sema. "We need your help to make this project work, what's your daily rate?" I quoted a comfortable figure. They agreed. And so I went back to work for the same projects, earning five times more.

I learned one important lesson: the more your clients pay you, the more they appreciate you and listen to what you say. The same CEO who'd walked past me in the corridor as if I was a tattered old poster now stopped and shook my hand. "Glad to have you back with us, Hintjens!" he said.

And so I found myself back in the world of enterprise software projects, providing that special skill we never really had a name for in our industry. Plumbing. Infrastructure. Wet work. Black magic. Making it possible for mediocre developers to build really large applications that worked well even under stress.

It was 1991, and my first major project was to port our tools to VAX ACMS. The project was signed and sold before I had a word to say. The IT manager was one of those who knew Leif and me, understood our blood-and-guts approach, and loved it. He had bet his career and arguably his firm's future, on us succeeding with an impossible project.

So here was the challenge. The client wants to build a travel reservation system to serve tens of thousands of agents in offices across Europe. These agents would log into the system, query availability, make bookings, and so on. The backend for this system would be a cluster of modern, cheap computers called a VAX. These cost a fraction of the IBM mainframes that the industry still relies on. It has cheaper memory and disks than IBM's dinosaurs (though not as cheap as PC hardware). It runs a modern operating system (VMS) that is fast and flexible (though not as nice as UNIX). DEC's wide-area networking was decades ahead of IBM (though not as nice as TCP/IP).

The only problem is that IBM's mainframes have long ago cracked the problem of sharing one mainframe between tens of thousands of users. Whereas DEC, the firm that makes the VAX, has not.

If you want to get technical, IBM used smart 3270 terminals that dealt with all keystrokes locally, and interrupted the computer only when there was a screen of data ready. Somewhat like a clunky web browser. The mainframes then run a "transaction processing system" called CICS that pumps these screens of data through applications, with all state kept out of the process. One application can thus serve masses of users.

The VAX on the other hand used dumb VT terminals. These sent every single keystroke back to the computer to deal with. Each terminal, and its user, have their own process. It's the same model that UNIX and Linux use. Five terminals, five interactive "sessions", five shell processes. So VAX gave a much nicer flow, and you could do things like scroll smoothly (impossibly on a 3270). On the down side it could not deal with anything like the number of users at once. Each of those processes eats up virtual memory.

DEC's proposed solution was to use smaller front-end MicroVAXes, each capable of handling around 50 users. These would then talk via some magic to the back-end cluster. The client did some maths. Two hundred front-ends and ten thousand interactive licenses. That cost several times more than the rest of the project together. They might as well buy an IBM mainframe.

Which was when someone decided to bluff the project, sign it for the budget the client was willing to pay, force DEC to accept whatever solution we came up with, and then hire me to make it all work.

Since no-one had told me it was impossible, I took the documentation and a test system, and began to play with DEC's transaction processing framework, called ACMS. This was the closest DEC had to an IBM CICS-style transaction processor. Think of a container for server applications. You could start and stop apps, and then send them events from other processes, even across a network.

So far so good. Then I looked at VMS's asynchronous I/O system. To get consistent and fast response in a real time system, you have to work asynchronously. You can never wait on something happening.

I'm going to explain this in a little detail, because there are lessons here that still apply today. If you want to make truly massive systems, it's worth understanding the transaction processing (TP) model. Today I'd explain it as follows:

Your application consists of a collection (hundreds, or thousands) of services. These are connected either directly or indirectly to your front-end UI.

A service runs with N instances, to provide scaling for many users. The TP starts and stops instances as needed.

The services typically work with further back-ends, databases, and so on. This is not the concern of the TP system.

A single service instance works with a single "transaction" at once. Be careful: the term "transaction" is over-loaded. Here, it means a request to do some work, where the work comes from the UI either directly or via some other layers. It is often used to describe a package of work done by a database. The meaning is similar, yet not the same thing.

Services hold no state for UI sessions. If the service transaction needs state, it is held by the TP, and passed to and from the service, with each call.

This is actually close to the "actor model" that we aspire to with asynchronous messaging.

OK, back to asynchronous I/O. Let's say I'm waiting for the user to press a key on their VT220 terminal. In synchronous code I'd make a system call that waits for input. At that point my process is swapped out, and effectively dead until the user presses a key and input arrives.

An asynchronous call (called an "AST" on VAX/VMS) is a bit different. It does a similar "wait for input" call, and adds extra arguments. It says, "when you are ready, call this function, and pass this state." The critical difference is that after the call, the process isn't suspended. It continues right along with the next instruction.

So you could for instance wait for input on 100 different terminals by making 100 AST calls in a row. You could chain to a single function, and pass the terminal ID as state. Then your process could explicitly suspend itself. Each time a user pressed a key, the process would wake up, the function would get the event, and process it.

This event-driven I/O model is still how we build really fast multi-threaded servers today. However a server typically only deals with one kind of event, namely network traffic. Perhaps three: input, output, and errors. To write an ACMS front-end server, you need to deal with rather more I/O events. At least:

System calls to resolve logical names, terminal IDs, network names, and so on.

System calls to read and write to disk.

Events coming from terminals.

Events coming from ACMS, typically "finished doing this work."

So you don't have a single event handler, you have dozens or hundreds. This makes the server design horrendous. Imagine writing a large program consisting of tiny functions, where each specifies the name of the next function to call, as an argument.

Being a clever wunderkind, I figured out a solution. One of our COBOL tools was a state machine designer. A neat thing: you describe your program flow as a state machine, and it turns that into code. I'd started rewriting that in C, a project that ended up as Libero.

Using Libero, I could write the flow of execution as a state machine, and generate C code to make it run. In the core of this C code is a generic AST handler that can deal with all possible events. It knows the state machine and can just call the next function itself.

So here is a piece of the state machine, which initializes a new terminal session. What this means isn't important. The point is you can read it, the words kind of make sense, and you're not writing crunky chained AST code:

So I presented this approach, with some examples and test results, to the client. They asked a respected consultancy firm (Andersen) to check it. The two consultants who did that work had previously worked for DEC and were experts in ACMS server design. I think they'd helped write the standard ACMS front-ends. After some days of explanations, they went off to write their report, which came down to "the proposed approach is insane and will not work."

Somewhat later we discovered that Andersen had been hoping to get the contract themselves. I learned an important lesson about consultants: they are professional liars.

The IT manager of the client took the report, threw it in the trash, and told his management, either we go with this option or I quit and you can forget this project. Since this man had built the original system on IBM CICS, and brought most of his team into play, that was a serious threat. Management caved, Andersen were kicked out, and we got the green light.

The system we built worked as designed and went live on time in 1992. It grew to handle multiple tour operators, with front-end servers happily dealing with two thousand terminals each. We maintained and extended the system every year for a decade, little by little. When I stopped working with Sema, the client paid maintenance to me directly, until the system was finally decommissioned in 2010.

The lesson here is, if you have the trust of your client, and s/he has real power, you have done half the work already.

This was a perfect software project. Working with good matériel; decent hardware and a non-insane operating system. Working with a smart team that know their stuff, and a client who likes and trusts you. With total freedom to build things the right way. Where the hardware and software bends to your will. Where you can take the risks, analyze them, and design them away.

Little did I know how rare such projects are. Shortly after the tour operating system went live I found myself in an insurance company in Brussels, looking at an ancient terminal the size and shape of a large microwave oven. Carved in the bezel was the ominous word: "UNIVAC."

Shit Goes Downhill

I imagine the conversation went thus: "You're saying you can make the EASY accounting system work on our existing UNIVAC mainframe?" followed by a confident reply, "Of course! It's portable! Now if you'll just sign our framework agreement on licenses and rates," followed by a dryly muffled laugh and the scratchings of quill on parchment.

Whereas the VAX project was a smart bet on new technology, the insurance company was in the midst of a fight between old and new, and we found ourselves on the wrong side of history. The old guard was defending their UNIVAC mainframe (and its not trivial budgets) at all costs. The new guard were trying to push the IT infrastructure towards UNIX.

The first time I saw our client — the incumbent IT manager — he was slumped in his chair, melted from fatigue and age, a cigarette in his hand, ash and stubs on his desk. Here is a man in the terminal stages of a truly horrific disease, I felt.

I'm sure that there was a time when UNIVAC made relatively excellent computers. I'm sure someone, in the early 1960's thought, "seven bits is going to be enough for anyone!" I'm sure at some point the slogan "America's First Commercial Computer System!" was spellbinding for customers. But not Brussels in 1992, please no.

By 1992 we'd invented the Internet and were starting to see the first Linux distributions for PC. I had an Yggdrasil distro on CD-ROM that included, wonder of wonders, a free C compiler. Linux was not just a real OS, as compared to the kindergarten MS-DOS, with proper virtual memory and tools. It had the same shells (ksh, bsh) as the Big Iron UNIX boxes we were starting to see in companies.

And here we were, with a system pulled through time like an Automatic Horseless Carriage from 1895 trying to compete in the 1992 Grand Prix.

Leif and I briefly looked through the system docs, and played with the terminals. I looked at the microwave-with-keys thingy and said that it seemed rather ancient. "Oh, it gets worse," said Leif, looking at a page in the manual. This was going to be a familiar phrase. "Oh, it gets worse!"

I'm not going to bore you with detail. Just one small piece to show how bad it was. The UNIVAC terminals were like the classic IBM 3270s, block mode devices that sent the user input as a single block of data, back to the mainframe transaction processor. Something like a HTML web form, except sending field/value pairs, it would send row/column/value tuples. Fair and good.

The first problem we hit was that the large Enter key did not send anything. Oh no, that would have been too obvious. Instead there was a separate SEND key. OK, we guessed UNIVAC users were used to that. More fun though, the terminal only sent up to the cursor, no further. So if you typed some data, and then backed up to fix an error, and then pressed SEND, you lost half your input.

The terminal had function keys that we could program by sending commands together with screen data. So our hack was to add an invisible input field on row 24, column 80. Then, F1 would move the cursor to that field, do a SEND, add a code for "F1" (our apps used sixteen function keys for navigation), and then move the cursor back to where it had been.

The worst part, which is hard to wash off even after years, is that we were actually proud of this. We'd made it work. There were dozens of other "WTF?" moments, yet eventually EASY could run on this prehistoric system. Leif and I made our excuses and found other projects to work on.

Taking Money off the Table

Around 1992 Sema was building a reservation system for a high-tech fun park in France called Futuroscope. Part of the challenge was to send bookings to the various hotels that clustered around the park itself, and sat further afield.

Before we automated it, bookings were faxed by hand. This was before the days of office laser printers. Each night (or perhaps twice a day), a batch job ran that printed out all hotel bookings on a "line printer." This produced a large "listing," perforated so it could be torn into individual sheets.

Someone poor fax jockey took got listing from the Computer People, ripped off the header and footer pages, separated the rest into individual pages, and chopped these using a guillotine into A4 width, so they would fit into a fax machine. Or perhaps they faxed them sideways, breaking all the rules of fax decorum of the time.

I can't imagine the joy of entering each fax number manually, waiting for the modems to connect, feeding in the sheets, and sorting the bookings into "done" and "failed, try again in 30 minutes," and "hotel fax seems borked, call them to see what's wrong" piles. Especially in peak season, when the booking system would spew out hundreds of bookings a day, and even a single misplaced booking would result in drama. If you've never seen a Parisian family of five arrive at a hotel where their booking didn't get through, they make the citizens of San Jose look polite and charming by comparison.

Free software fixed this and destroyed the "fax jockey" position, at least in the Futuroscope.

Futuroscope were looking at commercial bulk fax systems. These were extraordinarily expensive, thousands of dollars just to get started, and more depending on capacity. We found this distasteful especially since by this time a fax modem (a small external box that plugged into a serial port) was maybe a hundred dollars. On Windows we were used to software like WinFax that could send faxes using a special printer driver.

Also we did not want to have to interface with these beasts, which used bizarre proprietary software that I knew was going to cause us needless pain. And like I said, if the faxes didn't work, the whole system was suspect.

So I looked around and found an free software package for UNIX called Hylafax. Together with another free software project called GhostScript, we could create nice (i.e. using proper fonts, and with the Futuroscope logo) bookings, and send them to the hotels entirely automatically.

My idea of using free software was received with solid, head-shaking skepticism. This just wasn't how things were done. What sold my proposal were two arguments. First, that every franc the client spent on expensive fax machines was a franc we couldn't invoice. Second, the fax subsystem was so important that whomever supplied it would become an important vendor. Surely that should be us (Sema) and not some random box seller.

OK, Sema said, if you can make it work, we'll sell it to Futuroscope. So I downloaded the source zips for the packages, built them, and tried them out. It was all surprisingly simple. The hardest part was finding a fax modem that would work with the UNIX server, so I started work on my PC using its fax modem card. This was in the days of dial-up and I paid around $2,000 a year (in today's money) to one of Belgium's first ISPs, in Leuven, for Internet access.

It turned out that a single fax modem was able to handle peak traffic, especially since there was no time lost messing about entering fax numbers and feeding paper. Hotels never replied by fax, so there was no need to handle incoming faxes. All we needed was queuing of outgoing faxes and a way to know if they failed.

Which it turned out, Hylafax mostly did for us. It used a neat client-server design so the Hylafax server ran in the background, and our fax script called the Hylafax client each time it wanted to send a fax. The client passed the fax onto the server, which queued it, and sent it when it could.

Hylafax automatically retried sending, a few times, so we only had to deal with hard errors (broken fax, run out of paper, hotel on fire, that kind of stuff). We recovered the status codes later, from the Hylafax log file, and pumped them back into the application.

It came down to a rather simple Bash script that took the next fax from an inbox (a directory of with one file per booking), called GhostScript to create a printable page, called Hylafax to send it, and then moved the fax to the outbox. It also created a log file that the application could read to know what happened. This script ran as a daemon, sleeping for a second if there was no activity.

It worked nicely. The good folk at Futuroscope were a little shocked that a simple fax modem and some magic software could replace the large industrial-sized fax machines their vendors were trying to sell them. Nonetheless, they saw it worked, shrugged, and the system ran nicely.

Only once did things stop working, when the application was moved to a faster, cheaper server. Turned out someone had implemented "sleep for one second" by busy-looping. On a multi-user system, no less. People had wondered why everything froze for a heartbeat when someone confirmed a booking. Yes, the code said something like,

Their proposed fix was to increase the counter to 4 million times. I sighed and explained how to call the Bash "sleep" function from COBOL.

Lesson learned: identify the riskiest parts of your architecture and bring them under your full control. The point of using free software was that we could control everything using Bash scripts. We could test everything in isolation. We could in effect build a full solution, removing all risk, faster than it took us to learn and interface with existing solutions.

Successful use of free software in a commercial project was a shock to Sema in 1992. There was some talk of turning the fax gateway into a product yet I couldn't in honesty package a hundred-line Bash script for sale.

Wiring the Factory

Shortly after, and with UNIX, HylaFax, Bash working as mind-bleach for my UNIVAC PTSD, I worked with a Sema team on an industrial automation project. In short, this meant getting a large cement factory to talk, in a matter of speaking, with the SAP system that took orders and invoiced customers.

When we finished the project (and this is a flash-forwards), the head of our department came storming into our offices, quite angry. He showed us the figures for the work we'd done. We'd completely missed the budget and he was afraid it was going to cause real problems for the client.

I was a little confused at first because we'd worked hard and well, and pulled what was rather a mess into a coherent, well-working system. The managers on-site adored us, and so did the managers at customer HQ. What was the problem?

Turns out, Sema had made a fixed price bid, based on some random estimates of how much work we'd have to do, multiplied by the usual "engineers always underestimate the work" factor, plus random "project management" amounts, and so on. It was a fairly solid budget, for a major upgrade to the factory.

And we'd under-spent by such a large amount that Sema's finance department had red-flagged the figures as impossible. I laughed and explained that we'd simply done our work well. And indeed the customer never complained. We did not get bonuses, or raises. It was simply our job, and my reputation for pulling off miracles rose a few notches.

I'll describe the project itself, and how we got it to work, because it seems interesting. The factory produced dry cement, which was sent by truck all over Belgium. Clients would place orders, which would be scheduled, and then sent to the factory. Truck drivers (independent contractors) would come to the factory, get a shipment, and drive off to deliver it.

This was done largely by hand, which was slow and inflexible. Orders would be faxed through to the factory, where someone would schedule them, then enter them on a local application. Truck drivers would arrive, and at 7am the booths would open. In each booth, when a truck pulled up, an operator would choose an order, print it out, and give that to the driver.

The driver would then find the right loading bridge for the kind of cement, and give the paper to the bridge operator. He would control the loading pipes, confirm the order on his computer screen, and the trucker would go off to deliver it. Some trucks were dry, some got additional loads of sand and water and did what concrete mixers do.

Customers liked the cement, and hated the ordering process. It was slow and clumsy. If the weather suddenly got better or worse, there was no way to change an order to take advantage. You can't pour concrete when it's below freezing point. What do you do if there's a cold snap, and a truck full of the stuff turns up?

Industrial automation is a specialized business, and it was not ours. There were companies who knew this area, and one firm had built new automated loading bridges for the factory. Sema's job was to build a scheduling application. Another firm had built an application to replace the booth operators, and allowing more booths to be added over time. Yet another firm was providing various pieces of hardware. And then we had to connect the scheduling application to an SAP system (in Brussels, far from the actual factory) used for sales and invoicing.

None of this had been developed or tested beforehand. So when I arrived on the project, we had five different teams building stuff that was supposed to work together, and of course, did not. Apparently this was just standard practice: bring a bunch of stuff to a site and then hammer at it all until it worked. What made it extra fun was that each vendor had the attitude of "our stuff works, so if there's a problem it's yours, not ours."

It was made worse by the practice of fixed budgets. Each firm had made offers, which the client accepted. Spending an extra day fixing someone else's problems meant a day of lost income. All this is standard in construction projects. Yet building a complex software & hardware system is a different thing.

Let me give an example. The design used smart cards, a great idea. When a driver arrived, the kiosk would spit out a smart card holding all the order details. The kiosk would then say "Go to bridge 4" or whatever. At the bridge, the driver put the card back into a kiosk, the cement loading started, and that kiosk printed a receipt, for the driver.

So here we are with a UNIX system running the scheduling application, a PC running the kiosk app, and smart card readers. The team writing the kiosk app had no idea how to talk to the UNIX system, nor to the badge readers. No-one has experience with the smart cards, which the client has bought from Siemens. Siemens is not providing much support, and this is a decade before the web made it all easy.

And in every meeting, the attitude from each firm was, "not our problem, out stuff works." It was after several months of this toxic stalemate that I'd joined the project, in my usual dual role as plumber and fire fighter.

My colleagues and me adopted a simple solution, which was, "every problem in this project is ours." I figured out how the badges worked and we designed the badge data format. I wrote serial I/O routines to read and write badges, and handed them over to the kiosk teams. We built TCP/IP clients and servers to connect the kiosk application to the scheduling app, and gave the kiosk team libraries they could just call.

This all sounds like a lot of extra work yet we loved it, and were good at it, and it was faster and easier than arguing. At first the other teams were confused by our approach, and then they realized things were actually moving ahead. Little by little the project became happy again.

The kiosks were the cornerstones of the project, and were abysmally badly designed. The client provided the physical infrastructure, and the large metal housings. Various parts, such as cement-resistant keyboards, were ordered from afar. There were PCs to run the application, and little thermal printers to print tickets. Nothing had been tested beforehand.

And predictably, as the kiosk team tried to put this all together, nothing worked. It was almost comedic. The printers had to be propped up at an angle for the tickets to come out, and they jammed constantly. But the worst offenders were the screens. We'd started working in late winter, and by spring we were testing the kiosks for real.

As the sun rose, it dawned on us that the kiosk entrances faced south. The kiosk design was nothing as sophisticated as today's parking kiosks or highway toll kiosks. Flat panel screens were still science fiction. So here we have a PC in metal box with its CRT monitor behind a sheet of glass. The driver had to get out of their truck, their badge (which was both for ID and to hold the current delivery), choose a delivery, confirm it, and get their ticket.

With the sun shining from behind, the screen was unreadable. CRTs did not have great daylight visibility. In direct sunlight, behind a sheet of dirty glass, it was tragic.

One day, watching the frustrated drivers squinting at the glass as they tried to read the screen, I took a flat piece of cardboard, cut out a credit-card sized hole, and taped it over the glass. This fixed the problem. When I came back to the site a week later, both kiosks had metal plates, to my design, welded over the screens.

This was the tragicomic part. The project was also just painful in other ways. The client had specified that the kiosk application should be able to run even if the scheduling app did not work. This was sane, since the exchange of orders to and from the SAP system had never worked well. So if anything failed upstream, they could continue to ship cement, and then re-enter their orders afterwards.

The kiosk application thus had a "stand-alone" mode where, if it decided the scheduling app was not working, it would take control of things. Except, this app would detect a "failure" randomly, and the entire system would swing in to, and out of, stand-alone mode. Each time that meant fixing things up by hand afterwards.

It was stupid things like, the kiosk app would open connections and not close them, and exhaust file handles on the UNIX system. Or, it would run a 100% CPU background task that slowed everything down so that the I/O loop could not run, and timed-out.

There was a point where the factory would call us at 4am, saying the system was down, and we had to come on-site to get it working again. That meant a 2-hour drive. When we arrived, a long queue of furious truck drivers waited on us to fix things. Which meant restarting things, and manually consolidating a mess of local and remote orders. The real damage was done by switching in and then out of stand-alone mode.

The lesson here is that making systems more complex, to try to make them "reliable" will usually make them less reliable. What we were experiencing was "split brain" syndrome, the bane of any clustering design. Allowing the PCs to randomly decide they were in charge was a Bad Idea. Letting them decide to switch back to normal mode made it much worse.

Since the crises were so random, we started sleeping in the factory offices so we could be there at 3am to try to catch it happening. To be frank, I didn't enjoy sleeping on a mat in between office furniture, in the heart of a massive industrial zone. Once we decided we'd bring beer, it was slightly better.

What we eventually came to was a panic button on the PC screen that forced stand-alone mode. If orders stopped arriving from the UNIX system (for any reason, one does not care at 5am), the operator clicked the panic button, and the PC switched to stand-alone mode. The operator could then handle the morning peak, with a little more manual input. Somewhat later when things were calm, we'd consolidate orders with the UNIX, investigate what had caused the fault, and switch back to normal mode.

Additional lesson: fail-over automatically and recover manually.

Despite all this, we finished our work so far under budget that it raised alarm bells. The client loved us, having seen us in action. Years later, this would pay off.

The main lessons I learned were obvious and yet worth stating, perhaps because the industrial world is so distant from the normal pure digital development world:

Don't expect random work from random vendors to magically work together.

Do not develop and test in your production environment.

When things go wrong that even marginally involve you, assume responsibility until you know where the real problem lies

When you use hardware and software that bends to your will, you can work significantly faster.

Fear and Loathing in Brussels

In an industry that is driven by forwards change, it is shocking how many projects are based on betting against the tide. By this I mean forcing an application to run on some antiquated, expensive, inflexible, and painful platform, when new, cheap, flexible, and enjoyable alternatives exist.

It almost always ends in disaster for the organization. And yet I've seen this happen so often that it is almost a caricature. Brain-dead management forces obsolete technology on team who build mediocre application that fails to fix organization's real problems, which then gets taken over by smarter competitor.

Here are some of the reasons this happens so often, in my opinion:

Psychology of sunk costs. "We've spent twenty million on our goddamn mainframe and goddamn you if you think we're going to junk it just because you have this new fancy-wancy UNIX thingy."

Self-interest of vendors. "Sure we can expand the main memory from 64MB to 128MB. It may be a little expensive, but mainframe memory is special! It's better than that cheap, unreliable UNIX memory!"

Politicization of IT. "My departmental budget is $10M per year, and you're talking about slashing that by half and yet adding 10x more users? Are you insane? You're fired. I'll find a consultant who agrees with me."

Fear of the unknown. "Why would we use that 'UNIX' thingy? It's just a large PC, totally unreliable. And anyhow, UNIX will never be widespread. No, we prefer our tried-and-tested mainframe architecture that runs real networks like SNA LU6.2!"

Shortly after the UNIVAC trip, I was thrown into one more such project. The team was building a new financial system. The team was using our COBOL toolkit. They were using AIX UNIX machines for development. Production was to be on a BULL TDS system. One. More. Obsolete. Mainframe.

The entire project was designed as an internal hostile takeover of the firm. The firm was a recent fusion of two businesses. The profitable one consisted of many smaller offices, each working their own way. Clients got customized service and paid a premium for it. The unprofitable firm however had the power, for political and historical reasons.

So the new planned system would enforce order and consistency on those cowboy offices. It would in effect break their independence and bring them into a neat, centralized organization. It was a classic battle between good and evil, between sanity and insanity. And once again, I found myself on the wrong side of the battle lines.

The architects of the new system had no idea, really, what they were going to build. They could not ask their users, because they were at war with them. So they made it up as they went along. Designs and plans fell out of meeting rooms onto the developers' desks.

The developers were decent people. This was the era when even with modest training, you could learn to develop business software. We had as many women as men in the crew. I cannot fault the developers, nor the COBOL language, for the crap that emerged. This was a direct result of getting insatiable, arbitrary demands from the analysts.

My job was to support the developers with source control and continuous builds, and to build the technical framework to make the applications run on TDS. Bull's computers were feeble imitations of IBM's mainframes, and likewise, TDS was a feeble imitation of IBM CICS, the dominant mainframe transaction processing system.

Lurking like a bridge troll underneath the application was an Oracle database. I won't comment further except to say that it more or less worked, yet was both a major source of problems, and apparently, a major slice of the project cost. We could have done much better IMO.

In those days the only plausible source control system was CVS. We did not use that. Instead we used shell scripts to check-in and check-out sources from a repository. The application consisted of hundreds of programs, each was either interactive (basically, one or more screen pages with logic), or batch.

When you checked out a program you got all its related files. You could edit, compile, and test locally, quite quickly. When you checked your program back in, we saved the previous version (saving the differences, to save sapce), and stored the new one. We compiled everything, once per hour or so, to create a fresh testable version of the application for test users.

As new programs got checked in, they got sent to the Bull system for test compilation. That produced error listings (the code trying things that worked on AIX yet not on the Bull). Our shell and awk scripts pulled these error reports back to the AIX, extracted the few lines of error message from 20-page listings, and passed them to the developers so they could fix things.

Some of the developers (especially the older ones who had learned the Bull environment) just printed out the reports, then stacked them in heaps around their desks. These heaps grew higher and higher, eventually, forming walls behind which the developers hid.

The test users marked changed programs "approved" when they were happy with them. To make this work we had developed our own issue tracking system, in COBOL. That was not hard, as these tools worked well and let us build applications rather quickly.

When a program was approved, it was sent up to the Bull system for real this time. It was compiled, and from then on available to external test users.

People noticed quite early that the AIX test environment was rather faster than the Bull. Out came the arguments about interactive sessions vs. transaction processing. "The Bull will scale better for hundreds of users," went the official line. No matter that even with a handful of users, it was already slower. They added more of that special mainframe memory, and upgraded to the largest CPU possible. Using the Bull was not negotiable.

Our development environment was quite neat, I think. For developers, it all just worked. Of course I far prefer our git-based environments. Yet what we built showed the power of straight-forward UNIX shell scripting.

We also cracked the TDS environment and got the application running on it. This was no simple job, mainly because the programs had gotten so large that they did not fit into memory. Like a lot of machines bypassed by history, real memory was larger than virtual memory. That is, TDS limited programs to ridiculously small sizes, while the actual system memory was much larger.

You could even access the system memory from COBOL, just not portably. And we depended on making TDS and AIX look exactly the same from the point of view of developers. Which ended with me writing a memory manager in COBOL that managed system memory, swapping blocks of memory in and out of application space, behind the scenes. It very fast because the heavy work (copying memory) compiled down to fast machine code. If in COBOL you say "move these 30,000 bytes from A to B" that can be a single machine instruction. If you write a loop, moving 30,000 bytes one by one, it adds up to maybe a million instructions.

No matter, we could not save the project. The developers went through death marches, the regional offices went on strike, managers left and were replaced. By the end of 1995 the client had stopped trying to blame Sema and our team for the failure. Everything we did worked. The project was postponed indefinitely, and I went off to find more constructive things to work on.

Welcome to the Web

In 1995, the entire World Wide Web fit into a single book, with one paragraph per website. In November 1995 I registered imatix.com and started to think about building an Internet business.

I'd been working in my free time and weekends with my friend Pascal on web servers. The design was inspired by the server I'd built for that tour operator. You can handle a lot (a lot) of sessions in a single process, if you use what's called a "cooperative multithreading" model. This means each session does a little work and then gives control back to a central dispatcher.

Threads must never block, and all I/O must be asynchronous, just as on the VAX with its AST calls. You don't need actual system threads. That's nice because real threads bring a lot of nasty, often fatal problems with them. We've learned by 2016 that sharing so-called "mutable" data between threads is a Really Bad Idea. In 1996 I'd less data for this, yet already knew that it was nice when a thread could work with global data without risk of stepping on other threads' toes.

By December 1996 we released a working web server, which we called "Xitami." Xitami was one of the first free web servers to run on Windows. Microsoft's "personal web server" was crippleware, and people hated it. Xitami was easy to install, it was fast, and had no limits. You could, and people did, handle enough traffic to swamp a highest speed Internet connections.

I once saw a Slashdot article about a guy who'd hooked 26 hard drives to his Windows 95 box. He had a web page showing the PC and explaining how he'd done it. Slashdot was one of the most popular geek news sites, and the name had become to mean, "kill a website through sheer volume of requests." He was running Xitami on this PC, and it didn't crash or slow down.

Yet I'm not going to talk about Xitami. It was my first popular open source product. It won many awards and users absolutely loved it. Yet it made us no money and had no developer community, and got us no business. As I define "open source" today, it was a failure.

Our real product, on which I wanted to build a business, was something quite different. I'd designed a "web transaction protocol" and implemented that in Xitami, together with some tools for building web forms. This gave us a crude yet working transaction processing system for the Internet.

That is, by 1997 or so we were able to build usable web applications that could handle thousands of concurrent users on cheap hardware. HTML was really poor. Basic functionality such as "put the cursor on this input field so the user can fix an error" was missing. Yet it worked.

In that company with the Bull TDS there was a separate, independent division building their own applications. They'd been searching for some way to provide access to remote users. They were well aware of the potential of the Internet, even if that mostly ran over dial-up modems. They took one look at our web framework (which I called "iMatix Studio") and asked when they could start using it.

Studio wasn't trivial to use. We'd not spent any time looking at developer languages, and we had totally missed Python as a candidate. Java 1.0 was just released, unstable, slow, and unusable. We did not look at more esoteric languages, and I was firmly against C++ because it produced such fragile code. So, we developed in C, which is less than ideal for the grunt work of business applications.

Still, it worked, and we slowly built up our framework. We lacked good code generation, so we built a generic code generator called GSL, which we still use today in the ZeroMQ community. We lacked a language for modeling the state machines and UIs we wanted to generate, so we first wrote our own structured data language, and then switched to XML by mid-1997.

It was in 1998 that I decided to part ways with Sema. They paid me well, and they were decent people, yet they were consistently betting on the wrong side of history, and it became deeply troubling. The last straw for me was when my boss asked me to make a technical design for a nationwide insurance network.

Our brief was to connect several thousand insurance agents in a network. They would log into a system, query insurance files, get quotes, log events and claims, etc. It was almost exactly the same problem as our old tour operator system, seven years down the line.

So I made a design betting on the future. We'd build it as a transactional web application, and use a number of XML formats for document standardization. The framework would be Studio, using arbitrary developer languages. We'd run on a single large UNIX box, which we'd tested and shown capable of dealing with ten times more traffic than planned. We'd use an Oracle database (again, customer decision, not negotiable).

Sema presented this design to the client, while simultaneously presenting a second design based on a client-server framework called PowerBuilder. At the time, at least in Belgium, PowerBuilder was fairly popular for such wide-area applications. Microsoft supported it and promoted it. It was expensive and produced a lot of money for vendors like Sema simply by giving them a slice of license income.

The PowerBuilder user interface was also much richer than HTML, at the time. It was however a miserably nasty system to deploy. When you started your app every day, the first thing it would do is download updates to the client half. This could take five minutes, or it could take an hour. You could not do updates during the day. Users had to log out at night.

Yet we were talking about a system that was to be developed gradually over a period of years, and was meant to last for decades. The goal was to define new industry standards, and to bring the insurance industry into the modern world.

And my design was rejected on the grounds that it could never work, that dial-up was no basis for a serious application, and that the Internet was just a toy that would never come to anything.

I shrugged, and went and started iMatix Corporation sprl. I started to plan, with my friend Julie, how to build a real business. It was all new to me, so we kept it small and modest. Sema kept asking to help save catastrophic p

Show more