(Thanks to Adamantios for the picture! CC BY SA 3.0 )
Twenty-five years ago this month, I hung up the phone in my kitchen. I paused, my hand still on the receiver, and watched my wife Betsy make dinner as our five-year-old daughter played with chess pieces on the floor. After a moment, I said, “I think there was a house in that conversation!”
I’d just finished a conversation with John Mayes. We knew each other from operating system work at Adaptive Corporation, a spin-off of Network Equipment Technologies, and he’d called to see how my recent move back to Athens, Georgia had gone. As we caught up on how life was going post-Adaptive, he mentioned a problem he’d run into as a consultant cleaning up sysadmin messes (there was no shortage of messes in those days).
He’d spent a long weekend in a research lab trying to track down and change every single IP address scattered across the facility. The company had installed Ethernet gear a few years ago, but it had never occurred to them that they would need to connect to the internet. So, in the typical fashion of the day, everyone made up their own IP addresses.
IP addresses are the phone numbers of the Internet. When you type www.coraid.com into your browser, something in the domain name server turns that particular combination of letters into a four decimal number. For Coraid, it’s 208.71.232.233.
Organizations used all sorts of inspiration for their made up IP addresses: dates, room numbers, zip codes, street addresses, phone numbers. Then, management demanded that everyone connect to the internet. Suddenly, a whole lot of people had a huge problem.
And so did John Mayes. After 72 hours of almost non-stop poking the seemingly unlimited lines of /etc/host files, he still hadn’t tracked down where all the bogus IP addresses were hiding.
“Do you think we can make a product to fix this?” he asked. After a rough patch as a software consultant, building a product sounded like an appealing way to build a business. The problem was apparent, and so was the solution. These companies could easily obtain new, usable addresses, so all we’d have to do was dynamically change the made-up IPs on the way out of the building. Businesses could have their cake (private IP addresses) and eat it too (use the internet without renumbering hundreds of computers).
John had been selling refurbished Sun pizza-box computers. He’d install a new disk and Solaris, configuring them to be firewalls. “Maybe we could add a module to those?” he suggested.
“No,” I replied. “It needs to be an appliance, a stand-alone box that does one thing. You know, like programs on Unix.”
It wasn’t long before I was sleeping in his study as we worked out how to do this. A new company was born: Network Translation, Inc., since our product would translate non-functioning IP addresses into working ones and back again.
I could envision the product even on that first call. I could see the packets leaving the building through a router-like box with two ports. The inside port would face the interior of the company and IP packets arriving on that interface would be allowed to exit the second interface to the outside. Before they were allowed to leave, however, the packets’ addresses would be swapped. A pool of valid addresses in the box would be dynamically allocated to each internal address that arrived on the inside interface.
A data structure called the xlate table would keep score. Packets arriving on the outside interface would check the xlate table and if there was an entry for it, rewrite the packet with the corresponding internal address. If there was no entry, the packet would drop. It was like a connection diode. Starting a connection from the inside would create a new entry in the xlate table, but outside packets would not.
I’d implemented TCP/IP before and knew how everything worked, including the one’s complement checksum. I could see the packet. I could see how they would be changed.
John and I found motherboards with Intel 486 processors and found four rack unit chassis that mounted in a 19" rack. After looking at several Ethernet cards, we decided on the 3Com. It was a sixteen bit ISA card, one of the more elegant ones on the market.
After a while, I moved out of John’s study and into the nearby Holiday Inn. It was a good bit better than the futon. John rented office space at the Palo Alto airport where his Mooney airplane was tied down. The airport is a great place to work.
I built the system all through 1994. At first, I thought about using Unix, but we didn’t have the money for a license. There was 386BSD from Lynne and Bill Jolitz, but it seemed pretty large for an appliance. I needed to think.
While in Palo Alto for these extended work trips, I stayed on east coast time for the most part. I’d wake up around 3:00 a.m., guzzle hotel room coffee, made palatable thanks to overdoing sugar and creamer, and then head down Embarcadero to the airport. I’d examine the vast expanse of Cessnas and Cherokees as I grabbed my early morning breakfast of vending-machine pastry and soda. Then I’d head to my machine.
When it began to get light outside, I’d walk to the duck pond to the east to think. I had learned my development pattern from Ken Thompson, although I wouldn’t realize it until much later. I would meditate about the task at hand, sketching some mental pictures of how the software would work. Then I picked a beginning place, and start coding, compiling, and testing right away.
While walking one morning, I grappled with more of my mental software system. A bunch of threads were gone. Then the Unix STREAMS-like network of frames and the blocks to hold them disappeared because, as I dug deeper, I realized there wasn’t a need for them. I discarded everything that didn’t need to be there.
What was left was a loop, polling the two interfaces and calling slightly different pieces of code. The performance consequence of this exercise later paid huge dividends.
With the system simplified in my mind, I started on the code from scratch, right on the hardware. No OS. Just a BSD system to write and compile and a bare PIX to boot it. I think I was using BSDI for development at the time with some code from Bell Labs applied. I used the text editor Sam, just as I do today, and Bryan Rakitis rc shell. I couldn’t get real Plan 9 in 1994.
All day, I would sit at a folding table, the sharp edges cutting into my forearms, editing, compiling, copying to the floppy. I’d jab the eject button, and the floppy would pop out with a satisfying sound. I’d insert the floppy into the PIX and press the reset button. The floppy would turn, and characters would emerge from the serial port via the cu program on the BSD system. I debugged using prints, just like today. The endless cycle of edit, compile, copy, eject went on through almost all of October.
Late one night I returned to Athens on the last CC Air flight into the small Georgia airport. I’d been gone for six weeks. As I made my way from the airport to our house, it felt so strange to be driving my tiny Civic. The sleepy east Athens streets were cold and quiet as I turned down College Station road. Most of the work on the internals of the PIX was done. The next month was beta testing it on KLA/Instruments, a maker of capital equipment for the semiconductor industry. It would be a bugless beta.
The simple design using a single loop just checking for things to do is a powerful model. A single thread seems so lacking in power now. The Mac I’m running our Plan 9 terminal software on is running 1,568 threads in 392 processes. But a single thread is a powerful thing.
A thread is a stack of function calls. For those who are not programmers, a function is a group of code lines packaged under a name. It lets coders design the software in a hierarchy, with details hidden in the implementation of functions.
A function has its own local data, saved in a stack of data, freely created and destroyed as functions are called and returned.
Each thread has its respective stack of local data. Switching between threads means switching those stacks, not usually an expensive operation, but as systems have gotten more and more complex, the time devoted to switching has grown as well.
But for appliances like the PIX, there was no reason for more than a single thread. Mostly this was and still is because devices such as the disk controller do their work without much processor involvement. They use a data structure called a ring, a circle of buffers used to accomplish the task at hand. Hardware uses these rings and alerts the processor when something has been sent or received. The processor can access the ring at its leisure.
After the success of the PIX, I discovered that the idea of a single thread and not being interrupt-driven was not something new in computer history. Niklaus Wirth built an entire general purpose workstation using the concept of the single thread. His Oberon System had been used to teach computer science in Zurich for years. It was similar to my system but more general, allowing for the dynamic addition of code, something I didn’t need to do.
Even before that, back in the 1960s, the great designer Seymour Cray built an entire large mainframe without any interrupts in his CDC 6600. He implemented ten 12-bit processors to do input-output for his mighty 60-bit number crunching main processor.
Cray’s tiny processors used single threads to watch for completion of I/O. When I/O was finished, it could use a single instruction to halt, save all the registers from the currently running program in main memory, and load the registers from another program, all in a single PPU instruction.
After the PIX shipped in early 1995, I rented space from a friend and did most of my programming from a basement on Research Drive in Athens. I still have the Sam’s Club office chair that I used back in those days. In that year, I added new features, a TCP stack for logging into the PIX, the first virtual private network product, ever faster processors, and more.
Late in 1995, John told me that Cisco was sniffing around. In November, the Network Translation PIX Firewall became the Cisco PIX Firewall.
We frequently used the word “adaptive” to talk about the PIX that year. We were referring to the stateful nature of the xlate table used by the PIX to adapt to the flow of packets. When I left Cisco in 1997, I was figuring out how to detect bad behavior through our connections. Cisco combined this idea and the PIX with a couple of other products. They called it the Adaptive Security Appliance, keeping adaptive alive in its name.
Our use of the word “adaptive” was also a homage to the company that allowed John and me to meet. It was a tribute to Audrey MacLean and Charlie Giancarlo, the founders of Adaptive. It’s nice that the name lives on.