Usenix, the classic conference of Unix research and practice, pulled off a satisfying event in Boston this year. Turn-out was modest (about 700 people registered, and the show never seemed crowded except when food was served), but the number of famous and significant leaders in the Unix and free software world I saw walking the hallways was impressive, and the line-up on the podiums was even more so.

In this blog I’ll talk about movements toward–and barriers to–openness, a few technical discussions, and one project that’s really hot.

Tribute to John Lions, and other pioneers of open knowledge

Openness came up as a theme repeatedly. In Thursday’s opening remarks, Jon maddog Hall promoted the work of Professor John Lions, who used the source code of Unix in a classic book explaining its workings, Lions’ Commentary on Unix. This is a great tradition in spreading knowledge of computing, continued by such books as (most recently from O’Reilly) Understanding Linux Network Internals.

Lions’s book also became a famous casualty in the battle between open knowledge sharing and corporate secrecy. Publication was quashed by the owners of Unix, who decided at that moment to change their license and stop distributing Unix source code freely. The book spread by samizdat took two decades and was finally allowed to be published openly.

Hall is promoting a chair in Lions’s name at the university he taught at in Australia, and is auctioning off on eBay a copy of the book signed by many of the creators and luminaries of the Unix and Linux world. Everyone is invited to submit a bid.

At the same keynote where Lions’s contribution to knowledge was honored, BitTorrent creator Bram Cohen won the annual Usenix Software Tool Users Group award. As I pointed out in an earlier blog, a BitTorrent client also recently won a SourceForge award, a stark contrast to the opprobrium expressed toward file-sharing tools by the established media companies.

Another award, the Lifetime Achievement Award also known as “The Flame,” went to Radia Perlman for her contributions to TCP/IP networking, notably the Spanning Tree Protocol. For information on this protocol and its importance, I’ll give another plug for O’Reilly’s Understanding Linux Network Internals.

So freedom made a strong appearance from the start. On the other hand, Mono project leader Miguel de Icaza said, “I wouldn’t advise my friends starting new businesses to give away their product for free. My next company will not be an open source company.” But read on for the context for this statement.

Open source business leaders speak

Open source consultant Stephen Walli assembled a star panel to discuss open source business models, bringing in Brian Aker from MySQL AB, Miguel de Icaza from Ximian (now part of Novell), and Mike Olsen from SleepyCat Software (a non-relational database company recently acquired by Oracle).

(Addition on June 7, 2006: Stephen has put up recordings of his panel, along with a summary.)

Stephen asked questions about business models, licensing, and the role of community in helping product development. I think his goal in bringing up these varied (but clearly related) topics was to lead the panelists and the audience to explore the more general question: What is an open-source business? I don’t think we quite got there, but the meandering discussion touched on many interesting points.

The tone of the panel was caution: caution regarding starting a business at all, and caution regarding the licensing and revenue model. Open source was certainly not offered as a sure win.

The caution extends to the promise of today’s global reach for corporations and information. The panelists agreed that you need a physical presence in each region of the world–someone who understands the cultural and legal issues of doing business in that region–to sell to people and service them effectively there.

Similarly, the blogosphere came across like more of a hindrance than a help to businesses. (My apologies for contributing to it.) Rather than viewing Slashdot and all the welter of personal sites as a marketing opportunity, the panelists expressed anxiety at how quickly intemperate posters can ruin a company’s reputation. “They’re like a Board of Directors that has no financial stake in your enterprise,” quipped Mike.

A number of minor disagreements came up, and the early part of the panel was colored by Miguel’s provocative warning about open source businesses. For instance, compare the price Red Hat paid for JBoss (about 350 million dollars) with the price eBay paid for Skype (2.6 billion dollars). Does this demonstrate that proprietary software makes a company more valuable, as Miguel claimed? Or just that the potential market for a VoIP service is much larger than it is for a Java development platform, as Brian responded?

While Brian and Mike pushed back against Miguel’s pessimism, Brian affirmed that “what gets funded is a mixture of proprietary and open,” which in MySQL AB’s case means dual-licensing.

SleepyCat also uses dual-licensing. The model works particularly well (in my opinion) for databases, because they are often incorporated into larger products, and this is particularly true for SleepyCat’s Berkeley DB because it is a library. Even so, Mike said wistfully, “If you start a business, be aware that virtually none of your users will pay you.”

What does the community offer in terms of information and input to open source projects? The best thing they can give, according to Brian, is accurate bug reports. For Mike, it’s not even that–it’s just using the product and telling others about it. None of the panelists rated highly the value of code submitted by outsiders. Miguel did point out that new projects attract more submissions because they are less complex and there are fewer conventions to learn; in his own company, for instance, Mono gets a lot of code from the field while Evolution does not.

There was some fulmination about software patents, but the panelists kept a positive attitude: Walli said he advises his clients not to take out patents on solid business grounds. Brian pointed out that the anti-patent community has been mostly successful at holding them back in Europe. And Miguel claims that venture capitalists show no interest in patents when they evaluate software companies.

Sensors and Sensibility (or Under the Volcano)

I met Matt Welsh in the early 1990s and edited his classic Running Linux. As early as 1992 he founded the Linux Documentation Project and started teaching people how to use the new operating system, which was just then erupting upon the computer scene. But the live Ecuadoran volcano of Reventador became the scene of Matt’s most recent efforts, which were to support seismic research by creating a mesh network of wireless sensors.

There was an excellent panel on virtualization going on at the same time as Matt’s talk. I told myself I should really be at the virtualization panel for career purposes, and promised myself I’d leave Matt’s talk after half an hour, but the moment came and passed. I said, what the hell–I’m having more fun than I ever had at a technical conference, and I can hear about virtualization anytime.

Matt showed us photos of the sensor devices his seismologist colleagues used to put on mountains, and the new ones he and his computing team designed.

The old ones ran on two car batteries (imagine carrying the materials for half a dozen sensors up a mountain) and had to be visited physically to retrieve data.

The new ones are based on a tiny wireless device known as a mote. The key constraint is power, because the team wants to put in a couple standard D batteries and leave the device alone for a week. These devices run on just 20 milli-amperes when the radio is active. To stay this slim, they subsist on an 8-bit processor, 10 K of RAM, 60 K of ROM, and 1 Meg of Flash (where the seismic data is read in and stored in a circular buffer).

The radio runs on the Zigbee protocol and permits real-time monitoring. Each device searches out neighbor devices in the mesh, and together they create a simple hierarchy for transmitting data back to a directional antenna and thence to the base station. (The hierarchy is simple because the devices are arranged a row going up the mountain.) Another neat feature of the radios is that they allow someone to remotely reprogram the devices (although this function failed in the field).

The data collected–which consists both of seismic movements and sound waves below the pitch of human hearing–is too big for continuous data transmission. Instead, when a sufficiently large change in the data suggests to each device that an eruption is occurring (a guess that is correct most of the time), the device transmits its current buffer with the most recent 60 seconds’ worth of data.

Matt regaled us with many other fascinating details, such as how a synchronization protocol (FTSP) that achieved accuracy within 10 microseconds in an outdoor test failed to achieve the necessary accuracy within 10 milliseconds on site. And how the one piece of software they didn’t write–and didn’t test in advance–was the one piece that failed catastrophically on site. Lessons Matt drew include:

  • Test, and then test some more. Anything that’s not tested under conditions as realistic as possible will fail.

  • Even if something is tested under conditions as realistic as possible, it can still fail.

Nevertheless, enough useful results were culled from the experiment to call it a success–and to do it again next year.

Other encounters

Thursday’s keynote was delivered by Larry Peterson and concerned the world-wide grid computing servicePlanetLab, widely used by scientific researchers. Peterson went over the history of the PlanetLab consortium’s choices and the lessons they drew from them, confirming some familiar principles:

  • Leverage existing, standard software; don’t reinvent the wheel. (Otherwise, you’ll just create a maintenance headache.)

  • Layers are good; employ virtualization to hide such differences as the operating systems in use.

  • Distribute both processing and authority wherever you can, but be willing to centralize control when necessary to avoid chaos. This issue was addressed in an article of mine (Part 1 and Part 2), where I discuss the difficulties of distributing control.

I went to a talk by one of the true grandmasters of the golden age of Unix, Stephen C. Johnson, who created the original yacc, lint, and Portable C Compiler. Here he was discussing some interesting features of modern caches that seem to have unexpected and hard-to-account-for effects on performance. For instance, the order of members inside a C structure can have a reproducible impact on how fast they are processed.

Among the interesting poster sessions in the evening was one suggesting that paging could be sped up by storing pages in the swap area instead of simply expelling them and retrieving them from the filesystem again. The filesystem is more complex than the swap area and therefore requires more overhead. Furthermore, a single page can be replicated multiple times in swap to reduce disk rotation time.

I also investigated a poster about the famous WiFi network in downtown Philadelphia. Use of this network is fairly limited, and skewed (as one would expect) toward people with more education and money. But it’s extremely popular. The question is why. Will it really reap the promised economic and social benefits, or has it just been marketed really well?