The final component of Free Software is coordination. For many participants and observers, this is the central innovation and essential significance of Open Source: the possibility of enticing potentially huge numbers of volunteers to work freely on a software project, leveraging the law of large numbers, “peer production,” “gift economies,” and “self-organizing social economies.”
Coordination is important because it collapses and resolves the distinction between technical and social forms into a meaningful [pg 211] whole for participants. On the one hand, there is the coordination and management of people; on the other, there is the coordination of source code, patches, fixes, bug reports, versions, and distributions—but together there is a meaningful technosocial practice of managing, decision-making, and accounting that leads to the collaborative production of complex software and networks. Such coordination would be unexceptional, essentially mimicking long-familiar corporate practices of engineering, except for one key fact: it has no goals. Coordination in Free Software privileges adaptability over planning. This involves more than simply allowing any kind of modification; the structure of Free Software coordination actually gives precedence to a generalized openness to change, rather than to the following of shared plans, goals, or ideals dictated or controlled by a hierarchy of individuals.
Adaptability does not mean randomness or anarchy, however; it is a very specific way of resolving the tension between the individual curiosity and virtuosity of hackers, and the collective coordination necessary to create and use complex software and networks. No man is an island, but no archipelago is a nation, so to speak. Adaptability preserves the “joy” and “fun” of programming without sacrificing the careful engineering of a stable product. Linux and Apache should be understood as the results of this kind of coordination: experiments with adaptability that have worked, to the surprise of many who have insisted that complexity requires planning and hierarchy. Goals and planning are the province of governance—the practice of goal-setting, orientation, and definition of control—but adaptability is the province of critique, and this is why Free Software is a recursive public: it stands outside power and offers powerful criticism in the form of working alternatives. It is not the domain of the new—after all Linux is just a rewrite of UNIX—but the domain of critical and responsive public direction of a collective undertaking.
Linux and Apache are more than pieces of software; they are organizations of an unfamiliar kind. My claim that they are “recursive publics” is useful insofar as it gives a name to a practice that is neither corporate nor academic, neither profit nor nonprofit, neither governmental nor nongovernmental. The concept of recursive public includes, within the spectrum of political activity, the creation, modification, and maintenance of software, networks, and legal documents. While a “public” in most theories is a body of [pg 212] people and a discourse that give expressive form to some concern, “recursive public” is meant to suggest that geeks not only give expressive form to some set of concerns (e.g., that software should be free or that intellectual property rights are too expansive) but also give concrete infrastructural form to the means of expression itself. Linux and Apache are tools for creating networks by which expression of new kinds can be guaranteed and by which further infrastructural experimentation can be pursued. For geeks, hacking and programming are variants of free speech and freedom of assembly.
Linux and Apache are the two paradigmatic cases of Free Software in the 1990s, both for hackers and for scholars of Free Software. Linux is a UNIX-like operating-system kernel, bootstrapped out of the Minix operating system created by Andrew Tanenbaum.
Linux and Apache are both experiments in coordination. Both projects evolved decision-making systems through experiment: a voting system in Apache’s case and a structured hierarchy of decision-makers, with Linus Torvalds as benevolent dictator, in Linux’s case. Both projects also explored novel technical tools for coordination, especially Source Code Management (SCM) tools such as Concurrent Versioning System (cvs). Both are also cited as exemplars of how “fun,” “joy,” or interest determine individual participation and of how it is possible to maintain and encourage that participation and mutual aid instead of narrowing the focus or eliminating possible routes for participation.
Beyond these specific experiments, the stories of Linux and Apache are detailed here because both projects were actively central to the construction and expansion of the Internet of the 1990s by allowing a massive number of both corporate and noncorporate sites to cheaply install and run servers on the Internet. Were Linux and Apache nothing more than hobbyist projects with a few thousand [pg 213] interested tinkerers, rather than the core technical components of an emerging planetary network, they would probably not represent the same kind of revolutionary transformation ultimately branded a “movement” in 1998-99.
Linus Torvalds’s creation of the Linux kernel is often cited as the first instance of the real “Open Source” development model, and it has quickly become the most studied of the Free Software projects.
When Torvalds started, he was blessed with an eager audience of hackers keen on seeing a UNIX system run on desktop computers and a personal style of encouragement that produced enormous positive feedback. Torvalds is often given credit for creating, through his “management style,” a “new generation” of Free Software—a younger generation than that of Stallman and Raymond. Linus and Linux are not in fact the causes of this change, but the results of being at the right place at the right time and joining together a number of existing components. Indeed, the title of Torvalds’s semi-autobiographical reflection on Linux—Just for Fun: The Story of an Accidental Revolutionary—captures some of the character of its genesis.
The “fun” referred to in the title reflects the privileging of adaptability over planning. Projects, tools, people, and code that were fun were those that were not dictated by existing rules and ideas. Fun, for geeks, was associated with the sudden availability, especially for university students and amateur hackers, of a rapidly expanding underground world of networks and software—Usenet and the Internet especially, but also university-specific networks, online environments and games, and tools for navigating information of all kinds. Much of this activity occurred without the benefit of any explicit theorization, with the possible exception of the discourse of “community” (given print expression by Howard Rheingold in 1993 and present in nascent form in the pages of Wired and Mondo 2000) that took place through much of the 1990s.
Fun included the creation of mailing lists by the spread of software such as list-serv and majordomo; the collaborative maintenance and policing of Usenet; and the creation of Multi-User Dungeons (MUDs) and MUD Object Orienteds (MOOs), both of which gave game players and Internet geeks a way to co-create software environments and discover many of the problems of management and policing that thereby emerged.
During this period (roughly 1987 to 1993), the Free Software Foundation attained a mythic cult status—primarily among UNIX and EMACS users. Part of this status was due to the superiority of the tools Stallman and his collaborators had already created: the GNU C Compiler (gcc), GNU EMACS, the GNU Debugger (gdb), GNU Bison, and loads of smaller utilities that replaced the original AT&T UNIX versions. The GNU GPL had also acquired a life of its own by this time, having reached maturity as a license and become the de facto choice for those committed to Free Software and the Free Software Foundation. By 1991, however, the rumors of the imminent appearance of Stallman’s replacement UNIX operating system had started to sound empty—it had been six years since his public announcement of his intention. Most hackers were skeptical of Stallman’s operating-system project, even if they acknowledged the success of all the other tools necessary to create a full-fledged operating system, and Stallman himself was stymied by the development [pg 215] of one particular component: the kernel itself, called GNU Hurd.
Linus Torvalds’s project was not initially imagined as a contribution to the Free Software Foundation: it was a Helsinki university student’s late-night project in learning the ins and outs of the relatively new Intel 386/486 microprocessor. Torvalds, along with tens of thousands of other computer-science students, was being schooled in UNIX through the pedagogy of Andrew Tanenbaum’s Minix, Douglas Comer’s Xinu-PC, and a handful of other such teaching versions designed to run on IBM PCs. Along with the classroom pedagogy in the 1980s came the inevitable connection to, lurking on, and posting to the Usenet and Arpanet mailing lists devoted to technical (and nontechnical) topics of all sorts.
The fact of Linus Torvalds’s pedagogical embedding in the world of UNIX, Minix, the Free Software Foundation, and the Usenet should not be underestimated, as it often is in hagiographical accounts of the Linux operating system. Without this relatively robust moral-technical order or infrastructure within which it was possible to be at the right place at the right time, Torvalds’s late-night dorm-room project would have amounted to little more than that—but the pieces were all in place for his modest goals to be transformed into something much more significant.
Consider his announcement on 25 August 1991:
Hello everybody out there using minix—I’m doing a (free) operating system (just a hobby, won’t be big and professional like gnu) for 386(486) AT clones. This has been brewing since april, and is starting to get ready. I’d like any feedback on things people like/dislike in minix, as my OS resembles it somewhat (same physical layout of the file-system (due to practical reasons) among other things). I’ve currently ported bash(1.08) and gcc(1.40), and things seem to work. This implies that I’ll get something practical within a few months, and I’d like to know what features most people would want. Any suggestions are welcome, but I won’t promise I’ll implement them :-)
Linus . . .
PS. Yes—it’s free of any minix code, and it has a multi-threaded fs. It is NOT portable (uses 386 task switching etc), and it probably never will support anything other than AT-harddisks, as that’s all I have :-(.
Torvalds’s announcement is telling as to where his project fit into the existing context: “just a hobby,” not “big and professional like gnu” (a comment that suggests the stature that Stallman and the Free Software Foundation had achieved, especially since they were in reality anything but “big and professional”). The announcement was posted to the Minix list and thus was essentially directed at Minix users; but Torvalds also makes a point of insisting that the system would be free of cost, and his postscript furthermore indicates that it would be free of Minix code, just as Minix had been free of AT&T code.
Torvalds also mentions that he has ported “bash” and “gcc,” software created and distributed by the Free Software Foundation and tools essential for interacting with the computer and compiling new versions of the kernel. Torvalds’s decision to use these utilities, rather than write his own, reflects both the boundaries of his project (an operating-system kernel) and his satisfaction with the availability and reusability of software licensed under the GPL.
So the system is based on Minix, just as Minix had been based on UNIX—piggy-backed or bootstrapped, rather than rewritten in an entirely different fashion, that is, rather than becoming a different kind of operating system. And yet there are clearly concerns about the need to create something that is not Minix, rather than simply extending or “debugging” Minix. This concern is key to understanding what happened to Linux in 1991.
Tanenbaum’s Minix, since its inception in 1984, was always intended to allow students to see and change the source code of Minix in order to learn how an operating system worked, but it was not Free Software. It was copyrighted and owned by Prentice Hall, which distributed the textbooks. Tanenbaum made the case—similar to Gosling’s case for Unipress—that Prentice Hall was distributing the system far wider than if it were available only on the Internet: “A point which I don’t think everyone appreciates is that making something available by FTP is not necessarily the way to provide the widest distribution. The Internet is still a highly elite group. Most computer users are NOT on it. . . . MINIX is also widely used in Eastern Europe, Japan, Israel, South America, etc. Most of these people would never have gotten it if there hadn’t been a company selling it.”
By all accounts, Prentice Hall was not restrictive in its sublicensing of the operating system, if people wanted to create an “enhanced” [pg 217] version of Minix. Similarly, Tanenbaum’s frequent presence on comp.os.minix testified to his commitment to sharing his knowledge about the system with anyone who wanted it—not just paying customers. Nonetheless, Torvalds’s pointed use of the word free and his decision not to reuse any of the code is a clear indication of his desire to build a system completely unencumbered by restrictions, based perhaps on a kind of intuitive folkloric sense of the dangers associated with cases like that of EMACS.
The most significant aspect of Torvalds’s initial message, however, is his request: “I’d like to know what features most people would want. Any suggestions are welcome, but I won’t promise I’ll implement them.” Torvalds’s announcement and the subsequent interest it generated clearly reveal the issues of coordination and organization that would come to be a feature of Linux. The reason Torvalds had so many eager contributors to Linux, from the very start, was because he enthusiastically took them off of Tanenbaum’s hands.
Tanenbaum’s role in the story of Linux is usually that of the straw man—a crotchety old computer-science professor who opposes the revolutionary young Torvalds. Tanenbaum did have a certain revolutionary reputation himself, since Minix was used in classrooms around the world and could be installed on IBM PCs (something no other commercial UNIX vendors had achieved), but he was also a natural target for people like Torvalds: the tenured professor espousing the textbook version of an operating system. So, despite the fact that a very large number of people were using or knew of Minix as a UNIX operating system (estimates of comp.os.minix subscribers were at 40,000), Tanenbaum was emphatically not interested in collaboration or collaborative debugging, especially if debugging also meant creating extensions and adding features that would make the system bigger and harder to use as a stripped-down tool for teaching. For Tanenbaum, this point was central: “I’ve been repeatedly offered virtual memory, paging, symbolic links, window systems, and all manner of features. I have usually declined because I am still trying to keep the system simple enough for students to understand. You can put all this stuff in your version, but I won’t [pg 218] put it in mine. I think it is this point which irks the people who say ‘MINIX is not free,’ not the $60.”
So while Tanenbaum was in sympathy with the Free Software Foundation’s goals (insofar as he clearly wanted people to be able to use, update, enhance, and learn from software), he was not in sympathy with the idea of having 40,000 strangers make his software “better.” Or, to put it differently, the goals of Minix remained those of a researcher and a textbook author: to be useful in classrooms and cheap enough to be widely available and usable on the largest number of cheap computers.
By contrast, Torvalds’s “fun” project had no goals. Being a cocky nineteen-year-old student with little better to do (no textbooks to write, no students, grants, research projects, or committee meetings), Torvalds was keen to accept all the ready-made help he could find to make his project better. And with 40,000 Minix users, he had a more or less instant set of contributors. Stallman’s audience for EMACS in the early 1980s, by contrast, was limited to about a hundred distinct computers, which may have translated into thousands, but certainly not tens of thousands of users. Tanenbaum’s work in creating a generation of students who not only understood the internals of an operating system but, more specifically, understood the internals of the UNIX operating system created a huge pool of competent and eager UNIX hackers. It was the work of porting UNIX not only to various machines but to a generation of minds as well that set the stage for this event—and this is an essential, though often overlooked component of the success of Linux.
Many accounts of the Linux story focus on the fight between Torvalds and Tanenbaum, a fight carried out on comp.os.minix with the subject line “Linux is obsolete.”
Both Tanenbaum and Torvalds operated under a model of coordination in which one person was ultimately responsible for the entire project: Tanenbaum oversaw Minix and ensured that it remained true to its goals of serving a pedagogical audience; Torvalds would oversee Linux, but he would incorporate as many different features as users wanted or could contribute. Very quickly—with a pool of 40,000 potential contributors—Torvalds would be in the same position Tanenbaum was in, that is, forced to make decisions about the goals of Linux and about which enhancements would go into it and which would not. What makes the story of Linux so interesting to observers is that it appears that Torvalds made no decision: he accepted almost everything.
Tanenbaum’s goals and plans for Minix were clear and autocratically formed. Control, hierarchy, and restriction are after all appropriate in the classroom. But Torvalds wanted to do more. He wanted to go on learning and to try out alternatives, and with Minix as the only widely available way to do so, his decision to part ways starts to make sense; clearly he was not alone in his desire to explore and extend what he had learned. Nonetheless, Torvalds faced the problem of coordinating a new project and making similar decisions about its direction. On this point, Linux has been the subject of much reflection by both insiders and outsiders. Despite images of Linux as either an anarchic bazaar or an autocratic dictatorship, the reality is more subtle: it includes a hierarchy of contributors, maintainers, and “trusted lieutenants” and a sophisticated, informal, and intuitive sense of “good taste” gained through reading and incorporating the work of co-developers.
While it was possible for Torvalds to remain in charge as an individual for the first few years of Linux (1991-95, roughly), he eventually began to delegate some of that control to people who would make decisions about different subcomponents of the kernel. [pg 220] It was thus possible to incorporate more of the “patches” (pieces of code) contributed by volunteers, by distributing some of the work of evaluating them to people other than Torvalds. This informal hierarchy slowly developed into a formal one, as Steven Weber points out: “The final de facto ‘grant’ of authority came when Torvalds began publicly to reroute relevant submissions to the lieutenants. In 1996 the decision structure became more formal with an explicit differentiation between ‘credited developers’ and ‘maintainers.’ . . . If this sounds very much like a hierarchical decision structure, that is because it is one—albeit one in which participation is strictly voluntary.”
Almost all of the decisions made by Torvalds and lieutenants were of a single kind: whether or not to incorporate a piece of code submitted by a volunteer. Each such decision was technically complex: insert the code, recompile the kernel, test to see if it works or if it produces any bugs, decide whether it is worth keeping, issue a new version with a log of the changes that were made. Although the various official leaders were given the authority to make such changes, coordination was still technically informal. Since they were all working on the same complex technical object, one person (Torvalds) ultimately needed to verify a final version, containing all the subparts, in order to make sure that it worked without breaking.
Such decisions had very little to do with any kind of design goals or plans, only with whether the submitted patch “worked,” a term that reflects at once technical, aesthetic, legal, and design criteria that are not explicitly recorded anywhere in the project—hence, the privileging of adaptability over planning. At no point were the patches assigned or solicited, although Torvalds is justly famous for encouraging people to work on particular problems, but only if they wanted to. As a result, the system morphed in subtle, unexpected ways, diverging from its original, supposedly backwards “monolithic” design and into a novel configuration that reflected the interests of the volunteers and the implicit criteria of the leaders.
By 1995-96, Torvalds and lieutenants faced considerable challenges with regard to hierarchy and decision-making, as the project had grown in size and complexity. The first widely remembered response to the ongoing crisis of benevolent dictatorship in Linux was the creation of “loadable kernel modules,” conceived as a way to release some of the constant pressure to decide which patches would be incorporated into the kernel. The decision to modularize [pg 221] Linux was simultaneously technical and social: the software-code base would be rewritten to allow for external loadable modules to be inserted “on the fly,” rather than all being compiled into one large binary chunk; at the same time, it meant that the responsibility to ensure that the modules worked devolved from Torvalds to the creator of the module. The decision repudiated Torvalds’s early opposition to Tanenbaum in the “monolithic vs. microkernel” debate by inviting contributors to separate core from peripheral functions of an operating system (though the Linux kernel remains monolithic compared to classic microkernels). It also allowed for a significant proliferation of new ideas and related projects. It both contracted and distributed the hierarchy; now Linus was in charge of a tighter project, but more people could work with him according to structured technical and social rules of responsibility.
Creating loadable modules changed the look of Linux, but not because of any planning or design decisions set out in advance. The choice is an example of the privileged adaptability of the Linux, resolving the tension between the curiosity and virtuosity of individual contributors to the project and the need for hierarchical control in order to manage complexity. The commitment to adaptability dissolves the distinction between the technical means of coordination and the social means of management. It is about producing a meaningful whole by which both people and code can be coordinated—an achievement vigorously defended by kernel hackers.
The adaptable organization and structure of Linux is often described in evolutionary terms, as something without teleological purpose, but responding to an environment. Indeed, Torvalds himself has a weakness for this kind of explanation.
Let’s just be honest, and admit that it [Linux] wasn’t designed.
Sure, there’s design too—the design of UNIX made a scaffolding for the system, and more importantly it made it easier for people to communicate because people had a mental model for what the system was like, which means that it’s much easier to discuss changes.
But that’s like saying that you know that you’re going to build a car with four wheels and headlights—it’s true, but the real bitch is in the details.
And I know better than most that what I envisioned 10 years ago has nothing in common with what Linux is today. There was certainly no premeditated design there.
Adaptability does not answer the questions of intelligent design. Why, for example, does a car have four wheels and two headlights? Often these discussions are polarized: either technical objects are designed, or they are the result of random mutations. What this opposition overlooks is the fact that design and the coordination of collaboration go hand in hand; one reveals the limits and possibilities of the other. Linux represents a particular example of such a problematic—one that has become the paradigmatic case of Free Software—but there have been many others, including UNIX, for which the engineers created a system that reflected the distributed collaboration of users around the world even as the lawyers tried to make it conform to legal rules about licensing and practical concerns about bookkeeping and support.
Because it privileges adaptability over planning, Linux is a recursive public: operating systems and social systems. It privileges openness to new directions, at every level. It privileges the right to propose changes by actually creating them and trying to convince others to use and incorporate them. It privileges the right to fork the software into new and different kinds of systems. Given what it privileges, Linux ends up evolving differently than do systems whose life and design are constrained by corporate organization, or by strict engineering design principles, or by legal or marketing definitions of products—in short, by clear goals. What makes this distinction between the goal-oriented design principle and the principle of adaptability important is its relationship to politics. Goals and planning are the subject of negotiation and consensus, or of autocratic decision-making; adaptability is the province of critique. It should be remembered that Linux is by no means an attempt to create something radically new; it is a rewrite of a UNIX operating system, as Torvalds points out, but one that through adaptation can end up becoming something new.
The Apache Web server and the Apache Group (now called the Apache Software Foundation) provide a second illuminating example of the how and why of coordination in Free Software of the 1990s. As with the case of Linux, the development of the Apache project illustrates how adaptability is privileged over planning [pg 223] and, in particular, how this privileging is intended to resolve the tensions between individual curiosity and virtuosity and collective control and decision-making. It is also the story of the progressive evolution of coordination, the simultaneously technical and social mechanisms of coordinating people and code, patches and votes.
The Apache project emerged out of a group of users of the original httpd (HyperText Transmission Protocol Daemon) Web server created by Rob McCool at NCSA, based on the work of Tim Berners-Lee’s World Wide Web project at CERN. Berners-Lee had written a specification for the World Wide Web that included the mark-up language HTML, the transmission protocol http, and a set of libraries that implemented the code known as libwww, which he had dedicated to the public domain.
The NCSA, at the University of Illinois, Urbana-Champaign, picked up both www projects, subsequently creating both the first widely used browser, Mosaic, directed by Marc Andreessen, and httpd. Httpd was public domain up until version 1.3. Development slowed when McCool was lured to Netscape, along with the team that created Mosaic. By early 1994, when the World Wide Web had started to spread, many individuals and groups ran Web servers that used httpd; some of them had created extensions and fixed bugs. They ranged from university researchers to corporations like Wired Ventures, which launched the online version of its magazine (HotWired.com) in 1994. Most users communicated primarily through Usenet, on the comp.infosystems.www.* newsgroups, sharing experiences, instructions, and updates in the same manner as other software projects stretching back to the beginning of the Usenet and Arpanet newsgroups.
When NCSA failed to respond to most of the fixes and extensions being proposed, a group of several of the most active users of httpd began to communicate via a mailing list called new-httpd in 1995. The list was maintained by Brian Behlendorf, the webmaster for HotWired, on a server he maintained called hyperreal; its participants were those who had debugged httpd, created extensions, or added functionality. The list was the primary means of association and communication for a diverse group of people from various locations around the world. During the next year, participants hashed out issues related to coordination, to the identity of and the processes involved in patching the “new” httpd, version 1.3.
Patching a piece of software is a peculiar activity, akin to debugging, but more like a form of ex post facto design. Patching covers the spectrum of changes that can be made: from fixing security holes and bugs that prevent the software from compiling to feature and performance enhancements. A great number of the patches that initially drew this group together grew out of needs that each individual member had in making a Web server function. These patches were not due to any design or planning decisions by NCSA, McCool, or the assembled group, but most were useful enough that everyone gained from using them, because they fixed problems that everyone would or could encounter. As a result, the need for a coordinated new-httpd release was key to the group’s work. This new version of NCSA httpd had no name initially, but apache was a persistent candidate; the somewhat apocryphal origin of the name is that it was “a patchy webserver.”
At the outset, in February and March 1995, the pace of work of the various members of new-httpd differed a great deal, but was in general extremely rapid. Even before there was an official release of a new httpd, process issues started to confront the group, as Roy Fielding later explained: “Apache began with a conscious attempt to solve the process issues first, before development even started, because it was clear from the very beginning that a geographically distributed set of volunteers, without any traditional organizational ties, would require a unique development process in order to make decisions.”
The need for process arose more or less organically, as the group developed mechanisms for managing the various patches: assigning them IDs, testing them, and incorporating them “by hand” into the main source-code base. As this happened, members of the list would occasionally find themselves lost, confused by the process or the efficiency of other members, as in this message from Andrew Wilson concerning Cliff Skolnick’s management of the list of bugs:
Cliff, can you concentrate on getting an uptodate copy of the bug/improvement list please. I’ve already lost track of just what the heck is meant to be going on. Also what’s the status of this pre-pre-pre release Apache stuff. It’s either a pre or it isn’t surely? AND is the pre-pre-etc thing the same as the thing Cliff is meant to be working on?
Just what the fsck is going on anyway? Ay, ay ay! Andrew Wilson.
To which Rob Harthill replied, “It is getting messy. I still think we should all implement one patch at a time together. At the rate (and hours) some are working we can probably manage a couple of patches a day. . . . If this is acceptable to the rest of the group, I think we should order the patches, and start a systematic processes of discussion, implementations and testing.”
Some members found the pace of work exciting, while others appealed for slowing or stopping in order to take stock. Cliff Skolnick created a system for managing the patches and proposed that list-members vote in order to determine which patches be included.
Here are my votes for the current patch list shown at http://www.hyperreal.com/httpd/patchgen/list.cgi
I’ll use a vote of
-1 have a problem with it
0 haven’t tested it yet (failed to understand it or whatever)
+1 tried it, liked it, have no problem with it.
[Here Harthill provides a list of votes on each patch.]
If this voting scheme makes sense, lets use it to filter out the stuff we’re happy with. A “-1” vote should veto any patch. There seems to be about 6 or 7 of us actively commenting on patches, so I’d suggest that once a patch gets a vote of +4 (with no vetos), we can add it to an alpha.
Harthill’s votes immediately instigated discussion about various patches, further voting, and discussion about the process (i.e., how many votes or vetoes were needed), all mixed together in a flurry of e-mail messages. The voting process was far from perfect, but it did allow some consensus on what “apache” would be, that is, which patches would be incorporated into an “official” (though not very public) release: Apache 0.2 on 18 March.
The significance of the patch-and-vote system was that it clearly represented the tension between the virtuosity of individual developers and a group process aimed at creating and maintaining a common piece of software. It was a way of balancing the ability of each separate individual’s expertise against a common desire to ship and promote a stable, bug-free, public-domain Web server. As Roy Fielding and others would describe it in hindsight, this tension was part of Apache’s advantage.
Although the Apache Group makes decisions as a whole, all of the actual work of the project is done by individuals. The group does not write code, design solutions, document products, or provide support to our customers; individual people do that. The group provides an environment for collaboration and an excellent trial-by-fire for ideas and code, but the creative energy needed to solve a particular problem, redesign a piece of the system, or fix a given bug is almost always contributed by individual volunteers working on their own, for their own purposes, and not at the behest of the group. Competitors mistakenly assume Apache will be unable to take on new or unusual tasks because of the perception that we act as a group rather than follow a single leader. What they fail to see is that, by remaining open to new contributors, the group has an unlimited supply of innovative ideas, and it is the individuals who chose to pursue their own ideas who are the real driving force for innovation.
Although openness is widely touted as the key to the innovations of Apache, the claim is somewhat disingenuous: patches are just that, patches. Any large-scale changes to the code could not be accomplished by applying patches, especially if each patch must be subjected to a relatively harsh vote to be included. The only way to make sweeping changes—especially changes that require iteration and testing to get right—is to engage in separate “branches” of a project or to differentiate between internal and external releases—in short, to fork the project temporarily in hopes that it would soon rejoin its stable parent. Apache encountered this problem very early on with the “Shambhala” rewrite of httpd by Robert Thau. [pg 227]
Shambhala was never quite official: Thau called it his “noodling” server, or a “garage” project. It started as his attempt to rewrite httpd as a server which could handle and process multiple requests at the same time. As an experiment, it was entirely his own project, which he occasionally referred to on the new-httpd list: “Still hacking Shambhala, and laying low until it works well enough to talk about.”
Harthill had assumed that the NCSA code-base was “tried and tested” and that Shambhala represented a split, a fork: “The question is, should we all go in one direction, continue as things stand or Shambahla [sic] goes off on its own?”
Maybe it was rst’s [Robert Thau’s] choice of phrases, such as “garage project” and it having a different name, maybe I didn’t read his mailings thoroughly enough, maybe they weren’t explicit enough, whatever. . . . It’s a shame that nobody using Shambhala (who must have realized what was going on) didn’t raise these issues weeks ago. I can only presume that rst was too modest to push Shambhala, or at least discussion of it, onto us more vigourously. I remember saying words to the effect of “this is what I plan to do, stop me if you think this isn’t a good idea.” Why the hell didn’t anyone say something? . . . [D]id others get the same impression about rst’s work as I did? Come on people, if you want to be part of this group, collaborate!
Harthill’s injunction to collaborate seems surprising in the context of a mailing list and project created to facilitate collaboration, but the injunction is specific: collaborate by making plans and sharing goals. Implicit in his words is the tension between a project with clear plans and goals, an overarching design to which everyone contributes, as opposed to a group platform without clear goals that provides individuals with a setting to try out alternatives. Implicit in his words is the spectrum between debugging an existing piece of software with a stable identity and rewriting the fundamental aspects of it to make it something new. The meaning of collaboration bifurcates here: on the one hand, the privileging of the autonomous work of individuals which is submitted to a group peer review and then incorporated; on the other, the privileging of a set of shared goals to which the actions and labor of individuals is subordinated.
Indeed, the very design of Shambhala reflects the former approach of privileging individual work: like UNIX and EMACS before it, Shambhala was designed as a modular system, one that could “make some of that process [the patch-and-vote process] obsolete, by allowing stuff which is not universally applicable (e.g., database back-ends), controversial, or just half-baked, to be shipped anyway as optional modules.”
In the case of Apache one can see how coordination in Free Software is not just an afterthought or a necessary feature of distributed work, but is in fact at the core of software production itself, governing the norms and forms of life that determine what will count as good software, how it will progress with respect to a context and [pg 229] background, and how people will be expected to interact around the topic of design decisions. The privileging of adaptability brings with it a choice in the mode of collaboration: it resolves the tension between the agonistic competitive creation of software, such as Robert Thau’s creation of Shambhala, and the need for collective coordination of complexity, such as Harthill’s plea for collaboration to reduce duplicated or unnecessary work.
The technical and social forms that Linux and Apache take are enabled by the tools they build and use, from bug-tracking tools and mailing lists to the Web servers and kernels themselves. One such tool plays a very special role in the emergence of these organizations: Source Code Management systems (SCMs). SCMs are tools for coordinating people and code; they allow multiple people in dispersed locales to work simultaneously on the same object, the same source code, without the need for a central coordinating overseer and without the risk of stepping on each other’s toes. The history of SCMs—especially in the case of Linux—also illustrates the recursive-depth problem: namely, is Free Software still free if it is created with non-free tools?
SCM tools, like the Concurrent Versioning System (cvs) and Subversion, have become extremely common tools for Free Software programmers; indeed, it is rare to find a project, even a project conducted by only one individual, which does not make use of these tools. Their basic function is to allow two or more programmers to work on the same files at the same time and to provide feedback on where their edits conflict. When the number of programmers grows large, an SCM can become a tool for managing complexity. It keeps track of who has “checked out” files; it enables users to lock files if they want to ensure that no one else makes changes at the same time; it can keep track of and display the conflicting changes made by two users to the same file; it can be used to create “internal” forks or “branches” that may be incompatible with each other, but still allows programmers to try out new things and, if all goes well, merge the branches into the trunk later on. In sophisticated forms it can be used to “animate” successive changes to a piece of code, in order to visualize its evolution. [pg 230]
Beyond mere coordination functions, SCMs are also used as a form of distribution; generally SCMs allow anyone to check out the code, but restrict those who can check in or “commit” the code. The result is that users can get instant access to the most up-to-date version of a piece of software, and programmers can differentiate between stable releases, which have few bugs, and “unstable” or experimental versions that are under construction and will need the help of users willing to test and debug the latest versions. SCM tools automate certain aspects of coordination, not only reducing the labor involved but opening up new possibilities for coordination.
The genealogy of SCMs can be seen in the example of Ken Thompson’s creation of a diff tape, which he used to distribute changes that had been contributed to UNIX. Where Thompson saw UNIX as a spectrum of changes and the legal department at Bell Labs saw a series of versions, SCM tools combine these two approaches by minutely managing the revisions, assigning each change (each diff) a new version number, and storing the history of all of those changes so that software changes might be precisely undone in order to discover which changes cause problems. Written by Douglas McIlroy, “diff” is itself a piece of software, one of the famed small UNIX tools that do one thing well. The program diff compares two files, line by line, and prints out the differences between them in a structured format (showing a series of lines with codes that indicate changes, additions, or removals). Given two versions of a text, one could run diff to find the differences and make the appropriate changes to synchronize them, a task that is otherwise tedious and, given the exactitude of source code, prone to human error. A useful side-effect of diff (when combined with an editor like ed or EMACS) is that when someone makes a set of changes to a file and runs diff on both the original and the changed file, the output (i.e., the changes only) can be used to reconstruct the original file from the changed file. Diff thus allows for a clever, space-saving way to save all the changes ever made to a file, rather than retaining full copies of every new version, one saves only the changes. Ergo, version control. diff—and programs like it—became the basis for managing the complexity of large numbers of programmers working on the same text at the same time.
One of the first attempts to formalize version control was Walter Tichy’s Revision Control System (RCS), from 1985.
In order to add sophistication to RCS, Dick Grune, at the Vrije Universiteit, Amsterdam, began writing scripts that used RCS as a multi-user, Internet-accessible version-control system, a system that eventually became the Concurrent Versioning System. cvs allowed multiple users to check out a copy, make changes, and then commit those changes, and it would check for and either prevent or flag conflicting changes. Ultimately, cvs became most useful when programmers could use it remotely to check out source code from anywhere on the Internet. It allowed people to work at different speeds, different times, and in different places, without needing a central person in charge of checking and comparing the changes. cvs created a form of decentralized version control for very-large-scale collaboration; developers could work offline on software, and always on the most updated version, yet still be working on the same object.
Both the Apache project and the Linux kernel project use SCMs. In the case of Apache the original patch-and-vote system quickly began to strain the patience, time, and energy of participants as the number of contributors and patches began to grow. From the very beginning of the project, the contributor Paul Richards had urged the group to make use of cvs. He had extensive experience with the system in the Free-BSD project and was convinced that it provided a superior alternative to the patch-and-vote system. Few other contributors had much experience with it, however, so it wasn’t until over a year after Richards began his admonitions that cvs was eventually adopted. However, cvs is not a simple replacement for a patch-and-vote system; it necessitates a different kind of organization. Richards recognized the trade-off. The patch-and-vote system created a very high level of quality assurance and peer review of the patches that people submitted, while the cvs system allowed individuals to make more changes that might not meet the same level of quality assurance. The cvs system allowed branches—stable, testing, experimental—with different levels of quality assurance, while the patch-and-vote system was inherently directed at one final and stable version. As the case of Shambhala [pg 232] exhibited, under the patch-and-vote system experimental versions would remain unofficial garage projects, rather than serve as official branches with people responsible for committing changes.
While SCMs are in general good for managing conflicting changes, they can do so only up to a point. To allow anyone to commit a change, however, could result in a chaotic mess, just as difficult to disentangle as it would be without an SCM. In practice, therefore, most projects designate a handful of people as having the right to “commit” changes. The Apache project retained its voting scheme, for instance, but it became a way of voting for “committers” instead for patches themselves. Trusted committers—those with the mysterious “good taste,” or technical intuition—became the core members of the group.
The Linux kernel has also struggled with various issues surrounding SCMs and the management of responsibility they imply. The story of the so-called VGER tree and the creation of a new SCM called Bitkeeper is exemplary in this respect.
A great deal of yelling ensued, as nicely captured in Moody’s Rebel Code, culminating in the famous phrase, uttered by Larry McVoy: “Linus does not scale.” The meaning of this phrase is that the ability of Linux to grow into an ever larger project with increasing complexity, one which can handle myriad uses and functions (to “scale” up), is constrained by the fact that there is only one Linus Torvalds. By all accounts, Linus was and is excellent at what he does—but there is only one Linus. The danger of this situation is the danger of a fork. A fork would mean one or more new versions would proliferate under new leadership, a situation much like [pg 233] the spread of UNIX. Both the licenses and the SCMs are designed to facilitate this, but only as a last resort. Forking also implies dilution and confusion—competing versions of the same thing and potentially unmanageable incompatibilities.
The fork never happened, however, but only because Linus went on vacation, returning renewed and ready to continue and to be more responsive. But the crisis had been real, and it drove developers into considering new modes of coordination. Larry McVoy offered to create a new form of SCM, one that would allow a much more flexible response to the problem that the VGER tree represented. However, his proposed solution, called Bitkeeper, would create far more controversy than the one that precipitated it.
McVoy was well-known in geek circles before Linux. In the late stages of the open-systems era, as an employee of Sun, he had penned an important document called “The Sourceware Operating System Proposal.” It was an internal Sun Microsystems document that argued for the company to make its version of UNIX freely available. It was a last-ditch effort to save the dream of open systems. It was also the first such proposition within a company to “go open source,” much like the documents that would urge Netscape to Open Source its software in 1998. Despite this early commitment, McVoy chose not to create Bitkeeper as a Free Software project, but to make it quasi-proprietary, a decision that raised a very central question in ideological terms: can one, or should one, create Free Software using non-free tools?
On one side of this controversy, naturally, was Richard Stallman and those sharing his vision of Free Software. On the other were pragmatists like Torvalds claiming no goals and no commitment to “ideology”—only a commitment to “fun.” The tension laid bare the way in which recursive publics negotiate and modulate the core components of Free Software from within. Torvalds made a very strong and vocal statement concerning this issue, responding to Stallman’s criticisms about the use of non-free software to create Free Software: “Quite frankly, I don’t _want_ people using Linux for ideological reasons. I think ideology sucks. This world would be a much better place if people had less ideology, and a whole lot more ‘I do this because it’s FUN and because others might find it useful, not because I got religion.’”
Torvalds emphasizes pragmatism in terms of coordination: the right tool for the job is the right tool for the job. In terms of licenses, [pg 234] however, such pragmatism does not play, and Torvalds has always been strongly committed to the GPL, refusing to let non-GPL software into the kernel. This strategic pragmatism is in fact a recognition of where experimental changes might be proposed, and where practices are settled. The GPL was a stable document, sharing source code widely was a stable practice, but coordinating a project using SCMs was, during this period, still in flux, and thus Bitkeeper was a tool well worth using so long as it remained suitable to Linux development. Torvalds was experimenting with the meaning of coordination: could a non-free tool be used to create Free Software?
McVoy, on the other hand, was on thin ice. He was experimenting with the meaning of Free Software licenses. He created three separate licenses for Bitkeeper in an attempt to play both sides: a commercial license for paying customers, a license for people who sell Bitkeeper, and a license for “free users.” The free-user license allowed Linux developers to use the software for free—though it required them to use the latest version—and prohibited them from working on a competing project at the same time. McVoy’s attempt to have his cake and eat it, too, created enormous tension in the developer community, a tension that built from 2002, when Torvalds began using Bitkeeper in earnest, to 2005, when he announced he would stop.
The tension came from two sources: the first was debates among developers addressing the moral question of using non-free software to create Free Software. The moral question, as ever, was also a technical one, as the second source of tension, the license restrictions, would reveal.
The developer Andrew Trigdell, well known for his work on a project called Samba and his reverse engineering of a Microsoft networking protocol, began a project to reverse engineer Bitkeeper by looking at the metadata it produced in the course of being used for the Linux project. By doing so, he crossed a line set up by McVoy’s experimental licensing arrangement: the “free as long as you don’t copy me” license. Lawyers advised Trigdell to stay silent on the topic while Torvalds publicly berated him for “willful destruction” and a moral lapse of character in trying to reverse engineer Bitkeeper. Bruce Perens defended Trigdell and censured Torvalds for his seemingly contradictory ethics.
The story of the VGER tree and Bitkeeper illustrate common tensions within recursive publics, specifically, the depth of the meaning of free. On the one hand, there is Linux itself, an exemplary Free Software project made freely available; on the other hand, however, there is the ability to contribute to this process, a process that is potentially constrained by the use of Bitkeeper. So long as the function of Bitkeeper is completely circumscribed—that is, completely planned—there can be no problem. However, the moment one user sees a way to change or improve the process, and not just the kernel itself, then the restrictions and constraints of Bitkeeper can come into play. While it is not clear that Bitkeeper actually prevented anything, it is also clear that developers clearly recognized it as a potential drag on a generalized commitment to adaptability. Or to put it in terms of recursive publics, only one layer is properly open, that of the kernel itself; the layer beneath it, the process of its construction, is not free in the same sense. It is ironic that Torvalds—otherwise the spokesperson for antiplanning and adaptability—willingly adopted this form of constraint, but not at all surprising that it was collectively rejected.
The Bitkeeper controversy can be understood as a kind of experiment, a modulation on the one hand of the kinds of acceptable licenses (by McVoy) and on the other of acceptable forms of coordination (Torvalds’s decision to use Bitkeeper). The experiment was a failure, but a productive one, as it identified one kind of non-free software that is not safe to use in Free Software development: the SCM that coordinates the people and the code they contribute. In terms of recursive publics the experiment identified the proper depth of recursion. Although it might be possible to create Free Software using some kinds of non-free tools, SCMs are not among them; both the software created and the software used to create it need to be free.
The Bitkeeper controversy illustrates again that adaptability is not about radical invention, but about critique and response. Whereas controlled design and hierarchical planning represent the domain of governance—control through goal-setting and orientation of a collective or a project—adaptability privileges politics, properly speaking, the ability to critique existing design and to [pg 236] propose alternatives without restriction. The tension between goal-setting and adaptability is also part of the dominant ideology of intellectual property. According to this ideology, IP laws promote invention of new products and ideas, but restrict the re-use or transformation of existing ones; defining where novelty begins is a core test of the law. McVoy made this tension explicit in his justifications for Bitkeeper: “Richard [Stallman] might want to consider the fact that developing new software is extremely expensive. He’s very proud of the collection of free software, but that’s a collection of re-implementations, but no profoundly new ideas or products. . . . What if the free software model simply can’t support the costs of developing new ideas?”
Novelty, both in the case of Linux and in intellectual property law more generally, is directly related to the interplay of social and technical coordination: goal direction vs. adaptability. The ideal of adaptability promoted by Torvalds suggests a radical alternative to the dominant ideology of creation embedded in contemporary intellectual-property systems. If Linux is “new,” it is new through adaptation and the coordination of large numbers of creative contributors who challenge the “design” of an operating system from the bottom up, not from the top down. By contrast, McVoy represents a moral imagination of design in which it is impossible to achieve novelty without extremely expensive investment in top-down, goal-directed, unpolitical design—and it is this activity that the intellectual-property system is designed to reward. Both are engaged, however, in an experiment; both are engaged in “figuring out” what the limits of Free Software are.
Many popular accounts of Free Software skip quickly over the details of its mechanism to suggest that it is somehow inevitable or obvious that Free Software should work—a self-organizing, emergent system that manages complexity through distributed contributions by hundreds of thousands of people. In The Success of Open Source Steven Weber points out that when people refer to Open Source as a self-organizing system, they usually mean something more like “I don’t understand how it works.”
Eric Raymond, for instance, suggests that Free Software is essentially the emergent, self-organizing result of “collaborative debugging”: “Given enough eyeballs, all bugs are shallow.”
However, the actual practice and meaning of collective or collaborative debugging is incredibly elastic. Sometimes debugging means fixing an error; sometimes it means making the software do something different or new. (A common joke, often made at Microsoft’s expense, captures some of this elasticity: whenever something doesn’t seem to work right, one says, “That’s a feature, not a bug.”) Some programmers see a design decision as a stupid mistake and take action to correct it, whereas others simply learn to use the software as designed. Debugging can mean something as simple as reading someone else’s code and helping them understand why it does not work; it can mean finding bugs in someone else’s software; it can mean reliably reproducing bugs; it can mean pinpointing the cause of the bug in the source code; it can mean changing the source to eliminate the bug; or it can, at the limit, mean changing or even re-creating the software to make it do something different or better.
Coordination in Free Software is about adaptability over planning. It is a way of resolving the tension between individual virtuosity in creation and the social benefit in shared labor. If all software were created, maintained, and distributed only by individuals, coordination would be superfluous, and software would indeed be part of the domain of poetry. But even the paradigmatic cases of virtuosic creation—EMACS by Richard Stallman, UNIX by Ken Thompson and Dennis Ritchie—clearly represent the need for creative forms [pg 238] of coordination and the fundamental practice of reusing, reworking, rewriting, and imitation. UNIX was not created de novo, but was an attempt to streamline and rewrite Multics, itself a system that evolved out of Project MAC and the early mists of time-sharing and computer hacking.
UNIX was initially ported and shared through mixed academic and commercial means, through the active participation of computer scientists who both received updates and contributed fixes back to Thompson and Ritchie. No formal system existed to manage this process. When Thompson speaks of his understanding of UNIX as a “spectrum” and not as a series of releases (V1, V2, etc.), the implication is that work on UNIX was continuous, both within Bell Labs and among its widespread users. Thompson’s use of the diff tape encapsulates the core problem of coordination: how to collect and redistribute the changes made to the system by its users.
Similarly, Bill Joy’s distribution of BSD and James Gosling’s distribution of GOSMACS were both ad hoc, noncorporate experiments in “releasing early and often.” These distribution schemes had a purpose (beyond satisfying demand for the software). The frequent distribution of patches, fixes, and extensions eased the pain of debugging software and satisfied users’ demands for new features and extensions (by allowing them to do both themselves). Had Thompson and Ritchie followed the conventional corporate model of software production, they would have been held responsible for thoroughly debugging and testing the software they distributed, and AT&T or Bell Labs would have been responsible for coming up with all innovations and extensions as well, based on marketing and product research. Such an approach would have sacrificed adaptability in favor of planning. But Thompson’s and Ritchie’s model was different: both the extension and the debugging of software became shared responsibilities of the users and the developers. Stallman’s creation of EMACS followed a similar pattern; since EMACS was by design extensible and intended to satisfy myriad unforeseen needs, the responsibility rested on the users to address those needs, and sharing their extensions and fixes had obvious social benefit.
The ability to see development of software as a spectrum implies more than just continuous work on a product; it means seeing the [pg 239] product itself as something fluid, built out of previous ideas and products and transforming, differentiating into new ones. Debugging, from this perspective, is not separate from design. Both are part of a spectrum of changes and improvements whose goals and direction are governed by the users and developers themselves, and the patterns of coordination they adopt. It is in the space between debugging and design that Free Software finds its niche.
Coordination is a key component of Free Software, and is frequently identified as the central component. Free Software is the result of a complicated story of experimentation and construction, and the forms that coordination takes in Free Software are specific outcomes of this longer story. Apache and Linux are both experiments—not scientific experiments per se but collective social experiments in which there are complex technologies and legal tools, systems of coordination and governance, and moral and technical orders already present.
Free Software is an experimental system, a practice that changes with the results of new experiments. The privileging of adaptability makes it a peculiar kind of experiment, however, one not directed by goals, plans, or hierarchical control, but more like what John Dewey suggested throughout his work: the experimental praxis of science extended to the social organization of governance in the service of improving the conditions of freedom. What gives this experimentation significance is the centrality of Free Software—and specifically of Linux and Apache—to the experimental expansion of the Internet. As an infrastructure or a milieu, the Internet is changing the conditions of social organization, changing the relationship of knowledge to power, and changing the orientation of collective life toward governance. Free Software is, arguably, the best example of an attempt to make this transformation public, to ensure that it uses the advantages of adaptability as critique to counter the power of planning as control. Free Software, as a recursive public, proceeds by proposing and providing alternatives. It is a bit like Kant’s version of enlightenment: insofar as geeks speak (or hack) as scholars, in a public realm, they have a right to propose criticisms and changes of any sort; as soon as they relinquish [pg 240] that commitment, they become private employees or servants of the sovereign, bound by conscience and power to carry out the duties of their given office. The constitution of a public realm is not a universal activity, however, but a historically specific one: Free Software confronts the specific contemporary technical and legal infrastructure by which it is possible to propose criticisms and offer alternatives. What results is a recursive public filled not only with individuals who govern their own actions but also with code and concepts and licenses and forms of coordination that turn these actions into viable, concrete technical forms of life useful to inhabitants of the present.
Copyright: © 2008 Duke University Press
Printed in the United States of America on acid-free paper ∞
Designed by C. H. Westmoreland
Typeset in Charis (an Open Source font) by Achorn International
Library of Congress Cataloging-in-Publication data and republication acknowledgments appear on the last printed pages of this book.
License: Licensed under the Creative Commons Attribution-NonCommercial-Share Alike License, available at http://creativecommons.org/licenses/by-nc-sa/3.0/ or by mail from Creative Commons, 559 Nathan Abbott Way, Stanford, Calif. 94305, U.S.A. "NonCommercial" as defined in this license specifically excludes any sale of this work or any portion thereof for money, even if sale does not result in a profit by the seller or if the sale is by a 501(c)(3) nonprofit or NGO.
Duke University Press gratefully acknowledges the support of HASTAC (Humanities, Arts, Science, and Technology Advanced Collaboratory), which provided funds to help support the electronic interface of this book.
Two Bits is accessible on the Web at twobits.net.
SiSU Spine (object numbering & object search) 2022