Kevin Kelly -- Chapter 15: Artificial Evolution

	Cheaper than printing it out: buy the paperback book.
	Out of Control Chapter 15: ARTIFICIAL EVOLUTION

John Holland is a gnomic figure of indeterminate age who once worked on the world's earliest computers, and who now teaches at the University of Michigan. He was the first to invent a mathematical method of describing evolution's optimizing ability in a form that could be easily programmed on a computer. Because of the way his math mimicked the effects of genetic information, Holland called them genetic algorithms, or GAs for short.

Holland, unlike Tom Ray, started with sex. Holland's genetic algorithms took two strings of DNA-like computer code that did a job fairly well and recombined the two at random in a sexual swap to see if the new offspring code might do a little better. In designing his system, Holland had to overcome the same looming obstacle that Ray faced: any random generation of a computer program would most likely produce not a program that was either slightly better or slightly worse, but one that was not sensible at all. Statistically, successive random mutations to a working code were bound to produce successive crashes.

Mating rather than mutating was discovered by theoretical biologists in the early 1960s to make a more robust computer evolution -- one that birthed a higher ratio of sensible entities. But sexual mating alone was too restrictive in what it could come up with. In the mid-1960s Holland devised his GAs; these relied chiefly on mating and secondarily on mutation as a background instigator. With sex and mutation combined, the system was both flexible and wide.

Like many other systems thinkers, Holland sees the tasks of nature and the job of computers as similar. "Living organisms are consummate problem solvers," Holland wrote in a summary of his work. "They exhibit a versatility that puts the best computer programs to shame. This observation is especially galling for computer scientists, who may spend months or years of intellectual effort on an algorithm, whereas organisms come by their abilities through the apparently undirected mechanism of evolution and natural selection."

The evolutionary approach, Holland wrote, "eliminates one of the greatest hurdles in software design: specifying in advance all the features of a problem." Anywhere you have many conflicting, interlinked variables and a broadly defined goal where the solutions may be myriad, evolution is the answer.

Just as evolution deals in populations of individuals, genetic algorithms mimic nature by evolving huge churning populations of code, all processing and mutating at once. GAs are swarms of slightly different strategies trying to simultaneously hill-climb over a rugged landscape. Because a multitude of code strings "climb" in parallel, the population visits many regions of the landscape concurrently. This ensures it won't miss the Big Peak.

Implicit parallelism is the magic by which evolutionary processes guarantee you climb not just any peak but the tallest peak. How do you locate the global optima? By testing bits of the entire landscape at once. How do you optimally balance a thousand counteracting variables in a complex problem? By sampling a thousand combinations at once. How do you develop an organism that can survive harsh conditions? By running a thousand slightly varied individuals at once.

In Holland's scheme, the highest performing bits of code anywhere on the landscape mate with each other. Since high performance increases the assigned rate of mating in that area, this focuses the attention of the genetic algorithm system on the most promising areas in the overall landscape. It also diverts computational cycles away from unpromising areas. Thus parallelism sweeps a large net over the problem landscape while reducing the number of code strings that need manipulating to locate the peaks.

Parallelism is one of the ways around the inherent stupidity and blindness of random mutations. It is the great irony of life that a mindless act repeated in sequence can only lead to greater depths of absurdity, while a mindless act performed in parallel by a swarm of individuals can, under the proper conditions, lead to all that we find interesting.

John Holland invented genetic algorithms while studying the mechanics of adaptation in the 1960s. His work was ignored until the late 1980s by all but a dozen wild-eyed computer grad students. A couple of other researchers, such as the engineers Lawrence Fogel and Hans Bremermann, independently played around with mechanical evolution of populations in the 1960s; they enjoyed equal indifference from the science community. Michael Conrad, a computer scientist now at Wayne State University, Michigan, also drifted from the study of adaptation to modeling evolving populations in computers in the 1970s, and met the same silence that Holland did a decade earlier. The totality of this work was obscure to computer science and completely unknown in biology.

No more than a couple of students wrote theses on GA until Holland's book Adaptation in Natural and Artificial Systems about GAs and evolution appeared in 1975. The book sold only 2500 copies until it was reissued in 1992. Between 1972 and 1982, no more than two dozen articles on GAs were published in all of science. You could not even say computational evolution had a cult following.

The lack of interest from biology was understandable (but not commendable); biologists reasoned that nature was far too complex to be meaningfully represented by computers of that time. The lack of interest from computer science is more baffling. I was often perplexed in my research for this book why such a fundamental process as computational evolution could be so wholly ignored? I now believe the disregard stems from the messy parallelism inherent in evolution and the fundamental conflict it presented to the reigning dogma of computers: the von Neumann serial program.

The first functioning electronic computer was the ENIAC, which was booted up in 1945 to solve ballistic calculations for the U.S. Army. The ENIAC was an immense jumble of 18,000 hot vacuum tubes, 70,000 resistors, and 10,000 capacitors. The instructions for the machine were communicated to it by setting 6,000 switches by hand and then turning the program on. In essence the machine calculated all its values simultaneously in a parallel fashion. It was a bear to program.

The genius von Neumann radically altered this awkward programming system for the EDVAC, the ENIAC's successor and the first general-purpose computer with a stored program. Von Neumann had been thinking about systemic logic since the age of 24 when he published his first papers (in 1927) on mathematical logic systems and game theory. Working with the EDVAC computer group, he invented a way to control the slippery calculations needed to program a machine that could solve more than one problem. Von Neumann proposed that a problem be broken into discrete logical steps, much like the steps in a long division problem, and that intermediate values in the task be stored temporarily in the computer in such a way that those values could be considered input for the next portion of the problem. By feeding back the calculation through a coevolutionary loop (or what is now called a subroutine), and storing the logic of the program in the machine so that it could interact with the answer, von Neumann was able to take any problem and turn it into a series of steps that could be comprehended by a human mind. He also invented a notation for describing this step-wise circuit: the now familiar flow chart. Von Neumann's serial architecture for computation -- where one instruction at a time was executed -- was amazingly versatile and extremely suited to human programming. He published the general outlines for the architecture in 1946, and it immediately became the standard for every commercial computer thereafter, without exception.

In 1949, John Holland worked on Project Whirlwind, a follow-up to the EDVAC. In 1950 he joined the logical design team on what was then called IBM's Defense Calculator, later to become the IBM 701, the world's first commercial computer. Computers at that point were room-size calculators consuming a lot of electricity. But in the mid-fifties Holland participated in the legendary circle of thinkers who began to map out the possibility of artificial intelligence.

While luminaries such as Herbert Simon and Alan Newall thought of learning as a noble, high-order achievement, Holland thought of it as a polished type of lowly adaptation. If we could understand adaptation, especially evolutionary adaptation, Holland believed, we might be able to understand and maybe imitate conscious learning. But although the others could appreciate the parallels between evolution and learning, evolution was the low road in a fast-moving field.

Browsing for nothing in particular in the University of Michigan math library in 1953, Holland had an epiphany. He stumbled upon a volume, The Genetical Theory of Natural Selection, written by R. A. Fisher in 1929. It was Darwin who led the consequential shift from thinking about creatures as individuals to thinking about populations of individuals, but it was Fisher who transformed this population-thinking into a quantitative science. Fisher took what appeared to be a community of flittering butterflies evolving over time and saw them as a whole system transmitting differentiated information in parallel through a population. And he worked out the equations that governed that diffusion of information. Fisher single-handedly opened a new world of human knowledge by subjugating nature's most potent force -- evolution -- with humankind's most potent tool -- mathematics. "That was the first time I realized that you could do significant mathematics on evolution," Holland recalled of the encounter. "The idea appealed to me tremendously." Holland was so enamored of treating evolution as a type of math that in a desperate attempt to get a copy of the out-of-print text (in the days before copiers) he begged the library (unsuccessfully) to sell it to him. Holland absorbed Fisher's vision and then leaped to a vision of his own: butterflies as coprocessors in a field of computer RAM.

Holland felt artificial learning at its core was a special case of adaptation. He was pretty sure he could implement adaptation on computers. Taking the insights of Fisher -- that evolution was a class of probability -- Holland began the job of trying to code evolution into a machine.

Very early in his efforts, he confronted the dilemma that evolution is a parallel processor while all available electronic computers were von Neumann serial processors.

In his eagerness to wire up a computer as a platform for evolution, Holland did the only reasonable thing: he designed a massively parallel computer to run his experiments. During parallel computing, many instructions are executed concurrently, rather than one at a time. In 1959 he presented a paper which, as its title says, describes "A Universal Computer Capable of Executing an Arbitrary Number of Sub-programs Simultaneously," a contraption that became known as a "Holland Machine." It was almost thirty years before one was built.

In the interim, Holland and the other computational evolutionists had to rely on serial computers to grow evolution. By various tricks they programmed their fast serial CPUs to simulate a slow parallelism. The simulations worked well enough to hint at the power of true parallelism.

continue...