Life (and intelligence) from the perspective of Information Theory

I have an idea (not entirely original) kicking around in my head with regards to the remarkably contentious notions of life (as in what does it means for something to be alive), and furthermore that of intelligence (what does it mean for something to be intelligent).

One definition that is used by NASA is “a self-sustaining chemical system capable of Darwinian evolution”.

Let’s rephrase that a bit.

What we call evolution, from the perspective of thermodynamics, is the operation of entropy over time. We could reframe NASA’s theory of life as being, “a self-sustaining chemical system that is operated on by entropy”.

As a computer person, I’m interested in Information Theory, as proposed by Claude Shannon and others in the first half of the 20th century. Information Theory originally dealt with telecommunications, but has expanded over time to many fields (computing not the least of them). The notion is that a “bit” of information can be defined in terms of Claudian entropy (click the link for a definition), which is analogous to (and has deeper connections from the level of physics) to the thermodynamic entropy mentioned above.

The idea I had is that maybe entropy, from the Claudian point of view, has a bearing on what it means to be alive. If we defined information density as the quantity of information contained within a particular volume of matter, then perhaps life is simply:

Life: “a chemical system with information density greater than some constant K”

Let’s explore this a bit:

Isn’t this just a rephrasing of the NASA theory?

Some of you are definitely thinking “congratulations, you just rephrased that theory a bit”.

There’s one key difference though: we can measure Claudian Entropy. Using my version, we could measure the informational density of a bunch of things, living or inert, and come up with a number for K.

What does density have to do with all this?

I haven’t fully thought this out, but it feels to me like informational complexity that is very spread out over space doesn’t “feel” like life, per se. For instance, if you are looking at a large nebulous thing that is clearly living, you’re probably looking at a colony of separate living things, rather than one single living thing. Really, the same argument actually applies to any multi-cellular organism. They’re effectively colonies (even us humans!), even if they’re very well coordinated.

Here’s another example: we all agree that a turtle is alive. We generally do not consider the shell of a turtle to be living. Even though the shell is dense (ever try lifting a turtle?), it contains less information than the body of the animal itself. A similar thing applies to human hair (and nails, and outer layers of our skin). A person clearly is a living organism, but our hair doesn’t seem to be. In fact, we consider the cells that make up hair to be “dead”. What it means for something to be dead, we’ll discuss below.

On the flip side, we generally would not consider a chunk of iron to be alive. It’s massive and dense, but it doesn’t contain a whole bunch of information. One iron atom is much like another. Compared to the complexity of a DNA molecule (which, by itself, is still not really sufficiently informationally dense to be alive – see viruses, below), the chunk of iron is just too simple. The vast complexity of life is what allows entropy to modify it in a Darwinian manner. A chunk of iron, left sitting outside, just gets rusty.

So, is a virus alive?

Undefined! Virii don’t seem to do many of the things that we expect life to do. They don’t eat, they don’t have a protective membrane, and they can’t reproduce on their own. They’re obviously complex things though. When we discuss them in casual conversation, we use the same sort of language that we would when referring to something that is alive. Perhaps we should set our constant K somewhere around the level of complexity of a virus. That does not answer whether or not standard virii are alive (it’s too fuzzy), but it helps explain why it is so difficult for us to answer that question. On the basis of that definition, we can answer then that prions are probably not alive (too simple, not enough information in that tiny package), but that giant viruses like Mimivirus maybe are (more complex than K).

What does it mean for something to be dead?

A dead organism is one where entropy has increased to the extent that its information density has dropped below K. Cancer is entropy, basically. So are the claws and teeth of a lion.

Intelligence!

There’s another aspect to this application of Information Theory: intelligence.

The conversation around what it means for something to be intelligent is possibly one of the more contentious ones in all of science, and it becomes especially heated when considering the possibility of true artificial intelligence (“AGI”). I think this is an area where the notion of information density may apply:

What does it mean for something to be intelligent?

Let’s define intelligence as:

Intelligence: “a chemical system with informational density greater than some constant K(2)”

What’s K(2)?

Let’s say for the sake of argument roughly what we see from a human being. Give or take an order of magnitude (joke!).

By chemical system, we’re just talking about a chunk of matter. Today, that means a living being. More on AI below.

We seem to instinctively understand that a smart animal, an orangutan, say, has some of the reasoning capabilities of a human. On the other hand, that smart animal doesn’t seem likely to suddenly start discussing philosophy. It’s not just a matter of different intelligences. We also clearly understand that the smart animal doesn’t think the same way that we do (an octopus is a good example of this), but even accounting for that difference we seem to rank other creatures relative to some level of computational ability. The whale, the octopus are pretty smart. The dog is smart, maybe a little less so. The worm is definitely not smart, but it’s doing something computationally in order for it to find food.

Given the vast size of a whale’s brain, we’re not simply talking about the number of neurons here. The density of those neurons, on the other hand, definitely plays a part, as does the specific organizational structure of the brain of the creature in question. Is that not precisely the same thing as the concept of information density mentioned above?

What does that mean for artificial general intelligence?

I don’t know!

Seriously though, I think that the level of informational density in a human brain is being underestimated by some of the researchers trying to create AGI.

Density is especially important when considering intelligence, because computation is physically constrained by the speed of light (‘c’).

Picture, if you will, a mad scientist who grows brains (in bottles!) in a laboratory. One day, the scientist decides to grow a very large brain, in fact the largest one that has ever existed, one that fills entire rooms within the lab. The scientist then hooks up an EKG device to this monstrous brain. I can’t obviously prove this (and I don’t want to!), but I suspect that what they would see is not one single set of neuron activity, but rather several clusters of activity that look more like that of several vastly smaller brains, rather than one huge brain operating in unison.

The reason is that it takes physical time for an electrical signal (how the brain works, in effect) to travel across a neural network. At some point of scale, the limitation of ‘c’ means that it will take too long for the signal to travel from point A to B across the entirety of the brain. The activity will start to focus on local regions within the overall network, dense sections with large amounts of activity, and then those sections will have much smaller amounts of activity linking them together (basically a network).

This is much like how today we have discrete computers (max bus size feasible with the constraint of ‘c’ is around 30cm, which is why level 1 caching memory is integrated into a CPU on modern computers), and those discrete computers are linked together in a network. Data centers work on this principle. Supercomputers also.

What this means is that “true” AGI probably will be constrained in similar ways. Simply adding computational power to a large supercomputer isn’t going to result in either a) more Claudian entropy (because computer A is the same as computer B), or b) enough informational density (information divided by physical volume) to exceed constant K(2). Different approaches will be required!

Can you prove any of this?

Nope.

It feels roughly, approximately, right to me, but I haven’t fully thought it all through.

I’m interested to hear what people have to say though.