Artificial Intelligence and Information Theory

By Claude Berrou

The best way to confound a researcher in artificial intelligence (AI) is to ask him or her to define the natural intelligence he/she is supposed to reproduce in electronic machines. Indeed, the criteria for intelligence are so numerous and confusing: aptitudes for analogy, creativity, written and oral communication, etc.

There is one, however, whose objectivity cannot be questioned: the proportions of the innate and the acquired in the development of the brain. The lizard, without a frontal cortex, has no long-term memory and therefore cannot learn from its experience and correct its behavior. The chimpanzee does better with about six billion neurons.

But it is not only a question of quantity. For example, the elephant has three times more cortical neurons than the human being and apparently less intellectual capacity. Beyond the number of elementary components, it is also and above all a question of organization, architecture and communication between the different areas of the brain.

Another challenging question to ask the researcher is if he/she is working, like many others today, on so-called deep learning: in the successive layers of your networks, in the laces of their synaptic connections, where is it possible to find information? Is it localized or distributed? Figurative or abstract? Statistical or symbolic?

Let us look at this by using a famous model – the one proposed by Claude Shannon in 1948 to form the basis of information theory. This time it is applied to the brain (figure below).


We live in a frenetic world, rich in details and constantly fluctuating. Our sensory system is continuously barraged with countless familiar and new stimuli. It is out of the question for the cortex, limited in resources, to memorize in fine detail all the components of original stimuli.

For example, the visual system has to reduce, with very high compression rates, the amount of information representative of an image in order to eventually store its significant attributes. There are an infinite number of visual representations of cats, but the “cat class” is unique. Here we are at the heart of the first of the two steps suggested by Shannon’s model: “source coding.” The purpose of this work is to extract from sensory and somatosensory signals the least voluminous, yet most discriminating, components that can be useful for mental functions. It is essentially a matter of statistics and those interested in the models of this sensory source coding do “computational neuroscience.”

The information resulting from the compression step is valuable. If it is to be saved for future use, it must be saved with some caution. Since the cortex is a highly noisy substrate (erasures due to deficient synapses at certain times because of a lack of neurotransmitters, parasitic insertions due to spontaneous stimuli, chemical fluctuations), protection through redundancy seems essential.

This is the second step in Shannon’s model called “channel coding.” The most common redundancy is repetition, but this can multiply the material excessively and this is likely not the choice that evolution has made.

It is now increasingly speculated that mental information is materialized by groups of neurons that synchronize (alignment of spikes over time) to form what graph theory calls “cliques.” These fully connected subgraphs are highly redundant structures. Thus, if 20 vertices of a neural clique (i.e. 20 neurons in a small neighborhood) are fully interconnected, this is done through 190 connections. If one of them disappears, by accident or aging, it will not be fatal, as the clique continues to be easily synchronizable by the remaining 189 connections. And Hebb’s famous law, the one that sets the conditions for creating or restoring a synaptic connection, will allow the missing connection to reappear.

This is the mental world, the world of long-term memory on which our intelligence continuously relies, even in our dreams. The elements of knowledge are fixed once and for all in neural assemblies whose highly redundant structure provides resilience as well as correction of erasures, errors or approximations.

What distinguishes the intelligence of the human being from that of the lizard is, of course, the mental world, which is therefore essentially based on this second stage of Shannon’s model. After all, the reptile’s visual system is no less elaborate than ours, but on the reasoning side, it certainly does not measure up.

It is surprising to note that most of the research in AI today focuses on learning, i.e. the first step of the model. Little work is devoted to “artificial long-term memory”, to the way the elements of knowledge recorded in this associative memory could be exploited. It would, of course, be a question of combining and crossing ideas, analogies and intentionality – all kinds of faculties that our reason masters so well. This is a matter of “informational neurosciences,” whose scientific problematics call upon the notions of architecture, multimodality and communications, which are rather different from those considered in machine learning.

This strong distinction between nervous information and mental information, it seems, has not yet been seriously taken into account. The world of nervous information is very fast-moving by necessity and mainly uses direct circuits, specialized (sight, hearing, etc.) and common to most living species. The world of mental information, slower and with certain remanence properties, is based on a heterogeneous network whose organization and variety of spatial and temporal integration functions explain the ability to reason. This is, of course, very variable depending on the species considered. The scientific skills to approach and understand each of these worlds are probably not the same.

AI is still, no matter what the media say, in its infancy. Today’s AI researchers are mainly interested only in what lizards and humans have in common: the five senses (with one notable exception: automatic natural language processing). The only way to progress towards thinking, or at least reasoning, machines is to adopt an interdisciplinary approach in which brain specialists, psychologists and information science experts collaborate in the same laboratory. This need has not yet been understood in many countries, particularly in Europe. AI will be neuro-inspired or it will not be. And the link with information theory is absolutely essential.