Saturday, April 5, 2014

126. Kurzweil's Pattern-Recognition Theory of Mind – 2

Let us continue from where we left off in Part 125.

The nature of data flowing into a pattern recognizer

What does the data for a pattern look like? Suppose the pattern is a face, an essentially 2-dimensional set of data. But, as can be seen from the structure of the neocortex, the pattern inputs are only 1-dimensional lists. All the experience in the creation and functioning of artificial pattern-recognition systems also confirms that one can represent 2- or higher-dimensional data streams as 1-dimensional lists. Our memories are patterns organized as lists (that is why we have trouble reciting the alphabet backwards). And, what is more, each item in the list is another pattern, and so on, hierarchically. We have learnt these lists, and we recognize them when an appropriate stimulus is present. Memories exist in the neocortex in order to be recognized.

Autoassociation and invariance

As explained in Part 125, we can recognize a pattern even if it is incomplete. This ability to associate a pattern with a part of itself is called autoassociation.

Often we are able to recognize patterns that are distorted, or when aspects of them are transformed. This ability is called invariance, and the brain deals with it in four ways.

The first way is through global transformations that are effected before the cortex receives the sensory data.

The second takes advantage of the redundancy in the storage of memory. The memory has many perspectives or variations stored away.

The third is the ability to combine two or more memory lists. That is how we understand metaphors and similes.

The fourth method derives from the 'size parameters' that allow a single module to encode multiple instances of a pattern.


As Kurzweil (2012) writes: 'Our neocortex is virgin territory when our brain is created. It has the capability of learning and therefore of creating connections between its pattern recognizers, but it gains those connections from experience. . .  Learning and recognition take place simultaneously. We start learning immediately, and as soon as we've learned a pattern, we immediately start recognizing it. . . . patterns that are not recognized are stored as new patterns and are appropriately connected to the lower-level patterns that form them'.

The language of thought

At the heart of the pattern-recognition theory of mind (PRTM) is the neocortical pattern-recognition module, the inputs to and the outputs from which are shown below (diagram taken from Kurzweil 2012).

The brain starts out with a very large number of ‘connections-in-waiting’ to which the pattern-recognition modules can hook up. As we learn and have experiences, the pattern recognizing modules of the neocortex are connecting to preestablished connections that were created when we were embryos. Kurzweil (2012) has summarized his PRTM as follows:

'a) Dendrites enter the module that represents the pattern. Even though patterns may seem to have two- or three-dimensional qualities, they are represented by a one-dimensional sequence of signals. The pattern must be present in this (sequential) order for the pattern recognizer to be able to recognize it. Each of the dendrites is connected ultimately to one or more axons of pattern recognizers at a lower conceptual level that have recognized a lower-level pattern that constitutes part of this pattern. For each of these input patterns, there may be many lower-level pattern recognizers that can generate the signal that the lower-level pattern has been recognized. The necessary threshold to recognize the pattern may be achieved even if not all of the inputs have signalled. The module computes the probability that the pattern it is responsible for is present. This computation considers the "importance" and "size" parameters (see [f] below).

'Note that some of the dendrites transmit signals into the module and some out of the module. If all of the input dendrites to this pattern recognizer are signalling that their lower-level patterns have been recognized except for one or two, then this pattern recognizer will send a signal down to the pattern recognizer(s) recognizing the lower-level patterns that have not yet been recognized, indicating that there is a high likelihood that that pattern will soon be recognized and that lower-level recognizer(s) should be on the lookout for it.

'b) When this pattern recognizer recognizes its pattern (based on all or most of the input dendrite signals being activated), the axon (output) of this pattern recognizer will activate. In turn, this axon can connect to an entire network of dendrites connecting to many higher-level pattern recognizers that this pattern is input to. This signal will transmit magnitude information so that the pattern recognizers at the next higher conceptual level can consider it.

'c) If a higher-level pattern recognizer is receiving a positive signal from all or most of its constituent patterns except for the one represented by this pattern recognizer, then that higher-level recognizer might send a signal down to this recognizer indicating that its pattern is expected. Such a signal would cause this pattern recognizer to lower its threshold, meaning that it would be more likely to send a signal on its axon (indicating that its pattern is considered to have been recognized) even if some of its inputs are missing or unclear.

'd) Inhibitory signals from below would make it less likely that this pattern recognizer will recognize its pattern. This can result from recognition of lower-level patterns that are inconsistent with the pattern associated with this pattern recognizer. . . .

'e) Inhibitory signals from above would also make it less likely that this pattern recognizer will recognize its pattern. This can result from a higher-level context that is inconsistent with the pattern associated with this recognizer.

'f) For each input, there are stored parameters for importance, expected size, and expected variability of size. The module computes an overall probability that the pattern is present based on all of these parameters and the current signals indicating which of the inputs are present and their magnitudes. A mathematically optimal way to accomplish this is with a technique called hidden Markov models. When such models are organized in a hierarchy (as they are in the neocortex or in attempts to simulate a neocortex), we call them hierarchical hidden Markov models.'

Triggered patterns trigger other patterns. Incomplete patterns send signals down the conceptual hierarchy. Complete patters send signals up the hierarchy. These patterns are the language of thought. Like language they are hierarchical, but they are not always language per se, although language-based thoughts are also possible.

There can be two modes of thinking, nondirected and directed. In the former, thoughts trigger one another in a nonlogical way. Dreams are examples of nondirected thoughts. Directed thinking is what we use when we are trying to solve a problem, or when we formulate an organized response.

Thus, according to the PRTM, our intelligence is the result of 'self-organizing, hierarchical recognizers of invariant self-associative patterns with redundancy and up-and-down predictions' (Kurzweil 2012).

It is rightly claimed in Kurzweil's (2012) book that it ' . . is an incredible synthesis of neuroscience and technology and provides a road map for the future of human progress'. The operating principle of the neocortex (explained by the PRTM) 'is arguably the most important idea in the world, as it is capable of representing all knowledge and skills as well as creating new knowledge'.

Note added on 11th August 2016

Here is a good two-part update on what has been happening in the field of artificial intelligence: