Computing,  Linguistics

The Problem of Meaning in Artificial Intelligence

Since the 1960s, when computers first appeared,  a machine that can think just like humans was claimed to be just a few years away. This idea has been called Artificial Intelligence (AI) and it reappears every few years in a new form, the latest being the brouhaha around “Machine Learning”, “Deep Learning”, etc. The algorithms and techniques underlying these trends have existed for a few decades, and their limitations are also well-known. However, even with growing computational power we are only able to get closer to the boundaries of what is possible, rather than cross into what is impossible. This post discusses the problems which cannot be solved by AI in its current form and discusses the reasons why. It also discusses the changes that are needed in physical and mathematical theories to make AI a reality. Aside from the shifts in material thinking, a separation between matter and choice is also needed.

The Problem of Sentence Interpretation

AI faces a serious problem in describing the meanings of sentences. The canonical example of such a problem is illustrated by the varying interpretations of the sentence “I saw a man on a hill with a telescope” as illustrated by the possibilities below.

  • I saw a man using a telescope. The man was on a hill.
  • I saw a man. I was on the hill, looking through a telescope.
  • I saw the man. The man was on a hill and had a telescope.
  • I was on the hill. I saw a man. The man had a telescope.

The source of the problem is that the sentence has a hierarchical structure which is flattened when the sentence is written. In the above pictures, the hierarchical structure is two-dimensional, while the sentence “I saw a man on a hill with a telescope” is one-dimensional. When a two-dimensional reality is expressed in a single dimension, meanings are lost, because meanings happen to be the other dimension—vertical direction—which is flattened.

To describe the meaning we need a hierarchical tree structure in which phrases of the sentence—e.g. “on a hill” and “with a telescope”—can be attached to “I”, “a man”, or each other. If the phrase “on a hill” is attached to “I”, then I’m on a hill, but if the phrase “on a hill” is attached to “a man” then the man is on a hill. Likewise, if the phrase “with a telescope” is attached to “I”, then I’m seeing with a telescope, but if the phrase is attached to “a man” then the man is standing next to a telescope. These kinds of relationships cannot be described in a flat space because in a flat space there is no notion of “attachment” between two physical entities.

If the phrases, therefore, are encoded as symbols using material objects in a flat space, then the meanings of these phrases can never be understood automatically. If, however, the same phrases were encoded by material objects in a hierarchical space, then meanings would be easy.

The problem of sentence meaning thus reduces to the problem of the underlying space that is employed for enumerating objects—i.e. for locating them at specific points in space. In a flat space, the location of a phrase—e.g. “on a hill”—is fixed in the sentence “I saw a man on a hill with a telescope”. But the location of the same phrase varies in the hierarchical space.

If, therefore, we flatten the hierarchy into a linear structure—i.e. a sentence—then we lose the meaning. We can resurrect the meaning if a sentence were viewed hierarchically.

The Problem of Word Interpretation

The hierarchical structure pertains to the relationship between the different words in a sentence. But by itself, this is inadequate to address the problem of meaning because the words themselves can have different meanings. The word “with”, for example, has two different meanings: in one case it means “together”, “nearby”, “in proximity”, etc. while in another case it means “using as an instrument”. In the first case, “with a telescope” means nearby a telescope. But in the second case, “with a telescope” means the telescope is used as an instrument to see.

If therefore, we treat “with” as a separate word, then the meaning of “with a telescope” itself necessitates an understanding of the context of its use. Note that in the figure above, “with a telescope” always appears as a single hierarchy of words. But given that this hierarchy has two different meanings, there is no easy way to decode the meaning by the word itself.

Hence, there are two distinct problems of sentence-meaning and word-meaning. The problem of sentence meaning is describing the hierarchical relationship between phrases, but the problem of word meaning is that this hierarchical arrangement itself changes a word’s meaning!

We cannot, therefore, describe a word’s meaning without knowing the sentence structure. And we cannot understand the sentence structure without describing the word’s meanings. There is a circular interdependence between the word and the sentence meanings and that circularity entails that if a computing machine was tasked with the job of decoding meanings, it could never complete the task because the machine would enter an infinite loop of recursion.

Word Interpretations in Sanskrit

The problem of word-meaning is addressed in some languages such as Sanskrit. The solution is a distinction between a dhātu and a rūpa. The term dhātu means a “root”, or what we would typically call a morpheme in English. The term rūpa, on the other hand, denotes a form. Every noun in Sanskrit—for example, bālak or boy—has 24 different forms.

विभक्ति

एकवचन द्विवचन बहुवचन

प्रथमा

बालकः बालकौ बालकाः

द्वितीया

बालकम् बालकौ बलकान

तृतीया

बाल्केन

बालकाभ्याम्

बालकैः
चर्तुथी बालकाय बालकाभ्याम्

बालकेभ्यः

पन्चमी बालकात् बालकाभ्याम्

बालकेभ्यः

षष्ठी बालकस्य बालकयोः

बालकानाम्

सप्तमी

बालके बालकयोः बालकेषु
सम्बोधन हे बालक! हे बालकौ!

हे बालकाः

The noun is first divided into 8 cases, and each case is further divided into 3 forms—singular, biural, and plural. The combination of 8 cases and 3 forms for a noun produces 24 forms for each noun. The cases in Sanskrit are shown below.

Sanskrit Case

Case Meaning

English Case

Karta

The Doer Nominative Case

Karma

The Activity Dative Case

Karan

The Instrument

Instrument Case

Sampradan The Purpose or Goal

Objective Case

Apādān Removal From

Ablative Case

Sabandh Relation of Possession

Possessive Case

Adhikaran Relation of Location

Locative Case

Sambodhan Addressing an Individual

Vocative Case

The key point is that “with a telescope” will not be given meaning separately in Sanskrit. If the meaning of “with” is “in proximity” to a telescope, then the word “I” or “man” will have a form corresponding to the Locative Case depending on the proximity to the telescope. Similarly, if the meaning is “using it as an instrument” then the word “I” or “man” will have a form corresponding to the Instrument Case depending on who is using it. The use of the case in creating the morphology of the word from a “root” solves two distinct problems:

  • It describes the relation to a telescope—i.e. near it or using it.
  • It describes the specific entity—“I” or “man”—who is having this relationship.

In other words, when we flatten a hierarchical structure from two dimensions to a single dimension, we modify the words in a way so as to preserve the hierarchy. While we cannot see the hierarchy in between the words (as we did by drawing lines above), we can still see it within the words because the word forms are modified in accordance with the hierarchy.

A Mathematical Analogy

This is an ingenious scheme, but not very different from the schemes used in modern geometry to denote points numerically. Points in a three-dimensional space can be called P, Q, R, etc. and these names can suffice to distinguish them—provided we use a three-dimensional space. When we reduce these points to a single dimension—e.g. while writing mathematical proofs—then the points have to be represented by three numbers such as X, Y, Z. Thus, for example, we would now replace P by {X1, Y1, Z1}, Q by {X2, Y2, Z2}, R by {X3, Y3, Z3}, etc.

This transformation from physical points to symbolic entities is called the conversion of geometry into algebra, and it is attained by adding a coordinate system which helps us map points to numbers, following which numbers are labeled using numerals, and a physical entity becomes a symbol—which can be written in one-dimensional text in books.

The reality that exists in 3-dimensions in space is thus encoded in a single dimensional sentence in a book. This is achieved by expanding the names of the entities from P, Q, R, to {X1, Y1, Z1}, etc. More tokens are required to perform this expansion, which means more letters are needed to encode a three-dimensional reality into a single dimension.

The problem is similar to that of flattening a three-dimensional object—e.g. a cylinder—into two dimensions. As you press the cylinder along one dimension, you will get a much larger circle, or a much larger rectangle (depending on the direction in which you compress the object). The total amount of matter is not lost, although it is redistributed from one dimension to another.

The Problem of Sentence Meaning

If we simply compress an object but don’t allow sufficient space that is needed to reorganize all the matter, then some of the matter would be lost. This is precisely what happens in many languages when the world they describe is far more complex than its sentence expression.

One key factor contributing to this simplification is that the relationships between real-world physical entities are themselves converted into objects. For example, instead of modifying the objects which are related to each other (in order to express the relation), many modern languages introduce connectives such as prepositions and conjunctions, which are themselves words, and therefore have to be encoded quite like the real-world objects they denote. Thus, for instance, “table” is a word and “with”, “above”, “below”, “towards”, “along” are also words. In trying to express the relationships between objects, we end up creating new objects.

Now we have to solve the problem of describing the relationships between the real-world objects (e.g. telescope, I, man) and the newly created objects (with, on). If we were to try and solve this problem by using the same mechanism employed above—i.e. creating new words to establish relationships between the real-world objects, and the words that encode relationships between them—then we would end up in a recursion problem whereby we have words that describe first-order relationships (between real-world objects), second-order relationships (between the between real-world objects and relationships between them), and so forth.

The result of such a scheme would be an inordinately complex grammar and vocabulary because we must now know which conjunctions connect objects, which conjunctions connect objects and relations, which conjunctions connect relations to relations, and so forth. As new connectors are created, the problem of describing their interrelation explodes exponentially.

The Solution to the Problem of Sentence Meaning

The solution to this problem needs a different approach in which a connector is not another object in between two objects. Rather, that connection modifies the objects themselves. In order to understand the connection, we have to see the form of the objects (which changes with each connection), rather than look for new objects that should symbolize that connection.

One hallmark of languages such as Sanskrit is that Sanskrit drastically reduces the number of preposition and conjunction words. These connectors rather become embedded in the form of the nouns themselves, as different kinds of cases—Locative, Possessive, Objective, Ablative, etc. The connectors we are left with include logical conditions such as if, but, then, or, and, etc. These are not expressions of real-world relationships between objects. They are rather used to logically derive a particular type of answer and therefore create hypotheticals. Of course, these hypotheticals are not unimportant; in science, for example, they are involved in making predictions: “if you eat a medicine, you may be cured”, “you must eat the medicine and exercise if you want to be cured”, “you can get cured but you have to stop this habit”, etc.

In each such case, there is a choice involved because these hypotheticals are not necessarily reality, although reality can be created by applying a restricting condition. A much smaller set of connectors is needed to denote the choice itself, while a much larger set of connectors is required to describe the real-world relationships between material objects. The connectors between material objects have to be encoded in the form of the object itself, while the connectors that express choice have to be distinguished from the connectors of material relationships.

Implications for Physical Sciences

There is, hence, a solution to the problem of meaning but it involves changes to some current foundational ideas. These include the notion that space is flat, rather than hierarchical. It includes the notion that relationships too have to be expressed as objects rather than forms of objects. Finally, it includes the idea that choice is just like a material relation, and since the relations are material objects, therefore the choice too must be a material object or its property.

A complete theory of meaning and intelligence cannot be devised until these changes are made. Since these changes necessitate shifts in our physical thinking, the nature of space, and the mathematics that is used to describe it, the problem of Artificial Intelligence is much more fundamental than currently understood. The problem isn’t about making a machine using the laws of physics and mathematics as we know them today. It is rather about changing physics and mathematics themselves in a way objects in space can encode meanings.