8 :: Brainsofsand

Brains of Sand 8 Automated Reasoning (this section under construction)

Sections $1 through $7 describe the 'bare bones' of the TDE model. Sections $8 through $10 cover special topics in more detail. In this section, $8, the importance of distinguishing synthetic AI from analytic AI is clearly demonstrated. Without this distinction, there seems no reason to expect a direct equivalence between language and programming.

Recapping, synthetic AI involves building intelligent systems from existing (hardware and software) resources. For example, to build a Natural Language (NL) Engine for use perhaps in automated speech translation or mechanised proofreading of legal drafts, we might start with a word use corpus for the language involved, ie a frequency distribution of each word in the language plotted against its frequency of occurrence. The most frequently used word in English is 'the'- this is not surprising, since the most frequently used words in any language are the (definite and indefinite) articles, those words standing for 'the' and 'a'. Then we would compute word groups and K-nearest neighbours by means of Latent Semantic Analysis. That is, we would apply statistical and mathematical knowledge and techniques to solve the known Cognitive Linguistics problems.

By contrast, analytic AI involves careful observation and reverse engineering of systems which are known to exhibit the intelligent behaviours we are interested in. These systems are for the most part human and animal brains. To build a NLE using the analytical approach, we would examine how real brains construct and utilise actual languages. We would look for natural brain structures and neuroarchitectural regions which are known to perform the same or similar computations to the ones we require. For example, are there aspects of human language which resemble those observed in programming languages?

It is relatively straightforward to demonstrate that both brains and computers share the same basic operating principles- both rely on linguistic mechanisms to manage internal memory states. That this is true for computers is almost trivial to prove, since we humans designed and built them that way! It is nevertheless worth taking a few minutes to remind ourselves of their key operating principles.

Computers are a form of labor saving device. We must ask ourselves 'how deep' when we make statements of this kind about systems which are both intelligent and evolved. For example, an automobile on a highway replaced the horse on the track, but the horse on the dirt track replaced the person walking across the field. That is, automobiles represent two levels deep worth of machine 'assistance' (a nicer word than slavery). So how many levels 'deep' are computers?

Originally computers were people who sat at rows of tables in large rooms. These people used machines which looked like typewriters but computed equations- 'adding' machines, they used to be called, even though they mostly were used for multiplications and divisions, by using them together with logarithms. It is a useful exercise to see how you would organise the labor force needed to compute, say, artillery tables, or actuarial estimates, like the ones used by the British army on the western front or by establishment ship insurers like Lloyds of London. For example, by computing for each type of gun a table of increments, giving shelling range vs barrel elevation for a range of wind strengths, the ability of sudden artillery barrages to find the enemy trenches exactly is greatly enhanced. That is, artillery tables give the attacking forces a tremendous advantage. Without tables, each volley of shells gives the attacker and the attacked exactly the same range data. However, in world War I, Allied and Axis troops were equally well managed. Both sides had copies of the same sets of range tables. Artillery was no longer the game changer it used to be in past wars. In fact it became either a costly exercise in wasted ordnance, when all went according to plan, or an exercise in mass slaughter of both sides, caused by one side making a seemingly minor error in judgement. A secondary factor is that range tables permit less experienced troops to rapidly get up to speed with field artillery pieces. In the Great War, so many people died that virtually all troops ended up doing jobs they weren't originally trained for. The same principles can be seen in the shipping industry. Without 'loss' tables, one would not know how much to charge each ship owner who wants to insure each journey for the minimum possible cost, so that, over the whole client list, the annual aggregate of all premiums will comfortably exceed the total value of shipping losses that year.

Both artillery tables and actuarial instruments represent a kind of machine assistance. Another way of thinking about machine assistance is equivalent human effort, but greatly condensed in time. We are being asked to view bits of paper covered in writing as a form of machine assistance, no different for the purposes of this discussion from the horseless carriage, or the beast of burden. While a mechanism or organism could replace human graft, it is documents and books which replace human craft.

Turing was much closer to this level of reasoning than we are. He saw immediately the sideways shift in thinking that was needed- that to replace roomfuls of human computers ('keypunch operators', in the modern parlance), we only need spoonfuls of keypunch specification writers- programmers- to write detailed 'programs' in the form of long lists of statements, which we get them to 'execute'.

Programs are written by programmers expert in one of the many proprietary 'source code' dialects. I call them dialects rather than languages because they are all written in a functionally enhanced subtype ('structured') of English. The available range of allowable program words is tiny compared to the full lexicon available when we need to write books for other humans to read. Also, the amount of grammatical freedom (range of allowed expressions) is much less than human-readable documents. Both these factors are overlooked since it is not a human but a machine that does the reading, translating the programmers' problem solving strategies from a descriptive style (declarative paradigm) into one written in an imperative and proscriptive style (procedural paradigm). Nevertheless, the input has exactly the same meaning, and so executing it mechanically is guaranteed to produce the desired results, once certain steps are taken.

A) the compile-time 'bugs' are removed by T.O.T.E. (Test Operate Test Exit) microedits,
B) all run-time glitches are either totally eliminated or failing that, fixed heuristically with one or another 'kludge' or typical workaround.

One could be forgiven for failing to realise that the document used by the programmer to communicate with the machine is actually a detailed specification for solving a specific problem. The only way a computer can solve a real problem is by getting it to build within its memory banks a complex array of interconnected state-transition machines, multiple automata with a huge but finite number of working states, governed by phalanxes of supervisory processes which act like data flow traffic lights.

When Alan Turing first started to investigate state machines as a possible way to automatically do the sorts of computations previously performed laboriously by hand, each FSM had to be 'patched', ie individually wired up with plugs and wires so the computations the machine performed corresponded to the client's problem specification. The real breakthrough came when Turing (UK) and Von Neuman (US) independently discovered the idea of the 'stored program'. Now instead of getting a technician to turn each new problem into its own 'rats nest' of patching cables, the engineer could write a program describing what he wanted, knowing that his program was going directly to the machine. Each computer was now designed as a general purpose automaton which as delivered from the factory could do every computation, since it was the customer themselves who programmed this general purpose machine into a specialist information processor. The economies of scale of mass production drastically lowered unit cost, thus everyone could afford one, as long as they could write themselves a suitable program, or pay a little extra to get someone else to do it. Therefore an entirely new industry -software production- appeared suddenly, almost overnight. This was how the legend of Silicon Valley was created - an upstream miracle (very cheap hardware) hiding a downstream monster (very expensive software).

At the same time that big mainframe computers were being bought by all the big companies and universities, in the late 1950's and early 1960's, up-and-coming linguist Noam Chomsky wanted to know why human infants seemed to be able to learn their parents language, even when they never knew their parents. Consider the strange case of the Romanian orphanages, whose source of money was completely stopped when the communist dictatorship collapsed. Infants who found themselves there were effectively abandoned, fed once a day by a few volunteer nurses who hardly ever spoke to them. Yet almost every one grew up to speak Romanian, with only a small and understandable increase in their age at first speech. Chomsky knew that there was only one way this observation made sense- if language ability was present at birth and not learned later. The Romanian infants needed only that adults make a few clear fragments of speech in context. This situation was dubbed the 'paucity of inputs'.

Chomsky's solution was a radical one- language was not just an external means of communication, but an internal system of cognition! Chomsky demonstrated that all humans possess similar capacity for language understanding (semantics) at birth. However, when they grow up in dissimilar social environments, they learn completely different rules for language production (syntax/grammar). This can only be true if language consists of two parts, the first one (semantics) necessary and the second one (syntax, or grammar) sufficient.

There is a common misconception in the current view of linguistics, one that is fully recognised by those who study languages in a dedicated manner, but gets overlooked by the vast majority of people who must use the results of formal linguistics while researching other topics. This misconception concerns the number of levels involved in the definition of a formal language.

Consider the alphabet, the set of symbols that comprise human speech. What is that alphabet? If you answered {a,b,c etc...} you are giving the most common answer but that answer is wrong. The alphabet is the set of symbols that are used to construct words, which are character strings. It is those words which constitute the symbols for the actual sentences we use to construct meaning. So there are two formal language levels involved in any human language. Each has a set of symbols, and a set of rules for making well-formed formulae in that language. Hence, languages are often defined recursively, a technique which allows us to use syntax (production) rules at several formal language levels.

Consider words. Each word has a meaning. Then the words get combined into sentences that each also have a meaning. Are these meanings the same kind of entity? The answer is yes. We all have an intuitive idea of what meaning is, but to delve down further, this sloppy intuition must be teased out into something much more concrete.

We now introduce a new definition. We define a formal language level as a function which produces compound meanings from simple ones. Think about it. Each word has a dictionary meaning, which is really a formula which tests a variable x to see if it is a member of a semantic set. The word's dictionary meaning defines that word as a unique symbol. When we use the word in a sentence, we use a copy of the word-as-a-symbol, ie the word we use is a token. For example, the dictionary meaning for the noun 'dog' might consist of several predicate clauses and property phrases. , which are true statements about all of the things we call a 'dog'. One meaning, that of a household pet, also includes other synonymous statements, like 'mans best friend', 'individuals of the species canis canis', etc. But there are also non-canine dogs, such as a yearling steer, and non-living dogs such as a type of clutch (a mechanical drive element) with square cut teeth, so it can only be engaged while the input shaft is not rotating. We must also consider the problem of the implicit use of nouns in an adjectival role, like calling a hired thug a 'dog' ( a person who has no mind of their own, but comes running when their owner whistles). The person is not actually a dog, of course, but it serves a rhetorical purpose to point out that they exhibit certain dog-like traits. Another example is calling a minion who delivers a threat on their bosses behalf an 'errand boy'.

Now herein lies a conundrum. Words are simpler language units than sentences, right? Consider this. The lexical language, ie the formal language level which combines alphanumeric characters into character strings (words) relies upon a higher level language, the one that combines words into sentences, to define each word. Even if the dictionary definitions are not used, as in when infants learn word usage contexts from their parents and peers, sentences are needed. So it seems the sentence is the basic unit, or primary building brick of meaning, and words are more complex, secondary, derived units.

There is another kind of meaning than the constituent one described above, and that is the inherited or dependent meaning. For example, the noun 'dog' represents a type of 'animal' so 'dog' inherits some part of its meaning from our understanding of what an animal is. Probably, children learn what an animal is by first experiencing what a dog is. That is, we suspect that the instance is learned first then the child's mind broadens the semantic field into a general class called 'animal'. There is only one problem with this idea - it is wrong. Children first attach 'dog' to a general class, then accumulate more details over time. Initially, the dog is the exemplar, but of what class? In the beginning, there are only individual names and the things they label. Once the child has stored the name, it stays. However the set of exemplars it refers to changes as the infant gains experience.

Summarising so far, we can create meaning by either making new sentences or making new words.

Examples of making new sentences are

(a) combining sets of existing words into new sentences

(b) using subject-predicate pairs to add new elements to existing sets- eg 'Trevor has a blue shirt.

Examples of making new words are-

(c) EXPLICITLY -adding new words to existing dictionaries by using sets of sentences to create dictionary meanings of new words

(d) IMPLICITLY - uttering new statements in groups, all containing the new word in context. Essentially we are collecting sets of different semantic voids (unknown properties) and giving the set of these voids (the intersection set of all the predicates) a single new name.

If we take a step back, we see that all these activities are exactly the ones we do when we program computers. Therefore, this evidence suggests that language and programs are essentially the same. They are both means of constructing new patterns of memory types, where these memory types are themselves patterns (recursive). Furthermore, language represents a categorical (content-addressable) memory system, compared to numbers which represent a sequential or metric (context-addressable) memory system. Because language is recursive, the memory patterns being managed and manipulated may themselves be other memory addresses. This kind of operation resembles the use of pointers in programs, where call by reference goes back to a single memory pattern (a symbol), compared to the use of arrays which have multiple copies of memory patterns (tokens).