1 :: Brainsofsand

Brains of Sand 1 Linguistic Computation

What is language? How does it form mental representations?

This website describes a MOM (model of mind) called GOLEM (Goal Oriented Linguistic Emulation of Mind), which has as its physical substrate a cybermatic* architecture called the TDE (which is a recursive acronym of 'TDE Differential Engine').

The core idea behind GOLEM is a much wider conception of what language is. Far from being just a communication system, the GOLEM view is that language is a UNIVERSAL COMPUTATION TEMPLATE similar to a Turing Machine. This position is an extension of the well known Language of Thought (LOT) hypothesis popularised by Jerry Fodor, and essentially the same idea as proposed earlier by Noam Chomsky. Fodor's theory posits the existence of 'mentalese', a symbolic system physically realized in the brain of the relevant organisms (c.f. Chomsky's i-Language and ULM).

Formally speaking, language is a hybrid (inherited/learned) behaviour, consisting of a syntactic operational structure acting on a semantic organisational framework. That is, we use the abstract concept of semantics to describe linguistic states, and the abstract concept of syntax (grammar) to describe transitions between these (meaningful, content-bearing) states.

As a representational system, language obtains its descriptive power from the following principle- 'Infinite productivity from finite means'. This phrase reminds us that the overwhelming majority of sentences ever uttered are unique.

First, we examine words (morphological or categorical semantics, c.f. the more familiar generative semantics) as a coding system in their own right.

Words have semantic (meaning-bearing) properties, which explains the existence and usefulness of dictionaries. Rather than each individual language user making up their own complete set of words, infants are exposed to a kernel or common core of objective (existential) and subjective (experiential) words, roughly split into types of things (nouns and adjectives used to label structures in terms of property combinations) and classes of processes (verbs and adverbs used to label operations according to methodological predicates).

Each word's semantic exemplar set acts upon every other's, so that a unique solution is obtained, necessary for the purposes of (eg) mental imagery. However, this is rarely needed, because language is itself a categorical computation system, and the semantic values it computes are not exact renderings of real-world situations and occasions, but are stereotypical non-visual descriptions.

In a very simplistic, but (hopefully) easily appreciated way, each sentence forms a semantic 'neighbourhood', a compressed zone which exerts a sort of 'peer-group pressure' upon all its resident words, a reduction of each word's lexico-semantic properties, so that its in situ meaning becomes restricted to a subset of the full range described in the dictionary. For example, the first entry for the word 'table' in the dictionary refers to the entire class of flat-topped four-legged items of furniture. Used in a specific sentence, this infinitely broad field of semantic denotation is drastically reduced to (for example) the singleton item of furniture referred to in this individual's utterance. In terms of simple graphical visualisations, we can imagine a Venn diagram which has as many regions as constituent words. Each word does not have a precise 'meaning' per se, but is in reality a category label, referring to a set of valid exemplars**. While intersecting (ANDing) two or three such word sets isn't enough to define a situation/occasion exactly, most sentences contain between five and ten words, thus representing ample opportunity for Boolean targeting of this kind.

This reinvention of morphology in turn permits us to adopt a more purist, semantics-centred view of Chomskian linguistics, as depicted in figure 1.1 (c). Spoken and written formulations of language are called syntax. Syntactic structures (sentences) and substructures (phrases, clauses) derive from hierarchical (recursive) ortholinguistics (a.k.a. tree-structured grammars). Once regarded purely as data structures, syntagmatic symbol sequences (eg sentences) are more properly viewed as procedural computations, functionally equivalent to preferred transitions between the semantic states of the underlying linguistic state machine (LSA). Syntactic operations can be universally and unambiguously defined by a type of finite state mechanism called a Mealy Machine, a ROM (read-only memory) based automaton in which the next system state depends on BOTH the combination of the previous system state AND the current input symbol.

*Cybermatic (Cybermaton / cybermata) is a neologism, but a necessary one. It is a variant of cybernetic (feedback) systems, describing those which are functionally enhanced by the inclusion of a command (feedforward ) element at the top level (Marr's 'computational' layer).

** 'The search for the type of meaning that is associated with a linguistic expression in advance of an actual occasion of use, i.e. at the input stage, should not be defined as a search for something precise and immutable, but as something that constrains variation'. - from Harder, P. (2009) Meaning as Input - The instructional perspective. in New Directions in Cognitive Linguistics Edited by Vyvyan Evans Stephanie Pourcel Bangor University, UK

^from Harder, P. (2009) Meaning as Input - The instructional perspective.

The GOLEM language model is unusual, but otherwise conforms to commonsense notions of everyday language use. Firstly, in the GOLEM worldview, sentences obtain their nominal semantic properties from the words they contain, or occasionally from words that appear in adjacent sentences, eg in the case of anaphor. Sentences are a procedural computation which bridge the semantic gap between words (semantic atoms) and declarations (semantic compounds, meaning states which are indexed by statements, defined in turn as paragraphs or narratives made from one or more sentences on the same topic, and/or spoken by the same subject).

The redefined form of morphology allows a more direct definition of semantics, one which avoids Chomsky's problematic, ill-defined concepts of shallow and deep language structures. While syntax defines permutations of words (regarded as morphosemantic atoms, or symbols), semantics is defined by morphosemantic combinatorics, ie combinations of the word-symbols present in the sentence.

Probabilistic set theory tells us that each combination (choice, satisfaction, chaos) of n things taken r at a time consists of 1/r! permutations (placement, ranking, order) of those same n things (see formula 1.a)

In the GOLEM/TDE framework, semantics is defined in two independent ways-

(a) as a Moore^ machine, a synchronous animaton (sic), in which NEXT system states depend only on LAST states (hence the neologism 'animaton', a chimer of 'animation' and 'automaton').

(b) as a combination of word-meanings, broken down into n! sub-semantic permutations.

Formula 1.a

Is thought inner speech? Is speech outer thought?

There are two canonical features of the GOLEM model -
(1) Goal-Orientation, target acquisition via drive-state reduction (comparable to feedback of error signal obtained by gradient detection methods)
(2) Duplex (bidirectional) tri-level architecture.

Goal Orientation is a feature often found in 'agent' architectures. Goal-seeking (Drive-state reduction) mechanisms are called Cybermata within the scope of this research. GOLEM theory constructs more accurate, but simultaneously more widely applicable, definitions for otherwise familiar linguistics terms like semantics, syntax, &c. GOLEM theory completes the work of Fodor and Chomsky. For example, the definitions of syntax and semantics presented in subsequent sections represent far more complete and useful concepts with respect to both denotative and connotative aspects than previously available.

These improved conceptualisations of syntax and semantics are much more robust than the previous definitions, and therefore can be freely applied to both internal (memory and thought) and external (speech and text) representations and referential contexts.

The 'subjective stance' is a core component of the model

While the use of the word 'subjective' is a given in the humanities, it is not normally considered necessary, or indeed even acceptable in 'hard science' (eg robotics and cognitive science) circles. Therefore an important part of the GOLEM discussion is presenting a precise definition of subjectivity that is firmly grounded in physical science, yet captures every aspect of experiential (a.k.a. virtual, phenomenal or psychophysical) states.

In the new thematic space carved out by GOLEM theory, axiomatically 'objective', or 'global' information types are considered highly suspect. A strict, almost hygienic form of subjectivity has become necessary. Its axiomatic application is called 'the subjective stance'. Many pre-computed (feedforward) mathematical models represent a violation of informatics quarantine also known as 'no gods or globals'.

^^a non-cartesian (morphological or categorical) space with discrete-valued (Hamming), not real-valued (Euclidean) distance metrics.

Figure 1.1

Summarising the chain of reasoning thus far-
Every brain is linguistic, ie possesses syntactic reproduction of semantic knowledge representation and accumulation capacities. It is the biological drive to minimise moment to moment differentials in semantic drive-states at each of the major architectural levels which acts as the continuous generator of new and relevant behaviours via the intermediate creation of cognitive strategy and tactics, giving rise to subjective experiences that are familiar to all of us as our thoughts and imagery*. This insight allows us to-

(a) use a cognitive framework (eg that of Marr) to analyse language. In particular, we use Marr's top-down concept of the overall purpose (teleology, or 'computational' function) of a whole system, or sub-stage of a multipart system, to realise that the overall purpose of language is that it is a vehicle for the transfer and implementation of both procedural and declarative sub-types of knowledge, between subjects (INTERsubjective) or within a subject (INTRAsubjective).

(b) use a linguistic framework (eg that of Chomsky/Montague) to analyse cognition, eg use the concepts of semiology (symbols, syntax and semantics) to analyse cognitive and sub-cognitive systems- (i) use of semantics (meaning, facts, truth-states, true statements) to analyse memory, (ii) use of narrative to analyse consciousness.

We now have enough accumulated evidence to support a remarkable proposal - that the building blocks of thought are word-like in their nature because thought itself is made from words, or word-like units of meaning (semantic atoms).

In his monograph*, Eran Asoulin investigates this proposal, remarking that, "..the underlying mechanisms of language are in fact structured in a way to maximise computational efficiency, even if it means causing communicative problems." His assertion is that the communicative function of language is secondary to its primary, cognitive role.

*The accessibility of thoughts and imagery to introspective analysis at any one moment in time varies with the level of consciousness and pre-conscious volition, which in turn vary with the overall size and character of the mission-aggregate task demands. Subconscious processing is parallel and therefore more efficient. Its performance is not a monotonic function of load (number of independent attentional foci) as is the case with conscious, serial processing. Instead it represents a constant rather than variable computational performance bottleneck. This was discovered in 1977 by Schneider and Shiffrin (SS77) who demonstrate that brains and computers behave identically w.r.t. serial and parallel loading. By applying Occam's Razor/Pierce's Hook Hybrid Heuristic Universe Simplifier (ORPH3US) to this observation, we conclude that the simplest and therefore best explanation of the observations from SS77 is that the brain is a biocomputer whose SSDPU (subjective semantics differential processing utility) switches in a command/demand sensitive manner between ranked and unranked items in short term visuospatial memory. This is identical to the behaviour of the software (ie the process hierarchy created by the OSX runtime executive) of the computer into whose display these very words are being typewritten.

*Asoulin, E. (2016) Language as an instrument of thought

Fodor/Pinker/Chomsky all believe that cognition is linguistic. The easiest way of arranging this state of affairs is to combine some existing rules - Occam's Razor (in combination with) Pierce's Hook^ Hybrid Heuristic Universe Simplifier, or ORPH3US, resulting in the following hypothesis- that there exists a 1:1 correspondence between (a) cognitive constituents at various conceptual levels (eg brain, mind, self) and (b) linguistically similar elements at the same or equivalent levels (ie symbols, syntax, semantics).

At the lowest level of this proposed correspondence, cognitive building blocks are hypothesised as being equivalent (in the architectonic** sense) to linguistic building blocks, ie words.

But Marr has shown us how a three-layer hierarchical scheme can be used to organise (ie build, modify) and operate (ie manage, control) any information processing task and/or subsystem.

It remains only to realise that these two threads are but two facets of the very same idea. Language is made from a trilayer precisely because it is the software form of the layered hardware artefact. Language is the software (thoughts, speech) which runs upon the hardware (brain lobes, neural circuits). The main adjustment to this 'obvious' hypothesis is understanding the fractal (recursive, multiple size scales) nature of the entities described, resulting in architectonics which are mathematically dependent upon recursion and recursive (self-embedded) relationships and identities.

^Pierce's Hook is a neodigmatic name ( = name of philosopher who invented concept + sharp tool to cut to the bone) for the logical process which was originally called abduction by its inventor, 19th Century philosophy of language pioneer Charles Sanders Pierce. For rather obvious reasons, this evolved into the much less felonius-sounding adduction (short for adducive logic/inference). The latest alias for this kind of reasoning is retroduction, so named because the inference acts in reverse, that is upstream, against the arrow of time. Given downstream effects (a set of observations), choose those upstream causes, which, acting in context (situation, mechanism and interventions) would make these data unremarkable, obvious, expected.

**When two things are declared similar, the similarity is usually either structural (shape, form) or functional (purpose, process). We conventionally define architecture as a hybrid of these two viewpoints- eg medieval cathedrals must have celestial aesthetics, but they must also embody the principles of terrestrial mechanics. Hence we can use this word (architectonics) to denote such unification.

Marr Trilayer

The term 'computation' has a special meaning in AI circles, a meaning given to it by MIT neuroscientist David Marr. Marr intended it to have a decidedly teleological connotation, in addition to its conventional (denotative) meaning. He defined the computation performed by a neural circuit or module, as the information processing task that the module performs, independent of the method used. This definition is pretty close to what is meant by 'semantics'. It also shares common ground with the declarative computation in the following way- the relationship between the Marrian computation (data transformation between start and end) and the Marrian method (algorithm) is similar to the relationship between declarative and procedural programming paradigms.

We now refer to this dictum as the Marr Trilayer (a.k.a. Marr Trilevel hypothesis).

The three layers (or levels) are-

(i) The computational level - what the device, or subsystem does, its overall transformative purpose or function. At this level, the question 'what' (ie for what purpose) is addressed- what does it do***.

(ii) The algorithmic level (sometimes referred to as the representational level, although this term can be confusing in some contexts). This level describes 'how' the computation is achieved.

(iii) The implementational level (sometimes referred to as the physical level, although, again, this may be confusing in some cases). This level describes 'which' components/ substrate/ platform (building blocks, organisational framework or unit resources) are chosen to perform the function.

Paleofunctional roots (was trilayer invented by Trilobytes??)

Marr intended the term 'computation' to mean systematic changes in abstract (higher level) states. With a little imagination we can see that such a view is consistent with having evolved from basic navigation. Instead of ascribing abstract intentional states to extant ganglionic circuit semantics, we suggest that a simple arthropod or tetrapod would have used such neurosemantic states to keep track of more concrete states, such as changes in location (navigation) and landscape feature recognition (mapping).

With changes in location being tracked at topmost Marr level 3, the actual paths chosen (list of subgoal points) would then be tracked at middle (algorithmic) level 2, leaving the lowest level with the task of storing the recognisable features of each of the known locations. The species could then evolve multiple simultaneous maps. Eventually, additional maps might evolve which connect non-contemporaneous (past and future) locations. Eventually, as cortical capacity grows, brains might evolve new types of topologies capable of spanning other kinds of (eg social) spaces.

John Hughlings Jackson^^ had a very similar idea to Marr, as is shown by his neurological fault finding checklist (except for having inverted order, of course), which is reprinted below-

Disease of Tissue. (Changes in tissue)
Damage of Organs.
Disorder of Function.

Jackson developed a corresponding three-level theory of brain and mind. He claimed that the nervous system is an evolutionary hierarchy of three levels connected by the process of weighted ordinal representation (WOR). This idea preceded the TDE concept of fractal brain structure by over a century.

Jackson was the first person to realise that there is no 'magic' division between lower order sensorimotor control circuits and higher order consciousness ones. Furthermore, he stated clearly why there was no need for one. He said that the brain was a sensorimotor machine in which higher functions evolved by means of combinations of lower ones. His concept of cerebral evolution followed from, and was influenced by, that of Herbert Spencer, the man who (infamously) coined the term 'survival of the fittest'. Jackson's view remains a profoundly modern one, an approach we now routinely label as 'systems-level', without realising what an astounding leap forward it must have been at the time.

In figure 1.1, Marr's trilayer is compared to Hughlings Jackson's WOR.

*The intellectually precocious might ponder a possible connection between Marr's 2-1/2D concept and Mandelbrot's fractional dimensionality.

**Later, Marr's collaborator, Tomaso Poggio, added a fourth layer, to which he attributed the learning function.

***In some contexts, the question 'why' is appropriate- 'why' (ie for what reason) is this system chosen, what question does it provide the answer to.

Like Marr, linguist Noam Chomsky rose to prominence at Massachusetts Institute of Technology (MIT). Noam Chomsky's research was concerned with language learning in newborn children and infants. Even when infants are raised in very information poor environments (so-called poverty of stimulus conditions) they still acquire language normally and gain specific milestones at the expected ages. Chomsky concluded that language was not something that the infants learned from an outside source- rather, it was an external expression of innate patterns that they were born with.

This means that language forms a kind of window into the very structure of thought itself. Thought (ie cognition) and speech are therefore both linguistic phenomena. In other words, language is something that emerges from, and is a reflection of, the brain's functional architecture itself. Language is an intrinsic property of the brain, not an extrinsic one.

As stated in section 1 ('anatomy') reverse engineering is essentially a top-down (ie 'outside-in') process. Therefore, we treat the linguistic-cognitive system initially as a 'black box' and ask- what part of language do we learn in infancy, and what part do we know instinctively? The 'natural' place to put a boundary is between syntax, which is learned, and semantics, which is not. This prediction is a special case of the general principle that learned skills will vary between human cultures, but inherited instincts will be universal.

While the syntax of each human language group consists of a set of vocalizations usually* learned in infancy, the ability to use language itself is not a skill that can be learned. It is not something that is imposed from outside, it is an instinct** which imposes itself upon each human's thinking. Language capability is similar to an instinct in 'lower' animals.

This is such a crucial fact, that it bears repeating. While the basic phonetics, vocabulary (lexicon) and grammar (syntax) of any particular language are obviously acquired by children in infancy, their semantic capacity (ie knowing what to do with the many hundreds of separate vocal sounds they learn) is not learned (or even learnable***) in early childhood. Rather, phonetic cues from the local linguistic culture act very early in an infant's life in such a way so as to decide first the morphology and then later the syntax of the language spoken by its embedding culture.

So if the lexicon and syntax is learned, what part is inherited? The answer is, in short, the language's (combinational) semantics****. The semantics of all human languages display a tripartite, recursive format which is highly conserved across similar language groups.

*obviously not so when older children and adults learn a second language

**see 'The Language Instinct' by Steven Pinker

***Chomsky (and others) demonstrated that such learning could not have occurred using a 'poverty of stimulus' ('paucity of input') argument.

****This tripartite structure for English is SVO, which stands for Subject-Verb-Object. Other languages have other tripartite ordering - German is usually described as SOV. However, the number of these symbolic groups is always three, and there are one each of Subject, Object and Verb, whatever the order.

In spite of Chomsky's reputation, he made what turned out to be a huge blunder, one which effectively ended his career as the leading language theorist. He came to believe that a universal set of syntax rules could be applied to all human languages, rules which not only govern the surface structure, but also extend to the so-called 'deep' (ie more invariant) structure. Chomsky insisted that semantics depended on syntax, and never altered his ideas in spite of competing alternative viewpoints such as the theory of Richard Montague.

The 'classic' way to compare the approaches of the two competing theorists is via the abstraction of grammar types. Put simply, if Chomsky's view prevails, language possesses a 'constituency' grammar, ie we can decompose it using phrase structures (hence the alternative name, phrase structured grammar or PSG). Montague proponents, on the other hand, believe that language is ordered via a 'dependency' grammar, in which the contextual inter-relationships occur between the actual words that are said, and not between imaginary entities like phrases, whose existence is indirectly inferred, not observed directly.

TDE theory has resolved this conflict by siding with Richard Montague (see www.ai-fu.yolasite.com). However, it points out that the use of the term 'dependency grammar' (see figure 3) is somewhat oxymoronic, since the word 'grammar' is equivalent to 'syntax'.

Richard Montague believed that English is no more complicated in principle than a computer program, being composed of a sequence of recursively constructed sentences^, which are meaningful within a hierarchical framework of predicate* logic. He then formulated this belief, and proved it in its basic form, using mathematics, and the lambda calculus, based on the work of of Rudolf Carnap which in turn relied upon the controversial theories of his teacher in the Vienna Circle, Gottlob Frege**.

Montague found that the semantics of English could be reduced to a recursion of subject-predicate dyads. It will be assumed here, but not proven, that such a recursion is equivalent to a Piercean implication hierarchy, ie a hierarchy of logical subset relations***.

The importance of this chain of reasoning cannot be overemphasised, for it implies that not only knowledge, but also language, is built from a hierarchy of logical subsets. This conforms with the commonsense notion that all nouns can be naturally nested within subsets, and subsets of subsets, etc, a pattern which is a close matches to the GOLEM's internal 'hierarchitecture'.

*first order logic (FOL/predicate logic) is more useful than propositional logic (zeroth order logic/ ZOL) because it can be used to analyse the internal structure of propositions. A proposition can be treated as a mechanism, ie separated into logical parts. These are usually called subject and predicate.

^(Imperative) statements and computer programs are directly comparable in many cases- consider the following example of 'structured programming'-
Serve (sprinkle with lemon (grill (add salt and pepper (slice (salmon))))). The bottom-up paraphrase is "take a salmon, slice it, add salt and pepper, put the slices on the grill and sprinkle them with lemon before serving': Harder, P. (2009) Meaning as Input - The instructional perspective. in New Directions in Cognitive Linguistics.

**It is surprising that Carnap followed Frege, since Frege's magnum opus, 'Begriffschrift' had been devalued by Bertrand Russell in 1903, who famously found within it a contradiction, thereby discrediting its author.

*** such as a computer's file & name system, consisting of nested categories (directories or folders)

Figure 1.2

The GOLEM linguistic computation model divides (as a first order approximation) cognitive processing into efferent (syntactic reproduction) and afferent (semantic representation) information channels. The efferent channel structure is derived from a syntactic/ constituency (phrase) based viewpoint as per Chomsky, while the afferent channel structure is derived from a dependency (predicate) based viewpoint as per Montague.

Inter-channel interactions are ignored at this level (first-order) of modelling.

In terms of the 'what' of computation, this type of analysis demonstrates the usefulness of the GOLEM biomachine, in which two sets (minimally) of trilayers exist-
(a) input side trilayers: thresholded events ('stimuli') are combined into perceptual objects, often but not always corresponding to real physical objects. The combination process occurs via a bottom-up hierarchy with an ascending information flow direction.
(a) output side processes: system-cybernetic differentials ('goals') are resolved into conceptual actions (scripts, corresponding to response sequences or skills), often but not always corresponding to memory 'chunks'. The resolution process occurs via a top-down hierarchy with a descending information flow direction.

The best terminology, ie the least confusing and ambiguous, is when the term 'declarative' is tied to the topmost, or 'what', level of the Marr trilayer, while the term 'procedural' is tied to the middle, or 'how' level, the layer usually called 'algorithmic'. To complete the analogy, a term is needed for the lowest (sometimes called the 'which' or selection) layer, containing the system's set of hardware devices and software assets, lumped together as its resources. The result is as follows-

top ... computational ... declarative . ...'what'
middle ...algorithmic ... procedural ... 'how'
bottom ...resources/substrate ... combinational ...'which'

We also understand from Chomsky's work that there is a close relationship between this arrangement and language, as follows-

top ... semantics ...meaning ... culture independent
middle ... syntax ...grammar ... culture specific
bottom ... lexicon ...vocabulary ...individual/cohort specific

Chomsky demonstrated that language is not external, and corollary to the brain's cognitive processes, but internal and intimately implicated in them. Following Occam*, the simplest way for this to be so is for the Marrian and Chomskyan trilayers to be identical. The three resultant equivalences each conform to our common sense notions of both mechanism and semiosis**.

semantics ...computational...'what' ...teleology
syntax ....procedure ...'how'....mechanism
lexicon ...resources/substrate ...'which'...selection

*While technologies and ideologies (eg computers) tend to be poor masters but good servants, the opposite is true for basic principles (eg Newtons Laws, Occam's Razor).

**Semiology is the study of meaning in all its forms.

Figure 1.3

The developmental stages of the GOLEM (hierarchical/linguistic/mnemonic/duplex) biocomputer, from the original 4* fractal allomorph (TOP) to the GOLEM/GOES cognition engine (BOTTOM).

(a) - Hughling-Jackson's idea of 'weighted ordinal representation' or WOR is symbolised by the font size differences within each type of lobe. For example the limbic type L lobe has all four elements (L,T,F,P) but the L lobe is larger than the other three- this is why it functions as an L-type architectonic building block. Jackson's concept deserves full credit for its prescience, but unfortunately it has theoretical flaws that make it unsuitable as a basis for the depiction of real biofractal functional interrelationships.

(b) - While the upper diagram depicts DATA PROCESS (ie the inverted-U shaped data path), the lower diagram depicts DATA STRUCTURE (the duplex pair of hierarchical memories). The left hand channel's hierarchy contains REPRODUCTIONS (repeated patterns, tokens), while the right hand channel's hierarchy contains REPRESENTATIONS (unique patterns, symbols).

Summary: The GOLEM memory model and the GOES (Goals-Objects-Events-Scripts) declarative programming paradigm form a unified duplex mechanism, consisting of a complementary pair of computational constituents, as shown in figure 1.3.