Notes on Clark Chapter 3
("Mind and World: The Plastic Frontier")
Econ 308: Agent-Based Computational Economics

Last Updated: 16 June 2006
Latest Course Offering: Spring 2006

Course Instructor:
Professor Leigh Tesfatsion
tesfatsi AT iastate.edu

Syllabus for Econ 308

Basic Reference:
Andy Clark, Being There: Putting Brain, Body, and World Together Again, MIT Press, Cambridge, MA, 1998 (paper), ISBN: 0-262-53156-9

Basic Concepts

Key Issues

1. Major lessons of ANN research?

(Clark, p. 58): "The major lesson of neural network research, I believe, has been to thus expand our vision of the ways a physical system like the brain might encode and exploit information and knowledge."

(Clark, p. 59): "(C)ognitive science can no longer afford simplifications that take the real world and the acting organism out of the loop -- such simplifications may obscure the solutions to ecologically realistic problems that characterize active embodied agents such as human beings. ... (A)bstracting away from the real-world poles of sensing and acting deprives our artificial systems of the opportunity to simplify or otherwise transform their information-processing tasks by the direct exploitation of real-world structure."

(Clark, p. 59-60): "Artificial neural networks...present an interesting combination of strengths and weaknesses. (B)enefits accrue because the systems are, in effect, massively parallel pattern completers. ... (T)hey are not intrinsically well suited to highly sequential, stepwise problem solving of the kind involved in logic and planning... A summary characterization might be `good at Frisbee, bad at logic' -- a familiar profile indeed. ... (ANNs) are fast but limited systems that, in effect, substitute pattern recognition for classical reasoning."

(Clark, p. 60): "(W)e are generally better at Frisbee than at logic. Nonethless, we are also able ... to engage in long-term planning and to carry out sequential reasoning. If we are at root associative pattern-recognition devices (like ANNs), how do we do it? ... One (factor) merits immediate attention. It is the use of our old friend, external scaffolding."

2. Mistaking Mind for the Brain Alone?

(Clark, p. 61): "The combination of basic pattern-completing abilities and complex, well-structured environments may thus enable us to haul ourselves up by our own computational bootstraps."

(Clark, p. 61): "(C)lassical rule-and-symbol based AI may have made a fundamental error, mistaking the cognitive profile of the agent plus the environment for the cognitive profile of the naked brain..."

(Clark, p. 62): "Not all animals are capable of originating (external scaffoldings), and not all animals are capable of benefiting from them once they are in place. The stress on external scaffolding thus cannot circumvent the clear fact that human brains are special. But the computational difference may be smaller and less radical than we sometimes believe."

(Clark, pp. 63-64): Jigsaw puzzle example. Humans solve jigsaw puzzles by "(p)icking up pieces, rotating them to check for potential spatial matches, and then trying them out... Imagine, in contrast, a system that first solved the whole puzzle by pure thought and then used the world merely as the arena in which the already-achieved solution was to be played out. This crucial difference is nicely captured by David Kirsh and Paul Maglio (1994) as the distinction between pragmatic and epistemic action."

(Clark, p. 64): "(T)he classic (cognitive science/AI) image bundles into the machine a set of operational capacities which in real life emerge only from the interactions between machine (brain) and world."

(Clark, pp. 65-66): "(E)xternal structures (including external symbols like words and letters) are special insofar as they allow types of operations not readily (if at all) performed in the inner realm. A more complex example (than scrabble is) performance on the computer game Tetris. ... (In) the case of Tetris the internal and external operations must be temporally coordinated so closely that the inner and outer systems (the brain/CNS and the on-screen operations) seem to function together as a single integrated computational unit." (NOTE: CNS = Central Nervous System)

(Clark, pp. 68-69): "It is (the) methodological separation of the tasks of explaining mind and reason (on the one hand) and explaining real-world, real-time action taking (on the other) that a cognitive science of the embodied mind aims to question. ... (H)uman reasoners are truly distributed cognitive engines: we call on external resources to perform specific computational tasks, much as a networked computer may call on other networked computers to perform specific jobs. ... The true engine or reason, we shall see, is bounded neither by skin nor skull."

Questions Posed by Moderators During In-Class Discussion

1. Why is mean-squared error (MSE) used in supervised training of ANNs?

As Patrick Jordan correctly noted, the use of squared errors (error = discrepancy between desired and actual output at some output node) protects against positive errors cancelling out negative errors. Note, however, that a similar protection against cancellation would also be provided by the use of mean *absolute* error (MAE) in which the magnitudes of the errors are summed without regard for their signs. Each of these "penalty cost" functions implicitly imposes a different kind of penalty on different kinds of errors. The use of squared errors in MSE heavily penalizes large errors (discrepancies) relative to small errors, in a nonlinear manner. In contrast, the use of MAE penalizes errors linearly in proportion to their magnitude.

A wide variety of other penalty costs functions might also be considered. A "Bayesian" statistician would say that the exact choice of penalty cost function should be tailored to the problem at hand, reflecting the user's relative "disutility" for different error patterns in a particular problem context. A "classical" statistician might prefer to work with MSE because of an underlying presumption that the errors are random variables with a known "normal" (bell-shaped) distribution, in which case the MSE (a sum of squared normally distributed random variables) would itself be a random variable with a known distribution.

2. Is SUPERVISED training necessary for ANNs?

ANNs can range from highly supervised to highly unsupervised. More precisely, ANNs can range all the way from being completely hard-wired (connections and connection weights all pre-specified by a human designer) to to being totally self-organizing (connections and connection weights can all freely adapt in response to successive inputs). Self-organizing ANNs exhibit what is referred to as unsupervised training because there is no top-down controller guiding the self-organization process. A useful discussion of ANN training covering this full range is given by Stan Franklin in his well-known monograph Artificial Minds, MIT Press, Cambridge, MA, 1997.

3. What happens if the supervisor training an ANN is wrong about something?

This is another interesting question. While it may be impossible for a particular ANN ever to learn the "correct" solution with a poor supervisor (poorly designed penalty cost function), one could imagine that a population of ANNs with different types of supervisors might evolve to a point where only "good" supervisors remain active (survival of the fittest solution).

4. How many epistemic actions should be taken in any given situation?

This interesting question is closely related to several more traditionally posed questions arising in statistics and control theory generally. In statistical sequential decision making, a key question is when to stop sampling data and decide on an action (the "optimal stopping rule" problem). In control theory, the issue is when to stop collecting information about an incompletely understood system and to start exploiting the information you already have by choosing an action (control) conditional on this information in an attempt to achieve some optimization objective (the "dual control" problem).

5. How many "processors" (nodes) do we need to model the human mind?

As various class members noted, this will presumably depend on the problem or situation we were considering. A number of CAS researchers are explicitly focusing on this type of issue in various guises.

Here's an interesting point. Under relatively weak assumptions, ANNs are known to be "universal approximators." For example, any continuous function that maps input into output (where the input is restricted to some closed bounded domain) can be approximated arbitrarily closely by a suitably constructed ANN. The difficulty is that this non-constructive theorem does not say anything specific about how to carry out this construction. For example, the theorem says nothing about HOW MANY hidden node layers or HOW MANY nodes per hidden node layer would be needed to achieve good approximation.

6. Does the brain have a CPU?

Well, if it does, where might it be located? Brain scans suggest that widely dispersed areas of the brain tend to be activated when a person undertakes a task. Note that this question is distinct from the question of internal representations -- the latter could be supported by dispersed sensory inputs, the issue being whether the dispersed sensory inputs are ever "fused" into a single coherent mental representation of some aspect of the world (e.g., a spatial map, Robert Brook's coke can, etc.).

7. And how do YOU play Tetris (if you do)?

Copyright © 2006 Leigh Tesfatsion. All Rights Reserved.