Connectionism is an approach to the study of human mental processes and cognition that utilizes mathematical models known as connectionist networks or artificial neural networks.
Connectionism has had many "waves" since its beginnings. The first wave appeared 1943 with Warren Sturgis McCulloch and Walter Pitts both focusing on comprehending neural circuitry through a formal and mathematical approach, and Frank Rosenblatt who published the 1958 paper "The Perceptron: A Probabilistic Model For Information Storage and Organization in the Brain" in Psychological Review, while working at the Cornell Aeronautical Laboratory. The first wave ended with the 1969 book about the limitations of the original perceptron idea, written by Marvin Minsky and Seymour Papert, which contributed to discouraging major funding agencies in the US from investing in connectionist research. With a few noteworthy deviations, most connectionist research entered a period of inactivity until the mid-1980s. The term connectionist model was reintroduced in a 1982 paper in the journal Cognitive Science by Jerome Feldman and Dana Ballard.
The second wave blossomed in the late 1980s, following a 1987 book about Parallel Distributed Processing by James L. McClelland, David E. Rumelhart et al., which introduced a couple of improvements to the simple perceptron idea, such as intermediate processors (now known as "hidden layers") alongside input and output units, and used a Sigmoid function activation function instead of the old "all-or-nothing" function. Their work built upon that of John Hopfield, who was a key figure investigating the mathematical characteristics of sigmoid activation functions. From the late 1980s to the mid-1990s, connectionism took on an almost revolutionary tone when Schneider, Terence Horgan and Tienson posed the question of whether connectionism represented a paradigm shift in psychology and so-called "good old-fashioned AI," or GOFAI. Some advantages of the second wave connectionist approach included its applicability to a broad array of functions, structural approximation to biological neurons, low requirements for innate structure, and capacity for graceful degradation. Its disadvantages included the difficulty in deciphering how ANNs process information or account for the compositionality of mental representations, and a resultant difficulty explaining phenomena at a higher level.
The current (third) wave has been marked by advances in deep learning, which have made possible the creation of large language models. The success of deep-learning networks in the past decade has greatly increased the popularity of this approach, but the complexity and scale of such networks has brought with them increased interpretability problems.
Most of the variety among the models comes from:
Hopfield networks had precursors in the Ising model due to Wilhelm Lenz (1920) and Ernst Ising (1925), though the Ising model conceived by them did not involve time. Monte Carlo simulations of Ising model required the advent of computers in the 1950s.
Hebb contributed greatly to speculations about neural functioning, and proposed a learning principle, Hebbian learning. Karl Lashley argued for distributed representations as a result of his failure to find anything like a localized engram in years of lesion experiments. Friedrich Hayek independently conceived the model, first in a brief unpublished manuscript in 1920,Hayek, Friedrich A. 1920 1991. Beiträge zur Theorie der Entwicklung des Bewusstseins Contributions. Manuscript, translated by Grete Heinz. then expanded into a book in 1952.
The Perceptron machines were proposed and built by Frank Rosenblatt, who published the 1958 paper “The Perceptron: A Probabilistic Model For Information Storage and Organization in the Brain” in Psychological Review, while working at the Cornell Aeronautical Laboratory. He cited Hebb, Hayek, Uttley, and Ashby as main influences.
Another form of connectionist model was the relational network framework developed by the linguist Sydney Lamb in the 1960s.
The research group led by Widrow empirically searched for methods to train two-layered ADALINE networks (MADALINE), with limited success.pp 124-129, Olazaran Rodriguez, Jose Miguel. ''
target="_blank" rel="nofollow"> A historical sociology of neural network research . PhD Dissertation. University of Edinburgh, 1991.Widrow, B. (1962) Generalization and information storage in networks of ADALINE "neurons"''. In M. C. Yovits, G. T. Jacobi, & G. D. Goldstein (Ed.), Self-Organizing Svstems-1962 (pp. 435-461). Washington, DC: Spartan Books.
A method to train multilayered perceptrons with arbitrary levels of trainable weights was published by Alexey Grigorevich Ivakhnenko and Valentin Lapa in 1965, called the Group Method of Data Handling. This method employs incremental layer by layer training based on regression analysis, where useless units in hidden layers are pruned with the help of a validation set.
The first multilayered perceptrons trained by stochastic gradient descent was published in 1967 by Shun'ichi Amari. In computer experiments conducted by Amari's student Saito, a five layer MLP with two modifiable layers learned useful internal representations to classify non-linearily separable pattern classes.
In 1972, Shun'ichi Amari produced an early example of self-organizing network.
Hopfield approached the field from the perspective of statistical mechanics, providing some early forms of mathematical rigor that increased the perceived respectability of the field. Another important series of publications proved that neural networks are universal function approximators, which also provided some mathematical respectability.
Some early popular demonstration projects appeared during this time. NETtalk (1987) learned to pronounce written English. It achieved popular success, appearing on the Today show. TD-Gammon (1992) reached top human level in backgammon.
Connectionism and computationalism need not be at odds, but the debate in the late 1980s and early 1990s led to opposition between the two approaches. Throughout the debate, some researchers have argued that connectionism and computationalism are fully compatible, though full consensus on this issue has not been reached. Differences between the two approaches include the following:
Despite these differences, some theorists have proposed that the connectionist architecture is simply the manner in which organic brains happen to implement the symbol-manipulation system. This is logically possible, as it is well known that connectionist models can implement symbol-manipulation systems of the kind used in computationalist models, as indeed they must be able if they are to explain the human ability to perform symbol-manipulation tasks. Several cognitive models combining both symbol-manipulative and connectionist architectures have been proposed. Among them are Paul Smolensky's Integrated Connectionist/Symbolic Cognitive Architecture (ICS). and Ron Sun's CLARION (cognitive architecture). But the debate rests on whether this symbol manipulation forms the foundation of cognition in general, so this is not a potential vindication of computationalism. Nonetheless, computational descriptions may be helpful high-level descriptions of cognition of logic, for example.
The debate was largely centred on logical arguments about whether connectionist networks could produce the syntactic structure observed in this sort of reasoning. This was later achieved although using fast-variable binding abilities outside of those standardly assumed in connectionist models.
Part of the appeal of computational descriptions is that they are relatively easy to interpret, and thus may be seen as contributing to our understanding of particular mental processes, whereas connectionist models are in general more opaque, to the extent that they may be describable only in very general terms (such as specifying the learning algorithm, the number of units, etc.), or in unhelpfully low-level terms. In this sense, connectionist models may instantiate, and thereby provide evidence for, a broad theory of cognition (i.e., connectionism), without representing a helpful theory of the particular process that is being modelled. In this sense, the debate might be considered as to some extent reflecting a mere difference in the level of analysis in which particular theories are framed. Some researchers suggest that the analysis gap is the consequence of connectionist mechanisms giving rise to Emergence that may be describable in computational terms.
In the 2000s, the popularity of dynamical systems in philosophy of mind have added a new perspective on the debate; some authors now argue that any split between connectionism and computationalism is more conclusively characterized as a split between computationalism and dynamical systems.
In 2014, Alex Graves and others from DeepMind published a series of papers describing a novel Deep Neural Network structure called the Neural Turing Machine able to read symbols on a tape and store symbols in memory. Relational Networks, another Deep Network module published by DeepMind, are able to create object-like representations and manipulate them to answer complex questions. Relational Networks and Neural Turing Machines are further evidence that connectionism and computationalism need not be at odds.
This challenge has been met in modern connectionism, for example, not only by Smolensky's "Integrated Connectionist/Symbolic (ICS) Cognitive Architecture",P. Smolenky: Reply: Constituent structure and explanation in an integrated connectionist / symbolic cognitive architecture. In: C. MacDonald, G. MacDonald (Hrsg.): Connectionism: Debates on psychological explanation. Blackwell Publishers. Oxford/UK, Cambridge/MA. Vol. 2, 1995, S. 224, 236-239, 242-244, 250-252, 282.P. Smolensky, G. Legendre: The Harmonic Mind: From Neural Computation to Optimality-Theoretic Grammar. Vol. 1: Cognitive Architecture. A Bradford Book, The MIT Press, Cambridge, London, 2006a, ISBN 0-262-19526-7, S. 65-67, 69-71, 74-75, 154-155, 159-202, 209-210, 235-267, 271-342, 513. but also by Werning and Maye's "Oscillatory Networks".M. Werning: Neuronal synchronization, covariation, and compositional representation. In: M. Werning, E. Machery, G. Schurz (Hrsg.): The compositionality of meaning and content. Vol. II: Applications to linguistics, psychology and neuroscience. Ontos Verlag, 2005, S. 283-312.M. Werning: Non-symbolic compositional representation and its neuronal foundation: towards an emulative semantics. In: M. Werning, W. Hinzen, E. Machery (Hrsg.): The Oxford Handbook of Compositionality. Oxford University Press, 2012, S. 633-654.A. Maye und M. Werning: Neuronal synchronization: from dynamics feature binding to compositional representations. Chaos and Complexity Letters, Band 2, S. 315-325. An overview of this is given for example by Bechtel & Abrahamsen,Bechtel, W., Abrahamsen, A.A. Connectionism and the Mind: Parallel Processing, Dynamics, and Evolution in Networks.
Recently, Heng Zhang and his colleagues have demonstrated that mainstream knowledge representation formalisms are, in fact, recursively isomorphic, provided they possess equivalent expressive power. This finding implies that there is no fundamental distinction between using symbolic or connectionist knowledge representation formalisms for the realization of artificial general intelligence (AGI). Moreover, the existence of recursive isomorphisms suggests that different technical approaches can draw insights from one another.
/ref>
See also
Notes
External links
|
|