Abstract
LLMs perform many tasks fluidly and with mastery in natural language comprehension and generation. They have skirted the problem of hand-coding knowledge by instead acquiring it from sources on the Internet in multiple languages. While they are not intended to model the cognition of a single agent, they nevertheless are suggestive about representations and processes in cognitive systems in general. Because their representations are not in the familiar symbolic framework, their performance has revived debates from the 1980's about the role and need for symbolic representations in cognition: about the nature of linguistic knowledge and multi- and inter-linguistic competence, representations and nature of concepts, common and commonsense knowledge, relation to perception and the role of embodiment in cognitive behavior, among other questions. Their frequent impressive performance notwithstanding, LLMs are plagued with a persistent tendency to generate falsehoods and erratic performance in inference and problem solving. LLM developers propose that scaling, i.e., more data and computing, will solve these problems. Also, while LLMs make much of their use of various learning techniques, they do not learn at all during the sessions when millions of people use them. This is quite unlike humans.
In this talk, I will use Kahneman's System 1 and System 2 framework to propose that LLMs are suggestive of how the System 1 part of cognition might work: rapid production of intuitive, goal-relevant information, but without guarantees of correctness. The distributed analog representations in LLMs show how functionally symbolic representations can emerge in connectionist systems. On the other hand, in my view, the deliberative characteristic of System 2 requires symbolic representations and problem-solving, i.e., an architecture such as Soar. The two systems must work intimately together to account for the generality and flexibility of human cognition, including learning from problem solving activity. This, however, requires figuring out how representations in the two frameworks can be fluidly translated from one to the other, and the successes of System 2 can be fed back into System 1 representations. Further, these two cognitive systems must also work with perceptual modules to ground cognitive representations in perception. A research program for the integration of these very different architectures will move the field closer to AGI than scaling LLMs alone.