One crucial step in the construction of the human representation of the world is found at the boundary between two basic stimuli: visual experience and the sounds of language. In the developmental stage when the ability of recognizing objects consolidates, and that of segmenting streams of sounds into familiar chunks emerges, the mind gradually grasps the idea that utterances are related to the visible entities of the world. The model presented here is an attempt to reproduce this process, in its basic form, simulating the visual and auditory pathways, and a portion of the prefrontal cortex putatively responsible for more abstract representations of object classes. Simulations have been performed with the model, using a set of images of 100 real world objects seen from many different viewpoints and waveforms of labels of various classes of objects. Subsequently, categorization processes with and without language are also compared.
The free-energy-based reinforcement learning is a new approach to handling high-dimensional states and actions. We investigate its properties using a new experimental platform called the digit floor task. In this task, the high-dimensional pixel data of hand-written digits were directly used as sensory inputs to the reinforcement learning agent. The simulation results showed the robustness of the free-energy-based reinforcement learning method against noise applied in both the training and testing phases. In addition, reward-dependent sensory representations were found in the distributed activation patterns of hidden units. The representations coded in a distributed fashion persisted even when the number of hidden nodes were varied.