Each of us might have encountered the situation to desperately search for a personal item or a location in an unknown environment. At present there is no technical solution for such an assistive system. The newly granted Joint Research Project “Cognitive Vision” attempts to find first solutions in this direction. A human shall be supported with a system that can not only find things, but that can understand the relationship between the human activities and objects involved. This understanding of new information and new knowledge is the key aspect of the cognitive approach to computer vision.
The solution proposed is based on a trans-disciplinary approach. It integrates partners from theoretical computer science (TU Graz), neuroscience (Max-Planck-Institut Tübingen), machine learning (MU Leoben), and the main computer vision groups in Austria (ACIN & PRIP at TU Wien, EMT & ICG at TU Graz and Joanneum Research Graz).
One aspect of the project is to investigate the relations of the different brain regions in visual cortex. While individual functions of these regions are relatively well studied, new methods of screening brain functions enable deeper insights that contradict present hypotheses. It could be shown that human vision profits enormously from expectations in a given situation. For example, objects in an atypical environment are spotted much more quickly than in the expected environment.
Using this analysis of the only “working” vision system we will develop computer models to describe objects under different conditions, for example, different illumination, shape, scale, clutter and occlusion, and to describe the relationships between objects and the environment. A particular emphasis is on learning these models and relationships. In the same way one shows a new object to a child, we want to relieve the user from the present exhaustive learning phases.
Another aspect of the research work is the analysis of the interrelations of the different seeing functions, namely, mechanisms to guide attention, the detection and identification of objects, the prediction of motions and intentions of the user, the integration of knowledge of the present situation, and the creation of an appropriate system reaction. The coordination of these functions is resolved using an agent/based optimisation of the utility to the system’s functioning.
The techniques devised will be implemented in prototype systems. The objective of the next three years is to track and predict where objects are moved to and where they are hidden and could be refound. A user could then ask the system where her mug is or where a specific shop is when entering unknown parts of a city. In both cases the user would be assisted and guided to the location.