Computer vision
From Freepedia
Computer vision is the study of methods which allow computers to "understand" images, or multidimensional data in general. The term "understand" means here that specific information is being extracted from the image data for a specific purpose: either for presenting it to a human operator (e. g., if cancerous cells have been detected in a microscopy image), or for controlling some process (e. g., an industry robot or an autonomous vehicle). The image data that is fed into a computer vision system is often a digital gray-scale or colour image, but can also be in the form of two or more such images (e. g., from a stereo camera pair), a video sequence, or a 3D volume (e. g., from a tomography device). In most practical computer vision applications, the computers are pre-programmed to solve a particular task, but methods based on learning are now becoming increasingly common.
The field of computer vision can be charaterized as immature and diverse. Even though earlier work exists, it was not until the late 1970's that a more focused study of the field started when computers could manage the processing of large data sets such as images. However, these studies usually originated from various other fields, and consequently there is no standard formulation of the "computer vision problem". Also, and to an even larger extent, there is no standard formulation of how computer vision problems should be solved. Instead, there exists an abundance of methods for solving various well-defined computer vision tasks, where the methods often are very task specific and seldom can be generalized over a wide range of applications. Many of the methods and applications are still in the state of basic research, but more and more methods have found their way into commercial products, where they often constitute a part of a larger system which can solve complex tasks (e.g., in the area of medical images, or quality control and measurements in industrial processes).
Computer vision is by some seen as a subfield of artificial intelligence where image data is being fed into a system as an alternative to text based input for controlling the behaviour of a system. Some of the learning methods which are used in computer vision are based on learning techniqes developed within artificial intelligence.
Since a camera can be seen as a light sensor, there are various methods in computer vision based on correspondences between a physical phenomenon related to light and images of that phenomenon. For example, it is possible to extract information about motion in fluids and about waves by analyzing images of these phenomena. Also, a subfield within computer vision deals with the physical process which given a scene of objects, light sources, and camera lenses forms the image in a camera. Consequently, computer vision can also be seen as an extension of physics.
A third field which plays an important role is biology, specifically the study of the biological vision system. Over the last century, there has been an extensive study of eyes, neurons, and the brain structures devoted to processing of visual stimuli in both humans and various animals. This has led to a coarse, yet complicated, description of how "real" vision systems operate in order to solve certain vision related tasks. These results have led to a subfield within computer vision where artificial systems are designed to mimic the processing and behaviour of biological systems, at different levels of complexity. Also, some of the learning-based methods developed within computer vision have their background in biology.
Yet another field related to computer vision is signal processing. Many existing methods for processing of one-variable signals, typically temporal signals, can be extended in a natural way to processing of two-variable signals or multi-variable signals in computer vision. However, because of the specific nature of images there are many methods developed within computer vision which have no counterpart in the processing of one-variable signals. A distinct character of these methods is the fact that they are non-linear which, together with the multi-dimensionality of the signal, defines a subfield in signal processing as a part of computer vision.
Beside the above mentioned views on computer vision, many of the related research topics can also be studied from a purely mathematical point of view. For example, many methods in computer vision are based on statistics or optimization. Finally, a significant part of the field is devoted to the implementation aspect of computer vision; how existing methods can be realized in various combinations of software and hardware, or how these methods can be modified in order to gain processing speed without losing too much performance.
Computer vision and (digital) image processing are related fields. The distinction between the two is not very clear, e.g., computer vision uses many methods which traditionally belong to image processing. One formal distinction would be to say that image processing deals with transforming images, producing one image from another, or with producing low-level information about an image, such as edges or lines. Neither of these tasks provide, or require, an interpretation about what the image contains in terms of objects or events. Computer vision, on the other hand, uses models and assumptions about the real world depicted in the images to extract information which, e.g., can be used to control actions on objects in a scene. In more advanced systems, these models can be learned rather than programmed.
Contents |
Examples of applications for computer vision
Another way to describe computer vision is in terms of applications areas. One of the most prominent application fields is medical computer vision or medical image processing. This area is characterized by the extraction of information from image data for the purpose of making a medical diagnosis of a patient. Typical image data is in the form of microscopy images, X-ray images, angiography images, ultrasonic images, and tomography images. Examples of information which can be extracted from such image data is detection of tumours, arteriosclerosis, or other malign changes. It can also be measurements of, e. g., organ dimensions, blood flow, etc. This application area also supports medical research by providing new information, e.g., about the structure of the brain, or about the quality of medical treatments.
A second application area in computer vision is in industry. Here, information is extracted for the purpose of supporting a manufacturing process. One example is quality control where details or final products are being automatically inspected in order to find defects. Another example is measurement of position and orientation of details to be picked up by a robot arm. See the article on machine vision for more details on this area.
Military applications is probably one of the larges areas for computer vision, even though only a smaller part of it is open to public. The obvious examples are detection of enemy soldiers or vehicles and guidance of missiles to a designated target. More advanced systems for missile guidance send the missile to an area rather than a specific target, and target selection is made when the missile reach the area based on image data aquired there. Modern military concepts, such as "battlefield awareness", imply that various sensors, including image sensors, provide a rich set of information about a combat scene which can be used to support strategic decisions. In this case, automatic processing of the data is used to reduce complexity and to fuse information from multiple sensors to increase reliability.
One of the newer application areas is autonomous vehicles which ranges from submersibles, land-based vehicles (small robots with wheels, cars or trucks) to aerial vehicles. An unmanned aerial vehicle is often denoted UAV. The level of autonomy ranges from fully autonomous (unmanned) vehicles to vehicles where computer vision based systems support a driver or a pilot in various situations. Fully autonomous vehicles typically use computer vision for navigation, i. e., for knowing where it is, or for producing a map of its environment and for detecting obstacles. It can also be used for detecting certain task specific events, e. g., a UAV looking for forest fires. Examples of supporting system are obstacle warning systems in cars and systems for autonomous landing of aircrafts. Several car manufactures have demonstrated system for autonomous driving of cars, but this technology has still not reached a level where it can be put on the market. There are ample examples of military autonomous vehicles ranging from advanced missiles to UAVs for recon missions or missile guidance. Space exploration is already being made with autonomous vehicles using computer vision, e. g., NASA's Mars Exploration Rover.
Typical tasks of computer vision
Object Recognition
Detecting the presence and/or pose of known objects in an image
Examples:
- Searching for digital images by their content (content-based image retrieval)
- Recognizing human faces and their location in photographs.
- Estimation of the three-dimensional pose of humans and their limbs
Tracking
Tracking known objects through an image sequence
Examples:
- Tracking a single person walking through a shopping center.
Scene interpretation
Creating a model from an image/video.
Examples:
- Creating a model of the surrounding terrain from images, which are being taken by a robot-mounted camera.
Ego positioning
Determining position and motion of the camera itself.
Examples:
- Navigating a robot through a museum.
Computer Vision Systems
A typical computer vision system can be divided in the following subsystems:
Image acquisition
The image or image sequence is acquired with a imaging system (camera,radar,lidar,tomography system). Often the imaging system has to be calibrated before being used.
Preprocessing
In the preprocessing step, the image is being treated with "low-level"-operations. The aim of this step is to do noise reduction on the image (i.e. to dissociate the signal from the noise) and to reduce the overall amount of data. This is typically being done by employing different (digital)image processing methods such as:
- Sub-sampling the image.
- Applying digital filters
- convolutions, computing a scale space representation
- Correlations or linear shift invariant filters
- Sobel operator
- Computing the x- and y-gradient (possibly also the time-gradient).
- Segmenting the image.
- Pixelwise thresholding.
- Performing an eigentransform on the image
- Doing motion estimation for local regions of the image (also known as optic flow estimation).
- Estimating disparity in stereo images.
- Multiresolution analysis
Feature extraction
The aim of feature extraction is to further reduce the data to a set of features, which ought to be invariant to disturbances such as lighting conditions, camera position, noise and distortion. Examples of feature extraction are:
- Performing edge detection or estimation of local orientation.
- Extracting corner features.
- Detecting blob features.
- Extracting spin images from depth maps.
- Acquiring contour lines and maybe curvature zero crossings.
Registration
The aim of the registration step is to establish correspondence between the features in the acquired set and the features of known objects in a model-database and/or the features of the preceding image. The registration step has to bring up a final hypothesis. To name a few methods:
- Least squares estimation
- Hough transform in many variations
- Geometric hashing
- Particle filtering
Related Fields
Advanced systems are often borrowing from many different fields like pattern recognition, statistical learning, projective geometry, image processing, graph theory and other.
Cognitive computer vision is strongly related to cognitive psychology and biological computation.
A University Video Communication on Model-Based Computer Vision
Joseph Mundy in a University Video Communication on Model-Based Computer Vision (1987):
"What do students need to learn to be prepared to meet the challenges?" -
"I would like to comment on the necessary courses a student should take to really be prepared to carry out research in model-based vision. As we can see the geometry of image projection and the mathematics of transformation is a very key element in studying this field, but there are many other issues the student has to be prepared for. If we are going to talk about segmenting images and getting good geometric clues, we have to understand the relationship between the intensity of image data and its underlying geometry. And this would lead the student into such areas as optics, illumination theory, theory of shadows and the like. And also the mathematics underlying this kind of computations would of course require signal processing theory, fourier transform theory and the like. And in dealing with algebraic surfaces such as this curved surfaces as we talked about here, courses in algebraic geometry and higher pure forms of algebra will prove to be necessary in order to make any kind of progress in research to handle curved surfaces. So, I guess the bottom line of what I'm saying is: math courses, particularly those associated with geometric aspects will be key in all of this."
Applications
In the related fields machine vision and medical imaging, systems using computer vision techniques are sold in markets worth billions of US dollars per year.
One interesting application of computer vision, commonly used in the creation of visual effects for cinema and broadcast, is camera tracking or matchmoving. Computer vision also finds its applications in medicine, military industry, security and surveillance, quality inspection, robotics, automotive industry and many other fields.
See also
- artificial intelligence
- Machine learning
- image processing, digital image processing
- machine vision
- medical imaging
- morphological image processing
- VXL
- Herbert Freeman
- David Marr
- Jerome H. Lemelson
- Ron Kimmel
- Affective computing
- Computer graphics
- Important publications in computer vision
External links
- Wikicities has a wiki about Computer vision: Computer Vision
- The Computer Vision Homepage
- ETH Zürich Computer Vision Laboratory
- On-Line Compendium of Computer Vision
- Keith Price's Annotated Computer Vision Bibliography
- The Mimas Computer Vision Library and the MMVL MediaWiki
- RoboRealm: Free Robotic Vision Software.
- Xiris: Developer of custom machine vision automation systems and software.
- Tutorial to Image Processing
- introduction to computer vision
- VXL: A collection of C++ libraries designed for computer vision research and implementation.
- OpenCV: A free, open C/C++ based computer vision software library.
- Machine Perception of Three-Dimansional Solids - the paper mentioned by Joseph Mundy in the video



