Personal tools

Computer Vision Research and Applications

[Computer Vision - Carnegie Mellon University]

- Overview

Computer vision (CV) is a type of artificial intelligence (AI) that uses digital systems to process, analyze, and interpret visual data. The goal of CV is to enable computing devices to correctly identify people or objects in digital images and take appropriate action. 

CV systems consist of three steps: 

  • Acquiring an image
  • Processing an image
  • Understanding an image

CV has many applications, including: 

  • Facial recognition
  • Self-driving cars
  • Robotic automation
  • Medical anomaly detection
  • Sports performance analysis
  • Manufacturing fault detection
  • Agricultural monitoring
  • Plant species classification

CV uses digital images from cameras and videos and deep learning models to accurately identify and classify objects. For example, autonomous vehicle technology uses CV to recognize real-time images and build 3D maps from multiple cameras fitted to autonomous transport.


- Computer Vision and Machine Learning

Computer vision (CV) aims to teach computers to understand visual data in the same way that humans do. Machine learning (ML) trains computers to learn from data and make decisions based on that information. 

CV focuses on using cameras and working with images. ML is guided by statistical principles and algorithms to create models that can infer solutions from input data.
CV tries to imitate the human visual world to teach computers how to interpret it. ML focuses on how to make machines learn and act like a person. It aims to create applications that can learn from people's experiences without being specifically programmed. 

In simple terms, computer vision is a technology that attempts to train computers to recognize patterns in visual data in a similar way as humans do. On the other hand, machine learning is a process that enables computers to learn how to process and react to data inputs based on precedents set by previous actions.


- The Technologies of Computer Vision

The initial goal of computer vision (CV) was to enable machines to see the visual world and interpret it the way a human would, but AI has advanced CV beyond human vision and now machines can see things humans can’t, like air quality and temperature. 

Big data is essential to furthering what CV can recognize and the conclusions it draws from what it sees, which is why companies leading the way in the field are tech giants that already have a foot in the data gathering and machine learning door.

Computer vision uses convolutional neural networks (CNNs) to process visual data at the pixel level and deep learning recurrent neural networks (RNNs) to understand the relationship between one pixel and another.


- The Rise of Computer Vision

To a computer, the image above - like all images - is an array of pixels representing numerical values ​​for shades of red, green, and blue. 

One of the challenges that computer scientists have been grappling with since the 1950s is creating machines that can understand photos and videos like humans do. The field of computer vision has become one of the hottest areas of computer science and artificial intelligence (AI) research. 

Decades later, we've come a long way in creating software that can understand and describe the content of visual data. But we also discovered how far we have to go to understand and replicate one of the fundamental functions of the human brain. 

Computer vision (CV) has exploded over the past few years, and it is now able to identify objects with astonishing accuracy, driving advances in everything from surveillance cameras to autonomous vehicles. 

There are two main reasons for the rapid development of computer vision, which uses AI to interpret and process the scene seen by cameras and other devices.

  • First, millions of images are now labeled thanks to the web, allowing robotic vision systems to train themselves how to recognize what's in a scene using a form of AI called deep learning.
  • Second, a new generation of graphics processing units, or GPUs, originally developed for the video game industry, can learn and recognize images faster. Furthermore, the processing architectures used by deep networks mimic the human visual system, even to the point of assigning network layers so they reflect the arrangement of functional brain regions that humans use to view. "


- The Goal of Computer Vision

At an abstract level, the goal of computer vision (CV) problems is to infer the world using observed image data. It is a multidisciplinary field that can be broadly referred to as a subfield of AI and machine learning (ML) that may involve the use of specialized methods and the use of general learning algorithms. 

Using digital images from cameras and videos and deep learning (DL) models, machines can accurately identify and classify objects, and then react to what they "see". From recognizing faces to processing live performances of soccer matches, computer vision can match or even surpass human visual abilities in many areas. 

Since CV represents a relative understanding of the visual environment and its context, many scientists believe that the field paves the way for general AI due to its cross-domain mastery. 

CV is currently one of the hottest research areas in deep learning. It is located at the intersection of many disciplines such as computer science (graphics, algorithms, theory, systems, architecture), mathematics (information retrieval, machine learning), engineering (robotics, speech, NLP, image processing), physics (optics), Biology (Neuroscience) and Psychology (Cognitive Science).


Mount Fuji_Japan_062122A
[Mount Fuji, Japan]

- Application Domains of Computer Vision

Computer vision (CV) is an AI technology through which robots can see. It plays a vital role in safety, security, health, access and entertainment. CV automatically extracts, analyzes and understands useful information from a single image or a group of images. The process involves developing algorithms to enable automatic visual understanding. 

CV has numerous applications including: agriculture, augmented reality, autonomous vehicles, biometrics, character recognition, forensics, industrial quality inspection, face recognition, gesture analysis, geosciences, image inpainting, medical image analysis, contamination monitoring, process control, remote sensing, robotics, security and surveillance, transportation, and more.

Here are some examples of CV in different fields: 

  • Healthcare: CV algorithms can help automate tasks such as detecting cancerous moles in skin images or finding symptoms in x-ray and MRI scans.
  • Security: Person detection is performed for intelligent perimeter monitoring.
  • Autonomous vehicles: CV can recognize real-time images and build 3D maps from multiple cameras fitted to autonomous transport.
  • Agriculture: CV can help farmers identify product defects, sort the produce by weight, color, size, ripeness, and many other factors.
  • Manufacturing: CV systems can identify cracks and dents, missing components, surfaces with poor painting, and much more.


[More to come ...] 

Document Actions