Skip to main content
Do you get the picture? – Computer Vision

Do you get the picture? – Computer Vision

Why is machine vision a hard problem? As humans, we look at a dog and we see the dog. That’s all there is to it. Surely making machines that do the same isn’t that hard, right?

It actually turns out to be one of the toughest problems around, because that ‘simple’ seeing we do is in fact managed by our brains more than our eyes – and the brain is, well, the most complex object in the universe.

When we see, we are doing much more than perceiving an image on our retinas. Our brains produce a high-level understanding of what is going on in a scene that we see, and this involves our memories, cognitive abilities and imaginations as well as our in-itself incredible visual cortex. This is what scientists are up against when they try to give similar abilities – even basic ones – to machines.

Computer vision is about enabling machines to extract information from images. One example would be the systems used in self-driving cars. The car needs to ‘know’ whether a moving or stationary object is a child, an adult, a dog, a lamp-post, a tree, another vehicle. It needs to make ‘assumptions’ based on the raw visual data it is gathering and feed them into its algorithms to decide what to do next. Getting this right every time is a fiendishly complex problem, as the handful of unfortunate deaths there have been so far caused by self-driving cars sadly demonstrates.

Nevertheless, substantial progress is being made in the field. For example we can train an AI to identify photos of guitars by showing it lots of photos of guitars and so on, and on the whole this technology works pretty well and is being constantly refined. But we are a long way from mimicking anything as sophisticated as a living creature.

A new series of courses exploring this field has popped up on Coursera: Computer Vision Basics, Image Processing, Features & Segmentation, Stereo Vision, Dense Motion & Tracking, and Visual Recognition & Understanding. They spring from the Computer Vision Specialization of the Department of Engineering of the University at Buffalo and are an interdisciplinary collaboration of the UB Center for Industrial Effectiveness, UB Department of Computer Science and Engineering, Mathworks and a consortium of industry partners.

Requirements are basic programming skills, some knowledge of MATLAB, a familiarity with basic linear algebra, 3D co-ordinates, basic calculus and probability. A good way to start out in this increasingly important field.