Tri thức là sức mạnh: Part 2: Computer Image Processing

PROGRAMMING - COMPUTER VISION TUTORIAL
Part 2: Computer Image Processing

Pixels and Resolution

2D Matrices

Decreasing Resolution

Thresholding and Heuristics

Image Color Inversion

Image Brightness / Darkness

Addendum (1D -> 4D)

Computer Image Processing

Computer Vision Tutorial Series

Image Collection

CCD

CMOS

uncompressed

Pixels and Resolution
In every image you have pixels. These are the tiny little dots of color you see on your screen, and the smallest possible size any image can get. When an image is stored, the image file contains information on every single pixel in that image.

This information includes two things: color, and pixel location.

Images also have a set number of pixels per size of the image, known asresolution. You might see terms such as dpi (dots per square inch), meaning the number of pixels you will see in a square inch of the image. A higher resolution means there are more pixels in a set area, resulting in a higher quality image. The disadvantage of higher resolution is that it requires more processing power to analyze an image. When programming computer vision into a robot, use low resolution.

The Matrix (the math kind)
Images are stored in 2D matrices, which represent the locations of all pixels. All images have an X component, and a Y component. At each point, a color value is stored. If the image is black and white (binary), either a 1 or a 0 will be stored at each location. If the color is greyscale, it will store a range of values. If it is acolor image (RBG), it will store sets of values. Obviously, the less color involved, the faster the image can be processed. For many applications, binary images can acheive most of what you want.

Here is a matrix example of a binary image of a triangle:

0 0 0 1 0 0 0
0 0 1 0 1 0 0
0 1 0 0 0 1 0
1 1 1 1 1 1 1
0 0 0 0 0 0 0

It has a resolution of 7 x 5, with a single bit stored in each location. Memory required is therefore 7 x 5 x 1 = 35 bits.
Here is a matrix example of a greyscale (8 bit) image of a triangle:

0   0   55  255 55  0   0
0   55  255 55  255 55  0
55  255 55  55  55  255 55
255 255 255 255 255 255 255
55  55  55  55  55  55  55
0   0   0   0   0   0   0

It has a resolution of 7 x 6, with 8 bits stored in each location. Memory required is therefore 7 x 6 x 8 = 336 bits.
As you can see, increasing resolution and information per pixel can significantly slow down your image processing speed.
After converting color data to generate greyscale, Mona Lisa looks like this:

Decreasing Resolution
The very first operation I will show you is how to decrease the resolution of an image. The basic concept in decreasing resolution is that you are selectively deleting data from the image. There are several ways you can do this:

The first method is just delete 1 pixel out of every group of pixels in both X and Y directions of the matrix.

For example, using our greyscale image of a triangle above, and deleting one out of every two pixels in the X direction, we would get:

0   55  55  0
0   255 255 0
55  55  55  55
255 255 255 255
55  55  55  55
0   0   0   0

and continuing with the Y direction:

0   55  55  0
55  55  55  55
55  55  55  55

and will result in a 4 x 3 matrix, for memory usage of 96 bits.
Another way of decreasing resolution would be to choose a pixel, average the values of all surrounding pixels, store that value in the choosen pixel location, then delete all the surrounding pixels.
For example,

13  112 112 13
145 166 166 145
103 103 103 103

Using the latter method for resolution reduction, this is what Mona Lisa would look like (below). You can see how pixels are averaged along the edges of her hair.

Thresholding and Heuristics
While the above method reduces image file size by resolution reduction, thresholding reduces file size by reducing color data in each pixel.

To do this, you first need to analyze your image by using a method calledheuristics. Heuristics is when you statistically look at an image as a whole, such as determining the overall brightness of an image, or counting the total number of pixels that contain a certain color. For an example histogram, here is my sample

greyscale pixel histogram of Mona Lisa

sample histogram generation code

image contrast

threshold

0 0 1 1 1 0 0
0 1 1 1 1 1 0
1 1 1 1 1 1 1
1 1 1 1 1 1 1
1 1 1 1 1 1 1
0 0 0 0 0 0 0

0 0 0 1 0 0 0
0 0 1 0 1 0 0
0 1 0 0 0 1 0
1 1 1 1 1 1 1
0 0 0 0 0 0 0
0 0 0 0 0 0 0

Image Color Inversion
Color image inversion is a simple equation that inverts the colors of the image. I havnt found any use for this on a robot, but it does however make a good example . . .

The greyscale equation is simply:

255 - pixel_value = new_pixel_value

The greyscale triangle then becomes:

255 255 200 0   200 255 255
255 200 0   200 0   200 255
200 0   200 200 200 0   200
0   0   0   0   0   0   0
200 200 200 200 200 200 200
255 255 255 255 255 255 255

An RBG of Mona Lisa becomes:
Image Inversion