Tri thức là sức mạnh

Luật An ninh mạng; Công cụ, phương pháp, kỹ thuật An ninh mạng; Đấu tranh chống luận điệu xuyên tạc đường lối chính sách của Đảng, Nhà nước; bảo vệ lý luận, chính trị, tư tưởng đạo đức Hồ Chí Minh trên không gian mạng; Tuyên truyền gương người tốt việc tốt,...

Thứ Ba, 22 tháng 4, 2014

Part 3: Computer Vision Algorithms

Computer Vision Algorithms

Shape Detection

Middle Mass and Blobs

Pixel Classification

Image Correlation

Facial Recognition

biological vision

computer image processing

Computer Vision vs Machine Vision

Edge Detection
Edge detection is a technique to locate the edges of objects in the scene. This can be useful for locating the horizon, the corner of an object, white line following, or for determing the shape of an object. The algorithm is quite simple:

sort through the image matrix pixel by pixelfor each pixel, analyze each of the 8 pixels surrounding it
record the value of the darkest pixel, and the lightest pixel
if (darkest_pixel_value - lightest_pixel_value) > threshold)
then rewrite that pixel as 1;
else rewrite that pixel as 0;

What the algorithm does is detect sudden changes in color or lighting, representing the edge of an object.
Check out the edges on Mona Lisa:

Mona Lisa

Mona Lisa Edge Detection

A challenge you may have is choosing a good threshold. This left image has a threshold thats too low, and the right image has a threshold thats too high. You will need to run an

image heuristics program

High Edge Detection Threshold

Low Edge Detection Threshold

Shape Detection and Pattern Recognition
Shape detection requires preprogramming in a mathematical representation database of the shapes you wish to detect. For example, suppose you are writing a program that can distinguish between a triangle, a square, and a circle. This is how you would do it:

run edge detection to find the border line of each shapecount the number of continuous edges

if three lines detected, then its a triangle
if four lines, then a square
if one line, then its a circle
by measure angles between lines you can determine more info (rhomboid, equilateral triangle, etc.)

Shape Detection Original Image

Shape Detection after Edge Detection

The basic shapes are very easy, but as you get into more complex shapes (pattern recognition) you will have to use probability analysis. For example, suppose your algorithm needed to recognize between 10 different fruits (only by shape) such as an apple, an orange, a pear, a cherry, etc. How would you do it? Well all are circular, but none perfectly circular. And not all apples look the same, either.
By using probability, you can run an analysis that says 'oh, this fruit fits 90% of the characteristics of an apple, but only 60% the characteristics of an orange, so its more likely an apple.' Its the computational version of an 'educated guess.' You could also say 'if this particular feature is present, then it has a 20% higher probability of being an apple.' The feature could be a stem such as on an apple, fuzziness like on a coconut, or spikes like on a pinneapple, etc. This method is known as feature detection.
Bowl of Fruit

Bowl of Fruit

Middle Mass and Blob Detection
Blob detection is an algorithm used to determine if a group of connecting pixels are related to each other. This is useful for identifying seperate objects in a scene, or counting the number of objects in a scene. Blob detection would be useful for counting people in an airport lobby, or fish passing by a camera. Middle mass would be useful for a baseball catching robot, or a line following robot.

To find a blob, you threshold the image by a specific color as shown below. The blue dot represents the middle mass, or the average location of all pixels of the selected color.

Blob Detection

Blob Detection

If there is only one blob in a scene, the middle mass is always located in the center of an object. But what if there were two or more blobs? This is where it fails, as the middle mass is no longer located on any object:

Blob Detection Failure

To solve for this problem, your algorithm needs to label each blob as seperate entities. To do this, run this algorithm:

go through each pixel in the array:if the pixel is a blob color, label it '1'
    otherwise label it 0
go to the next pixel
    if it is also a blob color
        and if it is adjacent to blob 1
            label it '1'
        else label it '2' (or more)
repeat until all pixels are done

What the algorithm does is labels each blob by a number, counting up for every new blob it encounters. Then to find middle mass, you can just find it for each individual blob.
In this below video, I ran a few algorithms in tandem. First, I removed all non-red objects. Next, I blurred the video a bit to make blobs more connected. Then, using blob detection, I only kept the blob that had the most pixels (the largest red object). This removed background objects such as the fire extinguisher. Lastly, I did center of mass to track the actual location of the object. I also ran apopulation threshold algorithm that made the object edges really sharp. It doesnt improve the algorithm in this case, but it does make it look nicer as a video.

Feel free to download my

custom blob detection RoboRealm file

Pixel Classification
Pixel Classification is when you assign each pixel in an image to an object class. For example, all greenish pixels would be grass, all blueish pixels would be sky or water, all greyish pixels would be road, and all yellow would be a road lane divider. There are other ways to classify each pixel, but color is typically the easiest.

This method is clearly useful for picking out the road for road following and obstacles for obstacle avoidance. Its also used in satellite image processing, such as this image of a city (yellow/red for buildings), forest (green), and river (blue):

Pixel Classification Satellite Image

If Greenpeace wanted to know how much forest has been cut down, a simple pixel density count can be done. To do this, simply count and compare the forest pixels from before and after the logging.

A major benefit to this bottom-up method to image processing is its immunity to heavy image noise. Blobs do not need to be identified first. By finding the middle mass of these pixels, the center location of each object can be found.

Need an algorithm to identify roads for your driving robot? This below video (from my house front door) is an example of me simply maximizing RBG (red blue green) colors. Pixels that are more blue than any other color become all blue, pixels more green than any other color become all green, and the same for red. What you get is the road being all blue, the grass being all green, and houses being red. Its not perfect, yet still works amazingly well for a simple pixel classification algorithm. This algorithm would well compliment another algorithm(s).

Feel free to download my

custom pixel classification RoboRealm file

Image Correlation (Template Matching)
Image correlation is one of the many forms of template matching for simple object recognition. This method works by keeping a large database of various imaged features, and computing 'intensity similiarity' of an entire image or window with another.

In this example, various features of an adorably cute squirrel (its the species name) are obtained for comparison with other objects.

Image Correlation of a Squirrel

This method is also used for feature detection (mentioned earlier) and facial recognition . . .

Facial Recognition
Facial recognition is a more advanced type of pattern recognition. With shape recognition you only need a small database of mathematical representations of shapes. But while basic shapes like a triangle can be easily described, how do you mathematically represent a face?

George Bush looks like a monkey

Here is an excercise for you. Suppose you have a friend coming to your family's house and she/he wants to recognize every face by name before arriving. If you could only give a written list of facial features of each family member, what would you say about each face? You might describe hair color, length, or style. Maybe your sister has a beard. One person might have a more rounded face, while another person might have a very thin face. For a family of 4 people this excercise is really easy.

But what if you had to do it for everyone in your class? You might also analyze skin tone, eye color, wrinkles, mouth size . . . the list goes on. As the number of people that will be analyzed grows, so would the number of required descriptions for each face.

One popular way of digitizing faces is to measure the distance between each eye, size of the head, distance between eyes and mouth, and length of mouth. By keeping a database of these values, surprisingly you can accurately identify thousands of different faces. Hint: notice how the features on Mona Lisa's face above is much easier to identify and locate after edge detection.

Face Measuring

Unfortunately for law enforcement this method does not work outside of the lab. This is because it requires facial images that are really close and clear for the measurements to be done accurately. It is also difficult to control which way a person is looking, too. For example, can you make out the facial measurements of the man in this security cam image?

Blurry Security Cam Image

Have a look at this below image. Despite these pictures also being tiny and blurry, you can somehow recognize many of them! The human brain obviously has other yet undiscovered methods of facial recognition . . .

Tiny and blurry, yet recognizable faces

Stereo Vision
Stereo vision is a method of determing the 3D location of objects in a scene by comparing images of two seperate cameras. Now suppose you have some robot on Mars and he sees an alien (at point P(X,Y)) with two video cameras. Where does the robot need to drive to run over this alien (for 20 kill points)?

Mars Rover with an Alien

First lets analyze the robot camera itself. Although a simplification resulting in minor error, the pinhole camera model will be used in the following examples:

Camera Diagram

The image plane is where the photo-receptors are located in the camera, and thelens is the lens of the camera. The focal distance is the distance between the lens and the photo-receptors (can be found in the camera datasheet). Point P is the location of the alien, and point p is where the alien appears on the photo-receptors. The optical axis is the direction the camera is pointing. Redrawing the diagram to make it mathematically simpler to understand, we get this new diagram

Simplified Camera Diagram

with the following equations for a single camera:

x_camL = focal_length * X_actual / Z_actual
y_camL = focal_length * Y_actual / Z_actual

CASE 1: Parallel Cameras
Now moving on to two parallel facing cameras (L for left camera and R for right camera), we have this diagram:
Parallel Camera Stereo Vision

Parallel Camera Stereo Vision

The Z-axis is the optical axis (the direction the cameras are pointing). b is the distance between cameras, while f is still the focal length. The equations of stereo triangulation (because it looks like a triangle) are:

Z_actual = (b * focal_length) / (x_camL - x_camR)
X_actual = x_camL * Z_actual / focal_length
Y_actual = y_camL * Z_actual / focal_length

CASE 2a: Non-Parallel Cameras, Rotation About Y-axis
And lastly, what if the cameras are pointing in different non-parallel directions? In this below diagram, the Z-axis is the optical axis for the left camera, while the Zo-axis is the optical axis of the right camera. Both cameras lie on the XZ plane, but the right camera is rotated by some angle phi. The point where both optical axes (plural for axis, pronounced ACKS - I) intersect at the point (0,0,Zo) is called thefixation point. Note that the fixation point could also be behind the cameras when Zo < 0.
Non-Parallel Camera Stereo Vision

Non-Parallel Camera Stereo Vision

calculating for the alien location . . .

Zo = b / tan(phi)
Z_actual = (b * focal_length) / (x_camL - x_camR + focal_length * b / Zo)
X_actual = x_camL * Z_actual / focal_length
Y_actual = y_camL * Z_actual / focal_length

CASE 2b: Non-Parallel Cameras, Rotation About X-axis
calculating for the alien location . . .

Z_actual = (b * focal_length) / (x1 - x2)
X_actual = x_camL * Z_actual / focal_length
Y_actual = y_camL * Z_actual / focal_length + tan(phi) * Z

CASE 2c: Non-Parallel Cameras, Rotation About Z-axis
For simplicity, rotation around the optical axis is usually dealt with by rotating the image before applying matching and triangulation. Given the translation vector T and rotation matrix R describing the transormation from left camera to right camera coordinates, the equation to solve for stereo triangulation is:

p' = RT ( p - T )

where p and p' are the coordinates of P in the left and right camera coordinates respectively, and RT is the transpose (or the inverse) matrix of R.

Please continue on in the

Computer Vision Tutorial Series

Người đăng: bitnetvn vào lúc 00:05

Không có nhận xét nào:

Đăng nhận xét

Bài đăng Mới hơn Bài đăng Cũ hơn Trang chủ

Đăng ký: Đăng Nhận xét (Atom)

Tổng số lượt xem trang

Giới thiệu về tôi

bitnetvn

Xem hồ sơ hoàn chỉnh của tôi

Danh mục

Câu chuyện an ninh mạng (9)
English Library (5)
Image and Video Processing (44)
Kỹ năng mềm (72)
Kỹ thuật và Công nghệ (39)
Phương pháp nghiên cứu khoa học (67)
Tuyên truyền (6)
WhiteHat và Hacking (6)

Lưu trữ Blog

► 2022 (3)
- ► tháng 9 2022 (1)
- ► tháng 8 2022 (2)

► 2021 (5)
- ► tháng 9 2021 (1)
- ► tháng 6 2021 (2)
- ► tháng 5 2021 (2)

► 2020 (29)
- ► tháng 10 2020 (1)
- ► tháng 8 2020 (1)
- ► tháng 6 2020 (6)
- ► tháng 5 2020 (21)

► 2018 (1)
- ► tháng 6 2018 (1)

► 2017 (2)
- ► tháng 12 2017 (1)
- ► tháng 5 2017 (1)

► 2016 (24)
- ► tháng 10 2016 (5)
- ► tháng 8 2016 (2)
- ► tháng 6 2016 (4)
- ► tháng 5 2016 (4)
- ► tháng 3 2016 (1)
- ► tháng 2 2016 (1)
- ► tháng 1 2016 (7)

► 2015 (44)
- ► tháng 12 2015 (3)
- ► tháng 11 2015 (1)
- ► tháng 10 2015 (1)
- ► tháng 9 2015 (2)
- ► tháng 8 2015 (5)
- ► tháng 7 2015 (6)
- ► tháng 4 2015 (7)
- ► tháng 3 2015 (10)
- ► tháng 2 2015 (8)
- ► tháng 1 2015 (1)

▼ 2014 (147)
- ► tháng 12 2014 (5)
- ► tháng 11 2014 (12)
- ► tháng 10 2014 (4)
- ► tháng 9 2014 (7)
- ► tháng 8 2014 (7)
- ► tháng 7 2014 (4)
- ► tháng 6 2014 (42)
- ► tháng 5 2014 (15)
- ▼ tháng 4 2014 (30)
- ► tháng 3 2014 (8)
- ► tháng 2 2014 (13)

► 2013 (2)
- ► tháng 9 2013 (2)

► 2009 (1)
- ► tháng 6 2009 (1)

Tìm kiếm Blog này

Người theo dõi

Chủ đề Đơn giản. Hình ảnh chủ đề của imagedepotpro. Được tạo bởi Blogger.