Last week, Jingnan showed the difficulties that arise in computers due to their inability to perceive as humans do. This can cause a lot of frustration because as a computer user, sometimes what I want from a computer seems so obvious to me, yet it is, in fact, next to impossible for the computer to perceive that. But while Jingnan was referring to recognizing ants as blobs to track in a video, my frustration is with computers recognizing different species of plants in a photo.
I can’t get too upset by this though, because the thing that fuels my frustration is also what drives my passion for biology, and that is the amazing diversity of life we have on this planet. It is incredible that in this exhausting and difficult climate of southern California (coming from a Seattle-ite), the plants that have survived and thrived can be so different in color, shape, size, flowering patterns, etc., even when faced with the same obstacles! I love how nature and evolution somehow came up with a bunch of different solutions to one problem.
But no matter my love for the biodiversity on this earth, I still find myself frustrated, because somehow I have to teach a computer how to tell all of these amazing plants apart. For example, when looking at Figure 1, I can clearly see many types of plants based on the varying shades of green of their foliage and the presence of flowers. Yet when we run our machine learning algorithms on this photo, the results are not very satisfying, as the computer clearly doesn’t recognize these plants as different. As Cassie said in her October 20th post, these algorithms are good at distinguishing flower from non-flower, but not distinct species within what is classified as “flower.”
Figure 1. A sample picture in the field station taken by a drone
The way I approach this problem is from a human standpoint: how would I recognize these different plants, and how can I translate that into computer speak? Well first of all, I recognize the color of the leaves, the general shape of the plant, and the dispersal of the leaves or flowers. For example, the color of the bush outlined in the bottom right corner are not the same shade of green as the bush in the outlined in the upper left corner, nor are the branches/leaves as dense. In computer speak, the average color values for each pixel in the upper left corner bush are:
Red: 129.77
Blue: 123.88
Green: 116.65
Whereas those values for the lower right bush are:
Red: 109.44
Blue: 90.16
Green: 103.95
You can see this in our Machine Learning algorithm, as one descriptor used to train the model is the average color over an area (this would get us the shade of green) and another descriptor is the variance of the color over the area (this would tell us how dispersed the plant’s leaves and/or flowers are).
So what could explain the problem distinguishing species? As Cassie mentioned in her October post, it may be due to the contents of our training data. In the research photos (Figure 2) we have cropped out the flowers such that there is no object that is not a flower in this picture. Therefore all of the metrics our algorithm comes up with will be based solely on the plant and nothing else. However, in the transect images (Figure 3), we have non-plant objects in the frame, which could negatively affect the metrics as they aren’t accurately describing the plant.
Figure 2. A typical research area training image, containing only the flowering plant
Figure 3. A typical transect image containing both the flowering plant and its surroundings
My approach to tackling this problem is to get rid of those unwanted ground areas by distinguishing their color from those of the plant. This has been my project for the past couple of weeks. With the help of a computer vision professor here at Harvey Mudd, Professor Yekaterina Kharitonova, the technique I have found and will be using is called segmentation and color clustering. It clusters similar colors in an image to create different segments that can then be distinguished (Figure 4).
Figure 4. An image before and after the segmentation and clustering algorithm
Figure 5. The same image as Figure 1 with initial segmentation
As of now, I can make the segments, but they aren’t clustered specifically enough, and are segmented by hue (the actual color) and saturation (the amount of gray in the color) rather than just color, which makes a lot of noise in the images (Figure 5). My hope is to reduce the number of color clusters so that we can distinguish the ground from the plants from the flowers, without superfluous specificity. This will make our training data specific and accurate to the plants in the pictures!
Media Credits:
[Figures 1, 2, and 3] Field Station Images by Cassandra Burgess
[Figure 4] Lobacheva, E., Veksler, O., & Boykov, Y. (2015). Joint Optimization of Segmentation and Color Clustering. 2015 IEEE International Conference on Computer Vision (ICCV).
References:
Anup, Shah. "Why Is Biodiversity Important? Who Cares?" Global Issues. 19 Jan. 2014. Web. 14 Nov. 2016.
No comments:
Post a Comment