HMC Bee Lab: That looks rough: Extracting texture from images

Being able to analyze images for valuable information is the foundation of the image processing field and a key part of modern-day technology like facial detection. Many different models have been developed in order to classify or quantify the information obtained in these images. One such model is the random forest algorithm, a topic covered by Bee Lab member Tom Fu in Random Forest Model: What It Is and Why It Works. As Tom explains, random forests take in features of a dateset and build decision trees that analyze these features in order to provide a classification or numerical output.

[1] Description of a decision tree for playing tennis, the backbone for the random forest algorithm.

Random forests can be used to effectively predict information given a set of input data that we call ‘features’. For example, a random forest model that would predict house prices would take in features corresponding to the age of the house, the average house price in the area, etc. Thinking back to our image problem, applying a random forest can be a little tricky as it expects features in the form of single numbers that represent key characteristics of the input. We could flatten an image out into a single row array and treat each RGB pixel value as a feature but then our ‘features’ would not represent very meaningful information about the image.

One way to approach this problem is to extract textural information from our image. Texture is a defining characteristic that we can easily recognize as humans but it depends a lot on spatial relationships that aren’t well captured by individual pixel values. This is where gray level co-occurrence matrices, or GLCMs, come in! A GLCM takes in a direction defined by a pixel distance and an angle and then represents the pixel changes of an image in that direction. More specifically, a GLCM will quantify the amount of times a specific pixel value is close to another specific pixel value in a defined direction, holding this information in a matrix. In this way, GLCMs help capture the spatial relationships between pixels, which we can use to get some information about the texture of an image.

[2] Example input image and the corresponding GLCM matrix.

The figure above represents a GLCM where the direction is one pixel away horizontally. The ith row and jth column of the matrix relates to how many times the ith pixel value (e.g a gray level of 5) occurs one horizontal pixel away from the jth pixel value. For example, the GLCM value at [5,6] is 1 so there is only 1 occurrence of a 5 and a 6 next to each other in the image. This repeats for all possible combinations of pixel values, giving an eight by eight matrix for an image with value ranges one to eight. A GLCM is dependent on its direction; if the direction for the matrix above was 1 pixel away diagonally upwards we would get an entirely different output. This means that we want to create a lot of GLCMs, each with unique directions and distances, in order to get a good characterization of our image. Finally, we normalize the GLCM by dividing each component by the sum of the entire matrix, scaling down each value to between 0 and 1.

Sadly, a sequence of 2D GLCMs is still not a single number feature that can be passed into our random forest model. What we can do next is turn each one of our GLCMs into a feature by taking a weighted sum of its entries. This effectively collapses our sequence of matrices into a sequence of pertinent features. There are various different ways we can collapse a GLCM into a distinct feature and a set of standard features have been established: contrast, dissimilarity, homogeneity, and angular second moment/energy.

[3] P_i,j represents the GLCM value at [i,j], or the amount of times value i changes to value j

Contrast measures the amount of local variations present in an image and weights a large difference in pixel values more heavily. Dissimilarity is similar but weights differences in pixel values linearly, better capturing smaller shifts than contrast. Homogeneity represents the amount of similarity in an image and has an inverse relationship to contrast and dissimilarity. Angular Second Moment (ASM) does a squared sum over all the GLCM values, representing the total amount of pixel changes in the image, while energy is the square root of ASM so that it fits nicely between 0 and 1.

These features provide standard methods of processing GLCMs and determining key spatial characteristics of a given image. Now, when we want to apply a random forest model to an image, we can find a sequence of GLCMs with different directions, boil down each GLCM into a set of characteristic numbers, and combine all this information into a list of features that the random forest model can accept! This sounds like a lot of work but luckily there exist many libraries that handle the entire process for you, such as the python library scikit-image.

[4] Local contrast in an image, in this case the edges of objects

But what about local features of our image? We can find these local features of an image by applying the entire process outlined above, but to a specific window of pixels in our input image. We can slide the window of pixels around and repeat the process to get local features of the image [4]. This is more generally referred to as 2D convolution using a kernel, a foundational part of image processing.

Regardless of how you calculate them, GCLMs are a good way of extracting key information from an image to feed into a machine learning model. Looking back to our initial example of face detection, a model could use GLCMs to determine textural characteristics of a face and tell it apart from other, similarly shaped objects! Of course, texture is not the only feature we can extract but it does provide valuable information about an object and can be paired together with the outputs of other feature detectors to better characterize our images.

Further reading:
V. S. Thakare and N. N. Patil, "Classification of Texture Using Gray Level Co-occurrence Matrix and Self-Organizing Map," 2014 International Conference on Electronic Systems, Signal Processing and Computing Technologies, 2014, pp. 350-355, doi: 10.1109/ICESC.2014.66., https://ieeexplore.ieee.org/document/6745402

Hall-Beyer, Mryka. (2017). GLCM Texture: A Tutorial v. 3.0 March 2017. 10.13140/RG.2.2.12424.21767, https://www.researchgate.net/publication/315776784_GLCM_Texture_A_Tutorial_v_30_March_2017

Media Credits:
[1] Figure 1 from Vishal S. Thakare; Nitin N. Patil, 2014. “Classification of Texture Using Gray Level Co-occurrence Matrix and Self-Organizing Map”. https://ieeexplore.ieee.org/document/6745402

[2] Figure 4 from Richard Thomas, Lei Qin, Francesco Alessandrino, Sonia P. Sahu. 2019.
A review of the principles of texture analysis and its role in imaging of genitourinary neoplasms

[3] Figure in https://scikit-image.org/docs/0.7.0/api/skimage.feature.texture.html

[4] Figure by Dan Ringwald, https://medium.com/sicara/opencv-edge-detection-tutorial-7c3303f10788

HMC Bee Lab

Pages

Tuesday, December 7, 2021

That looks rough: Extracting texture from images

No comments:

Post a Comment