Creations & Distortions
Digital Surgery Accounts of a Soon-to-be Physicist
Friday, December 9, 2016
Activity 8: Morphological Operations
For the past few activities, we dwell on the internal properties which can be observed in the Fourier space, the corresponding color distribution and chromaticity of an image, and similar characteristics. In this activity, we will dwell more on the structural aspect of an object in an image and we apply image processing by modifying or morphing certain structural elements. Doing these morphological operations requires some basic knowledge about Set Theory. I wont dwell much on the said theory but I'll just give a short summary which can be seen in the table below [1].
Table 1. Set Theory symbols and definitions [1].
A. Erosion and Dilation
Erosion and Dilation are two morphological operations that we will discuss first in this activity. Erosion, by concept and experience, is a well known fact that is occurs when soil, rock, or any basic piece of the Earth's landmass is displaced or moved from one point to another. When it comes to image processing, an image $A$ can be eroded by another image $B$, and the result is an image containing the set of all points $x$, such that $B$ when translated by $x$ is contained in $A$. This is a morphological process that includes image size reduction of $A$ with $B$ being the Structuring Element. Dilation, on the other hand, is commonly attributed to enlargement or expansion. Similar with Erosion, Dilation can be done in an image $A$ through a Structuring Element $B$; and occurs when the intersection of $A$ and the reflection of $B$ is not an empty set. For the first part of the activity, these two morphological operations were done manually through the use of this site http://polarski.cz/graph_paper/. These manual outputs were then compared with simulated operations. Four test images were both dilated and eroded with 5 structuring elements found in Figure 1.
Observing through the simulated morphological operations, I can say that I've done quite well in this part. This can be seen in the similarity of the dilated and eroded images for both the manual process (Fig.2) and the simulation (Fig.3).
For the simulation, each structural element was created using the CreateStructureElement() function of Scilab. Eroding and Dilating with these structural elements can be done digitally through the ErodeImage() and DilateImage() functions, respectively. Through doing the manual part was quite rigorous and took me about 3 hours to finish, it is essential to understand the process of these operations. Also, it felt satisfying to produce quite good results. The similarity and notable preciseness of the manual and simulated morphological process infers to us about the robustness and straightforward approach of morphological operations. Now that we finished the basic morphological process, it's time to dig a bit deeper.
B. Closing, Opening, and Top-Hat Operators
In this part of the activity, we introduce operations which include integrations of both dilation and erosion. The Closing operator tends to dilate first and then erode an image with the same structural element; while the Opening operator erodes the image first and then dilates it afterward. These two processes can be used to morph and segment images. The closing operator tends to "close" certain gaps in objects based on the structural element. Similarly, the opening operator "opens" these gaps for object isolation based on the structural element. This can be seen in Figure 4 below wherein a binarized test image of cells is opened (closed) using the OpenImage() (CloseImage()). We can see here the said "opening" and "closing" of gaps. In this case, gaps of certain hypothetical cells are either opened or closed. Note that the whole image was processed due to the initial input stacksize('max') for Scilab, that puts the stacksize to be analyzed into maximum possible memory value.
Figure 4. Morphological operations done in test cell images see in the leftmost binarized image. |
The Top-Hat Operator, on the other hand, is modification of the two operators. This operator has two types and one of which is executed in Scilab. These two types are the white and black top-hat functions. The white top-hat (Scilab executed) opens an image first and then subtracts it from the original image, while the black top-hat does the same operation but with the closing function instead of the opening function. In figure 4, we see the white-top hat outlining the edges and parts omitted from opening an image with circle diameters of 8 pixels. Note that the binarized image had a threshold of >218 pixels as seen in histogram plot (Figure 5) below.
Figure 5. Threshold determination of the hypothetical test cells image. |
C. Cancer Cell Detection
In this activity, we also apply these said operations in complex and vital processes like Cancer cell detection. We know that cancer cells are generally larger in size compared to normal cells. Here, we do morphological operations to automatically deduce cancer cells from an image.
First, we want to have a basis whether a cell is a cancer cell or not. Having knowledge about the variation in sizes, we could use the diameter or the cell size/area in a image as basis. As seen in Fig. 4 above, the opening operation works quite well for cleaning the image for processing, but I noticed that it deletes some important information. Due to this, I made my own specialized function. This specialized function includes opening the difference of two sub-functions $X$ and $Y$. Here $X$ is the opened of the closed image with threshold diameter of 5px; and $Y$ is the closed of the top-hat image with similar threshold diameter of 5px. The differences were obtained and were opened with a threshold diameter of 10px. It can be seen that less details were lost in the specialized function. Blobs were then obtained assigned using SearchBlobs() function. Their corresponding areas were then noted and plotted, as seen in the image below.
Figure 6. Histogram plots of the areas of the cell blobs before (top) and after (bottom) the application of the interquartile range (IQR) |
It can be seen that there are outliers in the image so we use the interquartile range (IQR) method wherein values within the difference of the 1st quartile with1.5 times the IQR ($x > (1^{st}Q - 1.5IQR$), and the sum of the third quartile with 1.5 times the IQR ($x < (3^{rd}Q + 1.5IQR$), the the ones we take into account ($IQR = 3^{rd}Q - 1^{st}Q$). The result can be seen in the lower histogram in Fig. 6., wherein the range varied significantly from 365-5986 to 365-697. The mean = 469.185 and standard deviation = 66.606 are obtained using the mean() and stdev() functions, respectively. The range of values are therefore between about 400-540pixels.
Now that we have this range, we can now determine the cancer cells by terms of area variation. We apply this to a hypothetical organ with cancer cells image seen in the figure below. We first binarize the image and then apply the Watershed() function to separate the connected blobs to omit area anomalies brought by connected cells. We do this by having a gradient for the edge detection from the Sobel algorithm.
Figure 7. (From left to right) Original hypothetical cancer cells image, binarized image with normalized threshold = 0.88, and a binarized image with the application of the Watershed function. |
We then apply the said filter range obtained from the statistical process done with the normal cell images by using FilterBySize() and the result is shown in figure 8. It can be seen that there are still connected cells that was not segmented by the applied watershed function. This is in the limitation of the Sobel edge-detection algorithm and SearchBlobs function. In the midst of two errors, the process still significantly increases cancer cell detection.
Figure 8. Area-based (left) and Diameter-based (right) cell detection method. |
Also, due to the structural element diameter dependence of the cells upon image opening, we can determine the cancer cells using a Diameter-based method. It was observed that the maximum diameter for normal cells was only 13pixels, therefore we implement a structural element with diameter greater than 13 and then open the image and the result is quite remarkable for its simplicity!
This has been the most tiring activity. Huhu. But yeah, the struggle really payed off. I learned so much from this activity and I had fun! I like the thought that I'm learning and growing as a soon-to-be-physicist. Also, believe that I did quite great in this activity: from explaining the points, presenting the figures, and extending for specialized and researched functions, I grade myself a 12/10 here. :)
I would like to thank Fatima Garcia, Audrey Arcilla, Miguel Panagsagan, and Denise Musni for being with me as I do this activity. I also would like to thank Recovery Food in UP town center and Kuya Barteezy's blog for guidance. Up to the next activity!
I would like to thank Fatima Garcia, Audrey Arcilla, Miguel Panagsagan, and Denise Musni for being with me as I do this activity. I also would like to thank Recovery Food in UP town center and Kuya Barteezy's blog for guidance. Up to the next activity!
[2] M. Soriano, "Activity 8 - Morphological Operations," Applied Physics 186. 2014.
Wednesday, December 7, 2016
Activity 7: Image Segmentation
Previous activities allow us to analyze and enhance images trough various methods and image processing applications. Though these processes are quite efficient and reliable, sometimes, in the process of noise reduction, color detection, and similar concepts, it is easier to use morphological operations such as threshold applications and color-histogram mapping to remove certain image imperfections. Generally, these are processes of image segmentation. Here, image segmentation was done for both gray-scaled and colored images.
A. Grayscale Image Segmentation
Image segmentation and processing usually deals with separating or technically segmenting specific parts of the entire image that we want to focus on and obtain. Here, we highlight a region of interest (ROI) for further image analysis and processing. Taking note that each image pixel has a certain color value, we can easily segment a grayscaled image (such as one below) due to the simplicity and monotonicity of the values [1].
Notice that this image tend to have darker and lighter parts. If we want to obtain the handwritten and printed parts of the check, we can do grayscaled image thresholding as done in the previous activities. Also, it is evident that lighter colors dominate most of the image area. We can check this by obtaining a histogram of the said image and segmentation can be done by omitting the area where the pixels tend to be at maximum and grayscale values tend to cluster. This can be seen in the image below.
In Fig. 2, the ones omitted and the highlighted parts. It can be seen that the image pixels tend to peak and cluster at values between 160 and 230 (highlighted in orange). These values are for the light-colored grayscale values. We also omit values beyond 230 because we know that these are lighter parts of the image (highlighted in yellow). With this, we apply image thresholding for values greater than 160 and try to see the effect upon decreasing the threshold value with increments of 40 as seen in the figure below.
It can be seen in Fig. 3 above that as we decrease the pixel grayscale threshold value, we tend to lose more and more details. Based on personal observation, the most efficient and effective threshold value in obtaining the printed and handwritten texts is the grayscale threshold value of 120. The value of 160 still has unwanted features while the value of 80 lacks certain important features.
As you can see, grayscale image segmentation can be quite easy as long as you know the correct threshold value for the parts that you need to segment; but when it comes to colored images, segmenting isn't that easy.
B. Colored Image Segmentation
The difficulty in dealing with colored image segmentation as opposed to grayscale image segmentation is a matter of contrast and monochrome similarity. For colored images, grayscale values can be similar for different colored images. This results to confusions when trying to segment colored images by the use of grayscale pixel value thresholding. Thus, we need not only the color brightness variations of the image pixels, but also the pure chromaticity of the image details. With this, we can segment the image for specified colors. According to the handout given by Dr. Soriano, we can utilize the Normalized Chromaticity Coordinates (NCC) of the image, given by:
$$r = \dfrac{R}{R + G + B},\quad g = \dfrac{G}{R + G + B},\quad b = \dfrac{B}{R + G + B}$$
with $R$, $G$, and $B$ being the red, green, and blue image spaces, respectively [1]. It can also be implied in the above equation that the sum of $r$, $g$, and $b$ is equal to 1. With this in mind, we choose $b$ to be equal to $1-r-g$ so that we will only deal with two variables [1]. Here is a representation of the NCC using two variables $r$ and $g$:
As we can see in the figure above, primary colors can be obtained when either $r$, $g$, or $b$ would be equal to one. Similarly, other colors can be represented by the combination of the $r$ and $g$ coordinates in the given color space. Here, we do image segmentation using two proposed methods: the Parametric and Non-parametric Probability Distribution Estimation method.
B.1. Parametric Probability Distribution Estimation
In this method, a region of interest (ROI) is chosen and its corresponding color distribution is taken into account. Segmentation is done upon determining the probability that the pixel belongs to the said color distribution, which can be obtained by normalizing the histogram of the ROI with the total number of pixels. We then obtain the probability that a pixel chromaticity, either $r$ or $g$, belongs to the obtained ROI through the equation:
$$p(r) = \dfrac{1}{\sigma_r \sqrt{2\pi}} \exp \left \lbrace - \dfrac{( r-\mu_r )^2}{2\sigma_r ^{2}}\right \rbrace$$
Here, assume independent Gaussian distributions for $r$ and $g$, and obtain $\mu_r$, $\sigma_r$ and $\mu_g$, $\sigma_g$ from the cropped pixel $r$-$g$ values, respectively. We then take the product of $p(r)$ and $p(g)$ to obtain the probability that a certain color pixel belongs to the color distribution of the ROI [1].
B.2. Non-parametric Probability Distribution Estimation: Histogram Backprojection
In contrast to the parametric method, which relies on the probability that the pixel belongs to the chosen ROI's color distribution, the non-parametric method uses the histogram itself for pixel tagging [1]. This method is specifically called, Histogram Backprojection, wherein the pixel location is assigned a value based on the color histogram value in the chromaticity space. Here, $r$ and $g$ values are converted into integers and are binned into a matrix to obtain a calibrated 2D histogram[1].
C. Color Image Segmentation Application: Parametric Segmentation vs. Histogram Backprojection
Here, we apply the two methods for three images shown in the figure below. The leftmost image is by observation the simplest type wherein color values tend to be distinct and do not overlap unlike the image in the middle wherein the color patterns are more complex with discernible color gradients and distinguishable transitions. The rightmost image, on the other hand, is an image of personal choice to show validity, mastery, and personal touch on the said techniques. For all three images, three distinct ROIs were chosen and tested for segmentation.
C.1. Color Image Segmentation for Simple-type Colored Image
The figure below shows the image with three ROIs, namely, the pink, yellow, and blue colored spherical objects. It is quite evident that for segmenting images with simple color palettes, the Histogram Backprojection method is much more efficient for it is able to obtain with ease the desired region of interest. This is due to the effect of binning the histogram values. Note that the binning used in the program was 32 bins. Furthermore, because of this process, all the color values with the certain bin will be listed as a pixel within the ROI. Thus, even values of different lighting and shading can still be listed within the ROI. On the other hand, though Parametric Segmentation also produces significantly valuable results, it fails to return the whole region of interest and is limited to the color distribution with the same brightness and lighting as the chosen cropped palette. Here, we give 1pt for Histogram Backprojection.
C.2. Color Image Segmentation for Complex-type Colored Image
For this part of the activity, the image of choise has much more complex color palettes that transition from one color to another. Similarly, we select three ROIs, specifically, blue-green to yellow-green, violet to indigo, and yellow-orange to red, that also transition from one color to another. Here, it is easily discernible that Parametric Segmentation granted much more precise, complete, and promising results. This is due to the fact that Parametric segmentation takes into account all the pixel values within the cropped ROI and obtain the probability distribution of the pixel in the image if it is within the said ROI. This makes the Parametric method more dynamic and efficient for images having ROIs of transitioning color palettes.
1pt goes to the Parametric Method.
C.3. Color Image Segmentation for Colored Image of Personal Choice
Now we go to the tie-breaker round. In this part, we choose an image of personal choice with three ROIs namely, the sky, the clouds, and the sand. Here, we can further note the differences of the two methods. Similar to the results we saw above, the Parametric method focuses more on the precision of the cropped ROI, while histogram Backprojection focuses on obtaining the chosen ROI regardless of the brightness differences. It is quite nice to see that even cloud reflections visible on the surface of the sea are still segmented. Overall, in the chosen image, Histogram Backprojection gave better results due to its versatility when it comes to the lighting of the chosen ROI. But if this were a different image, and the chose ROI would be like the transition of the clouds to the sky, Parametric Segmentation would definitely be better than the Non-parametric one.
Lastly, I would like to acknowledge and thank my father for taking the picture and providing me good food while I do the activity. I would also like to thank Tin Santos for introducing to me Econ library. It's quite a convenient place to do papers. The noise and AC is just right for me to be productive. I thank Denise Musni for pushing me to finish this blog. I would give myself a 12/10 for the extra effort done in presenting the results well. I used Microsoft Powerpoint to join together and decrease the number of figures. Extra work was also done in applying segmentation methods for real-life images. But the most important part is that I learned a lot and had quite some fun while doing it in the midst of the hell week. :)
[1] M. Soriano, "A7-Image Segmentation," Applied Physics 186. 2014.
[2] Simple colored image: http://imanada.com/daut/as/m/a/abstract-archives-page-5-of-8-canvas-print-art-colourful-rainbow-balls-modern-design-20x16-free-uk-pp_colourful-abstract-designers_home-decor_home-decoration-ideas-decorator-decore-decorators-collecti_797x797.jpg
[3] Complex colored image: https://hdwallpaperpoint.com/colorful-blocks-rainbow-3d-graphics-background-4k-hd-wallpaper
Figure 1. Grayscaled image of a check obtained from Dr. Soriano |
Figure 2. Histogram of the grayscaled image in Figure 1. |
Figure 3. Segmented image with the application of grayscale value thresholding. Only grayscale values less then 160 were taken with decreasing increments of 40. |
As you can see, grayscale image segmentation can be quite easy as long as you know the correct threshold value for the parts that you need to segment; but when it comes to colored images, segmenting isn't that easy.
B. Colored Image Segmentation
The difficulty in dealing with colored image segmentation as opposed to grayscale image segmentation is a matter of contrast and monochrome similarity. For colored images, grayscale values can be similar for different colored images. This results to confusions when trying to segment colored images by the use of grayscale pixel value thresholding. Thus, we need not only the color brightness variations of the image pixels, but also the pure chromaticity of the image details. With this, we can segment the image for specified colors. According to the handout given by Dr. Soriano, we can utilize the Normalized Chromaticity Coordinates (NCC) of the image, given by:
$$r = \dfrac{R}{R + G + B},\quad g = \dfrac{G}{R + G + B},\quad b = \dfrac{B}{R + G + B}$$
with $R$, $G$, and $B$ being the red, green, and blue image spaces, respectively [1]. It can also be implied in the above equation that the sum of $r$, $g$, and $b$ is equal to 1. With this in mind, we choose $b$ to be equal to $1-r-g$ so that we will only deal with two variables [1]. Here is a representation of the NCC using two variables $r$ and $g$:
Figure 4. Normalized Chromaticity Coordinate 2D space with axes $r$ and $g$. Image is from Wikipedia |
B.1. Parametric Probability Distribution Estimation
In this method, a region of interest (ROI) is chosen and its corresponding color distribution is taken into account. Segmentation is done upon determining the probability that the pixel belongs to the said color distribution, which can be obtained by normalizing the histogram of the ROI with the total number of pixels. We then obtain the probability that a pixel chromaticity, either $r$ or $g$, belongs to the obtained ROI through the equation:
$$p(r) = \dfrac{1}{\sigma_r \sqrt{2\pi}} \exp \left \lbrace - \dfrac{( r-\mu_r )^2}{2\sigma_r ^{2}}\right \rbrace$$
Here, assume independent Gaussian distributions for $r$ and $g$, and obtain $\mu_r$, $\sigma_r$ and $\mu_g$, $\sigma_g$ from the cropped pixel $r$-$g$ values, respectively. We then take the product of $p(r)$ and $p(g)$ to obtain the probability that a certain color pixel belongs to the color distribution of the ROI [1].
B.2. Non-parametric Probability Distribution Estimation: Histogram Backprojection
In contrast to the parametric method, which relies on the probability that the pixel belongs to the chosen ROI's color distribution, the non-parametric method uses the histogram itself for pixel tagging [1]. This method is specifically called, Histogram Backprojection, wherein the pixel location is assigned a value based on the color histogram value in the chromaticity space. Here, $r$ and $g$ values are converted into integers and are binned into a matrix to obtain a calibrated 2D histogram[1].
C. Color Image Segmentation Application: Parametric Segmentation vs. Histogram Backprojection
Here, we apply the two methods for three images shown in the figure below. The leftmost image is by observation the simplest type wherein color values tend to be distinct and do not overlap unlike the image in the middle wherein the color patterns are more complex with discernible color gradients and distinguishable transitions. The rightmost image, on the other hand, is an image of personal choice to show validity, mastery, and personal touch on the said techniques. For all three images, three distinct ROIs were chosen and tested for segmentation.
Figure 5. Images to be segmented. (From left to right) Simple colored image[2], Complex colored image[3], and personal image of choice from my own personal photobunk. |
C.1. Color Image Segmentation for Simple-type Colored Image
The figure below shows the image with three ROIs, namely, the pink, yellow, and blue colored spherical objects. It is quite evident that for segmenting images with simple color palettes, the Histogram Backprojection method is much more efficient for it is able to obtain with ease the desired region of interest. This is due to the effect of binning the histogram values. Note that the binning used in the program was 32 bins. Furthermore, because of this process, all the color values with the certain bin will be listed as a pixel within the ROI. Thus, even values of different lighting and shading can still be listed within the ROI. On the other hand, though Parametric Segmentation also produces significantly valuable results, it fails to return the whole region of interest and is limited to the color distribution with the same brightness and lighting as the chosen cropped palette. Here, we give 1pt for Histogram Backprojection.
Figure 6. Color segmentation for the simple-type colored image showing three different ROIs and image comparisons for the two proposed methods (Parametric and Non-parametric). |
C.2. Color Image Segmentation for Complex-type Colored Image
For this part of the activity, the image of choise has much more complex color palettes that transition from one color to another. Similarly, we select three ROIs, specifically, blue-green to yellow-green, violet to indigo, and yellow-orange to red, that also transition from one color to another. Here, it is easily discernible that Parametric Segmentation granted much more precise, complete, and promising results. This is due to the fact that Parametric segmentation takes into account all the pixel values within the cropped ROI and obtain the probability distribution of the pixel in the image if it is within the said ROI. This makes the Parametric method more dynamic and efficient for images having ROIs of transitioning color palettes.
1pt goes to the Parametric Method.
C.3. Color Image Segmentation for Colored Image of Personal Choice
Now we go to the tie-breaker round. In this part, we choose an image of personal choice with three ROIs namely, the sky, the clouds, and the sand. Here, we can further note the differences of the two methods. Similar to the results we saw above, the Parametric method focuses more on the precision of the cropped ROI, while histogram Backprojection focuses on obtaining the chosen ROI regardless of the brightness differences. It is quite nice to see that even cloud reflections visible on the surface of the sea are still segmented. Overall, in the chosen image, Histogram Backprojection gave better results due to its versatility when it comes to the lighting of the chosen ROI. But if this were a different image, and the chose ROI would be like the transition of the clouds to the sky, Parametric Segmentation would definitely be better than the Non-parametric one.
Lastly, I would like to acknowledge and thank my father for taking the picture and providing me good food while I do the activity. I would also like to thank Tin Santos for introducing to me Econ library. It's quite a convenient place to do papers. The noise and AC is just right for me to be productive. I thank Denise Musni for pushing me to finish this blog. I would give myself a 12/10 for the extra effort done in presenting the results well. I used Microsoft Powerpoint to join together and decrease the number of figures. Extra work was also done in applying segmentation methods for real-life images. But the most important part is that I learned a lot and had quite some fun while doing it in the midst of the hell week. :)
[1] M. Soriano, "A7-Image Segmentation," Applied Physics 186. 2014.
[2] Simple colored image: http://imanada.com/daut/as/m/a/abstract-archives-page-5-of-8-canvas-print-art-colourful-rainbow-balls-modern-design-20x16-free-uk-pp_colourful-abstract-designers_home-decor_home-decoration-ideas-decorator-decore-decorators-collecti_797x797.jpg
[3] Complex colored image: https://hdwallpaperpoint.com/colorful-blocks-rainbow-3d-graphics-background-4k-hd-wallpaper
Thursday, October 13, 2016
Activity 6 - Properties and Applications of the 2D Fourier Transform
Hi there visitor! This activity is just a continuation and somehow a more advanced version of Activity 5 wherein we tend to explore more on the deeper properties and characteristics of the Fourier Transform.
First of all I would not start my blog with a certain life lesson or realization due time limitations. Well, I guess the main point here is that sometimes, no matter how Fast we think we are in Transforming, we must realize that there will be certain moments that we need to accept our limitations and do our best in the midst of the unwanted circumstances.
ANAMORPHIC PROPERTY OF THE FOURIER TRANSFORM
The Anamorphic properties of the FT just tells us the inverse properties of the FT wherein if a certain dimension longer at the digital space, it will tend to be shorter in the Fourier space, and vice versa. We can see this in Figure 1 below wherein the corresponding patterns on the left, with their FT on the right.
The theoretical expectation is met upon examining the shift in the dominant, longer dimension in the Fourier space which can be obviously implied in the long and wide rectangles. The two dots tend to have an FT that is a sinusoidal pattern along the x-axis. Anamorphism can be seen upon observing Figure 2 that shows a decrease in the line widths of the FT (bottom) upon increasing the gap between the two dots (top).
Here's a cool GIF for further visualization of Anamorphism (please take note Aliasing due to very high frequencies of the sinusoid):
ROTATION PROPERTY
Here we explore the rotation property of the FT upon the introduction of rotation parameters and sinusoid addition and overlap. The summary of this part can be observed in Figure 3 which shows (from left to right) the FT of a sinusoid showing peak frequencies (one negative-y and one positive-y: the reason why there are 2 dots), the resulting FT upon adding a constant bias, the FT upon the introduction of a rotational parameter, the FT of combined sinusoids, and the FT of rotated and combined sinusoids.
CONVOLUTION THEOREM
Here we revisit and go deeper in the convolution theorem of Fourier Transforms. Figure 4 shows the summary of what we did in this part. (From top to bottom) First, the FT of two Dirac deltas (peak ponts) were obtained then the said points were replaced by first a circle, a square and a gaussian bell curve. The sizes of these patterns were studies and the results were found to still exhibit anamorphism and for very small sizes, they tend to represent almost the FT of the Dirac Delta.
LINE REMOVAL THROUGH FOURIER SPACE MASKING
First of all I would not start my blog with a certain life lesson or realization due time limitations. Well, I guess the main point here is that sometimes, no matter how Fast we think we are in Transforming, we must realize that there will be certain moments that we need to accept our limitations and do our best in the midst of the unwanted circumstances.
ANAMORPHIC PROPERTY OF THE FOURIER TRANSFORM
The Anamorphic properties of the FT just tells us the inverse properties of the FT wherein if a certain dimension longer at the digital space, it will tend to be shorter in the Fourier space, and vice versa. We can see this in Figure 1 below wherein the corresponding patterns on the left, with their FT on the right.
Figure 1. (Top to bottom) Tall and wide rectangles, and two dots symmetrically separated from the center, with their corresponding Fourier transforms. |
Figure 2. Two dots (top) and its corresponding FT (bottom) showing a decrease in the line widths in Fourier space upon increasing the dot separation in digital space. |
ROTATION PROPERTY
Here we explore the rotation property of the FT upon the introduction of rotation parameters and sinusoid addition and overlap. The summary of this part can be observed in Figure 3 which shows (from left to right) the FT of a sinusoid showing peak frequencies (one negative-y and one positive-y: the reason why there are 2 dots), the resulting FT upon adding a constant bias, the FT upon the introduction of a rotational parameter, the FT of combined sinusoids, and the FT of rotated and combined sinusoids.
Figure 3. The sinusoidal patters(top) and their corresponding FTs (bottom). |
Shown below are GIFs for better visualization.
Sinusoid of increasing frequency with its corresponding FT |
Sinusoid with increasing constant bias with its corresponding FT |
Sinusoid rotated clockwise with its corresponding FT |
Combined sinusoids with one in the x and the other in the y direction with its corresponding FT |
Combined and rotated sinusoid with rotational functions and their corresponding FT |
Combined and rotated sinusoid with more complicated rotational functions and their corresponding FT |
Here we revisit and go deeper in the convolution theorem of Fourier Transforms. Figure 4 shows the summary of what we did in this part. (From top to bottom) First, the FT of two Dirac deltas (peak ponts) were obtained then the said points were replaced by first a circle, a square and a gaussian bell curve. The sizes of these patterns were studies and the results were found to still exhibit anamorphism and for very small sizes, they tend to represent almost the FT of the Dirac Delta.
Add caption |
LINE REMOVAL THROUGH FOURIER SPACE MASKING
Wednesday, October 5, 2016
Activity 5 - Fourier Transform Model of Image Formation
They say
that the only constant thing in the universe is change.
Whoever we are, wherever we're in, or whatever we do, we are always
prone to change. From the smallest variations at the subatomic level, to the
basic life processes that we do and experience, we always experience change. It
seems like whatever we do, we undergo a certain transformation
that only varies upon your reference point or to put it simply, your
perspective.
Some may
look at change in a negative sense, while others might look at it as a positive
thing. Our varying perspectives allow us to have the full grasp of the whole
image. Personally, I believe that change is good as long as you at least try to
understand how it happened. Change will forever be a blur unless you try to
experience and engage in it, so that your "transform" will add to
your "formation."
Here, we try to transform an image in a
literal sense, and upon doing so, we observe and understand multiple properties
that are present in the image. We use one of the most popular and easy-to-use
models, the Fast Fourier Transform Model in Image Formation.
THE FOURIER TRANSFORM MODEL
On of the
most known classes of transform models is the Fourier
Transform. It a linear transform that recasts information into another
set of linear basis functions. The function of the Fourier Transform (FT) is to
convert a physical dimension into the inverse of that dimension. Upon obtaining
the FT of a spatial signal, we expect the output to be the inverse space of
that signal or particularly its spatial frequency.
In this
activity, the FT will be explored with the use of a 2D image. The FT of an image f(x,y) is given by
where fx and fy
are the spatial frequencies along x and y, respectively. Here, we
apply the 2D FT for discrete signals; thus we introduce the discrete FT given
by
where k and l
are the spatial coordinates along x and y, respectively. Here, M and N
are the range of values along, x and y.
Due to
the specifications and complexity of the FT model, a certain algorithm, called
the Fast Fourier Transform (FFT) algorithm was introduced by Cooley and Turkey
for faster FT implementation. The results of this activity essentially show the
various characteristics and applications of the FT model with the use of
Scilab.
RESULTS
A. FFT Exploration
A. FFT Exploration
Figure 1. Compiled results in exploring the properties of FFT for different images. |
In this
part, I decided to somehow lighten up the mood and be less formal with my
writing. Haha. I think that this method would be more efficient in relying my
thoughts, especially that this activity is quite a load.
So let's start with figure 1. As you can see, I
compiled the images to show the various processes in the part wherein I started
to get comfortable with the use of FFT in Scilab. The photos were joined using
the old photo joiner (http://old.photojoiner.net/) and Kingsoft Presentation was used to add the labels.
Upon observing the images, it can be seen that the real part of the FFT image greatly resembles
the forward FFT image. This is expected because the abs() function was
introduced to get the modulus of the complex number, that is of the returned
FFT; and the modulus of a complex number is highly influenced by its real part
as compared to the imaginary part. This
also tells us that the idea of the whole image of
things is not just made up of what we obtain from our senses, what we can
observe, or on our own idea of reality,
we still need a bit of imagination for
us to have the whole grasp of the bigger picture.
REALIZATION!!! |
The FFTshift() was also important in obtaining the forward FFT.
Initially, without the fftshift() command, I thought that what I was doing was
wrong but in reality, I just needed to add a simple command to get the right
answer. Huhu. Another realization is boiling…
I guess in life, we focus too much on the complex details and upon experiencing failure, we overthink and overlook the simple solution. Sometimes the best solutions are the simplest ones.
Yay for simplicity! |
Lastly, looking at the rightmost images of
the figure, we can see that it's just a vertically inverted image of the
original one. This reminds us that in life, there are two types of individuals:
the A's and the Circles. The A's are
those individuals that upon exerting much effort, they can allow their lives to
turn upside-down. Note that this can be both in a positive or negative way. The
Circles are the ones that upon exerting
much effort they still end up with what they started.
For me, though this whole thing of A's and Circles is just a hypothesis, the outcome is just secondary to the journey. You might end up where you started, or you experienced a certain inversion in life, what matters most is on what you went through and on the priceless knowledge and experiences that you obtained. These things are not perishable and can be considered as true treasures of life.
For me, though this whole thing of A's and Circles is just a hypothesis, the outcome is just secondary to the journey. You might end up where you started, or you experienced a certain inversion in life, what matters most is on what you went through and on the priceless knowledge and experiences that you obtained. These things are not perishable and can be considered as true treasures of life.
For further exploration of things, I varied the size of the original image I got quite cool observations and results. I used (GIFCreator.me) in generating the GIF images.
Figure 2. A GIF of the letter A showing the various differences of the FT image with respect to the original image size. |
Figure 3. A GIF of a 2D Gaussian bell curve showing the various differences of the FT image with respect to the original image size. |
Figures 2 and 3 show that the image size is inversely proportional to the size of the FT image.
Figure 4. A GIF of a centered-circle showing the various differences of the FT image with respect to the original image size. |
Here, you can see that though changing the image size has just similar effects to the one in Fig 2, a highly notable change can be seen in the imaginary part of the complex FT image. It can be seen that as the image decreases in size, the patterns formed by the imaginary part does not only zoom-in but also somehow has a form of rotation that leads to pattern variations. Huhu. Beautiful :3
Figure 5. A GIF of a sinusoid along x showing the various differences of the FT image with respect to the original image sinusoid frequency. |
The FT of a sinusoid along x (Fig. 5) is composed of three dots with one bright dot in the center and two dimmer dots that somehow "sandwiches" the bright dot. It can be seen that as the frequency increases, the distance of the outer dots with respect to the inner dot also increases. While for Figure 6, I tried to play with the separation distance of the simulated double slit. It can be seen that upon increasing the distance between the slits, the number of maxima values (bright dots) also increases.
Figure 6. A GIF of the simulated double slit showing the various differences of the FT image with respect to the original inter-slit distance |
Figure 7. A GIF of square function showing the various differences of the FT image with respect to the original image size. |
B. Convolution
For the convolution, the image below (Figure 8) just shows the effect of the aperture size of a certain camera to the reconstructed image. This shows that ideally, it's better to have higher aperture sizes for you to have higher image resolution and details. Higher aperture size here implies a higher percentage of the rays gathered from the rays reflected off an object. It is also good to note the Fourier image of a centered-circle in Figure 4. As you can see, as the aperture decreases, the resulting image in Fourier space increases in the number of ripples, which in connection to Fig.8 can be seen in the smallest image (leftmost). This is expected because the convolution of two images should look more like the convolved images.
Figure 8. An image showing the aperture of the lens (top) and the corresponding reconstructed (covoluted) image of VIP (bottom) |
C. Template Matching
The concept here is finding the letter in the given text where the template letter "A" matches through the use of the correlation function. In the leftmost 2x2 image of Figure 9, it can be seen that the resulting bright spots in the correlated image corresponds to the location where the letter "A" can be found in the text. Upon further exploration, it can be observed that the position of the image that will be correlated should be centered. The middle and rightmost set of 2x2 images show that the correlated images shift to the direction opposite where the image subject for correlation is shifted.
Figure 9. Matched A template with the given set of texts while varying the positions of the A template. |
Walang labis, walang kulang. Sakto lang.
Figure 10. Matched A template with the given set of texts while varying the sizes of the A template. |
Figure 11. Matched AI and IN templates with the given set of texts. |
Kasi minsan talaga, hindi mo siya parating mahahanap :(
Figure 12. Matched meercat face templates with the given image containing 8 similar-looking meerkats. This was done for the gray-scaled (left) and the monochromatic (right) versions of the image. |
D. Edge Detection
This part of the activity is like an extension of the previous part, specifically the edge detection method but this time, there's a certain threshold of values. It's important that the matrix adds up to zero so that there would be no gray values in the resulting edge-detected image.
The results which can be seen below corresponds to the edge values available for both the matrix and the VIP image. Thus, for the horizontal matrix, horizontal edges can only be observed. Consequently, for the vertical and diagonal matrices, only vertical and diagonal edges can be observed, respectively. The square and inverted square matrices are good edge detecting matrices upon convolution because they both have horizontal and vertical parts.
Figure 13. Edge detection through matrix multiplication and convolution method for different test matrices. |
ACKNOWLEDGEMENTS
I would like to thank Tonee Hilario for suggesting a good coffee shop for me to finish this activity. The coffee shop is called Niche Cafe located near UP Manila. It has sockets, wifi, pillows, tables, coffee and air-conditioning, which sums up to a productive night. I would also like to thank Denise Musni for staying with me in the said cafe.
I would also like to acknowledge the Dr. Maricor Soriano for the handout and for continually guiding us throughout the activities. I find eagerness and fervor to teach both amusing and inspiring.
Lastly, I would like to rate myself a grade of 12/10 for I know that I did over what we were expected in the activity and I also had fun making the blog and the activity itself. I would also like to post an edge-detected image of me doing this activity with my oh-so-cool hair =))
Subscribe to:
Posts (Atom)