Wednesday, October 5, 2016

Activity 5 - Fourier Transform Model of Image Formation

They say that the only constant thing in the universe is change. Whoever we are, wherever we're in, or whatever we do, we are always prone to change. From the smallest variations at the subatomic level, to the basic life processes that we do and experience, we always experience change. It seems like whatever we do, we undergo a certain transformation that only varies upon your reference point or to put it simply, your perspective.

Some may look at change in a negative sense, while others might look at it as a positive thing. Our varying perspectives allow us to have the full grasp of the whole image. Personally, I believe that change is good as long as you at least try to understand how it happened. Change will forever be a blur unless you try to experience and engage in it, so that your "transform" will add to your "formation."

Here, we try to transform an image in a literal sense, and upon doing so, we observe and understand multiple properties that are present in the image. We use one of the most popular and easy-to-use models, the Fast Fourier Transform Model in Image Formation.


THE FOURIER TRANSFORM MODEL

On of the most known classes of transform models is the Fourier Transform. It a linear transform that recasts information into another set of linear basis functions. The function of the Fourier Transform (FT) is to convert a physical dimension into the inverse of that dimension. Upon obtaining the FT of a spatial signal, we expect the output to be the inverse space of that signal or particularly its spatial frequency.

In this activity, the FT will be explored with the use of a 2D image.  The FT of an image f(x,y) is given by
where fx and fare the spatial frequencies along x and y, respectively. Here, we apply the 2D FT for discrete signals; thus we introduce the discrete FT given by
where k and l are the spatial coordinates along x and y, respectively. Here, M and N are the range of values along, x and y.

Due to the specifications and complexity of the FT model, a certain algorithm, called the Fast Fourier Transform (FFT) algorithm was introduced by Cooley and Turkey for faster FT implementation. The results of this activity essentially show the various characteristics and applications of the FT model with the use of Scilab.


RESULTS

A. FFT Exploration
Figure 1. Compiled results in exploring the properties of FFT for different images.
In this part, I decided to somehow lighten up the mood and be less formal with my writing. Haha. I think that this method would be more efficient in relying my thoughts, especially that this activity is quite a load.

So let's start with figure 1. As you can see, I compiled the images to show the various processes in the part wherein I started to get comfortable with the use of FFT in Scilab. The photos were joined using the old photo joiner (http://old.photojoiner.net/) and Kingsoft Presentation was used to add the labels.

Upon observing the images, it can be seen that the real part of the FFT image greatly resembles the forward FFT image. This is expected because the abs() function was introduced to get the modulus of the complex number, that is of the returned FFT; and the modulus of a complex number is highly influenced by its real part as compared to the imaginary part.  This also tells us that the idea of the whole image of things is not just made up of what we obtain from our senses, what we can observe, or on our own idea of reality, we still need a bit of imagination for us to have the whole grasp of the bigger picture.

REALIZATION!!!
The FFTshift() was also important in obtaining the forward FFT. Initially, without the fftshift() command, I thought that what I was doing was wrong but in reality, I just needed to add a simple command to get the right answer. Huhu. Another realization is boiling…
I guess in life, we focus too much on the complex details and upon experiencing failure, we overthink and overlook the simple solution. Sometimes the best solutions are the simplest ones. 
Yay for simplicity!

Lastly, looking at the rightmost images of the figure, we can see that it's just a vertically inverted image of the original one. This reminds us that in life, there are two types of individuals: the A's and the Circles. The A's are those individuals that upon exerting much effort, they can allow their lives to turn upside-down. Note that this can be both in a positive or negative way. The Circles are the ones that upon exerting much effort they still end up with what they started.

For me, though this whole thing of A's and Circles is just a hypothesis, the outcome is just secondary to the journey. You might end up where you started, or you experienced a certain inversion in life, what matters most is on what you went through and on the priceless knowledge and experiences that you obtained. These things are not perishable and can be considered as true treasures of life.

For further exploration of things, I varied the size of the original image I got quite cool observations and results. I used (GIFCreator.me) in generating the GIF images.


Figure 2. A GIF of the letter A showing the various differences of the FT image with respect to the original image size.
Figure 3. A GIF of a 2D Gaussian bell curve showing the various differences of the FT image with respect to the original image size.

Figures 2 and 3 show that the image size is inversely proportional to the size of the FT image.


Figure 4. A GIF of a centered-circle showing the various differences of the FT image with respect to the original image size.
Here, you can see that though changing the image size has just similar effects to the one in Fig 2, a highly notable change can be seen in the imaginary part of the complex FT image. It can be seen that as the image decreases in size, the patterns formed by the imaginary part does not only zoom-in but also somehow has a form of rotation that leads to pattern variations. Huhu. Beautiful :3


Figure 5. A GIF of a sinusoid along x showing the various differences of the FT image with respect to the original image sinusoid frequency.
The FT of a sinusoid along x (Fig. 5) is composed of three dots with one bright dot in the center and two dimmer dots that somehow "sandwiches" the bright dot. It can be seen that as the frequency increases, the distance of the outer dots with respect to the inner dot also increases. While for Figure 6, I tried to play with the separation distance of the simulated double slit. It can be seen that upon increasing the distance between the slits, the number of maxima values (bright dots) also increases.

Figure 6. A GIF of the simulated double slit showing the various differences of the FT image with respect to the original inter-slit distance
It is also good to note that the square function image acts like two double slits with one along the x and another along the y. This can be seen in the FT image for small square images, but by increasing the size, the maxima values tend to dim, which results in inobservable FT maxima values. 

Figure 7. A GIF of square function showing the various differences of the FT image with respect to the original image size.

B. Convolution

For the convolution, the image below (Figure 8) just shows the effect of the aperture size of a certain camera to the reconstructed image. This shows that ideally, it's better to have higher aperture sizes for you to have higher image resolution and details. Higher aperture size here implies a higher percentage of the rays gathered from the rays reflected off an object. It is also good to note the Fourier image of a centered-circle in Figure 4. As you can see, as the aperture decreases, the resulting image in Fourier space increases in the number of ripples, which in connection to Fig.8 can be seen in the smallest image (leftmost). This is expected because the convolution of two images should look more like the convolved images.

Figure 8. An image showing the aperture of the lens (top) and the corresponding reconstructed (covoluted) image of VIP (bottom)
The idea here is technically straightforward. If you wanna have a clearer view of things, you might want to look at the world through a lens with a wider aperture. Consequently. if you're view of everything seems blurry and unsure, maybe you're just looking at the world with a smaller lens. We all need to increase our field of vision to live the fullness of life.


C. Template Matching


The concept here is finding the letter in the given text where the template letter "A" matches through the use of the correlation function. In the leftmost 2x2 image of Figure 9, it can be seen that the resulting bright spots in the correlated image corresponds to the location where the letter "A" can be found in the text. Upon further exploration, it can be observed that the position of the image that will be correlated should be centered. The middle and rightmost set of 2x2 images show that the correlated images shift to the direction opposite where the image subject for correlation is shifted.
Figure 9. Matched A template with the given set of texts while varying the positions of the A template.
I also explored on the template sizes and as you can observe in figure 10, the result is not size independent, wherein the image template should also be the same size of figure in the larger image upon the correlation process. 
Walang labis, walang kulang. Sakto lang.
Figure 10. Matched A template with the given set of texts while varying the sizes of the A template.
Also, I just wanted to prove that the correlation function is not exclusive to single letters. I applied the same correlation method to two letters (Figure 11) and for face detection (Figure 12).
Figure 11. Matched AI and IN templates with the given set of texts.
The results show that finding 1/8 similar-looking meerkats can be done using face-detection with image correlation. And upon observing Figure 12, it is also good to note that the correlation process is limited only to monochromatic or black-and-white images (Fig.12 right) and not highly reliable for gray-scaled images (Fig. 12 left). 
Kasi minsan talaga, hindi mo siya parating mahahanap :(
Figure 12. Matched meercat face templates with the given image containing 8 similar-looking meerkats. This was done for the gray-scaled (left) and the monochromatic (right) versions of the image.

D. Edge Detection

This part of the activity is like an extension of the previous part, specifically the edge detection method but this time, there's a certain threshold of values. It's important that the matrix adds up to zero so that there would be no gray values in the resulting edge-detected image. 

The results which can be seen below corresponds to the edge values available for both the matrix and the VIP image. Thus, for the horizontal matrix, horizontal edges can only be observed. Consequently, for the vertical and diagonal matrices, only vertical and diagonal edges can be observed, respectively. The square and inverted square matrices are good edge detecting matrices upon convolution because they both have horizontal and vertical parts.
Figure 13.  Edge detection through matrix multiplication and convolution method for different test matrices.

ACKNOWLEDGEMENTS

I would like to thank Tonee Hilario for suggesting a good coffee shop for me to finish this activity. The coffee shop is called Niche Cafe located near UP Manila. It has sockets, wifi, pillows, tables, coffee and air-conditioning, which sums up to a productive night. I would also like to thank Denise Musni for staying with me in the said cafe.

I would also like to acknowledge the Dr. Maricor Soriano for the handout and for continually guiding us throughout the activities. I find eagerness and fervor to teach both amusing and inspiring. 

Lastly, I would like to rate myself a grade of 12/10 for I know that I did over what we were expected in the activity and I also had fun making the blog and the activity itself. I would also like to post an edge-detected image of me doing this activity with my oh-so-cool hair =))




No comments:

Post a Comment