Fun with Filters and Frequencies!
Overview:
Photos can be decomposed into frequencies using filters — and you can do cool things with these isolated frequencies. My favorite part of the project was extending the project to make hybrids of 3 (+) images! Check that out below.
Part 1: Frequency For Edges
Consider a cameraman.
Finite Difference
Here is the convolutions of the finite difference operators: and . As we can see, convolving with the original image yields the difference in pixel values in the x and y direction, allowing us to isolate edges. If we treat each finite difference as the x and y derivatives, we can represent the vector as a form of a gradient, and then take it’s magnitude with . We can further binarize the image, and take a threshold to isolate only the edges.
Gaussian Filters
We then construct a gaussian kernel of size 11x11 with a sigma value of 2. We use this kernel in two different ways. We first convolve this kernel with the finite different operators to get the derivative of Gaussian filters:
We then 1) convolve the DOG filters with the original image to get the vertical and horizontal derivative of gaussian images and 2) we blur the original image by convolving it with the Gaussian Kernel and convolve it with the original finite difference filters. We can also create the gradient magnitude images for these different filters.
We can then isolate edges using thresholds for both the DOG and Gaussian blur.
DOG:
Gaussian Blur:
We note two things:
- The Gaussian Blur and DOG end results are nearly the exact same, barring some noise due to thresholding.
- In general, blurring yields stronger results than just finite difference, with thicker edges. It is better at identifying edges, more generally.
I was curious what happens in scenarios where edges are not really defined, so I tried the same filtering method on a much less clear image, where finite difference might yield not great results.
Here’s what happens when I test a Jackson Pollock: A FAILURE
Suppose we isolate the blue channel — it’s actually rather hard to see the edges! Let’s see how good our differences do.
Here is the comparison of finite difference and DOG filters:
And here are the results for various thresholds:
Finite Difference:
DOG:
As we can see, the DOG is significantly less accurate that the finite difference in this instance, creating “fuzzy” type edges where none exist. What can we conclude? When edges are packed close together — blurring causes boundaries to smear, which makes edge isolation hard.
Part 2.1: Image Sharpening
Convolutions with a Gaussian Kernel acts as a low pass filter. Subtracting this low pass image from the original image yields the high frequencies, which can be used to sharpen edges. Take this blurry image as the Taj Mahal, for an example:
We can then take various combinations of the original image and the high frequency image to sharpen the image:
Rather than just blurring and unblurring photos, I think a rather interesting use-case is to try to enhance older, blurrier photos. Let’s see if this method can enhance the sharpness of some ~2010 family photos, the sharpened images are on the top right.
Here’s a progression:
We can also blur and unblur!
Did the sharpening reduce the blur in the original image too? Hard to tell. We can see what happens if we just try to sharpen the original image, without the blur:
The answer: No, the original blur didn’t decrease. It actually increased, since the blur effect edges got sharper. Isn’t the paradoxical!
Part 2.2: Hybrid Imaging
The human eye perceives frequencies differently depending on where they are observed. At close distances, we are able to see high frequencies very well; however, as we step away, the human eye is less perceptive at spotting high frequencies, and it leads to us using the low frequencies to fill in the gaps! In each of these photos, watch as the high frequency component disappears as you step away, and the low frequency component dominates.
Here, we align the images, apply gaussian and the laplacian of the gaussian to yield high and low frequency images. We then add them together.
Bells and Whistle:
Let’s try to combine with color with various permutations.
You can’t really tell very well from these images, given that the cat’s color is brown/gray and so it’s close to the high frequency image to begin with, but subjectively, it seems like color is the way to go.
Additional Images:
One of the best small LLM’s on the market is Starling-7B! It competes with GPT-4o with so many fewer of parameters. Here we combine the two founders, Jiantao Jiao (126 prof! Bottom Left) and Jian Zhang (Stanford Prof! Top Left):
Close up you will see Jianto, as the middle image shows. Far away you will see Jian, as the right image shows.
We can also try my favorite, Raccoon and Cat, with the mix in the middle, and how you would see the raccoon from far away (lean back for optimal effect)!
And here are the various furrier components of this image:
Here’s a failure. I try to blend my friend (who’s voice kind of sounds like Batman) and the real Batman. Unfortunately, because of the distinctness of Batman’s mask (ie: triangular nose, pointy ears, distinct eyes) it is unable to properly blend with my friend. Here, |K| is the size of the Gaussian Kernel, and Sig is the sigma value for it.
A False Dichotomy (Another Bell and Whistle): Hybrids of More than One Image
Suppose we want to display more than 2 images on a plot. One might attempt to extend the natural high and low frequency plots to a series of bands of frequencies where each image may “reside.”
To perform this natural extension of the previous part, we apply the Fourier Transform, isolate the a specific frequency for each image, and the convert it back using the Inverse Fourier Transform. Let’s see this!
Frequencies between 25 and 60:
Frequencies above 45:
Frequencies below 30
Combining:
As you can see, as you shrink, different layers can be observed at different points. In the top one, the third image is very visible. In the middle, the third image is less visible, and the first two are very visible. In the bottom image, only the first image is visible!
Part 2.3/2.4: Image Blending
Suppose we wanted to make an Orapple from combining an apple and orange:
To start, we develop Gaussian and Laplacian Stacks, which consist of repeat applications of the Gaussian Kernel Convolution and the original image minus this convolution, respectively. This yields:
We can recreate the stack from Burt & Addleson with a binary mask (pictured on left):
At each level, the combined image looks like:
Summing them together we get the Orapple! Or maybe the Apprange
Now, we can experiment with some of our own masks!
TOTOBRON:
Can we blend Totoro on a branch with the GOAT?
Starry Night At Berkeley
What Did I Learn?
I got a pretty in depth understanding of filters and frequencies — specifically, low and high pass filters. Most importantly I got an intuitive understanding about how changing the sigma and number of Guassians impacts filtration.