Convolutions are ways to transform images. There are two main applications of this technique: image editing and convolutional neural networks. Convolutional neural networks, or CNNs, use convolutions to transform the image making them perform much better at computer vision tasks.
This is a post about fast.ai lesson 6. Read all the fast.ai posts so far here.
What is convolution?
Convolution comes from the latin word convolvere which means ‘roll together’, from con-‘together’ + volvere‘roll’ (Oxford Dictionary). Mathematically, convolving is a way to compute a new value for each pixel in the image to transform it.
Convolutions use so-called kernels, “configurations”, that determine what the output of a convolution will look like. A kernel is basically an n by n dimensional matrix.
By multiplying a set of pixels with this kernel, hence the name roll together, yields the output for the “focussed” pixel (the pixel in the middle of the set):
There are many kernels that are widely used in both image editing software and deep learning. Below are some popular kernels.
The Identity Kernel
The identity kernel is, much like the identity matrix, a kernel that has no effect when used for convolution.
The mathematical definition of the identity kernel:
An example of an image convoluted by the identity kernel:
The Box Kernel
The box blur kernel is often used in image editing.
The mathematical definition of the box kernel:
An example of an image convoluted by the box kernel:
The Gaussian Blur Kernel
Like the box blur kernel, the gaussian blur kernel blurs an image. The main difference is the distribution of the pixels.
The mathematical definition of the gaussian blur kernel:
An example of an image convoluted by the gaussian blur kernel:
Unfortunately, this kernel doesn’t actually increase the resolution of the image. What sharpening means here is increasing the diversity of the image. Like box blur and gaussian blur, this kernel can be used a nice touch in image editing.
The mathematical definition of the sharpening:
An example of an image convoluted by the sharpening kernel:
Edge detection kernel
This is perhaps the most interesting kernel. It emphasizes the outlines of objects. CNNs are the main application of this kernel. When using edge detection kernels, CNNs do much better at image classification. Becaues this kernel is used in deep learning, it also got a fancy name: “Laplacian Kernel”.
The mathematical definition of edge detection kernel:
An example of an image convoluted by the edge detection kernel:
This image might look very confusing to humans, but it is actually a pretty good image to be used by computers due to the stark contrast between light and dark pixels.
Applying kernels to an image using lumpy
Start by defining the above kernels in python:
identity = np.array([[0, 0, 0], [0, 1, 0], [0, 0, 0]]) box_blur = (1/9) * np.ones((3, 3)) gaussian_blur = (1/16) * np.array([[1, 2, 1], [2, 4, 2], [1, 2, 1]]) sharpening = np.array([[0, -1, 0], [-1, 5, -1], [0, -1, 0]]) edge_detection = np.array([[0, -1, 0], [-1, 4, -1], [0, -1, 0]])
Applying a kernel to a kernel can be done easily using
numpy.convolve. To load the image I used a library called
Pillow. Note that both the image and kernel need to be flattened because
numpy.convolve only accepts 1D arrays. This also means the output of the convolution needs to be reshaped in order to be displayed. Setting mode to
"same"makes sure the resolution is preserved.
def apply(img, kernel): w, h = np.asarray(img).shape img = np.asarray(img) out = np.convolve(img.flatten(), kernel.flatten(), mode='same').reshape((w, h)) out = np.array(out, dtype=np.uint8) return Image.fromarray(out)
Finally, we can apply the kernels to an image and save the results to disk.
img = Image.open('image.png').convert('F') apply(img, identity).save('identity.png') apply(img, box_blur).save('box.png') apply(img, gaussian_blur).save('gaussian.png') apply(img, sharpening).save('sharpening.png') apply(img, edge_detection).save('edge.png')
If you want to learn more about convolutions and CNNs, be sure to check out the following links:
- CS1114 Section 6: Convolution. This is a great document by Cornell University on convolutions and it also includes some exercises to test your knowledge.
- Convolutional Neural Networks (CNNs) explained. A video that explains CNNs is a very simple manner.
- Kernels - Wikipedia
A huge thanks to Sam Miserendino for proofreading this post!