July 23, 2019 · 9 mins read

Compressing images using Python

Compressing images is a neat way to shrink the size of an image while maintaining the resolution. In this tutorial we’re building an image compressor using Python, Numpy and Pillow. We’ll be using machine learning, the unsupervised K-means algorithm to be precise.

If you don’t have Numpy and Pillow installed, you can do so using the following command:

pip3 install pillow
pip3 install numpy

Start by importing the following libraries:

import os
import sys

from PIL import Image
import numpy as np

The K-means algorithm

K-means, as mentioned in the introduction, is an unsupervised machine learning algorithm. Simplified, the major difference between unsupervised and supervised machine learning algorithms is that supervised learning algorithms learn by example (labels are in the dataset), and unsupervised learning algorithms learn by trial and error; you label the data yourself.

I’ll be explaining how the algorithm works based on an example. In the illustration below there are two features, and .

Unclassified points on a 2d plane

We want to assign each item to one out of two clusters. The most natural way to do this would be something like this:

Colored points on a 2 plane

K-means is an algorithm to do exactly this. Note that is the number of clusters we want to have, hence the name K means.

We start by randomly getting training examples and in so that the points are and where is the number of features. At initialization, the points might be very close to one another, so we have to check if our result looks like how we want it to look because it might stuck be in a local optimum.

Then for a certain number of iterations, execute the following steps:

  1. For every training example, assign to the closest centroid.
  2. For every centroid , set the location to be the average of examples assigned to it.

As the algorithm progresses, generally the centroids will move to the center of the clusters and the overall distance of the examples to the clusters gets smaller.

K-means for image compression

This technique can also be used for image compression. Each pixel of the image consists of three values, R(ed), B(lue) and G(reen). Those can be seen as the points on the grid, in 3D in case of RGB. The objective of image compression in this case is to reduce the number of colors by taking the average colors that look closest to the original image. An image with fever colors takes up less disk space, which is what we want.

Compressing the image can also be a preprocessing step to another algorithm. Reducing the size of the input data generally speeds up learning.

Implementing K-means

We start by implementing a function that creates initial points for the centroids. This function takes as input X, the training examples, and chooses distinct points at random.

def initialize_K_centroids(X, K):
    """ Choose K points from X at random """
    m = len(X)
    return X[np.random.choice(m, K, replace=False), :]

Then we write a function to find the closest centroid for each training example. This is the first step of the algorithm. We take X and the centroids as input and return the the index of the closest centroid for every example in c, an m-dimensional vector.

def find_closest_centroids(X, centroids):
    m = len(X)
    c = np.zeros(m)
    for i in range(m):
        # Find distances
        distances = np.linalg.norm(X[i] - centroids, axis=1)

        # Assign closest cluster to c[i]
        c[i] = np.argmin(distances)

    return c

For the second step of the algorithm, we compute the distance of each example to ‘its’ centroid and take the average of distance for every centroid . Because we’re looping over the rows, we have to transpose the examples.

def compute_means(X, idx, K):
    _, n = X.shape
    centroids = np.zeros((K, n))
    for k in range(K):
        examples = X[np.where(idx == k)]
        mean = [np.mean(column) for column in examples.T]
        centroids[k] = mean
    return centroids

Finally, we got all the ingredients to complete the K-means algorithm. We set max_iter, the maximum number of iterations, to 10. Note that if the centroids aren’t moving anymore, we return the results because we cannot optimize any further.

def find_k_means(X, K, max_iters=10):
    centroids = initialize_K_centroids(X, K)
    previous_centroids = centroids
    for _ in range(max_iters):
        idx = find_closest_centroids(X, centroids)
        centroids = compute_means(X, idx, K)
        if centroids == previous_centroids:
            # The centroids aren't moving anymore.
            return centroids, idx
        else:
            previous_centroids = centroids

    return centroids, idx

Getting the image

In case you’ve never worked with Pillow before, don’t worry. The api is very easy.

We start by trying to open the image, which is defined as the first (and last) command line argument like so:

try:
    image_path = sys.argv[1]
    assert os.path.isfile(image_path)
except (IndexError, AssertionError):
    print('Please specify an image')

Because we want to get the raw image data, eg the rgb values for each pixel, we create a numpy from the image just like this:

image = load_image(image_path)
w, h, d = image.shape
print('Image found with width: {}, height: {}, depth: {}'.format(w, h, d))

Then we get our feature matrix . We’re reshaping the image because each pixel has the same meaning (color), so they don’t have to be presented as a grid.

X = image.reshape((w * h, d))
K = 20 # the desired number of colors in the compressed image

Finally we can make use of our algorithm and get the colors. These colors are chosen by the algorithm.

colors, _ = find_k_means(X, K, max_iters=20)

Because the indexes returned by the find_k_means function are 1 iteration behind the colors, we compute the indexes for the current colors. Each pixel has a value in corresponding, of course, to its color.

idx = find_closest_centroids(X, colors)

Once we have all the data required we reconstruct the image by substituting the color index with the color and resphaping the image back to its original dimensions. Then using the Pillow function Image.fromarray we convert the raw numbers back to an image. We also convert the indexes to integers because numpy only accepts those as indexes for matrices.

idx = np.array(idx, dtype=np.uint8)
X_reconstructed = np.array(colors[idx, :] * 255, dtype=np.uint8).reshape((w, h, d))
compressed_image = Image.fromarray(X_reconstructed)

Finally, we save the image back to the disk like so:

compressed_image.save('out.png')

Testing an image

This is the fun part, testing our program. You can choose any (png) image you want. I use this image by Valentin Pinisoara on Unsplash that I scaled down for speed purposes.

Compressed image next to uncompressed image

The file size was reduced by 71%, from 228kb to 65kb with .

To get a better intuition for this algorith, you should try out different values for and max_iter yourself.

Conclusion

As you’ve seen in this tutorial, it’s quite easy to reduce the file size of an image using unsupervised machine learning.

Check out the finished program on GitHub.