OpenCV & Python - Reading and Displaying Images
OpenCV is an open source Computer Vision and Image Processing Library made up of over 2500 algorithms. It's a foundational pillar in research and understanding for my line of work as machines often have to make a lot of their decisions based off image data alone. While we tend to use more commercialized software (scaling reasons) in our final solution, OpenCV can be a great tool to experiment, test, practice and gain a better understanding of the fundamentals of Computer Vision. In this post we are going to start with the most basic of tasks, reading and displaying images with OpenCV & Python.
**FYI, the code for this tutorial can be found in the form of a Jupyter Notebook on my GitHub page here.
STEP 1: Imports
Okay, Let's get started. For this to work correctly we need to import the following libraries
import cv2
import numpy as np
import matplotlib.pyplot as plt
The first library we need is the OpenCV library which is imported as cv2.
The second library import we "need" (don't really need to import it for this tutorial, but I want you to get used to seeing this) is the numpy library which is basically a library that allows for complex mathematics on arrays and matrices. Why is this important? Well, believe it or not images are just a (sometimes multi-dimensional) array of pixel values.
The final library we are going to use is the Matplotlib Library, and more specifically the pyplot module contained in that library. Matplotlib is the defacto plotting library for Python and in turn is a great way to display or show images.
STEP 2: loading Images
Actually, loading images with OpenCV is simple. It's actually only a single line of code, but there are some things you need to be aware of... like the fact that OpenCV will import all images (grayscale or color) as having 3 channels, so in order to read a grayscale image as only having a single channel you need to pass the arg 0 after the image location. In addition, the image path string can be a relative or absolute path to the image you are trying to load. In the example below I'm using a relative path to the dolphin.png.
# load an image
img = cv2.imread('images/dolphin.png')
# load an image as a single channel grayscale
img_single_channel = cv2.imread('images/dolphin.png', 0)
# print some details about the images
print('The shape of img without second arg is: {}'.format(img.shape))
print('The shape of img_single_channel is: {}'.format(img_single_channel.shape))
The output of this code block can be seen below. Let's dissect this a little bit... By using the .shape method we can return the "shape" of the images. We are also using the Python build-in string .format method which allows us to insert variables into a string where the {} are located. The first shape is (320, 500, 3). This translates to 320px in height, 500px in width and 3 channels in depth (BGR). Even though the image is actually grayscale OpenCV is still trying to resolve the image as 3 individual channels. In order to get around this we need to explicitly pass 0 as the second argument. But why do we need to pass a 0? Well, it turns out the second argument is actually a flag variable with 3 available options:
cv2.IMREAD_COLOR : Loads a color image. Any transparency of image will be neglected. It is the default flag.
cv2.IMREAD_GRAYSCALE : Loads image in grayscale mode
cv2.IMREAD_UNCHANGED : Loads image as such including alpha channel
However, instead of explicitly passing these flags we can use the build in shortcut where each is represented by 1, 0, and -1 respectively.
"The shape of img without second arg is: (320, 500, 3)"
"The shape of img_single_channel is: (320, 500)"
STEP 3: DISPLAYING IMAGES W/OPENCV
First we are going to display images using the built-in OpenCV function .imshow().
The cv2.imshow() takes two required arguments
1st Argument --> The name of the window where the image will be displayed
2nd Argument --> The image to show
IMPORTANT NOTE: You can show as many images as you want at once they just have to be different window names!
In addition to the cv2.imshow() function there are a few other code lines required items to make this displaying function work correctly.
The first peice is the cv2.waitKey() function
It has a single argument --> the time in milliseconds
The function waits for specified amount of milliseconds for any keyboard event. If you press any key after that time, the program continues. If 0 is passed, it waits indefinitely for a key stroke. It can also be set to detect specific key strokes like if key a is pressed etc.
IMPORTANT NOTE: Besides binding keyboard events this waitKey() also processes many other GUI events, so you MUST use it to actually display the image.
The second required piece of code is the cv2.destroyAllWindows() function
This function takes no arguments and simply destroys all the windows we created (in this case one).
IMPORTANT NOTE: If you want to destroy any specific window, use the function cv2.destroyWindow() instead where you pass the exact window name as the argument.
# display the image with OpenCV imshow()
cv2.imshow('OpenCV imshow()', img)
# The OpenCV waitKey() function is a required keyboard binding
# function after imwshow()
cv2.waitKey(0)
# destroy all windows command
cv2.destroyAllWindows()
From the code block above we we get a new window that will display the following image. You can see the title of the window is what I passed as the first argument above. You can press any key on the keyboard to jump out of the .waitKey() binding and the program will move on and "destroy" all the windows we've called (just this one in this particular case).
STEP 4: DISPLAYING IMAGES W/MATPLOTLIB
OpenCV is not the only way to display images with Python. In fact, because images are just functions we can plot them as we do with other functions. To do this we use the Matplotlib library.
**BTW: if you are working in a Jupyter Notebook environment plotting images with Matplotlib (IMO) is the best way to display images.
The function we need is plt.imshow(). Why is it plt.imshow()? Well remember in our imports during step one above, we imported matplotlib.pyplot as plt. This line of code allows us to alias matplotlib.pyplot as plt so we can save some time on typing.
Okay, so the plt.imshow() function (not to be confused with the cv2.imshow() function) can take quite a few arguments to learn more about this function you can see the documentation Here
For the purposes of this tutorial, I'm only using one additional argument:
cmap --> This is the color mapping. If this is not used in this case matplotlib will try to plot the gray images as a RGB image because it has a depth of 3 channels without the 0 passed in cv2.imread()
# first read in the image using OpenCV
img = cv2.imread('images/dolphin.png')
# Adding a title to the Plot
plt.title('Monochormatic Images in Matplotlib')
# Using the plt.imshow() to add the image plot to
# the matplotlib figure
plt.imshow(img, cmap='gray')
# This just hides x and y tick values by passing in
# empty lists to make the output a little cleaner
plt.xticks([]), plt.yticks([])
plt.show()
# print information about the image size and type
height, width, depth = img.shape
print('Image Width: {}px, Image Height: {}px, Image Depth: {}ch'.format(width, height, depth))
# openCV stores images as np.ndarray
print('Image Type: {}'.format(type(img)))
The output from the code block above will be as follows (if you are in a Jupyter Notebook). If you are running this from a script the image will appear in an interactive Matplotlib window and the text will print to the terminal.
Okay, so now that we can display grayscale images. Let's see how Matplotlib handles color images.
# Example of how matplotlib displays color images from OpenCV incorrectly
img_color = cv2.imread('images/fruit.png')
plt.title('How OpenCV images (BGR) display in Matplotlib (RGB)')
plt.imshow(img_color)
plt.show()
Code Output:
Hmm, something doesn't look quite right, I've personally never seen fruit like that before... Well, that's because color images loaded by default in OpenCV are in BGR (blue, green, red) mode. However, Matplotlib displays images in RGB mode. Therefore color images will not be displayed correctly in Matplotlib if image is read with OpenCV and plotted directly using Matplotlib. Let's see if we can figure out a way to better handle this situation.
OPTION 1: USE cv2.cvtColor()
The first option is to use the built in OpenCV color space conversion variable flags. For our case we want to convert from BGR (Blue, Green, Red) colorspace to RGB (Red, Green, Blue) to do this we can use the cv2.cvtColor() function pass in the image and the cv2.COLOR_BGR2RGB variable flag.
# Convert the color using cv2.COLOR_BGR2RGB
img_color = cv2.imread('images/fruit.png')
img_rgb = cv2.cvtColor(img_color, cv2.COLOR_BGR2RGB)
plt.title('Correct Display after converting with cv2.COLOR_BGR2RGB')
# Tip: passing in empty lists for xticks & yticks will
# turn them off
plt.imshow(img_rgb)
plt.xticks([])
plt.yticks([])
plt.show()
Code Output:
OPTION 2: SLICING
Remember when I said images are functions? Well if you kept that in mind you may have already thought of this method on your own. What I'm referring to here is to using the fact that images are stored as numpy arrays to manually move the channels of the image matrix by slicing operations.
Let's explain the meat and potatoes of this code here:
img_rgb_numpy = img_color[:,:,::-1]
Remember that images are arrays of width, height, and depth. For our particular example the depth is a BGR ordering, so what's exactly happening here? Basically we leave the width and height the same by slicing all indicies with the colon (:), as for the depth, well we are just reversing its order: BGR reversed is RGB!
# Reverse the color porition of the image array
img_rgb_numpy = img_color[:,:,::-1]
plt.title('Correct Display after matrix slicing the Numpy Array')
plt.imshow(img_rgb_numpy)
plt.xticks([])
plt.yticks([])
plt.show()
Code Output:
STEP 5: PULLING IT ALL TOGETHER AND SAVING IMAGES
Saving images with OpenCV is done with the cv2.imwrite() function. This function takes a relative or absolute path where you want to save the image and the image you want to save.
# write an image with imwrite
where_to_save = 'images/dolphin_2.png'
cv2.imwrite(where_to_save, img)
print('Image saved as {}'.format(where_to_save))
SOME BONUS MATERIAL!!
If you want to try everything we learned together, here's some code you can try executing from a Python script. I've commented each line to explain what's happening.
# Read an image
img = cv2.imread('images/dolphin.png')
# Show the image
cv2.imshow('Option to Save image', img)
# Prompt the user to press 's' to save the image
print("press 's' to save the image as dolphin_3.png\n")
# Bind the waitKey function
# NOTE: if you are using a 64-bit machine, this needs to be: key = cv2.waitKey(0) & 0xFF
key = cv2.waitKey(0)
# wait for the ESC key to exit
if key == 27:
cv2.destroyAllWindows()
# wait for 's' key to save and exit
elif key == ord('s'):
cv2.imwrite('images/dolphin_3.png', img)
cv2.destroyAllWindows()