How to use the Python imagehash library


ImageHash is a Python library that provides tools for generating perceptual hash values for images. These hashes can be used to compare images based on their visual content, making it useful for finding similar or duplicate images.

Installation

Before you can use ImageHash, you need to install it. Typically, it's installed along with Pillow, as it depends on Pillow for image processing. Install both using pip:

pip install Pillow imagehash

Basic Usage of ImageHash

Generating Hashes

To generate a hash for an image, you need to import ImageHash and Pillow’s Image module:

from PIL import Image
import imagehash # Load an image
image = Image.open('path/to/your/image.jpg') # Generate a hash
hash = imagehash.average_hash(image)
print(hash)

Replace 'path/to/your/image.jpg' with the path to your image file. This example uses the average_hash method, which is good for general-purpose use.

Comparing Images

You can compare two images to see how similar they are by comparing their hashes:

image1 = Image.open('path/to/your/first/image.jpg')
image2 = Image.open('path/to/your/second/image.jpg') hash1 = imagehash.average_hash(image1)
hash2 = imagehash.average_hash(image2) # Calculate the difference between the two hashes
difference = hash1 - hash2
print(difference) # A smaller difference means the images are more similar
if difference < 10:
print("The images are similar.")
else:
print("The images are different.")

Different Types of Hashes

ImageHash provides several methods to generate different types of hashes, each with its own use case:

  • average_hash: Good for general-purpose image comparison.
  • phash: Perceptual hash, more sensitive than average_hash.
  • dhash: Difference hash, good for detecting changes in image edges.
  • whash: Wavelet hash, useful for detecting scaled images.

You can experiment with these different hashing functions to see which works best for your specific application.

Advanced Usage

Customizing Hash Size

You can customize the hash size (the default is 8) to increase or decrease the granularity of the hash:

# Generate a hash with a hash size of 16 (more detailed)
hash = imagehash.average_hash(image, hash_size=16)
A larger hash size can detect finer differences between images but may be more sensitive to minor variations.
Recommended Course

Python Mega Course: Learn Python in 60 Days, Build 20 Apps
Learn Python on Udemy completely in 60 days or less by building 20 real-world applications from web development to data science.