0

I have a reference image A and 2 target images B and C , I tried to measure the SSIM as follows :

(from a human vision perception A & B are from the same class) and A & C from different class.

result1 = SSIM(A , B) = 4.71027%;  
result2 = SSIM(A , C) =  7.95047%; 

I used the code from opencv : SSIM CODE

I also tried LBP normalized histogram of the entire image by calculating KL divergence of the two histograms, but the results were worst.

Is there a way to measure the similarity without training?

Image A : enter image description here

Image B : enter image description here

Image C : enter image description here

EDIT :

After @Cris Luengo suggestion, these are the results of 2 LBP versions Circular, and Variance-based. It' seems like the choice of the method (features descriptor) is critical: (result = 0 means identical)

result1 = LPB_CIRCULAR_HIST_KL(A , B) =  0.66;
result2 = LPB_CIRCULAR_HIST_KL(A , C) =  0.64;

result1 = LPB_VAR_HIST_KL(A , B) = 0.49;
result2 = LPB_VAR_HIST_KL(A , C) = 3.74;
2
  • 1
    SSIM is to compare nearly identical images, differing only in noise or blur level. You want to compare texture, you need to use a texture measure. LBP is a good choice, but you might need to adjust its parameters. Note that A and B are rotated, so you need a rotationally invariant version of LBP. Commented Feb 14, 2024 at 11:29
  • Thank you. I used the extended LBP (circular), but I am not sure if that's enough for this case. Commented Feb 15, 2024 at 5:23

1 Answer 1

2

As comments suggest, SSIM will not work if the two images are not pixel-alingned. You can find similarity between two unaligned images in a variety of ways. Nowadays one of the most popular is using CLIP. CLIP is what Generative AI like Stable Diffusion is based on.

I suggest you look at this repo which tells you how install CLIP for python and extract features and similarities. The example in there is for image-text similarity but you can extract image-image similarity by doing something like:

import torch
import clip
from PIL import Image
import torch.nn.functional as F

device = "cuda" if torch.cuda.is_available() else "cpu"
model, preprocess = clip.load("ViT-B/32", device=device)

image1 = preprocess(Image.open("Image1.png")).unsqueeze(0).to(device)
image2 = preprocess(Image.open("Image2.png")).unsqueeze(0).to(device)

with torch.no_grad():
    image1_features = model.encode_image(image1)
    image2_features = model.encode_image(image2)
    
    sim = F.cosine_similarity(image1_features, image2_features)
    
print("Cosine similarity: ", sim)  

Note this might be quite slow depending on how many samples you have or what kind of task you want to run (brute force retrieval might not be feasible)

Sign up to request clarification or add additional context in comments.

2 Comments

Thank you. I'll give it a try. I am also curious whether this method can work using simpler models as well.
The last sentence is not complete.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.