Agnius Vasiliauskas coding sandbox: Detecting similar entropy zones in image

As I thought about information entropy one idea came to me - to write application which looks similar entropy zones in image. So after some time, I came with this algorithm (pseudo code):

1. Split image into 5x5 pixel image blocks
2. Calculate information entropy of these blocks (actually sum of entropy in 3 color channels)
3. Find similar entropy blocks.
4. [...] filter out small groups of blocks (seems like noise, huh ? ). Blah, blah...
5. Picture these similar entropy blocks on top of original image as red color layer.

Now real part (as you know already) - Python code which does the job [you need PIL module to run this]:


from PIL import Image
from math import *

def entropysum(pixels):
 """
 Calculating information entropy for image
 region and returning entropy sum for
 all 3 color channels
 """
 cr = [r for (r,g,b) in pixels]
 cg = [g for (r,g,b) in pixels]
 cb = [b for (r,g,b) in pixels]
 
 er = 0.0
 eg = 0.0
 eb = 0.0
 
 for r in set(cr):
  p = float(cr.count(r))/len(cr)
  if p > 0.0: er += -p * log(p,2)

 for g in set(cg):
  p = float(cg.count(g))/len(cg)
  if p > 0.0: eg += -p * log(p,2)

 for b in set(cb):
  p = float(cb.count(b))/len(cb)
  if p > 0.0: eb += -p * log(p,2)

 return er + eg + eb

def decompose(image, block_len):
 """
 Decomposing given image into some number of
 smaller images of size block_len*block_len
 """
 parts = []
 w, h = image.size
 
 for x in range(0, w, block_len):
  for y in range(0, h, block_len):
   locim = image.crop((x,y,x+block_len,y+block_len))
   acc = entropysum(list(locim.getdata()))
   parts.append((acc,x,y,locim))
 
 parts.sort()
 
 return parts

def similarparts(imagparts):
 """
 Detecting similar image blocks by comparing
 entropy of given images. Two images considered
 being equal if entropy difference is not big.
 """
 dupl = []
 
 for i in range(len(imagparts)-1):
  acc1, x1, y1, im1 = imagparts[i]
  acc2, x2, y2, im2 = imagparts[i+1]
  
  if acc1 == acc2 == 0:
   gain = 0.0
  else:
   gain = 100.0 * (1.0 - acc1 / acc2)

  if 0.01 < gain < 0.1 :
   if imagparts[i] not in dupl:
    dupl.append(imagparts[i])
   if imagparts[i+1] not in dupl:
    dupl.append(imagparts[i+1])

 return dupl

def clusterparts(parts):
 """
 Grouping nearest images into groups.
 This is done, because we need to
 filter out very small groups. We
 want to know only big differences.
 """
 
 filtparts = []
 clust = {}
 belongs = {}
 w,h = parts[0][3].size
 
 # assign all parts to clusters
 for i in range(len(parts)):
  acc, x, y, im = parts[i]
  sides = []
  sides.append(str(x)+str(y)+str(x+w)+str(y))
  sides.append(str(x+w)+str(y)+str(x+w)+str(y+h))
  sides.append(str(x)+str(y+h)+str(x+w)+str(y+h))
  sides.append(str(x)+str(y)+str(x)+str(y+h))
  
  # detect side already in cluster
  fc = None
  for s in sides:
   if belongs.has_key(s):
    fc = belongs[s]
    break
  
  # if this is new cluster
  if fc == None:
   fc = len(clust) + 1
   clust[fc] = 1
  else:
   clust[fc] += 1
  
  # set cluster for rectangle sides
  for s in sides:
   if not belongs.has_key(s):
    belongs[s] = fc

 # filter out small clusters
 for i in range(len(parts)):
  acc, x, y, im = parts[i]
  side = str(x)+str(y)+str(x+w)+str(y)
  cl = belongs[side]
  if clust[cl] > 2:
   filtparts.append(parts[i])
 
 return filtparts

def marksimilar(image, dparts):
 """
 Mark found similar image blocks on
 original image, by applying red layer
 on similar parts of image.
 """
 if dparts:
  colormask = Image.new('RGB', dparts[0][3].size,(255,0,0))
  for (acc,x,y,im) in dparts:
   im = Image.blend(im, colormask, 0.4)
   image.paste(im,(x,y))
 
 return image

if __name__ == '__main__':
 im = Image.open("1.jpg")
 ls = decompose(im, 5)
 dparts = similarparts(ls)
 cparts = clusterparts(dparts)
 im = marksimilar(im, cparts)
 im.show()

So these are the results after running this script on several images:

Conclusion

So this algorithm is an interesting tool for exploration of information entropy in image. Maybe in some cases it could be a tool for analyzing very similar texture zones. BTW information entropy may be used for hashing image. Hashing image is useful, because it lets us to search similar images in database (for example) by its hash.

Have fun !

Agnius Vasiliauskas coding sandbox

Sunday, September 14, 2008

Detecting similar entropy zones in image

No comments:

Post a Comment