2015-10-05

Calculating Perceptual hashes (PHASH) and comparing image similarity):

← Older revision

Revision as of 04:02, 5 October 2015

Line 1:

Line 1:

== Calculating Perceptual hashes (PHASH) and comparing image similarity) ==

== Calculating Perceptual hashes (PHASH) and comparing image similarity) ==



Some thoughts - What I do when I get a list of similar pictures I create two bat files. The first renames the files so that each of the duplicates are sequential in a directory and include the names and maybe other information. I will use the size so that the larger file will be first and the smaller next so that when I use an image viewer I can just delete the second after verifying it is similar.

+

Some thoughts - What I do when I get a list of similar pictures I create two bat files. The first renames the files so that each of the duplicates are sequential in a directory and include the names and maybe other information. I will use the size so that the larger file will be first and the smaller next so that when I use an image viewer I can just delete the second after verifying it is similar
while walking the directory. Afterwords I run the restore bat file to move them back to the correct directories. I am still playing around with the algorithm. Currently re-sizing the image to 34x34 and dropping the outside edge in an attempt to remove borders and lettering. I have found that once the bit difference exceeds 10 the files are usually different. If it is under 8 then there is a good chance it is a duplicate
.

Note JPG files are lossy format so you need to be aware that EACH time you save the file you loose some quality. Also note that just because a image is larger does not mean that the image is better. It could be that the other image was enlarged or adjusted and you find the quality of the larger image is worse than the smaller image. The best way to compare and select the best picture is for you to do the comparison.

Note JPG files are lossy format so you need to be aware that EACH time you save the file you loose some quality. Also note that just because a image is larger does not mean that the image is better. It could be that the other image was enlarged or adjusted and you find the quality of the larger image is worse than the smaller image. The best way to compare and select the best picture is for you to do the comparison.

Show more