[thelist] Image duplicate filtering

Bill Moseley moseley at hank.org
Tue Feb 24 11:53:59 CST 2009


On Tue, Feb 24, 2009 at 04:35:12PM +0530, L. Mohan Arun (marun2 at gmail.com) wrote:
> Hi All,
> 
> I have 1000's of JPEG files dumped in a folder. Some JPEGs are the
> same, their filesize is the same, the content is the same, but the
> names differ.
> 
> Something like Marketa-1, Marketa-2, etc. they have different name but
> the content is the same, file size is the same. Is there a tool that
> will let me remove all JPEG duplicates in a folder retaining only the
> unique ones?

How do you decide which to delete and which one to keep?

$ find /media/store/photos | wc -l
8312


$ time perl -MFile::Find::Duplicates -lwe 'print join "\n--\n", map { join "\n", @{$_->files} } File::Find::Duplicates::find_duplicate_files( "/media/store/photos" );'

/media/store/photos/2007/08/31/p1030834 (modified in gimp image editor).jpg
/media/store/photos/2007/08/31/p1030834.jpg
--
/media/store/photos/2004/12/08/singing.mpg
/media/store/photos/2004/12/07/mov00011.mpg
--
/media/store/photos/2005/08/29/dsc01641.jpg
/media/store/photos/2005/08/29/dsc01641_1.jpg
/media/store/photos/2005/08/29/dsc01641_2.jpg
--
/media/store/photos/2005/02/27/test.jpg
/media/store/photos/2005/02/27/dsc00527.jpg

real    0m0.574s
user    0m0.476s
sys     0m0.100s


-- 
Bill Moseley
moseley at hank.org
Sent from my iMutt




More information about the thelist mailing list