How to write a Perl script to filter digital images that have been processed?

Question

How to write a Perl script to filter digital images that have been processed?

Last night before going to bed, I looked again at the Learning Perl Scalar Data section and came across the following sentence:

The ability to have any character in a string means that you can create, scan, and process raw binary data as strings.

The idea immediately struck me that I really could let Perl scan the photos that I saved on my hard drive to see if they contained an Adobe string. It seems, doing this, I can say which of them was photoshop. So I tried to implement this idea and came up with the following code:

#!perl use autodie; use strict; use warnings; { local $/="\n\n"; my $dir = 'f:/TestPix/'; my @pix = glob "$dir/*"; foreach my $file (@pix) { open my $pic,'<', "$file"; while(<$pic>) { if (/Adobe/) { print "$file\n"; } } } }

Surprisingly, the code really works, and it filters photos that were processed by Photoshop. But the problem is that many pictures are edited by other utilities. I seem to be stuck there. We have a simple but universal method to determine if a digital image has been edited or not, something like

 if (!= /the origianl format/) {...}

Or do we just need to add additional conditions? as

 if (/Adobe/|/ACDSee/|/some other picture editors/)

Any ideas on this? Or am I simplifying due to my slightly limited programming knowledge?

Thanks, as always, for any guidance.

+4

regex perl photoshop

Mike Oct 26 '09 at 4:23

source share

10 answers

I will not say that it is absolutely impossible to detect changes in the image, but the problem is extremely complex.

The only one I know about who claims to have an answer, Dr. Neil Krawetz , who claims that digitally modified parts of the image will have different compression error rates from the original parts. He argues that re-preserving JPEGs at different quality levels will underline these differences.

I did not find this in my research, but perhaps you may have better results.

+4

peejaybee Oct 26 '09 at 5:37

source share

Not. The functional difference between a perfectly edited image and what it was from the very beginning is just a bag with pixels at the end, after all, and any other metadata that you can delete or fake whatever you want.

+3

bdonlan Oct 26 '09 at 4:25

source share

The name of the graphics program used to edit the image is not part of the image data itself, but is something called metadata, which can be stored in the image file, but, as others noted, is not required (therefore, some programs may not store it, some may allow you not to store it), and also not reliably - if you faked an image, you might have faked metadata.

Thus, the answer to your question: "No, there is no way to universally determine whether the rice has been edited or not, although some image editing software can write its signature to the image file and it will remain there by the carelessness of the editing person.

+3

DVK Oct 26 '09 at 4:41

source share

If you tend to learn more about image processing in Perl, you can take a look at some of the excellent modules that CPAN has to offer:

Image :: Magick - read, manipulate and write a large number of image file formats
GD - create color drawings using a large number of graphic primitives and select drawings in various formats.
GD :: Graph - create charts
GD :: Graph3d - create 3D graphics with GD and GD :: Graph

However, other utilities exist for identifying various image formats. This is more of a question for Superuser , but for various unix distributions you can use file to identify many different types of files, and for MacOSX, the “Graphics Converter” did not fail me. (He was even able to open my cat's bizarre multi-factor x-ray, a broken pelvis that I got on the disc from a veterinarian.)

+3

Ether Oct 26 '09 at 6:21

source share

How do you know what the original format is? I am sure there is no guaranteed way to find out if the image has been resized.

I can just open the file (with my favorite programming language and file system API) and just write whatever I want into this file. Until I fasten something to the file, you will never know that this happened.

Damn, I can print the image and then scan it back; how would you say this from the original?

+1

Carl Norum Oct 26 '09 at 4:25

source share

According to others, there is no way to know if the image has been taken into account. I assume that you basically want to know, this is the difference between a realistic photo and an improved or amended.

It is always possible to run an extremely complex pattern recognition algorithm that will analyze every pixel in your image and do some very complex things to determine if the image has been processed or not. This decision is likely to include an AI that will study millions of photographs, which are both doctrines, and those that are not and learn from them. However, this is more of a theoretical solution and not very practical ... you would probably see it only in films. It would be extremely difficult to develop and possibly take years. And even if you really got something like this to work, it probably will still not be 100% correct all the time. I assume that AI technology is still not at this level and may take some time until it is.

+1

Senseful Oct 26 '09 at 5:24

source share

An unknown exiftool function allows you to recognize source software by analyzing JPEG quantization tables (without relying on image metadata). It recognizes tables written by many applications. Please note that some cameras can use the same quantization tables as some applications, so this is not a 100% solution, but worth a look. Here is an example of running exiftool on two images, the first was edited by Photoshop.

 > exiftool -jpegdigest a.jpg b.jpg ======== a.jpg JPEG Digest : Adobe Photoshop, Quality 10 ======== b.jpg JPEG Digest : Canon EOS 30D/40D/50D/300D, Normal 2 image files read

This will work even if the metadata has been deleted.

+1

Phil harvey Oct 28 '09 at 14:05

source share

There is existing software that uses various methods (compression artifacts, comparison with signature profiles in the camera database, etc.) to analyze the actual image data to confirm the changes. If you have access to such software and the software available to you provides an API for external access to these analysis functions, then there is a decent chance that there will be a Perl module that will interact with this API and, if such a module does not exist, it can probably will be created pretty quickly.

In theory, it would also be possible to implement the image analysis code directly in native Perl, but I don’t know who would do it, and I expect you to write something that low-level and processor-intensive in a fully compiled language (for example, C / C ++), not Perl.

0

Dave sherohman Oct 26 '09 at 12:56

source share

http://www.impulseadventure.com/photo/jpeg-snoop.html is a tool that makes the job almost good

If there was any kind of cloning, there is a difference in pixel density or concentration that sometimes appears ... when manually inspected, the cloned Photoshop area will even have pixel density (my point is to change the pixels in the scanned image)

0

Stupendousman Jan 05 '10 at 10:43

source share

user181548 · Accepted Answer · 2009-10-26T04:28:08+0000

Your best bet in Perl is probably ExifTool . This gives you access to any information other than the image embedded in the image. However, as other people have said, you can, of course, share this information.

How to write a Perl script to filter digital images that have been processed?

More articles: