Open Photography Forums  
HOME FORUMS NEWS FAQ SEARCH

Go Back   Open Photography Forums > Digital Darkroom > Image Processing and Workflow

Image Processing and Workflow RAW, DNG , TIFF and JPG. From Capture to Ready for Publish/Display. All software and techniques used within an image workflow, (except extensive retouching and repair or DAM).

Reply
 
Thread Tools Display Modes
  #1  
Old June 9th, 2010, 07:35 AM
Doug Kerr Doug Kerr is offline
Senior Member
 
Join Date: May 2006
Location: Alamogordo, New Mexico, USA
Posts: 8,418
Default Photoshop - character '' in Exif metadata

This is to summarize my findings regarding a peculiarity when using Photoshop to embed in image file metadata a copyright notice including the character ''.


Three kinds of metadata

Let me first review the three major "kinds" of metadata provided for in modern image files:

Exif metadata. This includes the familiar information about the technical circumstances of the shot - camera model, shutter speed, and so forth. It also provides for information such as what we may call a copyright notice. Its technical structure draws upon the "tag" structure found in the TIFF file format.

IPTC IIM metadata. This is the original form of metadata standardized by the International Press Telecommunications Council. Its technical structure is similar to that of the Exif metadata. It also provides for what we may call a copyright notice.

IPTC XML metadata. This is an advanced form of metadata standardized by IPTC. Its structure is based on the XML information formatting concept. It also provides for what we may call a copyright notice. (The actual data item has different formal names in all three places.)

An Exif file (that is the type we most commonly use, including for JPEG image data) can support all three types (simultaneously).

Character Sets

The applicable specification provides that, in Exif metadata, what we will call here the 'copyright notice' item shall be encoded in ASCII. To get a little ahead of the story, that means that the character '', not being an ASCII character, cannot legitimately appear in the Exif Metadata.

As we so often find, many workers have, without any formal leave from the standards, "stretched" the prescription for text items in Exif metadata to be in ASCII to mean "ASCII or ISO-8859-1", the latter being essentially the "extended ASCII" character set widely used in Windows systems.

IPTC IIM metadata text may be encoded in any of several character sets, including ASCII or Unicode UTF-8. An indicator tells which character set is in use.

IPTC metadata is encoded in Unicode UTF-8.

Photoshop

Photoshop, in its File Info panel, permits the user to set various data items that will be embedded as metadata in the resulting image file. There is a single field for the "copyright notice". Photoshop will then embed it as the corresponding item in all three types of metadata mentioned above.

In all three places, this text string is encoded in Unicode UTF-8 form. In the two IPTC areas this practice is perfectly in keeping with the IPTC standards (in the IPTC IIM area, the proper character set indicator is provided).

But in the Exif metadata area, the copyright notice string is also encoded in Unicode UTF-8 form. This is not accommodated by the Exif specification.

So long as only ASCII characters appear in the string, this is only of academic interest: the Unicode UTF-8 representation of an ASCII character (such as 'a') is just the same as the ASCII (or ISO-8859-1) representation - a single byte, with the obvious value.

But the situation gets more complicated when the string includes a non-ASCII character, such as ''. For that in particular, the Unicode UTF-8 representation is a sequence of two bytes, with hexadecimal values 0xC2 and 0xA9. If that sequence is examined by an application that presumes text strings in Exif metadata to have been encoded in ISO-8859-1, it will display it as '©' - not cool.

So, what should Photoshop do? The dilemma is that there is no legitimate way (that is, in conformity with the Exif specification) to embed a '' in the Exif copyright notice item at all. But of course it would not have been practical for the Photoshop designers just to say "we cannot handle that - it is contrary to the specification".

But, given general practice in this area, it would probably have been more useful for Photoshop, when seeking to embed non-ASCII characters in Exif metadata, to use the ISO-8859-1 encoding.

Metadata reading applications

A "strict" Exif metadata-reading application, encountering the character '' in ISO-8859-1 encoding in the Exif metadata, would reject it as a non-character (perhaps displaying instead a substitute character).

But in fact, most Exif-metadata reading applications take a practical view, and will display the character as ''. But, faced with a Photoshop-generated file, they will display in this spot what they think they have seen in the file: '©'.

Some applications, developed by those aware of the Photoshop practice (BreezeBrowser is a good example), will render the byte sequence 0xC2 0xA9 as '', essentially treating the incoming data as being in Unicode UTF-8.

What it the file was not generated by PhotoShop, and the user has placed in the copyright notice string the character '' ('Copyright 2010 . Yert') (encoded as ISO-8859-1)?

Typical Exif readers render that byte sequence as intended.

BreezeBrowser renders the resulting byte sequence as 'Copyright 2010 Yert'. (I'll spare teh reader the analysis of how that happens.)

Photoshop renders it as intended (but then, if we let it rewrite the file, puts it into UTF-8 form.

(By the way, '' in Unicode UTF-8 has a two-byte representation.)

What should happen?

CIPA (the issuer of the DCF file specification, essentially the form of the Exif file format used by most digital cameras today) should amend the specification to provide for general text items in Exif metadata to be encoded in ISO-8859-1 form.

Adobe should arrange Photoshop to encode text data in the Exif metadata area in ISO-8859-1 form.

There should be peace between Israel and Palestine.

So, what should we do?

There is no foolproof solution at present to this problem. Perhaps best would be for those who embed a copyright notice in their files via Photoshop to not include the character ''. It is not a mandatory part of the copyright notice as prescribed under US copyright law. (However, it does play a role in the international scheme of copyright protection).

#

Best regards,

Doug

Last edited by Doug Kerr; June 9th, 2010 at 08:37 AM.
Reply With Quote
  #2  
Old June 9th, 2010, 09:25 AM
Rachel Foster Rachel Foster is offline
Senior Member
 
Join Date: Sep 2007
Location: Michigan, USA
Posts: 3,574
Default

So Alt 0 1 6 9 doesn't work? I'm confused.
Reply With Quote
  #3  
Old June 9th, 2010, 10:05 AM
Asher Kelman Asher Kelman is offline
OPF Owner/Editor-in-Chief
 
Join Date: Apr 2006
Posts: 34,113
Default

Doug,

In simple terms, what is in practice, wrong with adding a notive in the copyright fields of File Info in PS or in the other catalog programs.

Does it get destroyed somehow or is it merely non-conforming like premarital sex in the Catholic Church!

Asher
__________________
Follow us on Twitter at @opfweb

Our purpose is getting to an impressive photograph. So we encourage browsing and then feedback. Consider a link to your galleries annotated, C&C welcomed. Images posted within OPF are assumed to be for Comment & Critique, unless otherwise designated.
Reply With Quote
  #4  
Old June 9th, 2010, 12:12 PM
Doug Kerr Doug Kerr is offline
Senior Member
 
Join Date: May 2006
Location: Alamogordo, New Mexico, USA
Posts: 8,418
Default

Hi, Asher,

Quote:
Originally Posted by Asher Kelman View Post
Doug,

In simple terms, what is in practice, wrong with adding a notive in the copyright fields of File Info in PS or in the other catalog programs.

Does it get destroyed somehow or is it merely non-conforming like premarital sex in the Catholic Church!
The "only" problem is that many Exif metadata reading programs, examining a file created by Photoshop, present this character, as encoded by Photoshop, as '©', which some of us consider nikulturniy.

This typographic curiosity has none of the redeeming value of premarital sex.

Best regards,

Doug
Reply With Quote
  #5  
Old June 9th, 2010, 12:17 PM
Doug Kerr Doug Kerr is offline
Senior Member
 
Join Date: May 2006
Location: Alamogordo, New Mexico, USA
Posts: 8,418
Default

HI, Rachel,

Quote:
Originally Posted by Rachel Foster View Post
So Alt 0 1 6 9 doesn't work? I'm confused.
Let me give a simple example.

In Photoshop, you enter in the Copyright notice field of the File Info panel your copyright notice, using Alt+0169 to include the character ''.

When you look at the finished file with many applications that read and display the Exif (and other) metadata, you will see '©' instead of ''.

Here is an example with the Opanda Iexif metadata reader (which I use as a browser plugin to examine the metadata of posted images):



Best regards,

Doug
Reply With Quote
  #6  
Old June 9th, 2010, 01:34 PM
Rachel Foster Rachel Foster is offline
Senior Member
 
Join Date: Sep 2007
Location: Michigan, USA
Posts: 3,574
Default

Ah, got it. Thanks.
Reply With Quote
  #7  
Old June 10th, 2010, 08:57 AM
Doug Kerr Doug Kerr is offline
Senior Member
 
Join Date: May 2006
Location: Alamogordo, New Mexico, USA
Posts: 8,418
Default

Just for completeness of the record, let me describe here a collateral issue of character encoding in Exif metadata which, to the best of my knowledge, does not lead to any widespread difficulty nor need for attention from us in our usual work.

At issue is the Exif metadata item UserComment.

The Exif specification (as well as the DCF specification, an elaborated form of the Exif specification that governs common digital camera image files) provide that this data item can be in either of three coded character sets:

ASCII (meaning just that, not "ASCII or ISO-8859-1"
JIS, referring to a 16-but Japanese-language character set
Unicode (no initial mention of UTF-8 vs. UTF-16BE vs. UTF-16LE)

The character set used is declared by a text prefix to the data item itself, 'ASCII', 'JIS', or 'UNICODE', in ASCII, built out to 8 characters/bytes with NULs.

My experience is that typical Exif editors, when adding or modifying this items, will apply:

The ASCII encoding if the string entered by the user contains only ASCII characters.

The Unicode encoding if the string entered by the user contains any characters beyond ASCII.

Although not mentioned in the standard, evidently it is the practice to use UTF-16 encoding in the Unicode mode. In the case of characters in the Basic Multilingual Plane (all those we are likely to encounter), this uniformly uses a 16-bit number (recorded as two bytes) to represent a character.

With regard to the little-endian vs. big-endian distinction (the order of the two bytes representing a 16-bit number), that is declared in an Exif file on behalf of the whole file. In the header, there is a two-byte ASCII character bit order indicator, either 'II' (evocative of "Intel") for little-endian or 'MM' (evocative of Motorola") for big-endian.

Photoshop does not either display or allow us to enter or manipulate the Exif metadata item UserComment. Neither does it seem to tamper with existing data in that area.

For example, if the data item is declared to be encoded in ASCII but in fact contains a non-ASCII character (never mind how that got there), Photoshop does not, for example, rewrite the item in Unicode to attain orthodoxy.

Typical Exif metadata reading applications seem to interpret non-ASCII characters found in a UserComment declared as ASCII as if they were in ISO-8859-1 (a reasonable accommodation of something we should, however, not expect to encounter). I have not done a comprehensive survey in this regard.

As I said at the outset, I cannot just now imagine a situation in which the details of this are of any concern to our operations.

Best regards,

Doug
Reply With Quote
Reply

Bookmarks

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are Off
[IMG] code is On
HTML code is Off

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Metadata - examples of our practices Doug Kerr Image Processing and Workflow 6 May 30th, 2010 02:10 PM
Photoshop - encoding of copyright symbol in metadata Doug Kerr Image Processing and Workflow 6 May 17th, 2010 02:40 PM
Copyright notice practice Doug Kerr Image Processing and Workflow 0 May 16th, 2010 09:44 AM
Photoshop CS5 - Metadata handling Doug Kerr Image Processing and Workflow 4 May 15th, 2010 03:37 AM
Exif metadata - image dimensions Doug Kerr Image Processing and Workflow 1 May 14th, 2010 03:58 PM


All times are GMT -7. The time now is 09:20 PM.


Posting images or text grants license to OPF, yet of such remain with its creator. Still, all assembled discussion 2006-2017 Asher Kelman (all rights reserved) Posts with new theme or unusual image might be moved/copied to a new thread!