2RO050 - Metadata Automation

by Joseph Moore

The following is a reverse-chronological record of the efforts made to automatically transfer metadata (like the Title and Artist of an artwork) directly from photo editing software to a spreadsheet.

February 9th, 2019 – Condor

In order to avoid data entry requiring intermediate software knowledge, I'm developing a Google Spreadsheet and scripts to make the entry, update, and the exporting of data easier. This, combined with Adobe's InDesign will allow for the automation of binders, finding aids, and other related materials. Thus far I have familiarized myself with some of the command language for sheets and written a small extension in javascript. Using this approach I extended the sheets usefulness in the following way:

When entering in the name of an image file under the column N "imagefilenameprimary" a thumbnail image loads in column B "thumbnail." This should help with associating works and data by sight. The thumbnails are generated by a script that programmatically resizes image files to be 100 pixels on their longest side and compresses them to cut down on bandwidth. This resizing and storage is currently happening on another server due to the limitations of Google Drive. Google made hosting images for use in spreadsheets very difficult on Drive in 2016 when the "Google Websites" product was phased out. In the current version of sheets Google allows you to insert an image with the mouse but obfuscates the image name so that loading images algorithmically is impossible to do in a way that is fast, scalable, and not prone to error.

Once data is entered into a given spreadsheet it can be exported as a CSV file, a simple format used to store tabular data. After a CSV has been generated it can be used to populate an InDesign template using a "data merge," similar to a mail merge in Excel. A template can be created in InDesign that pulls data from the CSV, e.g. “artist,” “title,” “year,” and positions and resizes linked images algorithmically. The benefits of generating binders and other materials though InDesign is that it is a standard in the print industry and has a sophisticated layout engine that can produce high quality typography and formatting. Additionally, if a company prefers to use their own branding in a binder, etc, it is very likely that some of those assets will be available as InDesign files.

Planned future additions for the sheets app

  • Validating data by type

  • Validating data by format

  • Further optimization of embedded images

Things to solve

Currently our inventory number/file name generated by client, title, author, etc has a potential to create naming collisions (two things with the same name). This needs to be solved as we move forward. 

Notes on character encoding and formatting

  • The column titles should be lowercase with no spaces in the name and no special characters (a-z).

  • Character encoding is a complicated issue and can cause errors in any text driven application, this is true of reading and writing to CSV files. In general you do not want to export data to a CSV using ASCII encoding. While simple and easy to read ASCII lacks necessary characters outside of the English language. Attempting to encode text with diacritics will cause failure on loading or export of the file in most cases. Other common choices are UTF-8 and UTF-16. UTF-8 is usually a fine choice (see caveat below).

Additional technical notes

UTF-8 may work as well but this author has found that InDesign is particularly sensitive to character errors and it may be necessary to re-encode at UTF-8 file as UTF-16. This can be done from the command line on OSX or Linux using the iconv program. The following will convert a file from UTF-8 to UTF-16:

iconv -f utf8 -t utf16 < oldfile.csv > newfile.csv

Where oldfile.csv is the file encoded as UTF-8 and newfile.csv is that same file re-encoded using UTF-16. After this diacriticals should be checked.

February 3rd, 2019 – Condor

After migrating most data from the Snowmass collection to Lightroom I'm finding that the time required with this approach doesn't provide enough benefit to develop the process further.  

Lightroom was developed as a single user digital asset management program, and while it is the standard for developing raw digital images the support for a multiuser workflow is non-existent. I attempted to work around this limitation by developing a plugin to contain metadata required for Rookery, e.g. client, artist's name, dimensions, etc. However, user defined metadata is not stored in the image file (.dng or in a .psd)  as other metadata is but in the Lightroom catalogue. This makes user-defined metadata less flexible and useful both as a way to attach metadata directly to the image file and to share that data with others. It is possible to export the data from Lightroom as an excel document using a third party plugin, Listview. However, the plugin is buggy and support comes only from its author.

In conclusion, at this point using metadata stored in Lightroom's catalogue doesn't offer enough benefits in terms of making artwork  metadata easily maintainable. With that in mind I'm looking at other approaches for speeding up data entry.

February 1st, 2019 – Condor

Migrating data from Snowmass finding aid to Lightroom.

January 31st, 2019 – Condor

From: Condor

To: Bluebird

Hi Bluebird,

I found a plugin + workaround to export image and metadata to an Excel file from Lightroom. It's a little buggy but I still think it will speed up the process of creating binders and tracking data.

-Joe

January 29th, 2019 – Condor

From: Condor

To: Bluebird

Hi Bluebird,

The good news: I figured out how to write a plugin for Lightroom that adds fields for Rookery data! The idea being that I could just export the data to a excel file and then do a mail merge for the binder. (pic attached)

The bad news: Lightroom's SDK sucks! While you can export "normal" metadata, it seems like exporting user created metadata is a non-starter and is scheduled for a "future release." :-(

I'm looking at some other plugins that might allow me to work around the issue.

-Condor