An Embarrassment of Riches - Part II


mbarrassment of Riches - Part II

 by: Sam Vaknin, Ph.D.

http://www.doi.org/

The DOI Foundation has unveiled the DOI-EB (EB stands for e-books) Initiative in the Book Expo America Show 2001, to, in their words:

"Determine requirements with respect to the application of unique identifiers to eBooks

Develop proofs-of-concept for the use of DOIs with eBooks

Develop technical demonstrations, possibly including a prototype eBook Registration Agency."

It is backed by a few major publishers, such as McGraw-Hill, Random House, Pearson, and Wiley.

This ostensibly modest agenda conceals a revolutionary and ambitious attempt to unambiguously identify the origin of digital content (in this case, e-books) and link a universe of information to each and every ID number. Aware of competing efforts underway, the DOI Foundation is actively courting the likes of "indecs" (Interoperability of Data in E-Commerce System) and OeBF (Open e-Book). Companies ,like Enpia Systems of South Korea (a DOI Registration Agency), have already implemented a DOI-cum-indecs system. On November 2000, the APA's (American Publishers' Association) Open E-book Publishing Standards Initiative has recommended to use DOI as the primary identification system for e-books' metadata. The MPEG (Motion Pictures Experts Group) is said to be considering DOI seriously in its efforts to come up with numbering and metadata standards for digital videos. A DOI can be expressed as a URN (Universal Resource Name - IETF's syntax for generic resources) and is compatible with OpenURL (a syntax for embedding parameters such as identifiers and metadata in links). Shortly, a "Namespace Dictionary" is to be published. It will encompass 800 metadata elements and will tackle e-books, journals, audio, and video. A working group was started to develop a "services definition" interface (i.e., to allow web-enabled systems, especially e-commerce and m-commerce systems, to deploy DOI).

The DOI, in other words, is designed to be all-inclusive and all-pervasive. Each DOI number is made of a prefix, specific to a publisher, and a suffix, which could end up painlessly assimilating the ISBN and ISSN (or any other numbering and database) system.

Thus, a DOI can be assigned to every e-book based on its ISBN and to every part (chapter, section, or page) of every e-book. This flexibility could support Pay Per View models (such as Questia's or Fathom's), POD (Print On Demand), and academic "course packs", which comprise material from many textbooks, whether on digital media or downloadable. The DOI, in other words, can underlie D-CMS (Digital Content Management Systems) and Electronic Catalogue ID Management Systems.

Moreover, the DOI is a paradigm shift (though, conceptually, it was preceded by the likes of the UPC code and the ISO's HyTime multimedia standard). It blurs the borders between types of digital content. Imagine an e-novel with the video version of the novel, the sound track, still photographs, a tourist guide, an audio book, and other digital content embedded in it. Each content type and each segment of each content type can be identified and tagged separately and, thus, sold separately - yet all under the umbrella of the same DOI! The nightmare of DRM (digital rights management) may be finally over.

But the DOI is much more than a sophisticated tagging technology. It comes with multiple resolution (see "Embarrassment of Riches - Part I"). In other words, as opposed to the URL (Universal Resource Locator) - it is generated dynamically, "on the fly", by the user, and is not "hard coded" into the web page. This is because the DOI identifies content - not its location. And while the URL resolves to a single web page - the DOI resolves to a lot more in the form of publisher-controlled (ONIX-XML) "metadata" in a pop-up (Javascript or other) screen. The metadata include everything from the author's name through the book's title, edition, blurbs, sample chapters, other promotional material, links to related products, a rights and permissions profile, e-mail contacts, and active links to retailers' web pages. Thus, every book-related web page becomes a full fledged book retailing gateway. The "anchor document" (in which the DOI is embedded) remains uncluttered. ONIX 2.0 may contain standard metadata fields and extensions specific to e-publishing and e-books.

This latter feature - the ability to link to the systems of retailers, distributors, and other types of vendors - is the "barcode" function of the DOI. Like barcode technology, it helps to automate the supply chain, and update the inventory, ordering, billing and invoicing, accounting, and re-ordering databases and functions. Besides tracking content use and distribution, the DOI allows to seamlessly integrate hitherto disparate e-commerce technologies and facilitate interoperability among DRM systems.

The resolution itself can take place in the client's browser (using a software plug-in), in a proxy server, or in a central, dynamic server. Resolving from the client's PC, e-book reader, or PDA has the advantage of being able to respond to the user's specific condition (location, time of day, etc.). No plug-in is required when a proxy server HTTP is used - but then the DOI becomes just another URL, embedded in the page when it is created and not resolved when the user clicks on it. The most user-friendly solution is, probably, for a central server to look up values in response to a user's prompt and serve her with cascading menus or links. Admittedly, in this option, the resolution tables (what DOI links to what URL's and to what content) is not really dynamic. It changes only with every server update and is static between updates. But this is a minor inconvenience. As it is, users are likely to respond with some trepidation to the need to install plug-ins and to the avalanche of information their single, innocuous, mouse click generates.

The DOI Foundation has compiled this impressive list of benefits - and beneficiaries:

"Publishers to enable cross referencing to related information, control over metadata, viral distribution and sales, easy access to content, sale of granular content

Consumers to increase value for time and money, and purchase options

Distributors to facilitate sale and distribution of materials as well as user needs

Retailers to build related materials on their sites, heighten consumer usability and copyright protection

Conversion Houses/Wholesaler Repositories to increase access to and use of metadata

DRM Vendors/Rights Clearing Houses to enable interoperability and use of standards

Data Aggregators to enable compilation of primary and secondary content and print on demand

Trade Associations facilitate dialog on social level and attend to legal and technical perspectives pertaining to multiple versions of electronic content

eBbook software Developers to enable management of personal collections of eBooks including purchase receipt information as reference for quick return to retailer

Content Management System Vendors to enable internal synching with external usage

Syndicators to drive sales to retailers, add value to retail online store/sales, and increase sales for publishers"

The DOI is assigned to publishers by Registration Agencies (of which there are currently three - CrossRef and Content Directions in the States and the aforementioned Enpia Systems in Asia). It is already widely used to cross reference almost 5,000 periodicals with a database of 3,000,000 citations. The price is steep - it costs a publisher $200 to get a prefix and submit DOI's to the registry. But as Registration Agencies proliferate, competition is bound to slash these prices precipitously.