Thursday, October 16, 2008

LINK: The Chuck Cunningham of HTTP

If you actually read RFC 1945, the document that describes HTTP/1.0, one thing you'll notice is that section 8, where the methods are defined, has only GET, HEAD, and POST. The PUT and DELETE methods that are so important to advocates RESTful web services are relegated to Appendix D, "additional features".

This appendix documents protocol elements used by some existing HTTP implementations, but not consistently and correctly across most HTTP/1.0 applications. Implementors should be aware of these features, but cannot rely upon their presence in, or interoperability with, other HTTP/1.0 applications.

This also where a number of other familiar things like the Accept-* family of headers were listed. In RFC 2616 these were promoted to full-fledged features of HTTP 1.1.

But wait.. Appendix D also lists a couple of methods I'd never heard of before: LINK and UNLINK. And some headers (Link, Title, URI) that appear to have been intended to be used with them.

These features are simply missing from HTTP/1.1. They disappeared, like a TV character simply being dropped between seasons with no explanation or mention. They weren't put on a bus, they didn't die in a plane crash with no survivors on their way back home, a bridge didn't collapse on them. The powers that be just decided that they never existed at all.

The Link header is described:

The Link entity-header field provides a means for describing a relationship between the entity and some other resource. An entity may include multiple Link values. Links at the metainformation level typically indicate relationships like hierarchical structure and navigation paths.

This sounds a lot like HTML link tag. Redundant, even.

Digging around, I found this description of the HTTP of 1992, written by Tim Berners-Lee himself. This documents the protocol in a stage between "HTTP 0.9" with it's GET method and no headers and the eventual HTTP/1.0 standard.

In the section on "Object Headers", we find:

Note. It is proposed that any HTML metainformation element (allowed withing the HEAD as opposed to BODY element of the document) be a valid candidate for an HTTP object header. LINK is one example, TITLE another. One suggestion was that the isomorphism should be realized by prepending "WWW-" to the HTML element name to make the HTTP header name, and the HTML attributes imply identically named semicolon-separated MIME-style header parameters. It is open to discussion whether the "WWW-" should be inserted or not.

Apparently sometime between HTTP/1.0 and 1.1 they finally decided that this wasn't something that needed to be in HTTP after all.

I have to admit I'm a little bit sad. When dealing with (X)HTML these headers and the methods to put/delete them are redundant, but they could be useful for non-HTML. One use I can imagine would be to provide the title ("alt text") of a photo as part of the HTTP response for requesting the actual image file, rather than having that text only exist in the HTML that embeds the photo. Maybe Link(s) to the homepages of the people in the photo. Of course it would require the web server to maintain some kind of database in which to store this data, which might have been asking too much of in 1996. But now? I think we could have used these little guys.

EDIT: It just dawned on me that understanding how things like this get abandoned and yet the world still turns out OK, means understanding a great truth applicable to other domains besides HTTP. After all, if PUT, POST, and DELETE had been widely implemented as they were originally designed - to allow users to create and edit web pages using features directly built into their browsers - then mankind would never need to have invented things called "wikis" and "blogs", and then where would we be?


i guess EXIF handles some of that (but only for images) and ID3 handles some of that (but only for mp3s) They are not as unified as Object Headers would have been, but they do stick with the content no matter where it goes (mp3 player, image editor, etc.)