A view source on a lot of web pages out there shows something like this, which is from a web page created by the DITA Open Toolkit from a DITA XML file:
<html lang="en-us" xml:lang="en-us"> <head> <meta content="text/html; charset=utf-8" http-equiv="Content-Type" /> <meta name="copyright" content="(C) Copyright 2005" /> <meta name="DC.rights.owner" content="(C) Copyright a2005" /> <meta content="recipe" name="DC.Type" /> <meta name="DC.Title" content="My Topic" /> <meta name="abstract" content="Sample description of the topic." /> <meta name="description" content="Sample description of the topic." /> <meta content="XHTML" name="DC.Format" /> <meta content="r1" name="DC.Identifier" /> <link href="commonltr.css" type="text/css" rel="stylesheet" /> <title>My Topic</title> </head>
head elements of many web pages have metadata in collections of name/value pairs stored in the
content attributes of
meta elements like this. We’ve all seen that many HTML generation routines add these
head/meta elements, but what kind of applications actually pull these name/value pairs out and do something with them? Web-focused content management systems are the only candidate I can think of; can anyone confirm that one of those uses this data, or name some other kind of application that does?
A funny side note: a web search to find some numbers on this usage of
meta tags also displays a single paid ad with the title “Meta Tags Are Dead” that links to an ad for a $79 book called “Google Secrets: How to Get a Top 10 Ranking” at the clever domain name google-secrets.com. The ad title reminded me of a certain De La Soul album name.
One limited usage, but very useful for me, is that when you bookmark a page in the Opera browser, it stores the description metadata as well as the page title and URL (if the page doesn’t have any description, you can add your own, and I often have to, by copying/pasting content from the page). The Opera UI is based on fast search for navigation, so if you are looking at your bookmarks, you just type in a relevant word or two, and it searches all of the bookmark info, including the description, to filter the bookmark tab to show only the relevant bookmarks. This works really well.
Thanks Tony! This is just the kind of thing I was curious about. I see that they document it a bit here
Zotero uses Dublin Core meta tags to import bibliographic metadata.
Just to make sure I understand what you mean by “Dublin Core meta tags”: when you tell Zotero to save information about a web page, it looks for meta elements where the @name value begins with “DC.” and, for each that it finds, it saves the @name and @content values, right?
I scrape meta tags in many of the law-related Zotero translators I’ve contributed (Bob, you would be familiar with Cornell’s LII I think).
I’ve also been recently shot down for suggesting that publishers use simple meta tags in order to be “Zotero friendly” - see: http://groups.google.com/group/zotero-dev/browse_thread/thread/4a6f0190afc3e4a
RDF and ontologies are way sexier!
Definitely familiar with the Cornell LII. They’re doing great work, and not enough people realize that findlaw.com is owned by Thomson, so I think that the work at Cornell is the great hope for free access to online U.S. law.
RDF and ontologies are way sexier!
An important advantage of RDF is the ability to make the value of the name/value pair (the object) a URI, so that it can serve as the subject of other triples, so that you can start linking up triples to gain new information. I’ll be posting something that mentions doing this in HTML head elements shortly.
Ontologies are great, but you can get a lot done without them.
We publish meta tags from the article pages on IngentaConnect. Google Scholar uses them as a way to get better bibliographic metadata than they can automatically harvest from the full-text.