metadata - Dave's Blog
Navigation List
Blog Entries
2011 Apr 18, 4:27"SRU is a standard XML-focused search protocol for Internet search queries, utilizing CQL (Contextual Query Language), a standard syntax for representing queries."
standards search library metadata xml uri technical library-of-congress 2011 Apr 17, 12:51"Web-based protocols often require the discovery of host policy or metadata, where "host" is not a single resource but the entity controlling the collection of resources identified by Uniform
Resource Identifiers (URI) with a common URI host [RFC3986]."
host rfc reference metadata technical 2010 Aug 17, 3:05
I've just got a new media center PC connected directly to my television with lots of HD space and so I'm ripping a bunch of my DVDs to the PC so I don't have to fuss with the physical media. I'm
ripping with DVD Rip, viewing the results in Windows 7's Windows Media Center after turning on the WMC DVD Library, and using a powershell script I wrote to copy over cover art and metadata.
My powershell script follows. To use it you must do the following:
- Run Windows Media Center with the DVD in the drive and view the disc's metadata info.
- Rip each DVD to its own subdirectory of a common directory.
- The name of the subdirectory to which the DVD is ripped must have the same name as the DVD name in the metadata. An exception to this are characters that aren't allowed in Windows paths (e.g.
<, >, ?, *, etc)
- Run the script and pass the path to the common directory containing the DVD rips as the first parameter.
Running WMC and viewing the DVD's metadata forces WMC to copy the metadata off the Internet and cache it locally. After playing with Fiddler and reading this
blog post on WMC metadata I made the following script that copies metadata and cover art from the WMC cache to the corresponding
DVD rip directory.
Download copydvdinfo.ps1
powershell wmc technical tv dvd windows-media-center 2010 May 24, 6:29Installable web apps makes total sense given the Google Chrome OS: "An installable web app is a normal web site with a bit of extra metadata. You build and deploy this app exactly as you would build
and deploy any web app, using any server-side or client-side technologies you like. The only thing that is different about an installable web app is how the app is packaged."
technical web browser webapp google chrome 2010 Mar 13, 5:27WebFinger is finger but for the Web...
webfinger web google finger http metadata url technical 2010 Mar 12, 2:32Google's indexer now examines HTML5 microdata and they provide a tool to test out your pages microdata
html html5 google search microformats metadata rdf technical 2010 Jan 8, 2:08Flickr dev talks image metadata the various forms which to prefer and how to guess at their character encodings.
unicode charset flickr photo image exif programming reference xmp technical 2009 Dec 1, 9:40Wow: 'The fact that federal, state, and local law enforcement can obtain communications "metadata"—URLs of sites visited, e-mail message headers, numbers dialed, GPS locations, etc.—without any real
oversight or reporting requirements should be shocking, but it isn't. The courts ruled in 2005 that law enforcement doesn't need to show probable cause to obtain your physical location via the cell
phone grid. All of the aforementioned metadata can be accessed with an easy-to-obtain pen register/trap & trace order. But given the volume of requests, it's hard to imagine that the courts are
involved in all of these.'
privacy security gps phone cellphone government politics 2009 Sep 10, 8:22Geoff Nunberg investigates issues in Google Books and in the comments Google Book's team manager responds in the comments. Apparently metadata is bad everywhere and not an issue new to the Web and
user generated content or tagging. Like finding Feynman lectures categorized as Death Metal on Napster back in the day.
language google library metadata catalog 2009 Sep 1, 4:25"Each unit has a stable URI, making it possible to link to it from your own domain models in a reliable way. For each unit, the ontology defines some useful metadata including abbreviation, a link to
DBpedia and a categorization of units into groups, such as length units."
semanticweb via:connolly web unit conversion uri technical 2009 Apr 7, 9:02
I'm a big fan of the concept of registerProtocolHandler in HTML 5 and in FireFox 3, but not quite the implementation. From a high level, it allows web apps to register themselves as
handlers of an URL scheme so for (the canonical) example, GMail can register for the mailto URL scheme. I like the concept:
- Better integration of web apps with your system.
- Its easy for web apps to do.
- Links to URNs can now take the user to the sites the user prefers for the sort of thing identified by the URN. For example, if I have a physical address in HTML, instead of making that an http
link to Yahoo Maps, I can make the link a geo scheme URI and those who follow the link will get their preferred mapping site that
has registered for that scheme. Actually, looking at the geo scheme's RFC, maybe I'd rather use some other URN scheme to represent the physical location, but you get the point.
However, the way its currently spec'ed out I don't like the following:
- There's no way to know if you are the handler for a particular URL scheme which is an important question for web app URL protocol handler authors.
- There's no way to fallback to an http URL in the case that a particular URL scheme isn't registered. A suggested solution to testing the registration of a scheme is for browsers to provide an additional script method
to check if a scheme is registered. I don't like the idea of writing script that walks over all my page's links and rewrites them based on that method. I'd much rather see a declarative and
backwards compatible fallback mechanism, although I don't know what that would look like.
- There's no way to register for a namespace within the urn scheme URI, the info scheme URI, or the tag scheme URI. I want to register
info:lccn/... (Library of Congress Card Number identifiers) to LibraryThing or Amazon and I want to register urn:duri:... (dated URIs) to the Web Archive, among other things.
-
- Will this result in a proliferation of unregistered URL schemes with clashing namespaces? The ESW Wiki notes why this would be bad.
- And last, although this is nitpickier than the rest, I don't like the '%s' syntax used in the registration method. I'd much rather pass in an URL template, like the URL template used
in OpenSearch. If an URL template is used for matching rather than registering against a particular URL scheme, this could also allow for registering a namespace within a URN. For example
something along the lines of:
registerProtocolHandler("info:lccn/{lccnID}", "htttp://www.librarything.com/search_works.php?q={lccnID}", "LibraryThing LCCN")
url template registerprotocolhandler firefox technical url scheme protocol boring html5 uri urn 2009 Feb 23, 10:31"This is an experimental service that makes the Library of Congress Subject Headings available as linked-data using the SKOS vocabulary. The goal of lcsh.info is to encourage experimentation and use
of LCSH on the web with the hopes of informing a similar effort at the Library of Congress to make a continually updated version available. More information about the Linked Data effort can be found
on the W3C Wiki."
library-of-congress loc semanticweb web rdf metadata library api 2008 Aug 25, 10:13
As noted previously, my page consists of the
aggregation of my various feeds and in working on that code recently it was again brought to my attention that everyone has different ways of representing tag metadata in feeds. I made up a
list of how my various feed sources represent tags and list that data here so that it might help others in the future.
Tag markup from various sources
|
Source
|
Feed Type
|
Tag Markup Scheme
|
One Tag Per Element
|
Tag Scheme URI
|
Human / Machine Names
|
Example Markup
|
|
LiveJournal
|
Atom
|
atom:category
|
yes
|
no
|
no
|
, (source)
|
|
LiveJournal
|
RSS 2.0
|
rss2:category
|
yes
|
no
|
no
|
technical
(soure)
|
|
WordPress
|
RSS 2.0
|
rss2:category
|
yes
|
no
|
no
|
, (source)
|
|
Delicious
|
RSS 1.0
|
dc:subject
|
no
|
no
|
no
|
photosynth photos 3d tool
(source)
|
|
Delicious
|
RSS 2.0
|
rss2:category
|
yes
|
yes
|
no
|
domain="http://delicious.com/SequelGuy/">
hulu
(source)
|
|
Flickr
|
Atom
|
atom:category
|
yes
|
yes
|
no
|
term="seattle"
scheme="http://www.flickr.com/photos/tags/" />
(source)
|
|
Flickr
|
RSS 2.0
|
media:category
|
no
|
yes
|
no
|
scheme="urn:flickr:tags">
seattle washington baseball mariners
(source)
|
|
YouTube
|
RSS 2.0
|
media:category
|
no
|
no
|
no
|
label="Tags">
bunny rabbit yawn cadbury
(source)
|
|
LibraryThing
|
RSS 2.0
|
No explicit tag metadata.
|
no
|
no
|
no
|
n/a, (source)
|
Tag markup scheme
|
Tag Markup Scheme
|
Notes
|
Example
|
Atom Category
atom:category
xmlns:atom="http://www.w3.org/2005/Atom"
|
-
category/@term
-
Required category name.
-
category/@scheme
-
Optional IRI id'ing the categorization scheme.
-
category/@label
-
Optional human readable category name.
|
term="catName"
scheme="tag:deletethis.net,2008:tagscheme"
label="category name in human readable format"/>
|
RSS 2.0 category
rss2:category
empty namespace
|
-
category/@domain
-
Optional string id'ing the categorization scheme.
-
category/text()
-
Required category name. The value of the element is a forward-slash-separated string that identifies a hierarchic location in the indicated taxonomy. Processors may establish conventions
for the interpretation of categories.
|
domain="tag:deletethis.net,2008:tagscheme">
MSFT
|
Yahoo Media RSS Module category
media:category
xmlns:media="http://search.yahoo.com/mrss/"
|
-
category/text()
-
Required category name.
-
category/@domain
-
Optional string id'ing the categorization scheme.
|
scheme="http://dmoz.org"
label="Ace Ventura - Pet Detective">
Arts/Movies/Titles/A/Ace_Ventura_Series/Ace_Ventura_-_Pet_Detective
|
Dublin Core subject
dc:subject
xmlns:dc="http://purl.org/dc/elements/1.1/"
|
-
subject/text()
-
Required category name. Typically, the subject will be represented using keywords, key phrases, or classification codes. Recommended best practice is to use a controlled vocabulary.
|
humor
|
Update 2009-9-14: Added WordPress to the Tag Markup table and namespaces to the Tag Markup Scheme table.
feed media delicious technical atom youtube yahoo rss tag 2008 Jan 29, 7:28A standard URI scheme for describing books.
metadata microformats openurl coins uri 2008 Jan 2, 2:13FTA: "Seems that a number of villages in the English countryside are being overrun by errant trans-European trucks which are regularly misdirected by their GPS satnav systems onto roads that were
better suited for horse-drawn carriages than big, long-dist
gps humor article metadata blog england 2007 Apr 2, 11:50Google Base lets you add items to Google in a database like fashion. You add items of a particular type where the type is defined by you as consisting of various properties.
google base metadata database 2007 Apr 2, 11:48Thinglink lets you create data on their website (photo and description) for objects and gives your object an identifier. The objects on the site are mostly physical objects but that doesn't seem to
be a requirement.
blog tagging social information metadata thinglink 2007 Feb 13, 9:51A blog written by a librarian talking about ontology, blogging, tagging, and any other Web2.0 nonsense they like.
blog monthly folksonomy information library metadata ontology tag tagging web