Information about URI Fragments, the portion of URIs that follow the '#' at the end and that are used to navigate within a document, is scattered throughout various documents which I usually have to hunt down. Instead I'll link to them all here.
Definitions. Fragments are defined in the URI RFC which states that they're used to identify a secondary resource that is related to the primary resource identified by the URI as a subset of the primary, a view of the primary, or some other resource described by the primary. The interpretation of a fragment is based on the mime type of the primary resource. Tim Berners-Lee notes that determining fragment meaning from mime type is a problem because a single URI may contain a single fragment, however over HTTP a single URI can result in the same logical resource represented in different mime types. So there's one fragment but multiple mime types and so multiple interpretations of the one fragment. The URI RFC says that if an author has a single resource available in multiple mime types then the author must ensure that the various representations of a single resource must all resolve fragments to the same logical secondary resource. Depending on which mime types you're dealing with this is either not easy or not possible.
HTTP. In HTTP when URIs are used, the fragment is not included. The General Syntax section of the HTTP standard says it uses the definitions of 'URI-reference' (which includes the fragment), 'absoluteURI', and 'relativeURI' (which don't include the fragment) from the URI RFC. However, the 'URI-reference' term doesn't actually appear in the BNF for the protocol. Accordingly the headers like 'Request-URI', 'Content-Location', 'Location', and 'Referer' which include URIs are defined with 'absoluteURI' or 'relativeURI' and don't include the fragment. This is in keeping with the original fragment definition which says that the fragment is used as a view of the original resource and consequently only needed for resolution on the client. Additionally, the URI RFC explicitly notes that not including the fragment is a privacy feature such that page authors won't be able to stop clients from viewing whatever fragments the client chooses. This seems like an odd claim given that if the author wanted to selectively restrict access to portions of documents there are other options for them like breaking out the parts of a single resource to which the author wishes to restrict access into separate resources.
HTML. In HTML, the HTML mime type RFC defines HTML's fragment use which consists of fragments referring to elements with a corresponding 'id' attribute or one of a particular set of elements with a corresponding 'name' attribute. The HTML spec discusses fragment use additionally noting that the names and ids must be unique in the document and that they must consist of only US-ASCII characters. The ID and NAME attributes are further restricted in section 6 to only consist of alphanumerics, the hyphen, period, colon, and underscore. This is a subset of the characters allowed in the URI fragment so no encoding is discussed since technically its not needed. However, practically speaking, browsers like FireFox and Internet Explorer allow for names and ids containing characters outside of the defined set including characters that must be percent-encoded to appear in a URI fragment. The interpretation of percent-encoded characters in fragments for HTML documents is not consistent across browsers (or in some cases within the same browser) especially for the percent-encoded percent.
Text. Text/plain recently got a fragment definition that allows fragments to refer to particular lines or characters within a text document. The scheme no longer includes regular expressions, which disappointed me at first, but in retrospect is probably good idea for increasing the adoption of this fragment scheme and for avoiding the potential for ubiquitous DoS via regex. One of the authors also notes this on his blog. I look forward to the day when this scheme is widely implemented.
XML. XML has the XPointer framework to define its fragment structure as noted by the XML mime type definition. XPointer consists of a general scheme that contains subschemes that identify a subset of an XML document. Its too bad such a thing wasn't adopted for URI fragments in general to solve the problem of a single resource with multiple mime type representations. I wrote more about XPointer when I worked on hacking XPointer into IE.
SVG and MPEG. Through the Media Fragments Working Group I found a couple more fragment scheme definitions. SVG's fragment scheme is defined in the SVG documentation and looks similar to XML's. MPEG has one defined but I could only find it as an ISO document "Text of ISO/IEC FCD 21000-17 MPEG-12 FID" and not as an RFC which is a little disturbing.
AJAX. AJAX websites have used fragments as an escape hatch for two issues that I've seen. The first is getting a unique URL for versions of a page that are produced on the client by script. The fragment may be changed by script without forcing the page to reload. This goes outside the rules of the standards by using HTML fragments in a fashion not called out by the HTML spec. but it does seem to be inline with the spirit of the fragment in that it is a subview of the original resource and interpretted client side. The other hack-ier use of the fragment in AJAX is for cross domain communication. The basic idea is that different frames or windows may not communicate in normal fashions if they have different domains but they can view each other's URLs and accordingly can change their own fragments in order to send a message out to those who know where to look. IMO this is not inline with the spirit of the fragment but is rather a cool hack.
Two weekends ago it was actually sunny and kind of warm so Sarah and I went down to Spud Fish and Chips and Juanita Beach Park. We ate fish and chips on the dock. I took a few pictures and this time actually put some geographical information on Flickr so now I've got a map of my tiny fish and chips journey. On the map click on the floating marks to view the associated photos.
Flickr provides access to the geo data associated with your photos via GeoRSS feeds. And Google Maps displays GeoRSS feed content on their maps allowing you even to edit the data but doesn't appear to let you easily export the GeoRSS. Live Maps does the inverse, allowing you to create and export GeoRSS data but not import it. I'd like both please. Oh well.
At the grocery store the other day Sarah and I attempted to find shallot for a recipe, but I can't tell the difference between shallot, sweet onions, yellow onions, etc. etc. We found something that we decided was the closest we'd find in the store and I believe we picked correctly because at checkout the cashier rang it up as shallot.
I think this could be a practical problem that the 20q Pocket Mind Reader should be able to solve: obtain the name of an unidentified object. When we got home I decided to test the 20q Pocket Mind Reader on shallot. Unfortunately, it told me I had an onion, but I think if these were designed for identifying unknown objects based solely on information you can obtain by looking at it, rather than requiring knowledge of seeds, where it grows, etc. it would do better. Or I could just ask someone who works at the grocery store.
I signed up for the pre-release beta and purchased a Chumby last year. Chumby looks like a cousin to a GPS unit. Its similar in size with a touch screen, but has WiFi, accelerometers, and is pillow like on the sides that aren't a screen. In practice its like an Internet alarm clock that shows you photos and videos off the Web. Its hackable in that Chumby Industries tells you about the various ways to run your own stuff on the Chumby, modifying the boot sequence (it runs Linux), turning on sshd, etc, etc. The Chumby forum too has lots of info from folks who have found interesting hacks for the device.
When you turn on the Chumby it downloads and runs the latest version of the Chumby software which lets you set alarms, play music, and display Flash widgets. The Chumby website lets anyone upload their own Flash widgets to share with the community. I tried my hand at creating one using Adobe's free Flash creation SDK but I don't know Flash and didn't have the patience to learn.
Currently my Chumby is set to wake me up at 8am on weekdays with music from ShoutCast and then displays traffic and weather. At 10am everyday it switches to showing me a slide-show of LolCats. At 11pm it switches to night mode where it displays the time in dark grey text on a black background at a reduced light level so as not to disturb me while I sleep.
I like the Chumby but I have two complaints. The first is that it forces me to learn flash in order to create anything cool rather than having a built-in Web browser or depending on a more Web friendly technology. The second complaint is about its name. At first I thought the name was stupid in a kind of silly way, but now that I'm used to the name it sounds vaguely dirty.
More ideas stolen from me in the same vein as my stolen OpenID thoughts.
Fast Pedestrian Crossing on Four Way Stops. In college I didn't have a car and every weekend I had weekly poker with friends who lived nearby so I would end up waiting to cross from one corner of a traffic lit four way stop to the opposite corner. Waiting there in the cold gave me plenty of time to consider the fastest method of getting to the opposite corner of a four-way stop. My plan was to hit the pedestrian crossing button for both directions and travel on the first one available. This only seems like a bad choice if the pedestrian crossing signal travels clockwise or counter clockwise around the four way stop. In those two cases its better to take the later of the two pedestrian signal crossings, but I have yet to see those two patterns on a real life traffic stop. I decided recently to see if my plan was actually sound and looked up info on traffic signals. But the info didn't say much other than "its complicated" and "it depends" (I'm paraphrasing). Then I found some guy's analysis of this problem. So I'm done with this and I'll continue pressing both buttons and crossing on the first pedestrian signal. Incidentally on one such night when I was waiting to cross this intersection I heard a loud multi-click sound and realized that the woman in the SUV waiting to cross the intersection next to me had just locked her doors. I guess my thinking-about-crossing-the-street face is intimidating.
Windows Searching Windows Media Center Recorded TV's Closed Captions. An Ars-Technica article on a fancy DVR described one of the DVRs features: full text search over the subtitles of the recorded TV shows. I thought implementing this for Windows Media Center recorded TV shows and Windows Search would be an interesting project to learn about video files, and extending Windows Search. As it turns out though some guy, Stephen Toub implemented Windows Search over MCE closed captions already. Stephen Toub's article is very long and describes some other very interesting related projects including 'summarizing video files' which you may want to read.