Information about URI Fragments, the portion of URIs that follow the '#' at the end and that are used to navigate within a document, is scattered throughout various documents which I usually have to hunt down. Instead I'll link to them all here.
Definitions. Fragments are defined in the URI RFC which states that they're used to identify a secondary resource that is related to the primary resource identified by the URI as a subset of the primary, a view of the primary, or some other resource described by the primary. The interpretation of a fragment is based on the mime type of the primary resource. Tim Berners-Lee notes that determining fragment meaning from mime type is a problem because a single URI may contain a single fragment, however over HTTP a single URI can result in the same logical resource represented in different mime types. So there's one fragment but multiple mime types and so multiple interpretations of the one fragment. The URI RFC says that if an author has a single resource available in multiple mime types then the author must ensure that the various representations of a single resource must all resolve fragments to the same logical secondary resource. Depending on which mime types you're dealing with this is either not easy or not possible.
HTTP. In HTTP when URIs are used, the fragment is not included. The General Syntax section of the HTTP standard says it uses the definitions of 'URI-reference' (which includes the fragment), 'absoluteURI', and 'relativeURI' (which don't include the fragment) from the URI RFC. However, the 'URI-reference' term doesn't actually appear in the BNF for the protocol. Accordingly the headers like 'Request-URI', 'Content-Location', 'Location', and 'Referer' which include URIs are defined with 'absoluteURI' or 'relativeURI' and don't include the fragment. This is in keeping with the original fragment definition which says that the fragment is used as a view of the original resource and consequently only needed for resolution on the client. Additionally, the URI RFC explicitly notes that not including the fragment is a privacy feature such that page authors won't be able to stop clients from viewing whatever fragments the client chooses. This seems like an odd claim given that if the author wanted to selectively restrict access to portions of documents there are other options for them like breaking out the parts of a single resource to which the author wishes to restrict access into separate resources.
HTML. In HTML, the HTML mime type RFC defines HTML's fragment use which consists of fragments referring to elements with a corresponding 'id' attribute or one of a particular set of elements with a corresponding 'name' attribute. The HTML spec discusses fragment use additionally noting that the names and ids must be unique in the document and that they must consist of only US-ASCII characters. The ID and NAME attributes are further restricted in section 6 to only consist of alphanumerics, the hyphen, period, colon, and underscore. This is a subset of the characters allowed in the URI fragment so no encoding is discussed since technically its not needed. However, practically speaking, browsers like FireFox and Internet Explorer allow for names and ids containing characters outside of the defined set including characters that must be percent-encoded to appear in a URI fragment. The interpretation of percent-encoded characters in fragments for HTML documents is not consistent across browsers (or in some cases within the same browser) especially for the percent-encoded percent.
Text. Text/plain recently got a fragment definition that allows fragments to refer to particular lines or characters within a text document. The scheme no longer includes regular expressions, which disappointed me at first, but in retrospect is probably good idea for increasing the adoption of this fragment scheme and for avoiding the potential for ubiquitous DoS via regex. One of the authors also notes this on his blog. I look forward to the day when this scheme is widely implemented.
XML. XML has the XPointer framework to define its fragment structure as noted by the XML mime type definition. XPointer consists of a general scheme that contains subschemes that identify a subset of an XML document. Its too bad such a thing wasn't adopted for URI fragments in general to solve the problem of a single resource with multiple mime type representations. I wrote more about XPointer when I worked on hacking XPointer into IE.
SVG and MPEG. Through the Media Fragments Working Group I found a couple more fragment scheme definitions. SVG's fragment scheme is defined in the SVG documentation and looks similar to XML's. MPEG has one defined but I could only find it as an ISO document "Text of ISO/IEC FCD 21000-17 MPEG-12 FID" and not as an RFC which is a little disturbing.
AJAX. AJAX websites have used fragments as an escape hatch for two issues that I've seen. The first is getting a unique URL for versions of a page that are produced on the client by script. The fragment may be changed by script without forcing the page to reload. This goes outside the rules of the standards by using HTML fragments in a fashion not called out by the HTML spec. but it does seem to be inline with the spirit of the fragment in that it is a subview of the original resource and interpretted client side. The other hack-ier use of the fragment in AJAX is for cross domain communication. The basic idea is that different frames or windows may not communicate in normal fashions if they have different domains but they can view each other's URLs and accordingly can change their own fragments in order to send a message out to those who know where to look. IMO this is not inline with the spirit of the fragment but is rather a cool hack.
More of my thoughts have been stolen: In my previous job the customer wanted a progress bar displayed while information was copied off of proprietary hardware, during which the software didn't get any indication of progress until the copy was finished. I joked (mostly) that we could display a progress bar that continuously slows down and never quite reaches the end until we know we're done getting info from the hardware. The amount of progress would be a function of time where as time approaches infinity, progress approaches a value of at most 100 percent.
This is similar to Zeno's Paradox which says you can't cross a room because to do so first you must cross half the room, then you must cross half the remaining distance, then half the remaining again, and so on which means you must take an infinite number of steps. There's also an old joke inspired by Zeno's Paradox. The joke is the prototypical engineering vs sciences joke and is moderately humorous, but I think the fact that Wolfram has an interactive applet demonstrating the joke is funnier than the joke itself.
I recently found Lou Franco's blog post "Using Zeno's Paradox For Progress Bars" which covers the same concept as Zeno's Progress Bar but with real code. Apparently Lou wasn't making a joke and actually used this progress bar in an application. A progress bar that doesn't accurately represent progress seems dishonest. In cases like the Vista Defrag where the software can't make a reasonable guess about how long a process will take the software shouldn't display a progress bar.
Similarly a paper by Chris Harrison "Rethinking the Progress Bar" suggests that if a progress bar speeds up towards the end the user will perceive the operation as taking less time. The paper is interesting, but as in the previous case, I'd rather have progress accurately represented even if it means the user doesn't perceive the operation as being as fast.
Update: I should be clearer about Lou's post. He was actually making a practical and implementable suggestion as to how to handle the case of displaying progress when you have some idea of how long it will take but no indications of progress, whereas my suggestion is impractical and more of a joke concerning displaying progress with no indication of progress nor a general idea of how long it will take.
I now have search and an archive available for my site. I previously tried to setup crappy search by cheating using Yahoo Pipes and now instead I have a slightly less crappy search that works over all of the content that I've produced on my blog, uploaded to flickr or youtube, or added to delicious.
You can now read my first LiveJournal blog post or, for probably much more entertainment value, view all the photos and videos of Cadbury by searching for 'bunny'.
The search is only slightly less lame because although it searches over all my content, I still implemented it myself rather than getting a professional package. Also, the feed supports the same search and archive as my homepage so you can subscribe to a feed of Cadbury if you're so inclined and just skip all this other boring stuff. My homepage and feed implement the OpenSearch response elements and I've got an OpenSearch search provider (source) as well.
With the new features of IE8 there's several easy ways to integrate Gmail, Google's web mail service, for mail composition, searching, and monitoring that I enjoy using.
I've switched from using my own home web server of which one of the harddrives died, to using NearlyFreeSpeech.NET, an actual real live web hosting service. So far I'm very happy with them and they give me almost exactly what I had on my own home server: ssh access, vim, php, java, etc. etc. The only notable things they don't do are (1) cron jobs which I use currently and (2) SSL which I don't use currently. I can replace my cron job usage and I suppose I'll have to reevaluate my web hosting if I ever need SSL. At the moment many of the server side things like Vizicious will be unavailable. I'll work on getting those working again at some point.
I got a FlickrMail from Emma J. Williams a bit ago saying that they wanted to use two of my photos in their Schmap San Francisco Guide online travel guide. So now you can see two of my vacation photos on the Westfield San Francisco Shopping Center Schmap page and the Hotel Diva Schmap page.
I think its wonderful that digital cameras are at the point where I really don't have to know much about their workings to produce a photo that's reasonable looking. And its thanks to Flickr and searchable tags that Schmap could find my photos. Since my photos on Flickr are all licensed under a Creative Commons license named Attribution-Noncommercial-No Derivative Works 2.0 Generic which only applies to non-commercial uses, Schmap, which is advertisement supported, kindly asked me if they could use my photos. I agreed to their license which was human readable and included wonderful stuff like I get in place attribution and the license is only applicable while Schmap makes their guide freely available online.
Previously I've only heard of folks having their flickr photos used without their permission so I'm glad to know that's not always the case. Or perhaps this is just Schmap's clever method of getting me to blog about them.
I've setup a minimal search page that uses a Yahoo Pipe to sort of search through my content. I say sort of search because I only get full text search over my recent item feeds and otherwise I just search over my tags.
To get real search I'm going to have to keep an archive of all my content on my own website. This is a pain but on the other hand it will let me easily backup my content or display old items on my page. Why didn't I just use a prebuilt solution?
I saw this odd looking cute cat and it reminded me of Thom Yorke. On a related note also see the myth buster lol-cat.
Also I think the whistling puppy (~0:05) and hungry lumas transforming on Super Mario Galaxy (~0:15) sound very similar.
More ideas stolen from me in the same vein as my stolen OpenID thoughts.
Fast Pedestrian Crossing on Four Way Stops. In college I didn't have a car and every weekend I had weekly poker with friends who lived nearby so I would end up waiting to cross from one corner of a traffic lit four way stop to the opposite corner. Waiting there in the cold gave me plenty of time to consider the fastest method of getting to the opposite corner of a four-way stop. My plan was to hit the pedestrian crossing button for both directions and travel on the first one available. This only seems like a bad choice if the pedestrian crossing signal travels clockwise or counter clockwise around the four way stop. In those two cases its better to take the later of the two pedestrian signal crossings, but I have yet to see those two patterns on a real life traffic stop. I decided recently to see if my plan was actually sound and looked up info on traffic signals. But the info didn't say much other than "its complicated" and "it depends" (I'm paraphrasing). Then I found some guy's analysis of this problem. So I'm done with this and I'll continue pressing both buttons and crossing on the first pedestrian signal. Incidentally on one such night when I was waiting to cross this intersection I heard a loud multi-click sound and realized that the woman in the SUV waiting to cross the intersection next to me had just locked her doors. I guess my thinking-about-crossing-the-street face is intimidating.
Windows Searching Windows Media Center Recorded TV's Closed Captions. An Ars-Technica article on a fancy DVR described one of the DVRs features: full text search over the subtitles of the recorded TV shows. I thought implementing this for Windows Media Center recorded TV shows and Windows Search would be an interesting project to learn about video files, and extending Windows Search. As it turns out though some guy, Stephen Toub implemented Windows Search over MCE closed captions already. Stephen Toub's article is very long and describes some other very interesting related projects including 'summarizing video files' which you may want to read.
Two and half weeks ago Sarah and I went to Las Vegas where I got to see Jesse, Pat, Chris, and (briefly because he's some kind of big shot too busy for his friends now etc) Grib from college. They're mostly in San Jose and I hadn't seen them for a while so it was a lot of fun to hang out. We all stayed at the MGM which is a nice hotel with some good restaurants. In other Vegas related links, Sarah added Sarah's Las Vegas restaurant reviews to her reviews and Jesse has Jesse's Vegas photos up too.
Sarah and I saw the Blue Man Group (video from a concert) and the Price is Right Live Show. The Blue Man Group was very cool although the music was all rock with a heavy drum focus (not depicted in the videos I linked) which I got a little tired of. But despite that I really enjoyed the show, very funny and I totally recommend it. The Price is Right Live Show is like the regular show on TV except the recording is not televised and its not hosted by Bob Barker or Drew Carey. So folks from the audience are still called up to play the same games and really win prizes. It was advertised as hosted by Todd Newton, B-list game show host, but was instead hosted by JD Roberto who hosted such things as "Reality Remix" and the show "Are You Hot? The Search for America's Sexiest People". The showcase showdown included the 2008 version of my car and thankfully I wasn't picked to compete for that because, well I don't know where they bought the car, but I would have gotten the price very wrong. We sat right next to the stage for that show and had a good time.
For New Years Eve Sarah and I stayed in and watched the glitched Seattle Space Needle fireworks show from a safe distance. On New Years we went to a pot-luck at Todd's house and had a fun time. Todd's place is on the top of a hill and has a lovely view of Washington's snow-capped mountains.