tax page 2 - Dave's Blog

HTML vs. XHTML - WHATWG Wiki

2009 Sep 10, 6:42"Although HTML and XHTML appear to have similarities in their syntax, they are significantly different in many ways."

Scalable Vector Graphics (SVG) 1.1 (Second Edition)

2009 Aug 24, 4:57"This specification defines the features and syntax for Scalable Vector Graphics (SVG) Version 1.1, a modularized language for describing two-dimensional vector and mixed vector/raster graphics in XML."

svg graphic web xml reference w3c technical

Eat Pants - Interactive Fiction Sessions from my Server Logs

2009 Jun 29, 4:19

I've looked at my web server logs previously to see if anyone had used my Web Frotz Interpreter and until recently didn't realize that awstats (the web server log report generator) was truncating the query from my URL, so I couldn't tell that anyone was actually using it. But after grepping the logs manually I've pulled out the URLs of visitor's text adventure sessions. If you'll recall, my Web Frotz Interpreter stores the game state in the URL so its easy to see user's game states in the web server logs.

I've put some of the links up on the Web Frotz Interpreter page. Some of the interesting ones:

Apparently you can eat pants in Lost Pig.
Someone teaches me that in Zork the egg goes in the display case.
There's a creepy exchange in Galatea.

server-logs technical zork frotz pants interactive-fiction uri if

Linking to or Embedding a Portion of a Video

2009 Jun 19, 10:12

I'm excited by HTML5's video tag as are plenty of other people. Once that comes about and once media fragments are adopted, linking to or embedding a portion of a video will be as easy as using the correct fragment on your URL thanks to the Media Fragments WG who has been hard at work since the last time I looked at fragments.

However, until that work is embraced by browsers, embedding portions of videos will continue to require work specific to the site from which you are embedding the video. On the YouTube blog they wrote about how to "link to the best parts in your videos", using a fragment syntax like '#t=1m15s' to start playback of the associated video at 1 minute and 15 seconds. Of course if you want to embed part of a Hulu video it will be different. Although I haven't found an authoritative source describing the URL syntax to use, you can follow Hulu's video guide on linking to part of a video and note how the URL changes as you adjust the slider on the time-line. It looks like their syntax for linking to a Hulu page is to add '?c=[start time in seconds](:[end time in seconds])' with the colon and end time optional in order to link to a portion of a video. And the syntax for embedding appears to be "http://www.hulu.com/embed/.../[start time in seconds](/[end time in seconds])" again with the end time optional.

For more sites, check out the Media Fragments WG's list of existing applications' proprietary fragmenting schemes.

hulu technical media fragment wg url youtube video html5 uri fragment

Thoughts on registerProtocolHandler in HTML 5

2009 Apr 7, 9:02

I'm a big fan of the concept of registerProtocolHandler in HTML 5 and in FireFox 3, but not quite the implementation. From a high level, it allows web apps to register themselves as handlers of an URL scheme so for (the canonical) example, GMail can register for the mailto URL scheme. I like the concept:

Better integration of web apps with your system.
Its easy for web apps to do.
Links to URNs can now take the user to the sites the user prefers for the sort of thing identified by the URN. For example, if I have a physical address in HTML, instead of making that an http link to Yahoo Maps, I can make the link a geo scheme URI and those who follow the link will get their preferred mapping site that has registered for that scheme. Actually, looking at the geo scheme's RFC, maybe I'd rather use some other URN scheme to represent the physical location, but you get the point.

However, the way its currently spec'ed out I don't like the following:

There's no way to know if you are the handler for a particular URL scheme which is an important question for web app URL protocol handler authors.
There's no way to fallback to an http URL in the case that a particular URL scheme isn't registered. A suggested solution to testing the registration of a scheme is for browsers to provide an additional script method to check if a scheme is registered. I don't like the idea of writing script that walks over all my page's links and rewrites them based on that method. I'd much rather see a declarative and backwards compatible fallback mechanism, although I don't know what that would look like.
There's no way to register for a namespace within the urn scheme URI, the info scheme URI, or the tag scheme URI. I want to register info:lccn/... (Library of Congress Card Number identifiers) to LibraryThing or Amazon and I want to register urn:duri:... (dated URIs) to the Web Archive, among other things.

Will this result in a proliferation of unregistered URL schemes with clashing namespaces? The ESW Wiki notes why this would be bad.
And last, although this is nitpickier than the rest, I don't like the '%s' syntax used in the registration method. I'd much rather pass in an URL template, like the URL template used in OpenSearch. If an URL template is used for matching rather than registering against a particular URL scheme, this could also allow for registering a namespace within a URN. For example something along the lines of: registerProtocolHandler("info:lccn/{lccnID}", "htttp://www.librarything.com/search_works.php?q={lccnID}", "LibraryThing LCCN")

url template registerprotocolhandler firefox technical url scheme protocol boring html5 uri urn

Windows PowerShell Core

2008 Nov 19, 3:34PowerShell function syntax and help

powershell reference microsoft msdn function shell

Investigation of a Few Application Protocols (Updated)

2008 Oct 25, 6:51

Windows allows for application protocols in which, through the registry, you specify a URL scheme and a command line to have that URL passed to your application. Its an easy way to hook a webbrowser up to your application. Anyone can read the doc above and then walk through the registry and pick out the application protocols but just from that info you can't tell what the application expects these URLs to look like. I did a bit of research on some of the application protocols I've seen which is listed below. Good places to look for information on URI schemes: Wikipedia URI scheme, and ESW Wiki UriSchemes.

Some Application Protocols and associated documentation.
Scheme	Name	Notes
search-ms	Windows Search Protocol	The search-ms application protocol is a convention for querying the Windows Search index. The protocol enables applications, like Microsoft Windows Explorer, to query the index with parameter-value arguments, including property arguments, previously saved searches, Advanced Query Syntax, Natural Query Syntax, and language code identifiers (LCIDs) for both the Indexer and the query itself. See the MSDN docs for search-ms for more info. Example: search-ms:query=food
Explorer.AssocProtocol.search-ms	Windows Search Protocol
OneNote	OneNote Protocol	From the OneNote help: `/hyperlink "pagetarget"` - Starts OneNote and opens the page specified by the pagetarget parameter. To obtain the hyperlink for any page in a OneNote notebook, right-click its page tab and then click Copy Hyperlink to this Page. Example: onenote:///\\GUMMO\Users\davris\Documents\OneNote%20Notebooks\OneNote%202007%20Guide\Getting%20Started%20with%20OneNote.one#section-id={692F45F5-A42A-415B-8C0D-39A10E88A30F}&end
callto	Callto Protocol	ESW Wiki Info on callto Skype callto info NetMeeting callto info Example: callto://+12125551234
itpc	iTunes Podcast	Tells iTunes to subscribe to an indicated podcast. iTunes documentation. C:\Program Files\iTunes\iTunes.exe /url "%1" Example: itpc:http://www.npr.org/rss/podcast.php?id=35
iTunes.AssocProtocol.itpc
pcast
iTunes.AssocProtocol.pcast
Magnet	Magnet URI	Magnet URL scheme described by Wikipedia. Magnet URLs identify a resource by a hash of that resource so that when used in P2P scenarios no central authority is necessary to create URIs for a resource.
mailto	Mail Protocol	RFC 2368 - Mailto URL Scheme. Mailto Syntax Opens mail programs with new message with some parameters filled in, such as the to, from, subject, and body. Example: mailto:?to=david.risney@gmail.com&subject=test&body=Test of mailto syntax
WindowsMail.Url.Mailto	Mail Protocol
MMS	mms Protocol	MSDN describes associated protocols. Wikipedia describes MMS. "C:\Program Files\Windows Media Player\wmplayer.exe" "%L" Also appears to be related to MMS cellphone messages: MMS IETF Draft.
WMP11.AssocProtocol.MMS	mms Protocol
secondlife	[SecondLife]	Opens SecondLife to the specified location, user, etc. SecondLife Wiki description of the URL scheme. "C:\Program Files\SecondLife\SecondLife.exe" -set SystemLanguage en-us -url "%1" Example: secondlife://ahern/128/128/128
skype	Skype Protocol	Open Skype to call a user or phone number. Skype's documentation Wikipedia summary of skype URL scheme "C:\Program Files\Skype\Phone\Skype.exe" "/uri:%l" Example: skype:+14035551111?call
skype-plugin	Skype Plugin Protocol Handler	Something to do with adding plugins to skype? Maybe. "C:\Program Files\Skype\Plugin Manager\skypePM.exe" "/uri:%1"
svn	SVN Protocol	Opens TortoiseSVN to browse the repository URL specified in the URL. C:\Program Files\TortoiseSVN\bin\TortoiseProc.exe /command:repobrowser /path:"%1"
svn+ssh
tsvn
webcal	Webcal Protocol	Wikipedia describes webcal URL scheme. Webcal URL scheme description. A URL that starts with webcal:// points to an Internet location that contains a calendar in iCalendar format. "C:\Program Files\Windows Calendar\wincal.exe" /webcal "%1" Example: webcal://www.lightstalkers.org/LS.ics
WindowsCalendar.UrlWebcal.1	Webcal Protocol
zune	Zune Protocol	Provides access to some Zune operations such as podcast subscription (via Zune Insider). "c:\Program Files\Zune\Zune.exe" -link:"%1" Example: zune://subscribe/?name=http://feeds.feedburner.com/wallstrip.
feed	Outlook Add RSS Feed	Identify a resource that is a feed such as Atom or RSS. Implemented by Outlook to add the indicated feed to Outlook. Feed URI scheme pre-draft document "C:\PROGRA~2\MICROS~1\Office12\OUTLOOK.EXE" /share "%1"
im	IM Protocol	RFC 3860 IM URI scheme description Like mailto but for instant messaging clients. Registered by Office Communicator but I was unable to get it to work as described in RFC 3860. "C:\Program Files (x86)\Microsoft Office Communicator\Communicator.exe" "%1"
tel	Tel Protocol	RFC 5341 - tel URI scheme IANA assignment RFC 3966 - tel URI scheme description Call phone numbers via the tel URI scheme. Implemented by Office Communicator. "C:\Program Files (x86)\Microsoft Office Communicator\Communicator.exe" "%1"

(Updated 2008-10-27: Added feed, im, and tel from Office Communicator)

technical application protocol shell url windows

QuickBase Formula Pretty Printer and Syntax Highlighter

2008 Oct 5, 9:17

Sarah asked me if I knew of a syntax highlighter for the QuickBase formula language which she uses at work. I couldn't find one but thought it might be fun to make a QuickBase Formula syntax highlighter based on the QuickBase help's description of the formula syntax. Thankfully the language is relatively simple since my skills with ANTLR, the parser generator, are rusty now and I've only used it previously for personal projects (like Javaish, the ridiculous Java based shell idea I had).

With the help of some great ANTLR examples and an ANTLR cheat sheet I was able to come up with the grammar that parses the QuickBase Formula syntax and prints out the same formula marked up with HTML SPAN tags and various CSS classes. ANTLR produces the parser in Java which I wrapped up in an applet, put in a jar, and embedded in an HTML page. The script in that page runs user input through the applet's parser and sticks the output at the bottom of the page with appropriate CSS rules to highlight and print the formula in a pretty fashion.

What I learned:

I didn't realize that Java applets are easy to use via script in an HTML page. In the JavaScript I can simply refer to publicly exposed methods on the applet and run JavaScript strings through them. It makes for a great combination: do the heavy coding in Java and do the UI in HTML. I may end up doing this again in the future.
I love ANTLRWorks, the ANTLR IDE, that didn't exist the last time I used ANTLR. It tells you about issues with your grammar as you create it, lets you easily debug the grammar running it forwards and backwards, display parse trees, and other useful things.

java technical programming quickbase language antlr antlrworks

ANTLRWorks: The ANTLR GUI Development Environment

2008 Oct 2, 9:37Cool graphical ANTLR IDE! They didn't have this the last time I used ANTLR. "ANTLRWorks is a novel grammar development environment for ANTLR v3 grammars written by Jean Bovet (with suggested use cases from Terence Parr). It combines an excellent grammar-aware editor with an interpreter for rapid prototyping and a language-agnostic debugger for isolating grammar errors. ANTLRWorks helps eliminate grammar nondeterminisms, one of the most difficult problems for beginners and experts alike, by highlighting nondeterministic paths in the syntax diagram associated with a grammar."

antlr ide graph grammar tool free download development opensource java

ANTLR Cheat Sheet - ANTLR 3 - ANTLR Project

2008 Oct 2, 9:26Cheat sheet on ANTLR's syntax. ANTLR's another language parser generator.

antlr cheat parser language grammar opensource java software syntax quickreference

QuickBase Help - Formulas in QuickBase

2008 Oct 2, 9:24Sarah uses QuickBase formulas at work and this is the language's description. Looking at making a syntax highlighter.

quickbase language reference help

Tag Metadata in Feeds

2008 Aug 25, 10:13

As noted previously, my page consists of the aggregation of my various feeds and in working on that code recently it was again brought to my attention that everyone has different ways of representing tag metadata in feeds. I made up a list of how my various feed sources represent tags and list that data here so that it might help others in the future.

Tag markup from various sources
Source	Feed Type	Tag Markup Scheme	One Tag Per Element	Tag Scheme URI	Human / Machine Names	Example Markup
LiveJournal	Atom	atom:category	yes	no	no	, (source)
LiveJournal	RSS 2.0	rss2:category	yes	no	no	`technical` (soure)
WordPress	RSS 2.0	rss2:category	yes	no	no	, (source)
Delicious	RSS 1.0	dc:subject	no	no	no	`photosynth photos 3d tool` (source)
Delicious	RSS 2.0	rss2:category	yes	yes	no	`domain="http://delicious.com/SequelGuy/"> hulu` (source)
Flickr	Atom	atom:category	yes	yes	no	`term="seattle" scheme="http://www.flickr.com/photos/tags/" />` (source)
Flickr	RSS 2.0	media:category	no	yes	no	`scheme="urn:flickr:tags"> seattle washington baseball mariners` (source)
YouTube	RSS 2.0	media:category	no	no	no	`label="Tags"> bunny rabbit yawn cadbury` (source)
LibraryThing	RSS 2.0	No explicit tag metadata.	no	no	no	n/a, (source)

Tag markup scheme
Tag Markup Scheme	Notes	Example
Atom Category atom:category `xmlns:atom="http://www.w3.org/2005/Atom"`	category/@term Required category name. category/@scheme Optional IRI id'ing the categorization scheme. category/@label Optional human readable category name.	`term="catName" scheme="tag:deletethis.net,2008:tagscheme" label="category name in human readable format"/>`
RSS 2.0 category rss2:category empty namespace	category/@domain Optional string id'ing the categorization scheme. category/text() Required category name. The value of the element is a forward-slash-separated string that identifies a hierarchic location in the indicated taxonomy. Processors may establish conventions for the interpretation of categories.	`domain="tag:deletethis.net,2008:tagscheme"> MSFT`
Yahoo Media RSS Module category media:category `xmlns:media="http://search.yahoo.com/mrss/"`	category/text() Required category name. category/@domain Optional string id'ing the categorization scheme.	`scheme="http://dmoz.org" label="Ace Ventura - Pet Detective"> Arts/Movies/Titles/A/Ace_Ventura_Series/Ace_Ventura_-_Pet_Detective`
Dublin Core subject dc:subject `xmlns:dc="http://purl.org/dc/elements/1.1/"`	subject/text() Required category name. Typically, the subject will be represented using keywords, key phrases, or classification codes. Recommended best practice is to use a controlled vocabulary.	`humor`

Update 2009-9-14: Added WordPress to the Tag Markup table and namespaces to the Tag Markup Scheme table.

feed media delicious technical atom youtube yahoo rss tag

URI Fragment Info Roundup

2008 Apr 21, 11:53

Information about URI Fragments, the portion of URIs that follow the '#' at the end and that are used to navigate within a document, is scattered throughout various documents which I usually have to hunt down. Instead I'll link to them all here.

Definitions. Fragments are defined in the URI RFC which states that they're used to identify a secondary resource that is related to the primary resource identified by the URI as a subset of the primary, a view of the primary, or some other resource described by the primary. The interpretation of a fragment is based on the mime type of the primary resource. Tim Berners-Lee notes that determining fragment meaning from mime type is a problem because a single URI may contain a single fragment, however over HTTP a single URI can result in the same logical resource represented in different mime types. So there's one fragment but multiple mime types and so multiple interpretations of the one fragment. The URI RFC says that if an author has a single resource available in multiple mime types then the author must ensure that the various representations of a single resource must all resolve fragments to the same logical secondary resource. Depending on which mime types you're dealing with this is either not easy or not possible.

HTTP. In HTTP when URIs are used, the fragment is not included. The General Syntax section of the HTTP standard says it uses the definitions of 'URI-reference' (which includes the fragment), 'absoluteURI', and 'relativeURI' (which don't include the fragment) from the URI RFC. However, the 'URI-reference' term doesn't actually appear in the BNF for the protocol. Accordingly the headers like 'Request-URI', 'Content-Location', 'Location', and 'Referer' which include URIs are defined with 'absoluteURI' or 'relativeURI' and don't include the fragment. This is in keeping with the original fragment definition which says that the fragment is used as a view of the original resource and consequently only needed for resolution on the client. Additionally, the URI RFC explicitly notes that not including the fragment is a privacy feature such that page authors won't be able to stop clients from viewing whatever fragments the client chooses. This seems like an odd claim given that if the author wanted to selectively restrict access to portions of documents there are other options for them like breaking out the parts of a single resource to which the author wishes to restrict access into separate resources.

HTML. In HTML, the HTML mime type RFC defines HTML's fragment use which consists of fragments referring to elements with a corresponding 'id' attribute or one of a particular set of elements with a corresponding 'name' attribute. The HTML spec discusses fragment use additionally noting that the names and ids must be unique in the document and that they must consist of only US-ASCII characters. The ID and NAME attributes are further restricted in section 6 to only consist of alphanumerics, the hyphen, period, colon, and underscore. This is a subset of the characters allowed in the URI fragment so no encoding is discussed since technically its not needed. However, practically speaking, browsers like FireFox and Internet Explorer allow for names and ids containing characters outside of the defined set including characters that must be percent-encoded to appear in a URI fragment. The interpretation of percent-encoded characters in fragments for HTML documents is not consistent across browsers (or in some cases within the same browser) especially for the percent-encoded percent.

Text. Text/plain recently got a fragment definition that allows fragments to refer to particular lines or characters within a text document. The scheme no longer includes regular expressions, which disappointed me at first, but in retrospect is probably good idea for increasing the adoption of this fragment scheme and for avoiding the potential for ubiquitous DoS via regex. One of the authors also notes this on his blog. I look forward to the day when this scheme is widely implemented.

XML. XML has the XPointer framework to define its fragment structure as noted by the XML mime type definition. XPointer consists of a general scheme that contains subschemes that identify a subset of an XML document. Its too bad such a thing wasn't adopted for URI fragments in general to solve the problem of a single resource with multiple mime type representations. I wrote more about XPointer when I worked on hacking XPointer into IE.

SVG and MPEG. Through the Media Fragments Working Group I found a couple more fragment scheme definitions. SVG's fragment scheme is defined in the SVG documentation and looks similar to XML's. MPEG has one defined but I could only find it as an ISO document "Text of ISO/IEC FCD 21000-17 MPEG-12 FID" and not as an RFC which is a little disturbing.

AJAX. AJAX websites have used fragments as an escape hatch for two issues that I've seen. The first is getting a unique URL for versions of a page that are produced on the client by script. The fragment may be changed by script without forcing the page to reload. This goes outside the rules of the standards by using HTML fragments in a fashion not called out by the HTML spec. but it does seem to be inline with the spirit of the fragment in that it is a subview of the original resource and interpretted client side. The other hack-ier use of the fragment in AJAX is for cross domain communication. The basic idea is that different frames or windows may not communicate in normal fashions if they have different domains but they can view each other's URLs and accordingly can change their own fragments in order to send a message out to those who know where to look. IMO this is not inline with the spirit of the fragment but is rather a cool hack.

xml text ajax technical url boring uri fragment rfc

Fragment Identification of MPEG Resources (Text of ISO/IEC FCD 21000-17 MPEG-21 FID)

2008 Apr 16, 7:09Standard describing URI fragments identifying parts of MPEG videos. Very similar syntax to XML fragments. Having trouble finding this document as anything other than a Word doc. Looks to exist only as an ISO standard.

standard fragment uri video mpeg reference iso

IPv6 Roundup: Address Syntax on Windows

2008 Jan 9, 11:34

IPv6 address syntax consists of 8 groupings of colon delimited 16-bit hex values making up the 128-bit address. An optional double colon can replace any consecutive sequence of 0 valued hex values. For example the following is a valid IPv6 address: fe80::2c02:db79

Some IPv6 addresses aren't global and in those cases need a scope ID to describe their context. These get a '%' followed by the scope ID. For example the previous example with a scope ID of '8' would be: fe80::2c02:db79%8

IPv6 addresses in URIs may appear in the host section of a URI as long as they're enclosed by square brackets. For example: http://[fe80::2c02:db79]/. The RFC explicitly notes that there isn't a way to add a scope ID to the IPv6 address in a URI. However a draft document describes adding scope IDs to IPv6 addresses in URIs. The draft document uses the IPvFuture production from the URI RFC with a 'v1' to add a new hostname syntax and a '+' instead of a '%' for delimiting the scope id. For example: http://[v1.fe80::2c02:db79+8]/. However, this is still a draft document, not a final standard, and I don't know of any system that works this way.

In Windows XPSP2 the IPv6 stack is available but disabled by default. To enable the IPv6 stack, at a command prompt run 'netsh interface ipv6 install'. In Vista IPv6 is the on by default and cannot be turned off, while the IPv4 stack is optional and may be turned off by a command similar to the previous.

Once you have IPv6 on in your OS you can turn on IPv6 for IIS6 or just use IIS7. The address ::1 refers to the local machine.

In some places in Windows like UNC paths, IPv6 addresses aren't allowed. In those cases you can use a Vista DNS IPv6 hack that lives in the OS name resolution stack that transforms particularly crafted names into IPv6 addresses. Take your IPv6 address, replace the ':'s with '-'s and the '%' with an 's' and then append '.ipv6-literal.net' to the end. For example: fe80--2c02-db79s8.ipv6-literal.net. That name will resolve to the same example I've been using in Vista. This transformation occurs inside the system's local name resolution stack so no DNS servers are involved, although Microsoft does own the ipv6-literal.net domain name.

MSDN describes IPv6 addresses in URIs in Windows and I've described IPv6 addresses in URIs in IE7. File URIs in IE7 don't support IPv6 addresses. If you want to put a scope ID in a URI in IE7 you use a '%25' to delimit the scope ID and due to a bug you must have at least two digits in your scope ID. So, to take the previous example: http://[fe80::2c02:db79%2508]/. Note that its 08 rather than just 8.

roundup ip windows ipv6 technical microsoft boring syntax