rfc page 3 - Dave's Blog

RFC 2132 - DHCP Options and BOOTP Vendor Extensions

2009 Dec 12, 2:42"The Dynamic Host Configuration Protocol (DHCP) [1] provides a framework for passing configuration information to hosts on a TCP/IP network. Configuration parameters and other control information are carried in tagged data items that are stored in the 'options' field of the DHCP message. The data items themselves are also called "options.""

RFC battle: Browsers vs. programming languages - cat /dev/random | grep security -

2009 Nov 25, 7:09Relative URI resolution differences in browsers vs programming language libraries.

via:ericlaw url uri rfc web browser programming dotnet java technical

RFC 959 - File Transfer Protocol

2009 Sep 9, 5:35The FTP spec's section 3.5 'ERROR RECOVERY AND RESTART' describes how to resume an FTP download.

ietf reference ftp rfc resume download internet technical

RFC 1951 - DEFLATE Compressed Data Format Specification version 1.3

2009 Sep 3, 7:17"This specification defines a lossless compressed data format that compresses data using a combination of the LZ77 algorithm and Huffman coding." Also see RFC 1950 zlib, a wrapper compression format that can use deflate, and RFC 1952 gzip, a compressed file format that can use deflate.

technical rfc ietf compression http deflate gzip zlib

Internationalized Resource Identifiers (IRIs)

2009 Jul 29, 5:48The new draft IRI spec to replace RFC 3987. "To accomodate widespread current practice, additional derivative protocol elements are defined, and current practice for resolving IRI-based hypertext references in HTML are outlined."

iri uri rfc html reference technical

RFC 2483 - URI Resolution Services Necessary for URN Resolution

2009 Jul 27, 7:28Includes the text/uri-list mime type!

technical url uri mime reference ietf

Amazon.com: The Complete April Fools' Day RFCs: Thomas, A. Limoncelli, Peter, H. Salus: Books

2009 Apr 8, 10:40A good gift for a particular subset of people I know. "Also has commentary from Limoncelli and some other internet gods. Worth many geek points - full of lulz!!"

gift wishlist book ietf reference rfc humor

Thoughts on registerProtocolHandler in HTML 5

2009 Apr 7, 9:02

I'm a big fan of the concept of registerProtocolHandler in HTML 5 and in FireFox 3, but not quite the implementation. From a high level, it allows web apps to register themselves as handlers of an URL scheme so for (the canonical) example, GMail can register for the mailto URL scheme. I like the concept:

Better integration of web apps with your system.
Its easy for web apps to do.
Links to URNs can now take the user to the sites the user prefers for the sort of thing identified by the URN. For example, if I have a physical address in HTML, instead of making that an http link to Yahoo Maps, I can make the link a geo scheme URI and those who follow the link will get their preferred mapping site that has registered for that scheme. Actually, looking at the geo scheme's RFC, maybe I'd rather use some other URN scheme to represent the physical location, but you get the point.

However, the way its currently spec'ed out I don't like the following:

There's no way to know if you are the handler for a particular URL scheme which is an important question for web app URL protocol handler authors.
There's no way to fallback to an http URL in the case that a particular URL scheme isn't registered. A suggested solution to testing the registration of a scheme is for browsers to provide an additional script method to check if a scheme is registered. I don't like the idea of writing script that walks over all my page's links and rewrites them based on that method. I'd much rather see a declarative and backwards compatible fallback mechanism, although I don't know what that would look like.
There's no way to register for a namespace within the urn scheme URI, the info scheme URI, or the tag scheme URI. I want to register info:lccn/... (Library of Congress Card Number identifiers) to LibraryThing or Amazon and I want to register urn:duri:... (dated URIs) to the Web Archive, among other things.

Will this result in a proliferation of unregistered URL schemes with clashing namespaces? The ESW Wiki notes why this would be bad.
And last, although this is nitpickier than the rest, I don't like the '%s' syntax used in the registration method. I'd much rather pass in an URL template, like the URL template used in OpenSearch. If an URL template is used for matching rather than registering against a particular URL scheme, this could also allow for registering a namespace within a URN. For example something along the lines of: registerProtocolHandler("info:lccn/{lccnID}", "htttp://www.librarything.com/search_works.php?q={lccnID}", "LibraryThing LCCN")

url template registerprotocolhandler firefox technical url scheme protocol boring html5 uri urn

RFC 5514 - IPv6 over Social Networks

2009 Apr 1, 10:42Lol at actual Facebook app that does IPv6 over Facebook. "...most network users are not aware of what IPv6 is or are even afraid by IPv6 because it is unknown. On the other hand, Social Networks (like Facebook, LinkedIn, etc.) are well-known by users and the usage of those networks is huge... With IPv6 over Social Network (IPoSN): * Every user is a router with at least one loopback interface; * Every friend or connection between users will be used as a point-to-point link... A working prototype has been developed by the author and is freely available: IPv6 over Facebook Social Network [IPv6overFacebook]."

humor social network ipv6 ip iposn facebook ietf rfc

Web addresses in HTML 5

2009 Mar 23, 11:06The HTML5 spec tells us how it is in the real world for URLs: "This specification defines various algorithms for dealing with Web addresses intended for use by HTML user agents. For historical reaons, in order to be compatible with existing Web content HTML user agents need to implement a number of processes not defined by the URI and IRI specifications [RFC3986], [RFC3987]."

html html5 url uri reference w3c

The 'Is It UTF-8?' Quick and Dirty Test

2009 Mar 6, 5:16

I've found while debugging networking in IE its often useful to quickly tell if a string is encoded in UTF-8. You can check for the Byte Order Mark (EF BB BF in UTF-8) but, I rarely see the BOM on UTF-8 strings. Instead I apply a quick and dirty UTF-8 test that takes advantage of the well-formed UTF-8 restrictions.

Unlike other multibyte character encoding forms (see Windows supported character sets or IANA's list of character sets), for example Big5, where sticking together any two bytes is more likely than not to give a valid byte sequence, UTF-8 is more restrictive. And unlike other multibyte character encodings, UTF-8 bytes may be taken out of context and one can still know that its a single byte character, the starting byte of a three byte sequence, etc.

The full rules for well-formed UTF-8 are a little too complicated for me to commit to memory. Instead I've got my own simpler (this is the quick part) set of rules that will be mostly correct (this is the dirty part). For as many bytes in the string as you care to examine, check the most significant digit of the byte:

F:: This is byte 1 of a 4 byte encoded codepoint and must be followed by 3 trail bytes.
E:: This is byte 1 of a 3 byte encoded codepoint and must be followed by 2 trail bytes.
C..D:: This is byte 1 of a 2 byte encoded codepoint and must be followed by 1 trail byte.
8..B:: This is a trail byte.
0..7:: This is a single byte encoded codepoint.

The simpler rules can produce false positives in some cases: that is, they'll say a string is UTF-8 when in fact it might not be. But it won't produce false negatives. The following is table from the Unicode spec. that actually describes well-formed UTF-8.

Code Points	1st Byte	2nd Byte	3rd Byte	4th Byte
U+0000..U+007F	00..7F
U+0080..U+07FF	C2..DF	80..BF
U+0800..U+0FFF	E0	A0..BF	80..BF
U+1000..U+CFFF	E1..EC	80..BF	80..BF
U+D000..U+D7FF	ED	80..9F	80..BF
U+E000..U+FFFF	EE..EF	80..BF	80..BF
U+10000..U+3FFFF	F0	90..BF	80..BF	80..BF
U+40000..U+FFFFF	F1..F3	80..BF	80..BF	80..BF
U+100000..U+10FFFF	F4	80..8F	80..BF	80..BF

test technical unicode boring charset utf8 encoding

draft-masinter-dated-uri-05 - names are readily assigned, offer the persistence of reference that is required by URNs, but do not require a stable authority to assign the name. The first namespace ("duri") is used to refer to URI-

2009 Feb 4, 4:30New URN schemes with no central minting authority. duri allows you to name a resource that was identified by the specified URI at the specified date (e.g. refers to the IETF's homepage at the end of the year 2001). tdb allows you to name a physical object or entity that was described by a resource that was identified by a specified URI at the specified date (e.g. refers to IETF the orginization as referenced by their homepage at the end of the year 2001). Date format is concise but I'd prefer RFC3339 rather than roping in another date format.

duri tdb uri url scheme reference ietf date datetime rfc

Text/Plain Fragment Bookmarklet

2008 Nov 19, 12:58

The text/plain fragment documented in RFC 5147 and described on Erik Wilde's blog struck my interest and, like the XML fragment, I wanted to see if I could implement this in IE. In this case there's no XSLT for me to edit so, like my plain/text word wrap bookmarklet I've implemented it as a bookmarklet. This is only a partial implementation as it doesn't implement the integrity checks.

Check out my text/plain fragment bookmarklet.

text url boring bookmarklet uri plain-text javascript fragment

Investigation of a Few Application Protocols (Updated)

2008 Oct 25, 6:51

Windows allows for application protocols in which, through the registry, you specify a URL scheme and a command line to have that URL passed to your application. Its an easy way to hook a webbrowser up to your application. Anyone can read the doc above and then walk through the registry and pick out the application protocols but just from that info you can't tell what the application expects these URLs to look like. I did a bit of research on some of the application protocols I've seen which is listed below. Good places to look for information on URI schemes: Wikipedia URI scheme, and ESW Wiki UriSchemes.

Some Application Protocols and associated documentation.
Scheme	Name	Notes
search-ms	Windows Search Protocol	The search-ms application protocol is a convention for querying the Windows Search index. The protocol enables applications, like Microsoft Windows Explorer, to query the index with parameter-value arguments, including property arguments, previously saved searches, Advanced Query Syntax, Natural Query Syntax, and language code identifiers (LCIDs) for both the Indexer and the query itself. See the MSDN docs for search-ms for more info. Example: search-ms:query=food
Explorer.AssocProtocol.search-ms	Windows Search Protocol
OneNote	OneNote Protocol	From the OneNote help: `/hyperlink "pagetarget"` - Starts OneNote and opens the page specified by the pagetarget parameter. To obtain the hyperlink for any page in a OneNote notebook, right-click its page tab and then click Copy Hyperlink to this Page. Example: onenote:///\\GUMMO\Users\davris\Documents\OneNote%20Notebooks\OneNote%202007%20Guide\Getting%20Started%20with%20OneNote.one#section-id={692F45F5-A42A-415B-8C0D-39A10E88A30F}&end
callto	Callto Protocol	ESW Wiki Info on callto Skype callto info NetMeeting callto info Example: callto://+12125551234
itpc	iTunes Podcast	Tells iTunes to subscribe to an indicated podcast. iTunes documentation. C:\Program Files\iTunes\iTunes.exe /url "%1" Example: itpc:http://www.npr.org/rss/podcast.php?id=35
iTunes.AssocProtocol.itpc
pcast
iTunes.AssocProtocol.pcast
Magnet	Magnet URI	Magnet URL scheme described by Wikipedia. Magnet URLs identify a resource by a hash of that resource so that when used in P2P scenarios no central authority is necessary to create URIs for a resource.
mailto	Mail Protocol	RFC 2368 - Mailto URL Scheme. Mailto Syntax Opens mail programs with new message with some parameters filled in, such as the to, from, subject, and body. Example: mailto:?to=david.risney@gmail.com&subject=test&body=Test of mailto syntax
WindowsMail.Url.Mailto	Mail Protocol
MMS	mms Protocol	MSDN describes associated protocols. Wikipedia describes MMS. "C:\Program Files\Windows Media Player\wmplayer.exe" "%L" Also appears to be related to MMS cellphone messages: MMS IETF Draft.
WMP11.AssocProtocol.MMS	mms Protocol
secondlife	[SecondLife]	Opens SecondLife to the specified location, user, etc. SecondLife Wiki description of the URL scheme. "C:\Program Files\SecondLife\SecondLife.exe" -set SystemLanguage en-us -url "%1" Example: secondlife://ahern/128/128/128
skype	Skype Protocol	Open Skype to call a user or phone number. Skype's documentation Wikipedia summary of skype URL scheme "C:\Program Files\Skype\Phone\Skype.exe" "/uri:%l" Example: skype:+14035551111?call
skype-plugin	Skype Plugin Protocol Handler	Something to do with adding plugins to skype? Maybe. "C:\Program Files\Skype\Plugin Manager\skypePM.exe" "/uri:%1"
svn	SVN Protocol	Opens TortoiseSVN to browse the repository URL specified in the URL. C:\Program Files\TortoiseSVN\bin\TortoiseProc.exe /command:repobrowser /path:"%1"
svn+ssh
tsvn
webcal	Webcal Protocol	Wikipedia describes webcal URL scheme. Webcal URL scheme description. A URL that starts with webcal:// points to an Internet location that contains a calendar in iCalendar format. "C:\Program Files\Windows Calendar\wincal.exe" /webcal "%1" Example: webcal://www.lightstalkers.org/LS.ics
WindowsCalendar.UrlWebcal.1	Webcal Protocol
zune	Zune Protocol	Provides access to some Zune operations such as podcast subscription (via Zune Insider). "c:\Program Files\Zune\Zune.exe" -link:"%1" Example: zune://subscribe/?name=http://feeds.feedburner.com/wallstrip.
feed	Outlook Add RSS Feed	Identify a resource that is a feed such as Atom or RSS. Implemented by Outlook to add the indicated feed to Outlook. Feed URI scheme pre-draft document "C:\PROGRA~2\MICROS~1\Office12\OUTLOOK.EXE" /share "%1"
im	IM Protocol	RFC 3860 IM URI scheme description Like mailto but for instant messaging clients. Registered by Office Communicator but I was unable to get it to work as described in RFC 3860. "C:\Program Files (x86)\Microsoft Office Communicator\Communicator.exe" "%1"
tel	Tel Protocol	RFC 5341 - tel URI scheme IANA assignment RFC 3966 - tel URI scheme description Call phone numbers via the tel URI scheme. Implemented by Office Communicator. "C:\Program Files (x86)\Microsoft Office Communicator\Communicator.exe" "%1"

(Updated 2008-10-27: Added feed, im, and tel from Office Communicator)

technical application protocol shell url windows

Tag Metadata in Feeds

2008 Aug 25, 10:13

As noted previously, my page consists of the aggregation of my various feeds and in working on that code recently it was again brought to my attention that everyone has different ways of representing tag metadata in feeds. I made up a list of how my various feed sources represent tags and list that data here so that it might help others in the future.

Tag markup from various sources
Source	Feed Type	Tag Markup Scheme	One Tag Per Element	Tag Scheme URI	Human / Machine Names	Example Markup
LiveJournal	Atom	atom:category	yes	no	no	, (source)
LiveJournal	RSS 2.0	rss2:category	yes	no	no	`technical` (soure)
WordPress	RSS 2.0	rss2:category	yes	no	no	, (source)
Delicious	RSS 1.0	dc:subject	no	no	no	`photosynth photos 3d tool` (source)
Delicious	RSS 2.0	rss2:category	yes	yes	no	`domain="http://delicious.com/SequelGuy/"> hulu` (source)
Flickr	Atom	atom:category	yes	yes	no	`term="seattle" scheme="http://www.flickr.com/photos/tags/" />` (source)
Flickr	RSS 2.0	media:category	no	yes	no	`scheme="urn:flickr:tags"> seattle washington baseball mariners` (source)
YouTube	RSS 2.0	media:category	no	no	no	`label="Tags"> bunny rabbit yawn cadbury` (source)
LibraryThing	RSS 2.0	No explicit tag metadata.	no	no	no	n/a, (source)

Tag markup scheme
Tag Markup Scheme	Notes	Example
Atom Category atom:category `xmlns:atom="http://www.w3.org/2005/Atom"`	category/@term Required category name. category/@scheme Optional IRI id'ing the categorization scheme. category/@label Optional human readable category name.	`term="catName" scheme="tag:deletethis.net,2008:tagscheme" label="category name in human readable format"/>`
RSS 2.0 category rss2:category empty namespace	category/@domain Optional string id'ing the categorization scheme. category/text() Required category name. The value of the element is a forward-slash-separated string that identifies a hierarchic location in the indicated taxonomy. Processors may establish conventions for the interpretation of categories.	`domain="tag:deletethis.net,2008:tagscheme"> MSFT`
Yahoo Media RSS Module category media:category `xmlns:media="http://search.yahoo.com/mrss/"`	category/text() Required category name. category/@domain Optional string id'ing the categorization scheme.	`scheme="http://dmoz.org" label="Ace Ventura - Pet Detective"> Arts/Movies/Titles/A/Ace_Ventura_Series/Ace_Ventura_-_Pet_Detective`
Dublin Core subject dc:subject `xmlns:dc="http://purl.org/dc/elements/1.1/"`	subject/text() Required category name. Typically, the subject will be represented using keywords, key phrases, or classification codes. Recommended best practice is to use a controlled vocabulary.	`humor`

Update 2009-9-14: Added WordPress to the Tag Markup table and namespaces to the Tag Markup Scheme table.

feed media delicious technical atom youtube yahoo rss tag

RFC 3514 - The Security Flag in the IPv4 Header

2008 Jun 30, 3:57"Firewalls, packet filters, intrusion detection systems, and the like often have difficulty distinguishing between packets that have malicious intent and those that are merely unusual. We define a security flag in the IPv4 header as a means of distinguis

humor rfc security ipv4 ip

RFC 3675 - .sex Considered Dangerous

2008 Jun 30, 3:55FCC wants nationwide free wifi that's free of porn. They should read this. "Periodically there are proposals to mandate the use of a special top level name or an IP address bit to flag "adult" "unsafe" material or the like. This document explains why thi

domain dns rfc ietf internet porn government politics censorship

Refreshed Internet Drafts - Implementer's notes - by Yngve Nysaeter Pettersen

2008 Jun 30, 3:49Yngve Nysaeter Pettersen briefly talks about his Opera minimal security domain RFCs: "I've just refreshed my HTTP Cookie and Cache related Internet Drafts."

rfc opera browser cookie http internet domain dns

New Cookie related Internet Drafts from Yngve N. Pettersen on 2006-03-20 (ietf-http-wg@w3.org from January to March 2006)

2008 Jun 30, 3:46Opera's solution to minimal security domain determination: "The drafts describe 1) Opera's current "rule of thumb" implementation that uses DNS in an attempt to confirm the validity of a domain, and 2) a proposed new HTTP based lookup service that retur

opera rfc ietf cookie http internet browser dns domain

URI Fragment Info Roundup

2008 Apr 21, 11:53

Information about URI Fragments, the portion of URIs that follow the '#' at the end and that are used to navigate within a document, is scattered throughout various documents which I usually have to hunt down. Instead I'll link to them all here.

Definitions. Fragments are defined in the URI RFC which states that they're used to identify a secondary resource that is related to the primary resource identified by the URI as a subset of the primary, a view of the primary, or some other resource described by the primary. The interpretation of a fragment is based on the mime type of the primary resource. Tim Berners-Lee notes that determining fragment meaning from mime type is a problem because a single URI may contain a single fragment, however over HTTP a single URI can result in the same logical resource represented in different mime types. So there's one fragment but multiple mime types and so multiple interpretations of the one fragment. The URI RFC says that if an author has a single resource available in multiple mime types then the author must ensure that the various representations of a single resource must all resolve fragments to the same logical secondary resource. Depending on which mime types you're dealing with this is either not easy or not possible.

HTTP. In HTTP when URIs are used, the fragment is not included. The General Syntax section of the HTTP standard says it uses the definitions of 'URI-reference' (which includes the fragment), 'absoluteURI', and 'relativeURI' (which don't include the fragment) from the URI RFC. However, the 'URI-reference' term doesn't actually appear in the BNF for the protocol. Accordingly the headers like 'Request-URI', 'Content-Location', 'Location', and 'Referer' which include URIs are defined with 'absoluteURI' or 'relativeURI' and don't include the fragment. This is in keeping with the original fragment definition which says that the fragment is used as a view of the original resource and consequently only needed for resolution on the client. Additionally, the URI RFC explicitly notes that not including the fragment is a privacy feature such that page authors won't be able to stop clients from viewing whatever fragments the client chooses. This seems like an odd claim given that if the author wanted to selectively restrict access to portions of documents there are other options for them like breaking out the parts of a single resource to which the author wishes to restrict access into separate resources.

HTML. In HTML, the HTML mime type RFC defines HTML's fragment use which consists of fragments referring to elements with a corresponding 'id' attribute or one of a particular set of elements with a corresponding 'name' attribute. The HTML spec discusses fragment use additionally noting that the names and ids must be unique in the document and that they must consist of only US-ASCII characters. The ID and NAME attributes are further restricted in section 6 to only consist of alphanumerics, the hyphen, period, colon, and underscore. This is a subset of the characters allowed in the URI fragment so no encoding is discussed since technically its not needed. However, practically speaking, browsers like FireFox and Internet Explorer allow for names and ids containing characters outside of the defined set including characters that must be percent-encoded to appear in a URI fragment. The interpretation of percent-encoded characters in fragments for HTML documents is not consistent across browsers (or in some cases within the same browser) especially for the percent-encoded percent.

Text. Text/plain recently got a fragment definition that allows fragments to refer to particular lines or characters within a text document. The scheme no longer includes regular expressions, which disappointed me at first, but in retrospect is probably good idea for increasing the adoption of this fragment scheme and for avoiding the potential for ubiquitous DoS via regex. One of the authors also notes this on his blog. I look forward to the day when this scheme is widely implemented.

XML. XML has the XPointer framework to define its fragment structure as noted by the XML mime type definition. XPointer consists of a general scheme that contains subschemes that identify a subset of an XML document. Its too bad such a thing wasn't adopted for URI fragments in general to solve the problem of a single resource with multiple mime type representations. I wrote more about XPointer when I worked on hacking XPointer into IE.

SVG and MPEG. Through the Media Fragments Working Group I found a couple more fragment scheme definitions. SVG's fragment scheme is defined in the SVG documentation and looks similar to XML's. MPEG has one defined but I could only find it as an ISO document "Text of ISO/IEC FCD 21000-17 MPEG-12 FID" and not as an RFC which is a little disturbing.

AJAX. AJAX websites have used fragments as an escape hatch for two issues that I've seen. The first is getting a unique URL for versions of a page that are produced on the client by script. The fragment may be changed by script without forcing the page to reload. This goes outside the rules of the standards by using HTML fragments in a fashion not called out by the HTML spec. but it does seem to be inline with the spirit of the fragment in that it is a subview of the original resource and interpretted client side. The other hack-ier use of the fragment in AJAX is for cross domain communication. The basic idea is that different frames or windows may not communicate in normal fashions if they have different domains but they can view each other's URLs and accordingly can change their own fragments in order to send a message out to those who know where to look. IMO this is not inline with the spirit of the fragment but is rather a cool hack.

xml text ajax technical url boring uri fragment rfc