2010 Apr 21, 6:51Adds SHA 256 & 512 to HTTP instance digest: 'The IANA registry named "Hypertext Transfer Protocol (HTTP) Digest Algorithm Values" defines values for digest algorithms used by Instance Digests in
HTTP. Instance Digests in HTTP provide a digest, also known as a checksum or hash, of an entire representation of the current state of a resource. This document adds new values to the registry and
updates previous values.'
hash cryptography http instance-digest sha security technical ietf rfc standard 2010 Apr 6, 11:17A thread on HTTPBIS concerning about how one might standardize hotels and other such proxies that inject redirects to their own payment or T&C agreement sites.
http httpbis reference ietf network 2010 Mar 31, 7:59Defines the mime type for JSON as well as JSON itself.
technical json mimetype mime javascript ietf rfc specification 2010 Mar 3, 6:15Finally giving the javascript URI scheme its own IETF doc.
javascript ietf url uri scheme reference technical 2010 Jan 29, 3:54
Raymond Chen has some thought experiments useful for discovering various kinds of stupidity in software design:
Tim Berners-Lee's principles of Web design includes my favorite: Test of Independent Invention. This has a thought experiment containing the construction of the MMM (Multi-Media Mesh) with
MRIs (Media Resource Identifiers) and MMTP (Muli-Media Transport Protocol).
The Internet design principles (RFC 1958) includes the Robustness Principle: be strict when sending and tolerant when receiving. A good one, but applied too liberally can lead to interop issues. For instance, consider web browsers.
Imagine one browser becomes so popular that web devs create web pages and just test out their pages in this popular browser. They don't ensure their pages conform to standards and accidentally end
up depending on the manner in which this popular browser tolerantly accepts non-standard input. This non-standard behavior ends up as de facto standard and future updates to the standard
essentially has had decisions made for it.
technical design principles software development 2010 Jan 15, 7:05Section 4 has a summary table with all the various special use IPv4 address blocks.
reference rfc ipv4 ip internet ietf 2010 Jan 5, 7:42
I've made a WPAD server Fiddler extension and in a fit of creativity I've named it: WPAD Server Fiddler
Extension.
Of course you know about Fiddler, Eric's awesome HTTP debugger tool, the HTTP proxy that lets you inspect, visualize and modify the
HTTP traffic that flows through it. And on the subject you've probably definitely heard of WPAD, the Web Proxy Auto Discovery protocol
that allows web browsers like IE to use DHCP or DNS to automatically discover HTTP proxies on their network. While working on a particularly nasty WPAD bug towards the end of IE8 I really wished I
had a way to see the WPAD requests and responses and modify PAC responses in Fiddler. Well the wishes of me of the past are now fulfilled by present day me as this Fiddler extension will respond to
WPAD DHCP requests telling those clients (by default) that Fiddler is their proxy.
When I started working on this project I didn't really understand how DHCP worked especially with respect to WPAD. I won't bore you with my misconceptions: it works by having your one DHCP server
on your network respond to regular DHCP requests as well as WPAD DHCP requests. And Windows I've found runs a DHCP client service (you can start/stop it via Start|Run|'services.msc', scroll to DHCP
Client or via the command line with "net start/stop 'DHCP Client'") that caches DHCP server responses making it just slightly more difficult to test and debug my extension. If a Windows app uses
the DHCP client APIs to ask for the WPAD option, this service will send out a DHCP request and take the first DHCP server response it gets. That means that if you're on a network with a DHCP
server, my extension will be racing to respond to the client. If the DHCP server wins then the client ignores the WPAD response from my extension.
Various documents and tools I found useful while working on this:
proxy fiddler http technical debug wpad pac tool dhcp 2009 Dec 18, 9:48"Several applications extending the Hypertext Transfer Protocol (HTTP) require a feature to do partial resource modification. The existing HTTP PUT method only allows a complete replacement of a
document. This proposal adds a new HTTP method, PATCH, to modify an existing HTTP resource."
http patch ietf reference via:warren technical 2009 Dec 12, 2:42"The Dynamic Host Configuration Protocol (DHCP) [1] provides a framework for passing configuration information to hosts on a TCP/IP network. Configuration parameters and other control information are
carried in tagged data items that are stored in the 'options' field of the DHCP message. The data items themselves are also called "options.""
technical reference rfc dhcp ietf ipv4 ip 2009 Sep 9, 5:35The FTP spec's section 3.5 'ERROR RECOVERY AND RESTART' describes how to resume an FTP download.
ietf reference ftp rfc resume download internet technical 2009 Sep 3, 7:17"This specification defines a lossless compressed data format that compresses data using a combination of the LZ77 algorithm and Huffman coding." Also see RFC 1950 zlib, a wrapper compression format
that can use deflate, and RFC 1952 gzip, a compressed file format that can use deflate.
technical rfc ietf compression http deflate gzip zlib 2009 Jul 27, 7:28Includes the text/uri-list mime type!
technical url uri mime reference ietf 2009 Jun 22, 3:12HTML5's mime-sniffing is getting moved to an IETF doc: "Many web servers supply incorrect Content-Type headers with their HTTP responses. In order to be compatible with these servers, user agents
must consider the content of HTTP responses as well as the Content-Type header when determining the effective media type of the response. This document describes an algorithm for determining the
effective media type of HTTP responses that balances security and compatibility considerations."
mime mime-sniffing ietf http w3c html5 technical 2009 Jun 22, 3:09"Web/browser-security maven and coder Adam Barth has been working on implementing a content sniffer in WebKit, based on a content-sniffing algorithm that was originally specified in the HTML5 draft,
but that's now specified as a separate IETF draft that Adam is editing and that's titled, Content-Type Processing Model."
mime mime-sniffing webkit http technical 2009 Apr 29, 12:34"In this presentation, recorded at QCon San Francisco 2008, HTTPbis WG chair Mark Nottingham gives an update on the current status of the HTTP protocol in the wild, and the ongoing work to clarify
the HTTP specification."
http httpbis protocol ietf reference video authentication cookie uri url tcp sctp mark-nottingham via:ericlaw 2009 Apr 8, 10:40A good gift for a particular subset of people I know. "Also has commentary from Limoncelli and some other internet gods. Worth many geek points - full of lulz!!"
gift wishlist book ietf reference rfc humor 2009 Apr 7, 9:02
I'm a big fan of the concept of registerProtocolHandler in HTML 5 and in FireFox 3, but not quite the implementation. From a high level, it allows web apps to register themselves as
handlers of an URL scheme so for (the canonical) example, GMail can register for the mailto URL scheme. I like the concept:
- Better integration of web apps with your system.
- Its easy for web apps to do.
- Links to URNs can now take the user to the sites the user prefers for the sort of thing identified by the URN. For example, if I have a physical address in HTML, instead of making that an http
link to Yahoo Maps, I can make the link a geo scheme URI and those who follow the link will get their preferred mapping site that
has registered for that scheme. Actually, looking at the geo scheme's RFC, maybe I'd rather use some other URN scheme to represent the physical location, but you get the point.
However, the way its currently spec'ed out I don't like the following:
- There's no way to know if you are the handler for a particular URL scheme which is an important question for web app URL protocol handler authors.
- There's no way to fallback to an http URL in the case that a particular URL scheme isn't registered. A suggested solution to testing the registration of a scheme is for browsers to provide an additional script method
to check if a scheme is registered. I don't like the idea of writing script that walks over all my page's links and rewrites them based on that method. I'd much rather see a declarative and
backwards compatible fallback mechanism, although I don't know what that would look like.
- There's no way to register for a namespace within the urn scheme URI, the info scheme URI, or the tag scheme URI. I want to register
info:lccn/... (Library of Congress Card Number identifiers) to LibraryThing or Amazon and I want to register urn:duri:... (dated URIs) to the Web Archive, among other things.
- Will this result in a proliferation of unregistered URL schemes with clashing namespaces? The ESW Wiki notes why this would be bad.
- And last, although this is nitpickier than the rest, I don't like the '%s' syntax used in the registration method. I'd much rather pass in an URL template, like the URL template used
in OpenSearch. If an URL template is used for matching rather than registering against a particular URL scheme, this could also allow for registering a namespace within a URN. For example
something along the lines of:
registerProtocolHandler("info:lccn/{lccnID}", "htttp://www.librarything.com/search_works.php?q={lccnID}", "LibraryThing LCCN")
url template registerprotocolhandler firefox technical url scheme protocol boring html5 uri urn 2009 Apr 1, 10:42Lol at actual Facebook app that does IPv6 over Facebook. "...most network users are not aware of what IPv6 is or are even afraid by IPv6 because it is unknown. On the other hand, Social Networks
(like Facebook, LinkedIn, etc.) are well-known by users and the usage of those networks is huge... With IPv6 over Social Network (IPoSN): * Every user is a router with at least one loopback
interface; * Every friend or connection between users will be used as a point-to-point link... A working prototype has been developed by the author and is freely available: IPv6 over Facebook Social
Network [IPv6overFacebook]."
humor social network ipv6 ip iposn facebook ietf rfc 2009 Mar 6, 5:16
I've found while debugging networking in IE its often useful to quickly tell if a string is encoded in UTF-8. You can check for the Byte Order Mark (EF BB BF in UTF-8) but, I rarely see the BOM on
UTF-8 strings. Instead I apply a quick and dirty UTF-8 test that takes advantage of the well-formed UTF-8 restrictions.
Unlike other multibyte character encoding forms (see Windows supported character sets or IANA's list of character sets), for example Big5, where sticking together any two bytes is more likely than not to give a valid byte sequence, UTF-8 is more restrictive. And unlike
other multibyte character encodings, UTF-8 bytes may be taken out of context and one can still know that its a single byte character, the starting byte of a three byte sequence, etc.
The full rules for well-formed UTF-8 are a little too complicated for me to commit to memory. Instead I've got my own simpler (this is the quick part) set of rules that will be mostly correct (this
is the dirty part). For as many bytes in the string as you care to examine, check the most significant digit of the byte:
-
F:
-
This is byte 1 of a 4 byte encoded codepoint and must be followed by 3 trail bytes.
-
E:
-
This is byte 1 of a 3 byte encoded codepoint and must be followed by 2 trail bytes.
-
C..D:
-
This is byte 1 of a 2 byte encoded codepoint and must be followed by 1 trail byte.
-
8..B:
-
This is a trail byte.
-
0..7:
-
This is a single byte encoded codepoint.
The simpler rules can produce false positives in some cases: that is, they'll say a string is UTF-8 when in fact it might not be. But it won't produce false negatives. The following is table
from the
Unicode spec. that actually describes well-formed UTF-8.
Code Points
|
1st Byte
|
2nd Byte
|
3rd Byte
|
4th Byte
|
U+0000..U+007F
|
00..7F
|
U+0080..U+07FF
|
C2..DF
|
80..BF
|
U+0800..U+0FFF
|
E0
|
A0..BF
|
80..BF
|
U+1000..U+CFFF
|
E1..EC
|
80..BF
|
80..BF
|
U+D000..U+D7FF
|
ED
|
80..9F
|
80..BF
|
U+E000..U+FFFF
|
EE..EF
|
80..BF
|
80..BF
|
U+10000..U+3FFFF
|
F0
|
90..BF
|
80..BF
|
80..BF
|
U+40000..U+FFFFF
|
F1..F3
|
80..BF
|
80..BF
|
80..BF
|
U+100000..U+10FFFF
|
F4
|
80..8F
|
80..BF
|
80..BF
|
test technical unicode boring charset utf8 encoding 2009 Feb 5, 8:39The long expired draft of the Web Proxy Autodiscovery Protocol (WPAD). To summarize, use DHCP and failing that DNS to find the name of a web server and on that web server find a Proxy Auto-Config
file at a well known localtion.
wpad proxy internet reference browser dns dhcp