rio page 4 - Dave's Blog

Search
My timeline on Mastodon

URI Percent Encoding Ignorance Level 2 - There is no Unencoded URI

2012 Feb 20, 4:00

As a professional URI aficionado I deal with various levels of ignorance on URI percent-encoding (aka URI encoding, or URL escaping).

Getting into the more subtle levels of URI percent-encoding ignorance, folks try to apply their knowledge of percent-encoding to URIs as a whole producing the concepts escaped URIs and unescaped URIs. However there are no such things - URIs themselves aren't percent-encoded or decoded but rather contain characters that are percent-encoded or decoded. Applying percent-encoding or decoding to a URI as a whole produces a new and non-equivalent URI.

Instead of lingering on the incorrect concepts we'll just cover the correct ones: there's raw unencoded data, non-normal form URIs and normal form URIs. For example:

  1. http://example.com/%74%68%65%3F%70%61%74%68?query
  2. http://example.com/the%3Fpath?query
  3. "http", "example.com", "the?path", "query"

In the above (A) is not an 'encoded URI' but rather a non-normal form URI. The characters of 'the' and 'path' are percent-encoded but as unreserved characters specific in the RFC should not be encoded. In the normal form of the URI (B) the characters are decoded. But (B) is not a 'decoded URI' -- it still has an encoded '?' in it because that's a reserved character which by the RFC holds different meaning when appearing decoded versus encoded. Specifically in this case, it appears encoded which means it is data -- a literal '?' that appears as part of the path segment. This is as opposed to the decoded '?' that appears in the URI which is not part of the path but rather the delimiter to the query.

Usually when developers talk about decoding the URI what they really want is the raw data from the URI. The raw decoded data is (C) above. The only thing to note beyond what's covered already is that to obtain the decoded data one must parse the URI before percent decoding all percent-encoded octets.

Of course the exception here is when a URI is the raw data. In this case you must percent-encode the URI to have it appear in another URI. More on percent-encoding while constructing URIs later.

PermalinkCommentsurl encoding uri technical percent-encoding

URI Percent-Encoding Ignorance Level 1 - Purpose

2012 Feb 15, 4:00

As a professional URI aficionado I deal with various levels of ignorance on URI percent-encoding (aka URI encoding, or URL escaping).

Worse than the lame blog comments hating on percent-encoding is the shipping code which can do actual damage. In one very large project I won't name, I've fixed code that decodes all percent-encoded octets in a URI in order to get rid of pesky percents before calling ShellExecute. An unnamed developer with similar intent but clearly much craftier did the same thing in a loop until the string's length stopped changing. As it turns out percent-encoding serves a purpose and can't just be removed arbitrarily.

Percent-encoding exists so that one can represent data in a URI that would otherwise not be allowed or would be interpretted as a delimiter instead of data. For example, the space character (U+0020) is not allowed in a URI and so must be percent-encoded in order to appear in a URI:

  1. http://example.com/the%20path/
  2. http://example.com/the path/
In the above the first is a valid URI while the second is not valid since a space appears directly in the URI. Depending on the context and the code through which the wannabe URI is run one may get unexpected failure.

For an additional example, the question mark delimits the path from the query. If one wanted the question mark to appear as part of the path rather than delimit the path from the query, it must be percent-encoded:

  1. http://example.com/foo%3Fbar
  2. http://example.com/foo?bar
In the second, the question mark appears plainly and so delimits the path "/foo" from the query "bar". And in the first, the querstion mark is percent-encoded and so the path is "/foo%3Fbar".
PermalinkCommentsencoding uri technical ietf percent-encoding

URI Percent Encoding Ignorance Level 0 - Existence

2012 Feb 10, 4:00

As a professional URI aficionado I deal with various levels of ignorance on URI percent-encoding (aka URI encoding, or URL escaping). The basest ignorance is with respect to the mere existence of percent-encoding. Percents in URIs are special: they always represent the start of a percent-encoded octet. That is to say, a percent is always followed by two hex digits that represents a value between 0 and 255 and doesn't show up in a URI otherwise.

The IPv6 textual syntax for scoped addresses uses the '%' to delimit the zone ID from the rest of the address. When it came time to define how to represent scoped IPv6 addresses in URIs there were two camps: Folks who wanted to use the IPv6 format as is in the URI, and those who wanted to encode or replace the '%' with a different character. The resulting thread was more lively than what shows up on the IETF URI discussion mailing list. Ultimately we went with a percent-encoded '%' which means the percent maintains its special status and singular purpose.

PermalinkCommentsencoding uri technical ietf percent-encoding ipv6

Serious Sam’s DRM Is A Giant Pink Scorpion

2011 Dec 7, 12:48

“Serious Sam 3′s DRM is brilliantly cruel, punishing only those who pirated it. By relentlessly pursuing them with a giant invincible armoured scorpion.”

PermalinkCommentsgame video-game scorpion serious-sam

"Additional HTTP Status Codes" - Mark Nottingham, Roy Fielding

2011 Nov 14, 7:51

Includes ‘511 Network Authentication Required’ for airport/hotel/coffee shop scenarios!  Am I too excited about this?

PermalinkCommentstechnical ietf http http-status-codes

Dare to Fight, A Hilarious Ninja Ambush Prank

2011 Oct 24, 2:14
Ninjas fight dirty.
PermalinkCommentstechnical

NYTimes Sues US For Refusing To Reveal Secret Interpretation Of Patriot Act (techdirt.com)

2011 Oct 20, 6:52
Wow, FTA: "Given all of this, reporter Charlie Savage of the NY Times filed a Freedom of Information Act request to find out the federal government's interpretation of its own law... and had it refused. According to the federal government, its own interpretation of the law is classified."
PermalinkCommentstechnical

Nintendo Game Maps

2011 Sep 28, 10:22They've got maps from your favorite NES games as giant images. I'm using SMB3 1-1 as my desktop background. I've got a four monitor setup now and so its tough to find desktop backgrounds but Mario levels easily cover my whole desktop.PermalinkCommentsgame videogame map nintendo nes

Telex

2011 Jul 18, 2:38Neat idea: "When the user wants to visit a blacklisted site, the client establishes an encrypted HTTPS connection to a non-blacklisted web server outside the censor’s network, which could be a normal site that the user regularly visits... The client secretly marks the connection as a Telex request by inserting a cryptographic tag into the headers. We construct this tag using a mechanism called public-key steganography... As the connection travels over the Internet en route to the non-blacklisted site, it passes through routers at various ISPs in the core of the network. We envision that some of these ISPs would deploy equipment we call Telex stations."PermalinkCommentsinternet security tools censorship technical

A true American Patriot recites Bill Pullman’s Independence Day speech around New York City  | Great Job, Internet! | The A.V. Club

2011 Jul 6, 7:28"Over this past Fourth Of July weekend, we neglected to note that it was the 15th anniversary of Roland Emmerich’s 1996 blockbuster Independence Day. New York comedian Sean Kleier remembered, and decided to make his own tribute, going to various locations around New York City—Times Square, the Brooklyn Bridge, the subway, and inside a Victoria’s Secret—reciting Bill Pullman’s rousing speech before the movie's final battle sequence, megaphone and all."
PermalinkCommentshumor video bill-pullman independence-day new-york

Three arguments against the singularity - Charlie's Diary

2011 Jul 1, 10:09"I periodically get email from folks who, having read "Accelerando", assume I am some kind of fire-breathing extropian zealot who believes in the imminence of the singularity, the uploading of the libertarians, and the rapture of the nerds. I find this mildly distressing, and so I think it's time to set the record straight and say what I really think. Short version: Santa Claus doesn't exist."PermalinkCommentsscifi singularity charles-stross future fiction

YouTube - Radiohead - Paranoid Android: YouTube Artists Mix

2011 Jun 20, 2:36A mix of 36 YouTube videos of various people playing Radiohead's Paranoid Android. It sounds good but the video too is very compelling. Also I would be psych'ed if it were my video picked to rock out at 2:50.

The mix:


Crazy bear on accoustic from 6:03


Amazing big band rendition from 2:20
PermalinkCommentsvideo remix radiohead mashup paranoid-android music

Seth Meyers & Barack Obama at White House Correspondents’ Dinner

2011 May 1, 7:51"The hilarious speeches by Seth Meyers and Barack Obama at the 2011 White House Correspondents’ Dinner. Seth and Obama really let Trump have it in their speechs. Trump’s reaction in the audience is priceless."PermalinkCommentshumor politics barack-obama seth-meyers video white-house-correspondents-dinner

HTTP framework for time-based access to resource states -- Memento

2011 Apr 30, 4:33"The HTTP-based Memento framework bridges the present and past Web by interlinking current resources with resources that encapsulate their past. It facilitates obtaining representations of prior states of a resource, available from archival resources in Web archives or version resources in content management systems, by leveraging the resource's URI and a preferred datetime. To this end, the framework introduces datetime negotiation (a variation on content negotiation), and new Relation Types for the HTTP Link header aimed at interlinking resources with their archival/version resources. It also introduces various discovery mechanisms that further support briding the present and past Web."PermalinkCommentstechnical rfc reference http header time memento archive

It's-a-him: Mario as you've never seen him before | Games | Great Job, Internet! | The A.V. Club

2010 Dec 15, 6:08PermalinkCommentshumor mario video videogame

Infovore » Something on my mind grapes

2010 Sep 27, 1:39Twitter account that retweets Kanye West's tweets with 'Liz Lemmon, ' prefixed to produce occasionally hilarious 30 Rock quote sound-a-likes.PermalinkCommentshumor 30-rock meme kanye-west mind-grapes twitter

pinvoke.net: ConsoleFunctions (kernel32)

2010 Aug 14, 5:34pInvoke.net is a wiki for the C# interop declarations for various Win32 functions.PermalinkCommentspinvoke c# csharp consle api windows reference wiki technical

The Curious History of Uniform Resource Names - IETF Journal

2010 Jul 1, 10:51"Sometimes it’s hard to judge whether an engineering effort has been successful or not. It can take years for an idea to catch on, to go from being the butt of jokes to becoming an international imperative (IPv6). Uniform Resource Names (URNs), which are part of the Uniform Resource Identifier (URI) family, are conceptually at least as old as IPv6. While not figuring in international directives for deployment, they-and the technology engineered to resolve them-are still going concerns."PermalinkCommentsietf urn uri history technical internet url

A quote from Sacramento Credit Union

2010 May 14, 8:52It really is an actual quote from the Sacramento Credit Union's website: "The answers to your Security Questions are case sensitive and cannot contain special characters like an apostrophe, or the words “insert,” “delete,” “drop,” “update,” “null,” or “select.”"

Out of context that seems hilarious, but if you read the doc the next Q/A twists it like a defense in depth rather than a 'there-I-fixed-it'.PermalinkCommentstechnical security humor sql

The Paywall

2010 May 6, 9:52Examples of various kinds of paywalls. For the one that wants you to dial a number... try it. I dare you.PermalinkCommentspaywall art web
Older EntriesNewer Entries Creative Commons License Some rights reserved.