jQuery plugin that blindly removes lines with errors and recompiles until it works
By the URI RFC there is only one way to represent a particular IPv4 address in the host of a URI. This is the standard dotted decimal notation of four bytes in decimal with no leading zeroes delimited by periods. And no leading zeros are allowed which means there's only one textual representation of a particular IPv4 address.
However as discussed in the URI RFC, there are other forms of IPv4 addresses that although not officially allowed are generally accepted. Many implementations used inet_aton to parse the address from the URI which accepts more than just dotted decimal. Instead of dotted decimal, each dot delimited part can be in decimal, octal (if preceded by a '0') or hex (if preceded by '0x' or '0X'). And that's each section individually - they don't have to match. And there need not be 4 parts: there can be between 1 and 4 (inclusive). In case of less than 4, the last part in the string represents all of the left over bytes, not just one.
For example the following are all equivalent:
The bread and butter of URI related security issues is when one part of the system disagrees with another about the interpretation of the URI. So this non-standard, non-normal form syntax has been been a great source of security issues in the past. Its mostly well known now (CreateUri normalizes these non-normal forms to dotted decimal), but occasionally a good tool for bypassing naive URI blocking systems.
One of the more limiting issues of writing client side script in the browser is the same origin limitations of XMLHttpRequest. The latest version of all browsers support a subset of CORS to allow servers to opt-in particular resources for cross-domain access. Since IE8 there's XDomainRequest and in all other browsers (including IE10) there's XHR L2's cross-origin request features. But the vast majority of resources out on the web do not opt-in using CORS headers and so client side only web apps like a podcast player or a feed reader aren't doable.
One hack-y way around this I've found is to use YQL as a CORS proxy. YQL applies the CORS header to all its responses and among its features it allows a caller to request an arbitrary XML, HTML, or JSON resource. So my network helper script first attempts to access a URI directly using XDomainRequest if that exists and XMLHttpRequest otherwise. If that fails it then tries to use XDR or XHR to access the URI via YQL. I wrap my URIs in the following manner, where type is either "html", "xml", or "json":
yqlRequest = function(uri, method, type, onComplete, onError) {
var yqlUri = "http://query.yahooapis.com/v1/public/yql?q=" +
encodeURIComponent("SELECT * FROM " + type + ' where url="' + encodeURIComponent(uri) + '"');
if (type == "html") {
yqlUri += encodeURIComponent(" and xpath='/*'");
}
else if (type == "json") {
yqlUri += "&callback=&format=json";
}
...
This
also means I can get JSON data itself without having to go through JSONP.
As a professional URI aficionado I deal with various levels of ignorance on URI percent-encoding (aka URI encoding, or URL escaping).
Getting into the more subtle levels of URI percent-encoding ignorance, folks try to apply their knowledge of percent-encoding to URIs as a whole producing the concepts escaped URIs and unescaped URIs. However there are no such things - URIs themselves aren't percent-encoded or decoded but rather contain characters that are percent-encoded or decoded. Applying percent-encoding or decoding to a URI as a whole produces a new and non-equivalent URI.
Instead of lingering on the incorrect concepts we'll just cover the correct ones: there's raw unencoded data, non-normal form URIs and normal form URIs. For example:
In the above (A) is not an 'encoded URI' but rather a non-normal form URI. The characters of 'the' and 'path' are percent-encoded but as unreserved characters specific in the RFC should not be encoded. In the normal form of the URI (B) the characters are decoded. But (B) is not a 'decoded URI' -- it still has an encoded '?' in it because that's a reserved character which by the RFC holds different meaning when appearing decoded versus encoded. Specifically in this case, it appears encoded which means it is data -- a literal '?' that appears as part of the path segment. This is as opposed to the decoded '?' that appears in the URI which is not part of the path but rather the delimiter to the query.
Usually when developers talk about decoding the URI what they really want is the raw data from the URI. The raw decoded data is (C) above. The only thing to note beyond what's covered already is that to obtain the decoded data one must parse the URI before percent decoding all percent-encoded octets.
Of course the exception here is when a URI is the raw data. In this case you must percent-encode the URI to have it appear in another URI. More on percent-encoding while constructing URIs later.
As a professional URI aficionado I deal with various levels of ignorance on URI percent-encoding (aka URI encoding, or URL escaping).
Worse than the lame blog comments hating on percent-encoding is the shipping code which can do actual damage. In one very large project I won't name, I've fixed code that decodes all percent-encoded octets in a URI in order to get rid of pesky percents before calling ShellExecute. An unnamed developer with similar intent but clearly much craftier did the same thing in a loop until the string's length stopped changing. As it turns out percent-encoding serves a purpose and can't just be removed arbitrarily.
Percent-encoding exists so that one can represent data in a URI that would otherwise not be allowed or would be interpretted as a delimiter instead of data. For example, the space character (U+0020) is not allowed in a URI and so must be percent-encoded in order to appear in a URI:
http://example.com/the%20path/
http://example.com/the path/
For an additional example, the question mark delimits the path from the query. If one wanted the question mark to appear as part of the path rather than delimit the path from the query, it must be percent-encoded:
http://example.com/foo%3Fbar
http://example.com/foo?bar
/foo
" from the query "bar
". And in the first, the querstion mark is percent-encoded and so
the path is "/foo%3Fbar
".
“How pervasive is it? There are about 489,000 YouTube videos that say “no copyright intended” or some variation, and about 664,000 videos have a “copyright disclaimer” citing the fair use provision in Section 107 of the Copyright Act”
Irritatingly, my G1 won't show me PDFs so I've made the Google Docs PDF viewer which will load PDFs on the web up in Google Docs. Google Docs has the useful ability to display PDFs in web browsers without any Adobe software and works (mostly) on Android.
This was very easy to put together as an Android activity. First its necessary to register the application as handling PDFs from the web. This is done via the intent-filter declaration in the manifest:
intent-filter
action android:name="android.intent.action.VIEW"/
data android:scheme="http" android:mimeType="application/pdf"/
category android:name="android.intent.category.DEFAULT"/
category android:name="android.intent.category.BROWSABLE"/
/intent-filter
The action part says my activity will view PDFs, the data part says it accepts data with the PDF mime-type and with a URL that has an HTTP scheme. The browsable category
is necessary to allow links from a browser to open this activity.
Second, the activity opens up the browser to Google Docs pointing to the PDF.
Intent intent = new Intent();
intent.setAction(getIntent().getAction());
intent.setData(Uri.parse(
"http://docs.google.com/gview?embedded=true&url=" +
percentEncodeForQuery(getIntent().getData().toString())));
startActivity(intent);
This is very simple code to invoke a new intent browsing to a newly constructed URL for the PDF in Google Docs. That was easy.Before we shipped IE8 there were no Accelerators, so we had some fun making our own for our favorite web services. I've got a small set of tips for creating Accelerators for other people's web services. I was planning on writing this up as an IE blog post, but Jon wrote a post covering a similar area so rather than write a full and coherent blog post I'll just list a few points:
There's no easy way to use local applications on a PC as the result of an accelerator or a search provider in IE8 but there is a hack-y/obvious way, that I'll describe here. Both accelerators and search providers in IE8 fill in URL templates and navigate to the resulting URL when an accelerator or search provider is executed by the user. These URLs are limited in scheme to http and https but those pages may do anything any other webpage may do. If your local application has an ActiveX control you could use that, or (as I will provide examples for) if the local application has registered for an application protocol you can redirect to that URL. In any case, unfortunately this means that you must put a webpage on the Internet in order to get an accelerator or search provider to use a local application.
For examples of the app protocol case, I've created a callto accelerator that uses whatever application is registered for the callto scheme on your system, and a Windows Search search provider that opens Explorer's search with your search query. The callto accelerator navigates to my redirection page with 'callto:' followed by the selected text in the fragment and the redirection page redirects to that callto URL. In the Windows Search search provider case the same thing happens except the fragment contains 'search-ms:query=' followed by the selected text, which starts Windows Search on your system with the selected text as the query. I've looked into app protocols previously.
I've looked at my web server logs previously to see if anyone had used my Web Frotz Interpreter and until recently didn't realize that awstats (the web server log report generator) was truncating the query from my URL, so I couldn't tell that anyone was actually using it. But after grepping the logs manually I've pulled out the URLs of visitor's text adventure sessions. If you'll recall, my Web Frotz Interpreter stores the game state in the URL so its easy to see user's game states in the web server logs.
I've put some of the links up on the Web Frotz Interpreter page. Some of the interesting ones:
Windows allows for application protocols in which, through the registry, you specify a URL scheme and a command line to have that URL passed to your application. Its an easy way to hook a webbrowser up to your application. Anyone can read the doc above and then walk through the registry and pick out the application protocols but just from that info you can't tell what the application expects these URLs to look like. I did a bit of research on some of the application protocols I've seen which is listed below. Good places to look for information on URI schemes: Wikipedia URI scheme, and ESW Wiki UriSchemes.
Scheme | Name | Notes |
---|---|---|
search-ms | Windows Search Protocol |
The search-ms application protocol is a convention for querying the Windows Search index. The protocol enables applications, like Microsoft Windows Explorer, to query the index with
parameter-value arguments, including property arguments, previously saved searches, Advanced Query Syntax, Natural Query Syntax, and language code identifiers (LCIDs) for both the Indexer and
the query itself. See the MSDN docs for search-ms for more info. Example: search-ms:query=food |
Explorer.AssocProtocol.search-ms | ||
OneNote | OneNote Protocol |
From the OneNote help: /hyperlink "pagetarget" - Starts OneNote and opens the page specified by the pagetarget parameter. To obtain the hyperlink for any page in a OneNote
notebook, right-click its page tab and then click Copy Hyperlink to this Page.Example: onenote:///\\GUMMO\Users\davris\Documents\OneNote%20Notebooks\OneNote%202007%20Guide\Getting%20Started%20with%20OneNote.one#section-id={692F45F5-A42A-415B-8C0D-39A10E88A30F}&end |
callto | Callto Protocol |
ESW Wiki Info on callto Skype callto info NetMeeting callto info Example: callto://+12125551234 |
itpc | iTunes Podcast |
Tells iTunes to subscribe to an indicated podcast. iTunes documentation. C:\Program Files\iTunes\iTunes.exe /url "%1" Example: itpc:http://www.npr.org/rss/podcast.php?id=35 |
iTunes.AssocProtocol.itpc | ||
pcast | ||
iTunes.AssocProtocol.pcast | ||
Magnet | Magnet URI | Magnet URL scheme described by Wikipedia. Magnet URLs identify a resource by a hash of that resource so that when used in P2P scenarios no central authority is necessary to create URIs for a resource. |
mailto | Mail Protocol |
RFC 2368 - Mailto URL Scheme. Mailto Syntax Opens mail programs with new message with some parameters filled in, such as the to, from, subject, and body. Example: mailto:?to=david.risney@gmail.com&subject=test&body=Test of mailto syntax |
WindowsMail.Url.Mailto | ||
MMS | mms Protocol |
MSDN describes associated protocols. Wikipedia describes MMS. "C:\Program Files\Windows Media Player\wmplayer.exe" "%L" Also appears to be related to MMS cellphone messages: MMS IETF Draft. |
WMP11.AssocProtocol.MMS | ||
secondlife | [SecondLife] |
Opens SecondLife to the specified location, user, etc. SecondLife Wiki description of the URL scheme. "C:\Program Files\SecondLife\SecondLife.exe" -set SystemLanguage en-us -url "%1" Example: secondlife://ahern/128/128/128 |
skype | Skype Protocol |
Open Skype to call a user or phone number. Skype's documentation Wikipedia summary of skype URL scheme "C:\Program Files\Skype\Phone\Skype.exe" "/uri:%l" Example: skype:+14035551111?call |
skype-plugin | Skype Plugin Protocol Handler |
Something to do with adding plugins to skype? Maybe. "C:\Program Files\Skype\Plugin Manager\skypePM.exe" "/uri:%1" |
svn | SVN Protocol |
Opens TortoiseSVN to browse the repository URL specified in the URL. C:\Program Files\TortoiseSVN\bin\TortoiseProc.exe /command:repobrowser /path:"%1" |
svn+ssh | ||
tsvn | ||
webcal | Webcal Protocol |
Wikipedia describes webcal URL scheme. Webcal URL scheme description. A URL that starts with webcal:// points to an Internet location that contains a calendar in iCalendar format. "C:\Program Files\Windows Calendar\wincal.exe" /webcal "%1" Example: webcal://www.lightstalkers.org/LS.ics |
WindowsCalendar.UrlWebcal.1 | ||
zune | Zune Protocol |
Provides access to some Zune operations such as podcast subscription (via Zune Insider). "c:\Program Files\Zune\Zune.exe" -link:"%1" Example: zune://subscribe/?name=http://feeds.feedburner.com/wallstrip. |
feed | Outlook Add RSS Feed |
Identify a resource that is a feed such as Atom or RSS. Implemented by Outlook to add the indicated feed to Outlook. Feed URI scheme pre-draft document "C:\PROGRA~2\MICROS~1\Office12\OUTLOOK.EXE" /share "%1" |
im | IM Protocol |
RFC 3860 IM URI scheme description Like mailto but for instant messaging clients. Registered by Office Communicator but I was unable to get it to work as described in RFC 3860. "C:\Program Files (x86)\Microsoft Office Communicator\Communicator.exe" "%1" |
tel | Tel Protocol |
RFC 5341 - tel URI scheme IANA assignment RFC 3966 - tel URI scheme description Call phone numbers via the tel URI scheme. Implemented by Office Communicator. "C:\Program Files (x86)\Microsoft Office Communicator\Communicator.exe" "%1" |