2009 Apr 7, 1:30I really dislike how IE deals with non-US-ASCII in URLs. I should write up a post on what exactly IE does with non-US-ASCII characters in URLs. "Just like IRIs the URL is mapped to a URI using UTF-8.
Except for the query component of the URL (the bit after the question mark). Here for legacy reasons the encoding of the document is used instead. Except if the encoding of the document is UTF-16, in
which case UTF-8 is used. Effectively, using non-ASCII characters in URLs in documents not encoded as UTF-8 or UTF-16 will give you surprising results, to say the least. Yay for browsers!"
http encoding html5 url uri unicode iri 2009 Apr 7, 12:45"First, you started up the application and picked a video clip. The video clip just sat there. As you started clapping, the video clip started playing. If you clapped at about 80 beats per minute,
the video clip played at its normal speed. If you clapped faster, the video clip ran faster. If you clapped slower, the video clip ran slower. If you stopped clapping, the video clip stopped. It was
freaky cool. Totally useless, but freaky cool."
humor clap video raymond-chen filter-graph reference-clock 2009 Apr 7, 12:12HTML5's registerProtocolHandler seems to come from a cool FireFox 3 feature: "With web protocol handlers, the web application can register the specific protocol it wants to handle. Firefox will then
prompt the user to choose which of the registered applications (web or desktop) it should use to handle the action. Any protocol, real or imaginary, can be used - mailto: is only one example,
webcal:, tel: and fax: are others."
firefox uri scheme protocol mozilla html5 registerProtocolHandler 2009 Apr 7, 11:58
This past week I finished Anathem and despite the intimidating physical size of the book (difficult to take and read on the bus) I became very engrossed and was able to finish it in several orders of
magnitude less time than
what I spent on the Baroque
Cycle. Whereas reading the Baroque Cycle you can imagine Neal Stephenson sifting through giant economic tomes (or at least that's where my mind went whenever the characters began to explain
macro-economics to one another), in Anathem you can see Neal Stephenson staying up late
pouring over philosophy of mathematics. When not
exploring philosophy, Anathem has an appropriate amount of humor, love interests, nuclear bombs, etc. as you might hope from reading Snow Crash or Diamond Age. I thoroughly enjoyed Anathem.
On the topic of made up words: I get made up words for made up things, but there's already a name for cell-phone in English: its "cell-phone". The narrator notes that the book has been translated
into English so I guess I'll blame the fictional translator. Anyway, I wasn't bothered by the made up words nearly as much as some folk. Its a good thing I'm long
out of college because I can easily imagine confusing the names of actual concepts and people with those from the book, like Hemn space for Hamming distance. Towards the beginning, the description
of slines and the post-post-apocalyptic setting reminded me briefly of Idiocracy.
Recently, I've been reading everything of Charles Stross that I can, including about a month ago, The Jennifer Morgue from the surprisingly awesome amalgamation genre of spy thriller and Lovecraft
horror. Its the second in a series set in a universe in which magic exists as a form of mathematics and follows Bob Howard programmer/hacker, cube dweller, and begrudging spy who works for a
government agency tasked to suppress this knowledge and protect the world from its use. For a taste, try a short story from the series that's freely available on Tor's website, Down on the Farm.
Coincidentally, both Anathem and the Bob Howard series take an interest in the world of Platonic ideals. In the case of Anathem (without spoiling anything) the universe of Platonic ideals, under a
different name of course, is debated by the characters to be either just a concept or an actual separate universe and later becomes the underpinning of major events in the book. In the Bob Howard
series, magic is applied mathematics that through particular proofs or computations awakens/disturbs/provokes unnamed horrors in the universe of Platonic ideals to produce some desired effect in
Bob's universe.
atrocity archives neal stephenson jennifer morgue plato bob howard anathem 2009 Apr 7, 9:02
I'm a big fan of the concept of registerProtocolHandler in HTML 5 and in FireFox 3, but not quite the implementation. From a high level, it allows web apps to register themselves as
handlers of an URL scheme so for (the canonical) example, GMail can register for the mailto URL scheme. I like the concept:
- Better integration of web apps with your system.
- Its easy for web apps to do.
- Links to URNs can now take the user to the sites the user prefers for the sort of thing identified by the URN. For example, if I have a physical address in HTML, instead of making that an http
link to Yahoo Maps, I can make the link a geo scheme URI and those who follow the link will get their preferred mapping site that
has registered for that scheme. Actually, looking at the geo scheme's RFC, maybe I'd rather use some other URN scheme to represent the physical location, but you get the point.
However, the way its currently spec'ed out I don't like the following:
- There's no way to know if you are the handler for a particular URL scheme which is an important question for web app URL protocol handler authors.
- There's no way to fallback to an http URL in the case that a particular URL scheme isn't registered. A suggested solution to testing the registration of a scheme is for browsers to provide an additional script method
to check if a scheme is registered. I don't like the idea of writing script that walks over all my page's links and rewrites them based on that method. I'd much rather see a declarative and
backwards compatible fallback mechanism, although I don't know what that would look like.
- There's no way to register for a namespace within the urn scheme URI, the info scheme URI, or the tag scheme URI. I want to register
info:lccn/... (Library of Congress Card Number identifiers) to LibraryThing or Amazon and I want to register urn:duri:... (dated URIs) to the Web Archive, among other things.
- Will this result in a proliferation of unregistered URL schemes with clashing namespaces? The ESW Wiki notes why this would be bad.
- And last, although this is nitpickier than the rest, I don't like the '%s' syntax used in the registration method. I'd much rather pass in an URL template, like the URL template used
in OpenSearch. If an URL template is used for matching rather than registering against a particular URL scheme, this could also allow for registering a namespace within a URN. For example
something along the lines of:
registerProtocolHandler("info:lccn/{lccnID}", "htttp://www.librarything.com/search_works.php?q={lccnID}", "LibraryThing LCCN")
url template registerprotocolhandler firefox technical url scheme protocol boring html5 uri urn 2009 Apr 1, 10:42Lol at actual Facebook app that does IPv6 over Facebook. "...most network users are not aware of what IPv6 is or are even afraid by IPv6 because it is unknown. On the other hand, Social Networks
(like Facebook, LinkedIn, etc.) are well-known by users and the usage of those networks is huge... With IPv6 over Social Network (IPoSN): * Every user is a router with at least one loopback
interface; * Every friend or connection between users will be used as a point-to-point link... A working prototype has been developed by the author and is freely available: IPv6 over Facebook Social
Network [IPv6overFacebook]."
humor social network ipv6 ip iposn facebook ietf rfc 2009 Apr 1, 9:54Lol at end of final sentence: "Two Gmail accounts can happily converse with each other for up to three messages each. Beyond that, our experiments have shown a significant decline in the quality
ranking of Autopilot's responses and further messages may commit you to dinner parties or baby namings in which you have no interest."
humor google gmail ai email 2009 Mar 25, 10:09Bookmarklet to apply the Yahoo media player to any page that has mp3 links. "Take the player with you. Run my bookmarklet that will simply insert the required javascript into the page."
bookmarklet mp3 javascript music yahoo 2009 Mar 23, 12:58Details on a particular browser exploit and how its been resolved in IE8. "One approach they presented allowed attackers to use .NET framework DLL's to allocate executable pages of memory at
predictable locations within the iexplore.exe process. They were then able to demonstrate how .NET behavior could be combined with a separate exploitable memory corruption vulnerability to run
arbitrary code."
security ie8 ie browser hack via:ericlaw 2009 Mar 23, 8:13
I've made another extension for IE8,
Outline View, which gives you a side bar in IE that displays an outline of the current page and lets you make intrapage bookmarks.
The outline is generated based on the heading tags in the document (e.g. h1, h2, etc), kind of like what W3C's Semantic data extractor
tool displays for an outline. So if the page doesn't use heading tags the way the HTML spec intended or just sticks img tags in them, then the outline doesn't look so hot. On a page that does
use headings as intended though it looks really good. For instance a section from the HTML 4 spec shows up quite nicely and I find its
actually useful to be able to jump around to the different sections. Actually, I've been surprised going to various blogs how well the outline view is actually working -- I thought a lot more
webdevs would be abusing their heading tags.
I've also added intrapage bookmarks. When you make a text selection and clear it, that selected text is added as a temporary intrapage bookmark which shows up in the correct place in the outline.
You can navigate to the bookmark or right click to make it permanent. Right now I'm storing the permanent intrapage bookmarks in IE8's new per-domain DOM storage because I wanted to avoid writing
code to synchronize a cross process store of bookmarks, it allowed me to play with the DOM storage a bit, and the bookmarks will get cleared appropriately when the user clears their history via the
control panel.
technical intrapage bookmark boring html ie8 ie extension 2009 Mar 20, 5:03"This package contains header files and libraries to help you develop Windows applications that use Windows Internet Explorer."
ie8 ie msdn microsoft development C++ com visual-studio windows 2009 Mar 20, 6:18
IE8, the software I've been working on for some time now, has finally been released at MIX09.
As I mentioned previously, I worked on
accelerators (previously named
Activities) in IE8. Looking at the
kinds of things I blog about on the IE Blog, you might also
correctly guess that I work on the networking stack. Ask me about what else I worked on during IE8 development. The past few months were very busy for me and I'm happy this is finally out.
technical internet explorer ie8 2009 Mar 20, 4:51
Working on Internet Explorer extensions in C++ & COM, I had to relearn or rediscover how to do several totally basic and important things. To save myself and possibly others trouble in the
future, here's some pertinent links and tips.
First you must choose your IE extensibility point. Here's a very short list of the few I've used:
Once you've created your COM object that implements IObjectWithSite and whatever other interfaces your extensibility point requires as described in the above links you'll see your SetSite method
get called by IE. You might want to know how to get the top level browser object from the IUnknown site object passed in via that method.
After that you may also want to listen for some events from the browser. To do this you'll need to:
- Implement the dispinterface that has the event you want. For instance DWebBrowserEvents2, or HTMLDocumentEvents, or HTMLWindowEvents2. You'll have
to search around in that area of the documentation to find the event you're looking for.
- Register for events using AtlAdvise. The object you need to subscribe to depends on the events you want. For example, DWebBrowserEvents2 come from the webbrowser object, HTMLDocumentEvents come
from the document object assuming its an HTML document (I obtained via get_Document method on the webbrowser), and
HTMLWindowEvents2 come from the window object (which oddly I obtained via calling the get_script method on the document object).
Note that depending on when your SetSite method is called the document may not exist yet. For my extension I signed up for browser events immediately and then listened for events like NavigateComplete before signing up for document and window events.
- Implement IDispatch. The Invoke method will get called with event notifications from the dispinterfaces you sign up for in AtlAdvise. Implementing Invoke manually is a slight pain as all the
parameters come in as VARIANTs and are in reverse order. There's some ATL macros that may make this easier but I didn't bother.
- Call AtlUnadvise at some point -- at the latest when SetSite is called again and your site object changes.
If you want to check if an IHTMLElement is not visible on screen due how the page is scrolled, try comparing the Body or
Document Element's client height and width,
which appears to be the dimensions of the visible document area, to the element's bounding client rect which appears to be
its position relative to the upper left corner of the visible document area. I've found this to be working for me so far, but I'm not positive that frames, iframes, zooming, editable document
areas, etc won't mess this up.
Be sure to use pointers you get from the IWebBrowser/IHTMLDocument/etc. only on the thread on which you obtained the pointer or correctly marshal the pointers to other threads to avoid weird crashes and hangs.
Obtaining the HTML document of a subframe is slightly more complicated then you might hope. On the other hand this might
be resolved by the new to IE8 method IHTMLFrameElement3::get_contentDocument
Check out Eric's IE blog post on IE extensibility which has some great links on this topic as well.
technical boring internet explorer com c++ ihtmlelement extension 2009 Mar 10, 11:26I've seen Yahoo's media player javascript widget around but until I read the dev. instructions I didn't appreciate it. You just include their js file and it finds all your links to mp3s (finer
grained and more explicit control available too), adds them to its playlist, and sticks a simple play/pause button on each link.
mp3 music ajax design yahoo javascript 2009 Mar 6, 5:16
I've found while debugging networking in IE its often useful to quickly tell if a string is encoded in UTF-8. You can check for the Byte Order Mark (EF BB BF in UTF-8) but, I rarely see the BOM on
UTF-8 strings. Instead I apply a quick and dirty UTF-8 test that takes advantage of the well-formed UTF-8 restrictions.
Unlike other multibyte character encoding forms (see Windows supported character sets or IANA's list of character sets), for example Big5, where sticking together any two bytes is more likely than not to give a valid byte sequence, UTF-8 is more restrictive. And unlike
other multibyte character encodings, UTF-8 bytes may be taken out of context and one can still know that its a single byte character, the starting byte of a three byte sequence, etc.
The full rules for well-formed UTF-8 are a little too complicated for me to commit to memory. Instead I've got my own simpler (this is the quick part) set of rules that will be mostly correct (this
is the dirty part). For as many bytes in the string as you care to examine, check the most significant digit of the byte:
-
F:
-
This is byte 1 of a 4 byte encoded codepoint and must be followed by 3 trail bytes.
-
E:
-
This is byte 1 of a 3 byte encoded codepoint and must be followed by 2 trail bytes.
-
C..D:
-
This is byte 1 of a 2 byte encoded codepoint and must be followed by 1 trail byte.
-
8..B:
-
This is a trail byte.
-
0..7:
-
This is a single byte encoded codepoint.
The simpler rules can produce false positives in some cases: that is, they'll say a string is UTF-8 when in fact it might not be. But it won't produce false negatives. The following is table
from the
Unicode spec. that actually describes well-formed UTF-8.
Code Points
|
1st Byte
|
2nd Byte
|
3rd Byte
|
4th Byte
|
U+0000..U+007F
|
00..7F
|
U+0080..U+07FF
|
C2..DF
|
80..BF
|
U+0800..U+0FFF
|
E0
|
A0..BF
|
80..BF
|
U+1000..U+CFFF
|
E1..EC
|
80..BF
|
80..BF
|
U+D000..U+D7FF
|
ED
|
80..9F
|
80..BF
|
U+E000..U+FFFF
|
EE..EF
|
80..BF
|
80..BF
|
U+10000..U+3FFFF
|
F0
|
90..BF
|
80..BF
|
80..BF
|
U+40000..U+FFFFF
|
F1..F3
|
80..BF
|
80..BF
|
80..BF
|
U+100000..U+10FFFF
|
F4
|
80..8F
|
80..BF
|
80..BF
|
test technical unicode boring charset utf8 encoding 2009 Feb 28, 11:34"It's completely nuts... It's a book about what if the Rapture actually happened, and that's all I'm gonna tell you." -Junot Diaz, 2008 Pulitzer Prize Winner for Fiction
creativecommons comic literature religion magic download pdf 2009 Feb 27, 11:00Raymond Chen has a years worth of blog content written and scheduled! "To give you an idea of how far in advance I write my blog entries, I wrote this particular entry on February 13, 2008. ... this
particular entry ended up on February 27, 2009 because that was the next available open day. ... Now, with a buffer of over a year, I do have quite a bit of leeway in choosing when any particular
article is published." Humorous commentor John writes in response: "If you were to disappear off the face of the Earth, how long would it be before we knew?"
blog raymond-chen writing humor 2009 Feb 11, 10:05"With the iPhone version of WhatTheFont you can use the phone's built-in camera to photograph the text in question (or choose an existing image from your photo albums)... After confirming which
characters are used in the image, the app provides a list of possible matching fonts."
font iphone camera typography 2009 Feb 7, 10:39
On my laptop at work I often get mail with attached files the application for which I only have installed on my main computer. Tired of having to save the file on the laptop and then find it on the
network via my other computer, I wrote remoteopen two nights ago. With this I open the file on my laptop and remoteopen sends it to be opened on
my main computer. Overkill for this issue but it felt good to write a quick tool that solves my problem.
technical boring remoteopen tool 2009 Feb 5, 8:47Copy of the Netscape Navigator document (the original's long gone) describing the Proxy Auto-Config (PAC) file format and mime-type. Its a javascript file with at least one well known function that,
given a host, returns a string describing which methods are appropriate for a web browser to connect to that host.
javascript pac proxy http reference netscape navigator