2009 Apr 7, 9:02
I'm a big fan of the concept of registerProtocolHandler in HTML 5 and in FireFox 3, but not quite the implementation. From a high level, it allows web apps to register themselves as
handlers of an URL scheme so for (the canonical) example, GMail can register for the mailto URL scheme. I like the concept:
- Better integration of web apps with your system.
- Its easy for web apps to do.
- Links to URNs can now take the user to the sites the user prefers for the sort of thing identified by the URN. For example, if I have a physical address in HTML, instead of making that an http
link to Yahoo Maps, I can make the link a geo scheme URI and those who follow the link will get their preferred mapping site that
has registered for that scheme. Actually, looking at the geo scheme's RFC, maybe I'd rather use some other URN scheme to represent the physical location, but you get the point.
However, the way its currently spec'ed out I don't like the following:
- There's no way to know if you are the handler for a particular URL scheme which is an important question for web app URL protocol handler authors.
- There's no way to fallback to an http URL in the case that a particular URL scheme isn't registered. A suggested solution to testing the registration of a scheme is for browsers to provide an additional script method
to check if a scheme is registered. I don't like the idea of writing script that walks over all my page's links and rewrites them based on that method. I'd much rather see a declarative and
backwards compatible fallback mechanism, although I don't know what that would look like.
- There's no way to register for a namespace within the urn scheme URI, the info scheme URI, or the tag scheme URI. I want to register
info:lccn/... (Library of Congress Card Number identifiers) to LibraryThing or Amazon and I want to register urn:duri:... (dated URIs) to the Web Archive, among other things.
- Will this result in a proliferation of unregistered URL schemes with clashing namespaces? The ESW Wiki notes why this would be bad.
- And last, although this is nitpickier than the rest, I don't like the '%s' syntax used in the registration method. I'd much rather pass in an URL template, like the URL template used
in OpenSearch. If an URL template is used for matching rather than registering against a particular URL scheme, this could also allow for registering a namespace within a URN. For example
something along the lines of:
registerProtocolHandler("info:lccn/{lccnID}", "htttp://www.librarything.com/search_works.php?q={lccnID}", "LibraryThing LCCN")
url template registerprotocolhandler firefox technical url scheme protocol boring html5 uri urn 2009 Mar 23, 12:58Details on a particular browser exploit and how its been resolved in IE8. "One approach they presented allowed attackers to use .NET framework DLL's to allocate executable pages of memory at
predictable locations within the iexplore.exe process. They were then able to demonstrate how .NET behavior could be combined with a separate exploitable memory corruption vulnerability to run
arbitrary code."
security ie8 ie browser hack via:ericlaw 2009 Mar 23, 11:06The HTML5 spec tells us how it is in the real world for URLs: "This specification defines various algorithms for dealing with Web addresses intended for use by HTML user agents. For historical
reaons, in order to be compatible with existing Web content HTML user agents need to implement a number of processes not defined by the URI and IRI specifications [RFC3986], [RFC3987]."
html html5 url uri reference w3c 2009 Mar 20, 5:03"This package contains header files and libraries to help you develop Windows applications that use Windows Internet Explorer."
ie8 ie msdn microsoft development C++ com visual-studio windows 2009 Mar 20, 4:51
Working on Internet Explorer extensions in C++ & COM, I had to relearn or rediscover how to do several totally basic and important things. To save myself and possibly others trouble in the
future, here's some pertinent links and tips.
First you must choose your IE extensibility point. Here's a very short list of the few I've used:
Once you've created your COM object that implements IObjectWithSite and whatever other interfaces your extensibility point requires as described in the above links you'll see your SetSite method
get called by IE. You might want to know how to get the top level browser object from the IUnknown site object passed in via that method.
After that you may also want to listen for some events from the browser. To do this you'll need to:
- Implement the dispinterface that has the event you want. For instance DWebBrowserEvents2, or HTMLDocumentEvents, or HTMLWindowEvents2. You'll have
to search around in that area of the documentation to find the event you're looking for.
- Register for events using AtlAdvise. The object you need to subscribe to depends on the events you want. For example, DWebBrowserEvents2 come from the webbrowser object, HTMLDocumentEvents come
from the document object assuming its an HTML document (I obtained via get_Document method on the webbrowser), and
HTMLWindowEvents2 come from the window object (which oddly I obtained via calling the get_script method on the document object).
Note that depending on when your SetSite method is called the document may not exist yet. For my extension I signed up for browser events immediately and then listened for events like NavigateComplete before signing up for document and window events.
- Implement IDispatch. The Invoke method will get called with event notifications from the dispinterfaces you sign up for in AtlAdvise. Implementing Invoke manually is a slight pain as all the
parameters come in as VARIANTs and are in reverse order. There's some ATL macros that may make this easier but I didn't bother.
- Call AtlUnadvise at some point -- at the latest when SetSite is called again and your site object changes.
If you want to check if an IHTMLElement is not visible on screen due how the page is scrolled, try comparing the Body or
Document Element's client height and width,
which appears to be the dimensions of the visible document area, to the element's bounding client rect which appears to be
its position relative to the upper left corner of the visible document area. I've found this to be working for me so far, but I'm not positive that frames, iframes, zooming, editable document
areas, etc won't mess this up.
Be sure to use pointers you get from the IWebBrowser/IHTMLDocument/etc. only on the thread on which you obtained the pointer or correctly marshal the pointers to other threads to avoid weird crashes and hangs.
Obtaining the HTML document of a subframe is slightly more complicated then you might hope. On the other hand this might
be resolved by the new to IE8 method IHTMLFrameElement3::get_contentDocument
Check out Eric's IE blog post on IE extensibility which has some great links on this topic as well.
technical boring internet explorer com c++ ihtmlelement extension 2009 Mar 16, 4:22"This data set, contributed by Google Inc., contains English word n-grams and their observed frequency counts. The length of the n-grams ranges from unigrams (single words) to five-grams. We expect
this data will be useful for statistical language modeling, e.g., for machine translation or speech recognition, as well as for other uses." 6 DVDs for only $150 with licensing restri... ok nm.
language google statistics database text 2009 Mar 6, 1:21"BE IT FURTHER RESOLVED, That the Washington State Senate honor Jerry Holkins and Mike Krahulik for their hard work and dedication to improving the lives of hospitalized children worldwide through
their creation and continued work with Child's Play Charity"
comic charity videogames penny-arcade goverment washington senate 2009 Mar 6, 5:16
I've found while debugging networking in IE its often useful to quickly tell if a string is encoded in UTF-8. You can check for the Byte Order Mark (EF BB BF in UTF-8) but, I rarely see the BOM on
UTF-8 strings. Instead I apply a quick and dirty UTF-8 test that takes advantage of the well-formed UTF-8 restrictions.
Unlike other multibyte character encoding forms (see Windows supported character sets or IANA's list of character sets), for example Big5, where sticking together any two bytes is more likely than not to give a valid byte sequence, UTF-8 is more restrictive. And unlike
other multibyte character encodings, UTF-8 bytes may be taken out of context and one can still know that its a single byte character, the starting byte of a three byte sequence, etc.
The full rules for well-formed UTF-8 are a little too complicated for me to commit to memory. Instead I've got my own simpler (this is the quick part) set of rules that will be mostly correct (this
is the dirty part). For as many bytes in the string as you care to examine, check the most significant digit of the byte:
-
F:
-
This is byte 1 of a 4 byte encoded codepoint and must be followed by 3 trail bytes.
-
E:
-
This is byte 1 of a 3 byte encoded codepoint and must be followed by 2 trail bytes.
-
C..D:
-
This is byte 1 of a 2 byte encoded codepoint and must be followed by 1 trail byte.
-
8..B:
-
This is a trail byte.
-
0..7:
-
This is a single byte encoded codepoint.
The simpler rules can produce false positives in some cases: that is, they'll say a string is UTF-8 when in fact it might not be. But it won't produce false negatives. The following is table
from the
Unicode spec. that actually describes well-formed UTF-8.
Code Points
|
1st Byte
|
2nd Byte
|
3rd Byte
|
4th Byte
|
U+0000..U+007F
|
00..7F
|
U+0080..U+07FF
|
C2..DF
|
80..BF
|
U+0800..U+0FFF
|
E0
|
A0..BF
|
80..BF
|
U+1000..U+CFFF
|
E1..EC
|
80..BF
|
80..BF
|
U+D000..U+D7FF
|
ED
|
80..9F
|
80..BF
|
U+E000..U+FFFF
|
EE..EF
|
80..BF
|
80..BF
|
U+10000..U+3FFFF
|
F0
|
90..BF
|
80..BF
|
80..BF
|
U+40000..U+FFFFF
|
F1..F3
|
80..BF
|
80..BF
|
80..BF
|
U+100000..U+10FFFF
|
F4
|
80..8F
|
80..BF
|
80..BF
|
test technical unicode boring charset utf8 encoding 2009 Feb 14, 5:41"Now, you can simply add this link tag to specify your preferred version... and Google will understand that the duplicates all refer to the canonical URL:
http://www.example.com/product.php?item=swedish-fish. Additional URL properties, like PageRank and related signals, are transferred as well."
via:mattb google link html url uri canonical canonicalization web 2009 Feb 7, 10:39
On my laptop at work I often get mail with attached files the application for which I only have installed on my main computer. Tired of having to save the file on the laptop and then find it on the
network via my other computer, I wrote remoteopen two nights ago. With this I open the file on my laptop and remoteopen sends it to be opened on
my main computer. Overkill for this issue but it felt good to write a quick tool that solves my problem.
technical boring remoteopen tool 2009 Feb 4, 4:16From Sorting it all Out wrt the weather gadget in Vista's sidebar, this link to China's laws on weather forecast: "Article 22 The State applies a unified system for the issue of public meteorological
forecast and severe weather warning... No other organizations or individuals may issue to the community such forecast or warning." "Article 25 When the media, including radio, television, newspaper
and telecommunication, issue to the community public meteorological forecast or severe weather warning, they shall use the latest meteorological information provided by a meteorological office...
Part of the revenues from the distribution of meteorological information shall be drawn to support the development of meteorological service." Whether an application is legally allowed to provide a
weather forecast is not an attribute I would have imagined necessary for a localization API.
via:michael-kaplan china law legal politics weather forecast localization 2009 Jan 27, 10:41I just noticed that Google's Feeling Lucky doesn't work if your query contains a 'site:...' entry unless the HTTP request has a referer header pointing to Google. This person noticed too and wrote a
Google App that acts like Feeling Lucky without this restriction. "It appears that Google has some secret threshold to decide when to get in the way of your destination like an angry ceiling cat
catapulting itself onto your face."
google im-feeling-lucky search http referer http-header app 2009 Jan 25, 5:39
Microsoft isn't completely shielded from our economies issues but I still have a job and
still get free soda. While that's all still the case, I decided to test Sarah's claimed ability to differentiate between Pepsi, Coke, and their diet counterparts by taste alone. I poured the four
sodas into marked cups and Sarah and I each took two runs through the cups with the following guesses.
Soda Identification Challenge Results
Drink
|
Sarah
|
Dave
|
Guess 1
|
Guess 2
|
Guess 1
|
Guess 2
|
Coke
|
Coke
|
Coke
|
Pepsi
|
Diet Pepsi
|
Diet Coke
|
Diet Coke
|
Diet Pepsi
|
Diet Coke
|
Diet Coke
|
Pepsi
|
Pepsi
|
Pepsi
|
Coke
|
Coke
|
Diet Pepsi
|
Diet Pepsi
|
Diet Coke
|
Diet Pepsi
|
Pepsi
|
Total (out of 8)
|
6
|
3
|
As you can see from the results, Sarah's claimed ability to identify Coke and Pepsi by taste is confirmed. The first run through she got completely correct and on the second run only mistook Diet
Pepsi for Diet Coke. Her excuse for the error on the second run was a tainted palate from the first run. I on the other hand was mostly incorrect. Surprisingly though my incorrect answers were
mostly consistent between run one and two. For instance I thought Pepsi was Coke in both runs.
coke microsoft waste of soda pepsi waste of time soda 2009 Jan 22, 9:57'Dubbed the "Light Car," the concept car is designed to communicate with other drivers using a system of OLED lights, which rest on the some part of the body of the car including the front and rear
light, in the form of an OLED display panel.'
car led communication sign signal 2009 Jan 22, 9:48"Revocation presents another challenge. If a system relies only on a biometric for both identity and authentication, how do you revoke that factor? Forgotten passwords can be changed; lost smartcards
can be revoked and replaced. How do you revoke a finger?"
article microsoft security identity authentication biometrics 2009 Jan 20, 2:20"Because the G1 has a compass inside, nru presents its data as a sonar-like spinning map when held parallel to the ground, but presents a snazzy augmented reality overlay when tipped up towards the
horizon. It's easier to grok when you can see it in motion; there's a video up above."
g1 phone cellphone compass geolocation video android 2009 Jan 15, 10:28Thanks to Matt, for the first time I can see myself using Twitter. Twitter app on my phone notifies me when something's posted so my build process can let me know when its done, or when sync finally
finishes, etc. I'd been meaning to setup a mini-notification system with a command line tool to my phone (w/o paying per text msg) but I didn't think of Twitter.
via:swannman api internet curl cli twitter 2009 Jan 15, 8:00Cool application that turns your sketches into graphs. I wonder if this can ever come to my phone? "Sketch a rough shape with your finger and Instaviz transforms what you drew in a split second.
Sketch a link between two shapes and Instaviz quickly redraws the graph with the best layout."
graph visualization iphone graphviz phone software application development 2009 Jan 13, 6:20"So we've progressed now from having just a Registry key entry, to having an executable, to having a randomly-named executable, to having an executable which is shuffled around a little bit on each
machine, to one that's encrypted - really more just obfuscated - to an executable that doesn't even run as an executable. It runs merely as a series of threads."
security privacy adware malware advertising ie browser scheme interview bho via:li 2009 Jan 8, 5:45"It is a newly opened high-security data center run by one of Sweden's largest ISPs, located in an old nuclear bunker deep below the bedrock of Stockholm city... The bunker was designed to be able to
withstand a near hit by a hydrogen bomb." Wait, you mean it can't take a direct hit? Lame.
sweden photos design datacenter underground bomb technology