I'm a big fan of the concept of registerProtocolHandler in HTML 5 and in FireFox 3, but not quite the implementation. From a high level, it allows web apps to register themselves as handlers of an URL scheme so for (the canonical) example, GMail can register for the mailto URL scheme. I like the concept:
registerProtocolHandler("info:lccn/{lccnID}", "htttp://www.librarything.com/search_works.php?q={lccnID}", "LibraryThing LCCN")
Working on Internet Explorer extensions in C++ & COM, I had to relearn or rediscover how to do several totally basic and important things. To save myself and possibly others trouble in the future, here's some pertinent links and tips.
First you must choose your IE extensibility point. Here's a very short list of the few I've used:
Once you've created your COM object that implements IObjectWithSite and whatever other interfaces your extensibility point requires as described in the above links you'll see your SetSite method get called by IE. You might want to know how to get the top level browser object from the IUnknown site object passed in via that method.
After that you may also want to listen for some events from the browser. To do this you'll need to:
If you want to check if an IHTMLElement is not visible on screen due how the page is scrolled, try comparing the Body or Document Element's client height and width, which appears to be the dimensions of the visible document area, to the element's bounding client rect which appears to be its position relative to the upper left corner of the visible document area. I've found this to be working for me so far, but I'm not positive that frames, iframes, zooming, editable document areas, etc won't mess this up.
Be sure to use pointers you get from the IWebBrowser/IHTMLDocument/etc. only on the thread on which you obtained the pointer or correctly marshal the pointers to other threads to avoid weird crashes and hangs.
Obtaining the HTML document of a subframe is slightly more complicated then you might hope. On the other hand this might be resolved by the new to IE8 method IHTMLFrameElement3::get_contentDocument
Check out Eric's IE blog post on IE extensibility which has some great links on this topic as well.
I've found while debugging networking in IE its often useful to quickly tell if a string is encoded in UTF-8. You can check for the Byte Order Mark (EF BB BF in UTF-8) but, I rarely see the BOM on UTF-8 strings. Instead I apply a quick and dirty UTF-8 test that takes advantage of the well-formed UTF-8 restrictions.
Unlike other multibyte character encoding forms (see Windows supported character sets or IANA's list of character sets), for example Big5, where sticking together any two bytes is more likely than not to give a valid byte sequence, UTF-8 is more restrictive. And unlike other multibyte character encodings, UTF-8 bytes may be taken out of context and one can still know that its a single byte character, the starting byte of a three byte sequence, etc.
The full rules for well-formed UTF-8 are a little too complicated for me to commit to memory. Instead I've got my own simpler (this is the quick part) set of rules that will be mostly correct (this is the dirty part). For as many bytes in the string as you care to examine, check the most significant digit of the byte:
Code Points | 1st Byte | 2nd Byte | 3rd Byte | 4th Byte |
---|---|---|---|---|
U+0000..U+007F | 00..7F | |||
U+0080..U+07FF | C2..DF | 80..BF | ||
U+0800..U+0FFF | E0 | A0..BF | 80..BF | |
U+1000..U+CFFF | E1..EC | 80..BF | 80..BF | |
U+D000..U+D7FF | ED | 80..9F | 80..BF | |
U+E000..U+FFFF | EE..EF | 80..BF | 80..BF | |
U+10000..U+3FFFF | F0 | 90..BF | 80..BF | 80..BF |
U+40000..U+FFFFF | F1..F3 | 80..BF | 80..BF | 80..BF |
U+100000..U+10FFFF | F4 | 80..8F | 80..BF | 80..BF |
I've made an XSLT Meddler script in my continued XSLT adventures. Meddler is a simple and easy web server that runs whatever JScript.NET code you give it. I wrote a script that takes an indicated XSLT on the server, downloads an indicated XML from the Internet and returns the result of running that XML through the XSLT. This is useful when you want to work with something like the Zune software or IE7's feed platform which only reads feeds over the HTTP protocol. I'll give more interesting and specific examples of how this could be useful in the future.
Windows allows for application protocols in which, through the registry, you specify a URL scheme and a command line to have that URL passed to your application. Its an easy way to hook a webbrowser up to your application. Anyone can read the doc above and then walk through the registry and pick out the application protocols but just from that info you can't tell what the application expects these URLs to look like. I did a bit of research on some of the application protocols I've seen which is listed below. Good places to look for information on URI schemes: Wikipedia URI scheme, and ESW Wiki UriSchemes.
Scheme | Name | Notes |
---|---|---|
search-ms | Windows Search Protocol |
The search-ms application protocol is a convention for querying the Windows Search index. The protocol enables applications, like Microsoft Windows Explorer, to query the index with
parameter-value arguments, including property arguments, previously saved searches, Advanced Query Syntax, Natural Query Syntax, and language code identifiers (LCIDs) for both the Indexer and
the query itself. See the MSDN docs for search-ms for more info. Example: search-ms:query=food |
Explorer.AssocProtocol.search-ms | ||
OneNote | OneNote Protocol |
From the OneNote help: /hyperlink "pagetarget" - Starts OneNote and opens the page specified by the pagetarget parameter. To obtain the hyperlink for any page in a OneNote
notebook, right-click its page tab and then click Copy Hyperlink to this Page.Example: onenote:///\\GUMMO\Users\davris\Documents\OneNote%20Notebooks\OneNote%202007%20Guide\Getting%20Started%20with%20OneNote.one#section-id={692F45F5-A42A-415B-8C0D-39A10E88A30F}&end |
callto | Callto Protocol |
ESW Wiki Info on callto Skype callto info NetMeeting callto info Example: callto://+12125551234 |
itpc | iTunes Podcast |
Tells iTunes to subscribe to an indicated podcast. iTunes documentation. C:\Program Files\iTunes\iTunes.exe /url "%1" Example: itpc:http://www.npr.org/rss/podcast.php?id=35 |
iTunes.AssocProtocol.itpc | ||
pcast | ||
iTunes.AssocProtocol.pcast | ||
Magnet | Magnet URI | Magnet URL scheme described by Wikipedia. Magnet URLs identify a resource by a hash of that resource so that when used in P2P scenarios no central authority is necessary to create URIs for a resource. |
mailto | Mail Protocol |
RFC 2368 - Mailto URL Scheme. Mailto Syntax Opens mail programs with new message with some parameters filled in, such as the to, from, subject, and body. Example: mailto:?to=david.risney@gmail.com&subject=test&body=Test of mailto syntax |
WindowsMail.Url.Mailto | ||
MMS | mms Protocol |
MSDN describes associated protocols. Wikipedia describes MMS. "C:\Program Files\Windows Media Player\wmplayer.exe" "%L" Also appears to be related to MMS cellphone messages: MMS IETF Draft. |
WMP11.AssocProtocol.MMS | ||
secondlife | [SecondLife] |
Opens SecondLife to the specified location, user, etc. SecondLife Wiki description of the URL scheme. "C:\Program Files\SecondLife\SecondLife.exe" -set SystemLanguage en-us -url "%1" Example: secondlife://ahern/128/128/128 |
skype | Skype Protocol |
Open Skype to call a user or phone number. Skype's documentation Wikipedia summary of skype URL scheme "C:\Program Files\Skype\Phone\Skype.exe" "/uri:%l" Example: skype:+14035551111?call |
skype-plugin | Skype Plugin Protocol Handler |
Something to do with adding plugins to skype? Maybe. "C:\Program Files\Skype\Plugin Manager\skypePM.exe" "/uri:%1" |
svn | SVN Protocol |
Opens TortoiseSVN to browse the repository URL specified in the URL. C:\Program Files\TortoiseSVN\bin\TortoiseProc.exe /command:repobrowser /path:"%1" |
svn+ssh | ||
tsvn | ||
webcal | Webcal Protocol |
Wikipedia describes webcal URL scheme. Webcal URL scheme description. A URL that starts with webcal:// points to an Internet location that contains a calendar in iCalendar format. "C:\Program Files\Windows Calendar\wincal.exe" /webcal "%1" Example: webcal://www.lightstalkers.org/LS.ics |
WindowsCalendar.UrlWebcal.1 | ||
zune | Zune Protocol |
Provides access to some Zune operations such as podcast subscription (via Zune Insider). "c:\Program Files\Zune\Zune.exe" -link:"%1" Example: zune://subscribe/?name=http://feeds.feedburner.com/wallstrip. |
feed | Outlook Add RSS Feed |
Identify a resource that is a feed such as Atom or RSS. Implemented by Outlook to add the indicated feed to Outlook. Feed URI scheme pre-draft document "C:\PROGRA~2\MICROS~1\Office12\OUTLOOK.EXE" /share "%1" |
im | IM Protocol |
RFC 3860 IM URI scheme description Like mailto but for instant messaging clients. Registered by Office Communicator but I was unable to get it to work as described in RFC 3860. "C:\Program Files (x86)\Microsoft Office Communicator\Communicator.exe" "%1" |
tel | Tel Protocol |
RFC 5341 - tel URI scheme IANA assignment RFC 3966 - tel URI scheme description Call phone numbers via the tel URI scheme. Implemented by Office Communicator. "C:\Program Files (x86)\Microsoft Office Communicator\Communicator.exe" "%1" |
Sarah asked me if I knew of a syntax highlighter for the QuickBase formula language which she uses at work. I couldn't find one but thought it might be fun to make a QuickBase Formula syntax highlighter based on the QuickBase help's description of the formula syntax. Thankfully the language is relatively simple since my skills with ANTLR, the parser generator, are rusty now and I've only used it previously for personal projects (like Javaish, the ridiculous Java based shell idea I had).
With the help of some great ANTLR examples and an ANTLR cheat sheet I was able to come up with the grammar that parses the QuickBase Formula syntax and prints out the same formula marked up with HTML SPAN tags and various CSS classes. ANTLR produces the parser in Java which I wrapped up in an applet, put in a jar, and embedded in an HTML page. The script in that page runs user input through the applet's parser and sticks the output at the bottom of the page with appropriate CSS rules to highlight and print the formula in a pretty fashion.
What I learned:
I just upgraded to the Zune 3.0 software which includes games and purchasing music on the Zune via WiFi and once again I'm thrilled that the new firmware is available for old Zunes like mine. Rooting around looking at the new features I noticed Zune Badges for the first time. They're like Xbox Achievements, for example I have a Pixies Silver Artist Power Listener award for listening to the Pixies over 1000 times. I know its ridiculous but I like it, and now I want achievements for everything.
Achievements everywhere would require more developments in self-tracking. Self-trackers, folks who keep statistics on exactly when and what they eat, when and how much they exercise, anything one may track about one's self, were the topic of a Kevin Kelly Quantified Self blog post (also check out Cory Doctorow's SF short story The Things that Make Me Weak and Strange Get Engineered Away featuring a colony of self-trackers). For someone like me with a medium length attention span the data collection needs to be completely automatic or I will lose interest and stop collecting within a week. For instance, Nike iPod shoes that keep track of how many steps the wearer takes. I'll also need software to analyze, display, and share this data on a website like Mycrocosm. I don't want to have to spend extreme amounts of time to create something as wonderful as the Feltron Report (check out his statistic on how many daily measurements he takes for the report). Once we have the data we can give out achievements for everything!
Carnivore Eat at least ten different kinds of animals. |
|
Make Friends Meet at least 10% of the residents in your home town. |
|
Globetrotter Visit a city in every country. |
|
You're Old Survive at least 80 years of life. |
Of course none of the above is practical yet, but how about Delicious achievements based on the public Delicious feeds? That should be doable...
Internet Explorer 8 Beta 2 is now available! Some of the new features from this release that I really enjoy are Tab Grouping, the new address-bar, and InPrivate Subscriptions.
Tab Grouping groups tabs that are opened from the same page. For example, on a Google search results page if you open the first two links the two new tabs will be grouped with the Google search results page. If you close one of the tabs in that group focus goes to another tab in that group. Its small, but I really enjoy this feature and without knowing exactly what I wanted while using IE7 and FF2 I knew I wanted something like this. Plus the colors for the tab groups are pretty!
The new address bar and search box makes life much easier by searching through my browsing history for whatever I'm typing in. Other things are searched besides history but since I ignore favorites and use Delicious I mostly care about history. At any rate its one of the things that makes it impossible for me to go machines running IE7.
InPrivate Subscriptions allows you to subscribe to a feed of URLs from which IE should not download content. This is intended for avoiding sites that track you across websites and could sell or share your personal information, but this feature could be used for anything where the goal is to avoid a set of URLs. For example, phishing, malware sites, ad blocking, etc. etc. I think there's some interesting uses for this feature that we have yet to see.
Anyway, we're another release closer to the final IE8 and I can relax a little more.
As noted previously, my page consists of the aggregation of my various feeds and in working on that code recently it was again brought to my attention that everyone has different ways of representing tag metadata in feeds. I made up a list of how my various feed sources represent tags and list that data here so that it might help others in the future.
Source | Feed Type | Tag Markup Scheme | One Tag Per Element | Tag Scheme URI | Human / Machine Names | Example Markup |
---|---|---|---|---|---|---|
LiveJournal | Atom | atom:category | yes | no | no | , (source) |
LiveJournal | RSS 2.0 | rss2:category | yes | no | no |
technical (soure) |
WordPress | RSS 2.0 | rss2:category | yes | no | no |
, (source)
|
Delicious | RSS 1.0 | dc:subject | no | no | no |
photosynth photos 3d tool (source) |
Delicious | RSS 2.0 | rss2:category | yes | yes | no |
domain="http://delicious.com/SequelGuy/"> (source) |
Flickr | Atom | atom:category | yes | yes | no |
term="seattle" (source) |
Flickr | RSS 2.0 | media:category | no | yes | no |
scheme="urn:flickr:tags"> (source) |
YouTube | RSS 2.0 | media:category | no | no | no |
label="Tags"> (source) |
LibraryThing | RSS 2.0 | No explicit tag metadata. | no | no | no | n/a, (source) |
Tag Markup Scheme | Notes | Example |
---|---|---|
Atom Category atom:category xmlns:atom="http://www.w3.org/2005/Atom"
|
|
term="catName"
|
RSS 2.0 category rss2:category empty namespace |
|
domain="tag:deletethis.net,2008:tagscheme">
|
Yahoo Media RSS Module category media:category xmlns:media="http://search.yahoo.com/mrss/"
|
|
scheme="http://dmoz.org"
|
Dublin Core subject dc:subject xmlns:dc="http://purl.org/dc/elements/1.1/"
|
|
humor
|
Update 2009-9-14: Added WordPress to the Tag Markup table and namespaces to the Tag Markup Scheme table.
In my Intro to Algorithms course in college the Fibonacci sequence was used as the example algorithm to which various types of algorithm creation methods were applied. As the course went on we made
better and better performing algorithms to find the nth Fibonacci number. In another course we were told about a matrix that when multiplied successively produced Fibonacci numbers. In my linear
algebra courses I realized I could diagonalize the matrix to find a non-recursive Fibonacci function. To my surprise this worked and I
found a function.
Looking online I found that of course this same function was already well known. Mostly I was irritated that after all the
algorithms we created for faster and faster Fibonacci functions we were never told about a constant time function like this.
I recently found my paper depicting this and thought it would be a good thing to use to try out MathML, a markup language for displaying math. I went to the MathML implementations page and installed a plugin for IE to display MathML and then began writing up my paper in MathML. I wrote the MathML by hand and must say that's not how its intended to be created. The language is very verbose and it took me a long time to get the page of equations transcribed.
MathML has presentation elements and content elements that can be used separately or together. I stuck to content elements and while it looked great in IE with my extension when I tried it in FireFox which has builtin MathML support it didn't render. As it turns out FireFox doesn't support MathML content elements. I had already finished creating this page by hand and wasn't about to switch to content elements. Also, in order to get IE to render a MathML document, the document needs directives at the top for specific IE extensions which is a pain. Thankfully, the W3C has a MathML cross platform stylesheet. You just include this XSL at the top of your XHTML page and it turns content elements into appropriate presentation elements, and inserts all the known IE extension goo required for you. So now my page can look lovely and all the ickiness to get it to render is contained in the W3C's XSL.