name page 6 - Dave's Blog

Search
My timeline on Mastodon

Atlas of True Names

2009 Nov 23, 2:20"The Atlas of True Names reveals the etymological roots, or original meanings, of the familiar terms on today's maps of the World, Europe, the British Isles and the United States. For instance, where you would normally expect to see the Sahara indicated, the Atlas gives you "The Tawny One", derived from Arab. es-sahra “the fawn coloured ,desert”."PermalinkCommentshumor reference map etymology translation atlas geography

Why do we have an IMG element? [dive into mark]

2009 Nov 3, 1:33'A few hours after that, Tim Berners-Lee responded: I had imagined that figues would be reprented as <a name=fig1 href="fghjkdfghj" REL="EMBED, PRESENT">Figure </a>'. Ohhhh, that would have been better.PermalinkCommentshtml history mark-pilgrim browser web images technical

You know the name, but just who were the Luddites? - Ars Technica

2009 Oct 5, 8:44Brief history of the Luddites. "Are we all Luddites now? ... If you are reading this essay on your laptop or iPhone, chances are that you aren't an unemployed weaver staring starvation in the face." Also: "The Luddites didn't oppose technology; they opposed the sudden collapse of their industry, which they blamed in part on new weaving machines." So the TV and newspaper associations and Rupert Murdoch are Luddites.PermalinkCommentshistory technology luddite

Creating Accelerators for Other People's Web Services

2009 Aug 18, 4:19

Before we shipped IE8 there were no Accelerators, so we had some fun making our own for our favorite web services. I've got a small set of tips for creating Accelerators for other people's web services. I was planning on writing this up as an IE blog post, but Jon wrote a post covering a similar area so rather than write a full and coherent blog post I'll just list a few points:

PermalinkCommentstechnical accelerator ie8 ie

PowerShell Scanning Script

2009 Jun 27, 3:42

I've hooked up the printer/scanner to the Media Center PC since I leave that on all the time anyway so we can have a networked printer. I wanted to hook up the scanner in a somewhat similar fashion but I didn't want to install HP's software (other than the drivers of course). So I've written my own script for scanning in PowerShell that does the following:

  1. Scans using the Windows Image Acquisition APIs via COM
  2. Runs OCR on the image using Microsoft Office Document Imaging via COM (which may already be on your PC if you have Office installed)
  3. Converts the image to JPEG using .NET Image APIs
  4. Stores the OCR text into the EXIF comment field using .NET Image APIs (which means Windows Search can index the image by the text in the image)
  5. Moves the image to the public share

Here's the actual code from my scan.ps1 file:

param([Switch] $ShowProgress, [switch] $OpenCompletedResult)

$filePathTemplate = "C:\users\public\pictures\scanned\scan {0} {1}.{2}";
$time = get-date -uformat "%Y-%m-%d";

[void]([reflection.assembly]::loadfile( "C:\Windows\Microsoft.NET\Framework\v2.0.50727\System.Drawing.dll"))

$deviceManager = new-object -ComObject WIA.DeviceManager
$device = $deviceManager.DeviceInfos.Item(1).Connect();

foreach ($item in $device.Items) {
        $fileIdx = 0;
        while (test-path ($filePathTemplate -f $time,$fileIdx,"*")) {
                [void](++$fileIdx);
        }

        if ($ShowProgress) { "Scanning..." }

        $image = $item.Transfer();
        $fileName = ($filePathTemplate -f $time,$fileIdx,$image.FileExtension);
        $image.SaveFile($fileName);
        clear-variable image

        if ($ShowProgress) { "Running OCR..." }

        $modiDocument = new-object -comobject modi.document;
        $modiDocument.Create($fileName);
        $modiDocument.OCR();
        if ($modiDocument.Images.Count -gt 0) {
                $ocrText = $modiDocument.Images.Item(0).Layout.Text.ToString().Trim();
                $modiDocument.Close();
                clear-variable modiDocument

                if (!($ocrText.Equals(""))) {
                        $fileAsImage = New-Object -TypeName system.drawing.bitmap -ArgumentList $fileName
                        if (!($fileName.EndsWith(".jpg") -or $fileName.EndsWith(".jpeg"))) {
                                if ($ShowProgress) { "Converting to JPEG..." }

                                $newFileName = ($filePathTemplate -f $time,$fileIdx,"jpg");
                                $fileAsImage.Save($newFileName, [System.Drawing.Imaging.ImageFormat]::Jpeg);
                                $fileAsImage.Dispose();
                                del $fileName;

                                $fileAsImage = New-Object -TypeName system.drawing.bitmap -ArgumentList $newFileName 
                                $fileName = $newFileName
                        }

                        if ($ShowProgress) { "Saving OCR Text..." }

                        $property = $fileAsImage.PropertyItems[0];
                        $property.Id = 40092;
                        $property.Type = 1;
                        $property.Value = [system.text.encoding]::Unicode.GetBytes($ocrText);
                        $property.Len = $property.Value.Count;
                        $fileAsImage.SetPropertyItem($property);
                        $fileAsImage.Save(($fileName + ".new"));
                        $fileAsImage.Dispose();
                        del $fileName;
                        ren ($fileName + ".new") $fileName
                }
        }
        else {
                $modiDocument.Close();
                clear-variable modiDocument
        }

        if ($ShowProgress) { "Done." }

        if ($OpenCompletedResult) {
                . $fileName;
        }
        else {
                $result = dir $fileName;
                $result | add-member -membertype noteproperty -name OCRText -value $ocrText
                $result
        }
}

I ran into a few issues:

PermalinkCommentstechnical scanner ocr .net modi powershell office wia

Bits Up!: DNS Prefetching for Firefox

2009 Jun 22, 3:28Details on Firefox's DNS prefetching: "The Firefox implementation takes this approach one step further than just pre-resolving anchor href hostnames. It uses the prefetch logic on URLs that are being included in the current document. By this I mean that it uses the prefetch logic on things like images, css, and jscript that are being loaded right away, in addition to anchor links which might be clicked on at a slightly later time."PermalinkCommentsdns dns-prefetching html performance networking firefox mozilla technical

Chromium Blog: DNS Prefetching (or Pre-Resolving)

2009 Jun 22, 2:55"To speed up browsing, Google Chrome resolves domain names before the user navigates, typically while the user is viewing a web page." In addition to noting what and how they do it, and how web devs can control it, they give a few stats on how much it helps.PermalinkCommentsgoogle dns chrome dns-prefetching browser networking performance technical

Controlling DNS prefetching - MDC

2009 Jun 22, 2:53"Firefox 3.5 performs DNS prefetching. This is a feature by which Firefox proactively performs domain name resolution on both links that the user may choose to follow as well as URLs for items referenced by the document, including images, CSS, JavaScript, and so forth."PermalinkCommentsdns firefox mozilla networking performance dns-prefetching technical

A Brief History of Microsoft's Live Search's New Domain Bing

2009 Jun 1, 11:07
Logo for bing! from 2003 via The Wayback MachineLogo for BING* from 2006 via The Wayback MachineKimberly Saia's flickr photo of the Microsoft bing search logo.
When I heard that Live Search is now Bing one of my initial thoughts was how'd they get that domain name given the unavailability of pronouncable four letter .COM domain names. Well, the names been used in the past. Here now, via the Wayback Machine is a brief, somewhat speculative, and ultimately anticlimactic history of bing.com:

The new name reminds me of the show Friends. Also, I hope they get a new favicon - I don't enjoy the stretched 'b' nor its color scheme.

PermalinkCommentsmicrosoft technical domain history search archive dns bing

308 - The Pop Vs Soda Map - Strange Maps

2009 May 31, 8:29"When on a hot summer's day you buy a carbonated beverage to quench your thirst, how do you order it? Do you ask for a soda, a pop or something else? That question lay at the basis of an article in the Journal of English Linguistics (Soda or Pop?, #24, 1996) and of a map, showing the regional variation in American English of the names given to that type of drink."PermalinkCommentsmap language visualization statistics english culture soda coke for:hellosarah

WRECK & SALVAGE

2009 May 1, 12:09"If I'm reading the pop-up window correctly, domain registrar Godaddy recommends against purchasing .tv domain names because the island of Tuvalu, which the domain represents, is sinking."PermalinkCommentshumor dns domain godaddy tv via:boingboing

Penny Arcade! - Further Education

2009 Apr 27, 3:47FYI: the official title of the 'backslash' is 'reverse-solidus', according to Unicode, ISO 10646, etc. Much cooler name IMO.PermalinkCommentspunctuation comic humor penny-arcade reverse-solidus back-slash

Flickr Visual Search in IE8

2009 Apr 10, 9:48

A while ago I promised to say how an xsltproc Meddler script would be useful and the general answer is its useful for hooking up a client application that wants data from the web in a particular XML format and the data is available on the web but in another XML format. The specific case for this post is a Flickr Search service that includes IE8 Visual Search Suggestions. IE8 wants the Visual Search Suggestions XML format and Flickr gives out search data in their Flickr web API XML format.

So I wrote an XSLT to convert from Flickr Search XML to Visual Suggestions XML and used my xsltproc Meddler script to actually apply this xslt.

After getting this all working I've placed the result in two places: (1) I've updated the xsltproc Meddler script to include this XSLT and an XML file to install it as a search provider - although you'll need to edit the XML to include your own Flickr API key. (2) I've created a service for this so you can just install the Flickr search provider if you're interested in having the functionality and don't care about the implementation. Additionally, to the search provider I've added accelerator preview support to show the Flickr slideshow which I think looks snazzy.

Doing a quick search for this it looks like there's at least one other such implementation, but mine has the distinction of being done through XSLT which I provide, updated XML namespaces to work with the released version of IE8, and I made it so you know its good.

PermalinkCommentsmeddler xml ie8 xslt flickr technical boring search suggestions

[whatwg] [WhatWG] Some additional API is needed for sites to see whether registerProtocolHandler() call was successful

2009 Apr 7, 12:14This makes plenty of sense, that a site should be able to check if a protocol handler exists for some URI scheme, but it'd be nice if this were some sort of declaritive fallback plan rather than having to do it all with script. "The HTML5 standard function registerProtocolHandler() should probably remain void as in standard, but WhatWG could invent yet another boolean protocolRegistered("area"), with the only argument (protocol name as string), to check whether a protocol is registered."PermalinkCommentshtml5 registerProtocolHandler html script url uri scheme protocol

Platonic Ideals in Anathem and The Atrocity Archives

2009 Apr 7, 11:58
The Atrocity ArchivesThe Jennifer MorgueAnathem

This past week I finished Anathem and despite the intimidating physical size of the book (difficult to take and read on the bus) I became very engrossed and was able to finish it in several orders of magnitude less time than what I spent on the Baroque Cycle. Whereas reading the Baroque Cycle you can imagine Neal Stephenson sifting through giant economic tomes (or at least that's where my mind went whenever the characters began to explain macro-economics to one another), in Anathem you can see Neal Stephenson staying up late pouring over philosophy of mathematics. When not exploring philosophy, Anathem has an appropriate amount of humor, love interests, nuclear bombs, etc. as you might hope from reading Snow Crash or Diamond Age. I thoroughly enjoyed Anathem.

On the topic of made up words: I get made up words for made up things, but there's already a name for cell-phone in English: its "cell-phone". The narrator notes that the book has been translated into English so I guess I'll blame the fictional translator. Anyway, I wasn't bothered by the made up words nearly as much as some folk. Its a good thing I'm long out of college because I can easily imagine confusing the names of actual concepts and people with those from the book, like Hemn space for Hamming distance. Towards the beginning, the description of slines and the post-post-apocalyptic setting reminded me briefly of Idiocracy.

Recently, I've been reading everything of Charles Stross that I can, including about a month ago, The Jennifer Morgue from the surprisingly awesome amalgamation genre of spy thriller and Lovecraft horror. Its the second in a series set in a universe in which magic exists as a form of mathematics and follows Bob Howard programmer/hacker, cube dweller, and begrudging spy who works for a government agency tasked to suppress this knowledge and protect the world from its use. For a taste, try a short story from the series that's freely available on Tor's website, Down on the Farm.

Coincidentally, both Anathem and the Bob Howard series take an interest in the world of Platonic ideals. In the case of Anathem (without spoiling anything) the universe of Platonic ideals, under a different name of course, is debated by the characters to be either just a concept or an actual separate universe and later becomes the underpinning of major events in the book. In the Bob Howard series, magic is applied mathematics that through particular proofs or computations awakens/disturbs/provokes unnamed horrors in the universe of Platonic ideals to produce some desired effect in Bob's universe.

PermalinkCommentsatrocity archives neal stephenson jennifer morgue plato bob howard anathem

Thoughts on registerProtocolHandler in HTML 5

2009 Apr 7, 9:02

I'm a big fan of the concept of registerProtocolHandler in HTML 5 and in FireFox 3, but not quite the implementation. From a high level, it allows web apps to register themselves as handlers of an URL scheme so for (the canonical) example, GMail can register for the mailto URL scheme. I like the concept:

However, the way its currently spec'ed out I don't like the following: PermalinkCommentsurl template registerprotocolhandler firefox technical url scheme protocol boring html5 uri urn

Text ads in open source code will add millions in funding to OSS movement - jorgenmodin.net

2009 Apr 1, 9:26"Open source code will contain text ads in comments and variable names, and Google and other text ad companies will scan the open source code repositories and fund the projects based on what they find in the code"PermalinkCommentshumor opensource advertising code

Internet Explorer 8 Released

2009 Mar 20, 6:18

Our Fearless Leader reveals IE8 at MIX09. Photo by DBegley.IE8, the software I've been working on for some time now, has finally been released at MIX09.

As I mentioned previously, I worked on accelerators (previously named Activities) in IE8. Looking at the kinds of things I blog about on the IE Blog, you might also correctly guess that I work on the networking stack. Ask me about what else I worked on during IE8 development. The past few months were very busy for me and I'm happy this is finally out.PermalinkCommentstechnical internet explorer ie8

Exposing RSS Comments

2009 Mar 10, 1:27Description of wfw:commentRss RSS extension: Content of the element is a URL to a feed of the comments for the particular RSS item. Exactly the sort of thing I was looking for a couple of years ago. At the time none of my web services used it, but now the Delicious v2 feed uses it! Maybe its time to reexamine this sort of thing...PermalinkCommentsrss comment feed reference blog namespace xml wfw

Subst Allows Non-Letter Drive Letters

2009 Mar 4, 2:39

I knew that the command line tool subst would create virtual drives that map to existing directories but I didn't know that subst lets you name the virtual drives with characters that aren't US-ASCII letters. For instance you can run 'subst 4: C:\windows' and then 'more 4:\win.ini' to dump C:\windows\win.ini. This also works for non-US-ASCII characters like, "C" (aka U+FF23, Fullwidth Latin Capital Letter C), which when displayed by cmd.exe via some best fit style character conversions looks just like the regular US-ASCII 'C'. None of Explorer, IE, or the common file dialogs allow the use of these odd virtual drives -- just cmd.exe, so I'm not sure how this would ever be useful but I thought it was odd and I wanted to share.

PermalinkCommentscli technical boring subst windows
Older EntriesNewer Entries Creative Commons License Some rights reserved.