solution page 2 - Dave's Blog

Search
My timeline on Mastodon

WHEN ZOMBIES ATTACK!: MATHEMATICAL MODELLING OF AN OUTBREAK OF ZOMBIE INFECTION

2009 Aug 25, 7:10Research paper modelling zombie infection. "The key difference between the models presented here and other models of infectious disease is that the dead can come back to life." Also, love the references section with "Snyder, Zack (director), 2004 Dawn of the Dead" next to things like "Bainov, D.D. & Simeonov, P.S. Impulsive Differential Equations: Asymptotic Properties of the Solutions. World Scientific, Singapore (1995)."PermalinkCommentshumor zombie research via:schneier math science health apocalypse system:filetype:pdf system:media:document

RFC 2483 - URI Resolution Services Necessary for URN Resolution

2009 Jul 27, 7:28Includes the text/uri-list mime type!PermalinkCommentstechnical url uri mime reference ietf

Blog Layout and Implementation Improvements

2009 Jul 19, 11:44

Monticello, home of Thomas Jefferson, Charlottesville, Va. (LOC) from Flickr CommonsI've redone my blog's layout to remind myself how terrible CSS is -- err I mean to play with the more advanced features of CSS 2.1 which are all now available in IE8. As part of the new layout I've included my Delicious links by default but at a smaller size and I've replaced the navigation list options with Technical, Personal and Everything as I've heard from folks that that would actually be useful. Besides the layout I've also updated the back-end, switching from my handmade PHP+XSLT+RSS/Atom monster to a slightly less horrible PHP+DB solution. As a result everything should be much much faster including search which, incidentally, is so much easier to implement outside of XSLT.

PermalinkCommentsblog database redisgn xslt mysql homepage

PowerShell Scanning Script

2009 Jun 27, 3:42

I've hooked up the printer/scanner to the Media Center PC since I leave that on all the time anyway so we can have a networked printer. I wanted to hook up the scanner in a somewhat similar fashion but I didn't want to install HP's software (other than the drivers of course). So I've written my own script for scanning in PowerShell that does the following:

  1. Scans using the Windows Image Acquisition APIs via COM
  2. Runs OCR on the image using Microsoft Office Document Imaging via COM (which may already be on your PC if you have Office installed)
  3. Converts the image to JPEG using .NET Image APIs
  4. Stores the OCR text into the EXIF comment field using .NET Image APIs (which means Windows Search can index the image by the text in the image)
  5. Moves the image to the public share

Here's the actual code from my scan.ps1 file:

param([Switch] $ShowProgress, [switch] $OpenCompletedResult)

$filePathTemplate = "C:\users\public\pictures\scanned\scan {0} {1}.{2}";
$time = get-date -uformat "%Y-%m-%d";

[void]([reflection.assembly]::loadfile( "C:\Windows\Microsoft.NET\Framework\v2.0.50727\System.Drawing.dll"))

$deviceManager = new-object -ComObject WIA.DeviceManager
$device = $deviceManager.DeviceInfos.Item(1).Connect();

foreach ($item in $device.Items) {
        $fileIdx = 0;
        while (test-path ($filePathTemplate -f $time,$fileIdx,"*")) {
                [void](++$fileIdx);
        }

        if ($ShowProgress) { "Scanning..." }

        $image = $item.Transfer();
        $fileName = ($filePathTemplate -f $time,$fileIdx,$image.FileExtension);
        $image.SaveFile($fileName);
        clear-variable image

        if ($ShowProgress) { "Running OCR..." }

        $modiDocument = new-object -comobject modi.document;
        $modiDocument.Create($fileName);
        $modiDocument.OCR();
        if ($modiDocument.Images.Count -gt 0) {
                $ocrText = $modiDocument.Images.Item(0).Layout.Text.ToString().Trim();
                $modiDocument.Close();
                clear-variable modiDocument

                if (!($ocrText.Equals(""))) {
                        $fileAsImage = New-Object -TypeName system.drawing.bitmap -ArgumentList $fileName
                        if (!($fileName.EndsWith(".jpg") -or $fileName.EndsWith(".jpeg"))) {
                                if ($ShowProgress) { "Converting to JPEG..." }

                                $newFileName = ($filePathTemplate -f $time,$fileIdx,"jpg");
                                $fileAsImage.Save($newFileName, [System.Drawing.Imaging.ImageFormat]::Jpeg);
                                $fileAsImage.Dispose();
                                del $fileName;

                                $fileAsImage = New-Object -TypeName system.drawing.bitmap -ArgumentList $newFileName 
                                $fileName = $newFileName
                        }

                        if ($ShowProgress) { "Saving OCR Text..." }

                        $property = $fileAsImage.PropertyItems[0];
                        $property.Id = 40092;
                        $property.Type = 1;
                        $property.Value = [system.text.encoding]::Unicode.GetBytes($ocrText);
                        $property.Len = $property.Value.Count;
                        $fileAsImage.SetPropertyItem($property);
                        $fileAsImage.Save(($fileName + ".new"));
                        $fileAsImage.Dispose();
                        del $fileName;
                        ren ($fileName + ".new") $fileName
                }
        }
        else {
                $modiDocument.Close();
                clear-variable modiDocument
        }

        if ($ShowProgress) { "Done." }

        if ($OpenCompletedResult) {
                . $fileName;
        }
        else {
                $result = dir $fileName;
                $result | add-member -membertype noteproperty -name OCRText -value $ocrText
                $result
        }
}

I ran into a few issues:

PermalinkCommentstechnical scanner ocr .net modi powershell office wia

Controlling DNS prefetching - MDC

2009 Jun 22, 2:53"Firefox 3.5 performs DNS prefetching. This is a feature by which Firefox proactively performs domain name resolution on both links that the user may choose to follow as well as URLs for items referenced by the document, including images, CSS, JavaScript, and so forth."PermalinkCommentsdns firefox mozilla networking performance dns-prefetching technical

Fallout 3 'Broken Steel' DLC Preview - Shacknews - PC Games, PlayStation, Xbox 360 and Wii video game news, previews and downloads

2009 Apr 21, 1:28Fallout 3's May 5th DLC removes old ending, adds new quests, new levels, new perks. Sounds good! "In a nutshell, Broken Steel will remove the game's ending entirely, with Bethesda's Pete Hines saying simply to fans that called for an open-ended resolution, "We got the idea." Players will still have to make the final choice, but following that climax the game will continue, presenting new epilogue quests, another 10 levels to gain, and new perks, monsters and achievements to keep the climb interesting."PermalinkCommentsgame videogame news fallout3 fallout

A Bulbdial Clock - Evil Mad Scientist Laboratories

2009 Apr 9, 8:56Someone implemented the Ironic Sans artificial sundial clock concept! "Last year David Friedman published on his blog Ironic Sans an interesting design concept for something that he called The Bulbdial Clock. That's like a sundial, but with better resolution-- not just an hour hand, but a minute and second hand as well, each given as a shadow from moving artificial light sources (bulbs). We've recently put together a working bulbdial clock, with an implementation somewhat different from that of the original concept."PermalinkCommentshowto diy clock led sundial via:swannman

Thoughts on registerProtocolHandler in HTML 5

2009 Apr 7, 9:02

I'm a big fan of the concept of registerProtocolHandler in HTML 5 and in FireFox 3, but not quite the implementation. From a high level, it allows web apps to register themselves as handlers of an URL scheme so for (the canonical) example, GMail can register for the mailto URL scheme. I like the concept:

However, the way its currently spec'ed out I don't like the following: PermalinkCommentsurl template registerprotocolhandler firefox technical url scheme protocol boring html5 uri urn

Perth's Street Art Meets the Car Park | PSFK

2009 Jan 19, 3:18"Perth street art production group Ololo approached the construction manager of an inner-city skyscraper when they heard he hated the grey walls of the recently built Condor Tower five-storey car park. The three creative friends - Hurben, Shensing and Griv proposed a far more whimsical and creative solution to the bare walls, by allowing the group and their friends to embellish the interior with street art-inspired murals."PermalinkCommentsgraffiti streetart art cultural-disobediance

Phone Replacement For Grocery Card

2008 Dec 29, 11:04

My QFC grocery card barcode is 4 46600 03506 4.Another use for my new phone is as a replacement for my grocery card, those little plastic cards with a bar code on them that the grocery store gives you to track your purchasing habits. I've previously gone to great lengths to increase space in my pockets by removing infrequently used keys and reducing my wallet to the essentials. So I was glad to get rid of the QFC card and replace it with a photo of its bar code on my phone. Since the important part of the QFC card is the bar code which is just an image of black lines, if your phone has a camera and a screen with a reasonable resolution you can take a photo of the bar code and later display it to a reader. I've so far been able to try it once and successfully at a normal checkout line, but the reaction from the checkout lady was enough that I may in the future just keep a card in my car. She was very excited, asked me what kind of phone I had, called over another checkout person and generally made a large fuss. Also the checkout people generally don't mind giving me a new card if I don't have one with me.

PermalinkCommentstechnical boring barcode phone

Microsoft Research Image Composite Editor (ICE)

2008 Sep 24, 1:44"Microsoft Image Composite Editor is an advanced panoramic image stitcher. You shoot a set of overlapping photographs of a scene from a single location, and Image Composite Editor creates a high-resolution panorama incorporating all your images at full resolution."PermalinkCommentsmicrosoft research image photo panorama tool free ice stitching

What's expected of us : Article : Nature

2008 Aug 14, 4:29Scifi short story, "What's expected of us", by Ted Chiang. Younger cousin of 'The Riddle of the Universe and Its Solution'. FTA: "Civilization now depends on self-deception."PermalinkCommentsted-chiang scifi time time-travel fiction via:boingboing

New Cookie related Internet Drafts from Yngve N. Pettersen on 2006-03-20 (ietf-http-wg@w3.org from January to March 2006)

2008 Jun 30, 3:46Opera's solution to minimal security domain determination: "The drafts describe 1) Opera's current "rule of thumb" implementation that uses DNS in an attempt to confirm the validity of a domain, and 2) a proposed new HTTP based lookup service that returPermalinkCommentsopera rfc ietf cookie http internet browser dns domain

Bluelounge - The Sanctuary

2008 Jun 9, 4:44"The Sanctuary is a beautiful, simple solution to a real everyday problem. A place to put the multitude of personal items we all carry around so they are easily located again when later needed and, always fully charged"PermalinkCommentsgift gadget design cable battery phone mp3player charger shopping product

URI Fragment Info Roundup

2008 Apr 21, 11:53

['Neverending story' by Alexandre Duret-Lutz. A framed photo of books with the droste effect applied. Licensed under creative commons.]Information about URI Fragments, the portion of URIs that follow the '#' at the end and that are used to navigate within a document, is scattered throughout various documents which I usually have to hunt down. Instead I'll link to them all here.

Definitions. Fragments are defined in the URI RFC which states that they're used to identify a secondary resource that is related to the primary resource identified by the URI as a subset of the primary, a view of the primary, or some other resource described by the primary. The interpretation of a fragment is based on the mime type of the primary resource. Tim Berners-Lee notes that determining fragment meaning from mime type is a problem because a single URI may contain a single fragment, however over HTTP a single URI can result in the same logical resource represented in different mime types. So there's one fragment but multiple mime types and so multiple interpretations of the one fragment. The URI RFC says that if an author has a single resource available in multiple mime types then the author must ensure that the various representations of a single resource must all resolve fragments to the same logical secondary resource. Depending on which mime types you're dealing with this is either not easy or not possible.

HTTP. In HTTP when URIs are used, the fragment is not included. The General Syntax section of the HTTP standard says it uses the definitions of 'URI-reference' (which includes the fragment), 'absoluteURI', and 'relativeURI' (which don't include the fragment) from the URI RFC. However, the 'URI-reference' term doesn't actually appear in the BNF for the protocol. Accordingly the headers like 'Request-URI', 'Content-Location', 'Location', and 'Referer' which include URIs are defined with 'absoluteURI' or 'relativeURI' and don't include the fragment. This is in keeping with the original fragment definition which says that the fragment is used as a view of the original resource and consequently only needed for resolution on the client. Additionally, the URI RFC explicitly notes that not including the fragment is a privacy feature such that page authors won't be able to stop clients from viewing whatever fragments the client chooses. This seems like an odd claim given that if the author wanted to selectively restrict access to portions of documents there are other options for them like breaking out the parts of a single resource to which the author wishes to restrict access into separate resources.

HTML. In HTML, the HTML mime type RFC defines HTML's fragment use which consists of fragments referring to elements with a corresponding 'id' attribute or one of a particular set of elements with a corresponding 'name' attribute. The HTML spec discusses fragment use additionally noting that the names and ids must be unique in the document and that they must consist of only US-ASCII characters. The ID and NAME attributes are further restricted in section 6 to only consist of alphanumerics, the hyphen, period, colon, and underscore. This is a subset of the characters allowed in the URI fragment so no encoding is discussed since technically its not needed. However, practically speaking, browsers like FireFox and Internet Explorer allow for names and ids containing characters outside of the defined set including characters that must be percent-encoded to appear in a URI fragment. The interpretation of percent-encoded characters in fragments for HTML documents is not consistent across browsers (or in some cases within the same browser) especially for the percent-encoded percent.

Text. Text/plain recently got a fragment definition that allows fragments to refer to particular lines or characters within a text document. The scheme no longer includes regular expressions, which disappointed me at first, but in retrospect is probably good idea for increasing the adoption of this fragment scheme and for avoiding the potential for ubiquitous DoS via regex. One of the authors also notes this on his blog. I look forward to the day when this scheme is widely implemented.

XML. XML has the XPointer framework to define its fragment structure as noted by the XML mime type definition. XPointer consists of a general scheme that contains subschemes that identify a subset of an XML document. Its too bad such a thing wasn't adopted for URI fragments in general to solve the problem of a single resource with multiple mime type representations. I wrote more about XPointer when I worked on hacking XPointer into IE.

SVG and MPEG. Through the Media Fragments Working Group I found a couple more fragment scheme definitions. SVG's fragment scheme is defined in the SVG documentation and looks similar to XML's. MPEG has one defined but I could only find it as an ISO document "Text of ISO/IEC FCD 21000-17 MPEG-12 FID" and not as an RFC which is a little disturbing.

AJAX. AJAX websites have used fragments as an escape hatch for two issues that I've seen. The first is getting a unique URL for versions of a page that are produced on the client by script. The fragment may be changed by script without forcing the page to reload. This goes outside the rules of the standards by using HTML fragments in a fashion not called out by the HTML spec. but it does seem to be inline with the spirit of the fragment in that it is a subview of the original resource and interpretted client side. The other hack-ier use of the fragment in AJAX is for cross domain communication. The basic idea is that different frames or windows may not communicate in normal fashions if they have different domains but they can view each other's URLs and accordingly can change their own fragments in order to send a message out to those who know where to look. IMO this is not inline with the spirit of the fragment but is rather a cool hack.

PermalinkCommentsxml text ajax technical url boring uri fragment rfc

TED | Talks | Howard Rheingold: Way-new collaboration (video)

2008 Feb 17, 11:25How the Internet can allow new forms of collaboration, solutions to tragedy of commons, prisoner's dilemma.PermalinkCommentsvia:felix42 cooperation collaboration howard-rheingold video ted internet

Reuters Wants The World To Be Tagged - ReadWriteWeb

2008 Feb 8, 3:24FTA: "...Using a mix of natural language processing, AI techniques, and a massive databases, Reuters' solution extracts important bits of information from raw HTML pages. People, Companies, Places, and Events are really at the heart of many business articPermalinkCommentsvia:sambrook api reuters news tagging semantic semantic-web web

Personal Search with Yahoo Pipes

2008 Feb 3, 11:59

I've setup a minimal search page that uses a Yahoo Pipe to sort of search through my content. I say sort of search because I only get full text search over my recent item feeds and otherwise I just search over my tags.

To get real search I'm going to have to keep an archive of all my content on my own website. This is a pain but on the other hand it will let me easily backup my content or display old items on my page. Why didn't I just use a prebuilt solution?

PermalinkCommentsyahoo search rss yahoo-pipes homepage

IPv6 Roundup: Address Syntax on Windows

2008 Jan 9, 11:34

IPv6 address syntax consists of 8 groupings of colon delimited 16-bit hex values making up the 128-bit address. An optional double colon can replace any consecutive sequence of 0 valued hex values. For example the following is a valid IPv6 address: fe80::2c02:db79

Some IPv6 addresses aren't global and in those cases need a scope ID to describe their context. These get a '%' followed by the scope ID. For example the previous example with a scope ID of '8' would be: fe80::2c02:db79%8

IPv6 addresses in URIs may appear in the host section of a URI as long as they're enclosed by square brackets. For example: http://[fe80::2c02:db79]/. The RFC explicitly notes that there isn't a way to add a scope ID to the IPv6 address in a URI. However a draft document describes adding scope IDs to IPv6 addresses in URIs. The draft document uses the IPvFuture production from the URI RFC with a 'v1' to add a new hostname syntax and a '+' instead of a '%' for delimiting the scope id. For example: http://[v1.fe80::2c02:db79+8]/. However, this is still a draft document, not a final standard, and I don't know of any system that works this way.

In Windows XPSP2 the IPv6 stack is available but disabled by default. To enable the IPv6 stack, at a command prompt run 'netsh interface ipv6 install'. In Vista IPv6 is the on by default and cannot be turned off, while the IPv4 stack is optional and may be turned off by a command similar to the previous.

Once you have IPv6 on in your OS you can turn on IPv6 for IIS6 or just use IIS7. The address ::1 refers to the local machine.

In some places in Windows like UNC paths, IPv6 addresses aren't allowed. In those cases you can use a Vista DNS IPv6 hack that lives in the OS name resolution stack that transforms particularly crafted names into IPv6 addresses. Take your IPv6 address, replace the ':'s with '-'s and the '%' with an 's' and then append '.ipv6-literal.net' to the end. For example: fe80--2c02-db79s8.ipv6-literal.net. That name will resolve to the same example I've been using in Vista. This transformation occurs inside the system's local name resolution stack so no DNS servers are involved, although Microsoft does own the ipv6-literal.net domain name.

MSDN describes IPv6 addresses in URIs in Windows and I've described IPv6 addresses in URIs in IE7. File URIs in IE7 don't support IPv6 addresses. If you want to put a scope ID in a URI in IE7 you use a '%25' to delimit the scope ID and due to a bug you must have at least two digits in your scope ID. So, to take the previous example: http://[fe80::2c02:db79%2508]/. Note that its 08 rather than just 8.

PermalinkCommentsroundup ip windows ipv6 technical microsoft boring syntax

The Riddle of the Universe and Its Solution by CHRISTOPHER CHERNIAK

2007 Dec 31, 1:18A short short story titled 'The Riddle of the Universe and Its Solution' by CHRISTOPHER CHERNIAK. A classic (apparently) from the early 80s. No spoilers here. If you liked Snow Crash just read it.PermalinkCommentsread scifi story literature free fiction short-story rainbow
Older EntriesNewer Entries Creative Commons License Some rights reserved.