ir page 40 - Dave's Blog

Search
My timeline on Mastodon

The Moth Podcast

2009 Mar 6, 5:36"The Moth, a not-for-profit storytelling organization, was founded in New York in 1997 by poet and novelist George Dawes Green, who wanted to recreate in New York the feeling of sultry summer evenings on his native St. Simon's Island, Georgia, where he and a small circle of friends would gather to spin spellbinding tales on his friend Wanda's porch."PermalinkCommentsmoth podcast humor rss story nyc community

Washington State Senate Honors Penny Arcade

2009 Mar 6, 1:21"BE IT FURTHER RESOLVED, That the Washington State Senate honor Jerry Holkins and Mike Krahulik for their hard work and dedication to improving the lives of hospitalized children worldwide through their creation and continued work with Child's Play Charity"PermalinkCommentscomic charity videogames penny-arcade goverment washington senate

The 'Is It UTF-8?' Quick and Dirty Test

2009 Mar 6, 5:16

I've found while debugging networking in IE its often useful to quickly tell if a string is encoded in UTF-8. You can check for the Byte Order Mark (EF BB BF in UTF-8) but, I rarely see the BOM on UTF-8 strings. Instead I apply a quick and dirty UTF-8 test that takes advantage of the well-formed UTF-8 restrictions.

Unlike other multibyte character encoding forms (see Windows supported character sets or IANA's list of character sets), for example Big5, where sticking together any two bytes is more likely than not to give a valid byte sequence, UTF-8 is more restrictive. And unlike other multibyte character encodings, UTF-8 bytes may be taken out of context and one can still know that its a single byte character, the starting byte of a three byte sequence, etc.

The full rules for well-formed UTF-8 are a little too complicated for me to commit to memory. Instead I've got my own simpler (this is the quick part) set of rules that will be mostly correct (this is the dirty part). For as many bytes in the string as you care to examine, check the most significant digit of the byte:

F:
This is byte 1 of a 4 byte encoded codepoint and must be followed by 3 trail bytes.
E:
This is byte 1 of a 3 byte encoded codepoint and must be followed by 2 trail bytes.
C..D:
This is byte 1 of a 2 byte encoded codepoint and must be followed by 1 trail byte.
8..B:
This is a trail byte.
0..7:
This is a single byte encoded codepoint.
The simpler rules can produce false positives in some cases: that is, they'll say a string is UTF-8 when in fact it might not be. But it won't produce false negatives. The following is table from the Unicode spec. that actually describes well-formed UTF-8.
Code Points 1st Byte 2nd Byte 3rd Byte 4th Byte
U+0000..U+007F 00..7F
U+0080..U+07FF C2..DF 80..BF
U+0800..U+0FFF E0 A0..BF 80..BF
U+1000..U+CFFF E1..EC 80..BF 80..BF
U+D000..U+D7FF ED 80..9F 80..BF
U+E000..U+FFFF EE..EF 80..BF 80..BF
U+10000..U+3FFFF F0 90..BF 80..BF 80..BF
U+40000..U+FFFFF F1..F3 80..BF 80..BF 80..BF
U+100000..U+10FFFF F4 80..8F 80..BF 80..BF

PermalinkCommentstest technical unicode boring charset utf8 encoding

Subst Allows Non-Letter Drive Letters

2009 Mar 4, 2:39

I knew that the command line tool subst would create virtual drives that map to existing directories but I didn't know that subst lets you name the virtual drives with characters that aren't US-ASCII letters. For instance you can run 'subst 4: C:\windows' and then 'more 4:\win.ini' to dump C:\windows\win.ini. This also works for non-US-ASCII characters like, "C" (aka U+FF23, Fullwidth Latin Capital Letter C), which when displayed by cmd.exe via some best fit style character conversions looks just like the regular US-ASCII 'C'. None of Explorer, IE, or the common file dialogs allow the use of these odd virtual drives -- just cmd.exe, so I'm not sure how this would ever be useful but I thought it was odd and I wanted to share.

PermalinkCommentscli technical boring subst windows

Penn and Teller Stage

2009 Feb 28, 1:54

sequelguy posted a photo:

Penn and Teller Stage

Penn and Teller's stage before their Las Vegas show

PermalinkCommentsvegas rio nevada pennandteller

Scott, Jesse, and Jon in Vegas

2009 Feb 28, 1:53

sequelguy posted a photo:

Scott, Jesse, and Jon in Vegas

On the bridge in front of Treasure Island just before the first show of 'Sirens of TI' that day.

PermalinkCommentsvegas friends beer nevada collegefriends

YouTube - VMware demo showing two operating systems running on one phone

2009 Feb 27, 10:49Finally, you can play solitare on your phone while waiting for Android to boot with VMWare's mobile phone OS: "VMware has demoed its mobile virtualisation platform, which could potentially let users simultaneously run two different operating systems."PermalinkCommentsvideo vmware mobile phone cellphone os android google microsoft windows windows-ce

25 ideas: Creating An Open-Source Business Model For Newspapers

2009 Feb 26, 11:52This is what I'd like in a newspaper: "1: Focus on original content, do not rewrite wire stories or press releases." and "2: Focus on hyper-local coverage, newspapers should "own" their regional beat because they have the best contacts and the best understanding of local companies and issues."PermalinkCommentsvia:sambrook newspaper advertising business journalism internet

MyFonts Blog - Blog Archive - Introducing WhatTheFont for iPhone!

2009 Feb 11, 10:05"With the iPhone version of WhatTheFont you can use the phone's built-in camera to photograph the text in question (or choose an existing image from your photo albums)... After confirming which characters are used in the image, the app provides a list of possible matching fonts."PermalinkCommentsfont iphone camera typography

Cursebird: What the f#@! is everyone swearing about?

2009 Feb 10, 6:34Real time stats on folks cursing on Twitter. Shows percentage change in usage by curse word.PermalinkCommentstwitter humor language swearing mashup

RemoteOpen Tool

2009 Feb 7, 10:39

On my laptop at work I often get mail with attached files the application for which I only have installed on my main computer. Tired of having to save the file on the laptop and then find it on the network via my other computer, I wrote remoteopen two nights ago. With this I open the file on my laptop and remoteopen sends it to be opened on my main computer. Overkill for this issue but it felt good to write a quick tool that solves my problem.

PermalinkCommentstechnical boring remoteopen tool

Web Proxy Autodiscovery Protocol IETF Draft Document

2009 Feb 5, 8:39The long expired draft of the Web Proxy Autodiscovery Protocol (WPAD). To summarize, use DHCP and failing that DNS to find the name of a web server and on that web server find a Proxy Auto-Config file at a well known localtion.PermalinkCommentswpad proxy internet reference browser dns dhcp

draft-masinter-dated-uri-05 - names are readily assigned, offer the persistence of reference that is required by URNs, but do not require a stable authority to assign the name. The first namespace ("duri") is used to refer to URI-

2009 Feb 4, 4:30New URN schemes with no central minting authority. duri allows you to name a resource that was identified by the specified URI at the specified date (e.g. refers to the IETF's homepage at the end of the year 2001). tdb allows you to name a physical object or entity that was described by a resource that was identified by a specified URI at the specified date (e.g. refers to IETF the orginization as referenced by their homepage at the end of the year 2001). Date format is concise but I'd prefer RFC3339 rather than roping in another date format.PermalinkCommentsduri tdb uri url scheme reference ietf date datetime rfc

The WHATWG Blog - Blog Archive - This Week in HTML 5 - Episode 20

2009 Feb 3, 11:15"r2719 specifies that browsers should not allow scripts to set document.domain to anything on the Public Suffix List, such as "com" or "co.jp". Essential background reading on why this is dangerous: Untraceable XSS Attacks. Most browsers already block this attack, e.g. Firefox since 3.0. [Background: Re: Setting document.domain]"PermalinkCommentshtml5 tld publicsuffix dns security html internet web reference w3c

Chessboxing

2009 Feb 2, 11:52"Chessboxing: Created in 2003 by Dutch artist Iepe Rubingh, chess boxing has 11 rounds of alternated boxing and chess. In first round, which lasts four minutes, contestants initiate the chess match. A two-minute boxing round follows. Rounds alternate until one of the players gets a checkmate or a knockout."PermalinkCommentshumor art chess boxing sport via:boingboing video youtube

Gravity Bone

2009 Jan 29, 10:22Play this game now. Its like half of a delicious club sandwhich. Love the music. "To make it in Nuevos Aires, one has to have nerves of silk and the filthiest of hands. Mix together a batch of espionage, some high- speed car chases, fire-spewing assassins, and you've got one oven that'll never bake cookies again. We provide the pliers and you bring the moxie."PermalinkCommentsgame videogame quake gravity-bone humor spy espionage

DIY Pepsi Challenge

2009 Jan 25, 5:39

Deutsches MuseumMicrosoft isn't completely shielded from our economies issues but I still have a job and still get free soda. While that's all still the case, I decided to test Sarah's claimed ability to differentiate between Pepsi, Coke, and their diet counterparts by taste alone. I poured the four sodas into marked cups and Sarah and I each took two runs through the cups with the following guesses.

Soda Identification Challenge Results
Drink Sarah Dave
Guess 1 Guess 2 Guess 1 Guess 2
Coke Coke Coke Pepsi Diet Pepsi
Diet Coke Diet Coke Diet Pepsi Diet Coke Diet Coke
Pepsi Pepsi Pepsi Coke Coke
Diet Pepsi Diet Pepsi Diet Coke Diet Pepsi Pepsi
Total (out of 8) 6 3

As you can see from the results, Sarah's claimed ability to identify Coke and Pepsi by taste is confirmed. The first run through she got completely correct and on the second run only mistook Diet Pepsi for Diet Coke. Her excuse for the error on the second run was a tainted palate from the first run. I on the other hand was mostly incorrect. Surprisingly though my incorrect answers were mostly consistent between run one and two. For instance I thought Pepsi was Coke in both runs.

PermalinkCommentscoke microsoft waste of soda pepsi waste of time soda

PolitiFact | The Obameter: Tracking Barack Obama's Campaign Promises

2009 Jan 24, 2:42"PolitiFact has compiled about 500 promises that Barack Obama made during the campaign and is tracking their progress on our Obameter. We rate their status as No Action, In the Works or Stalled. Once we find action is completed, we rate them Promise Kept, Compromise or Promise Broken."PermalinkCommentspolitics news government obama election president tracking

The Faces of Mechanical Turk - Waxy.org

2009 Jan 23, 1:47"When you experiment with Amazon's Mechanical Turk, it feels like magic. ... Last week, I started a new Turk experiment to answer two questions: what do these people look like, and how much does it cost for someone to reveal their face?"PermalinkCommentsprivacy research amazon mechanicalturk internet photo experiment social

Pro Pants! 2K9

2009 Jan 20, 2:25"Pro Pants, now in its second year, is a counter-movement to Improv Everywhere's annual No Pants! Subway Ride. The Pro Pants mission is to inform pantsless riders about the joys and advantages of pants and to persuade them to accept pants into their lives." Don't you hate pants?PermalinkCommentshumor pants subway improv-everywhere
Older EntriesNewer Entries Creative Commons License Some rights reserved.