2009 Mar 6, 5:36"The Moth, a not-for-profit storytelling organization, was founded in New York in 1997 by poet and novelist George Dawes Green, who wanted to recreate in New York the feeling of sultry summer
evenings on his native St. Simon's Island, Georgia, where he and a small circle of friends would gather to spin spellbinding tales on his friend Wanda's porch."
moth podcast humor rss story nyc community 2009 Mar 6, 1:21"BE IT FURTHER RESOLVED, That the Washington State Senate honor Jerry Holkins and Mike Krahulik for their hard work and dedication to improving the lives of hospitalized children worldwide through
their creation and continued work with Child's Play Charity"
comic charity videogames penny-arcade goverment washington senate 2009 Mar 6, 5:16
I've found while debugging networking in IE its often useful to quickly tell if a string is encoded in UTF-8. You can check for the Byte Order Mark (EF BB BF in UTF-8) but, I rarely see the BOM on
UTF-8 strings. Instead I apply a quick and dirty UTF-8 test that takes advantage of the well-formed UTF-8 restrictions.
Unlike other multibyte character encoding forms (see Windows supported character sets or IANA's list of character sets), for example Big5, where sticking together any two bytes is more likely than not to give a valid byte sequence, UTF-8 is more restrictive. And unlike
other multibyte character encodings, UTF-8 bytes may be taken out of context and one can still know that its a single byte character, the starting byte of a three byte sequence, etc.
The full rules for well-formed UTF-8 are a little too complicated for me to commit to memory. Instead I've got my own simpler (this is the quick part) set of rules that will be mostly correct (this
is the dirty part). For as many bytes in the string as you care to examine, check the most significant digit of the byte:
-
F:
-
This is byte 1 of a 4 byte encoded codepoint and must be followed by 3 trail bytes.
-
E:
-
This is byte 1 of a 3 byte encoded codepoint and must be followed by 2 trail bytes.
-
C..D:
-
This is byte 1 of a 2 byte encoded codepoint and must be followed by 1 trail byte.
-
8..B:
-
This is a trail byte.
-
0..7:
-
This is a single byte encoded codepoint.
The simpler rules can produce false positives in some cases: that is, they'll say a string is UTF-8 when in fact it might not be. But it won't produce false negatives. The following is table
from the
Unicode spec. that actually describes well-formed UTF-8.
Code Points
|
1st Byte
|
2nd Byte
|
3rd Byte
|
4th Byte
|
U+0000..U+007F
|
00..7F
|
U+0080..U+07FF
|
C2..DF
|
80..BF
|
U+0800..U+0FFF
|
E0
|
A0..BF
|
80..BF
|
U+1000..U+CFFF
|
E1..EC
|
80..BF
|
80..BF
|
U+D000..U+D7FF
|
ED
|
80..9F
|
80..BF
|
U+E000..U+FFFF
|
EE..EF
|
80..BF
|
80..BF
|
U+10000..U+3FFFF
|
F0
|
90..BF
|
80..BF
|
80..BF
|
U+40000..U+FFFFF
|
F1..F3
|
80..BF
|
80..BF
|
80..BF
|
U+100000..U+10FFFF
|
F4
|
80..8F
|
80..BF
|
80..BF
|
test technical unicode boring charset utf8 encoding 2009 Mar 4, 2:39
I knew that the command line tool subst would create virtual drives that map to existing directories but I didn't know that subst lets you name the virtual drives with characters that aren't
US-ASCII letters. For instance you can run 'subst 4: C:\windows' and then 'more 4:\win.ini' to dump C:\windows\win.ini. This also works for non-US-ASCII characters like, "C" (aka U+FF23, Fullwidth Latin Capital Letter C), which when displayed by cmd.exe via some best fit style character conversions looks just like the regular US-ASCII 'C'. None of Explorer, IE, or the common file
dialogs allow the use of these odd virtual drives -- just cmd.exe, so I'm not sure how this would ever be useful but I thought it was odd and I wanted to share.
cli technical boring subst windows 2009 Feb 28, 1:54
sequelguy posted a photo:
Penn and Teller's stage before their Las Vegas show
vegas rio nevada pennandteller 2009 Feb 28, 1:53
sequelguy posted a photo:
On the bridge in front of Treasure Island just before the first show of 'Sirens of TI' that day.
vegas friends beer nevada collegefriends 2009 Feb 27, 10:49Finally, you can play solitare on your phone while waiting for Android to boot with VMWare's mobile phone OS: "VMware has demoed its mobile virtualisation platform, which could potentially let users
simultaneously run two different operating systems."
video vmware mobile phone cellphone os android google microsoft windows windows-ce 2009 Feb 26, 11:52This is what I'd like in a newspaper: "1: Focus on original content, do not rewrite wire stories or press releases." and "2: Focus on hyper-local coverage, newspapers should "own" their regional beat
because they have the best contacts and the best understanding of local companies and issues."
via:sambrook newspaper advertising business journalism internet 2009 Feb 11, 10:05"With the iPhone version of WhatTheFont you can use the phone's built-in camera to photograph the text in question (or choose an existing image from your photo albums)... After confirming which
characters are used in the image, the app provides a list of possible matching fonts."
font iphone camera typography 2009 Feb 10, 6:34Real time stats on folks cursing on Twitter. Shows percentage change in usage by curse word.
twitter humor language swearing mashup 2009 Feb 7, 10:39
On my laptop at work I often get mail with attached files the application for which I only have installed on my main computer. Tired of having to save the file on the laptop and then find it on the
network via my other computer, I wrote remoteopen two nights ago. With this I open the file on my laptop and remoteopen sends it to be opened on
my main computer. Overkill for this issue but it felt good to write a quick tool that solves my problem.
technical boring remoteopen tool 2009 Feb 5, 8:39The long expired draft of the Web Proxy Autodiscovery Protocol (WPAD). To summarize, use DHCP and failing that DNS to find the name of a web server and on that web server find a Proxy Auto-Config
file at a well known localtion.
wpad proxy internet reference browser dns dhcp 2009 Feb 4, 4:30New URN schemes with no central minting authority. duri allows you to name a resource that was identified by the specified URI at the specified date (e.g. refers to the IETF's homepage at the end of
the year 2001). tdb allows you to name a physical object or entity that was described by a resource that was identified by a specified URI at the specified date (e.g. refers to IETF the orginization
as referenced by their homepage at the end of the year 2001). Date format is concise but I'd prefer RFC3339 rather than roping in another date format.
duri tdb uri url scheme reference ietf date datetime rfc 2009 Feb 3, 11:15"r2719 specifies that browsers should not allow scripts to set document.domain to anything on the Public Suffix List, such as "com" or "co.jp". Essential background reading on why this is dangerous:
Untraceable XSS Attacks. Most browsers already block this attack, e.g. Firefox since 3.0. [Background: Re: Setting document.domain]"
html5 tld publicsuffix dns security html internet web reference w3c 2009 Feb 2, 11:52"Chessboxing: Created in 2003 by Dutch artist Iepe Rubingh, chess boxing has 11 rounds of alternated boxing and chess. In first round, which lasts four minutes, contestants initiate the chess match.
A two-minute boxing round follows. Rounds alternate until one of the players gets a checkmate or a knockout."
humor art chess boxing sport via:boingboing video youtube 2009 Jan 29, 10:22Play this game now. Its like half of a delicious club sandwhich. Love the music. "To make it in Nuevos Aires, one has to have nerves of silk and the filthiest of hands. Mix together a batch of
espionage, some high- speed car chases, fire-spewing assassins, and you've got one oven that'll never bake cookies again. We provide the pliers and you bring the moxie."
game videogame quake gravity-bone humor spy espionage 2009 Jan 25, 5:39
Microsoft isn't completely shielded from our economies issues but I still have a job and
still get free soda. While that's all still the case, I decided to test Sarah's claimed ability to differentiate between Pepsi, Coke, and their diet counterparts by taste alone. I poured the four
sodas into marked cups and Sarah and I each took two runs through the cups with the following guesses.
Soda Identification Challenge Results
Drink
|
Sarah
|
Dave
|
Guess 1
|
Guess 2
|
Guess 1
|
Guess 2
|
Coke
|
Coke
|
Coke
|
Pepsi
|
Diet Pepsi
|
Diet Coke
|
Diet Coke
|
Diet Pepsi
|
Diet Coke
|
Diet Coke
|
Pepsi
|
Pepsi
|
Pepsi
|
Coke
|
Coke
|
Diet Pepsi
|
Diet Pepsi
|
Diet Coke
|
Diet Pepsi
|
Pepsi
|
Total (out of 8)
|
6
|
3
|
As you can see from the results, Sarah's claimed ability to identify Coke and Pepsi by taste is confirmed. The first run through she got completely correct and on the second run only mistook Diet
Pepsi for Diet Coke. Her excuse for the error on the second run was a tainted palate from the first run. I on the other hand was mostly incorrect. Surprisingly though my incorrect answers were
mostly consistent between run one and two. For instance I thought Pepsi was Coke in both runs.
coke microsoft waste of soda pepsi waste of time soda 2009 Jan 24, 2:42"PolitiFact has compiled about 500 promises that Barack Obama made during the campaign and is tracking their progress on our Obameter. We rate their status as No Action, In the Works or Stalled. Once
we find action is completed, we rate them Promise Kept, Compromise or Promise Broken."
politics news government obama election president tracking 2009 Jan 23, 1:47"When you experiment with Amazon's Mechanical Turk, it feels like magic. ... Last week, I started a new Turk experiment to answer two questions: what do these people look like, and how much does it
cost for someone to reveal their face?"
privacy research amazon mechanicalturk internet photo experiment social 2009 Jan 20, 2:25"Pro Pants, now in its second year, is a counter-movement to Improv Everywhere's annual No Pants! Subway Ride. The Pro Pants mission is to inform pantsless riders about the joys and advantages of
pants and to persuade them to accept pants into their lives." Don't you hate pants?
humor pants subway improv-everywhere