I've found while debugging networking in IE its often useful to quickly tell if a string is encoded in UTF-8. You can check for the Byte Order Mark (EF BB BF in UTF-8) but, I rarely see the BOM on UTF-8 strings. Instead I apply a quick and dirty UTF-8 test that takes advantage of the well-formed UTF-8 restrictions.
Unlike other multibyte character encoding forms (see Windows supported character sets or IANA's list of character sets), for example Big5, where sticking together any two bytes is more likely than not to give a valid byte sequence, UTF-8 is more restrictive. And unlike other multibyte character encodings, UTF-8 bytes may be taken out of context and one can still know that its a single byte character, the starting byte of a three byte sequence, etc.
The full rules for well-formed UTF-8 are a little too complicated for me to commit to memory. Instead I've got my own simpler (this is the quick part) set of rules that will be mostly correct (this is the dirty part). For as many bytes in the string as you care to examine, check the most significant digit of the byte:
Code Points | 1st Byte | 2nd Byte | 3rd Byte | 4th Byte |
---|---|---|---|---|
U+0000..U+007F | 00..7F | |||
U+0080..U+07FF | C2..DF | 80..BF | ||
U+0800..U+0FFF | E0 | A0..BF | 80..BF | |
U+1000..U+CFFF | E1..EC | 80..BF | 80..BF | |
U+D000..U+D7FF | ED | 80..9F | 80..BF | |
U+E000..U+FFFF | EE..EF | 80..BF | 80..BF | |
U+10000..U+3FFFF | F0 | 90..BF | 80..BF | 80..BF |
U+40000..U+FFFFF | F1..F3 | 80..BF | 80..BF | 80..BF |
U+100000..U+10FFFF | F4 | 80..8F | 80..BF | 80..BF |
PowerShell gives us a real CLI for Windows based around .Net stuff. I don't like the creation of a new shell language but I suppose it makes sense given that they want something C# like but not C# exactly since that's much to verbose and strict for a CLI. One of the functions you can override is the TabExpansion function which is used when you tab complete commands. I really like this and so I've added on to the standard implementation to support replacing a variable name with its value, tab completion of available commands, previous command history, and drive names (there not restricted to just one letter in PS).
Learning the new language was a bit of a chore but MSDN helped. A couple of things to note, a statement that has a return value that you don't do anything with is implicitly the return value for the current function. That's why there's no explicit return's in my TabExpansion function. Also, if you're TabExpansion function fails or returns nothing then the builtin TabExpansion function runs which does just filenames. This is why you can see that the standard TabExpansion function doesn't handle normal filenames: it does extra stuff (like method and property completion on variables that represent .Net objects) but if there's no fancy extra stuff to be done it lets the builtin one take a crack.
Here's my TabExpansion function. Probably has bugs, so watch out!
function EscapePath([string] $path, [string] $original)
{
if ($path.Contains(' ') -and !$original.Contains(' '))
{
'"' $path '"';
}
else
{
$path;
}
}
function PathRelativeTo($pathDest, $pathCurrent)
{
if ($pathDest.PSParentPath.ToString().EndsWith($pathCurrent.Path))
{
'.\' $pathDest.name;
}
else
{
$pathDest.FullName;
}
}
# This is the default function to use for tab expansion. It handles simple
# member expansion on variables, variable name expansion and parameter completion
# on commands. It doesn't understand strings so strings containing ; | ( or { may
# cause expansion to fail.
function TabExpansion($line, $lastWord)
{
switch -regex ($lastWord)
{
# Handle property and method expansion...
'(^.*)(\$(\w|\.) )\.(\w*)$' {
$method = [Management.Automation.PSMemberTypes] `
'Method,CodeMethod,ScriptMethod,ParameterizedProperty'
$base = $matches[1]
$expression = $matches[2]
Invoke-Expression ('$val=' $expression)
$pat = $matches[4] '*'
Get-Member -inputobject $val $pat | sort membertype,name |
where { $_.name -notmatch '^[gs]et_'} |
foreach {
if ($_.MemberType -band $method)
{
# Return a method...
$base $expression '.' $_.name '('
}
else {
# Return a property...
$base $expression '.' $_.name
}
}
break;
}
# Handle variable name expansion...
'(^.*\$)([\w\:]*)$' {
$prefix = $matches[1]
$varName = $matches[2]
foreach ($v in Get-Childitem ('variable:' $varName '*'))
{
if ($v.name -eq $varName)
{
$v.value
}
else
{
$prefix $v.name
}
}
break;
}
# Do completion on parameters...
'^-([\w0-9]*)' {
$pat = $matches[1] '*'
# extract the command name from the string
# first split the string into statements and pipeline elements
# This doesn't handle strings however.
$cmdlet = [regex]::Split($line, '[|;]')[-1]
# Extract the trailing unclosed block e.g. ls | foreach { cp
if ($cmdlet -match '\{([^\{\}]*)$')
{
$cmdlet = $matches[1]
}
# Extract the longest unclosed parenthetical expression...
if ($cmdlet -match '\(([^()]*)$')
{
$cmdlet = $matches[1]
}
# take the first space separated token of the remaining string
# as the command to look up. Trim any leading or trailing spaces
# so you don't get leading empty elements.
$cmdlet = $cmdlet.Trim().Split()[0]
# now get the info object for it...
$cmdlet = @(Get-Command -type 'cmdlet,alias' $cmdlet)[0]
# loop resolving aliases...
while ($cmdlet.CommandType -eq 'alias') {
$cmdlet = @(Get-Command -type 'cmdlet,alias' $cmdlet.Definition)[0]
}
# expand the parameter sets and emit the matching elements
foreach ($n in $cmdlet.ParameterSets | Select-Object -expand parameters)
{
$n = $n.name
if ($n -like $pat) { '-' $n }
}
break;
}
default {
$varNameStar = $lastWord '*';
foreach ($n in @(Get-Childitem $varNameStar))
{
$name = PathRelativeTo ($n) ($PWD);
if ($n.PSIsContainer)
{
EscapePath ($name '\') ($lastWord);
}
else
{
EscapePath ($name) ($lastWord);
}
}
if (!$varNameStar.Contains('\'))
{
foreach ($n in @(Get-Command $varNameStar))
{
if ($n.CommandType.ToString().Equals('Application'))
{
foreach ($ext in @((cat Env:PathExt).Split(';')))
{
if ($n.Path.ToString().ToLower().EndsWith(($ext).ToString().ToLower()))
{
EscapePath($n.Path) ($lastWord);
}
}
}
else
{
EscapePath($n.Name) ($lastWord);
}
}
foreach ($n in @(Get-psdrive $varNameStar))
{
EscapePath($n.name ":") ($lastWord);
}
}
foreach ($n in @(Get-History))
{
if ($n.CommandLine.StartsWith($line) -and $n.CommandLine -ne $line)
{
$lastWord $n.CommandLine.Substring($line.Length);
}
}
# Add the original string to the end of the expansion list.
$lastWord;
break;
}
}
}
I just upgraded to the Zune 3.0 software which includes games and purchasing music on the Zune via WiFi and once again I'm thrilled that the new firmware is available for old Zunes like mine. Rooting around looking at the new features I noticed Zune Badges for the first time. They're like Xbox Achievements, for example I have a Pixies Silver Artist Power Listener award for listening to the Pixies over 1000 times. I know its ridiculous but I like it, and now I want achievements for everything.
Achievements everywhere would require more developments in self-tracking. Self-trackers, folks who keep statistics on exactly when and what they eat, when and how much they exercise, anything one may track about one's self, were the topic of a Kevin Kelly Quantified Self blog post (also check out Cory Doctorow's SF short story The Things that Make Me Weak and Strange Get Engineered Away featuring a colony of self-trackers). For someone like me with a medium length attention span the data collection needs to be completely automatic or I will lose interest and stop collecting within a week. For instance, Nike iPod shoes that keep track of how many steps the wearer takes. I'll also need software to analyze, display, and share this data on a website like Mycrocosm. I don't want to have to spend extreme amounts of time to create something as wonderful as the Feltron Report (check out his statistic on how many daily measurements he takes for the report). Once we have the data we can give out achievements for everything!
Carnivore Eat at least ten different kinds of animals. |
|
Make Friends Meet at least 10% of the residents in your home town. |
|
Globetrotter Visit a city in every country. |
|
You're Old Survive at least 80 years of life. |
Of course none of the above is practical yet, but how about Delicious achievements based on the public Delicious feeds? That should be doable...
I had an idea for a Facebook app the other day. I wondered who actually looked at my profile and thought I could create a Facebook app that would record this information and display it. When I talked to Vishu though he said that this wasn't something that Facebook would be too happy with. Indeed the Platform Policy explicitly disallows this in section 2.8. This explained why the app didn't already exist. Its probably for the best since everyone assumes they can anonymously view Facebook profiles and would be irritated if that weren't the case.
On the topic of assumed anonymity, check out this article on the aggregation and selling off of your cell phone data including your physical location.