param([Parameter(Mandatory = $true)] [string] $Path);
$FullPathOriginal = (gp "HKLM:\System\CurrentControlSet\Control\Session Manager\Environment").Path;
if (!($FullPathOriginal.split(";") | ?{ $_ -like $Path })) {
sp "HKLM:\System\CurrentControlSet\Control\Session Manager\Environment" -name Path -value ($FullPathOriginal + ";" +
$Path);
}
Is this really the right way to do this? Feels icky:
To programmatically add or modify system environment variables, add them to the HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\Session Manager\Environment registry key, then broadcast a WM_SETTINGCHANGE message with lParam set to the string “Environment”.
By the URI RFC there is only one way to represent a particular IPv4 address in the host of a URI. This is the standard dotted decimal notation of four bytes in decimal with no leading zeroes delimited by periods. And no leading zeros are allowed which means there's only one textual representation of a particular IPv4 address.
However as discussed in the URI RFC, there are other forms of IPv4 addresses that although not officially allowed are generally accepted. Many implementations used inet_aton to parse the address from the URI which accepts more than just dotted decimal. Instead of dotted decimal, each dot delimited part can be in decimal, octal (if preceded by a '0') or hex (if preceded by '0x' or '0X'). And that's each section individually - they don't have to match. And there need not be 4 parts: there can be between 1 and 4 (inclusive). In case of less than 4, the last part in the string represents all of the left over bytes, not just one.
For example the following are all equivalent:
The bread and butter of URI related security issues is when one part of the system disagrees with another about the interpretation of the URI. So this non-standard, non-normal form syntax has been been a great source of security issues in the past. Its mostly well known now (CreateUri normalizes these non-normal forms to dotted decimal), but occasionally a good tool for bypassing naive URI blocking systems.
As a professional URI aficionado I deal with various levels of ignorance on URI percent-encoding (aka URI encoding, or URL escaping).
Worse than the lame blog comments hating on percent-encoding is the shipping code which can do actual damage. In one very large project I won't name, I've fixed code that decodes all percent-encoded octets in a URI in order to get rid of pesky percents before calling ShellExecute. An unnamed developer with similar intent but clearly much craftier did the same thing in a loop until the string's length stopped changing. As it turns out percent-encoding serves a purpose and can't just be removed arbitrarily.
Percent-encoding exists so that one can represent data in a URI that would otherwise not be allowed or would be interpretted as a delimiter instead of data. For example, the space character (U+0020) is not allowed in a URI and so must be percent-encoded in order to appear in a URI:
http://example.com/the%20path/
http://example.com/the path/
For an additional example, the question mark delimits the path from the query. If one wanted the question mark to appear as part of the path rather than delimit the path from the query, it must be percent-encoded:
http://example.com/foo%3Fbar
http://example.com/foo?bar
/foo
" from the query "bar
". And in the first, the querstion mark is percent-encoded and so
the path is "/foo%3Fbar
".
IETF draft on the contents of the User Agent HTTP header.
Irritatingly, my G1 won't show me PDFs so I've made the Google Docs PDF viewer which will load PDFs on the web up in Google Docs. Google Docs has the useful ability to display PDFs in web browsers without any Adobe software and works (mostly) on Android.
This was very easy to put together as an Android activity. First its necessary to register the application as handling PDFs from the web. This is done via the intent-filter declaration in the manifest:
intent-filter
action android:name="android.intent.action.VIEW"/
data android:scheme="http" android:mimeType="application/pdf"/
category android:name="android.intent.category.DEFAULT"/
category android:name="android.intent.category.BROWSABLE"/
/intent-filter
The action part says my activity will view PDFs, the data part says it accepts data with the PDF mime-type and with a URL that has an HTTP scheme. The browsable category
is necessary to allow links from a browser to open this activity.
Second, the activity opens up the browser to Google Docs pointing to the PDF.
Intent intent = new Intent();
intent.setAction(getIntent().getAction());
intent.setData(Uri.parse(
"http://docs.google.com/gview?embedded=true&url=" +
percentEncodeForQuery(getIntent().getData().toString())));
startActivity(intent);
This is very simple code to invoke a new intent browsing to a newly constructed URL for the PDF in Google Docs. That was easy.I built timestamp.exe, a Windows command line tool to convert between computer and human readable date/time formats mostly for working on the first run wizard for IE8. We commonly write out our dates in binary form to the registry and in order to test and debug my work it became useful to be able to determine to what date the binary value of a FILETIME or SYSTEMTIME corresponded or to produce my own binary value of a FILETIME and insert it into the registry.
For instance, to convert to a binary value:
[PS C:\] timestamp -inString 2009/08/28:10:18 -outHexValue -convert filetime
2009/08/28:10:18 as FILETIME: 00 7c c8 d1 c8 27 ca 01
Converting in the other direction, if you don't know what format the bytes are in, just feed them in and timestamp will try all conversions and list only the valid ones:
[PS C:\] timestamp -inHexValue "40 52 1c 3b"
40 52 1c 3b as FILETIME: 1601-01-01:00:01:39.171
40 52 1c 3b as Unix Time: 2001-06-05:03:30:08.000
40 52 1c 3b as DOS Time: 2009-08-28:10:18:00.000
(it also supports OLE Dates, and SYSTEMTIME which aren't listed there because the hex value isn't valid for those types). Or use the guess
option to get timestamp's best guess:
[PS C:\] timestamp -inHexValue "40 52 1c 3b" -convert guess
40 52 1c 3b as DOS Time: 2009-08-28:10:18:00.000
When I first wrote this I had a bug in my function that parses the date-time value string in which I could parse 2009-07-02:10:18 just fine, but I wouldn't be able to parse 2009-09-02:10:18 correctly. This was my code:
success = swscanf_s(timeString, L"%hi%*[\\/- ,]%hi%*[\\/- ,]%hi%*[\\/- ,Tt:.]%hi%*[:.]%hi%*[:.]%hi%*[:.]%hi",
&systemTime->wYear,
&systemTime->wMonth,
&systemTime->wDay,
&systemTime->wHour,
&systemTime->wMinute,
&systemTime->wSecond,
&systemTime->wMilliseconds) > 1;
See the problem?
To convert between these various forms yourself read The Old New Thing date conversion article or Josh Poley's date time article. I previously wrote about date formats I like and dislike.
Inspired by one of Penn's (of Penn & Teller) articles in which he mentions he has his computer tell him what he wrote in his journal that day the previous year, I've wanted to implement a similar thing with my blog. Now that, as I mentioned previously, I've updated my blog such that its much easier to implement search and such, I've added date range filtering to my site's search. So now I can easily see what on Delicious and my blog I was doing last year.
I've also otherwise updated search on this site. You can now quote terms to match an entire string, stick 'tag:' in front of a term to only match that term against tags as opposed to the title and body of the entry as well, and you can stick '-' in front of a term to indicate that it must not be found in the entry.
I've hooked up the printer/scanner to the Media Center PC since I leave that on all the time anyway so we can have a networked printer. I wanted to hook up the scanner in a somewhat similar fashion but I didn't want to install HP's software (other than the drivers of course). So I've written my own script for scanning in PowerShell that does the following:
Here's the actual code from my scan.ps1 file:
param([Switch] $ShowProgress, [switch] $OpenCompletedResult)
$filePathTemplate = "C:\users\public\pictures\scanned\scan {0} {1}.{2}";
$time = get-date -uformat "%Y-%m-%d";
[void]([reflection.assembly]::loadfile( "C:\Windows\Microsoft.NET\Framework\v2.0.50727\System.Drawing.dll"))
$deviceManager = new-object -ComObject WIA.DeviceManager
$device = $deviceManager.DeviceInfos.Item(1).Connect();
foreach ($item in $device.Items) {
$fileIdx = 0;
while (test-path ($filePathTemplate -f $time,$fileIdx,"*")) {
[void](++$fileIdx);
}
if ($ShowProgress) { "Scanning..." }
$image = $item.Transfer();
$fileName = ($filePathTemplate -f $time,$fileIdx,$image.FileExtension);
$image.SaveFile($fileName);
clear-variable image
if ($ShowProgress) { "Running OCR..." }
$modiDocument = new-object -comobject modi.document;
$modiDocument.Create($fileName);
$modiDocument.OCR();
if ($modiDocument.Images.Count -gt 0) {
$ocrText = $modiDocument.Images.Item(0).Layout.Text.ToString().Trim();
$modiDocument.Close();
clear-variable modiDocument
if (!($ocrText.Equals(""))) {
$fileAsImage = New-Object -TypeName system.drawing.bitmap -ArgumentList $fileName
if (!($fileName.EndsWith(".jpg") -or $fileName.EndsWith(".jpeg"))) {
if ($ShowProgress) { "Converting to JPEG..." }
$newFileName = ($filePathTemplate -f $time,$fileIdx,"jpg");
$fileAsImage.Save($newFileName, [System.Drawing.Imaging.ImageFormat]::Jpeg);
$fileAsImage.Dispose();
del $fileName;
$fileAsImage = New-Object -TypeName system.drawing.bitmap -ArgumentList $newFileName
$fileName = $newFileName
}
if ($ShowProgress) { "Saving OCR Text..." }
$property = $fileAsImage.PropertyItems[0];
$property.Id = 40092;
$property.Type = 1;
$property.Value = [system.text.encoding]::Unicode.GetBytes($ocrText);
$property.Len = $property.Value.Count;
$fileAsImage.SetPropertyItem($property);
$fileAsImage.Save(($fileName + ".new"));
$fileAsImage.Dispose();
del $fileName;
ren ($fileName + ".new") $fileName
}
}
else {
$modiDocument.Close();
clear-variable modiDocument
}
if ($ShowProgress) { "Done." }
if ($OpenCompletedResult) {
. $fileName;
}
else {
$result = dir $fileName;
$result | add-member -membertype noteproperty -name OCRText -value $ocrText
$result
}
}
I ran into a few issues:
I've found while debugging networking in IE its often useful to quickly tell if a string is encoded in UTF-8. You can check for the Byte Order Mark (EF BB BF in UTF-8) but, I rarely see the BOM on UTF-8 strings. Instead I apply a quick and dirty UTF-8 test that takes advantage of the well-formed UTF-8 restrictions.
Unlike other multibyte character encoding forms (see Windows supported character sets or IANA's list of character sets), for example Big5, where sticking together any two bytes is more likely than not to give a valid byte sequence, UTF-8 is more restrictive. And unlike other multibyte character encodings, UTF-8 bytes may be taken out of context and one can still know that its a single byte character, the starting byte of a three byte sequence, etc.
The full rules for well-formed UTF-8 are a little too complicated for me to commit to memory. Instead I've got my own simpler (this is the quick part) set of rules that will be mostly correct (this is the dirty part). For as many bytes in the string as you care to examine, check the most significant digit of the byte:
Code Points | 1st Byte | 2nd Byte | 3rd Byte | 4th Byte |
---|---|---|---|---|
U+0000..U+007F | 00..7F | |||
U+0080..U+07FF | C2..DF | 80..BF | ||
U+0800..U+0FFF | E0 | A0..BF | 80..BF | |
U+1000..U+CFFF | E1..EC | 80..BF | 80..BF | |
U+D000..U+D7FF | ED | 80..9F | 80..BF | |
U+E000..U+FFFF | EE..EF | 80..BF | 80..BF | |
U+10000..U+3FFFF | F0 | 90..BF | 80..BF | 80..BF |
U+40000..U+FFFFF | F1..F3 | 80..BF | 80..BF | 80..BF |
U+100000..U+10FFFF | F4 | 80..8F | 80..BF | 80..BF |
PowerShell gives us a real CLI for Windows based around .Net stuff. I don't like the creation of a new shell language but I suppose it makes sense given that they want something C# like but not C# exactly since that's much to verbose and strict for a CLI. One of the functions you can override is the TabExpansion function which is used when you tab complete commands. I really like this and so I've added on to the standard implementation to support replacing a variable name with its value, tab completion of available commands, previous command history, and drive names (there not restricted to just one letter in PS).
Learning the new language was a bit of a chore but MSDN helped. A couple of things to note, a statement that has a return value that you don't do anything with is implicitly the return value for the current function. That's why there's no explicit return's in my TabExpansion function. Also, if you're TabExpansion function fails or returns nothing then the builtin TabExpansion function runs which does just filenames. This is why you can see that the standard TabExpansion function doesn't handle normal filenames: it does extra stuff (like method and property completion on variables that represent .Net objects) but if there's no fancy extra stuff to be done it lets the builtin one take a crack.
Here's my TabExpansion function. Probably has bugs, so watch out!
function EscapePath([string] $path, [string] $original)
{
if ($path.Contains(' ') -and !$original.Contains(' '))
{
'"' $path '"';
}
else
{
$path;
}
}
function PathRelativeTo($pathDest, $pathCurrent)
{
if ($pathDest.PSParentPath.ToString().EndsWith($pathCurrent.Path))
{
'.\' $pathDest.name;
}
else
{
$pathDest.FullName;
}
}
# This is the default function to use for tab expansion. It handles simple
# member expansion on variables, variable name expansion and parameter completion
# on commands. It doesn't understand strings so strings containing ; | ( or { may
# cause expansion to fail.
function TabExpansion($line, $lastWord)
{
switch -regex ($lastWord)
{
# Handle property and method expansion...
'(^.*)(\$(\w|\.) )\.(\w*)$' {
$method = [Management.Automation.PSMemberTypes] `
'Method,CodeMethod,ScriptMethod,ParameterizedProperty'
$base = $matches[1]
$expression = $matches[2]
Invoke-Expression ('$val=' $expression)
$pat = $matches[4] '*'
Get-Member -inputobject $val $pat | sort membertype,name |
where { $_.name -notmatch '^[gs]et_'} |
foreach {
if ($_.MemberType -band $method)
{
# Return a method...
$base $expression '.' $_.name '('
}
else {
# Return a property...
$base $expression '.' $_.name
}
}
break;
}
# Handle variable name expansion...
'(^.*\$)([\w\:]*)$' {
$prefix = $matches[1]
$varName = $matches[2]
foreach ($v in Get-Childitem ('variable:' $varName '*'))
{
if ($v.name -eq $varName)
{
$v.value
}
else
{
$prefix $v.name
}
}
break;
}
# Do completion on parameters...
'^-([\w0-9]*)' {
$pat = $matches[1] '*'
# extract the command name from the string
# first split the string into statements and pipeline elements
# This doesn't handle strings however.
$cmdlet = [regex]::Split($line, '[|;]')[-1]
# Extract the trailing unclosed block e.g. ls | foreach { cp
if ($cmdlet -match '\{([^\{\}]*)$')
{
$cmdlet = $matches[1]
}
# Extract the longest unclosed parenthetical expression...
if ($cmdlet -match '\(([^()]*)$')
{
$cmdlet = $matches[1]
}
# take the first space separated token of the remaining string
# as the command to look up. Trim any leading or trailing spaces
# so you don't get leading empty elements.
$cmdlet = $cmdlet.Trim().Split()[0]
# now get the info object for it...
$cmdlet = @(Get-Command -type 'cmdlet,alias' $cmdlet)[0]
# loop resolving aliases...
while ($cmdlet.CommandType -eq 'alias') {
$cmdlet = @(Get-Command -type 'cmdlet,alias' $cmdlet.Definition)[0]
}
# expand the parameter sets and emit the matching elements
foreach ($n in $cmdlet.ParameterSets | Select-Object -expand parameters)
{
$n = $n.name
if ($n -like $pat) { '-' $n }
}
break;
}
default {
$varNameStar = $lastWord '*';
foreach ($n in @(Get-Childitem $varNameStar))
{
$name = PathRelativeTo ($n) ($PWD);
if ($n.PSIsContainer)
{
EscapePath ($name '\') ($lastWord);
}
else
{
EscapePath ($name) ($lastWord);
}
}
if (!$varNameStar.Contains('\'))
{
foreach ($n in @(Get-Command $varNameStar))
{
if ($n.CommandType.ToString().Equals('Application'))
{
foreach ($ext in @((cat Env:PathExt).Split(';')))
{
if ($n.Path.ToString().ToLower().EndsWith(($ext).ToString().ToLower()))
{
EscapePath($n.Path) ($lastWord);
}
}
}
else
{
EscapePath($n.Name) ($lastWord);
}
}
foreach ($n in @(Get-psdrive $varNameStar))
{
EscapePath($n.name ":") ($lastWord);
}
}
foreach ($n in @(Get-History))
{
if ($n.CommandLine.StartsWith($line) -and $n.CommandLine -ne $line)
{
$lastWord $n.CommandLine.Substring($line.Length);
}
}
# Add the original string to the end of the expansion list.
$lastWord;
break;
}
}
}
Sarah asked me if I knew of a syntax highlighter for the QuickBase formula language which she uses at work. I couldn't find one but thought it might be fun to make a QuickBase Formula syntax highlighter based on the QuickBase help's description of the formula syntax. Thankfully the language is relatively simple since my skills with ANTLR, the parser generator, are rusty now and I've only used it previously for personal projects (like Javaish, the ridiculous Java based shell idea I had).
With the help of some great ANTLR examples and an ANTLR cheat sheet I was able to come up with the grammar that parses the QuickBase Formula syntax and prints out the same formula marked up with HTML SPAN tags and various CSS classes. ANTLR produces the parser in Java which I wrapped up in an applet, put in a jar, and embedded in an HTML page. The script in that page runs user input through the applet's parser and sticks the output at the bottom of the page with appropriate CSS rules to highlight and print the formula in a pretty fashion.
What I learned: