escape - Dave's Blog


Search

URI functions in Windows Store Applications

2013 Jul 25, 1:00

Summary

The Modern SDK contains some URI related functionality as do libraries available in particular projection languages. Unfortunately, collectively these APIs do not cover all scenarios in all languages. Specifically, JavaScript and C++ have no URI building APIs, and C++ additionally has no percent-encoding/decoding APIs.
WinRT (JS and C++)
JS Only
C++ Only
.NET Only
Parse
Build
Normalize
Equality
Relative resolution
Encode data for including in URI property
Decode data extracted from URI property
Build Query
Parse Query
The Windows.Foudnation.Uri type is not projected into .NET modern applications. Instead those applications use System.Uri and the platform ensures that it is correctly converted back and forth between Windows.Foundation.Uri as appropriate. Accordingly the column marked WinRT above is applicable to JS and C++ modern applications but not .NET modern applications. The only entries above applicable to .NET are the .NET Only column and the WwwFormUrlDecoder in the bottom left which is available to .NET.

Scenarios

Parse

This functionality is provided by the WinRT API Windows.Foundation.Uri in C++ and JS, and by System.Uri in .NET.
Parsing a URI pulls it apart into its basic components without decoding or otherwise modifying the contents.
var uri = new Windows.Foundation.Uri("http://example.com/path%20segment1/path%20segment2?key1=value1&key2=value2");
console.log(uri.path);// /path%20segment1/path%20segment2

WsDecodeUrl (C++)

WsDecodeUrl is not suitable for general purpose URI parsing. Use Windows.Foundation.Uri instead.

Build (C#)

URI building is only available in C# via System.UriBuilder.
URI building is the inverse of URI parsing: URI building allows the developer to specify the value of basic components of a URI and the API assembles them into a URI.
To work around the lack of a URI building API developers will likely concatenate strings to form their URIs. This can lead to injection bugs if they don’t validate or encode their input properly, but if based on trusted or known input is unlikely to have issues.
����������� Uri originalUri = new Uri("http://example.com/path1/?query");
����������� UriBuilder uriBuilder = new UriBuilder(originalUri);
����������� uriBuilder.Path = "/path2/";
����������� Uri newUri = uriBuilder.Uri; // http://example.com/path2/?query

WsEncodeUrl (C++)

WsEncodeUrl, in addition to building a URI from components also does some encoding. It encodes non-US-ASCII characters as UTF8, the percent, and a subset of gen-delims based on the URI property: all :/?#[]@ are percent-encoded except :/@ in the path and :/?@ in query and fragment.
Accordingly, WsEncodeUrl is not suitable for general purpose URI building. It is acceptable to use in the following cases:
- You’re building a URI out of non-encoded URI properties and don’t care about the difference between encoded and decoded characters. For instance you’re the only one consuming the URI and you uniformly decode URI properties when consuming – for instance using WsDecodeUrl to consume the URI.
- You’re building a URI with URI properties that don’t contain any of the characters that WsEncodeUrl encodes.

Normalize

This functionality is provided by the WinRT API Windows.Foundation.Uri in C++ and JS and by System.Uri in .NET. Normalization is applied during construction of the Uri object.
URI normalization is the application of URI normalization rules (including DNS normalization, IDN normalization, percent-encoding normalization, etc.) to the input URI.
������� var normalizedUri = new Windows.Foundation.Uri("HTTP://EXAMPLE.COM/p%61th foo/");
������� console.log(normalizedUri.absoluteUri); // http://example.com/path%20foo/
This is modulo Win8 812823 in which the Windows.Foundation.Uri.AbsoluteUri property returns a normalized IRI not a normalized URI. This bug does not affect System.Uri.AbsoluteUri which returns a normalized URI.

Equality

This functionality is provided by the WinRT API Windows.Foundation.Uri in C++ and JS and by System.Uri in .NET.
URI equality determines if two URIs are equal or not necessarily equal.
����������� var uri1 = new Windows.Foundation.Uri("HTTP://EXAMPLE.COM/p%61th foo/"),
��������������� uri2 = new Windows.Foundation.Uri("http://example.com/path%20foo/");
����������� console.log(uri1.equals(uri2)); // true

Relative resolution

This functionality is provided by the WinRT API Windows.Foundation.Uri in C++ and JS and by System.Uri in .NET
Relative resolution is a function that given an absolute URI A and a relative URI B, produces a new absolute URI C. C is the combination of A and B in which the basic components specified in B override or combine with those in A under rules specified in RFC 3986.
������� var baseUri = new Windows.Foundation.Uri("http://example.com/index.html"),
��� ��������relativeUri = "/path?query#fragment",
��� ��������absoluteUri = baseUri.combineUri(relativeUri);
������� console.log(baseUri.absoluteUri);������ // http://example.com/index.html
������� console.log(absoluteUri.absoluteUri);�� // http://example.com/path?query#fragment

Encode data for including in URI property

This functionality is available in JavaScript via encodeURIComponent and in C# via System.Uri.EscapeDataString. Although the two methods mentioned above will suffice for this purpose, they do not perform exactly the same operation.
Additionally we now have Windows.Foundation.Uri.EscapeComponent in WinRT, which is available in JavaScript and C++ (not C# since it doesn’t have access to Windows.Foundation.Uri). This is also slightly different from the previously mentioned mechanisms but works best for this purpose.
Encoding data for inclusion in a URI property is necessary when constructing a URI from data. In all the above cases the developer is dealing with a URI or substrings of a URI and so the strings are all encoded as appropriate. For instance, in the parsing example the path contains “path%20segment1” and not “path segment1”. To construct a URI one must first construct the basic components of the URI which involves encoding the data. For example, if one wanted to include “path segment / example” in the path of a URI, one must percent-encode the ‘ ‘ since it is not allowed in a URI, as well as the ‘/’ since although it is allowed, it is a delimiter and won’t be interpreted as data unless encoded.
If a developer does not have this API provided they can write it themselves. Percent-encoding methods appear simple to write, but the difficult part is getting the set of characters to encode correct, as well as handling non-US-ASCII characters.
������� var uri = new Windows.Foundation.Uri("http://example.com" +
����������� "/" + Windows.Foundation.Uri.escapeComponent("path segment / example") +
����������� "?key=" + Windows.Foundation.Uri.escapeComponent("=&?#"));
������� console.log(uri.absoluteUri); // http://example.com/path%20segment%20%2F%20example?key=%3D%26%3F%23

WsEncodeUrl (C++)

In addition to building a URI from components, WsEncodeUrl also percent-encodes some characters. However the API is not recommend for this scenario given the particular set of characters that are encoded and the convoluted nature in which a developer would have to use this API in order to use it for this purpose.
There are no general purpose scenarios for which the characters WsEncodeUrl encodes make sense: encode the %, encode a subset of gen-delims but not also encode the sub-delims. For instance this could not replace encodeURIComponent in a C++ version of the following code snippet since if ‘value’ contained ‘&’ or ‘=’ (both sub-delims) they wouldn’t be encoded and would be confused for delimiters in the name value pairs in the query:
"http://example.com/?key=" + Windows.Foundation.Uri.escapeComponent(value)
Since WsEncodeUrl produces a string URI, to obtain the property they want to encode they’d need to parse the resulting URI. WsDecodeUrl won’t work because it decodes the property but Windows.Foundation.Uri doesn’t decode. Accordingly the developer could run their string through WsEncodeUrl then Windows.Foundation.Uri to extract the property.

Decode data extracted from URI property

This functionality is available in JavaScript via decodeURIComponent and in C# via System.Uri.UnescapeDataString. Although the two methods mentioned above will suffice for this purpose, they do not perform exactly the same operation.
Additionally we now also have Windows.Foundation.Uri.UnescapeComponent in WinRT, which is available in JavaScript and C++ (not C# since it doesn’t have access to Windows.Foundation.Uri). This is also slightly different from the previously mentioned mechanisms but works best for this purpose.
Decoding is necessary when extracting data from a parsed URI property. For example, if a URI query contains a series of name and value pairs delimited by ‘=’ between names and values, and by ‘&’ between pairs, one must first parse the query into name and value entries and then decode the values. It is necessary to make this an extra step separate from parsing the URI property so that sub-delimiters (in this case ‘&’ and ‘=’) that are encoded will be interpreted as data, and those that are decoded will be interpreted as delimiters.
If a developer does not have this API provided they can write it themselves. Percent-decoding methods appear simple to write, but have some tricky parts including correctly handling non-US-ASCII, and remembering not to decode .
In the following example, note that if unescapeComponent were called first, the encoded ‘&’ and ‘=’ would be decoded and interfere with the parsing of the name value pairs in the query.
����������� var uri = new Windows.Foundation.Uri("http://example.com/?foo=bar&array=%5B%27%E3%84%93%27%2C%27%26%27%2C%27%3D%27%2C%27%23%27%5D");
����������� uri.query.substr(1).split("&").forEach(
��������������� function (keyValueString) {
������������������� var keyValue = keyValueString.split("=");
������������������� console.log(Windows.Foundation.Uri.unescapeComponent(keyValue[0]) + ": " + Windows.Foundation.Uri.unescapeComponent(keyValue[1]));
������������������� // foo: bar
������������������� // array: ['','&','=','#']
��������������� });

WsDecodeUrl (C++)

Since WsDecodeUrl decodes all percent-encoded octets it could be used for general purpose percent-decoding but it takes a URI so would require the dev to construct a stub URI around the string they want to decode. For example they could prefix “http:///#” to their string, run it through WsDecodeUrl and then extract the fragment property. It is convoluted but will work correctly.

Parse Query

The query of a URI is often encoded as application/x-www-form-urlencoded which is percent-encoded name value pairs delimited by ‘&’ between pairs and ‘=’ between corresponding names and values.
In WinRT we have a class to parse this form of encoding using Windows.Foundation.WwwFormUrlDecoder. The queryParsed property on the Windows.Foundation.Uri class is of this type and created with the query of its Uri:
��� var uri = Windows.Foundation.Uri("http://example.com/?foo=bar&array=%5B%27%E3%84%93%27%2C%27%26%27%2C%27%3D%27%2C%27%23%27%5D");
��� uri.queryParsed.forEach(
������� function (pair) {
����������� console.log("name: " + pair.name + ", value: " + pair.value);
����������� // name: foo, value: bar
����������� // name: array, value: ['','&','=','#']
������� });
��� console.log(uri.queryParsed.getFirstValueByName("array")); // ['','&','=','#']
The QueryParsed property is only on Windows.Foundation.Uri and not System.Uri and accordingly is not available in .NET. However the Windows.Foundation.WwwFormUrlDecoder class is available in C# and can be used manually:
����������� Uri uri = new Uri("http://example.com/?foo=bar&array=%5B%27%E3%84%93%27%2C%27%26%27%2C%27%3D%27%2C%27%23%27%5D");
����������� WwwFormUrlDecoder decoder = new WwwFormUrlDecoder(uri.Query);
���� �������foreach (IWwwFormUrlDecoderEntry entry in decoder)
����������� {
��������������� System.Diagnostics.Debug.WriteLine("name: " + entry.Name + ", value: " + entry.Value);
��������������� // name: foo, value: bar
��������������� // name: array, value: ['','&','=','#']
����������� }

Build Query

To build a query of name value pairs encoded as application/x-www-form-urlencoded there is no WinRT API to do this directly. Instead a developer must do this manually making use of the code described in “Encode data for including in URI property”.
In terms of public releases, this property is only in the RC and later builds.
For example in JavaScript a developer may write:
������� ����var uri = new Windows.Foundation.Uri("http://example.com/"),
��������������� query = "?" + Windows.Foundation.Uri.escapeComponent("array") + "=" + Windows.Foundation.Uri.escapeComponent("['','&','=','#']");
����������� console.log(uri.combine(new Windows.Foundation.Uri(query)).absoluteUri); // http://example.com/?array=%5B'%E3%84%93'%2C'%26'%2C'%3D'%2C'%23'%5D
PermalinkCommentsc# c++ javascript technical uri windows windows-runtime windows-store

Stripe CTF - XSS, CSRF (Levels 4 & 6)

2012 Sep 10, 4:43

Level 4 and level 6 of the Stripe CTF had solutions around XSS.

Level 4

Code

> Registered Users 

  • <% @registered_users.each do |user| %>
    <% last_active = user[:last_active].strftime('%H:%M:%S UTC') %>
    <% if @trusts_me.include?(user[:username]) %>

  • <%= user[:username] %>
    (password: <%= user[:password] %>, last active <%= last_active %>)
  • Issue

    The level 4 web application lets you transfer karma to another user and in doing so you are also forced to expose your password to that user. The main user page displays a list of users who have transfered karma to you along with their password. The password is not HTML encoded so we can inject HTML into that user's browser. For instance, we could create an account with the following HTML as the password which will result in XSS with that HTML:

    
    
    This HTML runs script that uses jQuery to post to the transfer URI resulting in a transfer of karma from the attacked user to the attacker user, and also the attacked user's password.

    Notes

    Code review red flags in this case included lack of encoding when using user controlled content to create HTML content, storing passwords in plain text in the database, and displaying passwords generally. By design the web app shows users passwords which is a very bad idea.

    Level 6

    Code



    ...

    def self.safe_insert(table, key_values)
    key_values.each do |key, value|
    # Just in case people try to exfiltrate
    # level07-password-holder's password
    if value.kind_of?(String) &&
    (value.include?('"') || value.include?("'"))
    raise "Value has unsafe characters"
    end
    end

    conn[table].insert(key_values)
    end

    Issue

    This web app does a much better job than the level 4 app with HTML injection. They use encoding whenever creating HTML using user controlled data, however they don't use encoding when injecting JSON data into script (see post_data initialization above). This JSON data is the last five most recent messages sent on the app so we get to inject script directly. However, the system also ensures that no strings we write contains single or double quotes so we can't get out of the string in the JSON data directly. As it turns out, HTML lets you jump out of a script block using no matter where you are in script. For instance, in the middle of a value in some JSON data we can jump out of script. But we still want to run script, so we can jump right back in. So the frame so far for the message we're going to post is the following:

    
    
PermalinkCommentscsrf encoding html internet javascript percent-encoding script security stripe-ctf technical web xss

URI Percent Encoding Ignorance Level 2 - There is no Unencoded URI

2012 Feb 20, 4:00

As a professional URI aficionado I deal with various levels of ignorance on URI percent-encoding (aka URI encoding, or URL escaping).

Getting into the more subtle levels of URI percent-encoding ignorance, folks try to apply their knowledge of percent-encoding to URIs as a whole producing the concepts escaped URIs and unescaped URIs. However there are no such things - URIs themselves aren't percent-encoded or decoded but rather contain characters that are percent-encoded or decoded. Applying percent-encoding or decoding to a URI as a whole produces a new and non-equivalent URI.

Instead of lingering on the incorrect concepts we'll just cover the correct ones: there's raw unencoded data, non-normal form URIs and normal form URIs. For example:

  1. http://example.com/%74%68%65%3F%70%61%74%68?query
  2. http://example.com/the%3Fpath?query
  3. "http", "example.com", "the?path", "query"

In the above (A) is not an 'encoded URI' but rather a non-normal form URI. The characters of 'the' and 'path' are percent-encoded but as unreserved characters specific in the RFC should not be encoded. In the normal form of the URI (B) the characters are decoded. But (B) is not a 'decoded URI' -- it still has an encoded '?' in it because that's a reserved character which by the RFC holds different meaning when appearing decoded versus encoded. Specifically in this case, it appears encoded which means it is data -- a literal '?' that appears as part of the path segment. This is as opposed to the decoded '?' that appears in the URI which is not part of the path but rather the delimiter to the query.

Usually when developers talk about decoding the URI what they really want is the raw data from the URI. The raw decoded data is (C) above. The only thing to note beyond what's covered already is that to obtain the decoded data one must parse the URI before percent decoding all percent-encoded octets.

Of course the exception here is when a URI is the raw data. In this case you must percent-encode the URI to have it appear in another URI. More on percent-encoding while constructing URIs later.

PermalinkCommentsurl encoding uri technical percent-encoding

YouTube - PengWIN

2009 Jan 14, 6:10An epic penguin escape!PermalinkCommentspenguin cute whale video youtube for:hellosarah

Tab Expansion in PowerShell

2008 Nov 18, 6:38

PowerShell gives us a real CLI for Windows based around .Net stuff. I don't like the creation of a new shell language but I suppose it makes sense given that they want something C# like but not C# exactly since that's much to verbose and strict for a CLI. One of the functions you can override is the TabExpansion function which is used when you tab complete commands. I really like this and so I've added on to the standard implementation to support replacing a variable name with its value, tab completion of available commands, previous command history, and drive names (there not restricted to just one letter in PS).

Learning the new language was a bit of a chore but MSDN helped. A couple of things to note, a statement that has a return value that you don't do anything with is implicitly the return value for the current function. That's why there's no explicit return's in my TabExpansion function. Also, if you're TabExpansion function fails or returns nothing then the builtin TabExpansion function runs which does just filenames. This is why you can see that the standard TabExpansion function doesn't handle normal filenames: it does extra stuff (like method and property completion on variables that represent .Net objects) but if there's no fancy extra stuff to be done it lets the builtin one take a crack.

Here's my TabExpansion function. Probably has bugs, so watch out!

function EscapePath([string] $path, [string] $original)
{
    if ($path.Contains(' ') -and !$original.Contains(' '))
    {
        '"'   $path   '"';
    }
    else
    {
        $path;
    }
}

function PathRelativeTo($pathDest, $pathCurrent)
{
    if ($pathDest.PSParentPath.ToString().EndsWith($pathCurrent.Path))
    {
        '.\'   $pathDest.name;
    }
    else
    {
        $pathDest.FullName;
    }
}

#  This is the default function to use for tab expansion. It handles simple
# member expansion on variables, variable name expansion and parameter completion
# on commands. It doesn't understand strings so strings containing ; | ( or { may
# cause expansion to fail.

function TabExpansion($line, $lastWord)
{
    switch -regex ($lastWord)
    {
         # Handle property and method expansion...
         '(^.*)(\$(\w|\.) )\.(\w*)$' {
             $method = [Management.Automation.PSMemberTypes] `
                 'Method,CodeMethod,ScriptMethod,ParameterizedProperty'
             $base = $matches[1]
             $expression = $matches[2]
             Invoke-Expression ('$val='   $expression)
             $pat = $matches[4]   '*'
             Get-Member -inputobject $val $pat | sort membertype,name |
                 where { $_.name -notmatch '^[gs]et_'} |
                 foreach {
                     if ($_.MemberType -band $method)
                     {
                         # Return a method...
                         $base   $expression   '.'   $_.name   '('
                     }
                     else {
                         # Return a property...
                         $base   $expression   '.'   $_.name
                     }
                 }
             break;
          }

         # Handle variable name expansion...
         '(^.*\$)([\w\:]*)$' {
             $prefix = $matches[1]
             $varName = $matches[2]
             foreach ($v in Get-Childitem ('variable:'   $varName   '*'))
             {
                 if ($v.name -eq $varName)
                 {
                     $v.value
                 }
                 else
                 {
                    $prefix   $v.name
                 }
             }
             break;
         }

         # Do completion on parameters...
         '^-([\w0-9]*)' {
             $pat = $matches[1]   '*'

             # extract the command name from the string
             # first split the string into statements and pipeline elements
             # This doesn't handle strings however.
             $cmdlet = [regex]::Split($line, '[|;]')[-1]

             #  Extract the trailing unclosed block e.g. ls | foreach { cp
             if ($cmdlet -match '\{([^\{\}]*)$')
             {
                 $cmdlet = $matches[1]
             }

             # Extract the longest unclosed parenthetical expression...
             if ($cmdlet -match '\(([^()]*)$')
             {
                 $cmdlet = $matches[1]
             }

             # take the first space separated token of the remaining string
             # as the command to look up. Trim any leading or trailing spaces
             # so you don't get leading empty elements.
             $cmdlet = $cmdlet.Trim().Split()[0]

             # now get the info object for it...
             $cmdlet = @(Get-Command -type 'cmdlet,alias' $cmdlet)[0]

             # loop resolving aliases...
             while ($cmdlet.CommandType -eq 'alias') {
                 $cmdlet = @(Get-Command -type 'cmdlet,alias' $cmdlet.Definition)[0]
             }

             # expand the parameter sets and emit the matching elements
             foreach ($n in $cmdlet.ParameterSets | Select-Object -expand parameters)
             {
                 $n = $n.name
                 if ($n -like $pat) { '-'   $n }
             }
             break;
         }

         default {
             $varNameStar = $lastWord   '*';

             foreach ($n in @(Get-Childitem $varNameStar))
             {
                 $name = PathRelativeTo ($n) ($PWD);

                 if ($n.PSIsContainer)
                 {
                     EscapePath ($name   '\') ($lastWord);
                 }
                 else
                 {
                     EscapePath ($name) ($lastWord);
                 }
             }

             if (!$varNameStar.Contains('\'))
             {
                foreach ($n in @(Get-Command $varNameStar))
                {
                    if ($n.CommandType.ToString().Equals('Application'))
                    {
                       foreach ($ext in @((cat Env:PathExt).Split(';')))
                       {
                          if ($n.Path.ToString().ToLower().EndsWith(($ext).ToString().ToLower()))
                          {
                              EscapePath($n.Path) ($lastWord);
                          }
                       }
                    }
                    else
                    {
                        EscapePath($n.Name) ($lastWord);
                    }
                }

                foreach ($n in @(Get-psdrive $varNameStar))
                {
                    EscapePath($n.name   ":") ($lastWord);
                }
             }

             foreach ($n in @(Get-History))
             {
                 if ($n.CommandLine.StartsWith($line) -and $n.CommandLine -ne $line)
                 {
                     $lastWord   $n.CommandLine.Substring($line.Length);
                 }
             }

             # Add the original string to the end of the expansion list.
             $lastWord;

             break;
         }
    }
}


PermalinkCommentscli technical tabexpansion powershell

URI Fragment Info Roundup

2008 Apr 21, 11:53

['Neverending story' by Alexandre Duret-Lutz. A framed photo of books with the droste effect applied. Licensed under creative commons.]Information about URI Fragments, the portion of URIs that follow the '#' at the end and that are used to navigate within a document, is scattered throughout various documents which I usually have to hunt down. Instead I'll link to them all here.

Definitions. Fragments are defined in the URI RFC which states that they're used to identify a secondary resource that is related to the primary resource identified by the URI as a subset of the primary, a view of the primary, or some other resource described by the primary. The interpretation of a fragment is based on the mime type of the primary resource. Tim Berners-Lee notes that determining fragment meaning from mime type is a problem because a single URI may contain a single fragment, however over HTTP a single URI can result in the same logical resource represented in different mime types. So there's one fragment but multiple mime types and so multiple interpretations of the one fragment. The URI RFC says that if an author has a single resource available in multiple mime types then the author must ensure that the various representations of a single resource must all resolve fragments to the same logical secondary resource. Depending on which mime types you're dealing with this is either not easy or not possible.

HTTP. In HTTP when URIs are used, the fragment is not included. The General Syntax section of the HTTP standard says it uses the definitions of 'URI-reference' (which includes the fragment), 'absoluteURI', and 'relativeURI' (which don't include the fragment) from the URI RFC. However, the 'URI-reference' term doesn't actually appear in the BNF for the protocol. Accordingly the headers like 'Request-URI', 'Content-Location', 'Location', and 'Referer' which include URIs are defined with 'absoluteURI' or 'relativeURI' and don't include the fragment. This is in keeping with the original fragment definition which says that the fragment is used as a view of the original resource and consequently only needed for resolution on the client. Additionally, the URI RFC explicitly notes that not including the fragment is a privacy feature such that page authors won't be able to stop clients from viewing whatever fragments the client chooses. This seems like an odd claim given that if the author wanted to selectively restrict access to portions of documents there are other options for them like breaking out the parts of a single resource to which the author wishes to restrict access into separate resources.

HTML. In HTML, the HTML mime type RFC defines HTML's fragment use which consists of fragments referring to elements with a corresponding 'id' attribute or one of a particular set of elements with a corresponding 'name' attribute. The HTML spec discusses fragment use additionally noting that the names and ids must be unique in the document and that they must consist of only US-ASCII characters. The ID and NAME attributes are further restricted in section 6 to only consist of alphanumerics, the hyphen, period, colon, and underscore. This is a subset of the characters allowed in the URI fragment so no encoding is discussed since technically its not needed. However, practically speaking, browsers like FireFox and Internet Explorer allow for names and ids containing characters outside of the defined set including characters that must be percent-encoded to appear in a URI fragment. The interpretation of percent-encoded characters in fragments for HTML documents is not consistent across browsers (or in some cases within the same browser) especially for the percent-encoded percent.

Text. Text/plain recently got a fragment definition that allows fragments to refer to particular lines or characters within a text document. The scheme no longer includes regular expressions, which disappointed me at first, but in retrospect is probably good idea for increasing the adoption of this fragment scheme and for avoiding the potential for ubiquitous DoS via regex. One of the authors also notes this on his blog. I look forward to the day when this scheme is widely implemented.

XML. XML has the XPointer framework to define its fragment structure as noted by the XML mime type definition. XPointer consists of a general scheme that contains subschemes that identify a subset of an XML document. Its too bad such a thing wasn't adopted for URI fragments in general to solve the problem of a single resource with multiple mime type representations. I wrote more about XPointer when I worked on hacking XPointer into IE.

SVG and MPEG. Through the Media Fragments Working Group I found a couple more fragment scheme definitions. SVG's fragment scheme is defined in the SVG documentation and looks similar to XML's. MPEG has one defined but I could only find it as an ISO document "Text of ISO/IEC FCD 21000-17 MPEG-12 FID" and not as an RFC which is a little disturbing.

AJAX. AJAX websites have used fragments as an escape hatch for two issues that I've seen. The first is getting a unique URL for versions of a page that are produced on the client by script. The fragment may be changed by script without forcing the page to reload. This goes outside the rules of the standards by using HTML fragments in a fashion not called out by the HTML spec. but it does seem to be inline with the spirit of the fragment in that it is a subview of the original resource and interpretted client side. The other hack-ier use of the fragment in AJAX is for cross domain communication. The basic idea is that different frames or windows may not communicate in normal fashions if they have different domains but they can view each other's URLs and accordingly can change their own fragments in order to send a message out to those who know where to look. IMO this is not inline with the spirit of the fragment but is rather a cool hack.

PermalinkCommentsxml text ajax technical url boring uri fragment rfc

Harold & Kumar Escape from Guantanamo Bay (2008)

2008 Feb 18, 2:24I saw the poster and thought someone was photoshopping me. I actually enjoyed the first movie (minus the bathroom scene).PermalinkCommentshumor movie harold-and-kumar guantanamo-bay

Web Q&A: XPath, XML Notepad, Data Islands, Case Sensitivity, XSL, and More -- MSDN Magazine, September 2001

2008 Feb 7, 2:36To summ up the last Q&A, the one I was interested in: "Is there any way to escape the characters " and ' in an XPath expression...". And their answer is no. Lame. I thought XPath folk would have defined this.PermalinkCommentsmicrosoft msdn xpath xml article

Which which - Batch File Hackiness

2007 Aug 9, 5:41To satisfy my hands which have already learned to type *nix commands I like to install Win32 versions of common GNU utilities. Unfortunately, the which command is a rather literal port and requires you to enter the entire name of the command for which you're looking. That is 'which which' won't find itself but 'which which.exe' will. This makes this almost useless for me so I thought to write my own as a batch file. I had learned about a few goodies available in cmd.exe that I thought would make this an easy task. It turned out to be more difficult than I thought.

for /F "usebackq tokens=*" %%a in ( `"echo %PATH:;=& echo %"` ) do (
    for /F "usebackq tokens=*" %%b in ( `"echo %PATHEXT:;=& echo %"` ) do (
        if exist "%%a"\%1%%b (
            for  %%c in ( "%%a"\%1%%b ) do (
                echo %%~fc
            )
        )
    )
)
The environment variables PATH and PATHEXT hold the list of paths to search through to find commands, and the extensions of files that should be run as commands respectively. The 'for /F "usebackq tokens=*" %%a in (...) do (...)' runs the 'do' portion with %%a sequentially taking on the value of every line in the 'in' portion. That's nice, but PATH and PATHEXT don't have their elements on different lines and I don't know of a way to escape a newline character to appear in a batch file. In order to get the PATH and PATHEXT's elements onto different lines I used the %ENV:a=b% syntax which replaces occurrences of a with b in the value of ENV. I replaced the ';' delimiter with the text '& echo ' which means %PATHEXT:;=& echo% evaluates to something like "echo .COM& echo .EXE& echo .BAT& ...". I have to put the whole expression in double quotes in order to escape the '&' for appearing in the batch file. The usebackq and the backwards quotes means that the backquoted string should be replaced with the output of the execution of its content. So in that fashion I'm able to get each element of the env. variable onto new lines. The rest is pretty straight forward.

Also, it supports wildcards:
C:\Users\davris>which.cmd *hi*
C:\Windows\System32\GRAPHICS.COM
C:\Windows\System32\SearchIndexer.exe
D:\bin\which.exe
D:\bin\which.cmd
PermalinkCommentswhich cmd technical batch for

Backup Notes

2007 Jul 13, 8:30I bought an external backup drive a few weekends ago. I've previously setup a Subversion repository so I decided to move everything into the repository and then back it up. So in went the contents of all of my %USERPROFILE% and ~ directories with a bit of sorting and pruning. Not too much though given its much easier to dump in everything and search for what I want then to take the time to examine and grade each file. What follows are the notes I took while setting this up. It takes me a bit of time to look up the help on each command so I figure I'll write it all down here for the benefit of myself and potentially others...

Setting Up the Backup Drive For Linux
I first changed the filesystem on the drive to ext3. I plugged it into my USB2.0 port and ran fdisk:

sudo fdisk /dev/sda

Useful commands I used to do this follow mostly in order:
m
help
p
print current partitions
d
delete current partition
n
create new partition (I used the defaults)
w
write changes and exit
Then I formatted for ext3.

sudo mkfs.ext3 /dev/sda1

I made it easy to mount:

sudo vim /etc/fstab
# added line to end:
/dev/sda1 /media/backup ext3 rw,user,noauto 0 0

I setup the directory structure on the disk

mount /media/backup
sudo mkdir /media/backup/users
sudo mkdir /media/backup/users/dave
sudo chown dave:dave /media/backup/users/dave


After all that its easy to make a copy of the Subversion repository:

mount /media/backup
cp -Rv /home/dave/svn /media/backup/users/dave/
umount /media/backup

Next on the agenda is to add a cron job to do this regularly.

Subversion Command Reference
On a machine that has local access to the Subversion repository you can check out a specific subdirectory as follows using the file scheme:

svn co file:///home/dave/svn/trunk/web/dave%40deletethis.net/public_html

Note also that although one of my directories is named 'dave@deletethis.net' Subversion requires the '@' to be percent-encoded.
Other useful subversion commands:
svn help
help
svn list file:///home/dave/svn/
list all files in root dir of svn depot
svn list -R file:///home/dave/svn/
list all files in svn depot
svn list -R file:///home/dave/svn/ | grep \/$
list all directories
svn status
List status of all files in the working copy directory as in - modified, not in repository, etc
svn update
Brings the working copy up to date wrt the repository
svn commit
Commit changes from the working copy to the repository
svn add / move / delete
Perform the specified action -- occurs immediately


Setting up Windows Client for Auto Auth into SVN
When using an SVN client on Windows via svn+ssh its useful to have the Windows automatically generate connections to the SVN server. I use putty on my Windows machines so I read the directions on using public keys with putty.

putty.exe dave@deletethis.net
cd .ssh
vim authorized_keys # leave the putty window open for now
puttygen.exe
Click the 'generate' button
Move the mouse around until finished
Copy text in 'Public key for pasting into OpenSSH authorized_keys file:' to putty window & save & close putty window
Enter Key passphrase & Comment in puttygen
Save the private key somewhere private
pageant.exe
'Add Key' the private key just saved.



Checking out using Tortoise SVN
On one of my Windows machines I've already installed Tortoise SVN. Checking out from my SVN repository was really easy. I just right clicked in Explorer in a directory and selected "SVN Checkout...". Then in the following dialog I entered the svn URI:

svn+ssh://dave@deletethis.net/home/dave/svn/trunk/web/dave%40deletethis.net/public_html/

Note again that the '@' that is part of the directory name is percent-encoded as '%40' while the '@' in the userinfo is not.

Windows Command Line Check Out
On my media center I didn't want to install Tortoise SVN so rather I used the command line tool. I setup pageant like before the only difficulty was getting the SVN command line tool to use putty. With the default configuration you can use the SVN_SSH environment variable to point at a compliant SSH command line tool. The trick is that its interpreted as a backslash escaped string. So I set mine thusly:

set SVN_SSH=C:\\users\\dave\\bin\\putty\\plink.exe

The escaping solved the vague error I received about not being able to create the tunnel.PermalinkCommentsbackup technical personal windows svn linux subversion

Extensible Markup Language (XML) 1.0 (Third Edition)

2007 Mar 8, 1:02Defintion of the escape mechanism used in XML.PermalinkCommentsxml escape encoding reference w3c standard character-entity-reference

CSS2.1: Section 4.1.3: Syntax and basic data types - Characters and Case

2006 Apr 18, 8:49PermalinkCommentscss w3c ietf html internet web encoding escape escape-sequence

Bob Bemer

2006 Mar 27, 11:06Bob Bemer - invetor of the back slashPermalinkCommentsbob-bemer ascii back-slash reverse-solidus escape escape-sequence y2k-problem

Character entity references

2005 Jul 18, 10:56HTML Encoding (Character entity references)PermalinkCommentshtml reference specification w3c encoding escape escape-sequence character-entity-reference
Older Entries Creative Commons License Some rights reserved.