percent-encoding - Dave's Blog

Search

JavaScript Microsoft Store app StartPage

2017 Jun 22, 8:58

JavaScript Microsoft Store apps have some details related to activation that are specific to JavaScript Store apps and that are poorly documented which I’ll describe here.

StartPage syntax

The StartPage attributes in the AppxManifest.xml (Package/Applications/Application/@StartPage, Package/Applications/Extensions/Extension/@StartPage) define the HTML page entry point for that kind of activation. That is, Application/@StartPage defines the entry point for tile activation, Extension[@Category="windows.protocol"]/@StartPage defines the entry point for URI handling activation, etc. There are two kinds of supported values in StartPage attributes: relative Windows file paths and absolute URIs. If the attribute doesn’t parse as an absolute URI then it is instead interpreted as relative Windows file path.

This implies a few things that I’ll declare explicitly here. Windows file paths, unlike URIs, don’t have a query or fragment, so if you are using a relative Windows file path for your StartPage attribute you cannot include anything like ‘?param=value’ at the end. Absolute URIs use percent-encoding for reserved characters like ‘%’ and ‘#’. If you have a ‘#’ in your HTML filename then you need to percent-encode that ‘#’ for a URI and not for a relative Windows file path.

If you specify a relative Windows file path, it is turned into an ms-appx URI by changing all backslashes to forward slashes, percent-encoding reserved characters, and combining the result with a base URI of ms-appx:///. Accordingly the relative Windows file paths are relative to the root of your package. If you are using a relative Windows file path as your StartPage and need to switch to using a URI so you can include a query or fragment, you can follow the same steps above.

StartPage validity

The validity of the StartPage is not determined before activation. If the StartPage is a relative Windows file path for a file that doesn’t exist, or an absolute URI that is not in the Application Content URI Rules, or something that doesn’t parse as a Windows file path or URI, or otherwise an absolute URI that fails to resolve (404, bad hostname, etc etc) then the JavaScript app will navigate to the app’s navigation error page (perhaps more on that in a future blog post). Just to call it out explicitly because I have personally accidentally done this: StartPage URIs are not automatically included in the Application Content URI Rules and if you forget to include your StartPage in your ACUR you will always fail to navigate to that StartPage.

StartPage navigation

When your app is activated for a particular activation kind, the StartPage value from the entry in your app’s manifest that corresponds to that activation kind is used as the navigation target. If the app is not already running, the app is activated, navigated to that StartPage value and then the Windows.UI.WebUI.WebUIApplication activated event is fired (more details on the order of various events in a moment). If, however, your app is already running and an activation occurs, we navigate or don’t navigate to the corresponding StartPage depending on the current page of the app. Take the app’s current top level document’s URI and if after removing the fragment it already matches the StartPage value then we won’t navigate and will jump straight to firing the WebUIApplication activated event.

Since navigating the top-level document means destroying the current JavaScript engine instance and losing all your state, this behavior might be a problem for you. If so, you can use the MSApp.pageHandlesAllApplicationActivations(true) API to always skip navigating to the StartPage and instead always jump straight to firing the WebUIApplication activated event. This does require of course that all of your pages all handle all activation kinds about which any part of your app cares.

PermalinkComments

URI functions in Windows Store Applications

2013 Jul 25, 1:00

Summary

The Modern SDK contains some URI related functionality as do libraries available in particular projection languages. Unfortunately, collectively these APIs do not cover all scenarios in all languages. Specifically, JavaScript and C++ have no URI building APIs, and C++ additionally has no percent-encoding/decoding APIs.
WinRT (JS and C++)
JS Only
C++ Only
.NET Only
Parse
 
Build
Normalize
Equality
 
 
Relative resolution
Encode data for including in URI property
Decode data extracted from URI property
Build Query
Parse Query
The Windows.Foudnation.Uri type is not projected into .NET modern applications. Instead those applications use System.Uri and the platform ensures that it is correctly converted back and forth between Windows.Foundation.Uri as appropriate. Accordingly the column marked WinRT above is applicable to JS and C++ modern applications but not .NET modern applications. The only entries above applicable to .NET are the .NET Only column and the WwwFormUrlDecoder in the bottom left which is available to .NET.

Scenarios

Parse

This functionality is provided by the WinRT API Windows.Foundation.Uri in C++ and JS, and by System.Uri in .NET.
Parsing a URI pulls it apart into its basic components without decoding or otherwise modifying the contents.
var uri = new Windows.Foundation.Uri("http://example.com/path%20segment1/path%20segment2?key1=value1&key2=value2");
console.log(uri.path);// /path%20segment1/path%20segment2

WsDecodeUrl (C++)

WsDecodeUrl is not suitable for general purpose URI parsing.  Use Windows.Foundation.Uri instead.

Build (C#)

URI building is only available in C# via System.UriBuilder.
URI building is the inverse of URI parsing: URI building allows the developer to specify the value of basic components of a URI and the API assembles them into a URI. 
To work around the lack of a URI building API developers will likely concatenate strings to form their URIs.  This can lead to injection bugs if they don’t validate or encode their input properly, but if based on trusted or known input is unlikely to have issues.
            Uri originalUri = new Uri("http://example.com/path1/?query");
            UriBuilder uriBuilder = new UriBuilder(originalUri);
            uriBuilder.Path = "/path2/";
            Uri newUri = uriBuilder.Uri; // http://example.com/path2/?query

WsEncodeUrl (C++)

WsEncodeUrl, in addition to building a URI from components also does some encoding.  It encodes non-US-ASCII characters as UTF8, the percent, and a subset of gen-delims based on the URI property: all :/?#[]@ are percent-encoded except :/@ in the path and :/?@ in query and fragment.
Accordingly, WsEncodeUrl is not suitable for general purpose URI building.  It is acceptable to use in the following cases:
- You’re building a URI out of non-encoded URI properties and don’t care about the difference between encoded and decoded characters.  For instance you’re the only one consuming the URI and you uniformly decode URI properties when consuming – for instance using WsDecodeUrl to consume the URI.
- You’re building a URI with URI properties that don’t contain any of the characters that WsEncodeUrl encodes.

Normalize

This functionality is provided by the WinRT API Windows.Foundation.Uri in C++ and JS and by System.Uri in .NET.  Normalization is applied during construction of the Uri object.
URI normalization is the application of URI normalization rules (including DNS normalization, IDN normalization, percent-encoding normalization, etc.) to the input URI.
        var normalizedUri = new Windows.Foundation.Uri("HTTP://EXAMPLE.COM/p%61th foo/");
        console.log(normalizedUri.absoluteUri); // http://example.com/path%20foo/
This is modulo Win8 812823 in which the Windows.Foundation.Uri.AbsoluteUri property returns a normalized IRI not a normalized URI.  This bug does not affect System.Uri.AbsoluteUri which returns a normalized URI.

Equality

This functionality is provided by the WinRT API Windows.Foundation.Uri in C++ and JS and by System.Uri in .NET. 
URI equality determines if two URIs are equal or not necessarily equal.
            var uri1 = new Windows.Foundation.Uri("HTTP://EXAMPLE.COM/p%61th foo/"),
                uri2 = new Windows.Foundation.Uri("http://example.com/path%20foo/");
            console.log(uri1.equals(uri2)); // true

Relative resolution

This functionality is provided by the WinRT API Windows.Foundation.Uri in C++ and JS and by System.Uri in .NET 
Relative resolution is a function that given an absolute URI A and a relative URI B, produces a new absolute URI C.  C is the combination of A and B in which the basic components specified in B override or combine with those in A under rules specified in RFC 3986.
        var baseUri = new Windows.Foundation.Uri("http://example.com/index.html"),
            relativeUri = "/path?query#fragment",
            absoluteUri = baseUri.combineUri(relativeUri);
        console.log(baseUri.absoluteUri);       // http://example.com/index.html
        console.log(absoluteUri.absoluteUri);   // http://example.com/path?query#fragment

Encode data for including in URI property

This functionality is available in JavaScript via encodeURIComponent and in C# via System.Uri.EscapeDataString. Although the two methods mentioned above will suffice for this purpose, they do not perform exactly the same operation.
Additionally we now have Windows.Foundation.Uri.EscapeComponent in WinRT, which is available in JavaScript and C++ (not C# since it doesn’t have access to Windows.Foundation.Uri).  This is also slightly different from the previously mentioned mechanisms but works best for this purpose.
Encoding data for inclusion in a URI property is necessary when constructing a URI from data.  In all the above cases the developer is dealing with a URI or substrings of a URI and so the strings are all encoded as appropriate. For instance, in the parsing example the path contains “path%20segment1” and not “path segment1”.  To construct a URI one must first construct the basic components of the URI which involves encoding the data.  For example, if one wanted to include “path segment / example” in the path of a URI, one must percent-encode the ‘ ‘ since it is not allowed in a URI, as well as the ‘/’ since although it is allowed, it is a delimiter and won’t be interpreted as data unless encoded.
If a developer does not have this API provided they can write it themselves.  Percent-encoding methods appear simple to write, but the difficult part is getting the set of characters to encode correct, as well as handling non-US-ASCII characters.
        var uri = new Windows.Foundation.Uri("http://example.com" +
            "/" + Windows.Foundation.Uri.escapeComponent("path segment / example") +
            "?key=" + Windows.Foundation.Uri.escapeComponent("=&?#"));
        console.log(uri.absoluteUri); // http://example.com/path%20segment%20%2F%20example?key=%3D%26%3F%23

WsEncodeUrl (C++)

In addition to building a URI from components, WsEncodeUrl also percent-encodes some characters.  However the API is not recommend for this scenario given the particular set of characters that are encoded and the convoluted nature in which a developer would have to use this API in order to use it for this purpose.
There are no general purpose scenarios for which the characters WsEncodeUrl encodes make sense: encode the %, encode a subset of gen-delims but not also encode the sub-delims.  For instance this could not replace encodeURIComponent in a C++ version of the following code snippet since if ‘value’ contained ‘&’ or ‘=’ (both sub-delims) they wouldn’t be encoded and would be confused for delimiters in the name value pairs in the query:
"http://example.com/?key=" + Windows.Foundation.Uri.escapeComponent(value)
Since WsEncodeUrl produces a string URI, to obtain the property they want to encode they’d need to parse the resulting URI.  WsDecodeUrl won’t work because it decodes the property but Windows.Foundation.Uri doesn’t decode.  Accordingly the developer could run their string through WsEncodeUrl then Windows.Foundation.Uri to extract the property.

Decode data extracted from URI property

This functionality is available in JavaScript via decodeURIComponent and in C# via System.Uri.UnescapeDataString. Although the two methods mentioned above will suffice for this purpose, they do not perform exactly the same operation.
Additionally we now also have Windows.Foundation.Uri.UnescapeComponent in WinRT, which is available in JavaScript and C++ (not C# since it doesn’t have access to Windows.Foundation.Uri).  This is also slightly different from the previously mentioned mechanisms but works best for this purpose.
Decoding is necessary when extracting data from a parsed URI property.  For example, if a URI query contains a series of name and value pairs delimited by ‘=’ between names and values, and by ‘&’ between pairs, one must first parse the query into name and value entries and then decode the values.  It is necessary to make this an extra step separate from parsing the URI property so that sub-delimiters (in this case ‘&’ and ‘=’) that are encoded will be interpreted as data, and those that are decoded will be interpreted as delimiters.
If a developer does not have this API provided they can write it themselves.  Percent-decoding methods appear simple to write, but have some tricky parts including correctly handling non-US-ASCII, and remembering not to decode .
In the following example, note that if unescapeComponent were called first, the encoded ‘&’ and ‘=’ would be decoded and interfere with the parsing of the name value pairs in the query.
            var uri = new Windows.Foundation.Uri("http://example.com/?foo=bar&array=%5B%27%E3%84%93%27%2C%27%26%27%2C%27%3D%27%2C%27%23%27%5D");
            uri.query.substr(1).split("&").forEach(
                function (keyValueString) {
                    var keyValue = keyValueString.split("=");
                    console.log(Windows.Foundation.Uri.unescapeComponent(keyValue[0]) + ": " + Windows.Foundation.Uri.unescapeComponent(keyValue[1]));
                    // foo: bar
                    // array: ['','&','=','#']
                });

WsDecodeUrl (C++)

Since WsDecodeUrl decodes all percent-encoded octets it could be used for general purpose percent-decoding but it takes a URI so would require the dev to construct a stub URI around the string they want to decode.  For example they could prefix “http:///#” to their string, run it through WsDecodeUrl and then extract the fragment property.  It is convoluted but will work correctly.

Parse Query

The query of a URI is often encoded as application/x-www-form-urlencoded which is percent-encoded name value pairs delimited by ‘&’ between pairs and ‘=’ between corresponding names and values.
In WinRT we have a class to parse this form of encoding using Windows.Foundation.WwwFormUrlDecoder.  The queryParsed property on the Windows.Foundation.Uri class is of this type and created with the query of its Uri:
    var uri = Windows.Foundation.Uri("http://example.com/?foo=bar&array=%5B%27%E3%84%93%27%2C%27%26%27%2C%27%3D%27%2C%27%23%27%5D");
    uri.queryParsed.forEach(
        function (pair) {
            console.log("name: " + pair.name + ", value: " + pair.value);
            // name: foo, value: bar
            // name: array, value: ['','&','=','#']
        });
    console.log(uri.queryParsed.getFirstValueByName("array")); // ['','&','=','#']
The QueryParsed property is only on Windows.Foundation.Uri and not System.Uri and accordingly is not available in .NET.  However the Windows.Foundation.WwwFormUrlDecoder class is available in C# and can be used manually:
            Uri uri = new Uri("http://example.com/?foo=bar&array=%5B%27%E3%84%93%27%2C%27%26%27%2C%27%3D%27%2C%27%23%27%5D");
            WwwFormUrlDecoder decoder = new WwwFormUrlDecoder(uri.Query);
            foreach (IWwwFormUrlDecoderEntry entry in decoder)
            {
                System.Diagnostics.Debug.WriteLine("name: " + entry.Name + ", value: " + entry.Value);
                // name: foo, value: bar
                // name: array, value: ['','&','=','#']
            }
 

Build Query

To build a query of name value pairs encoded as application/x-www-form-urlencoded there is no WinRT API to do this directly.  Instead a developer must do this manually making use of the code described in “Encode data for including in URI property”.
In terms of public releases, this property is only in the RC and later builds.
For example in JavaScript a developer may write:
            var uri = new Windows.Foundation.Uri("http://example.com/"),
                query = "?" + Windows.Foundation.Uri.escapeComponent("array") + "=" + Windows.Foundation.Uri.escapeComponent("['','&','=','#']");
 
            console.log(uri.combine(new Windows.Foundation.Uri(query)).absoluteUri); // http://example.com/?array=%5B'%E3%84%93'%2C'%26'%2C'%3D'%2C'%23'%5D
 
PermalinkCommentsc# c++ javascript technical uri windows windows-runtime windows-store

Stripe CTF - XSS, CSRF (Levels 4 & 6)

2012 Sep 10, 4:43

Level 4 and level 6 of the Stripe CTF had solutions around XSS.

Level 4

Code

> Registered Users 

  • <% @registered_users.each do |user| %>
    <% last_active = user[:last_active].strftime('%H:%M:%S UTC') %>
    <% if @trusts_me.include?(user[:username]) %>

  • <%= user[:username] %>
    (password: <%= user[:password] %>, last active <%= last_active %>)
  • Issue

    The level 4 web application lets you transfer karma to another user and in doing so you are also forced to expose your password to that user. The main user page displays a list of users who have transfered karma to you along with their password. The password is not HTML encoded so we can inject HTML into that user's browser. For instance, we could create an account with the following HTML as the password which will result in XSS with that HTML:

    
    
    This HTML runs script that uses jQuery to post to the transfer URI resulting in a transfer of karma from the attacked user to the attacker user, and also the attacked user's password.

    Notes

    Code review red flags in this case included lack of encoding when using user controlled content to create HTML content, storing passwords in plain text in the database, and displaying passwords generally. By design the web app shows users passwords which is a very bad idea.

    Level 6

    Code



    ...

    def self.safe_insert(table, key_values)
    key_values.each do |key, value|
    # Just in case people try to exfiltrate
    # level07-password-holder's password
    if value.kind_of?(String) &&
    (value.include?('"') || value.include?("'"))
    raise "Value has unsafe characters"
    end
    end

    conn[table].insert(key_values)
    end

    Issue

    This web app does a much better job than the level 4 app with HTML injection. They use encoding whenever creating HTML using user controlled data, however they don't use encoding when injecting JSON data into script (see post_data initialization above). This JSON data is the last five most recent messages sent on the app so we get to inject script directly. However, the system also ensures that no strings we write contains single or double quotes so we can't get out of the string in the JSON data directly. As it turns out, HTML lets you jump out of a script block using no matter where you are in script. For instance, in the middle of a value in some JSON data we can jump out of script. But we still want to run script, so we can jump right back in. So the frame so far for the message we're going to post is the following:

    
    
PermalinkCommentscsrf encoding html internet javascript percent-encoding script security stripe-ctf technical web xss

URI Percent Encoding Ignorance Level 2 - There is no Unencoded URI

2012 Feb 20, 4:00

As a professional URI aficionado I deal with various levels of ignorance on URI percent-encoding (aka URI encoding, or URL escaping).

Getting into the more subtle levels of URI percent-encoding ignorance, folks try to apply their knowledge of percent-encoding to URIs as a whole producing the concepts escaped URIs and unescaped URIs. However there are no such things - URIs themselves aren't percent-encoded or decoded but rather contain characters that are percent-encoded or decoded. Applying percent-encoding or decoding to a URI as a whole produces a new and non-equivalent URI.

Instead of lingering on the incorrect concepts we'll just cover the correct ones: there's raw unencoded data, non-normal form URIs and normal form URIs. For example:

  1. http://example.com/%74%68%65%3F%70%61%74%68?query
  2. http://example.com/the%3Fpath?query
  3. "http", "example.com", "the?path", "query"

In the above (A) is not an 'encoded URI' but rather a non-normal form URI. The characters of 'the' and 'path' are percent-encoded but as unreserved characters specific in the RFC should not be encoded. In the normal form of the URI (B) the characters are decoded. But (B) is not a 'decoded URI' -- it still has an encoded '?' in it because that's a reserved character which by the RFC holds different meaning when appearing decoded versus encoded. Specifically in this case, it appears encoded which means it is data -- a literal '?' that appears as part of the path segment. This is as opposed to the decoded '?' that appears in the URI which is not part of the path but rather the delimiter to the query.

Usually when developers talk about decoding the URI what they really want is the raw data from the URI. The raw decoded data is (C) above. The only thing to note beyond what's covered already is that to obtain the decoded data one must parse the URI before percent decoding all percent-encoded octets.

Of course the exception here is when a URI is the raw data. In this case you must percent-encode the URI to have it appear in another URI. More on percent-encoding while constructing URIs later.

PermalinkCommentsurl encoding uri technical percent-encoding

URI Percent-Encoding Ignorance Level 1 - Purpose

2012 Feb 15, 4:00

As a professional URI aficionado I deal with various levels of ignorance on URI percent-encoding (aka URI encoding, or URL escaping).

Worse than the lame blog comments hating on percent-encoding is the shipping code which can do actual damage. In one very large project I won't name, I've fixed code that decodes all percent-encoded octets in a URI in order to get rid of pesky percents before calling ShellExecute. An unnamed developer with similar intent but clearly much craftier did the same thing in a loop until the string's length stopped changing. As it turns out percent-encoding serves a purpose and can't just be removed arbitrarily.

Percent-encoding exists so that one can represent data in a URI that would otherwise not be allowed or would be interpretted as a delimiter instead of data. For example, the space character (U+0020) is not allowed in a URI and so must be percent-encoded in order to appear in a URI:

  1. http://example.com/the%20path/
  2. http://example.com/the path/
In the above the first is a valid URI while the second is not valid since a space appears directly in the URI. Depending on the context and the code through which the wannabe URI is run one may get unexpected failure.

For an additional example, the question mark delimits the path from the query. If one wanted the question mark to appear as part of the path rather than delimit the path from the query, it must be percent-encoded:

  1. http://example.com/foo%3Fbar
  2. http://example.com/foo?bar
In the second, the question mark appears plainly and so delimits the path "/foo" from the query "bar". And in the first, the querstion mark is percent-encoded and so the path is "/foo%3Fbar".
PermalinkCommentsencoding uri technical ietf percent-encoding

URI Percent Encoding Ignorance Level 0 - Existence

2012 Feb 10, 4:00

As a professional URI aficionado I deal with various levels of ignorance on URI percent-encoding (aka URI encoding, or URL escaping). The basest ignorance is with respect to the mere existence of percent-encoding. Percents in URIs are special: they always represent the start of a percent-encoded octet. That is to say, a percent is always followed by two hex digits that represents a value between 0 and 255 and doesn't show up in a URI otherwise.

The IPv6 textual syntax for scoped addresses uses the '%' to delimit the zone ID from the rest of the address. When it came time to define how to represent scoped IPv6 addresses in URIs there were two camps: Folks who wanted to use the IPv6 format as is in the URI, and those who wanted to encode or replace the '%' with a different character. The resulting thread was more lively than what shows up on the IETF URI discussion mailing list. Ultimately we went with a percent-encoded '%' which means the percent maintains its special status and singular purpose.

PermalinkCommentsencoding uri technical ietf percent-encoding ipv6
Older Entries Creative Commons License Some rights reserved.