createuri - Dave's Blog

Search
My timeline on Mastodon

Application Content URI Rules wildcard syntax

2017 May 31, 4:48

Application Content URI Rules (ACUR from now on) defines the bounds of the web that make up the Microsoft Store application. Package content via the ms-appx URI scheme is automatically considered part of the app. But if you have content on the web via http or https you can use ACUR to declare to Windows that those URIs are also part of your application. When your app navigates to URIs on the web those URIs will be matched against the ACUR to determine if they are part of your app or not. The documentation for how matching is done on the wildcard URIs in the ACUR Rule elements is not very helpful on MSDN so here are some notes.

Rules

You can have up to 100 Rule XML elements per ApplicationContentUriRules element. Each has a Match attribute that can be up to 2084 characters long. The content of the Match attribute is parsed with CreateUri and when matching against URIs on the web additional wildcard processing is performed. I’ll call the URI from the ACUR Rule the rule URI and the URI we compare it to found during app navigation the navigation URI.

The rule URI is matched to a navigation URI by URI component: scheme, username, password, host, port, path, query, and fragment. If a component does not exist on the rule URI then it matches any value of that component in the navigation URI. For example, a rule URI with no fragment will match a navigation URI with no fragment, with an empty string fragment, or a fragment with any value in it.

Asterisk

Each component except the port may have up to 8 asterisks. Two asterisks in a row counts as an escape and will match 1 literal asterisk. For scheme, username, password, query and fragment the asterisk matches whatever it can within the component.

Host

For the host, if the host consists of exactly one single asterisk then it matches anything. Otherwise an asterisk in a host only matches within its domain name label. For example, http://*.example.com will match http://a.example.com/ but not http://b.a.example.com/ or http://example.com/. And http://*/ will match http://example.com, http://a.example.com/, and http://b.a.example.com/. However the Store places restrictions on submitting apps that use the http://* rule or rules with an asterisk in the second effective domain name label. For example, http://*.com is also restricted for Store submission.

Path

For the path, an asterisk matches within the path segment. For example, http://example.com/a/*/c will match http://example.com/a/b/c and http://example.com/a//c but not http://example.com/a/b/b/c or http://example.com/a/c

Additionally for the path, if the path ends with a slash then it matches any path that starts with that same path. For example, http://example.com/a/ will match http://example.com/a/b and http://example.com/a/b/c/d/e/, but not http://example.com/b/.

If the path doesn’t end with a slash then there is no suffix matching performed. For example, http://example.com/a will match only http://example.com/a and no URIs with a different path.

As a part of parsing the rule URI and the navigation URI, CreateUri will perform URI normalization and so the hostname and scheme will be made lower case (casing matters in all other parts of the URI and case sensitive comparisons will be performed), IDN normalization will be performed, ‘.’ and ‘..’ path segments will be resolved and other normalizations as described in the CreateUri documentation.

PermalinkCommentsapplication-content-uri-rules programming windows windows-store

Alternate IPv4 Forms - URI Host Syntax Notes

2012 Mar 14, 4:30

By the URI RFC there is only one way to represent a particular IPv4 address in the host of a URI. This is the standard dotted decimal notation of four bytes in decimal with no leading zeroes delimited by periods. And no leading zeros are allowed which means there's only one textual representation of a particular IPv4 address.

However as discussed in the URI RFC, there are other forms of IPv4 addresses that although not officially allowed are generally accepted. Many implementations used inet_aton to parse the address from the URI which accepts more than just dotted decimal. Instead of dotted decimal, each dot delimited part can be in decimal, octal (if preceded by a '0') or hex (if preceded by '0x' or '0X'). And that's each section individually - they don't have to match. And there need not be 4 parts: there can be between 1 and 4 (inclusive). In case of less than 4, the last part in the string represents all of the left over bytes, not just one.

For example the following are all equivalent:

192.168.1.1
Standard dotted decimal form
0300.0250.01.01
Octal
0xC0.0XA8.0x1.0X1
Hex
192.168.257
Fewer parts
0300.0XA8.257
All of the above

The bread and butter of URI related security issues is when one part of the system disagrees with another about the interpretation of the URI. So this non-standard, non-normal form syntax has been been a great source of security issues in the past. Its mostly well known now (CreateUri normalizes these non-normal forms to dotted decimal), but occasionally a good tool for bypassing naive URI blocking systems.

PermalinkCommentsurl inet_aton uri technical host programming ipv4

Bug Spotting: Smart pointers and parameter evaluation order

2011 Oct 19, 5:58PermalinkCommentsc++ technical bug programming smart-pointer cpp

CreateUri (MSDN)

2007 Feb 20, 11:41Documentation for one of the main functions I worked on in IE7.PermalinkCommentsme projects professional createuri iuri msdn microsoft api help documentation reference internet web uri urlmon
Older Entries Creative Commons License Some rights reserved.