example - Dave's Blog

Search
My timeline on Mastodon

Right-To-Left Override Twitter Name

2020 Oct 21, 3:50

Its rare to find devs anticipating Unicode control characters showing up in user input. And the most fun when unanticipated is the Right-To-Left Override character U+202E. Unicode characters have an implicit direction so that for example by default Hebrew characters are rendered from right to left, and English characters are rendered left to right. The override characters force an explicit direction for all the text that follows.

I chose my Twitter display name to include the HTML encoding of the Right-To-Left Override character #x202E; as a sort of joke or shout out to my favorite Unicode control character. I did not anticipate that some Twitter clients in some of their UI would fail to encode it correctly. There's no way I can remove that from my display name now.


Try it on Amazon.


How about pages that want to tell you about the U+202E. 


PermalinkCommentsUnicode

Multiple Windows in Win10 JavaScript UWP apps

2018 Mar 10, 1:47

Win10 Changes

In Win8.1 JavaScript UWP apps we supported multiple windows using MSApp DOM APIs. In Win10 we use window.open and window and a new MSApp API getViewId and the previous MSApp APIs are gone:

Win10 Win8.1
Create new window window.open MSApp.createNewView
New window object window MSAppView
viewId MSApp.getViewId(window) MSAppView.viewId

WinRT viewId

We use window.open and window for creating new windows, but then to interact with WinRT APIs we add the MSApp.getViewId API. It takes a window object as a parameter and returns a viewId number that can be used with the various Windows.UI.ViewManagement.ApplicationViewSwitcher APIs.

Delaying Visibility

Views in WinRT normally start hidden and the end developer uses something like TryShowAsStandaloneAsync to display the view once it is fully prepared. In the web world, window.open shows a window immediately and the end user can watch as content is loaded and rendered. To have your new windows act like views in WinRT and not display immediately we have added a window.open option. For example
let newWindow = window.open("https://example.com", null, "msHideView=yes");

Primary Window Differences

The primary window that is initially opened by the OS acts differently than the secondary windows that it opens:

Primary Secondary
window.open Allowed Disallowed
window.close Close app Close window
Navigation restrictions ACUR only No restrictions

The restriction on secondary windows such that they cannot open secondary windows could change in the future depending on feedback.

Same Origin Communication Restrictions

Lastly, there is a very difficult technical issue preventing us from properly supporting synchronous, same-origin, cross-window, script calls. That is, when you open a window that's same origin, script in one window is allowed to directly call functions in the other window and some of these calls will fail. postMessage calls work just fine and is the recommended way to do things if that's possible for you. Otherwise we continue to work on improving this.

PermalinkComments

MSApp.getHtmlPrintDocumentSourceAsync - JavaScript UWP app printing

2017 Oct 11, 5:49

The documentation for printing in JavaScript UWP apps is out of date as it all references MSApp.getHtmlPrintDocumentSource but that method has been replaced by MSApp.getHtmlPrintDocumentSourceAsync since WinPhone 8.1.

Background

Previous to WinPhone 8.1 the WebView's HTML content ran on the UI thread of the app. This is troublesome for rendering arbitrary web content since in the extreme case the JavaScript of some arbitrary web page might just sit in a loop and never return control to your app's UI. With WinPhone 8.1 we added off thread WebView in which the WebView runs HTML content on a separate UI thread.

Off thread WebView required changing our MSApp.getHtmlPrintDocumentSource API which could no longer synchronously produce an HtmlPrintDocumentSource. With WebViews running on their own threads it may take some time for them to generate their print content for the HtmlPrintDocumentSource and we don't want to hang the app's UI thread in the interim. So the MSApp.getHtmlPrintDocumentSource API was replaced with MSApp.getHtmlPrintDocumentSourceAsync which returns a promise the resolved value of which is the eventual HtmlPrintDocumentSource.

Sample

However, the usage of the API is otherwise unchanged. So in sample code you see referencing MSApp.getHtmlPrintDocumentSource the sample code is still reasonable but you need to call MSApp.getHtmlPrintDocumentSourceAsync instead and wait for the promise to complete. For example the PrintManager docs has an example implementing a PrintTaskRequested event handler in a JavaScript UWP app.

    function onPrintTaskRequested(printEvent) {    
var printTask = printEvent.request.createPrintTask("Print Sample", function (args) {
args.setSource(MSApp.getHtmlPrintDocumentSource(document));
});

Instead we need to obtain a deferral in the event handler so we can asynchronously wait for getHtmlPrintDocumentSourceAsync to complete:

    function onPrintTaskRequested(printEvent) {    
var printTask = printEvent.request.createPrintTask("Print Sample", function (args) {
const deferral = args.getDeferral();
MSApp.getHtmlPrintDocumentSourceAsync(document).then(htmlPrintDocumentSource => {
args.setSource(htmlPrintDocumentSource);
deferral.complete();
}, error => {
console.error("Error: " + error.message + " " + error.stack);
deferral.complete();
});
});
PermalinkCommentsjavascript MSApp printing programming uwp webview win10 windows

Application Content URI Rules rule ordering

2017 Jun 1, 1:30

Application Content URI Rules (ACUR from now on) defines the bounds on the web that make up a Microsoft Store application. The previous blog post discussed the syntax of the Rule's Match attribute and this time I'll write about the interactions between the Rules elements.

Order

A single ApplicationContentUriRules element may have up to 100 Rule child elements. When determining if a navigation URI matches any of the ACUR the last Rule in the list with a matching match wildcard URI is used. If that Rule is an include rule then the navigation URI is determined to be an application content URI and if that Rule is an exclude rule then the navigation rule is not an application content URI. For example:

Rule Type='include' Match='https://example.com/'/
Rule Type='exclude' Match='https://example.com/'/

Given the above two rules in that order, the navigation URI https://example.com/ is not an application content URI because the last matching rule is the exclude rule. Reverse the order of the rules and get the opposite result.

WindowsRuntimeAccess

In addition to determining if a navigation URI is application content or not, a Rule may also confer varying levels of WinRT access via the optional WindowsRuntimeAccess attribute which may be set to 'none', 'allowForWeb', or 'all'. If a navigation URI matches multiple different include rules only the last rule is applied even as it applies to the WindowsRuntimeAccess attribute. For example:

Rule Type='include' Match='https://example.com/' WindowsRuntimeAccess='none'/
Rule Type='include' Match='https://example.com/' WindowsRuntimeAccess='all'/

Given the above two rules in that order, the navigation URI https://example.com/ will have access to all WinRT APIs because the last matching rule wins. Reverse the rule order and the navigation URI https://example.com/ will have no access to WinRT. There is no summation or combining of multiple matching rules - only the last matching rule wins.

PermalinkCommentsapplication-content-uri-rules programming uri windows windows-store

Application Content URI Rules wildcard syntax

2017 May 31, 4:48

Application Content URI Rules (ACUR from now on) defines the bounds of the web that make up the Microsoft Store application. Package content via the ms-appx URI scheme is automatically considered part of the app. But if you have content on the web via http or https you can use ACUR to declare to Windows that those URIs are also part of your application. When your app navigates to URIs on the web those URIs will be matched against the ACUR to determine if they are part of your app or not. The documentation for how matching is done on the wildcard URIs in the ACUR Rule elements is not very helpful on MSDN so here are some notes.

Rules

You can have up to 100 Rule XML elements per ApplicationContentUriRules element. Each has a Match attribute that can be up to 2084 characters long. The content of the Match attribute is parsed with CreateUri and when matching against URIs on the web additional wildcard processing is performed. I’ll call the URI from the ACUR Rule the rule URI and the URI we compare it to found during app navigation the navigation URI.

The rule URI is matched to a navigation URI by URI component: scheme, username, password, host, port, path, query, and fragment. If a component does not exist on the rule URI then it matches any value of that component in the navigation URI. For example, a rule URI with no fragment will match a navigation URI with no fragment, with an empty string fragment, or a fragment with any value in it.

Asterisk

Each component except the port may have up to 8 asterisks. Two asterisks in a row counts as an escape and will match 1 literal asterisk. For scheme, username, password, query and fragment the asterisk matches whatever it can within the component.

Host

For the host, if the host consists of exactly one single asterisk then it matches anything. Otherwise an asterisk in a host only matches within its domain name label. For example, http://*.example.com will match http://a.example.com/ but not http://b.a.example.com/ or http://example.com/. And http://*/ will match http://example.com, http://a.example.com/, and http://b.a.example.com/. However the Store places restrictions on submitting apps that use the http://* rule or rules with an asterisk in the second effective domain name label. For example, http://*.com is also restricted for Store submission.

Path

For the path, an asterisk matches within the path segment. For example, http://example.com/a/*/c will match http://example.com/a/b/c and http://example.com/a//c but not http://example.com/a/b/b/c or http://example.com/a/c

Additionally for the path, if the path ends with a slash then it matches any path that starts with that same path. For example, http://example.com/a/ will match http://example.com/a/b and http://example.com/a/b/c/d/e/, but not http://example.com/b/.

If the path doesn’t end with a slash then there is no suffix matching performed. For example, http://example.com/a will match only http://example.com/a and no URIs with a different path.

As a part of parsing the rule URI and the navigation URI, CreateUri will perform URI normalization and so the hostname and scheme will be made lower case (casing matters in all other parts of the URI and case sensitive comparisons will be performed), IDN normalization will be performed, ‘.’ and ‘..’ path segments will be resolved and other normalizations as described in the CreateUri documentation.

PermalinkCommentsapplication-content-uri-rules programming windows windows-store

Tweet from David Risney

2016 Nov 4, 4:08
@David_Risney Example graph https://raw.githubusercontent.com/david-risney/WinMDGraph/master/examples/3/3.dot.png  of the Windows .Services.Maps namespace
PermalinkComments

WinRT Toast from PowerShell

2016 Jun 15, 3:54

I've made a PowerShell script to show system toast notifications with WinRT and PowerShell. Along the way I learned several interesting things.

First off calling WinRT from PowerShell involves a strange syntax. If you want to use a class you write [-Class-,-Namespace-,ContentType=WindowsRuntime] first to tell PowerShell about the type. For example here I create a ToastNotification object:

[void][Windows.UI.Notifications.ToastNotification,Windows.UI.Notifications,ContentType=WindowsRuntime];
$toast = New-Object Windows.UI.Notifications.ToastNotification -ArgumentList $xml;
And here I call the static method CreateToastNotifier on the ToastNotificationManager class:
[void][Windows.UI.Notifications.ToastNotificationManager,Windows.UI.Notifications,ContentType=WindowsRuntime];
$notifier = [Windows.UI.Notifications.ToastNotificationManager]::CreateToastNotifier($AppUserModelId);
With this I can call WinRT methods and this is enough to show a toast but to handle the click requires a little more work.

To handle the user clicking on the toast I need to listen to the Activated event on the Toast object. However Register-ObjectEvent doesn't handle WinRT events. To work around this I created a .NET event wrapper class to turn the WinRT event into a .NET event that Register-ObjectEvent can handle. This is based on Keith Hill's blog post on calling WinRT async methods in PowerShell. With the event wrapper class I can run the following to subscribe to the event:

function WrapToastEvent {
param($target, $eventName);

Add-Type -Path (Join-Path $myPath "PoshWinRT.dll")
$wrapper = new-object "PoshWinRT.EventWrapper[Windows.UI.Notifications.ToastNotification,System.Object]";
$wrapper.Register($target, $eventName);
}

[void](Register-ObjectEvent -InputObject (WrapToastEvent $toast "Activated") -EventName FireEvent -Action {
...
});

To handle the Activated event I want to put focus back on the PowerShell window that created the toast. To do this I need to call the Win32 function SetForegroundWindow. Doing so from PowerShell is surprisingly easy. First you must tell PowerShell about the function:

Add-Type @"
using System;
using System.Runtime.InteropServices;
public class PInvoke {
[DllImport("user32.dll")] [return: MarshalAs(UnmanagedType.Bool)]
public static extern bool SetForegroundWindow(IntPtr hwnd);
}
"@
Then to call:
[PInvoke]::SetForegroundWindow((Get-Process -id $myWindowPid).MainWindowHandle);

But figuring out the HWND to give to SetForegroundWindow isn't totally straight forward. Get-Process exposes a MainWindowHandle property but if you start a cmd.exe prompt and then run PowerShell inside of that, the PowerShell process has 0 for its MainWindowHandle property. We must follow up process parents until we find one with a MainWindowHandle:

$myWindowPid = $pid;
while ($myWindowPid -gt 0 -and (Get-Process -id $myWindowPid).MainWindowHandle -eq 0) {
$myWindowPid = (gwmi Win32_Process -filter "processid = $($myWindowPid)" | select ParentProcessId).ParentProcessId;
}
PermalinkComments.net c# powershell toast winrt

WinRT Launcher API in PowerShell

2016 Mar 31, 10:12
You can call WinRT APIs from PowerShell. Here's a short example using the WinRT Launcher API:
[Windows.System.Launcher,Windows.System,ContentType=WindowsRuntime]
$uri = New-Object System.Uri "http://example.com/"
[Windows.System.Launcher]::LaunchUriAsync($uri)
Note that like using WinRT in .NET, you use the System.Uri .NET class instead of the Windows.Foundation.Uri WinRT class which is not projected and under the covers the system will convert the System.Uri to a Windows.Foundation.Uri.
PermalinkComments

Cdb/Windbg Commands for Runtime Patching

2016 Feb 8, 1:47

You can use conditional breakpoints and debugging commands in windbg and cdb that together can amount to effectively patching a binary at runtime. This can be useful if you have symbols but you can't easily rebuild the binary. Or if the patch is small and the binary requires a great deal of time to rebuild.

Skipping code

If you want to skip a chunk of code you can set a breakpoint at the start address of the code to skip and set the breakpoint's command to change the instruction pointer register to point to the address at the end of the code to skip and go. Voila you're skipping over that code now. For example:

bp 0x6dd6879b "r @eip=0x6dd687c3 ; g"

Changing parameters

You may want to modify parameters or variables and this is simple of course. In the following example a conditional breakpoint ANDs out a bit from dwFlags. Now when we run its as if no one is passing in that flag.

bp wiwi!RelativeCrack "?? dwFlags &= 0xFDFFFFFF;g"

Slightly more difficult is to modify string values. If the new string length is the same size or smaller than the previous, you may be able to modify the string value in place. But if the string is longer or the string memory isn't writable, you'll need a new chunk of memory into which to write your new string. You can use .dvalloc to allocate some memory and ezu to write a string into the newly allocated memory. In the following example I then overwrite the register containing the parameter I want to modify:

.dvalloc 100
ezu 000002a9`d4eb0000 "mfcore.dll"
r rcx = 000002a9`d4eb0000

Calling functions

You can also use .call to actually make new calls to methods or functions. Read more about that on the Old New Thing: Stupid debugger tricks: Calling functions and methods. Again, all of this can be used in a breakpoint command to effectively patch a binary.

PermalinkCommentscdb debug technical windbg

JavaScript Types and WinRT Types

2016 Jan 21, 5:35

MSDN covers the topic of JavaScript and WinRT type conversions provided by Chakra (JavaScript Representation of Windows Runtime Types and Considerations when Using the Windows Runtime API), but for the questions I get about it I’ll try to lay out some specifics of that discussion more plainly. I’ve made a TL;DR JavaScript types and WinRT types summary table.

WinRT Conversion JavaScript
Struct ↔️ JavaScript object with matching property names
Class or interface instance JavaScript object with matching property names
Windows.Foundation.Collections.IPropertySet JavaScript object with arbitrary property names
Any DOM object

Chakra, the JavaScript engine powering the Edge browser and JavaScript Windows Store apps, does the work to project WinRT into JavaScript. It is responsible for, among other things, converting back and forth between JavaScript types and WinRT types. Some basics are intuitive, like a JavaScript string is converted back and forth with WinRT’s string representation. For other basic types check out the MSDN links at the top of the page. For structs, interface instances, class instances, and objects things are more complicated.

A struct, class instance, or interface instance in WinRT is projected into JavaScript as a JavaScript object with corresponding property names and values. This JavaScript object representation of a WinRT type can be passed into other WinRT APIs that take the same underlying type as a parameter. This JavaScript object is special in that Chakra keeps a reference to the underlying WinRT object and so it can be reused with other WinRT APIs.

However, if you start with plain JavaScript objects and want to interact with WinRT APIs that take non-basic WinRT types, your options are less plentiful. You can use a plain JavaScript object as a WinRT struct, so long as the property names on the JavaScript object match the WinRT struct’s. Chakra will implicitly create an instance of the WinRT struct for you when you call a WinRT method that takes that WinRT struct as a parameter and fill in the WinRT struct’s values with the values from the corresponding properties on your JavaScript object.

// C# WinRT component
public struct ExampleStruct
{
public string String;
public int Int;
}

public sealed class ExampleStructContainer
{
ExampleStruct value;
public void Set(ExampleStruct value)
{
this.value = value;
}

public ExampleStruct Get()
{
return this.value;
}
}

// JS code
var structContainer = new ExampleWinRTComponent.ExampleNamespace.ExampleStructContainer();
structContainer.set({ string: "abc", int: 123 });
console.log("structContainer.get(): " + JSON.stringify(structContainer.get()));
// structContainer.get(): {"string":"abc","int":123}

You cannot have a plain JavaScript object and use it as a WinRT class instance or WinRT interface instance. Chakra does not provide such a conversion even with ES6 classes.

You cannot take a JavaScript object with arbitrary property names that are unknown at compile time and don’t correspond to a specific WinRT struct and pass that into a WinRT method. If you need to do this, you have to write additional JavaScript code to explicitly convert your arbitrary JavaScript object into an array of property name and value pairs or something else that could be represented in WinRT.

However, the other direction you can do. An instance of a Windows.Foundation.Collections.IPropertySet implementation in WinRT is projected into JavaScript as a JavaScript object with property names and values corresponding to the key and value pairs in the IPropertySet. In this way you can project a WinRT object as a JavaScript object with arbitrary property names and types. But again, the reverse is not possible. Chakra will not convert an arbitrary JavaScript object into an IPropertySet.

// C# WinRT component
public sealed class PropertySetContainer
{
private Windows.Foundation.Collections.IPropertySet otherValue = null;

public Windows.Foundation.Collections.IPropertySet other
{
get
{
return otherValue;
}
set
{
otherValue = value;
}
}
}

public sealed class PropertySet : Windows.Foundation.Collections.IPropertySet
{
private IDictionary map = new Dictionary();

public PropertySet()
{
map.Add("abc", "def");
map.Add("ghi", "jkl");
map.Add("mno", "pqr");
}
// ... rest of PropertySet implementation is simple wrapper around the map member.


// JS code
var propertySet = new ExampleWinRTComponent.ExampleNamespace.PropertySet();
console.log("propertySet: " + JSON.stringify(propertySet));
// propertySet: {"abc":"def","ghi":"jkl","mno":"pqr"}

var propertySetContainer = new ExampleWinRTComponent.ExampleNamespace.PropertySetContainer();
propertySetContainer.other = propertySet;
console.log("propertySetContainer.other: " + JSON.stringify(propertySetContainer.other));
// propertySetContainer.other: {"abc":"def","ghi":"jkl","mno":"pqr"}

try {
propertySetContainer.other = { "123": "456", "789": "012" };
}
catch (e) {
console.error("Error setting propertySetContainer.other: " + e);
// Error setting propertySetContainer.other: TypeError: Type mismatch
}

There’s also no way to implicitly convert a DOM object into a WinRT type. If you want to write third party WinRT code that interacts with the DOM, you must do so indirectly and explicitly in JavaScript code that is interacting with your third party WinRT. You’ll have to extract the information you want from your DOM objects to pass into WinRT methods and similarly have to pass messages out from WinRT that say what actions the JavaScript should perform on the DOM.

PermalinkCommentschakra development javascript winrt

Tweet from David_Risney

2015 Mar 30, 10:52
Or from GitHub's POV, how else can you use this XSS? Example: Open a new window with info on howto subvert particular censorship. What else?
PermalinkComments

Debugging LoadLibrary Failures - Junfeng Zhang's Windows Programming Notes - Site Home - MSDN Blogs

2014 Feb 25, 2:22

How to turn on debug logging for LoadLibrary to diagnose failures. For example, see where in the dependency graph of a DLL LoadLibrary ran into issues.

PermalinkCommentstechnical win32 windows debugging loadlibrary

URI functions in Windows Store Applications

2013 Jul 25, 1:00

Summary

The Modern SDK contains some URI related functionality as do libraries available in particular projection languages. Unfortunately, collectively these APIs do not cover all scenarios in all languages. Specifically, JavaScript and C++ have no URI building APIs, and C++ additionally has no percent-encoding/decoding APIs.
WinRT (JS and C++)
JS Only
C++ Only
.NET Only
Parse
 
Build
Normalize
Equality
 
 
Relative resolution
Encode data for including in URI property
Decode data extracted from URI property
Build Query
Parse Query
The Windows.Foudnation.Uri type is not projected into .NET modern applications. Instead those applications use System.Uri and the platform ensures that it is correctly converted back and forth between Windows.Foundation.Uri as appropriate. Accordingly the column marked WinRT above is applicable to JS and C++ modern applications but not .NET modern applications. The only entries above applicable to .NET are the .NET Only column and the WwwFormUrlDecoder in the bottom left which is available to .NET.

Scenarios

Parse

This functionality is provided by the WinRT API Windows.Foundation.Uri in C++ and JS, and by System.Uri in .NET.
Parsing a URI pulls it apart into its basic components without decoding or otherwise modifying the contents.
var uri = new Windows.Foundation.Uri("http://example.com/path%20segment1/path%20segment2?key1=value1&key2=value2");
console.log(uri.path);// /path%20segment1/path%20segment2

WsDecodeUrl (C++)

WsDecodeUrl is not suitable for general purpose URI parsing.  Use Windows.Foundation.Uri instead.

Build (C#)

URI building is only available in C# via System.UriBuilder.
URI building is the inverse of URI parsing: URI building allows the developer to specify the value of basic components of a URI and the API assembles them into a URI. 
To work around the lack of a URI building API developers will likely concatenate strings to form their URIs.  This can lead to injection bugs if they don’t validate or encode their input properly, but if based on trusted or known input is unlikely to have issues.
            Uri originalUri = new Uri("http://example.com/path1/?query");
            UriBuilder uriBuilder = new UriBuilder(originalUri);
            uriBuilder.Path = "/path2/";
            Uri newUri = uriBuilder.Uri; // http://example.com/path2/?query

WsEncodeUrl (C++)

WsEncodeUrl, in addition to building a URI from components also does some encoding.  It encodes non-US-ASCII characters as UTF8, the percent, and a subset of gen-delims based on the URI property: all :/?#[]@ are percent-encoded except :/@ in the path and :/?@ in query and fragment.
Accordingly, WsEncodeUrl is not suitable for general purpose URI building.  It is acceptable to use in the following cases:
- You’re building a URI out of non-encoded URI properties and don’t care about the difference between encoded and decoded characters.  For instance you’re the only one consuming the URI and you uniformly decode URI properties when consuming – for instance using WsDecodeUrl to consume the URI.
- You’re building a URI with URI properties that don’t contain any of the characters that WsEncodeUrl encodes.

Normalize

This functionality is provided by the WinRT API Windows.Foundation.Uri in C++ and JS and by System.Uri in .NET.  Normalization is applied during construction of the Uri object.
URI normalization is the application of URI normalization rules (including DNS normalization, IDN normalization, percent-encoding normalization, etc.) to the input URI.
        var normalizedUri = new Windows.Foundation.Uri("HTTP://EXAMPLE.COM/p%61th foo/");
        console.log(normalizedUri.absoluteUri); // http://example.com/path%20foo/
This is modulo Win8 812823 in which the Windows.Foundation.Uri.AbsoluteUri property returns a normalized IRI not a normalized URI.  This bug does not affect System.Uri.AbsoluteUri which returns a normalized URI.

Equality

This functionality is provided by the WinRT API Windows.Foundation.Uri in C++ and JS and by System.Uri in .NET. 
URI equality determines if two URIs are equal or not necessarily equal.
            var uri1 = new Windows.Foundation.Uri("HTTP://EXAMPLE.COM/p%61th foo/"),
                uri2 = new Windows.Foundation.Uri("http://example.com/path%20foo/");
            console.log(uri1.equals(uri2)); // true

Relative resolution

This functionality is provided by the WinRT API Windows.Foundation.Uri in C++ and JS and by System.Uri in .NET 
Relative resolution is a function that given an absolute URI A and a relative URI B, produces a new absolute URI C.  C is the combination of A and B in which the basic components specified in B override or combine with those in A under rules specified in RFC 3986.
        var baseUri = new Windows.Foundation.Uri("http://example.com/index.html"),
            relativeUri = "/path?query#fragment",
            absoluteUri = baseUri.combineUri(relativeUri);
        console.log(baseUri.absoluteUri);       // http://example.com/index.html
        console.log(absoluteUri.absoluteUri);   // http://example.com/path?query#fragment

Encode data for including in URI property

This functionality is available in JavaScript via encodeURIComponent and in C# via System.Uri.EscapeDataString. Although the two methods mentioned above will suffice for this purpose, they do not perform exactly the same operation.
Additionally we now have Windows.Foundation.Uri.EscapeComponent in WinRT, which is available in JavaScript and C++ (not C# since it doesn’t have access to Windows.Foundation.Uri).  This is also slightly different from the previously mentioned mechanisms but works best for this purpose.
Encoding data for inclusion in a URI property is necessary when constructing a URI from data.  In all the above cases the developer is dealing with a URI or substrings of a URI and so the strings are all encoded as appropriate. For instance, in the parsing example the path contains “path%20segment1” and not “path segment1”.  To construct a URI one must first construct the basic components of the URI which involves encoding the data.  For example, if one wanted to include “path segment / example” in the path of a URI, one must percent-encode the ‘ ‘ since it is not allowed in a URI, as well as the ‘/’ since although it is allowed, it is a delimiter and won’t be interpreted as data unless encoded.
If a developer does not have this API provided they can write it themselves.  Percent-encoding methods appear simple to write, but the difficult part is getting the set of characters to encode correct, as well as handling non-US-ASCII characters.
        var uri = new Windows.Foundation.Uri("http://example.com" +
            "/" + Windows.Foundation.Uri.escapeComponent("path segment / example") +
            "?key=" + Windows.Foundation.Uri.escapeComponent("=&?#"));
        console.log(uri.absoluteUri); // http://example.com/path%20segment%20%2F%20example?key=%3D%26%3F%23

WsEncodeUrl (C++)

In addition to building a URI from components, WsEncodeUrl also percent-encodes some characters.  However the API is not recommend for this scenario given the particular set of characters that are encoded and the convoluted nature in which a developer would have to use this API in order to use it for this purpose.
There are no general purpose scenarios for which the characters WsEncodeUrl encodes make sense: encode the %, encode a subset of gen-delims but not also encode the sub-delims.  For instance this could not replace encodeURIComponent in a C++ version of the following code snippet since if ‘value’ contained ‘&’ or ‘=’ (both sub-delims) they wouldn’t be encoded and would be confused for delimiters in the name value pairs in the query:
"http://example.com/?key=" + Windows.Foundation.Uri.escapeComponent(value)
Since WsEncodeUrl produces a string URI, to obtain the property they want to encode they’d need to parse the resulting URI.  WsDecodeUrl won’t work because it decodes the property but Windows.Foundation.Uri doesn’t decode.  Accordingly the developer could run their string through WsEncodeUrl then Windows.Foundation.Uri to extract the property.

Decode data extracted from URI property

This functionality is available in JavaScript via decodeURIComponent and in C# via System.Uri.UnescapeDataString. Although the two methods mentioned above will suffice for this purpose, they do not perform exactly the same operation.
Additionally we now also have Windows.Foundation.Uri.UnescapeComponent in WinRT, which is available in JavaScript and C++ (not C# since it doesn’t have access to Windows.Foundation.Uri).  This is also slightly different from the previously mentioned mechanisms but works best for this purpose.
Decoding is necessary when extracting data from a parsed URI property.  For example, if a URI query contains a series of name and value pairs delimited by ‘=’ between names and values, and by ‘&’ between pairs, one must first parse the query into name and value entries and then decode the values.  It is necessary to make this an extra step separate from parsing the URI property so that sub-delimiters (in this case ‘&’ and ‘=’) that are encoded will be interpreted as data, and those that are decoded will be interpreted as delimiters.
If a developer does not have this API provided they can write it themselves.  Percent-decoding methods appear simple to write, but have some tricky parts including correctly handling non-US-ASCII, and remembering not to decode .
In the following example, note that if unescapeComponent were called first, the encoded ‘&’ and ‘=’ would be decoded and interfere with the parsing of the name value pairs in the query.
            var uri = new Windows.Foundation.Uri("http://example.com/?foo=bar&array=%5B%27%E3%84%93%27%2C%27%26%27%2C%27%3D%27%2C%27%23%27%5D");
            uri.query.substr(1).split("&").forEach(
                function (keyValueString) {
                    var keyValue = keyValueString.split("=");
                    console.log(Windows.Foundation.Uri.unescapeComponent(keyValue[0]) + ": " + Windows.Foundation.Uri.unescapeComponent(keyValue[1]));
                    // foo: bar
                    // array: ['','&','=','#']
                });

WsDecodeUrl (C++)

Since WsDecodeUrl decodes all percent-encoded octets it could be used for general purpose percent-decoding but it takes a URI so would require the dev to construct a stub URI around the string they want to decode.  For example they could prefix “http:///#” to their string, run it through WsDecodeUrl and then extract the fragment property.  It is convoluted but will work correctly.

Parse Query

The query of a URI is often encoded as application/x-www-form-urlencoded which is percent-encoded name value pairs delimited by ‘&’ between pairs and ‘=’ between corresponding names and values.
In WinRT we have a class to parse this form of encoding using Windows.Foundation.WwwFormUrlDecoder.  The queryParsed property on the Windows.Foundation.Uri class is of this type and created with the query of its Uri:
    var uri = Windows.Foundation.Uri("http://example.com/?foo=bar&array=%5B%27%E3%84%93%27%2C%27%26%27%2C%27%3D%27%2C%27%23%27%5D");
    uri.queryParsed.forEach(
        function (pair) {
            console.log("name: " + pair.name + ", value: " + pair.value);
            // name: foo, value: bar
            // name: array, value: ['','&','=','#']
        });
    console.log(uri.queryParsed.getFirstValueByName("array")); // ['','&','=','#']
The QueryParsed property is only on Windows.Foundation.Uri and not System.Uri and accordingly is not available in .NET.  However the Windows.Foundation.WwwFormUrlDecoder class is available in C# and can be used manually:
            Uri uri = new Uri("http://example.com/?foo=bar&array=%5B%27%E3%84%93%27%2C%27%26%27%2C%27%3D%27%2C%27%23%27%5D");
            WwwFormUrlDecoder decoder = new WwwFormUrlDecoder(uri.Query);
            foreach (IWwwFormUrlDecoderEntry entry in decoder)
            {
                System.Diagnostics.Debug.WriteLine("name: " + entry.Name + ", value: " + entry.Value);
                // name: foo, value: bar
                // name: array, value: ['','&','=','#']
            }
 

Build Query

To build a query of name value pairs encoded as application/x-www-form-urlencoded there is no WinRT API to do this directly.  Instead a developer must do this manually making use of the code described in “Encode data for including in URI property”.
In terms of public releases, this property is only in the RC and later builds.
For example in JavaScript a developer may write:
            var uri = new Windows.Foundation.Uri("http://example.com/"),
                query = "?" + Windows.Foundation.Uri.escapeComponent("array") + "=" + Windows.Foundation.Uri.escapeComponent("['','&','=','#']");
 
            console.log(uri.combine(new Windows.Foundation.Uri(query)).absoluteUri); // http://example.com/?array=%5B'%E3%84%93'%2C'%26'%2C'%3D'%2C'%23'%5D
 
PermalinkCommentsc# c++ javascript technical uri windows windows-runtime windows-store

Microsoft Research make breakthrough in audio speech recognition (technet.com)

2012 Jun 22, 2:33

MAVIS indexes audio and video so you can do text search over the contents. For example search for ‘metro’ in all of the BUILD conference talks.

PermalinkCommentstechnical voice-recognition microsoft research mavis search

Alternate IPv4 Forms - URI Host Syntax Notes

2012 Mar 14, 4:30

By the URI RFC there is only one way to represent a particular IPv4 address in the host of a URI. This is the standard dotted decimal notation of four bytes in decimal with no leading zeroes delimited by periods. And no leading zeros are allowed which means there's only one textual representation of a particular IPv4 address.

However as discussed in the URI RFC, there are other forms of IPv4 addresses that although not officially allowed are generally accepted. Many implementations used inet_aton to parse the address from the URI which accepts more than just dotted decimal. Instead of dotted decimal, each dot delimited part can be in decimal, octal (if preceded by a '0') or hex (if preceded by '0x' or '0X'). And that's each section individually - they don't have to match. And there need not be 4 parts: there can be between 1 and 4 (inclusive). In case of less than 4, the last part in the string represents all of the left over bytes, not just one.

For example the following are all equivalent:

192.168.1.1
Standard dotted decimal form
0300.0250.01.01
Octal
0xC0.0XA8.0x1.0X1
Hex
192.168.257
Fewer parts
0300.0XA8.257
All of the above

The bread and butter of URI related security issues is when one part of the system disagrees with another about the interpretation of the URI. So this non-standard, non-normal form syntax has been been a great source of security issues in the past. Its mostly well known now (CreateUri normalizes these non-normal forms to dotted decimal), but occasionally a good tool for bypassing naive URI blocking systems.

PermalinkCommentsurl inet_aton uri technical host programming ipv4

Web Worker Initialization Race

2012 Feb 24, 1:44

Elaborating on a previous brief post on the topic of Web Worker initialization race conditions, there's two important points to avoid a race condition when setting up a Worker:

  1. The parent starts the communication posting to the worker.
  2. The worker sets up its message handler in its first synchronous block of execution.

For example the following has no race becaues the spec guarentees that messages posted to a worker during its first synchronous block of execution will be queued and handled after that block. So the worker gets a chance to setup its onmessage handler. No race:

'parent.js':
var worker = new Worker();
worker.postMessage("initialize");

'worker.js':
onmessage = function(e) {
// ...
}

The following has a race because there's no guarentee that the parent's onmessage handler is setup before the worker executes postMessage. Race (violates 1):

'parent.js':
var worker = new Worker();
worker.onmessage = function(e) {
// ...
};

'worker.js':
postMessage("initialize");

The following has a race because the worker has no onmessage handler set in its first synchronous execution block and so the parent's postMessage may be sent before the worker sets its onmessage handler. Race (violates 2):

'parent.js':
var worker = new Worker();
worker.postMessage("initialize");

'worker.js':
setTimeout(
function() {
onmessage = function(e) {
// ...
}
},
0);
PermalinkCommentstechnical programming worker web-worker html script

URI Percent Encoding Ignorance Level 2 - There is no Unencoded URI

2012 Feb 20, 4:00

As a professional URI aficionado I deal with various levels of ignorance on URI percent-encoding (aka URI encoding, or URL escaping).

Getting into the more subtle levels of URI percent-encoding ignorance, folks try to apply their knowledge of percent-encoding to URIs as a whole producing the concepts escaped URIs and unescaped URIs. However there are no such things - URIs themselves aren't percent-encoded or decoded but rather contain characters that are percent-encoded or decoded. Applying percent-encoding or decoding to a URI as a whole produces a new and non-equivalent URI.

Instead of lingering on the incorrect concepts we'll just cover the correct ones: there's raw unencoded data, non-normal form URIs and normal form URIs. For example:

  1. http://example.com/%74%68%65%3F%70%61%74%68?query
  2. http://example.com/the%3Fpath?query
  3. "http", "example.com", "the?path", "query"

In the above (A) is not an 'encoded URI' but rather a non-normal form URI. The characters of 'the' and 'path' are percent-encoded but as unreserved characters specific in the RFC should not be encoded. In the normal form of the URI (B) the characters are decoded. But (B) is not a 'decoded URI' -- it still has an encoded '?' in it because that's a reserved character which by the RFC holds different meaning when appearing decoded versus encoded. Specifically in this case, it appears encoded which means it is data -- a literal '?' that appears as part of the path segment. This is as opposed to the decoded '?' that appears in the URI which is not part of the path but rather the delimiter to the query.

Usually when developers talk about decoding the URI what they really want is the raw data from the URI. The raw decoded data is (C) above. The only thing to note beyond what's covered already is that to obtain the decoded data one must parse the URI before percent decoding all percent-encoded octets.

Of course the exception here is when a URI is the raw data. In this case you must percent-encode the URI to have it appear in another URI. More on percent-encoding while constructing URIs later.

PermalinkCommentsurl encoding uri technical percent-encoding

URI Percent-Encoding Ignorance Level 1 - Purpose

2012 Feb 15, 4:00

As a professional URI aficionado I deal with various levels of ignorance on URI percent-encoding (aka URI encoding, or URL escaping).

Worse than the lame blog comments hating on percent-encoding is the shipping code which can do actual damage. In one very large project I won't name, I've fixed code that decodes all percent-encoded octets in a URI in order to get rid of pesky percents before calling ShellExecute. An unnamed developer with similar intent but clearly much craftier did the same thing in a loop until the string's length stopped changing. As it turns out percent-encoding serves a purpose and can't just be removed arbitrarily.

Percent-encoding exists so that one can represent data in a URI that would otherwise not be allowed or would be interpretted as a delimiter instead of data. For example, the space character (U+0020) is not allowed in a URI and so must be percent-encoded in order to appear in a URI:

  1. http://example.com/the%20path/
  2. http://example.com/the path/
In the above the first is a valid URI while the second is not valid since a space appears directly in the URI. Depending on the context and the code through which the wannabe URI is run one may get unexpected failure.

For an additional example, the question mark delimits the path from the query. If one wanted the question mark to appear as part of the path rather than delimit the path from the query, it must be percent-encoded:

  1. http://example.com/foo%3Fbar
  2. http://example.com/foo?bar
In the second, the question mark appears plainly and so delimits the path "/foo" from the query "bar". And in the first, the querstion mark is percent-encoded and so the path is "/foo%3Fbar".
PermalinkCommentsencoding uri technical ietf percent-encoding

MPAA Boss: If The Chinese Censor The Internet, Why Can't The US? (techdirt.com)

2011 Dec 10, 8:31

FTA:

The MPAA is getting pretty desperate, it seems. MPAA boss Chris Dodd was out trying to defend censoring the internet this week by using China as an example of why censorship isn’t a problem. It’s kind of shocking, really.

“When the Chinese told Google that they had to block sites or they couldn’t do [business] in their country, they managed to figure out how to block sites.”

PermalinkCommentsmpaa technical censorship

URI Empty Path Segments Matter

2011 Nov 23, 11:00

Shortly after joining the Internet Explorer team I got a bug from a PM on a popular Microsoft web server product that I'll leave unnamed (from now on UWS). The bug said that IE was handling empty path segments incorrectly by not removing them before resolving dotted path segments. For example UWS would do the following:

A.1. http://example.com/a/b//../
A.2. http://example.com/a/b/../
A.3. http://example.com/a/
In step 1 they are given a URI with dotted path segment and an empty path segment. In step 2 they remove the empty path segment, and in step 3 they resolve the dotted path segment. Whereas, given the same initial URI, IE would do the following:
B.1. http://example.com/a/b//../
B.2. http://example.com/a/b/
IE simply resolves the dotted path segment against the empty path segment and removes them both. So, how did I resolve this bug? As "By Design" of course!

The URI RFC allows path segments of zero length and does not assign them any special meaning. So generic user agents that intend to work on the web must not treat an empty path segment any different from a path segment with some text in it. In the case above IE is doing the correct thing.

That's the case for generic user agents, however servers may decide that a URI with an empty path segment returns the same resource as a the same URI without that empty path segment. Essentially they can decide to ignore empty path segments. Both IIS and Apache work this way and thus return the same resource for the following URIs:

http://exmaple.com/foo//bar///baz
http://example.com/foo/bar/baz
The issue for UWS is that it removes empty path segments before resolving dotted path segments. It must follow normal URI procedure before applying its own additional rules for empty path segments. Not doing that means they end up violating URI equivalency rules: URIs (A.1) and (B.2) are equivalent but UWS will not return the same resource for them.
PermalinkCommentsuser agent url ie uri technical web browser
Older Entries Creative Commons License Some rights reserved.