fragment - Dave's Blog

Search

JavaScript Microsoft Store app StartPage

2017 Jun 22, 8:58

JavaScript Microsoft Store apps have some details related to activation that are specific to JavaScript Store apps and that are poorly documented which I’ll describe here.

StartPage syntax

The StartPage attributes in the AppxManifest.xml (Package/Applications/Application/@StartPage, Package/Applications/Extensions/Extension/@StartPage) define the HTML page entry point for that kind of activation. That is, Application/@StartPage defines the entry point for tile activation, Extension[@Category="windows.protocol"]/@StartPage defines the entry point for URI handling activation, etc. There are two kinds of supported values in StartPage attributes: relative Windows file paths and absolute URIs. If the attribute doesn’t parse as an absolute URI then it is instead interpreted as relative Windows file path.

This implies a few things that I’ll declare explicitly here. Windows file paths, unlike URIs, don’t have a query or fragment, so if you are using a relative Windows file path for your StartPage attribute you cannot include anything like ‘?param=value’ at the end. Absolute URIs use percent-encoding for reserved characters like ‘%’ and ‘#’. If you have a ‘#’ in your HTML filename then you need to percent-encode that ‘#’ for a URI and not for a relative Windows file path.

If you specify a relative Windows file path, it is turned into an ms-appx URI by changing all backslashes to forward slashes, percent-encoding reserved characters, and combining the result with a base URI of ms-appx:///. Accordingly the relative Windows file paths are relative to the root of your package. If you are using a relative Windows file path as your StartPage and need to switch to using a URI so you can include a query or fragment, you can follow the same steps above.

StartPage validity

The validity of the StartPage is not determined before activation. If the StartPage is a relative Windows file path for a file that doesn’t exist, or an absolute URI that is not in the Application Content URI Rules, or something that doesn’t parse as a Windows file path or URI, or otherwise an absolute URI that fails to resolve (404, bad hostname, etc etc) then the JavaScript app will navigate to the app’s navigation error page (perhaps more on that in a future blog post). Just to call it out explicitly because I have personally accidentally done this: StartPage URIs are not automatically included in the Application Content URI Rules and if you forget to include your StartPage in your ACUR you will always fail to navigate to that StartPage.

StartPage navigation

When your app is activated for a particular activation kind, the StartPage value from the entry in your app’s manifest that corresponds to that activation kind is used as the navigation target. If the app is not already running, the app is activated, navigated to that StartPage value and then the Windows.UI.WebUI.WebUIApplication activated event is fired (more details on the order of various events in a moment). If, however, your app is already running and an activation occurs, we navigate or don’t navigate to the corresponding StartPage depending on the current page of the app. Take the app’s current top level document’s URI and if after removing the fragment it already matches the StartPage value then we won’t navigate and will jump straight to firing the WebUIApplication activated event.

Since navigating the top-level document means destroying the current JavaScript engine instance and losing all your state, this behavior might be a problem for you. If so, you can use the MSApp.pageHandlesAllApplicationActivations(true) API to always skip navigating to the StartPage and instead always jump straight to firing the WebUIApplication activated event. This does require of course that all of your pages all handle all activation kinds about which any part of your app cares.

PermalinkComments

Application Content URI Rules wildcard syntax

2017 May 31, 4:48

Application Content URI Rules (ACUR from now on) defines the bounds of the web that make up the Microsoft Store application. Package content via the ms-appx URI scheme is automatically considered part of the app. But if you have content on the web via http or https you can use ACUR to declare to Windows that those URIs are also part of your application. When your app navigates to URIs on the web those URIs will be matched against the ACUR to determine if they are part of your app or not. The documentation for how matching is done on the wildcard URIs in the ACUR Rule elements is not very helpful on MSDN so here are some notes.

Rules

You can have up to 100 Rule XML elements per ApplicationContentUriRules element. Each has a Match attribute that can be up to 2084 characters long. The content of the Match attribute is parsed with CreateUri and when matching against URIs on the web additional wildcard processing is performed. I’ll call the URI from the ACUR Rule the rule URI and the URI we compare it to found during app navigation the navigation URI.

The rule URI is matched to a navigation URI by URI component: scheme, username, password, host, port, path, query, and fragment. If a component does not exist on the rule URI then it matches any value of that component in the navigation URI. For example, a rule URI with no fragment will match a navigation URI with no fragment, with an empty string fragment, or a fragment with any value in it.

Asterisk

Each component except the port may have up to 8 asterisks. Two asterisks in a row counts as an escape and will match 1 literal asterisk. For scheme, username, password, query and fragment the asterisk matches whatever it can within the component.

Host

For the host, if the host consists of exactly one single asterisk then it matches anything. Otherwise an asterisk in a host only matches within its domain name label. For example, http://*.example.com will match http://a.example.com/ but not http://b.a.example.com/ or http://example.com/. And http://*/ will match http://example.com, http://a.example.com/, and http://b.a.example.com/. However the Store places restrictions on submitting apps that use the http://* rule or rules with an asterisk in the second effective domain name label. For example, http://*.com is also restricted for Store submission.

Path

For the path, an asterisk matches within the path segment. For example, http://example.com/a/*/c will match http://example.com/a/b/c and http://example.com/a//c but not http://example.com/a/b/b/c or http://example.com/a/c

Additionally for the path, if the path ends with a slash then it matches any path that starts with that same path. For example, http://example.com/a/ will match http://example.com/a/b and http://example.com/a/b/c/d/e/, but not http://example.com/b/.

If the path doesn’t end with a slash then there is no suffix matching performed. For example, http://example.com/a will match only http://example.com/a and no URIs with a different path.

As a part of parsing the rule URI and the navigation URI, CreateUri will perform URI normalization and so the hostname and scheme will be made lower case (casing matters in all other parts of the URI and case sensitive comparisons will be performed), IDN normalization will be performed, ‘.’ and ‘..’ path segments will be resolved and other normalizations as described in the CreateUri documentation.

PermalinkCommentsapplication-content-uri-rules programming windows windows-store

location.hash and location.search are bad and they should feel bad

2014 May 22, 9:25
The DOM location interface exposes the HTML document's URI parsed into its properties. However, it is ancient and has problems that bug me but otherwise rarely show up in the real world. Complaining about mostly theoretical issues is why blogging exists, so here goes:
  • The location object's search, hash, and protocol properties are all misnomers that lead to confusion about the correct terms:
    • The 'search' property returns the URI's query property. The query property isn't limited to containing search terms.
    • The 'hash' property returns the URI's fragment property. This one is just named after its delimiter. It should be called the fragment.
    • The 'protocol' property returns the URI's scheme property. A URI's scheme isn't necessarily a protocol. The http URI scheme of course uses the HTTP protocol, but the https URI scheme is the HTTP protocol over SSL/TLS - there is no HTTPS protocol. Similarly for something like mailto - there is no mailto wire protocol.
  • The 'hash' and 'search' location properties both return null in the case that their corresponding URI property doesn't exist or if its the empty string. A URI with no query property and a URI with an empty string query property that are otherwise the same, are not equal URIs and are allowed by HTTP to return different content. Similarly for the fragment. Unless the specific URI scheme defines otherwise, an empty query or hash isn't the same as no query or hash.
But like complaining about the number of minutes in an hour none of this can ever change without huge compat issues on the web. Accordingly I can only give my thanks to Anne van Kesteren and the awesome work on the URL standard moving towards a more sane (but still working practically within the constraints of compat) location object and URI parsing in the browser.
PermalinkComments

URI functions in Windows Store Applications

2013 Jul 25, 1:00

Summary

The Modern SDK contains some URI related functionality as do libraries available in particular projection languages. Unfortunately, collectively these APIs do not cover all scenarios in all languages. Specifically, JavaScript and C++ have no URI building APIs, and C++ additionally has no percent-encoding/decoding APIs.
WinRT (JS and C++)
JS Only
C++ Only
.NET Only
Parse
 
Build
Normalize
Equality
 
 
Relative resolution
Encode data for including in URI property
Decode data extracted from URI property
Build Query
Parse Query
The Windows.Foudnation.Uri type is not projected into .NET modern applications. Instead those applications use System.Uri and the platform ensures that it is correctly converted back and forth between Windows.Foundation.Uri as appropriate. Accordingly the column marked WinRT above is applicable to JS and C++ modern applications but not .NET modern applications. The only entries above applicable to .NET are the .NET Only column and the WwwFormUrlDecoder in the bottom left which is available to .NET.

Scenarios

Parse

This functionality is provided by the WinRT API Windows.Foundation.Uri in C++ and JS, and by System.Uri in .NET.
Parsing a URI pulls it apart into its basic components without decoding or otherwise modifying the contents.
var uri = new Windows.Foundation.Uri("http://example.com/path%20segment1/path%20segment2?key1=value1&key2=value2");
console.log(uri.path);// /path%20segment1/path%20segment2

WsDecodeUrl (C++)

WsDecodeUrl is not suitable for general purpose URI parsing.  Use Windows.Foundation.Uri instead.

Build (C#)

URI building is only available in C# via System.UriBuilder.
URI building is the inverse of URI parsing: URI building allows the developer to specify the value of basic components of a URI and the API assembles them into a URI. 
To work around the lack of a URI building API developers will likely concatenate strings to form their URIs.  This can lead to injection bugs if they don’t validate or encode their input properly, but if based on trusted or known input is unlikely to have issues.
            Uri originalUri = new Uri("http://example.com/path1/?query");
            UriBuilder uriBuilder = new UriBuilder(originalUri);
            uriBuilder.Path = "/path2/";
            Uri newUri = uriBuilder.Uri; // http://example.com/path2/?query

WsEncodeUrl (C++)

WsEncodeUrl, in addition to building a URI from components also does some encoding.  It encodes non-US-ASCII characters as UTF8, the percent, and a subset of gen-delims based on the URI property: all :/?#[]@ are percent-encoded except :/@ in the path and :/?@ in query and fragment.
Accordingly, WsEncodeUrl is not suitable for general purpose URI building.  It is acceptable to use in the following cases:
- You’re building a URI out of non-encoded URI properties and don’t care about the difference between encoded and decoded characters.  For instance you’re the only one consuming the URI and you uniformly decode URI properties when consuming – for instance using WsDecodeUrl to consume the URI.
- You’re building a URI with URI properties that don’t contain any of the characters that WsEncodeUrl encodes.

Normalize

This functionality is provided by the WinRT API Windows.Foundation.Uri in C++ and JS and by System.Uri in .NET.  Normalization is applied during construction of the Uri object.
URI normalization is the application of URI normalization rules (including DNS normalization, IDN normalization, percent-encoding normalization, etc.) to the input URI.
        var normalizedUri = new Windows.Foundation.Uri("HTTP://EXAMPLE.COM/p%61th foo/");
        console.log(normalizedUri.absoluteUri); // http://example.com/path%20foo/
This is modulo Win8 812823 in which the Windows.Foundation.Uri.AbsoluteUri property returns a normalized IRI not a normalized URI.  This bug does not affect System.Uri.AbsoluteUri which returns a normalized URI.

Equality

This functionality is provided by the WinRT API Windows.Foundation.Uri in C++ and JS and by System.Uri in .NET. 
URI equality determines if two URIs are equal or not necessarily equal.
            var uri1 = new Windows.Foundation.Uri("HTTP://EXAMPLE.COM/p%61th foo/"),
                uri2 = new Windows.Foundation.Uri("http://example.com/path%20foo/");
            console.log(uri1.equals(uri2)); // true

Relative resolution

This functionality is provided by the WinRT API Windows.Foundation.Uri in C++ and JS and by System.Uri in .NET 
Relative resolution is a function that given an absolute URI A and a relative URI B, produces a new absolute URI C.  C is the combination of A and B in which the basic components specified in B override or combine with those in A under rules specified in RFC 3986.
        var baseUri = new Windows.Foundation.Uri("http://example.com/index.html"),
            relativeUri = "/path?query#fragment",
            absoluteUri = baseUri.combineUri(relativeUri);
        console.log(baseUri.absoluteUri);       // http://example.com/index.html
        console.log(absoluteUri.absoluteUri);   // http://example.com/path?query#fragment

Encode data for including in URI property

This functionality is available in JavaScript via encodeURIComponent and in C# via System.Uri.EscapeDataString. Although the two methods mentioned above will suffice for this purpose, they do not perform exactly the same operation.
Additionally we now have Windows.Foundation.Uri.EscapeComponent in WinRT, which is available in JavaScript and C++ (not C# since it doesn’t have access to Windows.Foundation.Uri).  This is also slightly different from the previously mentioned mechanisms but works best for this purpose.
Encoding data for inclusion in a URI property is necessary when constructing a URI from data.  In all the above cases the developer is dealing with a URI or substrings of a URI and so the strings are all encoded as appropriate. For instance, in the parsing example the path contains “path%20segment1” and not “path segment1”.  To construct a URI one must first construct the basic components of the URI which involves encoding the data.  For example, if one wanted to include “path segment / example” in the path of a URI, one must percent-encode the ‘ ‘ since it is not allowed in a URI, as well as the ‘/’ since although it is allowed, it is a delimiter and won’t be interpreted as data unless encoded.
If a developer does not have this API provided they can write it themselves.  Percent-encoding methods appear simple to write, but the difficult part is getting the set of characters to encode correct, as well as handling non-US-ASCII characters.
        var uri = new Windows.Foundation.Uri("http://example.com" +
            "/" + Windows.Foundation.Uri.escapeComponent("path segment / example") +
            "?key=" + Windows.Foundation.Uri.escapeComponent("=&?#"));
        console.log(uri.absoluteUri); // http://example.com/path%20segment%20%2F%20example?key=%3D%26%3F%23

WsEncodeUrl (C++)

In addition to building a URI from components, WsEncodeUrl also percent-encodes some characters.  However the API is not recommend for this scenario given the particular set of characters that are encoded and the convoluted nature in which a developer would have to use this API in order to use it for this purpose.
There are no general purpose scenarios for which the characters WsEncodeUrl encodes make sense: encode the %, encode a subset of gen-delims but not also encode the sub-delims.  For instance this could not replace encodeURIComponent in a C++ version of the following code snippet since if ‘value’ contained ‘&’ or ‘=’ (both sub-delims) they wouldn’t be encoded and would be confused for delimiters in the name value pairs in the query:
"http://example.com/?key=" + Windows.Foundation.Uri.escapeComponent(value)
Since WsEncodeUrl produces a string URI, to obtain the property they want to encode they’d need to parse the resulting URI.  WsDecodeUrl won’t work because it decodes the property but Windows.Foundation.Uri doesn’t decode.  Accordingly the developer could run their string through WsEncodeUrl then Windows.Foundation.Uri to extract the property.

Decode data extracted from URI property

This functionality is available in JavaScript via decodeURIComponent and in C# via System.Uri.UnescapeDataString. Although the two methods mentioned above will suffice for this purpose, they do not perform exactly the same operation.
Additionally we now also have Windows.Foundation.Uri.UnescapeComponent in WinRT, which is available in JavaScript and C++ (not C# since it doesn’t have access to Windows.Foundation.Uri).  This is also slightly different from the previously mentioned mechanisms but works best for this purpose.
Decoding is necessary when extracting data from a parsed URI property.  For example, if a URI query contains a series of name and value pairs delimited by ‘=’ between names and values, and by ‘&’ between pairs, one must first parse the query into name and value entries and then decode the values.  It is necessary to make this an extra step separate from parsing the URI property so that sub-delimiters (in this case ‘&’ and ‘=’) that are encoded will be interpreted as data, and those that are decoded will be interpreted as delimiters.
If a developer does not have this API provided they can write it themselves.  Percent-decoding methods appear simple to write, but have some tricky parts including correctly handling non-US-ASCII, and remembering not to decode .
In the following example, note that if unescapeComponent were called first, the encoded ‘&’ and ‘=’ would be decoded and interfere with the parsing of the name value pairs in the query.
            var uri = new Windows.Foundation.Uri("http://example.com/?foo=bar&array=%5B%27%E3%84%93%27%2C%27%26%27%2C%27%3D%27%2C%27%23%27%5D");
            uri.query.substr(1).split("&").forEach(
                function (keyValueString) {
                    var keyValue = keyValueString.split("=");
                    console.log(Windows.Foundation.Uri.unescapeComponent(keyValue[0]) + ": " + Windows.Foundation.Uri.unescapeComponent(keyValue[1]));
                    // foo: bar
                    // array: ['','&','=','#']
                });

WsDecodeUrl (C++)

Since WsDecodeUrl decodes all percent-encoded octets it could be used for general purpose percent-decoding but it takes a URI so would require the dev to construct a stub URI around the string they want to decode.  For example they could prefix “http:///#” to their string, run it through WsDecodeUrl and then extract the fragment property.  It is convoluted but will work correctly.

Parse Query

The query of a URI is often encoded as application/x-www-form-urlencoded which is percent-encoded name value pairs delimited by ‘&’ between pairs and ‘=’ between corresponding names and values.
In WinRT we have a class to parse this form of encoding using Windows.Foundation.WwwFormUrlDecoder.  The queryParsed property on the Windows.Foundation.Uri class is of this type and created with the query of its Uri:
    var uri = Windows.Foundation.Uri("http://example.com/?foo=bar&array=%5B%27%E3%84%93%27%2C%27%26%27%2C%27%3D%27%2C%27%23%27%5D");
    uri.queryParsed.forEach(
        function (pair) {
            console.log("name: " + pair.name + ", value: " + pair.value);
            // name: foo, value: bar
            // name: array, value: ['','&','=','#']
        });
    console.log(uri.queryParsed.getFirstValueByName("array")); // ['','&','=','#']
The QueryParsed property is only on Windows.Foundation.Uri and not System.Uri and accordingly is not available in .NET.  However the Windows.Foundation.WwwFormUrlDecoder class is available in C# and can be used manually:
            Uri uri = new Uri("http://example.com/?foo=bar&array=%5B%27%E3%84%93%27%2C%27%26%27%2C%27%3D%27%2C%27%23%27%5D");
            WwwFormUrlDecoder decoder = new WwwFormUrlDecoder(uri.Query);
            foreach (IWwwFormUrlDecoderEntry entry in decoder)
            {
                System.Diagnostics.Debug.WriteLine("name: " + entry.Name + ", value: " + entry.Value);
                // name: foo, value: bar
                // name: array, value: ['','&','=','#']
            }
 

Build Query

To build a query of name value pairs encoded as application/x-www-form-urlencoded there is no WinRT API to do this directly.  Instead a developer must do this manually making use of the code described in “Encode data for including in URI property”.
In terms of public releases, this property is only in the RC and later builds.
For example in JavaScript a developer may write:
            var uri = new Windows.Foundation.Uri("http://example.com/"),
                query = "?" + Windows.Foundation.Uri.escapeComponent("array") + "=" + Windows.Foundation.Uri.escapeComponent("['','&','=','#']");
 
            console.log(uri.combine(new Windows.Foundation.Uri(query)).absoluteUri); // http://example.com/?array=%5B'%E3%84%93'%2C'%26'%2C'%3D'%2C'%23'%5D
 
PermalinkCommentsc# c++ javascript technical uri windows windows-runtime windows-store

URI Fragment Identifiers for the text/csv Media Type

2011 Apr 29, 3:55This memo defines URI fragment identifiers for text/csv MIME entities. These fragment identifiers make it possible to refer to parts of a text/csv MIME entity, identified by cell, row, column, or slice.PermalinkCommentscsv uri technical mime reference

ginger's thoughts » URI fragments vs URI queries for media fragment addressing

2009 Sep 11, 8:39"In the W3C Media Fragment Working Group (MFWG) we have had long discussions about the use of the URI query (”?”) or the URI fragment (”#”) addressing approach for addressing directly into media fragments, and the diverse new HTTP headers required to serve such URI requests, considering such side conditions as the stripping-off of fragment parameters from a URI by Web browsers, or the existence of caching Web proxies."PermalinkCommentsfragment uri via:connolly media url query http http-header

Creating Accelerators for Other People's Web Services

2009 Aug 18, 4:19

Before we shipped IE8 there were no Accelerators, so we had some fun making our own for our favorite web services. I've got a small set of tips for creating Accelerators for other people's web services. I was planning on writing this up as an IE blog post, but Jon wrote a post covering a similar area so rather than write a full and coherent blog post I'll just list a few points:

PermalinkCommentstechnical accelerator ie8 ie

IE8 Search Providers, Accelerators, and Local Applications Hack

2009 Jul 25, 3:23

There's no easy way to use local applications on a PC as the result of an accelerator or a search provider in IE8 but there is a hack-y/obvious way, that I'll describe here. Both accelerators and search providers in IE8 fill in URL templates and navigate to the resulting URL when an accelerator or search provider is executed by the user. These URLs are limited in scheme to http and https but those pages may do anything any other webpage may do. If your local application has an ActiveX control you could use that, or (as I will provide examples for) if the local application has registered for an application protocol you can redirect to that URL. In any case, unfortunately this means that you must put a webpage on the Internet in order to get an accelerator or search provider to use a local application.

For examples of the app protocol case, I've created a callto accelerator that uses whatever application is registered for the callto scheme on your system, and a Windows Search search provider that opens Explorer's search with your search query. The callto accelerator navigates to my redirection page with 'callto:' followed by the selected text in the fragment and the redirection page redirects to that callto URL. In the Windows Search search provider case the same thing happens except the fragment contains 'search-ms:query=' followed by the selected text, which starts Windows Search on your system with the selected text as the query. I've looked into app protocols previously.

PermalinkCommentstechnical callto hack accelerator search ie8

Linking to or Embedding a Portion of a Video

2009 Jun 19, 10:12

I'm excited by HTML5's video tag as are plenty of other people. Once that comes about and once media fragments are adopted, linking to or embedding a portion of a video will be as easy as using the correct fragment on your URL thanks to the Media Fragments WG who has been hard at work since the last time I looked at fragments.

However, until that work is embraced by browsers, embedding portions of videos will continue to require work specific to the site from which you are embedding the video. On the YouTube blog they wrote about how to "link to the best parts in your videos", using a fragment syntax like '#t=1m15s' to start playback of the associated video at 1 minute and 15 seconds. Of course if you want to embed part of a Hulu video it will be different. Although I haven't found an authoritative source describing the URL syntax to use, you can follow Hulu's video guide on linking to part of a video and note how the URL changes as you adjust the slider on the time-line. It looks like their syntax for linking to a Hulu page is to add '?c=[start time in seconds](:[end time in seconds])' with the colon and end time optional in order to link to a portion of a video. And the syntax for embedding appears to be "http://www.hulu.com/embed/.../[start time in seconds](/[end time in seconds])" again with the end time optional.

For more sites, check out the Media Fragments WG's list of existing applications' proprietary fragmenting schemes.

PermalinkCommentshulu technical media fragment wg url youtube video html5 uri fragment

Existing Technologies Survey - Media Fragments Working Group Wiki

2009 Jun 17, 7:17A list of how some existing sites do URL-fragment-like things.PermalinkCommentsvideo web w3c url uri fragment

Use cases and requirements for Media Fragments

2009 Jun 17, 7:16"Use cases and requirements for Media Fragments"PermalinkCommentsvideo uri fragment internet web w3c

YouTube Blog - Link To The Best Parts In Your Videos

2009 Apr 23, 6:25"To create a deep link, append the following to the end of a YouTube video URL: #t=1m15s. This says to link to the time 1:15 - you can replace the numbers before the 'm' and the 's' with anything you like."PermalinkCommentsreference video blog google youtube api url fragment link

Clickable transcript of my Canonical Link Element talk

2009 Apr 23, 6:21You can link into the middle of a YouTube video using a fragment like '#t=30m14s'. Matt combines this with his transcript...: "If you run that over your entire caption file - boom - you have a clickable transcript of your video."PermalinkCommentsvideo blog hack youtube url transcript

Text/Plain Fragment Bookmarklet

2008 Nov 19, 12:58

The text/plain fragment documented in RFC 5147 and described on Erik Wilde's blog struck my interest and, like the XML fragment, I wanted to see if I could implement this in IE. In this case there's no XSLT for me to edit so, like my plain/text word wrap bookmarklet I've implemented it as a bookmarklet. This is only a partial implementation as it doesn't implement the integrity checks.

Check out my text/plain fragment bookmarklet.

PermalinkCommentstext url boring bookmarklet uri plain-text javascript fragment

URI Fragment Info Roundup

2008 Apr 21, 11:53

['Neverending story' by Alexandre Duret-Lutz. A framed photo of books with the droste effect applied. Licensed under creative commons.]Information about URI Fragments, the portion of URIs that follow the '#' at the end and that are used to navigate within a document, is scattered throughout various documents which I usually have to hunt down. Instead I'll link to them all here.

Definitions. Fragments are defined in the URI RFC which states that they're used to identify a secondary resource that is related to the primary resource identified by the URI as a subset of the primary, a view of the primary, or some other resource described by the primary. The interpretation of a fragment is based on the mime type of the primary resource. Tim Berners-Lee notes that determining fragment meaning from mime type is a problem because a single URI may contain a single fragment, however over HTTP a single URI can result in the same logical resource represented in different mime types. So there's one fragment but multiple mime types and so multiple interpretations of the one fragment. The URI RFC says that if an author has a single resource available in multiple mime types then the author must ensure that the various representations of a single resource must all resolve fragments to the same logical secondary resource. Depending on which mime types you're dealing with this is either not easy or not possible.

HTTP. In HTTP when URIs are used, the fragment is not included. The General Syntax section of the HTTP standard says it uses the definitions of 'URI-reference' (which includes the fragment), 'absoluteURI', and 'relativeURI' (which don't include the fragment) from the URI RFC. However, the 'URI-reference' term doesn't actually appear in the BNF for the protocol. Accordingly the headers like 'Request-URI', 'Content-Location', 'Location', and 'Referer' which include URIs are defined with 'absoluteURI' or 'relativeURI' and don't include the fragment. This is in keeping with the original fragment definition which says that the fragment is used as a view of the original resource and consequently only needed for resolution on the client. Additionally, the URI RFC explicitly notes that not including the fragment is a privacy feature such that page authors won't be able to stop clients from viewing whatever fragments the client chooses. This seems like an odd claim given that if the author wanted to selectively restrict access to portions of documents there are other options for them like breaking out the parts of a single resource to which the author wishes to restrict access into separate resources.

HTML. In HTML, the HTML mime type RFC defines HTML's fragment use which consists of fragments referring to elements with a corresponding 'id' attribute or one of a particular set of elements with a corresponding 'name' attribute. The HTML spec discusses fragment use additionally noting that the names and ids must be unique in the document and that they must consist of only US-ASCII characters. The ID and NAME attributes are further restricted in section 6 to only consist of alphanumerics, the hyphen, period, colon, and underscore. This is a subset of the characters allowed in the URI fragment so no encoding is discussed since technically its not needed. However, practically speaking, browsers like FireFox and Internet Explorer allow for names and ids containing characters outside of the defined set including characters that must be percent-encoded to appear in a URI fragment. The interpretation of percent-encoded characters in fragments for HTML documents is not consistent across browsers (or in some cases within the same browser) especially for the percent-encoded percent.

Text. Text/plain recently got a fragment definition that allows fragments to refer to particular lines or characters within a text document. The scheme no longer includes regular expressions, which disappointed me at first, but in retrospect is probably good idea for increasing the adoption of this fragment scheme and for avoiding the potential for ubiquitous DoS via regex. One of the authors also notes this on his blog. I look forward to the day when this scheme is widely implemented.

XML. XML has the XPointer framework to define its fragment structure as noted by the XML mime type definition. XPointer consists of a general scheme that contains subschemes that identify a subset of an XML document. Its too bad such a thing wasn't adopted for URI fragments in general to solve the problem of a single resource with multiple mime type representations. I wrote more about XPointer when I worked on hacking XPointer into IE.

SVG and MPEG. Through the Media Fragments Working Group I found a couple more fragment scheme definitions. SVG's fragment scheme is defined in the SVG documentation and looks similar to XML's. MPEG has one defined but I could only find it as an ISO document "Text of ISO/IEC FCD 21000-17 MPEG-12 FID" and not as an RFC which is a little disturbing.

AJAX. AJAX websites have used fragments as an escape hatch for two issues that I've seen. The first is getting a unique URL for versions of a page that are produced on the client by script. The fragment may be changed by script without forcing the page to reload. This goes outside the rules of the standards by using HTML fragments in a fashion not called out by the HTML spec. but it does seem to be inline with the spirit of the fragment in that it is a subview of the original resource and interpretted client side. The other hack-ier use of the fragment in AJAX is for cross domain communication. The basic idea is that different frames or windows may not communicate in normal fashions if they have different domains but they can view each other's URLs and accordingly can change their own fragments in order to send a message out to those who know where to look. IMO this is not inline with the spirit of the fragment but is rather a cool hack.

PermalinkCommentsxml text ajax technical url boring uri fragment rfc

Fragment Identification of MPEG Resources (Text of ISO/IEC FCD 21000-17 MPEG-21 FID)

2008 Apr 16, 7:09Standard describing URI fragments identifying parts of MPEG videos. Very similar syntax to XML fragments. Having trouble finding this document as anything other than a Word doc. Looks to exist only as an ISO standard.PermalinkCommentsstandard fragment uri video mpeg reference iso

dretblog: Fragment Identifiers for Plain Text Documents

2008 Apr 16, 6:58Eric Wilde talks about his text plain fragment RFC becoming a standard.PermalinkCommentsblog mime uri fragment text erik-wilde

14.3 Linking into SVG content: IRI fragments and SVG views

2008 Apr 16, 6:53SVG doc on how to make URI fragments that reference parts of an SVG document.PermalinkCommentsstandard svg w3c reference uri fragment

RFC 5147 - URI Fragment Identifiers for the text/plain Media Type

2008 Apr 16, 6:42The URI fragment for text/plain is finally a Proposed Standard!PermalinkCommentsuri fragment mime web rfc standards

Media Fragments Working Group

2008 Apr 16, 6:42A working group devoted to getting fragments to ID pieces of images or time positions or ranges in audio and video.PermalinkCommentsmime w3c standard uri fragment
Older Entries Creative Commons License Some rights reserved.