regex - Dave's Blog

Search
My timeline on Mastodon

Stripe CTF - Level 5

2012 Sep 11, 5:00

Level 5 of the Stripe CTF revolved around a design issue in an OpenID like protocol.

Code

    def authenticated?(body)
body =~ /[^\w]AUTHENTICATED[^\w]*$/
end

...

if authenticated?(body)
session[:auth_user] = username
session[:auth_host] = host
return "Remote server responded with: #{body}." \
" Authenticated as #{username}@#{host}!"

Issue

This level is an implementation of a federated identity protocol. You give it an endpoint URI and a username and password, it posts the username and password to the endpoint URI, and if the response is 'AUTHENTICATED' then access is allowed. It is easy to be authenticated on a server you control, but this level requires you to authenticate from the server running the level. This level only talks to stripe CTF servers so the first step is to upload a document to the level 2 server containing the text 'AUTHENTICATED' and we can now authenticate on a level 2 server. Notice that the level 5 server will dump out the content of the endpoint URI and that the regexp it uses to detect the text 'AUTHENTICATED' can match on that dump. Accordingly I uploaded an authenticated file to

https://level02-2.stripe-ctf.com/user-ajvivlehdt/uploads/authenticated
Using that as my endpoint URI means authenticating as level 2. I can then choose the following endpoint URI to authenticate as level 5.
https://level05-1.stripe-ctf.com/user-qtoyekwrod/?pingback=https%3A%2F%2Flevel02-2.stripe-ctf.com%2Fuser-ajvivlehdt%2Fuploads%2Fauthenticated&username=a&password=a
Navigating to that URI results in the level 5 server telling me I'm authenticated as level 2 and lists the text of the level 2 file 'AUTHENTICATED'. Feeding this back into the level 5 server as my endpoint URI means level 5 seeing 'AUTHENTICATED' coming back from a level 5 URI.

Notes

I didn't see any particular code review red flags, really the issue here is that the regular expression testing for 'AUTHENTICATED' is too permisive and the protocol itself doesn't do enough. The protocol requires only a set piece of common literal text to be returned which makes it easy for a server to accidentally fall into authenticating. Having the endpoint URI have to return variable text based on the input would make it much harder for a server to accidentally authenticate.

PermalinkCommentsinternet openid security stripe-ctf technical web

Stripe CTF - XSS, CSRF (Levels 4 & 6)

2012 Sep 10, 4:43

Level 4 and level 6 of the Stripe CTF had solutions around XSS.

Level 4

Code

> Registered Users 

    <%@registered_users.each do |user| %>
    <%last_active = user[:last_active].strftime('%H:%M:%S UTC') %>
    <%if @trusts_me.include?(user[:username]) %>

  • <%= user[:username] %>
    (password: <%= user[:password] %>, last active <%= last_active %>)
  • Issue

    The level 4 web application lets you transfer karma to another user and in doing so you are also forced to expose your password to that user. The main user page displays a list of users who have transfered karma to you along with their password. The password is not HTML encoded so we can inject HTML into that user's browser. For instance, we could create an account with the following HTML as the password which will result in XSS with that HTML:

    
    
    This HTML runs script that uses jQuery to post to the transfer URI resulting in a transfer of karma from the attacked user to the attacker user, and also the attacked user's password.

    Notes

    Code review red flags in this case included lack of encoding when using user controlled content to create HTML content, storing passwords in plain text in the database, and displaying passwords generally. By design the web app shows users passwords which is a very bad idea.

    Level 6

    Code

    
    

    ...

    def self.safe_insert(table, key_values)
    key_values.each do |key, value|
    # Just in case people try to exfiltrate
    # level07-password-holder's password
    if value.kind_of?(String) &&
    (value.include?('"') || value.include?("'"))
    raise "Value has unsafe characters"
    end
    end

    conn[table].insert(key_values)
    end

    Issue

    This web app does a much better job than the level 4 app with HTML injection. They use encoding whenever creating HTML using user controlled data, however they don't use encoding when injecting JSON data into script (see post_data initialization above). This JSON data is the last five most recent messages sent on the app so we get to inject script directly. However, the system also ensures that no strings we write contains single or double quotes so we can't get out of the string in the JSON data directly. As it turns out, HTML lets you jump out of a script block using no matter where you are in script. For instance, in the middle of a value in some JSON data we can jump out of script. But we still want to run script, so we can jump right back in. So the frame so far for the message we're going to post is the following:

    
    
    
    
PermalinkCommentscsrf encoding html internet javascript percent-encoding script security stripe-ctf technical web xss

A Perl Regular Expression That Matches Prime Numbers (catonmat.net)

2011 Dec 28, 3:36

perl -lne ‘(1x$_) =~ /^1?$|^(11+?)\1+$/ || print “$_ is prime”’

PermalinkCommentstechnical perl regexp prime math

Command line for finding missing URLACTIONs

2011 May 28, 11:00

I wanted to ensure that my switch statement in my implementation of IInternetSecurityManager::ProcessURLAction had a case for every possible documented URLACTION. I wrote the following short command line sequence to see the list of all URLACTIONs in the SDK header file not found in my source file:

grep URLACTION urlmon.idl | sed 's/.*\(URLACTION[a-zA-Z0-9_]*\).*/\1/g;' | sort | uniq > allURLACTIONs.txt
grep URLACTION MySecurityManager.cpp | sed 's/.*\(URLACTION[a-zA-Z0-9_]*\).*/\1/g;' | sort | uniq > myURLACTIONs.txt
comm -23 allURLACTIONs.txt myURLACTIONs.txt
I'm not a sed expert so I had to read the sed documentation, and I heard about comm from Kris Kowal's blog which happilly was in the Win32 GNU tools pack I already run.

But in my effort to learn and use PowerShell I found the following similar command line:

diff 
(more urlmon.idl | %{ if ($_ -cmatch "URLACTION[a-zA-Z0-9_]*") { $matches[0] } } | sort -uniq)
(more MySecurityManager.cpp | %{ if ($_ -cmatch "URLACTION[a-zA-Z0-9_]*") { $matches[0] } } | sort -uniq)
In the PowerShell version I can skip the temporary files which is nice. 'diff' is mapped to 'compare-object' which seems similar to comm but with no parameters to filter out the different streams (although this could be done more verbosely with the ?{ } filter syntax). In PowerShell uniq functionality is built into sort. The builtin -cmatch operator (c is for case sensitive) to do regexp is nice plus the side effect of generating the $matches variable with the regexp results.
PermalinkCommentspowershell tool cli technical command line

Tab Expansion in PowerShell

2008 Nov 18, 6:38

PowerShell gives us a real CLI for Windows based around .Net stuff. I don't like the creation of a new shell language but I suppose it makes sense given that they want something C# like but not C# exactly since that's much to verbose and strict for a CLI. One of the functions you can override is the TabExpansion function which is used when you tab complete commands. I really like this and so I've added on to the standard implementation to support replacing a variable name with its value, tab completion of available commands, previous command history, and drive names (there not restricted to just one letter in PS).

Learning the new language was a bit of a chore but MSDN helped. A couple of things to note, a statement that has a return value that you don't do anything with is implicitly the return value for the current function. That's why there's no explicit return's in my TabExpansion function. Also, if you're TabExpansion function fails or returns nothing then the builtin TabExpansion function runs which does just filenames. This is why you can see that the standard TabExpansion function doesn't handle normal filenames: it does extra stuff (like method and property completion on variables that represent .Net objects) but if there's no fancy extra stuff to be done it lets the builtin one take a crack.

Here's my TabExpansion function. Probably has bugs, so watch out!


function EscapePath([string] $path, [string] $original)
{
    if ($path.Contains(' ') -and !$original.Contains(' '))
    {
        '"'   $path   '"';
    }
    else
    {
        $path;
    }
}

function PathRelativeTo($pathDest, $pathCurrent)
{
    if ($pathDest.PSParentPath.ToString().EndsWith($pathCurrent.Path))
    {
        '.\'   $pathDest.name;
    }
    else
    {
        $pathDest.FullName;
    }
}

#  This is the default function to use for tab expansion. It handles simple
# member expansion on variables, variable name expansion and parameter completion
# on commands. It doesn't understand strings so strings containing ; | ( or { may
# cause expansion to fail.

function TabExpansion($line, $lastWord)
{
    switch -regex ($lastWord)
    {
         # Handle property and method expansion...
         '(^.*)(\$(\w|\.) )\.(\w*)$' {
             $method = [Management.Automation.PSMemberTypes] `
                 'Method,CodeMethod,ScriptMethod,ParameterizedProperty'
             $base = $matches[1]
             $expression = $matches[2]
             Invoke-Expression ('$val='   $expression)
             $pat = $matches[4]   '*'
             Get-Member -inputobject $val $pat | sort membertype,name |
                 where { $_.name -notmatch '^[gs]et_'} |
                 foreach {
                     if ($_.MemberType -band $method)
                     {
                         # Return a method...
                         $base   $expression   '.'   $_.name   '('
                     }
                     else {
                         # Return a property...
                         $base   $expression   '.'   $_.name
                     }
                 }
             break;
          }

         # Handle variable name expansion...
         '(^.*\$)([\w\:]*)$' {
             $prefix = $matches[1]
             $varName = $matches[2]
             foreach ($v in Get-Childitem ('variable:'   $varName   '*'))
             {
                 if ($v.name -eq $varName)
                 {
                     $v.value
                 }
                 else
                 {
                    $prefix   $v.name
                 }
             }
             break;
         }

         # Do completion on parameters...
         '^-([\w0-9]*)' {
             $pat = $matches[1]   '*'

             # extract the command name from the string
             # first split the string into statements and pipeline elements
             # This doesn't handle strings however.
             $cmdlet = [regex]::Split($line, '[|;]')[-1]

             #  Extract the trailing unclosed block e.g. ls | foreach { cp
             if ($cmdlet -match '\{([^\{\}]*)$')
             {
                 $cmdlet = $matches[1]
             }

             # Extract the longest unclosed parenthetical expression...
             if ($cmdlet -match '\(([^()]*)$')
             {
                 $cmdlet = $matches[1]
             }

             # take the first space separated token of the remaining string
             # as the command to look up. Trim any leading or trailing spaces
             # so you don't get leading empty elements.
             $cmdlet = $cmdlet.Trim().Split()[0]

             # now get the info object for it...
             $cmdlet = @(Get-Command -type 'cmdlet,alias' $cmdlet)[0]

             # loop resolving aliases...
             while ($cmdlet.CommandType -eq 'alias') {
                 $cmdlet = @(Get-Command -type 'cmdlet,alias' $cmdlet.Definition)[0]
             }

             # expand the parameter sets and emit the matching elements
             foreach ($n in $cmdlet.ParameterSets | Select-Object -expand parameters)
             {
                 $n = $n.name
                 if ($n -like $pat) { '-'   $n }
             }
             break;
         }

         default {
             $varNameStar = $lastWord   '*';

             foreach ($n in @(Get-Childitem $varNameStar))
             {
                 $name = PathRelativeTo ($n) ($PWD);

                 if ($n.PSIsContainer)
                 {
                     EscapePath ($name   '\') ($lastWord);
                 }
                 else
                 {
                     EscapePath ($name) ($lastWord);
                 }
             }

             if (!$varNameStar.Contains('\'))
             {
                foreach ($n in @(Get-Command $varNameStar))
                {
                    if ($n.CommandType.ToString().Equals('Application'))
                    {
                       foreach ($ext in @((cat Env:PathExt).Split(';')))
                       {
                          if ($n.Path.ToString().ToLower().EndsWith(($ext).ToString().ToLower()))
                          {
                              EscapePath($n.Path) ($lastWord);
                          }
                       }
                    }
                    else
                    {
                        EscapePath($n.Name) ($lastWord);
                    }
                }

                foreach ($n in @(Get-psdrive $varNameStar))
                {
                    EscapePath($n.name   ":") ($lastWord);
                }
             }

             foreach ($n in @(Get-History))
             {
                 if ($n.CommandLine.StartsWith($line) -and $n.CommandLine -ne $line)
                 {
                     $lastWord   $n.CommandLine.Substring($line.Length);
                 }
             }

             # Add the original string to the end of the expansion list.
             $lastWord;

             break;
         }
    }
}

PermalinkCommentscli technical tabexpansion powershell

Regex Tester - RegexPal

2008 Jun 12, 3:24A nice in browser regex tester. Give a sample string and a regex and the matches are highlighted.PermalinkCommentsregex javascript programming development

URI Fragment Info Roundup

2008 Apr 21, 11:53

['Neverending story' by Alexandre Duret-Lutz. A framed photo of books with the droste effect applied. Licensed under creative commons.]Information about URI Fragments, the portion of URIs that follow the '#' at the end and that are used to navigate within a document, is scattered throughout various documents which I usually have to hunt down. Instead I'll link to them all here.

Definitions. Fragments are defined in the URI RFC which states that they're used to identify a secondary resource that is related to the primary resource identified by the URI as a subset of the primary, a view of the primary, or some other resource described by the primary. The interpretation of a fragment is based on the mime type of the primary resource. Tim Berners-Lee notes that determining fragment meaning from mime type is a problem because a single URI may contain a single fragment, however over HTTP a single URI can result in the same logical resource represented in different mime types. So there's one fragment but multiple mime types and so multiple interpretations of the one fragment. The URI RFC says that if an author has a single resource available in multiple mime types then the author must ensure that the various representations of a single resource must all resolve fragments to the same logical secondary resource. Depending on which mime types you're dealing with this is either not easy or not possible.

HTTP. In HTTP when URIs are used, the fragment is not included. The General Syntax section of the HTTP standard says it uses the definitions of 'URI-reference' (which includes the fragment), 'absoluteURI', and 'relativeURI' (which don't include the fragment) from the URI RFC. However, the 'URI-reference' term doesn't actually appear in the BNF for the protocol. Accordingly the headers like 'Request-URI', 'Content-Location', 'Location', and 'Referer' which include URIs are defined with 'absoluteURI' or 'relativeURI' and don't include the fragment. This is in keeping with the original fragment definition which says that the fragment is used as a view of the original resource and consequently only needed for resolution on the client. Additionally, the URI RFC explicitly notes that not including the fragment is a privacy feature such that page authors won't be able to stop clients from viewing whatever fragments the client chooses. This seems like an odd claim given that if the author wanted to selectively restrict access to portions of documents there are other options for them like breaking out the parts of a single resource to which the author wishes to restrict access into separate resources.

HTML. In HTML, the HTML mime type RFC defines HTML's fragment use which consists of fragments referring to elements with a corresponding 'id' attribute or one of a particular set of elements with a corresponding 'name' attribute. The HTML spec discusses fragment use additionally noting that the names and ids must be unique in the document and that they must consist of only US-ASCII characters. The ID and NAME attributes are further restricted in section 6 to only consist of alphanumerics, the hyphen, period, colon, and underscore. This is a subset of the characters allowed in the URI fragment so no encoding is discussed since technically its not needed. However, practically speaking, browsers like FireFox and Internet Explorer allow for names and ids containing characters outside of the defined set including characters that must be percent-encoded to appear in a URI fragment. The interpretation of percent-encoded characters in fragments for HTML documents is not consistent across browsers (or in some cases within the same browser) especially for the percent-encoded percent.

Text. Text/plain recently got a fragment definition that allows fragments to refer to particular lines or characters within a text document. The scheme no longer includes regular expressions, which disappointed me at first, but in retrospect is probably good idea for increasing the adoption of this fragment scheme and for avoiding the potential for ubiquitous DoS via regex. One of the authors also notes this on his blog. I look forward to the day when this scheme is widely implemented.

XML. XML has the XPointer framework to define its fragment structure as noted by the XML mime type definition. XPointer consists of a general scheme that contains subschemes that identify a subset of an XML document. Its too bad such a thing wasn't adopted for URI fragments in general to solve the problem of a single resource with multiple mime type representations. I wrote more about XPointer when I worked on hacking XPointer into IE.

SVG and MPEG. Through the Media Fragments Working Group I found a couple more fragment scheme definitions. SVG's fragment scheme is defined in the SVG documentation and looks similar to XML's. MPEG has one defined but I could only find it as an ISO document "Text of ISO/IEC FCD 21000-17 MPEG-12 FID" and not as an RFC which is a little disturbing.

AJAX. AJAX websites have used fragments as an escape hatch for two issues that I've seen. The first is getting a unique URL for versions of a page that are produced on the client by script. The fragment may be changed by script without forcing the page to reload. This goes outside the rules of the standards by using HTML fragments in a fashion not called out by the HTML spec. but it does seem to be inline with the spirit of the fragment in that it is a subview of the original resource and interpretted client side. The other hack-ier use of the fragment in AJAX is for cross domain communication. The basic idea is that different frames or windows may not communicate in normal fashions if they have different domains but they can view each other's URLs and accordingly can change their own fragments in order to send a message out to those who know where to look. IMO this is not inline with the spirit of the fragment but is rather a cool hack.

PermalinkCommentsxml text ajax technical url boring uri fragment rfc

Regular Expressions - Justin Rogers

2008 Mar 28, 10:39Justin's blog posts on regex in .NET.PermalinkCommentsregex justin-rogers ie programming .net

Regular expression - Wikipedia, the free encyclopedia

2008 Mar 28, 10:38Running time of regular expression matching. Most modern regex APIs do backtracking and can have exponential running time depending on the regex string.PermalinkCommentsregex programming reference wikipedia big-oh running-time

Algorithmic Complexity Attacks

2008 Mar 28, 10:35Scott A Crosby and Dan S Wallach "present a new class of low-bandwidth denial of service attacks that exploit algorithmic deficiencies in many common applications' data structures." DoS via worst case behavior in hash tables and exponential time RegExp'sPermalinkCommentsscott-crosby dan-wallach dos programming regex research security hash

clbuttic - Google Search

2008 Feb 25, 10:08That's clbuttic. I totally didn't grok that until I read the WTF article that shows up. FTS: "Cbuttette tape tray broken(thank god for cd's {not good for clbuttic cbuttettes). So ya everyone else post whats wrong with your car. :lol: ..."PermalinkCommentsvia:swannman humor clbuttic error mistake regex

reAnimator: Regular Expression FSA Visualizer

2007 Jul 22, 8:38Animated interactive graph of the FSM used to parse any regex and corresponding string you enter.PermalinkCommentsregex flash visualization fsm interactive howto language

Sed - An Introduction and Tutorial

2005 Dec 6, 11:50PermalinkCommentsdevelopment reference regex tools unix sed tutorial

RFC 2234 (rfc2234) - Augmented BNF for Syntax Specifications: ABNF

2005 Oct 21, 1:25ABNF Sytnax DefinitionPermalinkCommentsreference rfc bnf abnf language regex

Regular Expression Library -- presented by ASPSmith.com Training

2005 Oct 21, 1:02PermalinkCommentssearch development regex language reference

EBNF - ISO 14977 Standard

2005 Oct 7, 12:33PermalinkCommentslanguage bnf ebnf iso reference specification regex

Vim Regular Expressions

2005 Sep 22, 11:48PermalinkCommentsreference regex vim tools language

Pattern (Java 2 Platform SE v1.4.2)

2005 Sep 22, 11:47Java's Regular Expression ReferencePermalinkCommentsjava regex reference development language

Welcome to the MSDN Library

2005 Sep 22, 11:45.NET's Regular Expression ReferencePermalinkCommentsregex reference development language dotnet
Older Entries Creative Commons License Some rights reserved.