WebApp Sec mailing list archives

RE: Sequence Identification Routines?

From: securityarchitect () hush com
Date: Tue, 10 Dec 2002 10:02:33 -0800


You can actually download the source code that thwe MIT Cookie eaters project used for their analysis.

http://pdos.lcs.mit.edu/cookies/pubs.html

On Tue, 10 Dec 2002 08:23:54 -0800 "Dawes, Rogan (ZA - Johannesburg)" <rdawes () deloitte co za> wrote:

Here is a fairly simple perl script that takes a sequence of "random"
tokens, determines the charset for each character position (it assumes 
that
all tokens are the same length), converts each character into a 
decimal
based on the charset for that specific character position, and calculates
the "integer value" of each token.

It then prints each token in sequence, and calculates the difference 
between
the preceding token "value", and its own.

You could then graph the results, to assist in determining the level 
of
randomness.

The fun part about this script, is that it looks explicitly at the 
input
that YOU give it, so the more input, the better and more accurate 
its
calculations will be.

Also, tokens such as

AAAAAAAAAA
AAAAAAAAAB
AAAAAAAAAC

Should expand to 0,1,2, because the extra AAAAAAA simply evaluates 
to zero
(0*1^0 + 0*1^1, etc)

I would be very interested in seeing the results of this plugged 
into
something like Michal Zalewski's strange attractors graphs. I have 
seen some
references to a similar approach, using a package called OpenQVis,
but have
not had time to play with it yet.

Obvious problems:

This generates VERY large numbers, depending on the character set,
and the
length of the token. Differences can therefore also be quite large. 
Graphing
that on a graph that makes any kind of sense is non-trivial, I think. 
Not
being a statistician, of course!

Ways of visualising the results:

Sort the token values, and plot them on a graph. One should ideally 
see a
"straight line" graph, most likely sparsely populated.
Sort the differences and plot them. One should again see a straight 
line
graph, most likely sparsely populated.

Any deviations from a straight line could indicate somewhat non-
random
behaviour. This is not to say that it can help you predict what 
is coming
next, but it can show flaws in the generator.

Alternatively, as someone mentioned diehard, take the integer values,
break
them back into bytes, write them out as a byte stream, and use that 
as input
to diehard for extensive analysis. 

I must say, when I used diehard, I was pretty much unable to evaluate 
what
it was telling me, as I have no idea what the tests that it is running 
mean!
:-)

Have fun.

Rogan

P.S. Any suggestions for improvements, especially performance, and 
analysis,
please send them my way, and I'll see what I can do.
P.P.S.

FWIW, I typically do something like:

for i in `seq 1 1000` ; do
(echo HEAD /cookiegenerator HTTP/1.0; echo Otherheader: whatever; 
echo ) |
nc target 80 | grep Set-Cookie >> cookies
done

Post process cookies to get just the "crumbs" :-), then run them 
through the
analysis below.

Does anyone know of a tool that would automatically use Keep-Alives 
to speed
something like this up, if available, but would fallback to recurring
connections when not?

0 $ cat charset.pl 
#!/usr/bin/perl -w

use strict;
use Math::BigInt ':constant';

my $verbose=0;

my %chars=();

my @charpos=();

my @cookies=();

while (my $line=<>) {
 chomp $line;
 push @cookies,$line;
 my @line=split('',$line);
 for (my $i=0; $i<length($line); $i++) {
   my $char=$line[$i];
   if (! exists $chars{$char} ) {
     $chars{$char}=1;
   } else {
     $chars{$char}++;
   }
   if (!exists $charpos[$i]->{$char}) {
     $charpos[$i]->{$char}=1;
   } else {
     $charpos[$i]->{$char}++;
   }
 }
}

if ($verbose) {
 my @chars=sort keys %chars;
 print "Overall Charset count is : ",$#chars+1,"\n";
 print "Overall Charset is : \n";
 print join('',sort keys %chars),"\n";
 
 print "\nOverall Distribution is :\n";

 foreach my $char (sort keys %chars) {
   print "$char : ",$chars{$char},"\n";
 }
}


my @charset=();

if ($verbose) { print "\n\nPositional distribution is as follows:\n\n\n"; 
}

for (my $i=0; $i<=$#charpos; $i++) {
 my $chars=$charpos[$i];
 my @chars=sort keys %$chars;
 $charset[$i]=join('',sort keys %$chars);

 if ($verbose) {
   print "Position $i Charset count is : ",$#chars+1,"\n";
   print "Charset is : \n";
   print $charset[$i],"\n";
 
   print "\nDistribution is :\n";
 
   foreach my $char (sort keys %$chars) {
     print "$char : ",$chars->{$char},"\n";
   }
   print "\n\n\n\n";
 }

}

my $prev=Math::BigInt->new("0");

while (my $cookie=shift @cookies) {
 my $value=undef;
 my $base=undef;
 my $total=Math::BigInt->new("0");

 for ( my $p=0; $p < length($cookie); $p++) {
   if (defined $base) { $total*=$base; }
   ($value,$base)=charval(substr($cookie,$p,1),$charset[$p]);
   $total+=$value;
 }
 print $cookie," : ",$total," : ",($total-$prev),"\n";
 $prev=$total;
}

exit;


sub charval {
 my $char=shift;
 my $charset=shift;

 return (index($charset,$char),length($charset));
}

-----Original Message-----
From: Nick Jacobsen [mailto:nick () ethicsdesign com] 
Sent: 09 December 2002 10:52 AM
To: webappsec () securityfocus com
Subject: Sequence Identification Routines?


I was hoping one of you might have some input here...  I am black 
box
testing a web app that generates a 5 character (letter and number 
only,
lowercase) verification string, that it then emails to the email 
address on
file, and then the receiver has to type it in to continue with his
registration...  now, I am looking for some sort of programming 
routines,
snippets, or programs, that will look at a set of say, a 1000, numbers,
and
tell me if there is any sensible pattern, off which to predict the 
next 5
character string in the sequence.  Any suggestions welcome!

Thanks,
Nick Jacobsen
Ethics Design
nick () ethicsdesign com




Concerned about your privacy? Follow this link to get
FREE encrypted email: https://www.hushmail.com/?l=2 

Big $$$ to be made with the HushMail Affiliate Program: 
https://www.hushmail.com/about.php?subloc=affiliate&l=427

Current thread:

Sequence Identification Routines? Nick Jacobsen (Dec 09)
- Re: Sequence Identification Routines? Charlie Root (Dec 09)
- Re: Sequence Identification Routines? Jeff Williams @ Aspect (Dec 09)
- RE: Sequence Identification Routines? Tony Welsh (Dec 09)
- Re: Sequence Identification Routines? maddany (Dec 09)
- <Possible follow-ups>
- RE: Sequence Identification Routines? Dawes, Rogan (ZA - Johannesburg) (Dec 10)
- RE: Sequence Identification Routines? securityarchitect (Dec 10)