Nmap Development mailing list archives

Re: Replacing passwords.lst


From: David Fifield <david () bamsoftware com>
Date: Tue, 16 Mar 2010 18:58:02 -0600

On Wed, Mar 17, 2010 at 12:48:33AM +0000, Brandon Enright wrote:
The sizes were not as bad as I thought at first. After stripping extra
spaces, we are left with

-rw-r--r--  1 david users  88K 2010-03-16 17:13 faithwriters.lst
-rw-r--r--  1 david users 103K 2010-03-16 17:14 hotmail.lst
-rw-r--r--  1 david users 421K 2010-03-16 17:07 myspace.lst
-rw-r--r--  1 david users 1.9M 2010-03-16 17:18 phpbb.lst
-rw-r--r--  1 david users  58M 2010-03-16 17:24 rockyou.lst.bz2

I wrote a simple program to sum the counts from several password files
and output the top n passwords. Using the five lists above, I
regenerated our nselib/data/passwords.lst. The program automatically
does bz2 decompression based on filename so keeping compressed lists
isn't inconvenient.

Cool, it's good to handle the bz2 compression transparently.  I think
we can't just sum the lists though without normalizing them to a
degree.  Otherwise rockyou is weighted too strongly.

Ron and I chatted off-list about this a bit.  A simple linear weight
probably isn't the right choice because things that are only duplicated
a few times in phpbb or mypspace would get scaled up too much.

I don't understand. All of Ron's lists have counts, not just ranks. So
if a myspace password has a count of 1 or 2, it will still have a count
of 1 or 2 in the master list and end up way at the bottom.

To me, each password list is like a sample from a giant population.
That's not totally accurate because different sites have different
password policies, but the size of each sample shouldn't matter, right?

David Fifield
_______________________________________________
Sent through the nmap-dev mailing list
http://cgi.insecure.org/mailman/listinfo/nmap-dev
Archived at http://seclists.org/nmap-dev/


Current thread: