Dailydave mailing list archives

Re: approximate string matching


From: "Mateusz Berezecki" <mateuszb () gmail com>
Date: Fri, 1 Sep 2006 13:48:50 +0200

Hello Arun,

On 9/1/06, Arun Koshy <arunkoshy () gmail com> wrote:
On 9/1/06, Mateusz Berezecki <mateuszb () gmail com> wrote:
Is anyone aware of a good implementation of any of these algorithms
in C or perhaps some opensource C library for that purpose?
Do you have any recommendations?

Check :

http://www.dcs.shef.ac.uk/~sam/stringmetrics.html#jaccard

The above links into a sourceforge project that has an implementation

http://sourceforge.net/projects/simmetrics/

Hope that helps


Well, sort of :-) I did check the simmetrics project and it's in C#
and reimplementing the interfaces and all required tokenizer libraries
is too much effort for now.

I want something fast yet simple like
http://en.wikipedia.org/wiki/Bitap_algorithm - that one uses Levenshtein
distance function

Thank you for the quick reply and for a reminder of simmetrics. If there
is no other alternative I'll try porting it to C and post the link to the list
so if anyone needs that as well it'll be available


thanks,
Mateusz Berezecki
_______________________________________________
Dailydave mailing list
Dailydave () lists immunitysec com
http://lists.immunitysec.com/mailman/listinfo/dailydave


Current thread: