Suffix Sorting Algorithm Used in Lingo

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Suffix Sorting Algorithm Used in Lingo

zivelian
Hello,

I have a question, what suffix sorting algorithm used in Lingo? Is it manber-myers algorithm or sadakane algorithm? Which one?


Thanks.
Reply | Threaded
Open this post in threaded view
|

Re: Suffix Sorting Algorithm Used in Lingo

Stanislaw Osinski
Administrator
I have a question, what suffix sorting algorithm used in Lingo? Is it
manber-myers algorithm or sadakane algorithm? Which one?

Hi,

Actually, it's neither of these -- we simply use a QuickSort with a comparator that compares string suffixes, see:

http://fisheye3.atlassian.com/browse/carrot2/trunk/core/carrot2-util-text/src/org/carrot2/text/preprocessing/SuffixSorter.java?r=trunk

Cheers,

S.

------------------------------------------------------------------------------
Stay on top of everything new and different, both inside and
around Java (TM) technology - register by April 22, and save
$200 on the JavaOne (SM) conference, June 2-5, 2009, San Francisco.
300 plus technical and hands-on sessions. Register today.
Use priority code J9JMT32. http://p.sf.net/sfu/p
_______________________________________________
Carrot2-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/carrot2-developers
Reply | Threaded
Open this post in threaded view
|

Re: Suffix Sorting Algorithm Used in Lingo

Dawid Weiss-2

>> manber-myers algorithm or sadakane algorithm? Which one?

These algorithms have better computational characteristics for texts with longer
average LCP. Simple naive sorting routines have a decent performance for
randomized input (or text-like input). That said, you may want to check
jsuffixarrays.org -- we are working on implementing some efficient suffix array
routines there.

Dawid

------------------------------------------------------------------------------
Stay on top of everything new and different, both inside and
around Java (TM) technology - register by April 22, and save
$200 on the JavaOne (SM) conference, June 2-5, 2009, San Francisco.
300 plus technical and hands-on sessions. Register today.
Use priority code J9JMT32. http://p.sf.net/sfu/p
_______________________________________________
Carrot2-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/carrot2-developers