Cluster process progress monitor?

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Cluster process progress monitor?

Bogdan94202
Hi,

I am extracting clusters from relatively large portion of data - I have indexed 20K+ documents in Solr.
When I run cluster analysis over the complete data set it completes successfully but it takes quite some time - several minutes.
Would that be possible to implement a progress monitor in the lingo algorithm or overall cluster analysis processing so that there is a way to tell the progress of the process? It could be really helpful for large amounts of analyzed data.

Best regards,
Bogdan
Reply | Threaded
Open this post in threaded view
|

Re: Cluster process progress monitor?

Dawid Weiss-2
Carrot2 algorithms are not designed for processing large amounts of
data -- we target small and medium sets of documents and specifically
require that the algorithms run fast (in nearly real time). If you
look at the Mahout project, it contains clustering algorithms that
you'd be able to track (for example by measuring the progress of
input/output). Progress reports from within Lingo would be quite
difficult anyway, because the SVD decomposition routine (which
probably takes most of the time in case of large document sets) is not
implemented in our code.

Dawid

On Thu, Dec 24, 2009 at 4:59 PM, Bogdan94202 <[hidden email]> wrote:

>
> Hi,
>
> I am extracting clusters from relatively large portion of data - I have
> indexed 20K+ documents in Solr.
> When I run cluster analysis over the complete data set it completes
> successfully but it takes quite some time - several minutes.
> Would that be possible to implement a progress monitor in the lingo
> algorithm or overall cluster analysis processing so that there is a way to
> tell the progress of the process? It could be really helpful for large
> amounts of analyzed data.
>
> Best regards,
> Bogdan
> --
> View this message in context: http://n2.nabble.com/Cluster-process-progress-monitor-tp4213646p4213646.html
> Sent from the Carrot2 Users and Developers Forum mailing list archive at Nabble.com.
>
> ------------------------------------------------------------------------------
> This SF.Net email is sponsored by the Verizon Developer Community
> Take advantage of Verizon's best-in-class app development support
> A streamlined, 14 day to market process makes app distribution fast and easy
> Join now and get one step closer to millions of Verizon customers
> http://p.sf.net/sfu/verizon-dev2dev
> _______________________________________________
> Carrot2-developers mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/carrot2-developers
>

------------------------------------------------------------------------------
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
_______________________________________________
Carrot2-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/carrot2-developers