Clustering Text Files

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

Clustering Text Files

praveenr

> I am using the Indri index ( http://www.lemurproject.org/ ) and I  
> need to do topic-based clustering to compare results. I am trying to  
> cluster around 130 documents, I have stored the contents of the  
> documents in separate text files. I was not able to find any  
> documentation for clustering text files. Can anyone help me with  
> this ?
> praveen

------------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It is the best place to buy or sell services for
just about anything Open Source.
http://p.sf.net/sfu/Xq1LFB
_______________________________________________
Carrot2-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/carrot2-developers
Reply | Threaded
Open this post in threaded view
|

Re: Clustering Text Files

praveenr
I also forgot to mention that I am trying to do perform the clustering on a linux machine and the workbench doesn't work. It says some of the library files are missing.
Reply | Threaded
Open this post in threaded view
|

Re: Clustering Text Files

Stanislaw Osinski
Administrator
In reply to this post by praveenr
Hi Praveen,

The easiest way to cluster documents from plain text files would be this:

1. Make sure you are able to compile & run Carrot2 API examples: http://download.carrot2.org/head/manual/#section.integration.compiling-java-program-with-carrot2

2. Modify the org.carrot2.examples.clustering.ClusteringDocumentList to read documents from your text files.

Cheers,

S.

On Sun, Jan 11, 2009 at 01:17, Praveen Chandar <[hidden email]> wrote:

> I am using the Indri index ( http://www.lemurproject.org/ ) and I
> need to do topic-based clustering to compare results. I am trying to
> cluster around 130 documents, I have stored the contents of the
> documents in separate text files. I was not able to find any
> documentation for clustering text files. Can anyone help me with
> this ?
> praveen

------------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It is the best place to buy or sell services for
just about anything Open Source.
http://p.sf.net/sfu/Xq1LFB
_______________________________________________
Carrot2-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/carrot2-developers



------------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It is the best place to buy or sell services for
just about anything Open Source.
http://p.sf.net/sfu/Xq1LFB
_______________________________________________
Carrot2-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/carrot2-developers
Reply | Threaded
Open this post in threaded view
|

Re: Clustering Text Files

Stanislaw Osinski
Administrator
In reply to this post by praveenr


On Sun, Jan 11, 2009 at 01:32, praveenr <[hidden email]> wrote:

I also forgot to mention that I am trying to do perform the clustering on a
linux machine and the workbench doesn't work. It says some of the library
files are missing.

Can you provide a detailed stack trace? Here's how you can get it:

http://download.carrot2.org/head/manual/#section.troubleshooting.workbench.stacktrace

Cheers,

S.

------------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It is the best place to buy or sell services for
just about anything Open Source.
http://p.sf.net/sfu/Xq1LFB
_______________________________________________
Carrot2-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/carrot2-developers
Reply | Threaded
Open this post in threaded view
|

Re: Clustering Text Files

praveenr
In reply to this post by praveenr

Now this is what I get when i run  ./carrot2-workbench -vmargs -Xmx256m

JVM terminated. Exit code=13
/usr/bin/java
-Xmx256m
-jar /home/ravichan/workbench/plugins/org.eclipse.equinox.launcher_1.0.1.R33x_v20080118.jar
-os linux
-ws gtk
-arch x86
-showsplash
-launcher /home/ravichan/workbench/carrot2-workbench
-name Carrot2-workbench
--launcher.library /home/ravichan/workbench/plugins/org.eclipse.equinox.launcher.gtk.linux.x86_1.0.3.R33x_v20080118/eclipse_1023.so
-startup /home/ravichan/workbench/plugins/org.eclipse.equinox.launcher_1.0.1.R33x_v20080118.jar
-exitdata 2e0023
-vm /usr/bin/java
-vmargs
-Xmx256m
-jar /home/ravichan/workbench/plugins/org.eclipse.equinox.launcher_1.0.1.R33x_v20080118.jar
praveenr wrote
> I am using the Indri index ( http://www.lemurproject.org/ ) and I  
> need to do topic-based clustering to compare results. I am trying to  
> cluster around 130 documents, I have stored the contents of the  
> documents in separate text files. I was not able to find any  
> documentation for clustering text files. Can anyone help me with  
> this ?
> praveen

------------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It is the best place to buy or sell services for
just about anything Open Source.
http://p.sf.net/sfu/Xq1LFB
_______________________________________________
Carrot2-developers mailing list
Carrot2-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/carrot2-developers
Reply | Threaded
Open this post in threaded view
|

Re: Clustering Text Files

Dawid Weiss-2

There should be a log file under the workspace location somewhere. I'm guessing
you have a 64-bit operating system and the build for Linux is 32-bit. You'll
need 32-bit JVM to run it. I'll see what we can do to provide 64-bit versions.

Dawid

praveenr wrote:

>
> Now this is what I get when i run  ./carrot2-workbench -vmargs -Xmx256m
>
> JVM terminated. Exit code=13
> /usr/bin/java
> -Xmx256m
> -jar
> /home/ravichan/workbench/plugins/org.eclipse.equinox.launcher_1.0.1.R33x_v20080118.jar
> -os linux
> -ws gtk
> -arch x86
> -showsplash
> -launcher /home/ravichan/workbench/carrot2-workbench
> -name Carrot2-workbench
> --launcher.library
> /home/ravichan/workbench/plugins/org.eclipse.equinox.launcher.gtk.linux.x86_1.0.3.R33x_v20080118/eclipse_1023.so
> -startup
> /home/ravichan/workbench/plugins/org.eclipse.equinox.launcher_1.0.1.R33x_v20080118.jar
> -exitdata 2e0023
> -vm /usr/bin/java
> -vmargs
> -Xmx256m
> -jar
> /home/ravichan/workbench/plugins/org.eclipse.equinox.launcher_1.0.1.R33x_v20080118.jar
>
> praveenr wrote:
>>
>>> I am using the Indri index ( http://www.lemurproject.org/ ) and I  
>>> need to do topic-based clustering to compare results. I am trying to  
>>> cluster around 130 documents, I have stored the contents of the  
>>> documents in separate text files. I was not able to find any  
>>> documentation for clustering text files. Can anyone help me with  
>>> this ?
>>> praveen
>> ------------------------------------------------------------------------------
>> Check out the new SourceForge.net Marketplace.
>> It is the best place to buy or sell services for
>> just about anything Open Source.
>> http://p.sf.net/sfu/Xq1LFB
>> _______________________________________________
>> Carrot2-developers mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/carrot2-developers
>>
>>
>

------------------------------------------------------------------------------
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
_______________________________________________
Carrot2-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/carrot2-developers
Reply | Threaded
Open this post in threaded view
|

Re: Clustering Text Files

Dawid Weiss-2
In reply to this post by praveenr

Please try this -- there is a x86_64 bit version of Workbench built there:

http://download.carrot2.org/stable/

Dawid

praveenr wrote:

>
> Now this is what I get when i run  ./carrot2-workbench -vmargs -Xmx256m
>
> JVM terminated. Exit code=13
> /usr/bin/java
> -Xmx256m
> -jar
> /home/ravichan/workbench/plugins/org.eclipse.equinox.launcher_1.0.1.R33x_v20080118.jar
> -os linux
> -ws gtk
> -arch x86
> -showsplash
> -launcher /home/ravichan/workbench/carrot2-workbench
> -name Carrot2-workbench
> --launcher.library
> /home/ravichan/workbench/plugins/org.eclipse.equinox.launcher.gtk.linux.x86_1.0.3.R33x_v20080118/eclipse_1023.so
> -startup
> /home/ravichan/workbench/plugins/org.eclipse.equinox.launcher_1.0.1.R33x_v20080118.jar
> -exitdata 2e0023
> -vm /usr/bin/java
> -vmargs
> -Xmx256m
> -jar
> /home/ravichan/workbench/plugins/org.eclipse.equinox.launcher_1.0.1.R33x_v20080118.jar
>
> praveenr wrote:
>>
>>> I am using the Indri index ( http://www.lemurproject.org/ ) and I  
>>> need to do topic-based clustering to compare results. I am trying to  
>>> cluster around 130 documents, I have stored the contents of the  
>>> documents in separate text files. I was not able to find any  
>>> documentation for clustering text files. Can anyone help me with  
>>> this ?
>>> praveen
>> ------------------------------------------------------------------------------
>> Check out the new SourceForge.net Marketplace.
>> It is the best place to buy or sell services for
>> just about anything Open Source.
>> http://p.sf.net/sfu/Xq1LFB
>> _______________________________________________
>> Carrot2-developers mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/carrot2-developers
>>
>>
>

------------------------------------------------------------------------------
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
_______________________________________________
Carrot2-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/carrot2-developers
Reply | Threaded
Open this post in threaded view
|

Re: Clustering Text Files

praveenr
Thank you Dawid ...... The application is running now. I have another issue here, it says the security component could not be initialized. I believe, I don't have write permission to create the application's  profile directory. Is there a way to change where the application stores profile information ? or could you tell me where the application stores the profile information by default.


JIRA dawid.weiss@cs.put.poznan.pl wrote
Please try this -- there is a x86_64 bit version of Workbench built there:

http://download.carrot2.org/stable/

Dawid

praveenr wrote:
>
> Now this is what I get when i run  ./carrot2-workbench -vmargs -Xmx256m
>
> JVM terminated. Exit code=13
> /usr/bin/java
> -Xmx256m
> -jar
> /home/ravichan/workbench/plugins/org.eclipse.equinox.launcher_1.0.1.R33x_v20080118.jar
> -os linux
> -ws gtk
> -arch x86
> -showsplash
> -launcher /home/ravichan/workbench/carrot2-workbench
> -name Carrot2-workbench
> --launcher.library
> /home/ravichan/workbench/plugins/org.eclipse.equinox.launcher.gtk.linux.x86_1.0.3.R33x_v20080118/eclipse_1023.so
> -startup
> /home/ravichan/workbench/plugins/org.eclipse.equinox.launcher_1.0.1.R33x_v20080118.jar
> -exitdata 2e0023
> -vm /usr/bin/java
> -vmargs
> -Xmx256m
> -jar
> /home/ravichan/workbench/plugins/org.eclipse.equinox.launcher_1.0.1.R33x_v20080118.jar
>
> praveenr wrote:
>>
>>> I am using the Indri index ( http://www.lemurproject.org/ ) and I  
>>> need to do topic-based clustering to compare results. I am trying to  
>>> cluster around 130 documents, I have stored the contents of the  
>>> documents in separate text files. I was not able to find any  
>>> documentation for clustering text files. Can anyone help me with  
>>> this ?
>>> praveen
>> ------------------------------------------------------------------------------
>> Check out the new SourceForge.net Marketplace.
>> It is the best place to buy or sell services for
>> just about anything Open Source.
>> http://p.sf.net/sfu/Xq1LFB
>> _______________________________________________
>> Carrot2-developers mailing list
>> Carrot2-developers@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/carrot2-developers
>>
>>
>

------------------------------------------------------------------------------
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
_______________________________________________
Carrot2-developers mailing list
Carrot2-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/carrot2-developers
Reply | Threaded
Open this post in threaded view
|

Re: Clustering Text Files

Dawid Weiss-2

Uhm... Can you give me a stack trace again? There should be a ".log" file under
your "workspace" directory. I really don't see why it should fail to initialize
-- on my OpenSuSE (x86_64) it works just fine.

Dawid

praveenr wrote:

> Thank you Dawid ...... The application is running now. I have another issue
> here, it says the security component could not be initialized. I believe, I
> don't have write permission to create the application's  profile directory.
> Is there a way to change where the application stores profile information ?
> or could you tell me where the application stores the profile information by
> default.
>
>
>
> JIRA [hidden email] wrote:
>>
>> Please try this -- there is a x86_64 bit version of Workbench built there:
>>
>> http://download.carrot2.org/stable/
>>
>> Dawid
>>
>> praveenr wrote:
>>> Now this is what I get when i run  ./carrot2-workbench -vmargs -Xmx256m
>>>
>>> JVM terminated. Exit code=13
>>> /usr/bin/java
>>> -Xmx256m
>>> -jar
>>> /home/ravichan/workbench/plugins/org.eclipse.equinox.launcher_1.0.1.R33x_v20080118.jar
>>> -os linux
>>> -ws gtk
>>> -arch x86
>>> -showsplash
>>> -launcher /home/ravichan/workbench/carrot2-workbench
>>> -name Carrot2-workbench
>>> --launcher.library
>>> /home/ravichan/workbench/plugins/org.eclipse.equinox.launcher.gtk.linux.x86_1.0.3.R33x_v20080118/eclipse_1023.so
>>> -startup
>>> /home/ravichan/workbench/plugins/org.eclipse.equinox.launcher_1.0.1.R33x_v20080118.jar
>>> -exitdata 2e0023
>>> -vm /usr/bin/java
>>> -vmargs
>>> -Xmx256m
>>> -jar
>>> /home/ravichan/workbench/plugins/org.eclipse.equinox.launcher_1.0.1.R33x_v20080118.jar
>>>
>>> praveenr wrote:
>>>>> I am using the Indri index ( http://www.lemurproject.org/ ) and I  
>>>>> need to do topic-based clustering to compare results. I am trying to  
>>>>> cluster around 130 documents, I have stored the contents of the  
>>>>> documents in separate text files. I was not able to find any  
>>>>> documentation for clustering text files. Can anyone help me with  
>>>>> this ?
>>>>> praveen
>>>> ------------------------------------------------------------------------------
>>>> Check out the new SourceForge.net Marketplace.
>>>> It is the best place to buy or sell services for
>>>> just about anything Open Source.
>>>> http://p.sf.net/sfu/Xq1LFB
>>>> _______________________________________________
>>>> Carrot2-developers mailing list
>>>> [hidden email]
>>>> https://lists.sourceforge.net/lists/listinfo/carrot2-developers
>>>>
>>>>
>> ------------------------------------------------------------------------------
>> This SF.net email is sponsored by:
>> SourcForge Community
>> SourceForge wants to tell your story.
>> http://p.sf.net/sfu/sf-spreadtheword
>> _______________________________________________
>> Carrot2-developers mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/carrot2-developers
>>
>>
>

------------------------------------------------------------------------------
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
_______________________________________________
Carrot2-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/carrot2-developers
Reply | Threaded
Open this post in threaded view
|

Re: Clustering Text Files

praveenr
The .log file is available at http://pastebin.com/m17b3e94d
JIRA dawid.weiss@cs.put.poznan.pl wrote
Uhm... Can you give me a stack trace again? There should be a ".log" file under your "workspace" directory. I really don't see why it should fail to initialize -- on my OpenSuSE (x86_64) it works just fine. Dawid praveenr wrote: > Thank you Dawid ...... The application is running now. I have another issue > here, it says the security component could not be initialized. I believe, I > don't have write permission to create the application's profile directory. > Is there a way to change where the application stores profile information ? > or could you tell me where the application stores the profile information by > default. > > > > JIRA dawid.weiss@cs.put.poznan.pl wrote: >> >> Please try this -- there is a x86_64 bit version of Workbench built there: >> >> http://download.carrot2.org/stable/ >> >> Dawid >> >> praveenr wrote: >>> Now this is what I get when i run ./carrot2-workbench -vmargs -Xmx256m >>> >>> JVM terminated. Exit code=13 >>> /usr/bin/java >>> -Xmx256m >>> -jar >>> /home/ravichan/workbench/plugins/org.eclipse.equinox.launcher_1.0.1.R33x_v20080118.jar >>> -os linux >>> -ws gtk >>> -arch x86 >>> -showsplash >>> -launcher /home/ravichan/workbench/carrot2-workbench >>> -name Carrot2-workbench >>> --launcher.library >>> /home/ravichan/workbench/plugins/org.eclipse.equinox.launcher.gtk.linux.x86_1.0.3.R33x_v20080118/eclipse_1023.so >>> -startup >>> /home/ravichan/workbench/plugins/org.eclipse.equinox.launcher_1.0.1.R33x_v20080118.jar >>> -exitdata 2e0023 >>> -vm /usr/bin/java >>> -vmargs >>> -Xmx256m >>> -jar >>> /home/ravichan/workbench/plugins/org.eclipse.equinox.launcher_1.0.1.R33x_v20080118.jar >>> >>> praveenr wrote: >>>>> I am using the Indri index ( http://www.lemurproject.org/ ) and I >>>>> need to do topic-based clustering to compare results. I am trying to >>>>> cluster around 130 documents, I have stored the contents of the >>>>> documents in separate text files. I was not able to find any >>>>> documentation for clustering text files. Can anyone help me with >>>>> this ? >>>>> praveen >>>> ------------------------------------------------------------------------------ >>>> Check out the new SourceForge.net Marketplace. >>>> It is the best place to buy or sell services for >>>> just about anything Open Source. >>>> http://p.sf.net/sfu/Xq1LFB >>>> _______________________________________________ >>>> Carrot2-developers mailing list >>>> Carrot2-developers@lists.sourceforge.net >>>> https://lists.sourceforge.net/lists/listinfo/carrot2-developers >>>> >>>> >> ------------------------------------------------------------------------------ >> This SF.net email is sponsored by: >> SourcForge Community >> SourceForge wants to tell your story. >> http://p.sf.net/sfu/sf-spreadtheword >> _______________________________________________ >> Carrot2-developers mailing list >> Carrot2-developers@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/carrot2-developers >> >> > ------------------------------------------------------------------------------ This SF.net email is sponsored by: SourcForge Community SourceForge wants to tell your story. http://p.sf.net/sfu/sf-spreadtheword _______________________________________________ Carrot2-developers mailing list Carrot2-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/carrot2-developers
Reply | Threaded
Open this post in threaded view
|

Re: Clustering Text Files

Dawid Weiss-2

This looks like something related to Eclipse (SWT)... I'll take a look, but it
doesn't seem to be something easy to solve.

D.


praveenr wrote:

> The .log file is available at http://pastebin.com/m17b3e94d 
>
> JIRA [hidden email] wrote:
>>
>> Uhm... Can you give me a stack trace again? There should be a ".log" file
>> under
>> your "workspace" directory. I really don't see why it should fail to
>> initialize
>> -- on my OpenSuSE (x86_64) it works just fine.
>>
>> Dawid
>>
>> praveenr wrote:
>>> Thank you Dawid ...... The application is running now. I have another
>>> issue
>>> here, it says the security component could not be initialized. I believe,
>>> I
>>> don't have write permission to create the application's  profile
>>> directory.
>>> Is there a way to change where the application stores profile information
>>> ?
>>> or could you tell me where the application stores the profile information
>>> by
>>> default.
>>>
>>>
>>>
>>> JIRA [hidden email] wrote:
>>>> Please try this -- there is a x86_64 bit version of Workbench built
>>>> there:
>>>>
>>>> http://download.carrot2.org/stable/
>>>>
>>>> Dawid
>>>>
>>>> praveenr wrote:
>>>>> Now this is what I get when i run  ./carrot2-workbench -vmargs -Xmx256m
>>>>>
>>>>> JVM terminated. Exit code=13
>>>>> /usr/bin/java
>>>>> -Xmx256m
>>>>> -jar
>>>>> /home/ravichan/workbench/plugins/org.eclipse.equinox.launcher_1.0.1.R33x_v20080118.jar
>>>>> -os linux
>>>>> -ws gtk
>>>>> -arch x86
>>>>> -showsplash
>>>>> -launcher /home/ravichan/workbench/carrot2-workbench
>>>>> -name Carrot2-workbench
>>>>> --launcher.library
>>>>> /home/ravichan/workbench/plugins/org.eclipse.equinox.launcher.gtk.linux.x86_1.0.3.R33x_v20080118/eclipse_1023.so
>>>>> -startup
>>>>> /home/ravichan/workbench/plugins/org.eclipse.equinox.launcher_1.0.1.R33x_v20080118.jar
>>>>> -exitdata 2e0023
>>>>> -vm /usr/bin/java
>>>>> -vmargs
>>>>> -Xmx256m
>>>>> -jar
>>>>> /home/ravichan/workbench/plugins/org.eclipse.equinox.launcher_1.0.1.R33x_v20080118.jar
>>>>>
>>>>> praveenr wrote:
>>>>>>> I am using the Indri index ( http://www.lemurproject.org/ ) and I  
>>>>>>> need to do topic-based clustering to compare results. I am trying to  
>>>>>>> cluster around 130 documents, I have stored the contents of the  
>>>>>>> documents in separate text files. I was not able to find any  
>>>>>>> documentation for clustering text files. Can anyone help me with  
>>>>>>> this ?
>>>>>>> praveen
>>>>>> ------------------------------------------------------------------------------
>>>>>> Check out the new SourceForge.net Marketplace.
>>>>>> It is the best place to buy or sell services for
>>>>>> just about anything Open Source.
>>>>>> http://p.sf.net/sfu/Xq1LFB
>>>>>> _______________________________________________
>>>>>> Carrot2-developers mailing list
>>>>>> [hidden email]
>>>>>> https://lists.sourceforge.net/lists/listinfo/carrot2-developers
>>>>>>
>>>>>>
>>>> ------------------------------------------------------------------------------
>>>> This SF.net email is sponsored by:
>>>> SourcForge Community
>>>> SourceForge wants to tell your story.
>>>> http://p.sf.net/sfu/sf-spreadtheword
>>>> _______________________________________________
>>>> Carrot2-developers mailing list
>>>> [hidden email]
>>>> https://lists.sourceforge.net/lists/listinfo/carrot2-developers
>>>>
>>>>
>> ------------------------------------------------------------------------------
>> This SF.net email is sponsored by:
>> SourcForge Community
>> SourceForge wants to tell your story.
>> http://p.sf.net/sfu/sf-spreadtheword
>> _______________________________________________
>> Carrot2-developers mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/carrot2-developers
>>
>>
>
>
> ------------------------------------------------------------------------
>
> ------------------------------------------------------------------------------
> This SF.net email is sponsored by:
> SourcForge Community
> SourceForge wants to tell your story.
> http://p.sf.net/sfu/sf-spreadtheword
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Carrot2-developers mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/carrot2-developers

------------------------------------------------------------------------------
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
_______________________________________________
Carrot2-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/carrot2-developers
Reply | Threaded
Open this post in threaded view
|

Re: Clustering Text Files

Gaurang Patel
In reply to this post by Stanislaw Osinski
Hey Praveen,

Were you able to make the API which reads documents from the text file? (apparently, second bullet in the original message.)

Let me know if so; this will save a lot time of mine.

Gaurang

Stanislaw Osinski wrote
Hi Praveen,

The easiest way to cluster documents from plain text files would be this:

1. Make sure you are able to compile & run Carrot2 API examples:
http://download.carrot2.org/head/manual/#section.integration.compiling-java-program-with-carrot2

2. Modify the org.carrot2.examples.clustering.ClusteringDocumentList to read
documents from your text files.

Cheers,

S.

On Sun, Jan 11, 2009 at 01:17, Praveen Chandar <pcr@udel.edu> wrote:

>
> > I am using the Indri index ( http://www.lemurproject.org/ ) and I
> > need to do topic-based clustering to compare results. I am trying to
> > cluster around 130 documents, I have stored the contents of the
> > documents in separate text files. I was not able to find any
> > documentation for clustering text files. Can anyone help me with
> > this ?
> > praveen
>
>
> ------------------------------------------------------------------------------
> Check out the new SourceForge.net Marketplace.
> It is the best place to buy or sell services for
> just about anything Open Source.
> http://p.sf.net/sfu/Xq1LFB
> _______________________________________________
> Carrot2-developers mailing list
> Carrot2-developers@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/carrot2-developers
>

------------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It is the best place to buy or sell services for
just about anything Open Source.
http://p.sf.net/sfu/Xq1LFB
_______________________________________________
Carrot2-developers mailing list
Carrot2-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/carrot2-developers