How to intercept documents returned from Lucene before clustering?

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

How to intercept documents returned from Lucene before clustering?

Sohe
Hi,

I need to do some preprocessing of documents returned from Lucene before clustering in your Carrot2. Could you please suggest on where should I start with Carrot2 sourcecode (what classes and methods that I need to work with) in order to get documents from Lucene to process with my application and then forward them to Carrot2 for further clustering? Thank you so much in advance.

Sincerely,
Pongdej
Reply | Threaded
Open this post in threaded view
|

Re: How to intercept documents returned from Lucene before clustering?

Stanislaw Osinski
Administrator
Hi Pongdej,

I need to do some preprocessing of documents returned from Lucene before
clustering in your Carrot2. Could you please suggest on where should I start
with Carrot2 sourcecode (what classes and methods that I need to work with)
in order to get documents from Lucene to process with my application and
then forward them to Carrot2 for further clustering? Thank you so much in
advance.

This happens in the SimpleFieldMapper class:

http://fisheye3.atlassian.com/browse/carrot2/trunk/core/carrot2-source-lucene/src/org/carrot2/source/lucene/SimpleFieldMapper.java?r=3963#l181

You may want to modify the map() method of SimpleFieldMapper or create your own implementation of IFieldMapper and pass it to LuceneDocumentSource.

Cheers,

S.


------------------------------------------------------------------------------
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
_______________________________________________
Carrot2-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/carrot2-developers