I modified the carrot-examples.ClusteringDataFromDocumentSources to make it work on Google Search results. I used GoogleDocumentSource.class as source of documents.
I have a list of questions as follows:
1) Everytime I am getting either 15/16/17 clusters, no matter what the query term is. Does this depend on some sort of parameter? How can I achieve the clustering engine return various number of clusters depending on the input documents. Does the number of clusters depend only on the number of input documents?
2) How can I know how many documents(snippets) are being considered in the clustering run? What does the following set? attributes.put(AttributeNames.RESULTS, 1000);