The topic of document clustering quality measures is very broad. Some
notion of "precision" and "recall" is typically used if you have some
ground truth (reference "ideal" clustering). Whether this is a good
measure of quality is debatable -- see this paper for some discussion:
As for calculating these metrics using Carrot2 see the
carrot2-output-metrics subproject, it contains several measures of
quality (including precision/recall). If you take a look at the tests
of this project, you need to pass the partitions for each document via
its attributes (PARTITIONS) and then invoke a metric of your choice.
Unit tests will guide you as to how this can be done.