Probable bug in UniqueLabelAssigner

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Probable bug in UniqueLabelAssigner

seyfullahd
Hello again :)

I suppose I detect a bug in UniqueLabelAssigner. Here it is.

phraseCos matrix is not column-length-normalized as I saw.
And assignLabels method is supposed the find the unique desired cluster count times pairs which are in decreasing order of pairs' value. But, I believe it should be done when phrase matrix is column-length-normalized.
Now think of a phraseCos matrix as below. The first selected pair should be phraseCos(3,1) but it will be phraseCos(4,3) as in the current code. And this cause different labels selected for the algorithm.

                   0     1               2       3     4
                   0[1  0.000001   3      50     2]
                   1[1  0.000002   3      50     2]
                   2[1  0.000001   3      50     2]
                   3[1  50             3      50     2]
                   4[1  0.000001   3      51     2]
                   5[1  0.000001   3      50     2]

Do you think it is a bug, too, or am I missing something?
Thanks,

Seyfullah