Lingo phrase matrix

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Lingo phrase matrix

Pen Skol
Hi there,

I am not able to understand how the phrase matrix P in Lingo is calculated. In the documents describing the matrix construction, do we have to use the DF of a term from the whole data set or only from the pseudo-documents?

It is not clear how the second column in the following P matrix is calculated:

<img style="margin-right: 0px;" src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAA3MAAAEQCAIAAACV1LrwAAAgAElEQVR4nO3dv4vjxuP/cf0bhnTXmeuuCWnsyk2ad+PmXbjYJkWKN58Uhi3e+F284VOYW3hD4AMHhiXwhuBDsBAOwoEQhHBhQYRAHBZBeBOWxXAsbxYjvhxLEPoWY2vHmtF4Zix77b3no0p8mh8a+UavG1lSUAAAAABNCB67AwAAAHgiSJYAAABoBskSAAAAzSBZAgAAoBkkSwAAADSDZAkAAIBmkCwBAADQDJIlAAAAmkGyBAAAQDNIlgAAAGgGyRIAAADNIFkCAACgGSRLAAAANINkCQAAgGaQLAEAANAMkiUAAACaQbIEAABAM0iWAAAAaAbJEgAAAM0gWQIAAKAZJEsAAAA0g2QJAACAZpAsAQAA0AySJQAAAJpBsgQAAEAzSJYAAABoBskSAAAAzSBZAgAAoBkkSwAAADSDZAkAAIBmkCwBAADQDJIlAAAAmkGyBAAAQDNIlgAAAGgGyRIAAADNIFkCAACgGSRLAAAANINkCQAAgGaQLAEAANAMkiUAAACaQbIEAABAM0iWAAAAaAbJEgAAAM0gWQIAAKAZJEsAAAA0g2QJAACAZpAsAQAA0AySJQAAAJpBsgQAAEAzSJYAAABoBskSAAAAzSBZAgAAoBkHniz/nIdfBCbPBuH1auM8uzoftILWSXiTP2anN8rnyUUYvh6ffD65sulpPk/C8d9Oo1v100FLjMPzwfjbKF3spr8AAABWDjxZCmW+lHNkURS30bA/TrLV/2bJuBsEQfBsnPz5CL20lf8RnjwPgiAIXgwrYbEiS6Pw3+OB2Lg/ST9If3YbDV9UY3bry/Dmfse9BwAAqHVcybIr5ciiKLJk/PdwXqbI+5vwy1bwfDD5NdNUciDukvGXw/FXXU1YLC0XX4OgN5x8Oxn2giAIWqfR4mF9M78JT3qnoVikzOeXZ8u1y9YwYt0SAAA8lqNIlqvFyOALKUceo/ub8G+98U830WlLCYuS2+j0b+PXcZrlD2uTvUn6sO2H9PxsbXlyEQ1FtByE853vBQAAgN5RJMvrcPBMSVfHJ78Jvzg5v8oWy6Bsszv51aTX2rwY+WcyfsaaJQAAeGTHkCxtYlM+Ty5eDbstJa4t0ihcXlCWLTdbpG/PBi255vt58iacDHvlJ8sbZcRPPO/nl18PWuu/j8zS6PXyTprW4Pwqq0mL+R/hF8Pw5r4MylYpcLkY2eqZ7/VZbrbph5sAAAC7dAzJch4Ogvo7c9ZukV5bCMznb0+7z7rDMM3yolhcTU6ka9D38+RNGF5Mhp+thzwR+0SOXObO1d0277PkbP33kYs0PO12/xHN71eXrSv3GJXukvFfT8I/8qIMypvCotiFdNLbfK9PvohOW8ER3BQPAACetiNIlqsly+qN0CKZ5fPkIkozdSFQXEeWf8soEuryk8Xv6ft8ea1ZioPLxb/+JP1Q5PMkTrNl8/1JEo+7LelmGhFVPx8nd0VR3vGtjYB5lvxrcPp2nstN2KwvLiNj/b0+0p4uAy4AAMCjOfxkWXdj+Dolri1X+6Q7WpafyJfL5RxZFA9JTt3m2d8n//fVMLzKijz7/fd5vl7b8u7sVreMj7Lsctz7exlwV8uQxrC49CGd9NUbw9fd34Rftlonkyt+YImPVxzHcRw/di8A4DEdyDR4+MnS6sbwZVx7SGCrPPqQLEVKk69BqzlSXNFev04tVjo/6w3OLqVgu8x8rcH42/GgFbS6w1cXiSZVrp6FVKfu6rncH8O9Pnl2dT4oHz8EfHxms1mn0+l0Oo/dEQB4ZAcyGR5+slzdGG56/vlqbU9doVw+PFw8IbKyprjMrKuK8+znyeCz1vp16lX6rDyEfHkZvdUdvgqj1PT4zPJ5QHoWl7nr7/XJ529PB//kIjg+Zu12ezQaPXYvAOAgjEajIAhms9kj9uHgk2X5K0vDkxpXOW/9npj7efLNsLt8grhmTXFZapUj8/nPl+H/Pqtcel6tTVay3TIvGi/Q6zV2YzixEuh0OsRKAJDFcdxutx+xA4eeLFe/SjQtWa626Y6TrMjfp7+XmS3P0mgy/Kv2XplVKXGRfZHGP99clT+dzLP5+6xQ0mdpebv6WrLM5z9+Hf664UnumhvDV2/cUW7BMdwYns/fnnZ1a6FH/shPwJ7NdZ/RaDQajVx/eyRKTafTug1ms5lfzU7iOHZqZVRjY6mte7q5Y07blztuWHcpt9myb2aWY+hRyvXg+vEbpY9h/P1qdrVxJjGUsvz+GCp5xMviB54sy5ujTYt8q2XNL8L0tygUr64piuLhcUVBV/NLxGWpZ+Pkfp5cxGl2v/pp5us0/SES21dv8VnJLsfdVlDespPPk/BsePpNsmkFcRUW5Z9Xlj8krayA3iXjzwNdsqyNlXZPMgKegDiOg8A0fYkNxPQaBIHlJFuWGo1GdaXEeoBrza5EbnZqpe43N3Xbi33c6clVjJX5SFVUdlwbvMptxA7uIpxVvgmWrcxmM5tSHgfXQ9mKOASWo3TU41+sfh5jKOVdsxObmWRjqaDmorbNkRVfxce6oedwk6X+YUO6a+J5OukFzwfjf1d/8rjMf0JvGF7Jf5rfhCetoDUYv15eJv+QTvri/5exUuTa1mD1osW1NrM0XF1q7w0nYWS+h6bmyUlBb5Lm9/PoH13pKet1264WZZc/vtThSen4WJgnzXKCFv9ref6ulNKeEsQ25SLEjpKBqLbyvxtbKc+pFeqWYi92t6QhFlTEYIq5ybJgZce1J055G7EjjZ9BK98E+1bkX/3WlfI7uK7kVkTIsOn/gYx/GdBdW+l0Ov1+31DK+8g6sZlJNpYS/6te1LY/shv/+b07h5sst5HPL8+H/cH4+zT7sPZry/Gl6+8iARyafr9fnj9U5WyrfmK+wF0pdX19XSnlV7Orfr8frC8llq28e/fOXNCmfnGea7DDFdPpdCRdSLVPluqOv3z5snJyVbcRnzT4q7LyuKt9M7citjGX8j64TvxG6UDGvyiKTz/91Hv85S92pZT3kXViM5PYlCrq98h+/M1T5e48wWSZ34QnrefLF94I2VW4fMGjzVMkARyujf8Q164QiFnbsGxgU8qvZlfaRQjxofnKtc0pROzCPq+R2SdLdcen02nlQ3Wbxq9pikYrB9R8dbLsW+UEr/bN++A68RulAxl/0Urd+G/sv3yAKqW8j6yTBucf9UOP8d/z3/dlo3tub/duo+ELzf0+/ndzAzggdVd4S+JqUWWbjZdlbUr51exEu3RRtmLOjhuT5a4vgmtZDk55XVJbXIy5zTbbK6/ja1upW3YSfVMPgdy3bQ6uPb9ROpzxr1wUtmzFpm9+R9ZVg/NPsf7Xx2/8N06Yu/D0kuV1OHhWffxksUqWppfZADgCG/8Jrp1nN87sNqX8anZSrhJVPrdPloZfWNadWXfKcnDqhlFNBrtONuoFR7mVuvxRd4Dkvm1zcO35jdLhjL85/23Tf78j66rB+adY/+vjN/6P8mvLp5cs8yw56wZBa3D2dnVXTT6/PB/2gtbg7FL7mhwAR2PjLNngzH5cyVLcb1Ehp/DykziOxRpJEASdTmenF8ssB+dwkg3Jsq4UydKG3yxRt3dyKe/xt/kL2KynlyyLorifJ99Nlj+sDIIgCLrDSfhm4yOBABy4kcVD2hqc2Y8uWZYZsbxwFqzCpXwbqXhkSdnzYJdv7LAcnMNJNiTLulIkSxsNzj9FQ8lS/GW334XtPclkCeBpspkiG5zZjytZVtJhGS5FFi9rlqN5uc3ubhW3HJzDSTYky7pSJEsbDc4/BckSAHatY/E6xwZn9uNKlqrykrehZrHN7h5NYjk4h5NsSJZ1pUiWNhqcfwqSpa88S+Pw9fjk1OI92msWaXThVdBVnqVxGL4anpzz4kTgcdlklAZn9mNPlnLf9hNrVJaDczjJhmRZV4pkaaPB+adoKFkWe/+p5XaNLR/lI3F5b3X5TnDX1xLaF8znb0+7z/zfeSi98Mbweslm5DfRaY8XfwMGNvOjYWY3TL42pfxqdmIOHx4vIC5rO+pkab/N9sz5o66UOVmK/2724Jp74jpKhzP+5mRpLmXum9+RddXg/CM+Lz/cZvwb3EEbTTQm8mX3LKm+AtGGeDO4x2sJbQqK+8Q/Hyd37h2TakknvWZfnJilkboIml2Ou894SxBg4J1RxH3ThpO3TSm/ml1pzzGilbrXtEynU+0tOPI7PLQvIClWO/Xy5ctmeq+wP22rO/7u3btg/fHj6jYirjX4DhVt/rNpxaaUx8H14DdKBzL+hd1I1pWS/yZWSnkfWScNzj+ib3Ip7/E/vmSZp5Oe46KjRLyt2+Mxkx/SSX8v79QREbbJhsSa684XQYEnx2Z+1L4LWJ21R6ORPGXblCq3kWOcfXKypH2RtNqT0WhU6WqlHvW56NqnMat71CzD+FQOgbrj4qwpd1jdptk146LmXcxqK+ohEH0zl7I5uNuzGSXRf/m4H8j4F7r3lautjEajSqPq17tSyvLIbsly/ql8f7SlRN/kUt7jf6TJ0ntJ7zYavvC6BPwhnfT3cu3YO/vW8V6mBT52lvNjZf4Vk6/2lWiGUuK0WnnWY+Xspda8PfUco7YSrMh7JF+KFUUqCUZ9h8ce3spTF5vUQ6DueL/fr9sF+X2DjfdfPX/XxUHzYBoOgeHgbs9mlMQG8nfmcMZf/fZq45SalQNlhbXSN5sjuz2bmUT9/mj7VomM3uN/dMlyuyW9/GrSa3mt3t1Gwxd7Wfbzzr519rbaCjw19vNjecuz9sxd94ulcnIfjUby4yHVbcSSSeOZQCjPH3WtaJNlsE67jKHW3Oxqk9yf0WgkP4+9ssikPQSV7mmv8cnbdDqdHcVi+ZugbUU9BIUUiQx923hwGyG3ogbEQpcs1b4d7Phrk2UhhXtD3zYe2cb7r51JtN8fm75tPLJaR5cs7Zb0sjSaDLtiPLpfDk/+udx+EQ1b4kq69HjzSozL58nFq2G3FQSt7unb5Vt08qtJz3xrzv08CceDF9I29/Pkm6F0Q08+Ty4mw67mUv4ijSbDbnl3krxBnqXRZNhbC7X5TXT6uaYzD3st9Vw3Yvk8CceDVmXHszR6PR6IXnSH58ncfUeAJ8VpfhTXm+qS02j9UqxaynCBeDqdGmpuymhF/SP1Uqzcq40ds9xs1+oOgU33xAY7fXtQ+U3QtqI9BBtLlfZzCEQThu+59kt+FONf18PZbLaxb5bHaEvmmaSR74/9z7uPLlluXNLLs6vzk5PzK3F/T3Y57j4sUubppBf0J8nP4fjsPJnn1RXQPEvD4eDvYbooivub8MvyjzZcgs/nycWby+jrhx+AZmkUik9Eqft58uYimf+p/kg0+3Vy8rfJlejgIg1Puw8NLdLou+jy9SoNr1oTe7G2BrlIw38Oz34UYTCfvz09OU9z5eVAQX+SLubJm4vLt696LXkY8/nb097J8nWU2VU47C1fhm6/I8CTs+f5EQCegGNLlhsvZ2eX4+5fH1LXIhq2yqAmcmR3MPw6Wr56cS1Z5jfhSeshZS6i09ZyqU/+b12nfv8xvrmXAl++SL6P5/d5Oum1TqPFXRp9L172qITCu2T8l9rUuPgl+vlufRcKXWfub8Ivn52ENw+1XJ2fxau3mK+vtub/iePrvPJhdjnufibn5lWSfm+9I8ATRLIEAFfHliwfLmdr3d+EX8oXedcDkPjFofxUIDlZ3iXjzx9WN+U1PKvbd9SLzvkiOm31vn77NozlICtto+SzmkrWPlF+8VmNnkVRLH5P3+e1f1pZgv2QTvqVpzit34C/eUeAJ4lkCQCujixZbroqXfk1ZCURqlfSpQ2WmfW3RRqHk7Px5LtkmaIs7/tRb/H5kE76re4Xrx6CbGUbpT+ahqqhNr8JT9aytTnkaW94Wi+i2zs1ehp3BHiaSJYA4Oq4kuWGG8MruTO/CU9a0nMc1btwHj4RNfeGkzBKlby0YaG03GY98ooX6lSDo9SBRTSUu1fc31z8vfdZ5T6h22j4mfTJ3c+TLz5bGwHzD0+1Nzyt50LN3mmip2lHgCeKZAkAro4rWd5GwxfaB+jk6bdn0fu13Lm8xbuS5F5Uf024vApsuuXc5gmamt8dKomtGnyrOfjN2fgfg0o8Xevz/U30fZK86sn9VGOfMmLVxcX1XKh78vxtNPzs4fU8m3YEeKpIlgDg6qiSpUhRmvyXL6JX5+mdlA4XafzzzZVIe/8vS16H6f+rrnfmf4Qnf1395lJNlos0/N9xdCsv4OXzHy+S/+p6Vhb/M/v5h6S86Wcta5b1/Cn1Z5nP8nl0NkkWq3ia3/wU/17eky4quZ8nUTJfrC6O399cfBsvcv2aZfZHKi7lPyyL5lnyOqzc5579GiX/1QTHm/CkV/7scuOOcBMPniySJQC4OqJkmWdX54OW8vjJoijyP8Iv/hkt3kfDF0HrNLq7SS7iNPtzGYmSn8Lwt2yZkJ4PJr9mRVHk88t/jf91WT6ycf1Ps/Tt+PQ0vMqKYhndWqfRza/h5E2qf1n5Mt5dXcfnF6KUGlVX29z8cH7+0zx/uP86S99Mzn+a56vE9tDQ6pO7m+TiTTK/X1XyKvntu4ncUNA7jW7yQqzUfjM5/2n5+KHleuQvN5ffTr4Re7tKildpdP4mzfIi/yM8eR50T8N0URRFPv/x7OEpSFY7AjxVJEsAcHUsyVLkoXqt02jx5yI6bbUG4zB5iFDd4SRKl1d18/nlmXgQeG8o36BT/dPng/G30q8tb6Phi0CuR+M2Gr54aLkQy6svljm1bhvxrM3B2dtlW+IebbmhPEvOusHzwfj7VaK9jYafdYeTtR+D1vZ8eSF7fft8fZREBT+eDZ4Hy5GJ1tKzzY4ATxTJEgBcHUuyBIB9I1kCgCuSJQDokSwBwBXJEgD0SJYA4IpkCQB6JEsAcEWyBAA9kiUAuCJZAoAeyRIAXJEsd2GRRhfh6/HJ6WO9WTvP0jgM/z0++WfN+8QBbEayBABXJMuG5emkt3rG5uO8WVu8qUiofeujPfFGTa3e8PySZ1riCSNZAoArkuUu5PKbG/fQXJbG4eT0RA6yD+91bET1je35/PJ82AuC1sPrxYEnh2QJAK5Ilrsg3rgov2t7D82tBdmHl4M3Y/WKS/naenY57mpf4w48EU7z42g0Eov5nU4njuONNZvV1e/Uf3txHHc6Hfv+25fyq9mD3NBoNPIo1e/3G6zZj+tRLr91hr7t5xAc+/jbjKRqOp1uLOVXs6uylXa7bX+IXb8/lv0nWe7Ch3TSb+JKtCU19q1eDt5UtBVX2Ct7tLzsvrcADeyb/fzY6XQ6nU5RFHEciynYPLmbY2Vl+pZnf78dMRN9Fh227H9RFLPZbGMpv5q33AVxFhSHw76U+G+1lF/NruI49jjKo9Go7Exd3/ZzCORWxI4c1/iXf3+dWonjuN1um0v51exKbqXdblseYrVvanD0O7Iky124jYYvmrsSvYkm9n1IJ/0mVxO119ZFu6xZ4umynB/FpFxO5WVE8KhZzOPlf4upfzqd2tTpRz2jW7bVbrfNpbxrdqU2ZLO+opbq9/uVUn41OxGLXqPRSA6XNgXlU35d3/ZzCI56/IvVmJhHUiX6Jm+jlqrMDJY1u6q0Ui6jOpUq+yZv4z3+x5Us7+fJd5NhLwiCoPuPaH5fFEWR30Sn658U9/PLrwct+VpwnqXRsmDQ6n711equ7TxLvx8PnpeZKZ8n4Xgg/URyVfAhPy3SaDLsfrbcIJ8nF6+G3VbQ+jK8uV+1djXpPVNu39F1vhH51aTXWo9969F2tVdKNFx1Pmh1T98a7sXJ00lPvSGp4Z9yAgfHZn4s14EqBc3zb90fdTqd8o/iOHZKq360axWirel0WldKPa0Wyl771exBbchmuNRS6inZr2Yn8lA4VS76Jn+iFt/PITjq8S9Wi3zyJ/b9Nxw+sajvUbMTv/lHW0r0TU3GHuN/PMkyuwqHXwzDq6wo8pvwpCWCzv1N9O1FuhC/+WsNo0WRZ+kPUboosj/SZXpbXE3+djL5NSuKosiz5Ky7DI6LNPouunw9XFU1T95cJDd35XXkfJ5cXETp79HwhVgRzOfJRRhdhsNW8GIYvc/SOLxI5rnIWA/BS/MbR33nt/YQDVfEyuVDf/JlJ/+spM88S8Ph4O9huiiK+5vwS+Olc+0NSaLU5+Pkbvv9AA6TzfyovUIk/joaLhtpJ3354qa2lV3M16LayrUzm2RcV6rcBb+aPagNlcNluCaoblAmm/JDv5q9OR3lIAjKS7F1fdvPITj28Vf/qtr3fzab1ZUSu1NXs1xwG97zT13f1BVKj/HfxUxlas6zXP5HePLiIY0tomGrEnRWP21c/J6k8iJaniVn3YcrxfkiOm2JBcjFL9HPd6uqfk+j75P5/cMGN1eRSI0P638f0vB1kv0pNnj7W/RNfCP1pwyLUhO2nd9Knk5667lwFW3fL0N2tYci3ZZFlA5XlTck3c7nIp/Pk/NhN3h+Ev7BhXA8YTbzo3ZZaGNE0M7Lhvl6R8lSu6xSNld3R0WhWzCrdNK7Zlfa1ZdiU36yKeVX8zbsj7LomzqMct/2cwiOffy1q+8bW7Hpm7oEKG/T1Jqx3/xT96tKudQ2438UyTLPkrNuGX3ym+i0Pzj7cf3qrYhHg/HbdO0hOJVUp9xbk6eTXtAbTr5LlgucH9JJP+gOJ9GqnmoQFD9h/Pz07fXD/TJri5SVJmw6vw01F4olxr+Mw++jVcheT593yfhz6QcAb097J2emB1OWdwjdpeFpNwiCoNUdvloGb+DpspkftfOsRxA0LFj6VWij7idZG8OHoZT43LtmV3UjYz7/2ZTyq3kb9ke5bhjlvu3nEBz7+Jvz3zb9V382Km/TVLL0m3/q9k4utc34Nz5Tmfk1dru8Hr1Io3AyHk+0mUZdutN8WP0FpHqdt3qftVLtbTR8EXTPkqzmRuzqrx6tOl8Uxer6sl79zxlvy4v1K2KJUb5OvX5Dz3L98rdFGoeTs/FDqq6hvTEc+AjYzI9+M7vK/KwQkmWdY082Nr2q25JkuT2Spfq5uRKb8W98pjLzamwRDVut7vBVGKWmh3LXXSJ/iIkiuikLkHJMrEZP5flBmrtzbqPhi9ZJuLw0vn7d2bbz3tTbdzRBUL6hR+Tg3nASRqndvTfcqYOPlc386DezV8iPL9EiWdY59mRj06u6LUmW2yNZqp+bK7EZ/8ZnKjOfxuwe+r1IL8OxLvOVGSufJxeTYVdZgFzLTNV4unED8ZvFhwXCSm+bfmK5Qs3TlWhb/cT5gUT6G8OBj4DN/Og3s8vE7+Fevnxp2IZkWefYk41Nr+q2JFluj2Spfm6uxGb8G5+pzBpJlnmWhqfjWFpCy7P0lzR7L+XIu5+jXxbyel6Wxsl/rpYLkHfJ+Xere6jXaq7+ZrHcIPs1Sv677MxaLLtLxn+R3nD4UEM+//Ei+a9F57chX80XO6V5RvpDH7LkPPzlqposF2n4v+Pa7LvnN1UCB8RmfvSb2WXau2EqSJZ1jj3Z2PSqbkuS5fZIlurn5kpsxr/xmcrM+2p40BqcX2V5USzSt/8anoZplhfFh/T8VbT4M7t6E/58t0p1X4Y32Tz+9/nlPF9dxr2bJxdRmi2XMF8lyXfh1aKo+w1l5UbyoD+5SqPzN2mWl/cJLe93yefJ+dk4vJIuc69+pnnzazh5k2Z5fecbsbpr++o/l+eTb5J5rlmSXG2TJOHkTSrubQ+eD8RjmLL07fj0dG0XKsQbw0mW+BjZzI+Gmd3mZRV1JzbtZo3P1+bwYeiVoZTYa++aXZnPf6732sul/GreRlPJUn7qza4PwbGPvzlZbtN/c7Js6qlDfvOPIVmWpbYZ/2NIlssnnwdBELQG49cPP1gUMW7wcF+zeKrl4Ozt8heEt9HwRWswDpc3zdxGw8+6w8nq94Uf0kn/4feRRc39PVL5VUoTng/GoXLvi7i/R7q1vLbzjSh/NBkt02p+Nel9PlxLispd8/n88kz06Plg/G39ry3lnRW4Jo6Pi838qH1jm/3J2ylJND5flw+mqZwnRHOGZRX1BR7F+l571+yqfDaK9qmN25Tyq3kb9kdZ9E39ba5cfD+H4NjHX33NjE0rZd8qAVEuVRfvmu2/3/xT955JudQ247+jvyC1ze2zsR24jYafka6Aj4R35tPOyKq6p8pZttIIUW2lD9qzptofbanKY65da/agNmSzZqyWUh9t7VezN6ejrG6p9m0/h+DYx99mJOtKqe/gKUtp14wb77/f/KMtJT40v4vSsv8kSxdNP+QcwCGzmR+1/7JXZ+04jisn+LoHEWvtLllqVy8qbYnOyzuoXemplLKpuRFqQ9rgOxqN5Byglipf4e1ac1MMR1k9BOqysdq3/RwCm1ES/Zfj7OGMv3i7o3kkR6NRpdG6Nx+Wn2jXjBvvv9P8U26jLSX6JpfyHn+SpQPtIzMBPFWW82Nl/lX/6V/o/vVvv2BZ7DJZqueYuqWLSuuV87FayqbmHe2CGqfUAVRL9fv9SimbmhtkOMrqIVDDvdq3/RwCm1ESG8g/DD2c8deu9apxSs3KwfoPEtS+qclsF/2vtCICunb+CYz/8FOnI+/xJ1na2/gWRABPiv38KKbpfr9fd+aunJnKc9XGc7xYLClPDJ1OR10+2VJ5/ijb2nhaKqQlmbpSNjXvYhc6nY568tOGtkr3tI8U3VhzI53feJS1h6CMRIa+7ecQyK2oAbHQJUu1b481/oUUs7StaJNlIYV7Q9/MNe+i/9p3Lmi/PzZ923hktUiW9pSnpgN40pzmxzIfGP60kV7tyGhF/SP1UqxNKadtGiFaqUtOo/Wr4ZVSNruwo1hso+4QlN86c9/2cwhEE9pBVq+GO/VtD+NvHsm6Hs5ms419szxGWypb0Q5yI98f+7u+SJbW8qtJ78XyYT0APgJ7nh8B4AkgWQKAHskSAFyRLFEjv1Ut1yAAABjaSURBVIlOe1z9x8eMZAkArkiWH49FGoWT4Ve297Znl+PuM+nFlcBHh2QJAK5Ilh+N/GrSawXc2w5YI1kCgCuS5Udj9Rb1upc5AqggWQKAK5LlxyJPJz1e/A24IFkCgCuS5UHK0rfjwcNT2fN5Eo4H8opjlkaTYTeovGoyz9JoMuwFQRAEre5XX52cPqxQ5umkV93+fp58M+w+602u8iLP0nDY/UxssGzw4fad+3n0j24QBEHvNLoRn+XzH88Gz1kExRNGsgQAVyTLQ5NnaRxGP4TDF+K+7HyeXFwk87to2BIrjvfz5E0YXd1Ep621V00uriZ/O1k+bjPPkrMyd+bz5GIy7JbP4Bcrl9nVxeTbn6Kve8GLYfQ+uzoftIIg6E/SxTx5c3H59lWvVd4Ynt/E5xdXWXGXjD9f/lIzS6MozYrF7+l7FkHxVJEsAcAVyfLQ/DeJfs2K22j4ojV8e5PGF8k8L8SvJF8Mo9/T6Ptkfq+8ajLPkrPuwxKj+iLKyguE8kXyfTz/sBDxNInPhv93Ob/P38/f//mfOL7O86tJ75l66Xz15vT3afI794zjySNZAoArkuVByq8mvWfd4atlrFxGut5wEqWZ+EBEz9WV6PyP8OSFFATVF1Gub79qZhGdtrpfjcffJNlahtRdOi+KQgTc54Px92nGSiWePpIlALgiWR6k6n3c+SI6bclRb31NcbWU+EH7p6sK1dt3bqPhiyD4fJzcrTevLnmWf6JfywSeJJIlALgiWR6iPJ301gJfdQ1yPUp+SCd96UGV9zfhl631FUf9GmR+Nem1WifhTTUnahc4N/4R8NSQLAHAFcnyAKlLhpU8V9ngNlrd7lM83K8j39xTswa5/O2mcsm7dmEyz9KfwvFfeOUjPhIkSwBwRbI8QMq6YDUClhvc/Rz9shBLj2L7LI2T/1wtFzjvkvPv0ny5/eqe7uQ8TPXX0FceFjizX6Pkvw9/kP2epHeLh3vS8+znHxLe6IOni2QJAK5IloenumSYLyoPGFr+aPKX6+jbi3RR/ijzbp5cRGm2XMJ8lSTfhVeLVYWtoDe5uvnpfPI6md8XRaFcQ39oftncVRqdv0mzPE+/PYtui+y3MPwlW7b+/CT848/5D+fnP80Jlni6SJYA4IpkeXDydNJrfRne3K8+UG70Xt6gHa4y4m00fNEajMPljeS30fCz7nASpeWi5200fBF0h5MofXhUUH416X0+DK+UhweJS+dldSJoPh+c/bgKkeKploPx25QHD+FpI1kCgCuSJQDokSwBwBXJEsAxieN4NBqJ10l1Op3RaDQajTqdjvjvZttymh/lXsVxvLFms7r6nfpvL47jTqdj33/7Un41e5Absv8myKX6/X6DNbsqvz9OrdiU2tshEG21222n7Y96/KfT6cZSfjW7Kltpt9v2h9j1+2PZf5IlgOPT7/crk1ccx+LE2WAr9vOjiLZlN4IgME/u5lhZmb7l2d9vR8xEn0WHLftfFMVsNttYyq/mLXdBnAVtvglq99RSfjW7Kr8/Tq2If1OZS+3tEIjOOH1Lj3385RhdV8qvZldyK+122/IQq31Tg6M8/mIisuk/yRLA8Wm32+oE1/hZ03J+FJNy2W4ZBD1qFvN4+d9i6p9OpzZ1+lHP6JZtVQ6BWsq7ZldqQzbrK2op8c8VuZRfza4q3x/LVuRTfl2pPRyC8lsq2rWv/HDGX4yJ3/jL26il/I6sq0or5TKqU6lCd+y8x59kCeD4aGc3cYZocNa2mR/LdaBKQXNP6v5IvqYfx7FTWvWjXasQbU2n07pS6mm1UPbar2YPakM2w6WWUk/JfjU7EUu/lQrt+28utYdDIH9LC5dkeSDjX6wW+eRP7MdfHsZKKe8j68Rv/tGWUudP7/EnWQI4MuIMNJvNKp83nlps5kcx1VZ+IibOVYYfnGkn/X6/X/dTs90lS1FtZTBF///nf/6nrpToj7ZUudd+NXtQGyoziuHLoJZ69+5dpZRfzU5EhZXjXrby7t07Q/8//fRTbSlD/4vdHAK5V5bf0gMZf9FK3fhv/P5cX1/XlfI+sk6855+6vsmlvMefZAngyKirNcXj/c5SzLN1v4y0b0uscNT96Y6SZd2FM+2JR2YoJT73rtlV3choj4tTKb+aXamXgOVW6s7fdcMo921vh0DtwMbNDmf86y50bP/98Tuyrvzmn7q9k0ttM/6Nz1RmJEsA2zKcU5u9NcFmfvSb2VX9ft989dy1QhskS0MpkqUHy2/p4Yw/yVL93FyJzfg3PlOZkSwBbEWsTcqXbEYWdywGFrSlNvbHb2Z3bYtkWYdkWdc3kqUNkqX6ubkSm/FvfKYyI1kC2Ep5JpDtqC2b+dFvZq/odDrmMz3Jsg7Jsq5vJEsbJEv1c3MlNuPf+ExlRrIEsBWSZYNIloZSJEsPlt/Swxl/kqX6ubkSm/FvfKYyI1kC2IqY19Qbw3fUlmV/tkmW4vq++UxDsqxDsqzrG8nSBslS/dxcic34Nz5TmZEsAWxlFwHL0JbNNh4zu0x7q3sFybIOybKubyRLGyRL9XNzJTbj3/hMZUayBOCv7jSwIzbzo2Fmt3xBn80ePUqyNPTKUErstXfNrsznv7pnBdiU8qvZlTl/1K3Nm5Ol6NveDoHagY2bHc74m5PlNt8fvyPrym/+MSTLstQ240+yBHA01DeSWQosaEtZdqkyidufvLc8E2+pfE1IZUhFc4ZlFe2BkPfau2ZXde/CNg+XTSm/ml3VhQCb/quPwpZL7e0Q1HXA4HDGv+45uJbfn0pAlEv5HVlXfvNP3RvM5VLbjH/jM9WG5vbZGIAnZhfpytzcxm20mU87I6u0L9+zb6URotpKH7RnTbU/2lLlXvvV7EFtyGbNWC0lFvnMb0m2X422pF1ZtO+/udTeDoGhV+YtH338C7uRrCulvt2xLOV9ZJ34zT/aUuJD81vOLftPsgRwHMTVpR39PkzLZn4sV4bKc4z4pLKeNBqN+v2++p40wxvYKsV3lCzFqMrdUPsvOi+fYNRt1E9sat7RLohP5LP+bDbr9/vyKw3VUi9fvqyUsql5S9fX12pGUVtRD4G6Td0nezgEpbpvqei/uW+PMv5FUXz66afmkRRfnsrfX7GN+o0qS1ke2S2p849oVzv/lN8ftZTo2yeffCKX8h5/kiWAQzcajcTyXvlv6MO5N7xQLo11Op26S2DqlSbLn7uVu9/sS4bkyss+q/0vR17+sLJ6od3rjTU3RW5IjK32EnBlFyrda7fb6vBurHl7lcuO2la0h0C8HtpQqtjjISjqr58Wq/5X/ll4IONf+dmA2kr55an8bZV/E6Ltm82R3V6lFTFbVrZRvz/avqnzqjz+YqBs+k+yBAA9+/kxjuMy/GnDYuWHiXEcj0Yjy+vgWpYds1SeO7X9r/tVlrmU/TaNKBvqdDrqya/uh4ly98w3YtfV3Aj5+6NtRXsIZrOZuVSl/7s7BBu/pXU//DiK8TckZrGGZ+jbxiPbeP+1V3W03x/LvpXjr43+WiRLANDb8/wIAE8AyRIA9EiWAOCKZAkAeiRLAHBFsgQAPZIlALgiWQKAHskSAFyRLAFAj2QJAK5IlgCgR7IEAFckSwDQI1kCgCuSJQDokSwBwBXJEgD0SJYA4IpkCQB6JEsAcEWyBAA9kiUAuCJZAoAeyRIAXJEsAUCPZAkArkiWAKBHsgQAVyRLANAjWQKAK5IlAOiRLAHAFckSAPRIlgDgimQJAHqu8+N0Op1Op7PZzKPUu3fv6ja4vr72q9nJbDbzaMWmlF/NHsqG/EpdX183W7Or6UrjpfZzCI59/IVdjL/fkXW1cSYxlLL8/ljWSbIEAD37+TGO4yAIRqNRp9MJgqDT6TiVGo1GdaXiOG632641u+p0Op1Ox7UVm1J+NXsoGwqCIAiCOI6dSonuaUv51eyk8k2wbGU2m9mU2s8hKFtpt9tHN/5C2YR9EfF309A3vyPrymYm2VgqCALtPzw8jizJEgD0LOfHcoIW/2t5/q6U0p4SxDblUsGOkoGotvK/G1sRHTaX8qvZg9yQGDSbU2Cle9oTp1/NTirfBPtWRKwxl9rPIZBbEXn3iMZ/Op12VpySZafT6ff7hr55H1knNjPJxlLif9vtdmUzvyNLsgQAPcv5sXI2sjx/VEppZ22/mp2I81B5gpFbMV85rStV9s275u13wSY/qaXUU7Jfza7UNGPff3Op/RyCox7/0WhUpkOnNUuxsXyBWO2b35F1ZTOTbCxV9k0u5T3+JEsA0LOZH7VnI/GJPCN7lPKr2ZW2QvGh4WdV6ilHrcqvZg9qQzYRQS01nU4rpfxqdqI26tR/c6n9HIKjHn+ZU+XqMFaKex/Z7fu8cZbQllL/UnuP/44OUG1z+2wMALZhMz9qM9bGy2o2pfxqdqWtTbRSruWo1AUztW9+NXtQGypX5gwnV7VUudhTCceuNTsR+aNSW9mKIf+pw6j2bT+H4KjHX+aRLA198z6yHn1uZP4p+1Z+4j3+DU5QNkiWAI6GzfyonWc3nqJsSvnV7ES7rFK2YggfhlLic++aXdUNiPn8Z1PKr2ZX/X5fW5s5f9QNo5psdn0Ijn38bXplv6XcN78j66rB+adY/3u9zfg3NUFZIlkCOBo282ODMzvJ0sOxJxuSZV0pkqWNBuefgmQJALtmMz82OLOTLD0ce7IhWdaVIlnaaHD+KUiWALBrNvNjgzM7ydLDsScbkmVdKZKljQbnn4JkCQC7ZjM/Njizkyw9HHuyIVnWlSJZ2mhw/ilIlgCwazbzY4MzO8nSw7EnG5JlXSmSpY0G55+CZAkAu2YzPxpmdsPka1PKr2Yn5vBhfuRNXSmbWNPgwxTN579tSvnV7MqcP+pKmZOl+O/9HIJjH3+bXtlvKX/od2RdNTj/iM/LD7cZ/x0doNrm9tkYAGzD+xwjnidnmNltSvnV7Ep7jhGtuD4pvdI3v5o9qA2pT+azKVU+T3HLmp1o8599/82l9nMIjnr8ZU4XBMSW8jBW+uZ9ZLfvs9/8U3nfY7HF+O/oANU2t8/GAGAbNvOjdo7eOPnalPKr2ZU2fFj231zKr2YPakM2K7tqKfXR1n41OzG8qcWm/+ZS+zkERz3+si2TZaVv3kd2+z77zT/iQ7Uej/EnWQKAntM5ppxtxT/r5Vfrik/a7bahlFiwqbyQt3L2UmvennqOUVvRvix4Yymbmne0C+obkNVDoJYSn6jvTTbXvD31/K22YvNO8Lr3Pu/6ENiMkrqKdjjjb2hOKBfq1L7JP0hQ+2ZzZLdnM5PUvdO80rfKHOU9/iRLANCznB/LE89sNiuKotPpVKZ1w7WncpoejUbq+b6Sh9SaG1E566itBCvyh2KnDKVsam6K3JAYtMrJz3DRsOxSu91WT5kba95e5ZugbUV7CNrttrlUsa9DILciko22/5Ufhh7I+FfaMnx5tL8rMIy/zZHdns1Mon5/tH0TM5hs45HVIlkCgJ79/BjHcXlaUi8VadcsK6Xq7tWN41gECG3NTSnPndpWtGuWG0vZb9OIsqFOp6Oe/OoOgdw9843YdTU3Qv4maFvRHoLZbGYuVen/fr5ChoCotn4I4z+bzUY1xAbaNUtB3KZj6NvGI9uIjTOJ9vtj2TfzkdUiWQKA3p7nRwB4AkiWAKBHsgQAVyRLANAjWQKAK5IlAOiRLAHAFckSAPRIlgDgimQJAHokSwBwRbIEAD2SJQC4IlkCgB7JEgBckSwBQI9kCQCuSJYAoEeyBABXJEsA0CNZAoArkiUA6JEsAcAVyRIA9EiWAOCKZAkAeiRLAHBFsgQAPZIlALgiWQKAHskSAFyRLAFAj2QJAK5IlgCgR7IEAFckSwDQc50fR6PRaDSK49ij1HQ6rdtgNpv51ewkjmOPVkaSjdvstP+FtAt+pWazWbM1u9o4kt6l/A6uK8a/2Zr9GGYSLafvj2WdJEsA0LOfH+M4DoJgNBp1Op0gCDqdjlOp0WhUVyqO43a77Vqzq06n0+l0nFop+1yWCoKgElzENrId9b+QdkHbk42lxC5oS/nV7KTyTbBsZTab2ZTyOLgeylba7fZHMv5FUYi/m4ZS3jV7EE30+33L7dW+acO9x5ElWQKAnuX8WE7Q4n8tz9+VUtpwKbYpFyF2lAxEtZX/NbdS6XwhhcjKJ+V5qEyfu1i5kXdB9M3mFFjZce2J069mJ5XBtG9FxBpzKY+D60FuReTdIxr/MqC7ttLpdMoYpy3lfWSdiFwu2CdLbd/a7XZlM78jS7IEAD3L+bGSqCzPH5VS2lnbr2YnIv/Jaa9sxXB1spIJ5N7KWVmbEho/66i7YJOf1FJquPer2ZU6Jvb9N5fyO7iujn38+/2+9/jLl57VUn5H1kmg/OvUMlnW9U3+O+s9/iRLANCzmR/VtbpCyVh+pfxqdqWtUHxo+MGWNiBWqlJPP9o92p66CzYNqaWm02mllF/NTtRGnfpvLuV3cF0d9fgXdiNZV0oexkop7yPrzT5Zaruh5kjv8d/RDtY2t8/GAGAbNvOjOh0XFotzNqX8analrW3jZTVRqrIkWflQXVjVXvHfnroL5cqcIYKrpcpl47KUX81ORP6o1Fa2Yg73lQOk9s3v4Lo66vEv1pf97FvZ2DfvI+vNNVnW9a38xHv8SZYAoGczP2rn2Y3/srcp5VezE+2ySmFxihIXENvtdnmC7Pf77XbbfI213W5v3MZV3YCYz382pfxqdiVGUq3NnD/qDpCabDwOrpNjH39txtrYik3f/I7sNuyPbN3eyTu1zfg3NUFZIlkCOBo286Nf/rMp5Vezk23Ch7jfIggCceuA+aefcRyLzbbuctWxJxuSZV0pkqUrkiUAHDqb+dEv/9mU8qvZyZbho+xMEAR14VLeJlg9paiRzlfqr3x+LMmGZFlXimTpimQJAIfOZn7UzrMfQ7IUZcUjlIMVdVVSvdW0wVhQHH+yIVnWlSJZuiJZAsCh23iRt/hYk6X4Ib98s47l4ypFzepj87wde7IhWdaVIlm6OpBkKSYHp55viWQJ4GiMRqONJzC//GdTyq9mJ97hI9Clw2DF3GiD/S+OP9mQLOtKkSxdHUiytJk2m0WyBHA0tkyWhrI2pfxqdmIOH66xxjL17jNZblPKr2ZX5vxRV8qcLMV/+x1cV8c+/uZkaS5l7pvfkd1GI8my/NB7/EmWAFBLXOQ1b6Odf8WlYfM/6zeW8qvZlfYcI1rZXbJs8Gp4odsF9cl8NqXK5yluWbMTbf6z77+5lMfB9XDU41/YjWRdKXkYK6W8j6w3+2Sp/auqvrLV+8g2+/rKjUiWAI7JxnOAdo7eOPnalPKr2ZU2fJhbMSfL8lKsNv6qZ6/tqbtgs7KrllIfbe1XsxPDm1ps+m8u5XFwPRz1+Bd2I1lXSn0HT1nK+8h62zJZig/lT7yPrFO3t0eyBHBMbJ6SU5l/RXiSFzvFJ5WFukopsWBTWSKtnL3UmrennmPq+h9ISxHadFL5UHuC2cVTLdVdUN+ArB4CtZT4RHszu6Hm7annb7UVm3eC17332Xxwt2czSupa++GMv3rZutKK+v0vdDFO7ZvNkW2QIVnWvdO80rfKHOUx/qPRaBePrTUjWQI4JtPpdOOl29lsJh4b/vLlS7F9JYxqVwjKUqPRaDqd9vt9NcKKbT755JPpdKqtuRGj1f3ada2U/Zdz8CeffCK2FMuT7Xa73+/L79cRb+URG4httPvY+C6IzlTe9GNYpCl3/NNPP1XfD7Sx5u1VvgnaVoIV+cMyEhn6tvHgNkJu5eXLl3X9r4SeAxn/6+tr8/ir33/5c8P42xzZ7c1ms/LgBkFQThdyQ+r3R9u36+vrSuUbj2yF/F6uvSFZAjgylg/3Fk92rNtSTN+GUob5Wsz7O8pkpdGK4U/ritSVKvduNBrt4adX5obqDoF5F2xqbkQ5VtpWxJ+qf2QuVbLZx+2JJgzfc+2X/CjGv66Hs9lsY98sj9FONfL92RgZR4+xYFmQLAEcnf0/ng0Ajs7uLvRvaHf/TQLAlnb0zmsAeBoecZIkWQI4SuqvrAAAxeoGxMdqnWQJ4CiJa+KN//oeAI5du91+xF+RkiwBHKs4jpt9xDcAHDvLexx3h2QJ4IiJpx8/4r/OAeBA7OLxqB5IlgCOm3jlI+ESwEfuQGZCkiUAAACaQbIEAABAM0iWAAAAaAbJEgAAAM0gWQIAAKAZJEsAAAA0g2QJAACAZpAsAQAA0AySJQAAAJpBsgQAAEAzSJYAAABoBskSAAAAzSBZAgAAoBkkSwAAADSDZAkAAIBmkCwBAADQDJIlAAAAmkGyBAAAQDNIlgAAAGgGyRIAAADNIFkCAACgGSRLAAAANINkCQAAgGaQLAEAANAMkiUAAACaQbIEAABAM0iWAAAAaAbJEgAAAM0gWQIAAKAZJEsAAAA0g2QJAACAZpAsAQAA0AySJQAAAJpBsgQAAEAzSJYAAABoBskSAAAAzSBZAgAAoBkkSwAAADTj/wNRf5QYUvJd7wAAAABJRU5ErkJggg==" alt="" height="110" width="358">

I understand that after obtaining the frequencies of term and phrases for P matrix, tf-idf and column length normalization is done but even that does not give me the same figure as in this matrix. Would appreciate your help.

Best,

Skold

------------------------------------------------------------------------------
Is your legacy SCM system holding you back? Join Perforce May 7 to find out:
&#149; 3 signs your SCM is hindering your productivity
&#149; Requirements for releasing software faster
&#149; Expert tips and advice for migrating your SCM now
http://p.sf.net/sfu/perforce
_______________________________________________
Carrot2-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/carrot2-developers
Reply | Threaded
Open this post in threaded view
|

Fwd: Lingo phrase matrix

Stanislaw Osinski
Administrator
Hi Skold,

The description in the thesis could indeed have been better. Here are the missing bits:

* for the P matrix calculation, the TF is the frequency of terms in the specific phrase, which means it will be 1 most of the time (unless some word appears more than once in the phrase)
* the IDF factor is taken based on the input documents only (so not including the phrase in question)
* the logarithm in IDF is base 2
* the normalization is for the Euclidean length (so that computing cosine distance is then a simple matrix multiplication)

If you apply all the above, you should get the results shown in the thesis. I recreated the complete calculation for the second column ("Information Retrieval" phrase) in this spreadsheet: https://docs.google.com/spreadsheets/d/1vcmzXlhi-71ivptNzlGx3vlrCHOfsGJgWuU1VuMsW_g/edit#gid=0.

Finally, the implementation of Lingo you'll find in current Carrot2 differs a bit from the one described in our papers. The changes are briefly mentioned in this paper: http://www.ijcte.org/papers/842-IT022.pdf.

Thanks,

Stanislaw



On Fri, May 9, 2014 at 11:41 AM, Pen Skol <[hidden email]> wrote:
Hi there,

I am not able to understand how the phrase matrix P in Lingo is calculated. In the documents describing the matrix construction, do we have to use the DF of a term from the whole data set or only from the pseudo-documents?

It is not clear how the second column in the following P matrix is calculated:



I understand that after obtaining the frequencies of term and phrases for P matrix, tf-idf and column length normalization is done but even that does not give me the same figure as in this matrix. Would appreciate your help.

Best,

Skold

------------------------------------------------------------------------------
Is your legacy SCM system holding you back? Join Perforce May 7 to find out:
&#149; 3 signs your SCM is hindering your productivity
&#149; Requirements for releasing software faster
&#149; Expert tips and advice for migrating your SCM now
http://p.sf.net/sfu/perforce
_______________________________________________
Carrot2-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/carrot2-developers




------------------------------------------------------------------------------
Is your legacy SCM system holding you back? Join Perforce May 7 to find out:
&#149; 3 signs your SCM is hindering your productivity
&#149; Requirements for releasing software faster
&#149; Expert tips and advice for migrating your SCM now
http://p.sf.net/sfu/perforce
_______________________________________________
Carrot2-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/carrot2-developers
Reply | Threaded
Open this post in threaded view
|

Re: Fwd: Lingo phrase matrix

Pen Skol
HI Stanislaw,

Thank you for you answer. I will go through these.

cheers,



On Mon, May 12, 2014 at 2:04 AM, Stanislaw Osinski <[hidden email]> wrote:
Hi Skold,

The description in the thesis could indeed have been better. Here are the missing bits:

* for the P matrix calculation, the TF is the frequency of terms in the specific phrase, which means it will be 1 most of the time (unless some word appears more than once in the phrase)
* the IDF factor is taken based on the input documents only (so not including the phrase in question)
* the logarithm in IDF is base 2
* the normalization is for the Euclidean length (so that computing cosine distance is then a simple matrix multiplication)

If you apply all the above, you should get the results shown in the thesis. I recreated the complete calculation for the second column ("Information Retrieval" phrase) in this spreadsheet: https://docs.google.com/spreadsheets/d/1vcmzXlhi-71ivptNzlGx3vlrCHOfsGJgWuU1VuMsW_g/edit#gid=0.

Finally, the implementation of Lingo you'll find in current Carrot2 differs a bit from the one described in our papers. The changes are briefly mentioned in this paper: http://www.ijcte.org/papers/842-IT022.pdf.

Thanks,

Stanislaw



On Fri, May 9, 2014 at 11:41 AM, Pen Skol <[hidden email]> wrote:
Hi there,

I am not able to understand how the phrase matrix P in Lingo is calculated. In the documents describing the matrix construction, do we have to use the DF of a term from the whole data set or only from the pseudo-documents?

It is not clear how the second column in the following P matrix is calculated:



I understand that after obtaining the frequencies of term and phrases for P matrix, tf-idf and column length normalization is done but even that does not give me the same figure as in this matrix. Would appreciate your help.

Best,

Skold

------------------------------------------------------------------------------
Is your legacy SCM system holding you back? Join Perforce May 7 to find out:
&#149; 3 signs your SCM is hindering your productivity
&#149; Requirements for releasing software faster
&#149; Expert tips and advice for migrating your SCM now
http://p.sf.net/sfu/perforce
_______________________________________________
Carrot2-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/carrot2-developers




------------------------------------------------------------------------------
Is your legacy SCM system holding you back? Join Perforce May 7 to find out:
&#149; 3 signs your SCM is hindering your productivity
&#149; Requirements for releasing software faster
&#149; Expert tips and advice for migrating your SCM now
http://p.sf.net/sfu/perforce
_______________________________________________
Carrot2-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/carrot2-developers



------------------------------------------------------------------------------
"Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
Instantly run your Selenium tests across 300+ browser/OS combos.
Get unparalleled scalability from the best Selenium testing platform available
Simple to use. Nothing to install. Get started now for free."
http://p.sf.net/sfu/SauceLabs
_______________________________________________
Carrot2-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/carrot2-developers