Carrot2 DCS - Solr

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Carrot2 DCS - Solr

FAVORY , XAVIER
Hi,

I am trying to set up Carrot2 to add its functionality to a Django web application.
I think the Document Clustering Server (DCS) is the most appropriate in my case, as I already have a system working with Django (linked to Solr and other stuff).
Launching the server works and I tested the interface from the browser. However when I want to run a query with Solr as the "Document source", I get that error:

 HTTP ERROR 500

Problem accessing /dcs/rest. Reason:

Could not perform processing: javax.xml.transform.TransformerException: javax.xml.transform.TransformerException: com.sun.org.apache.xml.internal.utils.WrappedRuntimeException: The declaration for the entity "ContentType" must end with '>'.

I am running Solr on: http://localhost:8080
Carrot in running on:
http://localhost:9090
In the "war/carrot2-dcs.war/WEB-INF/suites/source-solr-attributes.xml" file, I have this:
<attribute-sets default="overridden-attributes">
<attribute-set id="overridden-attributes">
<value-set>

<label>overridden-attributes</label>

<attribute key="SolrDocumentSource.serviceUrlBase">
<value type="java.lang.String" value="http://localhost:8080/#/fs2"/>
</attribute>

<attribute key="SolrDocumentSource.solrSummaryFieldName">
<value type="java.lang.String" value="content"/>
</attribute>

<attribute key="SolrDocumentSource.solrTitleFieldName">
<value type="java.lang.String" value="title"/>
</attribute>

</value-set>
</attribute-set>
</attribute-sets>

http://localhost:8080/#/fs2 is the the url I use to access the core Solr interface from the browser.
It is not clear for me what is the problem. Do I have to specify anything else to Carrot2 ?
Or I need to do something with Solr ? (I have not changed anything to Solr configuration, maybe that is the problem?)
I do not find the documentation very clear for the DCS, but I am quite new in this kind of technologies so apologies if my question is stupid...



Thank you,
Xavier Favory

------------------------------------------------------------------------------
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi
_______________________________________________
Carrot2-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/carrot2-developers
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Carrot2 DCS - Solr

Dawid Weiss
Your URL to Solr needs to be pointing at the search handler (API
endpoint). Typically it'd be:

http://localhost:8983/solr/select

What you currently have is not right. The error you're getting is
basically invalid returned XML.

Also, you can install clustering on Solr side so that it'll just
return clustered result. See here:

https://cwiki.apache.org/confluence/display/solr/Result+Clustering

Dawid

On Wed, Nov 9, 2016 at 5:17 PM, FAVORY , XAVIER <[hidden email]> wrote:

> Hi,
>
> I am trying to set up Carrot2 to add its functionality to a Django web
> application.
> I think the Document Clustering Server (DCS) is the most appropriate in my
> case, as I already have a system working with Django (linked to Solr and
> other stuff).
> Launching the server works and I tested the interface from the browser.
> However when I want to run a query with Solr as the "Document source", I get
> that error:
>
>  HTTP ERROR 500
>
> Problem accessing /dcs/rest. Reason:
>
> Could not perform processing: javax.xml.transform.TransformerException:
> javax.xml.transform.TransformerException:
> com.sun.org.apache.xml.internal.utils.WrappedRuntimeException: The
> declaration for the entity "ContentType" must end with '>'.
>
>
> I am running Solr on: http://localhost:8080
> Carrot in running on: http://localhost:9090
>
> In the "war/carrot2-dcs.war/WEB-INF/suites/source-solr-attributes.xml" file,
> I have this:
>
> <attribute-sets default="overridden-attributes">
>   <attribute-set id="overridden-attributes">
>     <value-set>
>
>      <label>overridden-attributes</label>
>
>      <attribute key="SolrDocumentSource.serviceUrlBase">
>            <value type="java.lang.String"
> value="http://localhost:8080/#/fs2"/>
>      </attribute>
>
>      <attribute key="SolrDocumentSource.solrSummaryFieldName">
>        <value type="java.lang.String" value="content"/>
>      </attribute>
>
>      <attribute key="SolrDocumentSource.solrTitleFieldName">
>        <value type="java.lang.String" value="title"/>
>      </attribute>
>
>    </value-set>
>  </attribute-set>
> </attribute-sets>
>
> http://localhost:8080/#/fs2 is the the url I use to access the core Solr
> interface from the browser.
>
> It is not clear for me what is the problem. Do I have to specify anything
> else to Carrot2 ?
> Or I need to do something with Solr ? (I have not changed anything to Solr
> configuration, maybe that is the problem?)
> I do not find the documentation very clear for the DCS, but I am quite new
> in this kind of technologies so apologies if my question is stupid...
>
>
>
> Thank you,
>
> Xavier Favory
>
>
> ------------------------------------------------------------------------------
> Developer Access Program for Intel Xeon Phi Processors
> Access to Intel Xeon Phi processor-based developer platforms.
> With one year of Intel Parallel Studio XE.
> Training and support from Colfax.
> Order your platform today. http://sdm.link/xeonphi
> _______________________________________________
> Carrot2-developers mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/carrot2-developers
>

------------------------------------------------------------------------------
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi
_______________________________________________
Carrot2-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/carrot2-developers
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Carrot2 DCS - Solr

FAVORY , XAVIER
Thank you for your quick answer.
It is working now.

However, it seems that Carrot2 has no information about fields of my documents (and therefor do not produce any valuable clustering). Do I have to specify it in the source-solr-attributes.xml file ?


I might consider installing Carrot2 on Solr, as it seems to be well integrated.


Thanks again,
Xavier

On 9 November 2016 at 17:24, Dawid Weiss <[hidden email]> wrote:
Your URL to Solr needs to be pointing at the search handler (API
endpoint). Typically it'd be:

http://localhost:8983/solr/select

What you currently have is not right. The error you're getting is
basically invalid returned XML.

Also, you can install clustering on Solr side so that it'll just
return clustered result. See here:

https://cwiki.apache.org/confluence/display/solr/Result+Clustering

Dawid

On Wed, Nov 9, 2016 at 5:17 PM, FAVORY , XAVIER <[hidden email]> wrote:
> Hi,
>
> I am trying to set up Carrot2 to add its functionality to a Django web
> application.
> I think the Document Clustering Server (DCS) is the most appropriate in my
> case, as I already have a system working with Django (linked to Solr and
> other stuff).
> Launching the server works and I tested the interface from the browser.
> However when I want to run a query with Solr as the "Document source", I get
> that error:
>
>  HTTP ERROR 500
>
> Problem accessing /dcs/rest. Reason:
>
> Could not perform processing: javax.xml.transform.TransformerException:
> javax.xml.transform.TransformerException:
> com.sun.org.apache.xml.internal.utils.WrappedRuntimeException: The
> declaration for the entity "ContentType" must end with '>'.
>
>
> I am running Solr on: http://localhost:8080
> Carrot in running on: http://localhost:9090
>
> In the "war/carrot2-dcs.war/WEB-INF/suites/source-solr-attributes.xml" file,
> I have this:
>
> <attribute-sets default="overridden-attributes">
>   <attribute-set id="overridden-attributes">
>     <value-set>
>
>      <label>overridden-attributes</label>
>
>      <attribute key="SolrDocumentSource.serviceUrlBase">
>            <value type="java.lang.String"
> value="http://localhost:8080/#/fs2"/>
>      </attribute>
>
>      <attribute key="SolrDocumentSource.solrSummaryFieldName">
>        <value type="java.lang.String" value="content"/>
>      </attribute>
>
>      <attribute key="SolrDocumentSource.solrTitleFieldName">
>        <value type="java.lang.String" value="title"/>
>      </attribute>
>
>    </value-set>
>  </attribute-set>
> </attribute-sets>
>
> http://localhost:8080/#/fs2 is the the url I use to access the core Solr
> interface from the browser.
>
> It is not clear for me what is the problem. Do I have to specify anything
> else to Carrot2 ?
> Or I need to do something with Solr ? (I have not changed anything to Solr
> configuration, maybe that is the problem?)
> I do not find the documentation very clear for the DCS, but I am quite new
> in this kind of technologies so apologies if my question is stupid...
>
>
>
> Thank you,
>
> Xavier Favory
>
>
> ------------------------------------------------------------------------------
> Developer Access Program for Intel Xeon Phi Processors
> Access to Intel Xeon Phi processor-based developer platforms.
> With one year of Intel Parallel Studio XE.
> Training and support from Colfax.
> Order your platform today. http://sdm.link/xeonphi
> _______________________________________________
> Carrot2-developers mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/carrot2-developers
>

------------------------------------------------------------------------------
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi
_______________________________________________
Carrot2-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/carrot2-developers


------------------------------------------------------------------------------
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi
_______________________________________________
Carrot2-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/carrot2-developers
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Carrot2 DCS - Solr

FAVORY , XAVIER
I think I got it.
Thank you !

On 9 November 2016 at 17:57, FAVORY , XAVIER <[hidden email]> wrote:
Thank you for your quick answer.
It is working now.

However, it seems that Carrot2 has no information about fields of my documents (and therefor do not produce any valuable clustering). Do I have to specify it in the source-solr-attributes.xml file ?


I might consider installing Carrot2 on Solr, as it seems to be well integrated.


Thanks again,
Xavier

On 9 November 2016 at 17:24, Dawid Weiss <[hidden email]> wrote:
Your URL to Solr needs to be pointing at the search handler (API
endpoint). Typically it'd be:

http://localhost:8983/solr/select

What you currently have is not right. The error you're getting is
basically invalid returned XML.

Also, you can install clustering on Solr side so that it'll just
return clustered result. See here:

https://cwiki.apache.org/confluence/display/solr/Result+Clustering

Dawid

On Wed, Nov 9, 2016 at 5:17 PM, FAVORY , XAVIER <[hidden email]> wrote:
> Hi,
>
> I am trying to set up Carrot2 to add its functionality to a Django web
> application.
> I think the Document Clustering Server (DCS) is the most appropriate in my
> case, as I already have a system working with Django (linked to Solr and
> other stuff).
> Launching the server works and I tested the interface from the browser.
> However when I want to run a query with Solr as the "Document source", I get
> that error:
>
>  HTTP ERROR 500
>
> Problem accessing /dcs/rest. Reason:
>
> Could not perform processing: javax.xml.transform.TransformerException:
> javax.xml.transform.TransformerException:
> com.sun.org.apache.xml.internal.utils.WrappedRuntimeException: The
> declaration for the entity "ContentType" must end with '>'.
>
>
> I am running Solr on: http://localhost:8080
> Carrot in running on: http://localhost:9090
>
> In the "war/carrot2-dcs.war/WEB-INF/suites/source-solr-attributes.xml" file,
> I have this:
>
> <attribute-sets default="overridden-attributes">
>   <attribute-set id="overridden-attributes">
>     <value-set>
>
>      <label>overridden-attributes</label>
>
>      <attribute key="SolrDocumentSource.serviceUrlBase">
>            <value type="java.lang.String"
> value="http://localhost:8080/#/fs2"/>
>      </attribute>
>
>      <attribute key="SolrDocumentSource.solrSummaryFieldName">
>        <value type="java.lang.String" value="content"/>
>      </attribute>
>
>      <attribute key="SolrDocumentSource.solrTitleFieldName">
>        <value type="java.lang.String" value="title"/>
>      </attribute>
>
>    </value-set>
>  </attribute-set>
> </attribute-sets>
>
> http://localhost:8080/#/fs2 is the the url I use to access the core Solr
> interface from the browser.
>
> It is not clear for me what is the problem. Do I have to specify anything
> else to Carrot2 ?
> Or I need to do something with Solr ? (I have not changed anything to Solr
> configuration, maybe that is the problem?)
> I do not find the documentation very clear for the DCS, but I am quite new
> in this kind of technologies so apologies if my question is stupid...
>
>
>
> Thank you,
>
> Xavier Favory
>
>
> ------------------------------------------------------------------------------
> Developer Access Program for Intel Xeon Phi Processors
> Access to Intel Xeon Phi processor-based developer platforms.
> With one year of Intel Parallel Studio XE.
> Training and support from Colfax.
> Order your platform today. http://sdm.link/xeonphi
> _______________________________________________
> Carrot2-developers mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/carrot2-developers
>

------------------------------------------------------------------------------
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi
_______________________________________________
Carrot2-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/carrot2-developers



------------------------------------------------------------------------------
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi
_______________________________________________
Carrot2-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/carrot2-developers
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Carrot2 DCS - Solr

Dawid Weiss
In reply to this post by FAVORY , XAVIER
> Do I have to specify it in the source-solr-attributes.xml file ?

Yes. There are actually commented out sections in that file that will
guide you. There is also the documentation of the document source,
here:

http://download.carrot2.org/head/manual/#section.component.solr

D.

------------------------------------------------------------------------------
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi
_______________________________________________
Carrot2-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/carrot2-developers
Loading...