Confused Setting Up External Source?

classic Classic list List threaded Threaded
7 messages Options
Max
Reply | Threaded
Open this post in threaded view
|

Confused Setting Up External Source?

Max
Hi,

I'm very new to using Carrot2 and would deeply appreciate some help in understanding how to get the document clustering server working correctly using my own external document source.

When I go to localhost:8080 I'm shown this screen.



I want to cluster the information on my own websites rss feed, so I choose "External document source" and for the second portion labeled Document source I select "XML" from the drop down.

Now for query, I don't want to do any type of query I just want to cluster my website feed, so I leave the following field blank. The rest of the settings seem fine, so I leave as is.



The problem here is that, the cluster button is not clickable if I leave the query blank and my second problem that I'm facing is if I do put a query and run the cluster, I get the following 500 error.



I know I have to provide my xml source url, but I'm not sure how exactly to set that up. I've read through the document manual as well as done many searches in the forum, but my mind can't wrap around how to set this up the right way. I see there are examples provided, but I'm not sure what to do with the examples.

Would some one help explain (step-by-step) how to setup up the document server if I want to provide an external xml source (my website url feed) and how to run document cluster server without having to input a query?

Thank you,
Max

Reply | Threaded
Open this post in threaded view
|

Re: Confused Setting Up External Source?

Stanislaw Osinski
Administrator
Hi Max,

The quick start screen is there just to show a few typical use cases, such as querying an external search engine or direct feed of documents. For quick experiments with RSS, I recommend Workbench:


To do the same with the DCS, you'd need to build the request URL yourself, something like:


Note that the XSLT transforming the RSS to Carrot2 format needs to be hosted somewhere, in this case I've used a local server. 

Thanks,

Staszek

On Fri, Oct 5, 2012 at 5:36 PM, Max <[hidden email]> wrote:
Hi,

I'm very new to using Carrot2 and would deeply appreciate some help in
understanding how to get the document clustering server working correctly
using my own external document source.

When I go to localhost:8080 I'm shown this screen.

<http://carrot2-users-and-developers-forum.607571.n2.nabble.com/file/n7577722/c2_1.png>

I want to cluster the information on my own websites rss feed, so I choose
"External document source" and for the second portion labeled Document
source I select "XML" from the drop down.

Now for query, I don't want to do any type of query I just want to cluster
my website feed, so I leave the following field blank. The rest of the
settings seem fine, so I leave as is.

<http://carrot2-users-and-developers-forum.607571.n2.nabble.com/file/n7577722/c2_2.png>

The problem here is that, the cluster button is not clickable if I leave the
query blank and my second problem that I'm facing is if I do put a query and
run the cluster, I get the following 500 error.

<http://carrot2-users-and-developers-forum.607571.n2.nabble.com/file/n7577722/c2_3.png>

I know I have to provide my xml source url, but I'm not sure how exactly to
set that up. I've read through the document manual as well as done many
searches in the forum, but my mind can't wrap around how to set this up the
right way. I see there are examples provided, but I'm not sure what to do
with the examples.

Would some one help explain (step-by-step) how to setup up the document
server if I want to provide an external xml source (my website url feed) and
how to run document cluster server without having to input a query?

Thank you,
Max





--
View this message in context: http://carrot2-users-and-developers-forum.607571.n2.nabble.com/Confused-Setting-Up-External-Source-tp7577722.html
Sent from the Carrot2 Users and Developers Forum mailing list archive at Nabble.com.

------------------------------------------------------------------------------
Don't let slow site performance ruin your business. Deploy New Relic APM
Deploy New Relic app performance management and know exactly
what is happening inside your Ruby, Python, PHP, Java, and .NET app
Try New Relic at no cost today and get our sweet Data Nerd shirt too!
http://p.sf.net/sfu/newrelic-dev2dev
_______________________________________________
Carrot2-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/carrot2-developers


------------------------------------------------------------------------------
Don't let slow site performance ruin your business. Deploy New Relic APM
Deploy New Relic app performance management and know exactly
what is happening inside your Ruby, Python, PHP, Java, and .NET app
Try New Relic at no cost today and get our sweet Data Nerd shirt too!
http://p.sf.net/sfu/newrelic-dev2dev
_______________________________________________
Carrot2-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/carrot2-developers
Reply | Threaded
Open this post in threaded view
|

Carrot2 Webapp + embedded Jetty

jredondo
In reply to this post by Max
Can you guys give some guide lines to build the carrot2 webapp with
embedded Jetty (if it has not been done yet)?

Thank you again...

Jorge


------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_sfd2d_oct
_______________________________________________
Carrot2-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/carrot2-developers
Reply | Threaded
Open this post in threaded view
|

Re: Carrot2 Webapp + embedded Jetty

Dawid Weiss-2
We use embedded jetty for development, so it's kind of there...
WebApp.java in the source distribution. Also, the DCS is bundled as a
jetty application so it's a start ('ant dcs'). To be honest, I don't
see much benefit in doing this for the webapp though -- a WAR file is
easy to deploy on Jetty or any other web container and most people
will be familiar with it. Can you provide some rationale for a
stand-alone embedded jetty distribution?

Dawid

On Wed, Oct 17, 2012 at 6:17 PM,  <[hidden email]> wrote:

> Can you guys give some guide lines to build the carrot2 webapp with
> embedded Jetty (if it has not been done yet)?
>
> Thank you again...
>
> Jorge
>
>
> ------------------------------------------------------------------------------
> Everyone hates slow websites. So do we.
> Make your web apps faster with AppDynamics
> Download AppDynamics Lite for free today:
> http://p.sf.net/sfu/appdyn_sfd2d_oct
> _______________________________________________
> Carrot2-developers mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/carrot2-developers
>

------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_sfd2d_oct
_______________________________________________
Carrot2-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/carrot2-developers
Reply | Threaded
Open this post in threaded view
|

Re: Carrot2 Webapp + embedded Jetty

jredondo
> Can you provide some rationale for a
> stand-alone embedded jetty distribution?

Thanks for your answer.
I think there are many but almost all has something to do with
particularities of our network's admin (:P)... almost all... except two:
1) Although not really significant with small number of requests, there is
no better performance with embedded jetty than, for example, with WAR +
Tomcat?
2) Has not the stand-alone webapp less software dependencies? I mean, in
order to replicate the installation process of all the web site system (in
our case, two webapps: a python one and carrot2) in different machines.

Thank you again.

Jorge


------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_sfd2d_oct
_______________________________________________
Carrot2-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/carrot2-developers
Reply | Threaded
Open this post in threaded view
|

Re: Carrot2 Webapp + embedded Jetty

Dawid Weiss-2
> I think there are many but almost all has something to do with
> particularities of our network's admin (:P)... almost all... except two:

An embedded jetty running the webapp is identical to a stand-alone,
preconfigured jetty together with a war file deployed on startup.
There no difference, really.

> 1) Although not really significant with small number of requests, there is
> no better performance with embedded jetty than, for example, with WAR +
> Tomcat?

No, the performance will be most likely the same. Clustering itself
will be a bottleneck most likely, not the network interface/
middleware.

> 2) Has not the stand-alone webapp less software dependencies? I mean, in

No, they'll be essentially the same thing. You can trim jetty to have
only the required stuff (we only need servlet/ filters support as far
as I remember, no fancy stuff).

If you need to go really lightweight then you can try
http://winstone.sourceforge.net/. I don't have any experience with it
but it seems to have everything our webapp should need.

Dawid

------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_sfd2d_oct
_______________________________________________
Carrot2-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/carrot2-developers
Reply | Threaded
Open this post in threaded view
|

Re: Carrot2 Webapp + embedded Jetty

jredondo
Good, thank you very much once again!

>> I think there are many but almost all has something to do with
>> particularities of our network's admin (:P)... almost all... except two:
>
> An embedded jetty running the webapp is identical to a stand-alone,
> preconfigured jetty together with a war file deployed on startup.
> There no difference, really.
>
>> 1) Although not really significant with small number of requests, there
>> is
>> no better performance with embedded jetty than, for example, with WAR +
>> Tomcat?
>
> No, the performance will be most likely the same. Clustering itself
> will be a bottleneck most likely, not the network interface/
> middleware.
>
>> 2) Has not the stand-alone webapp less software dependencies? I mean, in
>
> No, they'll be essentially the same thing. You can trim jetty to have
> only the required stuff (we only need servlet/ filters support as far
> as I remember, no fancy stuff).
>
> If you need to go really lightweight then you can try
> http://winstone.sourceforge.net/. I don't have any experience with it
> but it seems to have everything our webapp should need.
>
> Dawid
>
> ------------------------------------------------------------------------------
> Everyone hates slow websites. So do we.
> Make your web apps faster with AppDynamics
> Download AppDynamics Lite for free today:
> http://p.sf.net/sfu/appdyn_sfd2d_oct
> _______________________________________________
> Carrot2-developers mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/carrot2-developers
>



------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_sfd2d_oct
_______________________________________________
Carrot2-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/carrot2-developers