How to Convert search results into XML format - Carrot2-developers Digest, Vol 22, Issue 19

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

How to Convert search results into XML format - Carrot2-developers Digest, Vol 22, Issue 19

Gyanendra Tripathi

Dear Dawid,

 

I had gone through according to steps as you mentioned in your previous mail which are just listed here,

 

1) query your search engine in perl,

2) format the results to XML,
3) send it to the DCS (via REST interface for example),
4) parse the results.

 

Now I have problem at step number 2, my search engine showes results as a HTML page, so how I can catch results and convert it into desired XML format. Please guide me from where I can get any such script which could resolve this problem.

 

I need guidance also In following area,

 

1) After getting the clustered results form DCS, how I can give it to tree like view.

2) Do we need a separate server for DCS or Both (Search Engine and DCS) can
    configured on same server.

 

Thanking to you with my best regards.
 

Gyanendra

 



On 3/15/08, [hidden email] <[hidden email]> wrote:
Send Carrot2-developers mailing list submissions to
       [hidden email]

To subscribe or unsubscribe via the World Wide Web, visit
       https://lists.sourceforge.net/lists/listinfo/carrot2-developers
or, via email, send a message with subject or body 'help' to
       [hidden email]

You can reach the person managing the list at
       [hidden email]

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Carrot2-developers digest..."


Today's Topics:

  1. Let me try - Carrot2-developers Digest, Vol 22, Issue 13
     (Gyanendra Tripathi)
  2. [JIRA][Carrot2] Work started: (CARROT-167) Basic editor
     displaying results (Urszula Krukar (JIRA))
  3. [JIRA][Carrot2] Resolved: (CARROT-167) Basic editor
     displaying results (Urszula Krukar (JIRA))
  4. [JIRA][Carrot2] Resolved: (CARROT-166) Static Search view
     (Urszula Krukar (JIRA))
  5. [JIRA][Carrot2] Work started: (CARROT-166) Static Search  view
     (Urszula Krukar (JIRA))
  6. [JIRA][Carrot2] Resolved: (CARROT-165) Creating Source and
     Algorithm Extention Points (Urszula Krukar (JIRA))


----------------------------------------------------------------------

Message: 1
Date: Sat, 15 Mar 2008 15:21:28 +0530
From: "Gyanendra Tripathi" <[hidden email]>
Subject: [C2-devel] Let me try - Carrot2-developers Digest, Vol 22,
       Issue 13
To: [hidden email]
Message-ID:
       <[hidden email]>
Content-Type: text/plain; charset="iso-8859-1"

Dear Dawid,

Thanks for your prompt reply.

Let me try as you suggested in your mail. I will get back to you for my
latter problems.

Thanks one's again

Gyanendra.


On 3/14/08, [hidden email] <
[hidden email]> wrote:
>
> Send Carrot2-developers mailing list submissions to
>        [hidden email]
>
> To subscribe or unsubscribe via the World Wide Web, visit
>        https://lists.sourceforge.net/lists/listinfo/carrot2-developers
> or, via email, send a message with subject or body 'help' to
>        [hidden email]
>
> You can reach the person managing the list at
>        [hidden email]
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Carrot2-developers digest..."
>
>
> Today's Topics:
>
>   1. How to configure DCS with my Search engien. (Gyanendra Tripathi)
>   2. Re: How to configure DCS with my Search engien. (Dawid Weiss)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Thu, 13 Mar 2008 10:39:15 +0530
> From: "Gyanendra Tripathi" <[hidden email]>
> Subject: [C2-devel] How to configure DCS with my Search engien.
> To: [hidden email]
> Message-ID:
>        <[hidden email]>
> Content-Type: text/plain; charset="iso-8859-1"
>
> Hi !
>
> I had checkedout complete 'carrot2' folder from SVN repository on 4th
> March
> 2008, and want to implement clustering concept with results of my search
> engine.
> Presently i am working on 'DCS', DCS example page is now runing well on my
> localhost.
> My problem is, i want configure DCS with my search engine which is
> completely developed on Perl, so please guide me how i can start this.
>
>
>
> --
> Thanks and Regards
> Gyanendra Tripathi
> Mob.91+ 9971303398
> [hidden email]
> -------------- next part --------------
> An HTML attachment was scrubbed...
>
> ------------------------------
>
> Message: 2
> Date: Thu, 13 Mar 2008 09:19:54 +0100
> From: Dawid Weiss <[hidden email]>
> Subject: Re: [C2-devel] How to configure DCS with my Search engien.
> To: Carrot2-developers <[hidden email]>
> Message-ID: <[hidden email]>
> Content-Type: text/plain; charset=UTF-8; format=flowed
>
>
> The search engine does not matter for the DCS -- DCS accepts an input that
> is
> the result of a search engine (snippets in XML), so you can basically
> write a
> script that will:
>
> 1) query your search engine in perl,
> 2) format the results to XML,
> 3) send it to the DCS (via REST interface for example),
> 4) parse the results.
>
> D.
>
> Gyanendra Tripathi wrote:
> > Hi !
> >
> > I had checkedout complete 'carrot2' folder from SVN repository on 4th
> March
> > 2008, and want to implement clustering concept with results of my search
> > engine.
> > Presently i am working on 'DCS', DCS example page is now runing well on
> my
> > localhost.
> > My problem is, i want configure DCS with my search engine which is
> > completely developed on Perl, so please guide me how i can start this.
> >
> >
> >
> >
> >
> > ------------------------------------------------------------------------
> >
> >
> -------------------------------------------------------------------------
> > This SF.net email is sponsored by: Microsoft
> > Defy all challenges. Microsoft(R) Visual Studio 2008.
> > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
> >
> >
> > ------------------------------------------------------------------------
> >
> > _______________________________________________
> > Carrot2-developers mailing list
> > [hidden email]
> > https://lists.sourceforge.net/lists/listinfo/carrot2-developers
>
>
>
> ------------------------------
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Microsoft
> Defy all challenges. Microsoft(R) Visual Studio 2008.
> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
>
> ------------------------------
>
> _______________________________________________
> Carrot2-developers mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/carrot2-developers
>
>
> End of Carrot2-developers Digest, Vol 22, Issue 13
> **************************************************
>



--
Thanks and Regards
Gyanendra Tripathi
Mob. 9971303398
[hidden email]
-------------- next part --------------
An HTML attachment was scrubbed...

------------------------------

Message: 2
Date: Sat, 15 Mar 2008 17:08:48 +0100 (CET)
From: "Urszula Krukar (JIRA)" <[hidden email]>
Subject: [C2-devel] [JIRA][Carrot2] Work started: (CARROT-167) Basic
       editor displaying results
To: [hidden email]
Message-ID: <861155639.1205597328110.JavaMail.jira@ophelia>
Content-Type: text/plain; charset=UTF-8


    [ http://issues.carrot2.org/browse/CARROT-167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Work on CARROT-167 started by Urszula Krukar.

> Basic editor displaying results
> -------------------------------
>
>                 Key: CARROT-167
>                 URL: http://issues.carrot2.org/browse/CARROT-167
>             Project: Carrot2
>          Issue Type: New Feature
>          Components: Eclipse browser
>            Reporter: Urszula Krukar
>            Assignee: Urszula Krukar
>             Fix For: 3.0 M1
>
>
> Splitted view:
> - on the left: clusters tree (label with document count, images)
> - on the right: document list inside embedded browser
> Document list should be filtered according to selection on clusters tree.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.carrot2.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira





------------------------------

Message: 3
Date: Sat, 15 Mar 2008 17:15:47 +0100 (CET)
From: "Urszula Krukar (JIRA)" <[hidden email]>
Subject: [C2-devel] [JIRA][Carrot2] Resolved: (CARROT-167) Basic
       editor displaying results
To: [hidden email]
Message-ID: <1649274806.1205597747664.JavaMail.jira@ophelia>
Content-Type: text/plain; charset=UTF-8


    [ http://issues.carrot2.org/browse/CARROT-167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Urszula Krukar resolved CARROT-167.
-----------------------------------

   Resolution: Fixed
     Assignee: Dawid Weiss  (was: Urszula Krukar)

I think that this part is done.
Right now it looks like this:
- on the left: tree with cluster hierarchy (images, labels, count)
- on the right: embedded browser with document list, initially all documents are displayed, when one cluster is selected on tree component, only documents from this cluster are displayed (recursively) and cluster label is added. Links are opened in separate Internal Browser Editor, it has navigation, status and location bars.

If you would like something changed about this editor, please write it in a comment to this issue and reopen it.

> Basic editor displaying results
> -------------------------------
>
>                 Key: CARROT-167
>                 URL: http://issues.carrot2.org/browse/CARROT-167
>             Project: Carrot2
>          Issue Type: New Feature
>          Components: Eclipse browser
>            Reporter: Urszula Krukar
>            Assignee: Dawid Weiss
>             Fix For: 3.0 M1
>
>
> Splitted view:
> - on the left: clusters tree (label with document count, images)
> - on the right: document list inside embedded browser
> Document list should be filtered according to selection on clusters tree.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.carrot2.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira





------------------------------

Message: 4
Date: Sat, 15 Mar 2008 17:31:53 +0100 (CET)
From: "Urszula Krukar (JIRA)" <[hidden email]>
Subject: [C2-devel] [JIRA][Carrot2] Resolved: (CARROT-166) Static
       Search view
To: [hidden email]
Message-ID: <1579669247.1205598713273.JavaMail.jira@ophelia>
Content-Type: text/plain; charset=UTF-8


    [ http://issues.carrot2.org/browse/CARROT-166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Urszula Krukar resolved CARROT-166.
-----------------------------------

   Resolution: Fixed
     Assignee: Dawid Weiss  (was: Urszula Krukar)

If you would like something changed about this view, please write it in a comment to this issue and reopen it.

> Static Search view
> ------------------
>
>                 Key: CARROT-166
>                 URL: http://issues.carrot2.org/browse/CARROT-166
>             Project: Carrot2
>          Issue Type: New Feature
>          Components: Eclipse browser
>            Reporter: Urszula Krukar
>            Assignee: Dawid Weiss
>             Fix For: 3.0 M1
>
>
> Creating static search view:
> - combobox with list of available sources (list created from plugin registry)
> - combobox with list of available algorithms (list created from plugin registry)
> - textbox for inserting query string
> - Search button

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.carrot2.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira





------------------------------

Message: 5
Date: Sat, 15 Mar 2008 17:31:47 +0100 (CET)
From: "Urszula Krukar (JIRA)" <[hidden email]>
Subject: [C2-devel] [JIRA][Carrot2] Work started: (CARROT-166) Static
       Search  view
To: [hidden email]
Message-ID: <1828498879.1205598707630.JavaMail.jira@ophelia>
Content-Type: text/plain; charset=UTF-8


    [ http://issues.carrot2.org/browse/CARROT-166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Work on CARROT-166 started by Urszula Krukar.

> Static Search view
> ------------------
>
>                 Key: CARROT-166
>                 URL: http://issues.carrot2.org/browse/CARROT-166
>             Project: Carrot2
>          Issue Type: New Feature
>          Components: Eclipse browser
>            Reporter: Urszula Krukar
>            Assignee: Urszula Krukar
>             Fix For: 3.0 M1
>
>
> Creating static search view:
> - combobox with list of available sources (list created from plugin registry)
> - combobox with list of available algorithms (list created from plugin registry)
> - textbox for inserting query string
> - Search button

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.carrot2.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira





------------------------------

Message: 6
Date: Sat, 15 Mar 2008 18:05:47 +0100 (CET)
From: "Urszula Krukar (JIRA)" <[hidden email]>
Subject: [C2-devel] [JIRA][Carrot2] Resolved: (CARROT-165) Creating
       Source and Algorithm Extention Points
To: [hidden email]
Message-ID: <223670271.1205600747642.JavaMail.jira@ophelia>
Content-Type: text/plain; charset=UTF-8


    [ http://issues.carrot2.org/browse/CARROT-165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Urszula Krukar resolved CARROT-165.
-----------------------------------

   Resolution: Fixed
     Assignee: Dawid Weiss  (was: Urszula Krukar)

Done.

If you would like something changed about this EPs, please write it in a comment to this issue and reopen it.

> Creating Source and Algorithm Extention Points
> ----------------------------------------------
>
>                 Key: CARROT-165
>                 URL: http://issues.carrot2.org/browse/CARROT-165
>             Project: Carrot2
>          Issue Type: New Feature
>          Components: Eclipse browser
>            Reporter: Urszula Krukar
>            Assignee: Dawid Weiss
>             Fix For: 3.0 M1
>
>
> Create definition for document source EP and algorithm source EP, so that new sources and documents can be added to workbench without the need to reimplement it.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.carrot2.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira





------------------------------

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

------------------------------

_______________________________________________
Carrot2-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/carrot2-developers


End of Carrot2-developers Digest, Vol 22, Issue 19
**************************************************



--
Thanks and Regards
Gyanendra Tripathi
Mob. 9971303398
[hidden email]
-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Carrot2-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/carrot2-developers
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: How to Convert search results into XML format - Carrot2-developers Digest, Vol 22, Issue 19

Dawid Weiss-2

> Now I have problem at step number *2*, my search engine showes results as a
> HTML page, so how I can catch results and convert it into desired XML

I'm sure your engine has a functionality of returning semi-structured data (like
XML). It does not make sense to go back from HTML to XML, really. What is this
engine? Did you write it yourself? If not, check its documentation.

> 1) After getting the clustered results form DCS, how I can give it to tree
> like view.

I don't know, really. This depends on your application (it's your code after all).

> 2) Do we need a separate server for DCS or Both (Search Engine and DCS) can
>     configured on same server.

No, it can run on the same server.

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Carrot2-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/carrot2-developers
Loading...