NullPointerException in ContextClassLoaderLocator

classic Classic list List threaded Threaded
15 messages Options
aw
Reply | Threaded
Open this post in threaded view
|

NullPointerException in ContextClassLoaderLocator

aw
Hello Dawid,
Hello everybody,

I developed a (still simple) API in C for the Carrot2 Framework using JNI to integrate Carrot2 in other C/C++ programs. It works pretty good in a metasearch framework written in pure C and also from command line I have experienced no problems up to now.
Now I have developed a PHP5 Extension as a module for a project to integrate Carrot2 with PHP applications more easily and here I experience a problem with Carrot2 when actually executing the clustering engine (via controller.process(...)).

I initialize everything as usual, add the search terms and clustering parameters with the attributes map, I add the documents, like said, all thru JNI and call the following Java function to execute the clustering engine:

<snip>
public static void XTN_STC_execute() {
  final Controller controller = ControllerFactory.createSimple();
  Class<?> algorithm = STCClusteringAlgorithm.class;
  XTN_STC_attributes.put(AttributeNames.DOCUMENTS, XTN_document_list );
  cluster_result = controller.process(XTN_STC_attributes, algorithm);
}
</snip>

The Exception happens executing the clustering algorithm and throws the following Exception:

--
XTN/Carrot2 Error
JNI call CallStaticVoidMethod failed at jvm.c:1136

XTN/Carrot2 JAVA EXCEPTION - Stacktrace follows:
java.lang.NullPointerException
        at org.carrot2.util.resource.ContextClassLoaderLocator.getAll(ContextClassLoaderLocator.java:28)
        at org.carrot2.util.resource.ResourceUtils.getFirst(ResourceUtils.java:105)
        at org.carrot2.text.linguistic.LexicalResources.loadStopLabels(LexicalResources.java:143)
        at org.carrot2.text.linguistic.LexicalResources.load(LexicalResources.java:77)
        at org.carrot2.text.linguistic.DefaultLanguageModelFactory.getLanguageModel(DefaultLanguageModelFactory.java:164)
        at org.carrot2.text.preprocessing.pipeline.BasicPreprocessingPipeline.preprocess(BasicPreprocessingPipeline.java:90)
        at org.carrot2.clustering.stc.STCClusteringAlgorithm.cluster(STCClusteringAlgorithm.java:239)
        at org.carrot2.clustering.stc.STCClusteringAlgorithm.access$000(STCClusteringAlgorithm.java:61)
        at org.carrot2.clustering.stc.STCClusteringAlgorithm$2.process(STCClusteringAlgorithm.java:205)
        at org.carrot2.text.clustering.MultilingualClustering.clusterByLanguage(MultilingualClustering.java:222)
        at org.carrot2.text.clustering.MultilingualClustering.process(MultilingualClustering.java:110)
        at org.carrot2.clustering.stc.STCClusteringAlgorithm.process(STCClusteringAlgorithm.java:198)
        at org.carrot2.core.ControllerUtils.performProcessing(ControllerUtils.java:101)
        at org.carrot2.core.Controller.process(Controller.java:287)
        at org.carrot2.core.Controller.process(Controller.java:180)
        at org.carrot2.xtn.clustering.CarrotSTC_to_XTN.XTN_STC_execute(Unknown Source)
FORCING core dump!

--

Once again, this happens executing the C code as a PHP5 Extension (Apache Module). I create a Java VM at module startup (when PHP gets loaded at apache startup) and coded the extension carrot2 functions as a PHP5 class, means I attach with the php carrot2 class constructor to the JVM and the destructor detaches again from the JVM. I use the thread safety mechanism that I get with the Zend API (Zend object store). The methods are wrappers of the carrot2 C API functions like adding clustering parameters, add documents, execute the Lingo or STC algorithm, and getting the cluster results in various formats (PHP Array, XML or JSON).
I have to mention, that before I used the attach/detach calls, I had not that exception problem. The actual clustering process and feeding documents, getting the results from Carrot2 worked fine so far. But I had to change the mechanism due to re-creation of the JVM in the Apache/PHP environment. I received segfaults with recreating the JVM, so I decided to rewrite the basic module and use the AttachCurrentThread/DetachCurrentThread functions. Even though is is more effective!

Development Platform is Ubuntu/Linux Lucid 10.04 LTS, JRE version: 6.0_24-b07, PHP version is 5.2.16. At this time I use the Carrot2 API Verion 3.4.1.

So, I am a little stuck for now and do not quiet understand why I get that Exception and how I can solve that issue. So any hints are greatly appreciated.  If I missed something or something needs to be more specific, please let me know. I would be happy to get that PHP5 extension working.

Andreas W. Wylach

Reply | Threaded
Open this post in threaded view
|

Re: NullPointerException in ContextClassLoaderLocator

Dawid Weiss-2

Hi Andreas,

What you describe is a pretty wicked scenario... are you sure it was worth the effort? (I mean -- wouldn't going through a local network/ REST protocol be an easier way to integrate with PHP)?

This said, the error you're getting is caused by a NULL context class loader; most likely you'll need to set it if you start the JVM via JNI -- Thread.currentThread().setContextClassLoader()?

Dawid


On Sun, Mar 20, 2011 at 6:02 PM, aw <[hidden email]> wrote:
Hello Dawid,
Hello everybody,

I developed a (still simple) API in C for the Carrot2 Framework using JNI to
integrate Carrot2 in other C/C++ programs. It works pretty good in a
metasearch framework written in pure C and also from command line I have
experienced no problems up to now.
Now I have developed a PHP5 Extension as a module for a project to integrate
Carrot2 with PHP applications more easily and here I experience a problem
with Carrot2 when actually executing the clustering engine (via
controller.process(...)).

I initialize everything as usual, add the search terms and clustering
parameters with the attributes map, I add the documents, like said, all thru
JNI and call the following Java function to execute the clustering engine:


public static void XTN_STC_execute() {
 final Controller controller = ControllerFactory.createSimple();
 Class<?> algorithm = STCClusteringAlgorithm.class;
 XTN_STC_attributes.put(AttributeNames.DOCUMENTS, XTN_document_list );
 cluster_result = controller.process(XTN_STC_attributes, algorithm);
}


The Exception happens executing the clustering algorithm and throws the
following Exception:

--
XTN/Carrot2 Error
JNI call CallStaticVoidMethod failed at jvm.c:1136

XTN/Carrot2 JAVA EXCEPTION - Stacktrace follows:
java.lang.NullPointerException
       at
org.carrot2.util.resource.ContextClassLoaderLocator.getAll(ContextClassLoaderLocator.java:28)
       at org.carrot2.util.resource.ResourceUtils.getFirst(ResourceUtils.java:105)
       at
org.carrot2.text.linguistic.LexicalResources.loadStopLabels(LexicalResources.java:143)
       at
org.carrot2.text.linguistic.LexicalResources.load(LexicalResources.java:77)
       at
org.carrot2.text.linguistic.DefaultLanguageModelFactory.getLanguageModel(DefaultLanguageModelFactory.java:164)
       at
org.carrot2.text.preprocessing.pipeline.BasicPreprocessingPipeline.preprocess(BasicPreprocessingPipeline.java:90)
       at
org.carrot2.clustering.stc.STCClusteringAlgorithm.cluster(STCClusteringAlgorithm.java:239)
       at
org.carrot2.clustering.stc.STCClusteringAlgorithm.access$000(STCClusteringAlgorithm.java:61)
       at
org.carrot2.clustering.stc.STCClusteringAlgorithm$2.process(STCClusteringAlgorithm.java:205)
       at
org.carrot2.text.clustering.MultilingualClustering.clusterByLanguage(MultilingualClustering.java:222)
       at
org.carrot2.text.clustering.MultilingualClustering.process(MultilingualClustering.java:110)
       at
org.carrot2.clustering.stc.STCClusteringAlgorithm.process(STCClusteringAlgorithm.java:198)
       at
org.carrot2.core.ControllerUtils.performProcessing(ControllerUtils.java:101)
       at org.carrot2.core.Controller.process(Controller.java:287)
       at org.carrot2.core.Controller.process(Controller.java:180)
       at org.carrot2.xtn.clustering.CarrotSTC_to_XTN.XTN_STC_execute(Unknown
Source)
FORCING core dump!

--

Once again, this happens executing the C code as a PHP5 Extension (Apache
Module). I create a Java VM at module startup (when PHP gets loaded at
apache startup) and coded the extension carrot2 functions as a PHP5 class,
means I attach with the php carrot2 class constructor to the JVM and the
destructor detaches again from the JVM. I use the thread safety mechanism
that I get with the Zend API (Zend object store). The methods are wrappers
of the carrot2 C API functions like adding clustering parameters, add
documents, execute the Lingo or STC algorithm, and getting the cluster
results in various formats (PHP Array, XML or JSON).
I have to mention, that before I used the attach/detach calls, I had not
that exception problem. The actual clustering process and feeding documents,
getting the results from Carrot2 worked fine so far. But I had to change the
mechanism due to re-creation of the JVM in the Apache/PHP environment. I
received segfaults with recreating the JVM, so I decided to rewrite the
basic module and use the AttachCurrentThread/DetachCurrentThread functions.
Even though is is more effective!

Development Platform is Ubuntu/Linux Lucid 10.04 LTS, JRE version:
6.0_24-b07, PHP version is 5.2.16. At this time I use the Carrot2 API Verion
3.4.1.

So, I am a little stuck for now and do not quiet understand why I get that
Exception and how I can solve that issue. So any hints are greatly
appreciated.  If I missed something or something needs to be more specific,
please let me know. I would be happy to get that PHP5 extension working.

Andreas W. Wylach



--
View this message in context: http://carrot2-users-and-developers-forum.607571.n2.nabble.com/NullPointerException-in-ContextClassLoaderLocator-tp6189959p6189959.html
Sent from the Carrot2 Users and Developers Forum mailing list archive at Nabble.com.

------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
Carrot2-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/carrot2-developers



------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
Carrot2-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/carrot2-developers
Reply | Threaded
Open this post in threaded view
|

Re: NullPointerException in ContextClassLoaderLocator

Dawid Weiss-2

Oh, one more thing -- we will fix this in the trunk so that no NPE occurs even if the class loader is NULL.

Dawid

On Sun, Mar 20, 2011 at 8:39 PM, Dawid Weiss <[hidden email]> wrote:

Hi Andreas,

What you describe is a pretty wicked scenario... are you sure it was worth the effort? (I mean -- wouldn't going through a local network/ REST protocol be an easier way to integrate with PHP)?

This said, the error you're getting is caused by a NULL context class loader; most likely you'll need to set it if you start the JVM via JNI -- Thread.currentThread().setContextClassLoader()?

Dawid


On Sun, Mar 20, 2011 at 6:02 PM, aw <[hidden email]> wrote:
Hello Dawid,
Hello everybody,

I developed a (still simple) API in C for the Carrot2 Framework using JNI to
integrate Carrot2 in other C/C++ programs. It works pretty good in a
metasearch framework written in pure C and also from command line I have
experienced no problems up to now.
Now I have developed a PHP5 Extension as a module for a project to integrate
Carrot2 with PHP applications more easily and here I experience a problem
with Carrot2 when actually executing the clustering engine (via
controller.process(...)).

I initialize everything as usual, add the search terms and clustering
parameters with the attributes map, I add the documents, like said, all thru
JNI and call the following Java function to execute the clustering engine:


public static void XTN_STC_execute() {
 final Controller controller = ControllerFactory.createSimple();
 Class<?> algorithm = STCClusteringAlgorithm.class;
 XTN_STC_attributes.put(AttributeNames.DOCUMENTS, XTN_document_list );
 cluster_result = controller.process(XTN_STC_attributes, algorithm);
}


The Exception happens executing the clustering algorithm and throws the
following Exception:

--
XTN/Carrot2 Error
JNI call CallStaticVoidMethod failed at jvm.c:1136

XTN/Carrot2 JAVA EXCEPTION - Stacktrace follows:
java.lang.NullPointerException
       at
org.carrot2.util.resource.ContextClassLoaderLocator.getAll(ContextClassLoaderLocator.java:28)
       at org.carrot2.util.resource.ResourceUtils.getFirst(ResourceUtils.java:105)
       at
org.carrot2.text.linguistic.LexicalResources.loadStopLabels(LexicalResources.java:143)
       at
org.carrot2.text.linguistic.LexicalResources.load(LexicalResources.java:77)
       at
org.carrot2.text.linguistic.DefaultLanguageModelFactory.getLanguageModel(DefaultLanguageModelFactory.java:164)
       at
org.carrot2.text.preprocessing.pipeline.BasicPreprocessingPipeline.preprocess(BasicPreprocessingPipeline.java:90)
       at
org.carrot2.clustering.stc.STCClusteringAlgorithm.cluster(STCClusteringAlgorithm.java:239)
       at
org.carrot2.clustering.stc.STCClusteringAlgorithm.access$000(STCClusteringAlgorithm.java:61)
       at
org.carrot2.clustering.stc.STCClusteringAlgorithm$2.process(STCClusteringAlgorithm.java:205)
       at
org.carrot2.text.clustering.MultilingualClustering.clusterByLanguage(MultilingualClustering.java:222)
       at
org.carrot2.text.clustering.MultilingualClustering.process(MultilingualClustering.java:110)
       at
org.carrot2.clustering.stc.STCClusteringAlgorithm.process(STCClusteringAlgorithm.java:198)
       at
org.carrot2.core.ControllerUtils.performProcessing(ControllerUtils.java:101)
       at org.carrot2.core.Controller.process(Controller.java:287)
       at org.carrot2.core.Controller.process(Controller.java:180)
       at org.carrot2.xtn.clustering.CarrotSTC_to_XTN.XTN_STC_execute(Unknown
Source)
FORCING core dump!

--

Once again, this happens executing the C code as a PHP5 Extension (Apache
Module). I create a Java VM at module startup (when PHP gets loaded at
apache startup) and coded the extension carrot2 functions as a PHP5 class,
means I attach with the php carrot2 class constructor to the JVM and the
destructor detaches again from the JVM. I use the thread safety mechanism
that I get with the Zend API (Zend object store). The methods are wrappers
of the carrot2 C API functions like adding clustering parameters, add
documents, execute the Lingo or STC algorithm, and getting the cluster
results in various formats (PHP Array, XML or JSON).
I have to mention, that before I used the attach/detach calls, I had not
that exception problem. The actual clustering process and feeding documents,
getting the results from Carrot2 worked fine so far. But I had to change the
mechanism due to re-creation of the JVM in the Apache/PHP environment. I
received segfaults with recreating the JVM, so I decided to rewrite the
basic module and use the AttachCurrentThread/DetachCurrentThread functions.
Even though is is more effective!

Development Platform is Ubuntu/Linux Lucid 10.04 LTS, JRE version:
6.0_24-b07, PHP version is 5.2.16. At this time I use the Carrot2 API Verion
3.4.1.

So, I am a little stuck for now and do not quiet understand why I get that
Exception and how I can solve that issue. So any hints are greatly
appreciated.  If I missed something or something needs to be more specific,
please let me know. I would be happy to get that PHP5 extension working.

Andreas W. Wylach



--
View this message in context: http://carrot2-users-and-developers-forum.607571.n2.nabble.com/NullPointerException-in-ContextClassLoaderLocator-tp6189959p6189959.html
Sent from the Carrot2 Users and Developers Forum mailing list archive at Nabble.com.

------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
Carrot2-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/carrot2-developers




------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
Carrot2-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/carrot2-developers
Reply | Threaded
Open this post in threaded view
|

Re: NullPointerException in ContextClassLoaderLocator

Dawid Weiss-2

This is more complicated than I thought because the NPE occurs here:

final ClassLoader cl = Thread.currentThread().getContextClassLoader();

and it's the Thread.currentThread() that is returning null. I believe what happens is that you're calling from a native thread into a Java method -- if so, you'll probably need to set up a thread context of some sort, because Thread.currentThread() should return a non-null value. Check the JNI documentation (and let us know, out of curiosity).

Dawid

On Sun, Mar 20, 2011 at 8:40 PM, Dawid Weiss <[hidden email]> wrote:

Oh, one more thing -- we will fix this in the trunk so that no NPE occurs even if the class loader is NULL.

Dawid


On Sun, Mar 20, 2011 at 8:39 PM, Dawid Weiss <[hidden email]> wrote:

Hi Andreas,

What you describe is a pretty wicked scenario... are you sure it was worth the effort? (I mean -- wouldn't going through a local network/ REST protocol be an easier way to integrate with PHP)?

This said, the error you're getting is caused by a NULL context class loader; most likely you'll need to set it if you start the JVM via JNI -- Thread.currentThread().setContextClassLoader()?

Dawid


On Sun, Mar 20, 2011 at 6:02 PM, aw <[hidden email]> wrote:
Hello Dawid,
Hello everybody,

I developed a (still simple) API in C for the Carrot2 Framework using JNI to
integrate Carrot2 in other C/C++ programs. It works pretty good in a
metasearch framework written in pure C and also from command line I have
experienced no problems up to now.
Now I have developed a PHP5 Extension as a module for a project to integrate
Carrot2 with PHP applications more easily and here I experience a problem
with Carrot2 when actually executing the clustering engine (via
controller.process(...)).

I initialize everything as usual, add the search terms and clustering
parameters with the attributes map, I add the documents, like said, all thru
JNI and call the following Java function to execute the clustering engine:


public static void XTN_STC_execute() {
 final Controller controller = ControllerFactory.createSimple();
 Class<?> algorithm = STCClusteringAlgorithm.class;
 XTN_STC_attributes.put(AttributeNames.DOCUMENTS, XTN_document_list );
 cluster_result = controller.process(XTN_STC_attributes, algorithm);
}


The Exception happens executing the clustering algorithm and throws the
following Exception:

--
XTN/Carrot2 Error
JNI call CallStaticVoidMethod failed at jvm.c:1136

XTN/Carrot2 JAVA EXCEPTION - Stacktrace follows:
java.lang.NullPointerException
       at
org.carrot2.util.resource.ContextClassLoaderLocator.getAll(ContextClassLoaderLocator.java:28)
       at org.carrot2.util.resource.ResourceUtils.getFirst(ResourceUtils.java:105)
       at
org.carrot2.text.linguistic.LexicalResources.loadStopLabels(LexicalResources.java:143)
       at
org.carrot2.text.linguistic.LexicalResources.load(LexicalResources.java:77)
       at
org.carrot2.text.linguistic.DefaultLanguageModelFactory.getLanguageModel(DefaultLanguageModelFactory.java:164)
       at
org.carrot2.text.preprocessing.pipeline.BasicPreprocessingPipeline.preprocess(BasicPreprocessingPipeline.java:90)
       at
org.carrot2.clustering.stc.STCClusteringAlgorithm.cluster(STCClusteringAlgorithm.java:239)
       at
org.carrot2.clustering.stc.STCClusteringAlgorithm.access$000(STCClusteringAlgorithm.java:61)
       at
org.carrot2.clustering.stc.STCClusteringAlgorithm$2.process(STCClusteringAlgorithm.java:205)
       at
org.carrot2.text.clustering.MultilingualClustering.clusterByLanguage(MultilingualClustering.java:222)
       at
org.carrot2.text.clustering.MultilingualClustering.process(MultilingualClustering.java:110)
       at
org.carrot2.clustering.stc.STCClusteringAlgorithm.process(STCClusteringAlgorithm.java:198)
       at
org.carrot2.core.ControllerUtils.performProcessing(ControllerUtils.java:101)
       at org.carrot2.core.Controller.process(Controller.java:287)
       at org.carrot2.core.Controller.process(Controller.java:180)
       at org.carrot2.xtn.clustering.CarrotSTC_to_XTN.XTN_STC_execute(Unknown
Source)
FORCING core dump!

--

Once again, this happens executing the C code as a PHP5 Extension (Apache
Module). I create a Java VM at module startup (when PHP gets loaded at
apache startup) and coded the extension carrot2 functions as a PHP5 class,
means I attach with the php carrot2 class constructor to the JVM and the
destructor detaches again from the JVM. I use the thread safety mechanism
that I get with the Zend API (Zend object store). The methods are wrappers
of the carrot2 C API functions like adding clustering parameters, add
documents, execute the Lingo or STC algorithm, and getting the cluster
results in various formats (PHP Array, XML or JSON).
I have to mention, that before I used the attach/detach calls, I had not
that exception problem. The actual clustering process and feeding documents,
getting the results from Carrot2 worked fine so far. But I had to change the
mechanism due to re-creation of the JVM in the Apache/PHP environment. I
received segfaults with recreating the JVM, so I decided to rewrite the
basic module and use the AttachCurrentThread/DetachCurrentThread functions.
Even though is is more effective!

Development Platform is Ubuntu/Linux Lucid 10.04 LTS, JRE version:
6.0_24-b07, PHP version is 5.2.16. At this time I use the Carrot2 API Verion
3.4.1.

So, I am a little stuck for now and do not quiet understand why I get that
Exception and how I can solve that issue. So any hints are greatly
appreciated.  If I missed something or something needs to be more specific,
please let me know. I would be happy to get that PHP5 extension working.

Andreas W. Wylach



--
View this message in context: http://carrot2-users-and-developers-forum.607571.n2.nabble.com/NullPointerException-in-ContextClassLoaderLocator-tp6189959p6189959.html
Sent from the Carrot2 Users and Developers Forum mailing list archive at Nabble.com.

------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
Carrot2-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/carrot2-developers





------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
Carrot2-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/carrot2-developers
aw
Reply | Threaded
Open this post in threaded view
|

Re: NullPointerException in ContextClassLoaderLocator

aw
In reply to this post by Dawid Weiss-2
Hi Dawid,

your hint made my day :-) The PHP5  Module runs now without problem!

In the Java class that contains all methods I control/call thru JNI I added the following 2 lines to the method I call for initialization at construction (PHP5 constructor):

        ClassLoader Cl  = ClassLoader.getSystemClassLoader();
        Thread.currentThread().setContextClassLoader(Cl);

And that seems to solve the issue.

First I did not clearly understand why that is happening ( I am not a Java pro). But with your hints I seem get it now. So as for my understanding, when a thread is attached through JNI the bootstrap class loader is invoked and cannot locate the classes requested classes, then a Exception occurs (would that be a ClassNotFoundException??) .
So in short, Java threads created from JNI code (so a non-java thread) have a null ContextClassloader unless it is explicitly set. In such context it happens that Thread.currentThread() returns NULL, therefore we get that NLP in Carrot2. Is my thinking correct?

I do not know but maybe that should be mentioned somewhere.

I had to smile when I read "pretty wicked scenario". I think working with JNI and a set of small Java methods (These methods really just contain a few lines) that communicate with the carrot2 framework is straight-forward. On the C side I have a few JNI helper methods and C functions that make up somekinds of a core API to the Java carrot2 framework. And like I said, that opens the door for all C or C++ programs to connect to carrot2 without using the clustering server (DCS). And now I created a PHP5 (PHPCarro2) class that has a few methods to work with carrot2 directly.
It is mandatory to write pretty clean code (on the C side though) to prevent memory leaks and other errors like uninitialized values, etc because it is not possible (as far as I experienced) to use valgrind with JNI Code. The JVM cannot be initialized thru valgrind; It does not surprise me.
I will extend and improve the C API  and the PHP5 extension, so basically I think it is worth the effort  :-)

For now in PHP, the PHPCarro2 module loaded in Apache, it goes like this
<?php
$carrot2 = new phpacrrot2("data mining");
$carrot2->set_parameter("maxClusters", "10");
...
foreach ($r['result_set']['documents'] as $k => $v) {
  $carrot2->add_document( $v['doc_id'], $v['title'], $v['snippet'], $v['url'] );
}
$carrot2->execute("XTN_STC_execute"); or $carrot2->execute("XTN_Lingo_execute");
print $carrot2->get_num_cluster();
$result = $carrot2->get_searchresult_as_array();
$result_json = $carrot2->get_searchresult_as_json();
$result_xml = $carrot2->get_searchresult_as_xml();
?>

If you like I will keep you posted.

Thanks again for your help!

Andreas
Reply | Threaded
Open this post in threaded view
|

Re: NullPointerException in ContextClassLoaderLocator

Stanislaw Osinski
Administrator
Hi Andreas,

Well done with the JNI interface! I must say that the art of C/C++ coding is declining these days, which is a shame...

If I understand correctly, there is one JVM started upon module initialization that handles all the clustering calls. If that's the case, you should be able to get some performance improvement by replacing the simple controller with a pooling one like this:

private static final Controller controller = ControllerFactory.getPooling();

public static void XTN_STC_execute() {
 Class<?> algorithm = STCClusteringAlgorithm.class;
 XTN_STC_attributes.put(AttributeNames.DOCUMENTS, XTN_document_list );
 cluster_result = controller.process(XTN_STC_attributes, algorithm);
}

When you're using the simple controller, every time you initiate clustering, a new instance of the clustering algorithm is created that needs to read and initialize language files etc., which takes some time. With the pooling controller the initialization happens only once, so subsequent calls should be slightly faster.

Cheers,

Staszek

On Mon, Mar 21, 2011 at 05:56, aw <[hidden email]> wrote:
Hi Dawid,

your hint made my day :-) The PHP5  Module runs now without problem!

In the Java class that contains all methods I control/call thru JNI I added
the following 2 lines to the method I call for initialization at
construction (PHP5 constructor):

       ClassLoader Cl  = ClassLoader.getSystemClassLoader();
       Thread.currentThread().setContextClassLoader(Cl);

And that seems to solve the issue.

First I did not clearly understand why that is happening ( I am not a Java
pro). But with your hints I seem get it now. So as for my understanding,
when a thread is attached through JNI the bootstrap class loader is invoked
and cannot locate the classes requested classes, then a Exception occurs
(would that be a ClassNotFoundException??) .
So in short, Java threads created from JNI code (so a non-java thread) have
a null ContextClassloader unless it is explicitly set. In such context it
happens that Thread.currentThread() returns NULL, therefore we get that NLP
in Carrot2. Is my thinking correct?

I do not know but maybe that should be mentioned somewhere.

I had to smile when I read "pretty wicked scenario". I think working with
JNI and a set of small Java methods (These methods really just contain a few
lines) that communicate with the carrot2 framework is straight-forward. On
the C side I have a few JNI helper methods and C functions that make up
somekinds of a core API to the Java carrot2 framework. And like I said, that
opens the door for all C or C++ programs to connect to carrot2 without using
the clustering server (DCS). And now I created a PHP5 (PHPCarro2) class that
has a few methods to work with carrot2 directly.
It is mandatory to write pretty clean code (on the C side though) to prevent
memory leaks and other errors like uninitialized values, etc because it is
not possible (as far as I experienced) to use valgrind with JNI Code. The
JVM cannot be initialized thru valgrind; It does not surprise me.
I will extend and improve the C API  and the PHP5 extension, so basically I
think it is worth the effort  :-)

For now in PHP, the PHPCarro2 module loaded in Apache, it goes like this
<?php
$carrot2 = new phpacrrot2("data mining");
$carrot2->set_parameter("maxClusters", "10");
...
foreach ($r['result_set']['documents'] as $k => $v) {
 $carrot2->add_document( $v['doc_id'], $v['title'], $v['snippet'],
$v['url'] );
}
$carrot2->execute("XTN_STC_execute"); or
$carrot2->execute("XTN_Lingo_execute");
print $carrot2->get_num_cluster();
$result = $carrot2->get_searchresult_as_array();
$result_json = $carrot2->get_searchresult_as_json();
$result_xml = $carrot2->get_searchresult_as_xml();
?>

If you like I will keep you posted.

Thanks again for your help!

Andreas

--
View this message in context: http://carrot2-users-and-developers-forum.607571.n2.nabble.com/NullPointerException-in-ContextClassLoaderLocator-tp6189959p6191198.html
Sent from the Carrot2 Users and Developers Forum mailing list archive at Nabble.com.

------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
Carrot2-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/carrot2-developers


------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
Carrot2-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/carrot2-developers
aw
Reply | Threaded
Open this post in threaded view
|

Re: NullPointerException in ContextClassLoaderLocator

aw
Hi Stanislaw,

I feel a bit old now after reading your message ..  by still sticking to C/C++ (remember the old days? .. the C programming language ...) :-)) Somehow you're surely right, with Java things are much more simpler than in a low level language as C. I just like the speed of C and effectively coded, it is like a bullet. I guess I just like the hard way :-))

Regarding your suggestion, Yes, when Apache/PHP is started (like said, PHP5 runs as a module in my development environment) the JVM is created upon module initialization. This is due to the fact, that one process may only run _one_ JVM at a time. As I mentioned in my first post, I had to rewrite the initialization part of the C Carrot2 Api Core, because of problems recreating the JVM all the time (at PHP5 class construction/destruction).  Even though, it was a waste, why always destroying and re-creating a new JVM (with all the costs that come with that) and not  using the attach/detach thread functionality. With this changed, it is more effective and and way faster (still need to do some benchmarking on that). Also, for a stable processing of larger document sets, I added 256MB heap at JVM startup (via -Xmx256m) option. Well, I need to do heavy testing on all that!

I changed the processing controller to "ControllerFactory.createPooling();". I can not find the method "getPooling()". Is that new? I am using the API 3.4.1 at this time. I will upgrade soon to a more recent version. Before that I just wanted to get the complete system running.

Andreas
Reply | Threaded
Open this post in threaded view
|

Re: NullPointerException in ContextClassLoaderLocator

Stanislaw Osinski
Administrator

I feel a bit old now after reading your message ..  by still sticking to
C/C++ (remember the old days? .. the C programming language ...) :-))

No worries, I'm old enough to have coded Motorola 68000 assembly and C on an Amiga :-)
 
Regarding your suggestion, Yes, when Apache/PHP is started (like said, PHP5
runs as a module in my development environment) the JVM is created upon
module initialization. This is due to the fact, that one process may only
run _one_ JVM at a time. As I mentioned in my first post, I had to rewrite
the initialization part of the C Carrot2 Api Core, because of problems
recreating the JVM all the time (at PHP5 class construction/destruction).
Even though, it was a waste, why always destroying and re-creating a new JVM
(with all the costs that come with that) and not  using the attach/detach
thread functionality. With this changed, it is more effective and and way
faster (still need to do some benchmarking on that). Also, for a stable
processing of larger document sets, I added 256MB heap at JVM startup (via
-Xmx256m) option. Well, I need to do heavy testing on all that!

Creating a JVM for each clustering call would be a huge overhead. Also, in such cases, the Just In Time compiler wouldn't have a chance to optimize the Java code at runtime (this happens for frequently executed areas), slowing things down even more. -Xmx256m at startup is a good idea too.
 
I changed the processing controller to "ControllerFactory.createPooling();".
I can not find the method "getPooling()". Is that new? I am using the API
3.4.1 at this time. I will upgrade soon to a more recent version. Before
that I just wanted to get the complete system running.

Oh, sorry, I was writing off the top of my head, it's createPooling(), not getPooling(). There are two variants actually:

http://download.carrot2.org/stable/javadoc/org/carrot2/core/ControllerFactory.html#createPooling%28%29  (garbage-collected pool, variable pool size)

http://download.carrot2.org/stable/javadoc/org/carrot2/core/ControllerFactory.html#createPooling%28int%29  (fixed pool size)

Cheers,

S.

------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
Carrot2-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/carrot2-developers
Reply | Threaded
Open this post in threaded view
|

Re: NullPointerException in ContextClassLoaderLocator

Dawid Weiss-2
I feel a bit old now after reading your message ..  by still sticking to
C/C++ (remember the old days? .. the C programming language ...) :-))

No worries, I'm old enough to have coded Motorola 68000 assembly and C on an Amiga :-)

And I'm old enough not to use C at all and writing assembly code byte-by-byte from my head :) 
 
I'm glad we could be of help. Would you consider donating this code to the project (or making it available via github, for example)? Or is it proprietary? I don't know how many other people would find it useful, but who knows.

As for JNI linking -- there are libraries to help you with stub classes and all that mess... can't locate the name off the top of my head, but I recall it made things a lot easier (although it was a commercial library).

Dawid

------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
Carrot2-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/carrot2-developers
aw
Reply | Threaded
Open this post in threaded view
|

Re: NullPointerException in ContextClassLoaderLocator

aw
OK, reading that makes me feeling comfortable again :-)

Well, I guess it is not over with the problems executing the clustering process. There seems no problems at all
with Apache running with just one process (starting one worker with the -X option). All runs fine so far and clean, clustering different sets with about 100 or 200 documents (I fetch them via google) in the same process, reloading the PHP script with the carrot2 methods, all good so far.  I experience no segfaults or other erroneous misbehaviour.

But then running Apache in regular multiprocess mode, the controller process method does not return, like it is in an infinite loop or gets stuck at some point. If I use the Apache-MPM-prefork (the non-threaded model) or the the Apache-MPM-worker (threaded model) makes no difference, both show the same symptom. I can not see any unusual memory usage or apache process consuming too much cpu, etc. I am not sure what really happens, so I need to do more investigation. Is there any logging / debugging mechanism I could use with the Carrot2 API? What could controller.process stop or hold from returning back to the caller? Maybe I need to do some locking on the Apache/PHP side, maybe there is some interference with threading. Maybe you guys have another good hint for me?

@Dawid: This module is intented to get integrated in a customer project that I work on, I will try to integrate the module in the search engine of the customer intranet this summer (its somekind like a side project, that why I have a few month time for that). So if that happen and the code is clean enough and ready for productivity usage, I will be happy to donate it to the Carrot2 project. I am sure some people might like to use it in their PHP Applications.

Andreas
Reply | Threaded
Open this post in threaded view
|

Re: NullPointerException in ContextClassLoaderLocator

Stanislaw Osinski
Administrator

Well, I guess it is not over with the problems executing the clustering
process. There seems no problems at all
with Apache running with just one process (starting one worker with the -X
option). All runs fine so far and clean, clustering different sets with
about 100 or 200 documents (I fetch them via google) in the same process,
reloading the PHP script with the carrot2 methods, all good so far.  I
experience no segfaults or other erroneous misbehaviour.

But then running Apache in regular multiprocess mode, the controller process
method does not return, like it is in an infinite loop or gets stuck at some
point. If I use the Apache-MPM-prefork (the non-threaded model) or the the
Apache-MPM-worker (threaded model) makes no difference, both show the same
symptom. I can not see any unusual memory usage or apache process consuming
too much cpu, etc. I am not sure what really happens, so I need to do more
investigation. Is there any logging / debugging mechanism I could use with
the Carrot2 API?

Yes, there is. The simplest way to see the logs would be to put a file like this:

https://carrot2.svn.sourceforge.net/svnroot/carrot2/trunk/applications/carrot2-examples/src/log4j.xml

at the root of the classpath (directly in the src/ directory). If you change <priority value="warn" /> to <priority value="debug" />, you should see some output on the console. You can also configure a file appender (some examples: http://wiki.apache.org/logging-log4j/Log4jXmlFormat) if that would make things easier.

What could controller.process stop or hold from returning
back to the caller? Maybe I need to do some locking on the Apache/PHP side,
maybe there is some interference with threading. Maybe you guys have another
good hint for me?

I can't think of a good reason for the clustering to hang like this at the moment. Maybe the logs will reveal something? Also, can you try reverting back to ControllerFactory.createSimple() and run the multithreaded test? If there's a deadlock caused by the pooling controller, then the simple controller should work fine.

Cheers,

S.

------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
Carrot2-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/carrot2-developers
aw
Reply | Threaded
Open this post in threaded view
|

Re: NullPointerException in ContextClassLoaderLocator

aw
Hi Stanislaw,

I fact that I am working with the API (Jar file), so I was not sure and downloaded the recent Carrot2 source code version from the svn. I built the complete carrot2 source, copied my XTN/Carrot2 class (not source) into tmp/classes/ ... class path, modifed my C XTN/Carrot2 Test program (edited the class path and extpath directories to point to the new Carrot2 version libraryy, Jar files, etc), compiled it and run it. Runs all fine, just like before (So at this point using the new Carrot2 verson is no problem at all).

If I place that log4j.xml file into the "tmp/classes" folder(which is the root of the classpath, right?) nothing happens.  Also tried to set it right into the carro2 core class path, but no) All I get at the point of execution of the controller.process is the following:

... <snip>
XTN_ce_java_execute:[1504] start clustering, execute algorithm: XTN_STC_execute
XTN_CallVoidMethod:[1139]  * GetStaticMethodID: XTN_STC_execute ()V
XTN_CallVoidMethod:[1143]  * CallStaticVoidMethod: argument (null)

SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/projects/ioc3.de/carrot2/full/stable/lib/org.slf4j/slf4j-nop-1.5.8.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/projects/ioc3.de/carrot2/full/stable/lib/org.slf4j/slf4j-log4j12-1.5.8.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
XTN_CallVoidMethod:[1147]  * CallStaticVoidMethod returned

XTN_ce_java_execute:[1519] clustering done
... </snip>

I do not get any debugging output, which I think is just a misplacement of the log4j.xml file. As I was reading, the file
should be placed where the building process (ant) stores the classes. isn't  that "tmp/classes" folder?
Sorry, if I misunderstand some basics here, but maybe that is just a dumb mistake on my side. What Am I doing wrong?

I can not tell you how much curious I am to see the debug outpot of the multi-threaded apache env to track down the issue ....

Regarding the controller.simple or createPooling controllerFactoty, that was the first I changed when I saw, the XTN/Carrot2 Module is not working in the multi-threaded Environment. But no change, it hangs at the same point.

Please give me a advise to get that log4j debugger output running. That is my only choice for now ...

Thanks again for your help! I appreciate that!

Andreas
Reply | Threaded
Open this post in threaded view
|

Re: NullPointerException in ContextClassLoaderLocator

Stanislaw Osinski
Administrator

If I place that log4j.xml file into the "tmp/classes" folder(which is the
root of the classpath, right?) nothing happens.  

Yes, that's what I meant.
 
Also tried to set it right
into the carro2 core class path, but no) All I get at the point of execution
of the controller.process is the following:

...
XTN_ce_java_execute:[1504] start clustering, execute algorithm:
XTN_STC_execute
XTN_CallVoidMethod:[1139]  * GetStaticMethodID: XTN_STC_execute ()V
XTN_CallVoidMethod:[1143]  * CallStaticVoidMethod: argument (null)

SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in
[jar:file:/projects/ioc3.de/carrot2/full/stable/lib/org.slf4j/slf4j-nop-1.5.8.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/projects/ioc3.de/carrot2/full/stable/lib/org.slf4j/slf4j-log4j12-1.5.8.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
explanation.

Oh, this is something I forgot to mention. Please delete the slf4j-nop-1.5.8.jar from your lib/ directory, it prevents the output of logs (I've just fixed that problem in the upcoming 3.5.0 release).
 
XTN_CallVoidMethod:[1147]  * CallStaticVoidMethod returned

XTN_ce_java_execute:[1519] clustering done
...

I do not get any debugging output, which I think is just a misplacement of
the log4j.xml file. As I was reading, the file
should be placed where the building process (ant) stores the classes. isn't
that "tmp/classes" folder?
Sorry, if I misunderstand some basics here, but maybe that is just a dumb
mistake on my side. What Am I doing wrong?

No, I forgot to mention about deleting the JAR as above, apologies for that.
 
Regarding the controller.simple or createPooling controllerFactoty, that was
the first I changed when I saw, the XTN/Carrot2 Module is not working in the
multi-threaded Environment. But no change, it hangs at the same point.

Oh, this means it's probably not some deadlock in the pooling controller then. Things are getting interesting :-)
 
Please give me a advise to get that log4j debugger output running. That is
my only choice for now ...

I hope you see some debug output now. Chances are that we'd need to insert some more detailed logging at a few places (I hope the initial logs will give me some hints about these), recompile Carrot2 and see what happens.

S.

------------------------------------------------------------------------------
Enable your software for Intel(R) Active Management Technology to meet the
growing manageability and security demands of your customers. Businesses
are taking advantage of Intel(R) vPro (TM) technology - will your software
be a part of the solution? Download the Intel(R) Manageability Checker
today! http://p.sf.net/sfu/intel-dev2devmar
_______________________________________________
Carrot2-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/carrot2-developers
aw
Reply | Threaded
Open this post in threaded view
|

Re: NullPointerException in ContextClassLoaderLocator

aw
Hi Stanislaw,

I got the debugging working now.

While testing in the multithreaded environment I experience a few more points where the XTN/Carrot2 PHP module seems to stop the processing, for example while adding documents to Carrot2 or a few reloads later, even at the point attaching to the JVM (JNI's AttachCurrentThread, which is executed when the PHP5 constructor is called).
So I think the problem is somewhere else and maybe not Carrot2/Java itsself. There still seems to be some problem with JNI in the multithreaded environment, at least the way I have coded by now. As far as my knowledge I implemented  everything pretty clean, the few Java functions I run with JNI to control Carrot2 are just 4 methods along with some static class variables such as the processcontroller object, the document list and ProcessingResult object, nothing complicated), so my guess now is that I have to have a deeper look with the JNI stuff in the threaded env. Running all that as a single (Apache) process or from my metasearch framework (there, I only use threads to fetch search results in parallel), I expirience no problem at all. But that is single-threaded though and another story.

I will have to do some more investigation on that and when I find a solution and see a more stable runtime behaviour I will return back to you. I'll keep you guys posted about that project.

Andreas
Reply | Threaded
Open this post in threaded view
|

Re: NullPointerException in ContextClassLoaderLocator

Dawid Weiss-2
Thanks for the info, Andreas. I don't know how to handle JNI in a
multi-threaded situation, but I think it's not straightforward since
the JVM itself is a multi-threaded beast, so synchronization/ locking
is definitely part of it. You may want to ask around on Java-dev
mailing lists:

http://mail.openjdk.java.net/mailman/listinfo/

Unfortunately I can't tell you which one would apply best, but folks
there are very helpful and knowledgeable.

Dawid

On Wed, Mar 23, 2011 at 4:56 AM, aw <[hidden email]> wrote:

> Hi Stanislaw,
>
> I got the debugging working now.
>
> While testing in the multithreaded environment I experience a few more
> points where the XTN/Carrot2 PHP module seems to stop the processing, for
> example while adding documents to Carrot2 or a few reloads later, even at
> the point attaching to the JVM (JNI's AttachCurrentThread, which is executed
> when the PHP5 constructor is called).
> So I think the problem is somewhere else and maybe not Carrot2/Java itsself.
> There still seems to be some problem with JNI in the multithreaded
> environment, at least the way I have coded by now. As far as my knowledge I
> implemented  everything pretty clean, the few Java functions I run with JNI
> to control Carrot2 are just 4 methods along with some static class variables
> such as the processcontroller object, the document list and ProcessingResult
> object, nothing complicated), so my guess now is that I have to have a
> deeper look with the JNI stuff in the threaded env. Running all that as a
> single (Apache) process or from my metasearch framework (there, I only use
> threads to fetch search results in parallel), I expirience no problem at
> all. But that is single-threaded though and another story.
>
> I will have to do some more investigation on that and when I find a solution
> and see a more stable runtime behaviour I will return back to you. I'll keep
> you guys posted about that project.
>
> Andreas
>
> --
> View this message in context: http://carrot2-users-and-developers-forum.607571.n2.nabble.com/NullPointerException-in-ContextClassLoaderLocator-tp6189959p6198903.html
> Sent from the Carrot2 Users and Developers Forum mailing list archive at Nabble.com.
>
> ------------------------------------------------------------------------------
> Enable your software for Intel(R) Active Management Technology to meet the
> growing manageability and security demands of your customers. Businesses
> are taking advantage of Intel(R) vPro (TM) technology - will your software
> be a part of the solution? Download the Intel(R) Manageability Checker
> today! http://p.sf.net/sfu/intel-dev2devmar
> _______________________________________________
> Carrot2-developers mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/carrot2-developers
>
>

------------------------------------------------------------------------------
Enable your software for Intel(R) Active Management Technology to meet the
growing manageability and security demands of your customers. Businesses
are taking advantage of Intel(R) vPro (TM) technology - will your software
be a part of the solution? Download the Intel(R) Manageability Checker
today! http://p.sf.net/sfu/intel-dev2devmar
_______________________________________________
Carrot2-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/carrot2-developers