solr dataimport handler jndi websphere - solr

I am trying to use jndiName attribute in db-data-config.xml. This works great in tomcat. However having issues in websphere.
Following exception is thrown
"Make sure that a J2EE application does not execute JNDI operations on "java:" names within static code blocks or in threads created by that J2EE application. Such code does not necessarily run on the thread of a server application request and therefore is not supported by JNDI operations on "java:" names. [Root exception is javax.naming.NameNotFoundException: Name comp/env/jdbc not found in context "java:"."
It seems like websphere has issues accessing jndi resource from Static code. Has anyone experienced this ?
DataImporter E org.apache.solr.common.SolrException log Full Import failed:org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to execute query: <REMOVE SQL from here>
at org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:72)
at org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.<init>(JdbcDataSource.java:253)
at org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:210)
at org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:39)
at org.apache.solr.handler.dataimport.SqlEntityProcessor.initQuery(SqlEntityProcessor.java:59)
at org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:73)
at org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:238)
at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:596)
at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:268)
at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:187)
at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:359)
at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:427)
at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:408)
Caused by: javax.naming.ConfigurationException: A JNDI operation on a "java:" name cannot be completed because the server runtime is not able to associate the operation's thread with any J2EE application component. This condition can occur when the JNDI client using the "java:" name is not executed on the thread of a server application request. Make sure that a J2EE application does not execute JNDI operations on "java:" names within static code blocks or in threads created by that J2EE application. Such code does not necessarily run on the thread of a server application request and therefore is not supported by JNDI operations on "java:" names. [Root exception is javax.naming.NameNotFoundException: Name comp/env/jdbc not found in context "java:".]
at com.ibm.ws.naming.java.javaURLContextImpl.throwConfigurationExceptionWithDefaultJavaNS(javaURLContextImpl.java:428)
at com.ibm.ws.naming.java.javaURLContextImpl.lookup(javaURLContextImpl.java:399)
at com.ibm.ws.naming.java.javaURLContextRoot.lookup(javaURLContextRoot.java:214)
at com.ibm.ws.naming.java.javaURLContextRoot.lookup(javaURLContextRoot.java:154)
at javax.naming.InitialContext.lookup(InitialContext.java:436)
at org.apache.solr.handler.dataimport.JdbcDataSource$1.call(JdbcDataSource.java:140)
at org.apache.solr.handler.dataimport.JdbcDataSource$1.call(JdbcDataSource.java:128)
at org.apache.solr.handler.dataimport.JdbcDataSource.getConnection(JdbcDataSource.java:363)
at org.apache.solr.handler.dataimport.JdbcDataSource.access$200(JdbcDataSource.java:39)
at org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.<init>(JdbcDataSource.java:240)
... 11 more
Caused by: javax.naming.NameNotFoundException: Name comp/env/jdbc not found in context "java:".
at com.ibm.ws.naming.ipbase.NameSpace.getParentCtxInternal(NameSpace.java:1837)
at com.ibm.ws.naming.ipbase.NameSpace.lookupInternal(NameSpace.java:1166)
at com.ibm.ws.naming.ipbase.NameSpace.lookup(NameSpace.java:1095)
at com.ibm.ws.naming.urlbase.UrlContextImpl.lookup(UrlContextImpl.java:1233)
at com.ibm.ws.naming.java.javaURLContextImpl.lookup(javaURLContextImpl.java:395)
... 19 more

Solr is using unmanaged threads here and is doing a JNDI lookup in the java: namespace in one of these threads. That is not supported in WebSphere because WebSphere identifies the java: namespace based on the thread performing the lookup. Tomcat does this based on the thread context class loader, which explains the difference in behavior. Note that unmanaged threads are not allowed by the J2EE specs, so that WebSphere's behavior is actually standards compliant.

Related

DB2 Communication Error

We recently developed an application which will run a query in DB2 and send a mail to the corresponding recipient. It works well in our local system and QA region. But in production, few queries failed (even if it's rare, like once in week). It throws the exception below.
Exception InnerDetails:
ERROR [40003] [IBM][CLI Driver] SQL30081N A communication error has
been detected. Communication protocol being used: "TCP/IP".
Communication API being used: "SOCKETS". Location where the error was
detected: "111.111.111.111". Communication function detecting the
error: "recv". Protocol specific error code(s): "10004", "", "".
SQLSTATE=08001
Since error occurs only in production and not very often, we are not sure whether it is the code or a setting issue. Do you have any idea?
We recently discussed this issue with our IBM rep. After looking in their internal knowledge base, he suggested we add "Interrupt=0" to our connection string, based on recommendations given to other customers that had the same problem.
The default value for Interrupt was 1 before v10.5 FP2 and still is for most connections. They changed the default value to 2 for connections to z/OS (mainframe) in FP2.
We're using C# and the connection string properties for the IBM Data Server Driver for .Net can be found here. I'm sure there is a similar property for their drivers for other languages.
This page from the IBM docs goes into a bit more detail about the setting.
We haven't seen the issue since we recently added the property, but it was always intermittent so I can't yet confidently say that the problem is fixed. Time will tell...
That particular error (SQL30081N) is just a generic message that indicates a network issue between your DB2 client and the server. In this case, you want to look at the Protocol specific error code(s). Here, it looks like you're on Windows, and that particular code (10004) isn't given in the IBM documentation.
So, if you google "windows network error codes", you'll find this page, which says:
WSAEINTR
10004
Interrupted function call.
A blocking operation was interrupted by a call to WSACancelBlockingCall.
Which links to this page with more information on that specific function (emphasis mine):
The WSACancelBlockingCall function has been removed in compliance
with the Windows Sockets 2 specification, revision 2.2.0.
The function is not exported directly by WS2_32.DLL and Windows
Sockets 2 applications should not use this function. Windows Sockets
1.1 applications that call this function are still supported through the WINSOCK.DLL and WSOCK32.DLL.
Blocking hooks are generally used to keep a single-threaded GUI
application responsive during calls to blocking functions. Instead of
using blocking hooks, an applications should use a separate thread
(separate from the main GUI thread) for network activity.
I'm guessing that your application may be blocking for a longer time in your production application than your other environments, and something along the way is causing the interrupt.
Hopefully this leads you down the right path...
I spent hours to solve the same problem and fixed it. I use a Windows exe (developed with C#.NET) to run a SELECT query from a DB2 database and I sometimes got this error. Finally I realized that my problem is a time out error. Error with protocol code "10004" message, sometimes occurs if query execution is longer than 30 seconds which is default timeout value. Maybe the interruption call on the "Windows Socket Error Codes" page occurs for time out mechanism. I add aline to set an acceptable timeout value and got rid off this annoying error. I hope it helps other.
Here is my code fix :
...
connDb.Open();
DB2Command cmdDb = new DB2Command(QueryText,connDb);
cmdDb.CommandTimeout = 300; //I added this line.
using (DB2DataReader readerDb = cmdDb.ExecuteReader())
{
...

Does a 'Broken pipe' exception cancel my job?

Currently I am running a Flink program on a remote cluster of 4 machines using 144 TaskSlots. After running for around 30 minutes I received the following error:
INFO
org.apache.flink.runtime.jobmanager.web.JobManagerInfoServlet - Info
server for jobmanager: Failed to write json updates for job
b2eaff8539c8c9b696826e69fb40ca14, because
org.eclipse.jetty.io.RuntimeIOException:
org.eclipse.jetty.io.EofException at
org.eclipse.jetty.io.UncheckedPrintWriter.setError(UncheckedPrintWriter.java:107)
at
org.eclipse.jetty.io.UncheckedPrintWriter.write(UncheckedPrintWriter.java:280)
at
org.eclipse.jetty.io.UncheckedPrintWriter.write(UncheckedPrintWriter.java:295)
at
org.apache.flink.runtime.jobmanager.web.JobManagerInfoServlet.writeJsonUpdatesForJob(JobManagerInfoServlet.java:588)
at
org.apache.flink.runtime.jobmanager.web.JobManagerInfoServlet.doGet(JobManagerInfoServlet.java:209)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:734) at
javax.servlet.http.HttpServlet.service(HttpServlet.java:847) at
org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:532)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:453)
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:227)
at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:965)
at
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:388)
at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:187)
at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:901)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
at
org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:47)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:113)
at org.eclipse.jetty.server.Server.handle(Server.java:352) at
org.eclipse.jetty.server.HttpConnection.handleRequest(HttpConnection.java:596)
at
org.eclipse.jetty.server.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:1048)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:549)
at
org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:211)
at
org.eclipse.jetty.server.HttpConnection.handle(HttpConnection.java:425)
at
org.eclipse.jetty.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:489)
at
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:436)
at java.lang.Thread.run(Thread.java:745) Caused by:
org.eclipse.jetty.io.EofException at
org.eclipse.jetty.http.HttpGenerator.flushBuffer(HttpGenerator.java:905)
at
org.eclipse.jetty.http.AbstractGenerator.flush(AbstractGenerator.java:427)
at org.eclipse.jetty.server.HttpOutput.flush(HttpOutput.java:78) at
org.eclipse.jetty.server.HttpConnection$Output.flush(HttpConnection.java:1139)
at org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:159) at
org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:86) at
java.io.ByteArrayOutputStream.writeTo(ByteArrayOutputStream.java:154)
at org.eclipse.jetty.server.HttpWriter.write(HttpWriter.java:258) at
org.eclipse.jetty.server.HttpWriter.write(HttpWriter.java:107) at
org.eclipse.jetty.io.UncheckedPrintWriter.write(UncheckedPrintWriter.java:271)
... 24 more Caused by: java.io.IOException: Broken pipe at
sun.nio.ch.FileDispatcherImpl.write0(Native Method) at
sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) at
sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) at
sun.nio.ch.IOUtil.write(IOUtil.java:51) at
sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:470) at
org.eclipse.jetty.io.nio.ChannelEndPoint.flush(ChannelEndPoint.java:185)
at
org.eclipse.jetty.io.nio.SelectChannelEndPoint.flush(SelectChannelEndPoint.java:256)
at
org.eclipse.jetty.http.HttpGenerator.flushBuffer(HttpGenerator.java:849)
... 33 more
I know that java.io.IOException: Broken pipe means that the JobManager lost some kind of connection so I guess the whole job failed and I have to restart it. Although I think the process is not running anymore the WebInterface still lists it as running. Additionally the JobManager is still present when I use jps to identify my running processes on the cluster. So my question is if my job is lost and whether this error is happening randomly sometimes or whether my program caused it.
EDIT: My TaskManagers still send Heartbeats every few seconds and seem to be running.
It's actually a problem of the JobManagerInfoServlet, Flink's web server, which cannot sent the latest JSON updates of the requested job to your browser because of the java.io.IOException: Broken pipe at sun.nio.ch.FileDispatcherImpl.write0(Native Method). Thus, only the GET request to the server failed.
Such a failure should not affect the execution of the currently running Flink job. Simply refreshing your browser (with Flink's web UI) should send another GET request which then hopefully completes successfully.

Multiple instance writing on same mounted folder using camel-file component

We have requirement to write to single file using multiple instance of camel interface running simultaneously.
The file is on windows shared file system which has been mounted on JBoss server using SMB.
We are using camel file component to write file from each instance as a local file.
Below is the endpoint URI in camel context
file:/fuse/server/location/proc?fileName=abc.csv&fileExist=Append
The file generate has no issues when the write is happening from single instance, but in case of multiple instance it add junk characters to the file at random lines.
We are using JBoss Fuse 6.0.0 and the interface have been written using camel 2.10 version.
How can this be fixed? Is this the issue with SMB mount or the interface need to handle it.
I've had a look at the source code of the relevant camel component (https://github.com/apache/camel/tree/master/camel-core/src/main/java/org/apache/camel/component/file) & there isn't any built in support for concurrent access to a single file from multiple JVMs. Concurrent access from a single JVM is handled though.
I think you have two basic options to address your requirement:
Write some code to support shared access to a single file. The camel file component looks like it was built with extension in mind, or you could just create a standalone component to do this.
As #Namphibian suggests use some queuing system to serialise your writes (though I don't think seda will work, as it doesn't span JVMS.
My solution would be to use ActiveMQ. Each instance of your application would send messages to a single, shared queue. Some other processes would then consume the messages from MQ & write them to disk.
With a single process consuming all MQ messages there would be no concurrent writes to the filesystem.
A more robust solution would to run ActiveMQ in a cluster (possibly with a node in each of your application instances). Look at "JMSXGroupID" to prevent concurrent consumption of messages.

Apache stops processing requests (mod_wsgi?)

At some point my site, running on Apache2 with mod_wsgi just stops processing requests. The connection to server is maintained and client waits for responce, but it never is returned by apache. The server at this time is at 0% CPU, and nothing is processing. I think, apache just sends request to queue and never gets them out of there.
When I perform apache2ctl graceful the problem does not resolve. Only after apache2ctl restart.
My site is a 4 instance wsgi application of Pyramid and 2 instances of Zope 3. It is running normaly and does not have speed problems, that I am aware of.
versions:
Ubuntu 10.04
apache2 2.2.14-5ubuntu8.9
libapache2-mod-wsgi 2.8-2ubuntu1
Sounds like you are using embedded mode to run the multiple applications and you are using third party C extensions that have problems in sub interpreters, resulting in potential deadlock. Else your code is internally deadlocking or blocking on external services and never returning, causing exhaustion of available processes/threads.
For a start, you should look at using daemon mode and delegate each web application to a distinct daemon process group and then forcing each to run in the main interpreter.
See:
http://code.google.com/p/modwsgi/wiki/QuickConfigurationGuide#Delegation_To_Daemon_Process
http://code.google.com/p/modwsgi/wiki/ApplicationIssues#Python_Simplified_GIL_State_API
Otherwise use debugging tips described in:
http://code.google.com/p/modwsgi/wiki/DebuggingTechniques
for getting stack traces about what application is doing.

JDBC connection hanging

One of our customers has a newish problem: the application grinds to halt. Thread dump shows that all threads are hanging on network IO in JDBC calls.
We/I have never seen these 'network IO' hangs. Typically a slow machine w/ DB problems has either a) one or two long-running queries or b) some type of lock/deadlock. In either of these cases the threads 'hang' on different methods. I have never seen all 30+ threads hanging on network IO.
Below I have included an excerpt from the thread dump. All HTTP threads are hanging on the same java.net.SocketInputStream.read call.
I talked to their dba and sysadmin. According to them 'nothing has changed' in the environment recently which would cause this problem.
db environment
MSSQL 2005 64-bit Service Pack 2
Driver: sqljdbc.jar : 1.0 809 102
Note: they are running an older jdbc driver. AFAIK they tried upgrading from 1.0 to the 1.2 driver but had some other problem.
other environment issues
They're running both the app server and the db server in VMWare VM's. I don't know how this setup affects network performance.
Apparently this is the only application with this problem. I don't know anything else about their network architecture.
Questions
* any insights on this problem?
* if it is network, any next steps for analyzing?
Appendix A: Excerpt from Thread dump
All HTTP connections are hanging on the same method:
"TP-Processor31" daemon prio=5 tid=0x04085b78 nid=0x970 runnable [0x0764d000..0x0764fd6c]
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at com.microsoft.sqlserver.jdbc.DBComms.receive(Unknown Source)
at com.microsoft.sqlserver.jdbc.IOBuffer.sendCommand(Unknown Source)
- locked (a com.microsoft.sqlserver.jdbc.DBComms)
at com.microsoft.sqlserver.jdbc.SQLServerStatement.sendExecute(Unknown Source)
at com.microsoft.sqlserver.jdbc.SQLServerStatement.doExecuteQuery(Unknown Source)
We've had similar issues, and traced them back to a buggy JDK update (1.6.29).
We downloaded 1.6.27 (http://www.oracle.com/technetwork/java/javasebusiness/downloads/java-archive-downloads-javase6-419409.html#jdk-6u27-oth-JPR), re-set the JAVA_HOME environment and we were back on track.
It is the changes to JSSE in version 1.6 u29. The change is to partially patch CVE-2011-3389 and CVE-2011-3560.
If are not deploying applets and using java web-start. You might be able to just use the 1.6 u27 jsse.jar. You're still going to have the vulnerability, but it will allow the sqljdbc.jar and sqljdbc4.jar to work.
The other options to migrate to Java 7, the sqljdbc4.jar does work in that environment.
The patch is not complete fix. So expect more changes in future patches.

Resources