Job failed when index with solr - solr

I have integrated nutch with hadoop and ran a crawl with it. But when I want to create solrindex for the data, I got Job Failed. This is my command:
$bin/nutch solrindex http://namenode:8983/solr/ crawl/crawldb -linkdb crawl/linkdb crawl/segments/*
the directory 'crawl' stores the data and it located in HDFS. I just used the example within the solr release. And it works fine when ran the command in single node mode(not in distributed mode). Here is a fragment of output and forgive me for posting it in such a mussy way:
12/04/28 13:51:22 INFO mapred.JobClient: map 100% reduce 23%
12/04/28 13:51:30 INFO mapred.JobClient: Task Id : attempt_201204212112_0076_r_000000_0, Status : FAILED
java.io.IOException
at org.apache.nutch.indexer.solr.SolrWriter.makeIOException(SolrWriter.java:103)
at org.apache.nutch.indexer.solr.SolrWriter.close(SolrWriter.java:98)
at org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:48)
at org.apache.hadoop.mapred.ReduceTask$OldTrackingRecordWriter.close(ReduceTask.java:466)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:530)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:420)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Unknown Source)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: org.apache.solr.client.solrj.SolrServerException: java.net.ConnectException: Connection refused
at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:478)
at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244)
at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:49)
at org.apache.nutch.indexer.solr.SolrWriter.close(SolrWriter.java:93)
... 9 more
Caused by: java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(Unknown Source)
at java.net.PlainSocketImpl.connectToAddress(Unknown Source)
at java.net.PlainSocketImpl.connect(Unknown Source)
at java.net.SocksSocketImpl.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at java.net.Socket.<init>(Unknown Source)
at java.net.Socket.<init>(Unknown Source)
at org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:79)
at org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:121)
at org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:706)
at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:386)
at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:170)
at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:396)
at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:324)
at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:422)
... 13 more

Related

Ping Failed - com.mysql.cj.jdbc.exceptions.CommunicationsException: Communications link failure

Trying to connect to through database development(data source explorer) in eclipse and during test connection i found ping failed.
com.mysql.cj.jdbc.exceptions.CommunicationsException: Communications link failure
The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
at com.mysql.cj.jdbc.exceptions.SQLError.createCommunicationsException(SQLError.java:174)
at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:64)
at com.mysql.cj.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:836)
at com.mysql.cj.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:456)
at com.mysql.cj.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:246)
at com.mysql.cj.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:197)
at org.eclipse.datatools.connectivity.drivers.jdbc.JDBCConnection.createConnection(JDBCConnection.java:328)
at org.eclipse.datatools.connectivity.DriverConnectionBase.internalCreateConnection(DriverConnectionBase.java:105)
at org.eclipse.datatools.connectivity.DriverConnectionBase.open(DriverConnectionBase.java:54)
at org.eclipse.datatools.connectivity.drivers.jdbc.JDBCConnection.open(JDBCConnection.java:96)
at org.eclipse.datatools.enablement.internal.mysql.connection.JDBCMySQLConnectionFactory.createConnection(JDBCMySQLConnectionFactory.java:28)
at org.eclipse.datatools.connectivity.internal.ConnectionFactoryProvider.createConnection(ConnectionFactoryProvider.java:83)
at org.eclipse.datatools.connectivity.internal.ConnectionProfile.createConnection(ConnectionProfile.java:359)
at org.eclipse.datatools.connectivity.ui.PingJob.createTestConnection(PingJob.java:76)
at org.eclipse.datatools.connectivity.ui.PingJob.run(PingJob.java:59)
at org.eclipse.core.internal.jobs.Worker.run(Worker.java:55)
Caused by: com.mysql.cj.exceptions.CJCommunicationsException: Communications link failure
The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
at java.lang.reflect.Constructor.newInstance(Unknown Source)
at com.mysql.cj.exceptions.ExceptionFactory.createException(ExceptionFactory.java:61)
at com.mysql.cj.exceptions.ExceptionFactory.createException(ExceptionFactory.java:105)
at com.mysql.cj.exceptions.ExceptionFactory.createException(ExceptionFactory.java:151)
at com.mysql.cj.exceptions.ExceptionFactory.createCommunicationsException(ExceptionFactory.java:167)
at com.mysql.cj.protocol.a.NativeSocketConnection.connect(NativeSocketConnection.java:91)
at com.mysql.cj.NativeSession.connect(NativeSession.java:144)
at com.mysql.cj.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:956)
at com.mysql.cj.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:826)
... 13 more
Caused by: java.net.ConnectException: Connection refused: connect
at java.net.DualStackPlainSocketImpl.connect0(Native Method)
at java.net.DualStackPlainSocketImpl.socketConnect(Unknown Source)
at java.net.AbstractPlainSocketImpl.doConnect(Unknown Source)
at java.net.AbstractPlainSocketImpl.connectToAddress(Unknown Source)
at java.net.AbstractPlainSocketImpl.connect(Unknown Source)
at java.net.PlainSocketImpl.connect(Unknown Source)
at java.net.SocksSocketImpl.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at com.mysql.cj.protocol.StandardSocketFactory.connect(StandardSocketFactory.java:155)
at com.mysql.cj.protocol.a.NativeSocketConnection.connect(NativeSocketConnection.java:65)
... 16 more

Cant connect to salesforce using kettle spoon

I'm trying to connect using kettle spoon installed in one of our db server but I'm getting the error below. I tried using the same .ktr file on my laptop and it is working fine..
I'm sure that the username and password are correct..
java.lang.Exception:
Error connecting to Salesforce!
; nested exception is:
java.net.ConnectException: Connection timed out: connect
at org.pentaho.di.ui.trans.steps.salesforceinput.SalesforceInputDialog.test(SalesforceInputDialog.java:1396)
at org.pentaho.di.ui.trans.steps.salesforceinput.SalesforceInputDialog.access$2000(SalesforceInputDialog.java:96)
at org.pentaho.di.ui.trans.steps.salesforceinput.SalesforceInputDialog$23.handleEvent(SalesforceInputDialog.java:1229)
at org.eclipse.swt.widgets.EventTable.sendEvent(Unknown Source)
at org.eclipse.swt.widgets.Widget.sendEvent(Unknown Source)
at org.eclipse.swt.widgets.Display.runDeferredEvents(Unknown Source)
at org.eclipse.swt.widgets.Display.readAndDispatch(Unknown Source)
at org.pentaho.di.ui.trans.steps.salesforceinput.SalesforceInputDialog.open(SalesforceInputDialog.java:1292)
at org.pentaho.di.ui.spoon.delegates.SpoonStepsDelegate.editStep(SpoonStepsDelegate.java:136)
at org.pentaho.di.ui.spoon.Spoon.editStep(Spoon.java:7756)
at org.pentaho.di.ui.spoon.trans.TransGraph.editStep(TransGraph.java:2756)
at org.pentaho.di.ui.spoon.trans.TransGraph.mouseDoubleClick(TransGraph.java:705)
at org.eclipse.swt.widgets.TypedListener.handleEvent(Unknown Source)
at org.eclipse.swt.widgets.EventTable.sendEvent(Unknown Source)
at org.eclipse.swt.widgets.Widget.sendEvent(Unknown Source)
at org.eclipse.swt.widgets.Display.runDeferredEvents(Unknown Source)
at org.eclipse.swt.widgets.Display.readAndDispatch(Unknown Source)
at org.pentaho.di.ui.spoon.Spoon.readAndDispatch(Spoon.java:1183)
at org.pentaho.di.ui.spoon.Spoon.start(Spoon.java:6968)
at org.pentaho.di.ui.spoon.Spoon.main(Spoon.java:567)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at org.pentaho.commons.launcher.Launcher.main(Launcher.java:134)
seems kettle is unable to reach the salesforce endpoint on the network.
try from the server where is kettle, open a console and try:
curl -i 'https://yourSalesforceWebservice.foo/services/Soap/u/21.0'
or
traceroute yourSalesforceWebservice.foo
(put the domain or the ip )

Solr setting up standalone zooKeeper

Im trying to setup solr cluster on jetty but with a separate standalone(not ensemble zookeeper). I manage to start zooKeeper, but I cannot upload config to zooKeeper - I get the following error:
Exception in thread "main" java.lang.NoClassDefFoundError: org/slf4j/LoggerFacto
ry
at org.apache.solr.common.cloud.SolrZkClient.<clinit>(SolrZkClient.java:
66)
at org.apache.solr.cloud.ZkCLI.main(ZkCLI.java:163)
Caused by: java.lang.ClassNotFoundException: org.slf4j.LoggerFactory
at java.net.URLClassLoader$1.run(Unknown Source)
at java.net.URLClassLoader$1.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
... 2 more
I use cloud-scripts - zkcli.bat. Any idea why???
You have to provide one of the various SLF4J implementation .jar files in the classpath..
Try adding slf4j-log4j12.jar and slf4j-api-1.6.6.jar to lib directory of Jetty.

Google App Engine error

I get an error while trying an application on Google App Engine as below...any idea why?
My hunch is since I'm behind a proxy, I'm not able to connect to the appengine. If thats the case how do i set it up?
Initializing App Engine server
Apr 25, 2012 1:07:32 PM com.google.appengine.tools.info.RemoteVersionFactory getVersion
INFO: Unable to access http://appengine.google.com/api/updatecheck?runtime=java&release=1.5.4&timestamp=1315604504&api_versions=['1.0']
java.net.ConnectException: Connection refused: connect
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(Unknown Source)
at java.net.PlainSocketImpl.connectToAddress(Unknown Source)
at java.net.PlainSocketImpl.connect(Unknown Source)
at java.net.SocksSocketImpl.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at sun.net.NetworkClient.doConnect(Unknown Source)
at sun.net.www.http.HttpClient.openServer(Unknown Source)
at sun.net.www.http.HttpClient.openServer(Unknown Source)
at sun.net.www.http.HttpClient.<init>(Unknown Source)
at sun.net.www.http.HttpClient.New(Unknown Source)
at sun.net.www.http.HttpClient.New(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.connect(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown Source)
at java.net.URL.openStream(Unknown Source)
at com.google.appengine.tools.info.RemoteVersionFactory.getVersion(RemoteVersionFactory.java:76)
at com.google.appengine.tools.info.UpdateCheck.checkForUpdates(UpdateCheck.java:99)
at com.google.appengine.tools.info.UpdateCheck.doNagScreen(UpdateCheck.java:174)
at com.google.appengine.tools.info.UpdateCheck.maybePrintNagScreen(UpdateCheck.java:142)
at com.google.appengine.tools.development.gwt.AppEngineLauncher.maybePerformUpdateCheck(AppEngineLauncher.java:115)
at com.google.appengine.tools.development.gwt.AppEngineLauncher.start(AppEngineLauncher.java:81)
at com.google.gwt.dev.DevMode.doStartUpServer(DevMode.java:509)
at com.google.gwt.dev.DevModeBase.startUp(DevModeBase.java:1068)
at com.google.gwt.dev.DevModeBase.run(DevModeBase.java:811)
at com.google.gwt.dev.DevMode.main(DevMode.java:311)
By default the development server checks for updates during startup, you can disable it using --disable_update_check
You can use the appcfg command from appengine-sdk to pass the proxy. For me it worked.
Example :
appcfg --proxy=10.35.71.201:8080 update ../target/exploded_war
At the end of https://code.google.com/p/googleappengine/issues/detail?id=3131
Solution: create a file with name: .appcfg_no_nag
In the user path System.getProperty("user.home"), disable the update action.

SOLR DIH, error 500

First of all there is a conflict between Solr and MSSQL server in using port 8080.
So I changed the Tomcat port to the default Solr port 8983.
Then the problem is when I install the adventure work data base and
then do all the configuration according to this tutorial:
http://www.chrisumbel.com/article/lucene_solr_sql_server
I then get this error and it does not even open the admin page.
HTTP Status 500 - Severe errors in solr configuration. Check your log files for more detailed information on what may be wrong. If you want solr to continue after configuration errors, change: <abortOnConfigurationError>false</abortOnConfigurationError> in solr.xml ------------------------------------------------------------- org.apache.solr.common.SolrException: QueryElevationComponent requires the schema to have a uniqueKeyField implemented using StrField at org.apache.solr.handler.component.QueryElevationComponent.inform(QueryElevationComponent.java:158) at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:522) at org.apache.solr.core.SolrCore.<init>(SolrCore.java:594) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:463) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:316) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:207) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:130) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:94) at org.apache.catalina.core.ApplicationFilterConfig.initFilter(ApplicationFilterConfig.java:273) at org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:254) at org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:372) at org.apache.catalina.core.ApplicationFilterConfig.<init>(ApplicationFilterConfig.java:98) at org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4584) at org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5262) at org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5257) at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) ------------------------------------------------------------- org.apache.solr.common.SolrException: Error loading class 'org.apache.solr.handler.dataimport.DataImportHandler' at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:389) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:423) at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:459) at org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:157) at org.apache.solr.core.SolrCore.<init>(SolrCore.java:563) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:463) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:316) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:207) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:130) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:94) at org.apache.catalina.core.ApplicationFilterConfig.initFilter(ApplicationFilterConfig.java:273) at org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:254) at org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:372) at org.apache.catalina.core.ApplicationFilterConfig.<init>(ApplicationFilterConfig.java:98) at org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4584) at org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5262) at org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5257) at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Caused by: java.lang.ClassNotFoundException: org.apache.solr.handler.dataimport.DataImportHandler at java.net.URLClassLoader$1.run(Unknown Source) at java.net.URLClassLoader$1.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at java.net.FactoryURLClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Unknown Source) at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:373) ... 21 more
Read first lines of log. Your configuration is missing uniqueKey field. How to set read at:
Solr: QueryElevationComponent requires StrField uniqueKeyField error
Issue about Schema.xml uniqueKey field
SchemaXml - The_Unique_Key_Field
DataImportHandler - Example

Resources