Camel service call giving 400 error - apache-camel

I am using Camel serviceCall to call my eureka VIP. The host and port are resolved as expected. But the response is 400 and an error is thrown. My serviceCall looks like something below:
from("direct:servicecall")
.setBody().constant(null)
.setHeader(Exchange.HTTP_METHOD, constant(HttpMethods.GET))
.serviceCall().name("eureka_vip_name")
.expression()
.simple("http4:${header.CamelServiceCallServiceHost}:${header.CamelServiceCallServicePort}"
+ "service call path and parameters")
.end()
Adding the error logs:
org.apache.camel.http.common.HttpOperationFailedException: HTTP operation failed invoking http://172.31.20.241:8080/<------url path and parameteres------> with statusCode: 400
at org.apache.camel.component.http4.HttpProducer.populateHttpOperationFailedException(HttpProducer.java:312) ~[camel-http4-2.20.1.jar!/:2.20.1]
at org.apache.camel.component.http4.HttpProducer.process(HttpProducer.java:207) ~[camel-http4-2.20.1.jar!/:2.20.1]
at org.apache.camel.util.AsyncProcessorConverterHelper$ProcessorToAsyncProcessorBridge.process(AsyncProcessorConverterHelper.java:61) ~[camel-core-2.20.1.jar!/:2.20.1]
at org.apache.camel.processor.SendDynamicProcessor$1.doInAsyncProducer(SendDynamicProcessor.java:132) ~[camel-core-2.20.1.jar!/:2.20.1]
at org.apache.camel.impl.ProducerCache.doInAsyncProducer(ProducerCache.java:445) ~[camel-core-2.20.1.jar!/:2.20.1]
at org.apache.camel.processor.SendDynamicProcessor.process(SendDynamicProcessor.java:127) ~[camel-core-2.20.1.jar!/:2.20.1]
at org.apache.camel.impl.cloud.DefaultServiceCallProcessor.execute(DefaultServiceCallProcessor.java:184) ~[camel-core-2.20.1.jar!/:2.20.1]
at org.apache.camel.impl.cloud.DefaultServiceCallProcessor.lambda$process$0(DefaultServiceCallProcessor.java:164) ~[camel-core-2.20.1.jar!/:2.20.1]
at org.apache.camel.spring.cloud.CamelSpringCloudServiceLoadBalancer.lambda$process$0(CamelSpringCloudServiceLoadBalancer.java:66) ~[camel-spring-cloud-2.20.1.jar!/:2.20.1]
at org.springframework.cloud.netflix.ribbon.RibbonLoadBalancerClient.execute(RibbonLoadBalancerClient.java:98) ~[spring-cloud-netflix-core-1.3.4.RELEASE.jar!/:1.3.4.RELEASE]
at org.springframework.cloud.netflix.ribbon.RibbonLoadBalancerClient.execute(RibbonLoadBalancerClient.java:80) ~[spring-cloud-netflix-core-1.3.4.RELEASE.jar!/:1.3.4.RELEASE]
at org.apache.camel.spring.cloud.CamelSpringCloudServiceLoadBalancer.process(CamelSpringCloudServiceLoadBalancer.java:66) ~[camel-spring-cloud-2.20.1.jar!/:2.20.1]
at org.apache.camel.impl.cloud.DefaultServiceCallProcessor.process(DefaultServiceCallProcessor.java:164) ~[camel-core-2.20.1.jar!/:2.20.1]
at org.apache.camel.management.InstrumentationProcessor.process(InstrumentationProcessor.java:76) ~[camel-core-2.20.1.jar!/:2.20.1]
at org.apache.camel.processor.RedeliveryErrorHandler$AsyncRedeliveryTask.call(RedeliveryErrorHandler.java:203) ~[camel-core-2.20.1.jar!/:2.20.1]
at org.apache.camel.processor.RedeliveryErrorHandler$AsyncRedeliveryTask.call(RedeliveryErrorHandler.java:171) ~[camel-core-2.20.1.jar!/:2.20.1]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_161]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) ~[?:1.8.0_161]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) ~[?:1.8.0_161]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_161]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_161]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_161]
I tried hitting the same url using from postMan (chrome extension): When hitting from postMan, i used the ec2 instance DNS name as host and it gives me the expected response ie, 200 OK with the expected body as result.

Related

Error : java.io.IOException: UT000036: Connection terminated parsing multipart data

I'm working with React Js and Jhipster(Spring boot) , I try to upload a file using MultipartFile.
Where I pass Request Method: POST.
With Content-Type: multipart/form-data; boundary=----WebKitFormBoundarykyl9Y0dZ1t72LO7Q
And server nginx
Here is the codes:
The controller
#PostMapping(value = "/uploadCustomer", consumes = "multipart/form-data")
public void uploadMultipart(#RequestPart("file") MultipartFile file,#RequestPart("tags") String tags ) {
log.debug("tags "+tags+" {}",tags);
myService.saveCustomerFromCSV(file,tags);
}
part of configuration file
spring:
servlet:
multipart:
enabled: false
file-size-threshold: 2KB
max-file-size: 15MB
max-request-size: 50MB
ribbon:
eureka:
enabled: true
ReadTimeout: 300000
connection-timeout: 30000
# See http://cloud.spring.io/spring-cloud-netflix/spring-cloud-netflix.html
zuul: # those values must be configured depending on the application specific needs
sensitive-headers: Cookie,Set-Cookie #see https://github.com/spring-cloud/spring-cloud-netflix/issues/3126
ignored-headers: Access-Control-Allow-Credentials, Access-Control-Allow-Origin
host:
max-total-connections: 1000
max-per-route-connections: 100
connect-timeout-millis: 60000
socket-timeout-millis: 60000
semaphore:
max-semaphores: 500
But with Undertow, it throws RuntimeException , The Exception message:
java.lang.RuntimeException: java.io.IOException: UT000036: Connection terminated parsing multipart data
at io.undertow.servlet.spec.HttpServletRequestImpl.parseFormData(HttpServletRequestImpl.java:798)
at io.undertow.servlet.spec.HttpServletRequestImpl.getParameter(HttpServletRequestImpl.java:665)
at org.springframework.web.filter.HiddenHttpMethodFilter.doFilterInternal(HiddenHttpMethodFilter.java:84)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
at io.undertow.servlet.core.ManagedFilter.doFilter(ManagedFilter.java:61)
at io.undertow.servlet.handlers.FilterHandler$FilterChainImpl.doFilter(FilterHandler.java:131)
at org.springframework.boot.actuate.metrics.web.servlet.WebMvcMetricsFilter.filterAndRecordMetrics(WebMvcMetricsFilter.java:117)
at org.springframework.boot.actuate.metrics.web.servlet.WebMvcMetricsFilter.doFilterInternal(WebMvcMetricsFilter.java:106)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
at io.undertow.servlet.core.ManagedFilter.doFilter(ManagedFilter.java:61)
at io.undertow.servlet.handlers.FilterHandler$FilterChainImpl.doFilter(FilterHandler.java:131)
at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:200)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
at io.undertow.servlet.core.ManagedFilter.doFilter(ManagedFilter.java:61)
at io.undertow.servlet.handlers.FilterHandler$FilterChainImpl.doFilter(FilterHandler.java:131)
at io.undertow.servlet.handlers.FilterHandler.handleRequest(FilterHandler.java:84)
at io.undertow.servlet.handlers.security.ServletSecurityRoleHandler.handleRequest(ServletSecurityRoleHandler.java:62)
at io.undertow.servlet.handlers.ServletChain$1.handleRequest(ServletChain.java:65)
at io.undertow.servlet.handlers.ServletDispatchingHandler.handleRequest(ServletDispatchingHandler.java:36)
at io.undertow.servlet.handlers.security.SSLInformationAssociationHandler.handleRequest(SSLInformationAssociationHandler.java:132)
at io.undertow.servlet.handlers.security.ServletAuthenticationCallHandler.handleRequest(ServletAuthenticationCallHandler.java:57)
at io.undertow.server.handlers.PredicateHandler.handleRequest(PredicateHandler.java:43)
at io.undertow.security.handlers.AbstractConfidentialityHandler.handleRequest(AbstractConfidentialityHandler.java:46)
at io.undertow.servlet.handlers.security.ServletConfidentialityConstraintHandler.handleRequest(ServletConfidentialityConstraintHandler.java:64)
at io.undertow.security.handlers.AuthenticationMechanismsHandler.handleRequest(AuthenticationMechanismsHandler.java:60)
at io.undertow.servlet.handlers.security.CachedAuthenticatedSessionHandler.handleRequest(CachedAuthenticatedSessionHandler.java:77)
at io.undertow.security.handlers.AbstractSecurityContextAssociationHandler.handleRequest(AbstractSecurityContextAssociationHandler.java:43)
at io.undertow.server.handlers.PredicateHandler.handleRequest(PredicateHandler.java:43)
at io.undertow.server.handlers.PredicateHandler.handleRequest(PredicateHandler.java:43)
at io.undertow.servlet.handlers.ServletInitialHandler.handleFirstRequest(ServletInitialHandler.java:292)
at io.undertow.servlet.handlers.ServletInitialHandler.access$100(ServletInitialHandler.java:81)
at io.undertow.servlet.handlers.ServletInitialHandler$2.call(ServletInitialHandler.java:138)
at io.undertow.servlet.handlers.ServletInitialHandler$2.call(ServletInitialHandler.java:135)
at io.undertow.servlet.core.ServletRequestContextThreadSetupAction$1.call(ServletRequestContextThreadSetupAction.java:48)
at io.undertow.servlet.core.ContextClassLoaderSetupAction$1.call(ContextClassLoaderSetupAction.java:43)
at io.undertow.servlet.handlers.ServletInitialHandler.dispatchRequest(ServletInitialHandler.java:272)
at io.undertow.servlet.handlers.ServletInitialHandler.access$000(ServletInitialHandler.java:81)
at io.undertow.servlet.handlers.ServletInitialHandler$1.handleRequest(ServletInitialHandler.java:104)
at io.undertow.server.Connectors.executeRootHandler(Connectors.java:336)
at io.undertow.server.HttpServerExchange$1.run(HttpServerExchange.java:830)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: UT000036: Connection terminated parsing multipart data
at io.undertow.server.handlers.form.MultiPartParserDefinition$MultiPartUploadHandler.parseBlocking(MultiPartParserDefinition.java:228)
at io.undertow.servlet.spec.HttpServletRequestImpl.parseFormData(HttpServletRequestImpl.java:792)
... 42 common frames omitted
Solution :
In my case the Solution is to remove the Spring Servlet multipart Configuration which is in the gateway part
#spring:
# servlet:
# multipart:
# enabled: false
# file-size-threshold: 2KB
# max-file-size: 15MB
# max-request-size: 50MB

Issue with Apache Camel https rest api with username and password

I have the following piece of code, I've built for connecting to a "https" REST end point using Apache Camel. The problem is that I get 401 error if this is run.
from("timer:learnTimer?period=100s")
.to("log:?level=INFO&showBody=true")
.setHeader("currentTime", simple(currentTime))
.setHeader(Exchange.CONTENT_TYPE,constant("application/json"))
.setHeader(Exchange.HTTP_METHOD, constant("GET"))
.setHeader(Exchange.HTTP_URI, simple("https://xxxxxx/api/siem/offenses?filter=status%20%3D%20%22OPEN%22%20and%20start_time%20%3E%201543647979000?&authMethod=Basic&authUsername=xxxxx&authPassword=xxxxx"))
.to("https://xxxxxxx/api/siem/offenses?filter=status%20%3D%20%22OPEN%22%20and%20start_time%20%3E%201543647979000?&authMethod=Basic&authUsername=xxxx&authPassword=xxxx").convertBodyTo(String.class)
.to("log:?level=INFO&showBody=true");
The error I am receiving is:
Stacktrace
org.apache.camel.http.common.HttpOperationFailedException: HTTP operation failed invoking https://xx.xx.xx.xx/api/siem/offenses?filter=status+%3D+%22OPEN%22+and+start_time+%3E+1543647979000%3F with statusCode: 401
at org.apache.camel.component.http.HttpProducer.populateHttpOperationFailedException(HttpProducer.java:243)
at org.apache.camel.component.http.HttpProducer.process(HttpProducer.java:165)
at org.apache.camel.util.AsyncProcessorConverterHelper$ProcessorToAsyncProcessorBridge.process(AsyncProcessorConverterHelper.java:61)
at org.apache.camel.processor.SendProcessor.process(SendProcessor.java:148)
at org.apache.camel.processor.RedeliveryErrorHandler.process(RedeliveryErrorHandler.java:548)
at org.apache.camel.processor.CamelInternalProcessor.process(CamelInternalProcessor.java:201)
at org.apache.camel.processor.Pipeline.process(Pipeline.java:138)
at org.apache.camel.processor.Pipeline.process(Pipeline.java:101)
at org.apache.camel.processor.CamelInternalProcessor.process(CamelInternalProcessor.java:201)
at org.apache.camel.component.timer.TimerConsumer.sendTimerExchange(TimerConsumer.java:197)
at org.apache.camel.component.timer.TimerConsumer$1.run(TimerConsumer.java:79)
at java.util.TimerThread.mainLoop(Timer.java:555)
at java.util.TimerThread.run(Timer.java:505)
15:16| WARN | CamelLogger.java 213 | Error processing exchange. Exchange[ID-zabbixproxy-node2-1544019394005-0-1]. Caused by: [org.apache.camel.http.common.HttpOperationFailedException - HTTP operation failed invoking https://xx.xx.xx.xx/api/siem/offenses?filter=status+%3D+%22OPEN%22+and+start_time+%3E+1543647979000%3F with statusCode: 401]
org.apache.camel.http.common.HttpOperationFailedException: HTTP operation failed invoking https://10.96.40.66/api/siem/offenses?filter=status+%3D+%22OPEN%22+and+start_time+%3E+1543647979000%3F with statusCode: 401
at org.apache.camel.component.http.HttpProducer.populateHttpOperationFailedException(HttpProducer.java:243)
at org.apache.camel.component.http.HttpProducer.process(HttpProducer.java:165)
at org.apache.camel.util.AsyncProcessorConverterHelper$ProcessorToAsyncProcessorBridge.process(AsyncProcessorConverterHelper.java:61)
at org.apache.camel.processor.SendProcessor.process(SendProcessor.java:148)
Are you sure you should set these header before making an rest call?
un necessary request headers in IN Message may cause some issue.
Exchange exchange = ExchangeBuilder.anExchange(camelContext)
.withHeader("").withProperty("")
.withPattern(ExchangePattern...)
.withHeader(Exchange.HTTP_METHOD, HttpMethod.GET)
.build();
producer.send("the end point to rest",exchange);
// producer is ProducerTemaplte
In above code you can set The ExchangePattern and required Headers and property (if only needed).
Hope this helps.

Flink 1.5.0-SNAPSHOT RestClient - Received response is abnormal

The job of the application can be submitted to the standalone cluster, via Spring boot Application main method.
But mvn spring-boot:run will pring this exception.
In the development environment,such as, Eclipse Luna.
BUG in CentOS6.8 JDK1.8.0_161 ,The system prints the exception log as follows:
09:07:20.755 tysc_log [Flink-RestClusterClient-IO-thread-4] ERROR o.a.flink.runtime.rest.RestClient - Received response was neither of the expected type ([simple type, class org.apache.flink.runtime.rest.messages.job.JobExecutionResultResponseBody]) nor an error. Response=org.apache.flink.runtime.rest.RestClient$JsonResponse#2ac43968
org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.exc.UnrecognizedPropertyException: Unrecognized field "status" (class org.apache.flink.runtime.rest.messages.ErrorResponseBody), not marked as ignorable (one known property: "errors"])
at [Source: N/A; line: -1, column: -1] (through reference chain: org.apache.flink.runtime.rest.messages.ErrorResponseBody["status"])
at org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.exc.UnrecognizedPropertyException.from(UnrecognizedPropertyException.java:62)
at org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.DeserializationContext.reportUnknownProperty(DeserializationContext.java:851)
at org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.deser.std.StdDeserializer.handleUnknownProperty(StdDeserializer.java:1085)
at org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.deser.BeanDeserializerBase.handleUnknownProperty(BeanDeserializerBase.java:1392)
at org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.deser.BeanDeserializerBase.handleUnknownProperties(BeanDeserializerBase.java:1346)
at org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(BeanDeserializer.java:455)
at org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:1127)
at org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:298)
at org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:133)
at org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.ObjectMapper._readValue(ObjectMapper.java:3779)
at org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2050)
at org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.ObjectMapper.treeToValue(ObjectMapper.java:2547)
at org.apache.flink.runtime.rest.RestClient.parseResponse(RestClient.java:225)
at org.apache.flink.runtime.rest.RestClient.lambda$submitRequest$3(RestClient.java:210)
at java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:952)
at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:926)
at java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

Error integration of solr 5.5.0 with nutch 1.13: 'Connection pool shut down'

I had a problem when I tried to integrate 'Solr' with 'Nutch':
version of 'Nutch':1.13
version of 'Solr': 5.5.0 (as recommended by the official
documentations https://wiki.apache.org/nutch/NutchTutorial#Verify_your_Nutch_installation)
The error is :
Active IndexWriters :
SOLRIndexWriter
solr.server.url : URL of the SOLR instance
solr.zookeeper.hosts : URL of the Zookeeper quorum
solr.commit.size : buffer size when sending to SOLR (default 1000)
solr.mapping.file : name of the mapping file for fields (default solrindex-mapping.xml)
solr.auth : use authentication (default false)
solr.auth.username : username for authentication
solr.auth.password : password for authentication
Indexer: number of documents indexed, deleted, or skipped:
Indexer: finished at 2017-11-30 01:34:49, elapsed: 00:00:01
Cleaning up index if possible
apache-nutch-1.13/bin /nutch clean -Dsolr.server.url=http://localhost:8983/solr/nutch crawling_dir/crawldb
SolrIndexer: deleting 1/1 documents
ERROR CleaningJob: java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:865)
at org.apache.nutch.indexer.CleaningJob.delete(CleaningJob.java:174)
at org.apache.nutch.indexer.CleaningJob.run(CleaningJob.java:197)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.nutch.indexer.CleaningJob.main(CleaningJob.java:208)
Error running:
apache-nutch-1.13/bin/nutch clean -Dsolr.server.url=http://localhost:8983/solr/nutch crawling_dir/crawldb
Failed with exit value 255.
on the log file:
2017-11-30 01:34:50,851 WARN output.FileOutputCommitter - Output Path is null in cleanupJob()
2017-11-30 01:34:50,851 WARN mapred.LocalJobRunner - job_local531807742_0001
java.lang.Exception: java.lang.IllegalStateException: Connection pool shut down
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529)
Caused by: java.lang.IllegalStateException: Connection pool shut down
at org.apache.http.util.Asserts.check(Asserts.java:34)
at org.apache.http.pool.AbstractConnPool.lease(AbstractConnPool.java:169)
at org.apache.http.pool.AbstractConnPool.lease(AbstractConnPool.java:202)
at org.apache.http.impl.conn.PoolingClientConnectionManager.requestConnection(PoolingClientConnectionManager.java:184)
at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:415)
at org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:863)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:106)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:57)
at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:481)
at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:240)
at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:229)
at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:149)
at org.apache.solr.client.solrj.SolrClient.commit(SolrClient.java:482)
at org.apache.solr.client.solrj.SolrClient.commit(SolrClient.java:463)
at org.apache.nutch.indexwriter.solr.SolrIndexWriter.commit(SolrIndexWriter.java:191)
at org.apache.nutch.indexwriter.solr.SolrIndexWriter.close(SolrIndexWriter.java:179)
at org.apache.nutch.indexer.IndexWriters.close(IndexWriters.java:117)
at org.apache.nutch.indexer.CleaningJob$DeleterReducer.close(CleaningJob.java:122)
at org.apache.hadoop.io.IOUtils.cleanup(IOUtils.java:244)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:459)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
at org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2017-11-30 01:34:51,458 ERROR indexer.CleaningJob - CleaningJob: java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:865)
at org.apache.nutch.indexer.CleaningJob.delete(CleaningJob.java:174)
at org.apache.nutch.indexer.CleaningJob.run(CleaningJob.java:197)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.nutch.indexer.CleaningJob.main(CleaningJob.java:208)
Please do you have any idea?
Had same problem and yours probably is due to the same reason
https://issues.apache.org/jira/browse/NUTCH-2269
Try patch it and error should be gone
From my finding, it appears to be a bug. Here's a blog that explains it well, https://reformatcode.com/code/apache-configuration/apache-nutch-112-with-apache-solr-621-give-an-error

NUTCH 1.13 fetch of url failed with: org.apache.nutch.protocol.ProtocolNotFound: protocol not found for url=http

fetch of httpurl failed with:
org.apache.nutch.protocol.ProtocolNotFound: protocol not found for
url=http at
org.apache.nutch.protocol.ProtocolFactory.getProtocol(ProtocolFactory.java:85)
at org.apache.nutch.fetcher.FetcherThread.run(FetcherThread.java:285)
Using queue mode : byHost
fetch of httpsurl failed with: org.apache.nutch.protocol.ProtocolNotFound: protocol not found for url=https
at org.apache.nutch.protocol.ProtocolFactory.getProtocol(ProtocolFactory.java:85)
at org.apache.nutch.fetcher.FetcherThread.run(FetcherThread.java:285)
I get above result while running nutch1.13 with solr6.6.0
command i used is
bin/crawl -i -D
solr.server.url=http://myip/solr/nutch/ urls/ crawl 2
below is plugin section in my nutch-site.xml
<name>plugin.includes</name>
<value>
protocol-(http|httpclient)|urlfilter-regex|parse-(html)|index-(basic|anchor)|indexer-solr|query-(basic|site|url)|response-(json|xml)|summary-basic|scoring-opic|urlnormalizer-(pass|regex|basic)
</value>
Below are my file contents
[root#localhost apache-nutch-1.13]# ls plugins
creativecommons index-more nutch-extensionpoints protocol-file scoring-similarity urlnormalizer-ajax
feed index-replace parse-ext protocol-ftp subcollection urlnormalizer-basic
headings index-static parsefilter-naivebayes protocol-htmlunit tld urlnormalizer-host
index-anchor language-identifier parsefilter-regex protocol-http urlfilter-automaton urlnormalizer-pass
index-basic lib-htmlunit parse-html protocol-httpclient urlfilter-domain urlnormalizer-protocol
indexer-cloudsearch lib-http parse-js protocol-interactiveselenium urlfilter-domainblacklist urlnormalizer-querystring
indexer-dummy lib-nekohtml parse-metatags protocol-selenium urlfilter-ignoreexempt urlnormalizer-regex
indexer-elastic lib-regex-filter parse-replace publish-rabbitmq urlfilter-prefix urlnormalizer-slash
indexer-solr lib-selenium parse-swf publish-rabitmq urlfilter-regex
index-geoip lib-xml parse-tika scoring-depth urlfilter-suffix
index-links microformats-reltag parse-zip scoring-link urlfilter-validator
index-metadata mimetype-filter plugin scoring-opic urlmeta
I'm stuck with this issue. As you can see i have included both protocol-(http|httpclient) .But still fetching url failed. Thanks in advance.
NEWER ISSUE hadoop.log
2017-09-01 14:35:07,172 INFO solr.SolrIndexWriter - SolrIndexer:
deleting 1/1 documents 2017-09-01 14:35:07,321 WARN
output.FileOutputCommitter - Output Path is null in cleanupJob()
2017-09-01 14:35:07,323 WARN mapred.LocalJobRunner -
job_local1176811933_0001 java.lang.Exception:
java.lang.IllegalStateException: Connection pool shut down at
org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529)
Caused by: java.lang.IllegalStateException: Connection pool shut down
at org.apache.http.util.Asserts.check(Asserts.java:34) at
org.apache.http.pool.AbstractConnPool.lease(AbstractConnPool.java:169)
at
org.apache.http.pool.AbstractConnPool.lease(AbstractConnPool.java:202)
at
org.apache.http.impl.conn.PoolingClientConnectionManager.requestConnection(PoolingClientConnectionManager.java:184)
at
org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:415)
at
org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:863)
at
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
at
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:106)
at
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:57)
at
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:481)
at
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:240)
at
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:229)
at
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:149)
at
org.apache.solr.client.solrj.SolrClient.commit(SolrClient.java:482)
at
org.apache.solr.client.solrj.SolrClient.commit(SolrClient.java:463)
at
org.apache.nutch.indexwriter.solr.SolrIndexWriter.commit(SolrIndexWriter.java:191)
at
org.apache.nutch.indexwriter.solr.SolrIndexWriter.close(SolrIndexWriter.java:179)
at org.apache.nutch.indexer.IndexWriters.close(IndexWriters.java:117)
at
org.apache.nutch.indexer.CleaningJob$DeleterReducer.close(CleaningJob.java:122)
at org.apache.hadoop.io.IOUtils.cleanup(IOUtils.java:244) at
org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:459)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392) at
org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748) 2017-09-01 14:35:07,679
ERROR indexer.CleaningJob - CleaningJob: java.io.IOException: Job
failed! at
org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:865) at
org.apache.nutch.indexer.CleaningJob.delete(CleaningJob.java:174) at
org.apache.nutch.indexer.CleaningJob.run(CleaningJob.java:197) at
org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at
org.apache.nutch.indexer.CleaningJob.main(CleaningJob.java:208)
I somehow solved the issue. I think the space in nutch-site.xml was causing issue new plugin.includes section for others coming here.
<name>plugin.includes</name>
<value>protocol-http|protocol-httpclient|urlfilter-regex|parse-(html)|index-(basic|anchor)|indexer-solr|query-(basic|site|url)|response-(json|xml)|summary-basic|scoring-opic|urlnormalizer-(pass|regex|basic)</value>

Resources