Error exporting data from Google Cloud Bigtable - export

While going through the Google docs, I'm getting the below stack trace on the final export command (executed from the master instance with appropriate env variables set).
${HADOOP_HOME}/bin/hadoop jar ${HADOOP_BIGTABLE_JAR} export-table -libjars ${HADOOP_BIGTABLE_JAR} <table-name> <gs://bucket>
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/hadoop/hbase-install/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/hadoop/hadoop-install/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
2016-02-08 23:39:39,068 INFO [main] mapreduce.Export: versions=1, starttime=0, endtime=9223372036854775807, keepDeletedCells=false
2016-02-08 23:39:39,213 INFO [main] gcs.GoogleHadoopFileSystemBase: GHFS version: 1.4.4-hadoop2
java.lang.IllegalAccessError: tried to access field sun.security.ssl.Handshaker.localSupportedSignAlgs from class sun.security.ssl.ClientHandshaker
at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:278)
at sun.security.ssl.Handshaker.processLoop(Handshaker.java:913)
at sun.security.ssl.Handshaker.process_record(Handshaker.java:849)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1035)
at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1344)
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1371)
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1355)
at sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:559)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:185)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.connect(HttpsURLConnectionImpl.java:153)
at com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:93)
at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:972)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:419)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:469)
at com.google.cloud.hadoop.gcsio.GoogleCloudStorageImpl.getBucket(GoogleCloudStorageImpl.java:1599)
at com.google.cloud.hadoop.gcsio.GoogleCloudStorageImpl.getItemInfo(GoogleCloudStorageImpl.java:1554)
at com.google.cloud.hadoop.gcsio.CacheSupplementedGoogleCloudStorage.getItemInfo(CacheSupplementedGoogleCloudStorage.java:547)
at com.google.cloud.hadoop.gcsio.GoogleCloudStorageFileSystem.getFileInfo(GoogleCloudStorageFileSystem.java:1042)
at com.google.cloud.hadoop.gcsio.GoogleCloudStorageFileSystem.exists(GoogleCloudStorageFileSystem.java:383)
at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.configureBuckets(GoogleHadoopFileSystemBase.java:1650)
at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem.configureBuckets(GoogleHadoopFileSystem.java:71)
at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.configure(GoogleHadoopFileSystemBase.java:1598)
at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.initialize(GoogleHadoopFileSystemBase.java:783)
at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.initialize(GoogleHadoopFileSystemBase.java:746)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2591)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:89)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2625)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2607)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:368)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:167)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:352)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
at org.apache.hadoop.hbase.util.DynamicClassLoader.<init>(DynamicClassLoader.java:104)
at org.apache.hadoop.hbase.protobuf.ProtobufUtil.<clinit>(ProtobufUtil.java:241)
at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.convertScanToString(TableMapReduceUtil.java:509)
at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:207)
at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:168)
at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:291)
at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:92)
at org.apache.hadoop.hbase.mapreduce.IdentityTableMapper.initJob(IdentityTableMapper.java:51)
at org.apache.hadoop.hbase.mapreduce.Export.createSubmittableJob(Export.java:75)
at org.apache.hadoop.hbase.mapreduce.Export.main(Export.java:187)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:145)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:153)
at com.google.cloud.bigtable.mapreduce.Driver.main(Driver.java:35)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Here's my ENV var set up in case it's helpful:
export HBASE_HOME=/home/hadoop/hbase-install
export HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase classpath`
export HADOOP_HOME=/home/hadoop/hadoop-install
export HADOOP_CLIENT_OPTS="-Xbootclasspath/p:${HBASE_HOME}/lib/bigtable/alpn-boot-7.1.3.v20150130.jar"
export HADOOP_BIGTABLE_JAR=${HBASE_HOME}/lib/bigtable/bigtable-hbase-mapreduce-0.2.2-shaded.jar
export HADOOP_HBASE_JAR=${HBASE_HOME}/lib/hbase-server-1.1.2.jar
Also, when I try to run hbase shell and then list tables it just hangs and doesn't fetch me the list of tables. This is what happens:
~$ hbase shell
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/hadoop/hbase-install/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/hadoop/hadoop-install/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
2016-02-09 00:02:01,334 INFO [main] grpc.BigtableSession: Opening connection for projectId mystical-height-89421, zoneId us-central1-b, clusterId twitter-data, on data host bigtable.googleapis.com, table admin host bigtabletableadmin.googleapis.com.
2016-02-09 00:02:01,358 INFO [BigtableSession-startup-0] grpc.BigtableSession: gRPC is using the JDK provider (alpn-boot jar)
2016-02-09 00:02:01,648 INFO [bigtable-connection-shared-executor-pool1-t2] io.RefreshingOAuth2CredentialsInterceptor: Refreshing the OAuth token
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 1.1.2, rcc2b70cf03e3378800661ec5cab11eb43fafe0fc, Wed Aug 26 20:11:27 PDT 2015
hbase(main):001:0> list
TABLE
I've tried:
Double checking ALPN and ENV variables are appropriately set
Double checking hbase-site.xml and hbase-env.sh to make sure nothing looks wrong.
I also even tried connecting to my cluster (like I was previously able to following these directions) from ANOTHER gcloud instance, but it seems like I can't seem to get that to work now either...(it also hangs)
user#gcloud-instance:hbase-1.1.2$ bin/hbase shell
2016-02-09 00:07:03,506 WARN [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2016-02-09 00:07:03,913 INFO [main] grpc.BigtableSession: Opening connection for projectId <project>, zoneId us-central1-b, clusterId <cluster>, on data host bigtable.googleapis.com, table admin host bigtabletableadmin.googleapis.com.
2016-02-09 00:07:04,039 INFO [BigtableSession-startup-0] grpc.BigtableSession: gRPC is using the JDK provider (alpn-boot jar)
2016-02-09 00:07:05,138 INFO [Credentials-Refresh-0] io.RefreshingOAuth2CredentialsInterceptor: Refreshing the OAuth token
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 1.1.2, rcc2b70cf03e3378800661ec5cab11eb43fafe0fc, Wed Aug 26 20:11:27 PDT 2015
hbase(main):001:0> list
TABLE
Feb 09, 2016 12:07:08 AM com.google.bigtable.repackaged.io.grpc.internal.TransportSet$1 run
INFO: Created transport com.google.bigtable.repackaged.io.grpc.netty.NettyClientTransport#7b480442(bigtabletableadmin.googleapis.com/64.233.183.219:443) for bigtabletableadmin.googleapis.com/64.233.183.219:443
Any ideas with what I'm doing wrong? Looks like an access issue - how do I fix it?
Thanks!

You can spin up a Dataproc cluster w/ Bigtable enabled following these instructions.
ssh to the master by ./cluster.sh ssh
hbase shell to verify that all is in order.
hadoop jar ${HADOOP_BIGTABLE_JAR} export-table -libjars ${HADOOP_BIGTABLE_JAR} <table-name> gs://<bucket>/some-folder
gsutil ls gs://<bucket>/some-folder/** and see if _SUCCESS exists. If so, the remaining files are your data.
exit from your cluster master
./cluster.sh delete to get rid of the cluster, if you no longer require it.
You ran into a problem with the weekly java runtime update, that has been corrected.

Related

Unable to use extended_choice_parameter.ExtendedChoiceParameterValue for security reasons in jenkins

please find the error while trying to pass the parameters during build -
java.lang.UnsupportedOperationException: Refusing to marshal com.cwctravel.hudson.plugins.extended_choice_parameter.ExtendedChoiceParameterValue for security reasons; see https://jenkins.io/redirect/class-filter/
at hudson.util.XStream2$BlacklistedTypesConverter.marshal(XStream2.java:541)
at com.thoughtworks.xstream.core.AbstractReferenceMarshaller.convert(AbstractReferenceMarshaller.java:69)
at com.thoughtworks.xstream.core.TreeMarshaller.convertAnother(TreeMarshaller.java:58)
at com.thoughtworks.xstream.core.TreeMarshaller.convertAnother(TreeMarshaller.java:43)
at com.thoughtworks.xstream.core.AbstractReferenceMarshaller$1.convertAnother(AbstractReferenceMarshaller.java:88)
at com.thoughtworks.xstream.converters.collections.AbstractCollectionConverter.writeItem(AbstractCollectionConverter.java:64)
at com.thoughtworks.xstream.converters.collections.CollectionConverter.marshal(CollectionConverter.java:74)
at com.thoughtworks.xstream.core.AbstractReferenceMarshaller.convert(AbstractReferenceMarshaller.java:69)
at com.thoughtworks.xstream.core.TreeMarshaller.convertAnother(TreeMarshaller.java:58)
at com.thoughtworks.xstream.core.AbstractReferenceMarshaller$1.convertAnother(AbstractReferenceMarshaller.java:84)
at hudson.util.RobustReflectionConverter.marshallField(RobustReflectionConverter.java:264)
at hudson.util.RobustReflectionConverter$2.writeField(RobustReflectionConverter.java:251)
Caused: java.lang.RuntimeException: Failed to serialize hudson.model.ParametersAction#parameters for class hudson.model.ParametersAction
at hudson.util.RobustReflectionConverter$2.writeField(RobustReflectionConverter.java:255)
at hudson.util.RobustReflectionConverter$2.visit(RobustReflectionConverter.java:223)
at com.thoughtworks.xstream.converters.reflection.PureJavaReflectionProvider.visitSerializableFields(PureJavaReflectionProvider.java:138)
at hudson.util.RobustReflectionConverter.doMarshal(RobustReflectionConverter.java:209)
at hudson.util.RobustReflectionConverter.marshal(RobustReflectionConverter.java:150)
at com.thoughtworks.xstream.core.AbstractReferenceMarshaller.convert(AbstractReferenceMarshaller.java:69)
at com.thoughtworks.xstream.core.TreeMarshaller.convertAnother(TreeMarshaller.java:58)
at com.thoughtworks.xstream.core.TreeMarshaller.convertAnother(TreeMarshaller.java:43)
at com.thoughtworks.xstream.core.AbstractReferenceMarshaller$1.convertAnother(AbstractReferenceMarshaller.java:88)
at com.thoughtworks.xstream.converters.collections.AbstractCollectionConverter.writeItem(AbstractCollectionConverter.java:64)
at com.thoughtworks.xstream.converters.collections.CollectionConverter.marshal(CollectionConverter.java:74)
at com.thoughtworks.xstream.core.AbstractReferenceMarshaller.convert(AbstractReferenceMarshaller.java:69)
at com.thoughtworks.xstream.core.TreeMarshaller.convertAnother(TreeMarshaller.java:58)
at com.thoughtworks.xstream.core.AbstractReferenceMarshaller$1.convertAnother(AbstractReferenceMarshaller.java:84)
at hudson.util.RobustReflectionConverter.marshallField(RobustReflectionConverter.java:264)
at hudson.util.RobustReflectionConverter$2.writeField(RobustReflectionConverter.java:251)
Caused: java.lang.RuntimeException: Failed to serialize hudson.model.Actionable#actions for class org.jenkinsci.plugins.workflow.job.WorkflowRun
at hudson.util.RobustReflectionConverter$2.writeField(RobustReflectionConverter.java:255)
at hudson.util.RobustReflectionConverter$2.visit(RobustReflectionConverter.java:223)
at com.thoughtworks.xstream.converters.reflection.PureJavaReflectionProvider.visitSerializableFields(PureJavaReflectionProvider.java:138)
at hudson.util.RobustReflectionConverter.doMarshal(RobustReflectionConverter.java:209)
at hudson.util.RobustReflectionConverter.marshal(RobustReflectionConverter.java:150)
at com.thoughtworks.xstream.core.AbstractReferenceMarshaller.convert(AbstractReferenceMarshaller.java:69)
at com.thoughtworks.xstream.core.TreeMarshaller.convertAnother(TreeMarshaller.java:58)
at com.thoughtworks.xstream.core.TreeMarshaller.convertAnother(TreeMarshaller.java:43)
at com.thoughtworks.xstream.core.TreeMarshaller.start(TreeMarshaller.java:82)
at com.thoughtworks.xstream.core.AbstractTreeMarshallingStrategy.marshal(AbstractTreeMarshallingStrategy.java:37)
at com.thoughtworks.xstream.XStream.marshal(XStream.java:1026)
at com.thoughtworks.xstream.XStream.marshal(XStream.java:1015)
at com.thoughtworks.xstream.XStream.toXML(XStream.java:988)
at hudson.XmlFile.write(XmlFile.java:195)
at hudson.model.Run.save(Run.java:2077)
at org.jenkinsci.plugins.workflow.cps.EnvActionImpl.forRun(EnvActionImpl.java:136)
at org.jenkinsci.plugins.workflow.cps.EnvActionImpl$Binder.getValue(EnvActionImpl.java:149)
at org.jenkinsci.plugins.workflow.cps.EnvActionImpl$Binder.getValue(EnvActionImpl.java:142)
at org.jenkinsci.plugins.workflow.cps.CpsScript.getProperty(CpsScript.java:121)
at org.codehaus.groovy.runtime.InvokerHelper.getProperty(InvokerHelper.java:174)
at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.getProperty(ScriptBytecodeAdapter.java:456)
at org.kohsuke.groovy.sandbox.impl.Checker$7.call(Checker.java:355)
at org.kohsuke.groovy.sandbox.GroovyInterceptor.onGetProperty(GroovyInterceptor.java:68)
at org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.SandboxInterceptor.onGetProperty(SandboxInterceptor.java:354)
at org.kohsuke.groovy.sandbox.impl.Checker$7.call(Checker.java:353)
at org.kohsuke.groovy.sandbox.impl.Checker.checkedGetProperty(Checker.java:357)
at org.kohsuke.groovy.sandbox.impl.Checker.checkedGetProperty(Checker.java:333)
at org.kohsuke.groovy.sandbox.impl.Checker.checkedGetProperty(Checker.java:333)
at com.cloudbees.groovy.cps.sandbox.SandboxInvoker.getProperty(SandboxInvoker.java:29)
at com.cloudbees.groovy.cps.impl.PropertyAccessBlock.rawGet(PropertyAccessBlock.java:20)
Caused: java.io.IOException
at hudson.XmlFile.write(XmlFile.java:202)
at hudson.model.Run.save(Run.java:2077)
at org.jenkinsci.plugins.workflow.cps.EnvActionImpl.forRun(EnvActionImpl.java:136)
at org.jenkinsci.plugins.workflow.cps.EnvActionImpl$Binder.getValue(EnvActionImpl.java:149)
at org.jenkinsci.plugins.workflow.cps.EnvActionImpl$Binder.getValue(EnvActionImpl.java:142)
at org.jenkinsci.plugins.workflow.cps.CpsScript.getProperty(CpsScript.java:121)
at org.codehaus.groovy.runtime.InvokerHelper.getProperty(InvokerHelper.java:174)
at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.getProperty(ScriptBytecodeAdapter.java:456)
at org.kohsuke.groovy.sandbox.impl.Checker$7.call(Checker.java:355)
at org.kohsuke.groovy.sandbox.GroovyInterceptor.onGetProperty(GroovyInterceptor.java:68)
at org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.SandboxInterceptor.onGetProperty(SandboxInterceptor.java:354)
at org.kohsuke.groovy.sandbox.impl.Checker$7.call(Checker.java:353)
at org.kohsuke.groovy.sandbox.impl.Checker.checkedGetProperty(Checker.java:357)
at org.kohsuke.groovy.sandbox.impl.Checker.checkedGetProperty(Checker.java:333)
at org.kohsuke.groovy.sandbox.impl.Checker.checkedGetProperty(Checker.java:333)
at com.cloudbees.groovy.cps.sandbox.SandboxInvoker.getProperty(SandboxInvoker.java:29)
at com.cloudbees.groovy.cps.impl.PropertyAccessBlock.rawGet(PropertyAccessBlock.java:20)
at WorkflowScript.run(WorkflowScript:8)
at ___cps.transform___(Native Method)
at com.cloudbees.groovy.cps.impl.PropertyishBlock$ContinuationImpl.get(PropertyishBlock.java:74)
at com.cloudbees.groovy.cps.LValueBlock$GetAdapter.receive(LValueBlock.java:30)
at com.cloudbees.groovy.cps.impl.PropertyishBlock$ContinuationImpl.fixName(PropertyishBlock.java:66)
at jdk.internal.reflect.GeneratedMethodAccessor629.invoke(Unknown Source)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at com.cloudbees.groovy.cps.impl.ContinuationPtr$ContinuationImpl.receive(ContinuationPtr.java:72)
at com.cloudbees.groovy.cps.impl.ConstantBlock.eval(ConstantBlock.java:21)
at com.cloudbees.groovy.cps.Next.step(Next.java:83)
at com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:174)
at com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:163)
at org.codehaus.groovy.runtime.GroovyCategorySupport$ThreadCategoryInfo.use(GroovyCategorySupport.java:129)
at org.codehaus.groovy.runtime.GroovyCategorySupport.use(GroovyCategorySupport.java:268)
at com.cloudbees.groovy.cps.Continuable.run0(Continuable.java:163)
at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.access$001(SandboxContinuable.java:19)
at org.jenkinsci.plugins.workflow.cps.SandboxContinuable$1.call(SandboxContinuable.java:35)
at org.jenkinsci.plugins.workflow.cps.SandboxContinuable$1.call(SandboxContinuable.java:32)
at org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.GroovySandbox.runInSandbox(GroovySandbox.java:237)
at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.run0(SandboxContinuable.java:32)
at org.jenkinsci.plugins.workflow.cps.CpsThread.runNextChunk(CpsThread.java:174)
at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.run(CpsThreadGroup.java:331)
at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.access$100(CpsThreadGroup.java:82)
at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:243)
at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:231)
at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$2.call(CpsVmExecutorService.java:64)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:136)
at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
Finished: FAILURE
We are stated to see security issue related to extended choice parameters in jenkins ,which gets json array as input. Please find the error . Can you please help on this. FYI There was no jenkins or plugins update done recently. This stared happening all of the sudden (2 weeks back) and all the "extended choice parameters" got deleted without any clue in jenkins build.
extended choice parameters VERSION - 0.78
jenkins VERSION - 2.263.1
Issue has been fixed by replacing the extended choice parameter jar file with the latest version which had fix to extended_choice_parameter.ExtendedChoiceParameterValue parameters. Also, make sure to White list "export JDK_JAVA_OPTIONS="Dhudson.remoting.ClassFilter=com.cwctravel.hudson.plugins.extended_choice_parameter.ExtendedChoiceParameterValue" in .bash_profile of your jenkins machine

[flink]Task manager initialization failed

I am new to flink. I am trying to run the flink example on my local PC(windows).
However, after I run the start-cluster.bat, I login to the dashboard, it shows the task manager is 0.
I checked the log and seems it fails to initialize:
2020-02-21 23:03:14,202 ERROR org.apache.flink.runtime.taskexecutor.TaskManagerRunner - TaskManager initialization failed.
org.apache.flink.configuration.IllegalConfigurationException: Failed to create TaskExecutorResourceSpec
at org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils.resourceSpec.FromConfig(TaskExecutorResourceUtils.java:72)
at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.startTaskManager(TaskManagerRunner.java:356)
at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.<init>(TaskManagerRunner.java:152)
at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.runTaskManager(TaskManagerRunner.java:308)
at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.lambda$runTaskManagerSecurely$2(TaskManagerRunner.java:322)
at org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.runTaskManagerSecurely(TaskManagerRunner.java:321)
at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.main(TaskManagerRunner.java:287)
Caused by: org.apache.flink.configuration.IllegalConfigurationException: The required configuration option Key: 'taskmanager.cpu.cores' , default: null (fallback keys: []) is not set
at org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils.checkConfigOptionIsSet(TaskExecutorResourceUtils.java:90)
at org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils.lambda$checkTaskExecutorResourceConfigSet$0(TaskExecutorResourceUtils.java:84)
at java.util.Arrays$ArrayList.forEach(Arrays.java:3880)
at org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils.checkTaskExecutorResourceConfigSet(TaskExecutorResourceUtils.java:84)
at org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils.resourceSpecFromConfig(TaskExecutorResourceUtils.java:70)
... 7 more
2020-02-21 23:03:14,217 INFO org.apache.flink.runtime.blob.TransientBlobCache - Shutting down BLOB cache
Basically, it looks like a required option 'taskmanager.cpu.cores' is not set. However, I can't find this property in flink-conf.yaml and in the document(https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/config.html) either.
I am using flink 1.10.0. Any help would be highly appreciated!
That configuration option is intended for internal use only -- it shouldn't be user configured, which is why it isn't documented.
The windows start-cluster.bat is failing because of a bug introduced in Flink 1.10. See https://jira.apache.org/jira/browse/FLINK-15925.
One workaround is to use the bash script, start-cluster.sh, instead.
See also this mailing list thread: https://lists.apache.org/thread.html/r7693d0c06ac5ced9a34597c662bcf37b34ef8e799c32cc0edee373b2%40%3Cdev.flink.apache.org%3E

How to monitor Apache Flink in AWS EMR (ElasticMapReduce)?

I currently have Flink setup and have a Job running on EMR and I'm now trying to add monitoring by sending metrics off to prometheus.
I have come across an issue with running Flink on EMR. I'm using Terraform to provision EMR (I run ansible after to download and run a job). Out the box, it does not look like EMR's Flink distribution includes the optional jars (flink-metrics-prometheus, flink-cep, etc).
Looking at Flink's documentation, it says
"In order to use this reporter you must copy /opt/flink-metrics-prometheus-1.6.1.jar into the /lib folder of your Flink distribution"
https://ci.apache.org/projects/flink/flink-docs-release-1.6/monitoring/metrics.html#prometheuspushgateway-orgapacheflinkmetricsprometheusprometheuspushgatewayreporter
But when logging into the EMR master node, neither /etc/flink or /usr/lib/flink has a directory called opts and i can not see flink-metrics-prometheus-1.6.1.jar anywhere.
I know Flink has other optional libs you'd usually have to copy if you want to use them such as flink-cep, but I'm not sure how to do this when using EMR.
This is the exception i get, which I beleive is because it can not find the metrics jar in its classpath.
java.lang.ClassNotFoundException: org.apache.flink.metrics.prometheus.PrometheusPushGatewayReporter
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:264)
at org.apache.flink.runtime.metrics.MetricRegistryImpl.<init>(MetricRegistryImpl.java:144)
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.createMetricRegistry(ClusterEntrypoint.java:419)
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.initializeServices(ClusterEntrypoint.java:276)
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:227)
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$0(ClusterEntrypoint.java:191)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)
at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:190)
at org.apache.flink.yarn.entrypoint.YarnSessionClusterEntrypoint.main(YarnSessionClusterEntrypoint.java:137)
EMR resource in terraform
resource "aws_emr_cluster" "emr_flink" {
name = "ce-emr-flink-arn"
release_label = "emr-5.20.0" # 5.21.0 is not found, could be a region thing
applications = ["Flink"]
ec2_attributes {
key_name = "ce_test"
subnet_id = "${aws_subnet.ce_test_subnet_public.id}"
instance_profile = "${aws_iam_instance_profile.emr_profile.arn}"
emr_managed_master_security_group = "${aws_security_group.allow_all_vpc.id}"
emr_managed_slave_security_group = "${aws_security_group.allow_all_vpc.id}"
additional_master_security_groups = "${aws_security_group.external_connectivity.id}"
additional_slave_security_groups = "${aws_security_group.external_connectivity.id}"
}
ebs_root_volume_size = 100
master_instance_type = "m4.xlarge"
core_instance_type = "m4.xlarge"
core_instance_count = 2
service_role = "${aws_iam_role.iam_emr_service_role.arn}"
configurations_json = <<EOF
[
{
"Classification": "flink-conf",
"Properties": {
"parallelism.default": "8",
"state.backend": "RocksDB",
"state.backend.async": "true",
"state.backend.incremental": "true",
"state.savepoints.dir": "file:///savepoints",
"state.checkpoints.dir": "file:///checkpoints",
"web.submit.enable": "true",
"metrics.reporter.promgateway.class": "org.apache.flink.metrics.prometheus.PrometheusPushGatewayReporter",
"metrics.reporter.promgateway.host": "${aws_instance.monitoring.private_ip}",
"metrics.reporter.promgateway.port": "9091",
"metrics.reporter.promgateway.jobName": "ce-test",
"metrics.reporter.promgateway.randomJobNameSuffix": "true",
"metrics.reporter.promgateway.deleteOnShutdown": "false"
}
}
]
EOF
}
I suspect i may have to download the Jar in the bootstrap stage, but wanted to check this first, and see if there's any examples of this being done
I haven't used Terraform, but note that you typically need to provision (set up jars) on both the master and the slaves in EMR. One way to figure out where EMR thinks jars should go is to log onto a slave when a job is running, do ps auxwww | grep java, find the TaskManager process, look at the jars added to the classpath when it launched, and find where those are located on the server. Or at least that worked for me in the past.
I've select the EMR release emr-5.24.0 and I monitoring with the influxdb .jar with suceed.
I've copy the .jar file to /usr/lib/flink/lib folder and restart the Flink cluster with the following bash command (with sudo permission).
/usr/lib/flink/bin/stop-cluster.sh && /usr/lib/flink/bin/stop-cluster.sh
I assume that you can solve your question with the same steps for prometheus
[ec2-user#ip-10-0-11-17 ~]$ cd /usr/lib/flink/opt/flink-metrics-
flink-metrics-datadog-1.8.0.jar flink-metrics-influxdb-1.8.0.jar flink-metrics-slf4j-1.8.0.jar
flink-metrics-graphite-1.8.0.jar flink-metrics-prometheus-1.8.0.jar flink-metrics-statsd-1.8.0.jar
[ec2-user#ip-10-0-11-17 ~]$ ll /usr/lib/flink/opt/flink-metrics-prometheus-1.8.0.jar
-rw-r--r-- 1 root root 101984 may 14 19:21 /usr/lib/flink/opt/flink-metrics-prometheus-1.8.0.jar
[ec2-user#ip-10-0-11-17 ~]$ uname -a
Linux ip-10-0-11-17 4.14.114-83.126.amzn1.x86_64 #1 SMP Tue May 7 02:26:58 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

Spark - jdbc write fails in Yarn cluster mode but works in spark-shell

I am using Spark 1.6.2, Hadoop 2.6, Scala 2.10.5 and Java 1.7
I am using JDBC to read data from MSSQL and this works without any problem:
val hqlContext = new HiveContext(sc)
val url = "jdbc:sqlserver://1.1.1.1:1111;database=CIQOwnershipProcessing;user=OwnershipUser;password=Ownership123"
val driver = "com.microsoft.sqlserver.jdbc.SQLServerDriver";
val df1 = hqlContext.read.format("jdbc").options(
Map("url" -> url, "driver" -> driver,
"dbtable" -> "(select * from OwnershipStandardization_PositionSequence_tbl) as ps")).load()
And, while writing back dataframe to MSSQL, I am using the JDBC write as shown below. This works fine in Spark-shell but fails when I do spark-submit in Yarn-Cluster mode. What am I missing ?
val prop = new java.util.Properties
df1.write.mode("Overwrite").jdbc(url, "CIQOwnershipProcessing.dbo.df_sparkop",prop)
This is how my spark-submit command looks like. As you can see, I am passing the SQLJDBC jar path too. And, I have also specified the jdbc jar path in "spark.executor.extraClassPath" property in spark-defaults.conf on all nodes of the cluster. Since the JDBC read is working, I doubt if it has anything to do with the classpaths.
spark-submit --class com.spgmi.csd.OshpStdCarryOver --master yarn --deploy-mode cluster --conf spark.yarn.executor.memoryOverhead=2048 --num-executors 1 --executor-cores 2 --driver-memory 3g --executor-memory 8g --jars $SPARK_HOME/lib/datanucleus-api-jdo-3.2.6.jar,$SPARK_HOME/lib/datanucleus-core-3.2.10.jar,$SPARK_HOME/lib/datanucleus-rdbms-3.2.9.jar,/usr/share/java/sqljdbc_4.1/enu/sqljdbc41.jar --files $SPARK_HOME/conf/hive-site.xml $SPARK_HOME/lib/spark-poc2-17.1.0.jar
The error thrown in the Yarn-Cluster mode is:
17/01/05 10:21:31 ERROR yarn.ApplicationMaster: User class threw
exception: java.lang.InstantiationException:
org.apache.spark.sql.execution.datasources.jdbc.DriverWrapper
java.lang.InstantiationException:
org.apache.spark.sql.execution.datasources.jdbc.DriverWrapper
at java.lang.Class.newInstance(Class.java:368)
at org.apache.spark.sql.execution.datasources.jdbc.DriverRegistry$.register(DriverRegistry.scala:46)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$createConnectionFactory$2.apply(JdbcUtils.scala:53)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$createConnectionFactory$2.apply(JdbcUtils.scala:52)
at org.apache.spark.sql.DataFrameWriter.jdbc(DataFrameWriter.scala:278)
at com.spgmi.csd.OshpStdCarryOver$.main(SparkOshpStdCarryOver.scala:175)
at com.spgmi.csd.OshpStdCarryOver.main(SparkOshpStdCarryOver.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:558)
I was facing the same issue. I resolved it by setting connection property in prop.
val prop = new java.util.Properties
prop.setProperty("driver","com.mysql.jdbc.Driver")
now pass this prop in
df1.write.mode("Overwrite").jdbc(url, "CIQOwnershipProcessing.dbo.df_sparkop",prop)
Your problem feels very similar to SPARK-14204 and SPARK-14162 -- although that bug was supposed to be fixed in Spark 1.6.2 (?!)
With a Type 4 JDBC driver you should not have to explicitly mention the "driver" property; the JAR should automatically register the URL prefix that it supports (here jdbc:sqlserver:).
But because of the bug, the Spark JDBC module may not use that "registration" to find the driver that implicitly matches the URL.
In other words: for reading, you force the "driver" property and the connection works; for writing, you don't force it, and it does not work. Aha!

OpenEJB 4.5.1: NameNotFoundException

Its the first time I used the OpenEJB container system. When I use the lockup-Method of the InitialContext, I get a NameNotFoundException. I've read lots of examples and tutorials and in every example the lookup method looks like:
initialContext.lookup("NameOfBean");
Now I've found another solution which uses the lookup like the following code snippet which works for me too.
initialContext.lookup("java:global/classpath.ear/ProjectName/NameofBean");
The question is why the first version doesn't work for me and what I've done wrong?
Excerpts of the OpenEJB log:
INFO - ********************************************************************************
INFO - OpenEJB http://openejb.apache.org/
INFO - Startup: Sat Dec 22 13:17:59 CET 2012
INFO - Copyright 1999-2012 (C) Apache OpenEJB Project, All Rights Reserved.
INFO - Version: 4.5.1
INFO - Build date: 20121209
INFO - Build time: 08:47
INFO - ********************************************************************************
INFO - openejb.home = D:\workspace\ProjectName
INFO - openejb.base = D:\workspace\ProjectName
INFO - Succeeded in installing singleton service
INFO - Cannot find the configuration file [conf/openejb.xml]. Will attempt to create one for the beans deployed.
INFO - Configuring Service(id=Default Security Service, type=SecurityService, provider-id=Default Security Service)
INFO - Configuring Service(id=Default Transaction Manager, type=TransactionManager, provider-id=Default Transaction Manager)
INFO - Using 'openejb.deployments.classpath.include=.*'
INFO - Found EjbModule in classpath: D:\workspace\ProjectName\build\classes
INFO - Searched 17 classpath urls in 2184 milliseconds. Average 128 milliseconds per url.
INFO - Beginning load: D:\workspace\ProjectName\build\classes
INFO - Configuring enterprise application: D:\workspace\ProjectName\classpath.ear
WARNUNG - Method 'lookup' is not available for 'javax.annotation.Resource'. Probably using an older Runtime.
INFO - Auto-deploying ejb NameOfBean: EjbDeployment(deployment-id=NameOfBean)
[... AUTHORS'S NOTE: SOME MORE BEANS]
INFO - Assembling app: D:\workspace\ProjectName\classpath.ear
INFO - Hibernate Validator 4.2.0.Final
INFO - Ignoring XML configuration.
JAVA AGENT NOT INSTALLED. The JPA Persistence Provider requested installation of a ClassFileTransformer which requires a JavaAgent. See http://openejb.apache.org/3.0/javaagent.html
INFO - OpenJPA dynamically loaded a validation provider.
INFO - Jndi(name=NameOfBeanRemote) --> Ejb(deployment-id=NameofBean)
INFO - Jndi(name=global/classpath.ear/ProjectName/NameOfBean!de.mypath.stateless.NameOfBeanInterface) --> Ejb(deployment-id=NameofBean)
INFO - Jndi(name=global/classpath.ear/ProjectName/NameofBean) --> Ejb(deployment-id=NameOfBean)
[... AUTHORS'S NOTE: SOME FOR OTHER BEANS]
INFO - OpenWebBeans Container is starting...
INFO - Adding OpenWebBeansPlugin : [CdiPlugin]
INFO - All injection points are validated successfully.
INFO - OpenWebBeans Container has started, it took 250 ms.
INFO - Created Ejb(deployment-id=NameOfBean, ejb-name=NameOfBean, container=Default Stateless Container)
[... AUTHORS'S NOTE: SOME FOR OTHER BEANS]
INFO - Quartz scheduler 'OpenEJB-TimerService-Scheduler' initialized from an externally provided properties instance.
INFO - Quartz scheduler version: 2.1.6
INFO - Scheduler OpenEJB-TimerService-Scheduler_$_OpenEJB started.
INFO - Started Ejb(deployment-id=NameOfBean, ejb-name=NameOfBean, container=Default Stateless Container)
[... AUTHORS'S NOTE: SOME FOR OTHER BEANS]
This is my TestClass:
public class NamerOfBeanOpenEJBTest {
private static InitialContext initialContext;
#BeforeClass
public static void setUp() throws Exception {
Properties properties = new Properties();
properties.setProperty(Context.INITIAL_CONTEXT_FACTORY, "org.apache.openejb.client.LocalInitialContextFactory");
properties.setProperty("openejb.deployments.classpath.include", ".*");
initialContext = new InitialContext(properties);
}
#Test
public void testBean() throws NamingException{
Object object = initialContext.lookup("java:global/classpath.ear/ProjectName/NameOfBean");
assertNotNull(object);
assertTrue(object instanceof NameOfBean);
}
#AfterClass
public static void afterClass() throws Exception {
if (initialContext != null) {
initialContext.close();
}}
}
Does someone have tipps or solutions for me?
Thanks a lot.
Edit:
In JBoss AS 7.1 the lookup can places like this example:
new InitialContext().lookup("ejb:/ProjectName//NameOfBean!de." + "mypath.sessionbean.stateless.NameOfBeanInterface");
Isnt that possible in OpenEJB? Do I have to change every lookup call in every bean to have a local test with OpenEJB? That wouldn't be really effective and time-saving.
Problem solved!
The strukture of the lookup is {deploymentId}{interfaceType.annotationName}. Therefore in my case it must be
initialContext.lookup("NameOfBeanLocal");
or
initialContext.lookup("NameOfBeanRemote");
depending on the type of the interface.
To get the problem with JBoss solved you can switch from default lookup
new InitialContext().lookup("ejb:/ProjectName//NameOfBean!de." + "mypath.sessionbean.stateless.NameOfBeanInterface");
to something more flexible like Dependcy-Lookup or Dependency-Injection and use the #EJB annotation. Both ways are supportet by JBoss and OpenEJB.

Resources