java.lang.ClassNotFoundException: org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer when trying to run spark job in zeppelin - apache-zeppelin

1.Summarizing the problem
I have build zeppelin from the source code by running the below command.
mvn clean package -DskipTests -Pspark-2.3 -Pscala-2.11
The build was successful.
Launched apache zeppelin on kubernetes cluster and could see zeppelin-server starts perfectly fine.
but when trying to run a spark notebook the spark interpreter pod goes into completed/succeded state with below errors in the logs from spark-interpreter.log
WARN [2020-03-06 00:42:37,683] ({main} Logging.scala[logWarning]:87) - Failed to load org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.
java.lang.ClassNotFoundException: org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer
2.Describe what you’ve tried
I did not find any resolution so could not try any solution to this problem yet.
any suggestions or ideas would be highly appreciated.

I figured out the issue and was able to resolve by adding --jars with interpreter and spark jars in the zeppelin-env.sh script but later stuck into different issue.
Now that interpreter is starting but unable to launch executors.
Below is the error message, if anybody would like to provide any inputs, would appreciate it.
java.lang.NoClassDefFoundError: org/sonatype/aether/resolution/DependencyResolutionException
Thank you.

Related

Upgrading Apache Flink need to update pom.xml?

I've just upgraded my flink from version 1.9.1 to 1.11.2 (using docker)
I have already many flink jobs running in version 1.9.1
When I try to upgrade to 1.11.1 and re run my job, it shows error.
2020-11-12 06:49:17,731 WARN org.apache.zookeeper.ClientCnxn []
- SASL configuration failed: javax.security.auth.login.LoginException: No JAAS configuration section named 'Client' was found in specified JAAS configuration file: '/tmp/jaas-1135609831848314731.conf'. Will continue connection to Zookeeper server without SASL authentication, if Zookeeper server allows it.
2020-11-12 06:49:17,739 INFO org.apache.zookeeper.ClientCnxn [] - Opening socket connection to server xxxxxx:2181
2020-11-12 06:49:17,741 ERROR org.apache.curator.ConnectionState [] - Authentication failed
And this is the error after deploying my flink job:
Caused by: java.lang.RuntimeException: API paths not defined
and also:
java.lang.NoSuchMethodError: org.apache.flink.api.common.state.OperatorStateStore.getSerializableListState(Ljava/lang/String;)Lorg/apache/flink/api/common/state/ListState;
Do I need to change every pom for my flink jobs?
Is there any work around without changing my source code?
Thanks
Yes, you do have to rebuild your Flink jobs whenever you update the Flink version being used to run them. The libraries you use should be from the same exact version used by the Job Manager and Task Managers.
If you are trying to automate deployments for a CI/CD pipeline, you could inject the version number into the pom.xml using an environment variable -- but doing things like that can make it hard to debug when things go wrong.

SSH Agent Plugin v1.17 with Jenkins Declaritive Pipeline not working with Windows

I have been having issues getting my multibranch pipeline to perform git commands with an SSH key via the SSH Agent plugin on Windows.
I am able to successfully perform a git clone with the ssh from Git Bash on windows server that is running Jenkins.
In my pipeline log I am getting the following error when trying to use the sshagent plugin:
[ssh-agent] Looking for ssh-agent implementation... Could not find
ssh-agent: IOException: Cannot run program "ssh-agent": CreateProcess
error=2, The system cannot find the file specified Check if ssh-agent
is installed and in PATH [ssh-agent] FATAL: Could not find a suitable
ssh-agent provider
I have seen that installing Apache Tomcat Native libraries has helped some people, but the steps for doing so are not very descriptive.
Any help is appreciated. Thanks!

[AWS Glue]: org.apache.thrift.TApplicationException: Internal error processing createInterpreter

I'm trying to use zeppelin-0.8.0 to connect to AWS Glue Development endpoint and when executing a cell below error occurs.
And there is no helpful message to understand what could be the problem. Any leads appreciated
172318_1906434757 is finished, status: ERROR, exception: java.lang.RuntimeException: org.apache.thrift.TApplicationException: Internal error processing createInterpreter, result: %text org.apache.thrift.TApplicationException: Internal error processing createInterpreter
at org.apache.thrift.TApplicationException.read(TApplicationException.java:111)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71)
at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.recv_createInterpreter(RemoteInterpreterService.java:209)
at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.createInterpreter(RemoteInterpreterService.java:192)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreter$2.call(RemoteInterpreter.java:169)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreter$2.call(RemoteInterpreter.java:165)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterProcess.callRemoteFunction(RemoteInterpreterProcess.java:135)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.internal_create(RemoteInterpreter.java:165)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.open(RemoteInterpreter.java:132)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getFormType(RemoteInterpreter.java:299)
at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:407)
at org.apache.zeppelin.scheduler.Job.run(Job.java:188)
at org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:307)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
UPDATE: So as in the answer below looks like 0.8.0 doesn't work with Glue yet.. I had problems running 0.7.x aw well with the javax.ws.rx package having a bunch of MethodNotFoundException when running with Java 8(did not help update-alternative to Java 7 as well). But when running inside a JDK 7 docker container it worked with no problems and was able to connect to my Dev end point. Highly appreciate if anyone can clarify the root cause of it
Could you please provide more information, such as zeppin instance location. Is it running on your desktop/laptop or is it running as AWS Notebook server? Also did you try connecting to zeppelin 0.7.3 version, as mentioned here in this AWS forum link :
https://forums.aws.amazon.com/thread.jspa?threadID=285128
As per the above link dated Jul 2018, think AWS Glue doesn't yet support Zeppelin 0.8 version.
I am assuming all other configurations, environment settings are done as needed. Can help more, if you can provide additional info.
UPDATE:
Anyway, please refer here and setting up zeppelin on windows, for any help on setting up local development environment & zeppelin notebook.
Once you set up the zeppelin notebook, have an SSH connection established (using AWS Glue DevEndpoint URL), so you can have access to the data catalog/crawlers,etc., and also the S3 bucket where your data resides. Then, you can create your python scripts in the zeppelin notebook, and run from the zeppelin.
You can use dev instance provided by Glue, but you may incur additional costs for the same(EC2 instance charges).
Environment settings (updated in response to comments):
JAVA_HOME=E:\Java7\jre7
Path=E:\Python27;E:\Python27\Lib;E:\Python27\Scripts;
PYTHONPATH=E:\spark-2.1.0-bin-hadoop2.7\python;E:\spark-2.1.0-bin-hadoop2.7\python\lib\py4j-0.10.4-src.zip;E:\spark-2.1.0-bin-hadoop2.7\python\lib\pys
park.zip
SPARK_HOME=E:\spark-2.1.0-bin-hadoop2.7
Change the drive name/ folders accordingly. Let me know if any help neeed.

Trouble Configuring Hbase Interpreter on Apache Zeppelin

I installed both Apache Zeppelin and Hbase via home brew and they both worked on their own. I was able to use Hbase shell in command line and open Zeppelin. I tested Zeppelin with spark and it worked fine.
However, my problem is how do I configure Hbase interpreter? I tried to follow the tutorials given by Zeppelin and it didn't work. This is the error message I got
I tried to resolve this by resetting the interpreter in the interpreter menu like this. But none of that helped. Any help is appreciated.
UPDATE:
I was able to resolve the problem by adding the following dependencies to the Hbase interpreter on Apache-zeppelin:
/usr/local/hbase-1.2.0/lib/hbase-client-1.2.0.jar
/usr/local/hbase-1.2.0/lib/hbase-protocol-1.2.0.jar
/usr/local/hbase-1.2.0/lib/hbase-common-1.2.0.jar
Note: /usr/local/hbase-1.2.0 is the home directory of Hbase
Reference:
https://stochasticcoder.com/2018/02/12/adding-hbase-interpreter-to-zeppelin-hortonworks/ (Thanks to #Alan)
For complete guide on installing/configuring hbase interpreter on Apache-zeppelin, you can find it at my repo: https://github.com/bixingxie/hbase-zeppelin/blob/master/README.md

Apache Zeppelin running on spark occurs java ConnectionException

I want to ask some question about using appache-zeppelin installation.
I downloaded the zeppelin-0.5.5-incubating-bin-all
configure export JAVA_HOME=/sparkDemo/java-1.8.0-openjdk in zeppelin-env.sh and zeppelin.server.port 8084 in zeppelin-site.xml. I didn't configure SPARK_HOME in zeppelin-env.sh because i wanna use Zeppelin embedded Spark libraries.
But when i run the zeppelin tutorial code in my window browser,occur the following error: enter image description here
And even i configure SPARK_HOME, export MASTER in zeppelin-env.sh and create new interpreter in zeppelin web UI,the same error occurs.
Thanks a lot for responding me!
Stack Trace here
As mentioined in other answers, most probably the issue is that Interpreter process quite due to some error.
More details on particular error could be found in:
Interpreter process log
./logs/zeppelin-interpreter-<interpreter name>-<username>-<hostname>.log
and ZeppelinServer process log under
./logs/zeppelin-<username>-<hostname>.log

Resources