How to set Zeppelin interpreter dependencies via configuration - apache-zeppelin

I'm trying to deploy zeppelin 0.7.3 and add ojdbc7.jar and custom jar dependencies automatically.
I'm wondering if there's a configuration item in zeppelin-env.sh or zeppelin-site.xml that could do this.
example:
copy zeppelin-0.7.3-bin-all to /data and copy ojdbc7.jar to /data/zeppelin-0.7.3-bin-all/jdbc/ojdbc7.jar
In interpreter, find spark, I can see dependencies/artifact added according to the configuration.
/data/zeppelin-0.7.3-bin-all/interpreter/jdbc/ojdbc7.jar

Related

Custom configuration file flink-conf.yaml

I need to specify different Flink settings for different applications. In other words, each application should run with its custom file flink-conf.yaml. What is the proper way to do it?
I found some old recommendations to declare FLINK_CONF_DIR pointing to a custom directory with Flink configuration files (for example: How could I override configuration value in Apache Flink?). However, the official Flink documentation does not mention the FLINK_CONF_DIR variable at all (as of Flink 1.13). Therefore I have doubts, that this way is officially recommended and supported by Flink developers.
UPDATE 1: Details on application running
I am running Flink on YARN in the Application mode. Here is how I launch the application:
"$flink_home/bin/flink" \
run-application \
--target yarn-application \
--class com.example.App1
The out-of-the-box Flink configuration is located in the $flink_home/conf directory. As I have several applications App1, App2, ..., I want them to use their respective Flink configurations instead of the out-of-the-box configuration.
TL;DR: The paragraph about FLINK_CONF_DIR was accidentally removed when the Flink on YARN docs were rewritten for the Flink 1.12 release. It is still the intended and supported way to establish per-application settings in YARN clusters.
Other ways to override the configuration:
You can override the settings specified in the cluster's flink-conf.yaml file with settings you specify on the command line, as described in this answer.
You can also override specific settings from the global configuration in your code, e.g.:
Configuration conf = new Configuration();
conf.setString("state.backend", "filesystem");
env = StreamExecutionEnvironment.getExecutionEnvironment(conf);
You can also load all of the settings in a flink-conf.yaml file from your application code, via
FileSystem.initialize(GlobalConfiguration.loadConfiguration("/path/to/conf/directory"));
And with Kubernetes you can mount different ConfigMaps for different applications.

Flink logging - Using Log4j2

We are running a Flink(1.9.1) application on AWS-EMR(5.29) using yarn. We are using a common logging adaptor throughout all the components(including the Flink application) in our project and it uses Log4j2.
From the documentation, I see that there are 3 configuration files.
log4j.properties
log4j-yarn-session.properties
log4j-cli.properties
I understand that I will have to modify log4j.properties for the job manager and task manager logs and log4j-cli.properties for the code not included in the cluster code.
Now given this situation,
How do I pass my log4j2.properties?
Do we replace the logging jars in the lib folder with log4j2 jars?
Not a solid solution but this is a workaround. If the log4j.properties file in the /conf folder is deleted, the log4j2 file within the jar that is within the classpath is referred. But be careful when you have multiple jars in the classpath with the log4j2 properties file.

Artifactory Generic Download - VSTS Task Failing

I have set up a very basic task in a VSTS build definition with the following simple steps and objective:
Setup and successfully test an endpoint to our Artifactory repository.
Implement a VSTS "Artifactory Generic Download" Task to retrieve a single jar file from the Artifactory repository.
Drop the jar file in staging directory of the build agent.
The file spec source, based on an example from the JFrog website www.jfrog.com and set up as a Task Configuration is very basic and is depicted below:
Unfortunately, triggering this build job fails horribly with the below error and I simply can't figure out why it is failing. Would appreciate some help on this.
It seems that no artifacts were found and the task fails due to the configured "Fail task if no dependencies were downloaded" flag. If you wish to change this behavior, you can uncheck the flag in your task configuration.
As for the not downloaded artifacts, make sure a repository called "list" exists and a jar file exists in the provided pattern.
More information about file-specs can be found here.

Running Solr with Jetty

I'm having a little trouble understanding how Solr fits in with Jetty, and why I can't seem to get the start.jar in the distribution package to work.
I can run all of the example configurations via java -jar start.jar. However, when I try to run something like the follwing --
java -Dsolr.solr.home=/Users/jwwest/solr -jar $(brew --prefix solr)/libexec/example/start.jar
-- the following error occurs:
java.io.FileNotFoundException: No XML configuration files specified in start.config or command line.
at org.eclipse.jetty.start.Main.start(Main.java:506)
at org.eclipse.jetty.start.Main.main(Main.java:95)
I opened up the start.jar file, and there is a start.config file located inside of the jar which I'm assuming should handle this configuration for me. I'm not understanding why it will work when run from inside of the distribution examples directory, but not outside of it.
You also need to define the jetty.home property. Try:
java -Dsolr.solr.home=/Users/jwwest/solr -jar $(brew --prefix solr)/libexec/example/start.jar -Djetty.home=$(brew --prefix solr)/libexec/example
You can see the effective command line start.jar generates by using the --dry-run command line flag.
java -jar start.jar --dry-run
That will output everything with full path names so you can run it from outside the directory.
Source: http://www.eclipse.org/jetty/documentation/9.0.0.M3/advanced-jetty-start.html
The start.jar is a jetty specific mechanism that works to build out all the classpath requirements for starting up Jetty. It is generally only used in the scope of the jetty distribution. Pulling the start.jar out of the configuration and placing it somewhere else renders the default configuration of the start.config rather moot.
My understanding of Solr is that it bundles itself with a distribution of jetty, placing what it needs to run into the distribution and repackages it as its own. They may have a custom start.config file that further adds its own locations for classpath resources and the like, or not.
The exception you are seeings stems from the start.config file expecting an etc/ directory containing jetty.xml formatted xml files which are used to configure the jetty process.
Jetty being often used in an embedded format has little to do with this issue, it is simply a common use case because jetty is incredibly easy to embed into an application. Embedded instances of jetty rarely (if ever) leverage a start.jar...instead it is up to the embedding application to manage its own classpath.
First, you need to change your folder where start.jar is located, then execute the same command.
Jetty is often used as embedded container. If you want to use the jetty, then a good start would be to copy the example directory and rename it to what you want it to be. The solr directory is the one for basic configuration.
Else it is recommended to use tomcat and the solr.war file.

Using JSVC to daemonize a Java app packaged with the Maven One-Jar Plugin

Here is the problem:
I have packaged my Java application into a single jar using the Maven plugin One-Jar.
Now I want to run the application as a Unix Daemon using JSVC, i.e. Apache Commons Daemon.
I am using JSVC as follows (which works for Jars made with the Maven assembly plugin, etc):
jsvc -user $USER -home $HOME -pidfile $PID_PATH -cp $PATH_TO_ONE_JAR my.package.MyClass
The error is this:
jsvc.exec error: Cannot find daemon loader org/apache/commons/daemon/support/DaemonLoader
jsvc.exec error: Service exit with a return value of 1
Does anyone know if it is even possible to use JSVC and One-Jar together, since One-Jar uses a custom class loader? The jar runs just fine when I run java -jar my-one-jar.jar.
What can be done?
Thank you for any insight!
I had to add all jars dependencies to the classpath option from jsvc. It seems jsvc doesn't use the jars inside another jar
If you use the (poorly-documented) Maven Shade Plugin instead of One-jar (they can achieve similar results as each other), it should solve your problems. It unpacks the dependent jars and stores the class files directly in the fat Jar (rather than having jars within the jar). I have used it to create an executable jar for running under JSVC with some success.
Of course, things are seldom as simple as they sound. With the Shade plugin, you may have to do some work to relocate classes when there are conflicts in your dependency tree, or use resource transformers to handle your non-Java resource files. But hopefully not.
(Of course Mkyong.com has a guide on this)

Resources