I currently have Flink setup and have a Job running on EMR and I'm now trying to add monitoring by sending metrics off to prometheus.
I have come across an issue with running Flink on EMR. I'm using Terraform to provision EMR (I run ansible after to download and run a job). Out the box, it does not look like EMR's Flink distribution includes the optional jars (flink-metrics-prometheus, flink-cep, etc).
Looking at Flink's documentation, it says
"In order to use this reporter you must copy /opt/flink-metrics-prometheus-1.6.1.jar into the /lib folder of your Flink distribution"
https://ci.apache.org/projects/flink/flink-docs-release-1.6/monitoring/metrics.html#prometheuspushgateway-orgapacheflinkmetricsprometheusprometheuspushgatewayreporter
But when logging into the EMR master node, neither /etc/flink or /usr/lib/flink has a directory called opts and i can not see flink-metrics-prometheus-1.6.1.jar anywhere.
I know Flink has other optional libs you'd usually have to copy if you want to use them such as flink-cep, but I'm not sure how to do this when using EMR.
This is the exception i get, which I beleive is because it can not find the metrics jar in its classpath.
java.lang.ClassNotFoundException: org.apache.flink.metrics.prometheus.PrometheusPushGatewayReporter
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:264)
at org.apache.flink.runtime.metrics.MetricRegistryImpl.<init>(MetricRegistryImpl.java:144)
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.createMetricRegistry(ClusterEntrypoint.java:419)
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.initializeServices(ClusterEntrypoint.java:276)
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:227)
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$0(ClusterEntrypoint.java:191)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)
at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:190)
at org.apache.flink.yarn.entrypoint.YarnSessionClusterEntrypoint.main(YarnSessionClusterEntrypoint.java:137)
EMR resource in terraform
resource "aws_emr_cluster" "emr_flink" {
name = "ce-emr-flink-arn"
release_label = "emr-5.20.0" # 5.21.0 is not found, could be a region thing
applications = ["Flink"]
ec2_attributes {
key_name = "ce_test"
subnet_id = "${aws_subnet.ce_test_subnet_public.id}"
instance_profile = "${aws_iam_instance_profile.emr_profile.arn}"
emr_managed_master_security_group = "${aws_security_group.allow_all_vpc.id}"
emr_managed_slave_security_group = "${aws_security_group.allow_all_vpc.id}"
additional_master_security_groups = "${aws_security_group.external_connectivity.id}"
additional_slave_security_groups = "${aws_security_group.external_connectivity.id}"
}
ebs_root_volume_size = 100
master_instance_type = "m4.xlarge"
core_instance_type = "m4.xlarge"
core_instance_count = 2
service_role = "${aws_iam_role.iam_emr_service_role.arn}"
configurations_json = <<EOF
[
{
"Classification": "flink-conf",
"Properties": {
"parallelism.default": "8",
"state.backend": "RocksDB",
"state.backend.async": "true",
"state.backend.incremental": "true",
"state.savepoints.dir": "file:///savepoints",
"state.checkpoints.dir": "file:///checkpoints",
"web.submit.enable": "true",
"metrics.reporter.promgateway.class": "org.apache.flink.metrics.prometheus.PrometheusPushGatewayReporter",
"metrics.reporter.promgateway.host": "${aws_instance.monitoring.private_ip}",
"metrics.reporter.promgateway.port": "9091",
"metrics.reporter.promgateway.jobName": "ce-test",
"metrics.reporter.promgateway.randomJobNameSuffix": "true",
"metrics.reporter.promgateway.deleteOnShutdown": "false"
}
}
]
EOF
}
I suspect i may have to download the Jar in the bootstrap stage, but wanted to check this first, and see if there's any examples of this being done
I haven't used Terraform, but note that you typically need to provision (set up jars) on both the master and the slaves in EMR. One way to figure out where EMR thinks jars should go is to log onto a slave when a job is running, do ps auxwww | grep java, find the TaskManager process, look at the jars added to the classpath when it launched, and find where those are located on the server. Or at least that worked for me in the past.
I've select the EMR release emr-5.24.0 and I monitoring with the influxdb .jar with suceed.
I've copy the .jar file to /usr/lib/flink/lib folder and restart the Flink cluster with the following bash command (with sudo permission).
/usr/lib/flink/bin/stop-cluster.sh && /usr/lib/flink/bin/stop-cluster.sh
I assume that you can solve your question with the same steps for prometheus
[ec2-user#ip-10-0-11-17 ~]$ cd /usr/lib/flink/opt/flink-metrics-
flink-metrics-datadog-1.8.0.jar flink-metrics-influxdb-1.8.0.jar flink-metrics-slf4j-1.8.0.jar
flink-metrics-graphite-1.8.0.jar flink-metrics-prometheus-1.8.0.jar flink-metrics-statsd-1.8.0.jar
[ec2-user#ip-10-0-11-17 ~]$ ll /usr/lib/flink/opt/flink-metrics-prometheus-1.8.0.jar
-rw-r--r-- 1 root root 101984 may 14 19:21 /usr/lib/flink/opt/flink-metrics-prometheus-1.8.0.jar
[ec2-user#ip-10-0-11-17 ~]$ uname -a
Linux ip-10-0-11-17 4.14.114-83.126.amzn1.x86_64 #1 SMP Tue May 7 02:26:58 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Related
When trying to run a SOLR query,
SELECT count(*) from products;
we are seeing the below exception from solr server, ours is a SOLR cloud setup,
SOLR version - solr-8.8.2-PATCH2
solr-solrj-8.8.2 version
Complete Stack Trace i have mentioned below,
2023-02-09 14:21:34.824 ERROR (qtp1209411469-15) [c:products s:shard3 r:core_node12 x:otmm_shard3_replica_n10] o.a.s.s.HttpSolrCall java.lang.RuntimeException: java.lang.NoClassDefFoundError: Could not initialize class org.apache.solr.handler.sql.SolrRules
at org.apache.solr.servlet.HttpSolrCall.sendError(HttpSolrCall.java:746)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:592)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:427)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:357)
at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:201)
at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:548)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:602)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)
at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1624)
at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1434)
at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:501)
at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1594)
at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1349)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:191)
at org.eclipse.jetty.server.handler.InetAccessHandler.handle(InetAccessHandler.java:177)
at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:146)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:322)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
at org.eclipse.jetty.server.Server.handle(Server.java:516)
at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:388)
at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:633)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:380)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:277)
at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105)
at org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:338)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:315)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:173)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131)
at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:386)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:883)
at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1034)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.lang.NoClassDefFoundError: Could not initialize class org.apache.solr.handler.sql.SolrRules
at org.apache.solr.handler.sql.SolrTableScan.register(SolrTableScan.java:72)
at org.apache.calcite.plan.AbstractRelOptPlanner.onNewClass(AbstractRelOptPlanner.java:239)
at org.apache.calcite.plan.volcano.VolcanoPlanner.onNewClass(VolcanoPlanner.java:464)
at org.apache.calcite.plan.AbstractRelOptPlanner.registerClass(AbstractRelOptPlanner.java:230)
at org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(VolcanoPlanner.java:1224)
at org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:589)
at org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:604)
at org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:84)
at org.apache.calcite.rel.AbstractRelNode.onRegister(AbstractRelNode.java:268)
at org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(VolcanoPlanner.java:1132)
at org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:589)
at org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:604)
at org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:84)
at org.apache.calcite.rel.AbstractRelNode.onRegister(AbstractRelNode.java:268)
at org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(VolcanoPlanner.java:1132)
at org.apache.calcite.plan.volcano.VolcanoPlanner.setRoot(VolcanoPlanner.java:265)
at org.apache.calcite.tools.Programs.lambda$standard$3(Programs.java:262)
at org.apache.calcite.tools.Programs$SequenceProgram.run(Programs.java:331)
at org.apache.calcite.prepare.Prepare.optimize(Prepare.java:166)
at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:297)
at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:208)
at org.apache.calcite.prepare.CalcitePrepareImpl.prepare2_(CalcitePrepareImpl.java:642)
at org.apache.calcite.prepare.CalcitePrepareImpl.prepare_(CalcitePrepareImpl.java:508)
at org.apache.calcite.prepare.CalcitePrepareImpl.prepareSql(CalcitePrepareImpl.java:478)
at org.apache.calcite.jdbc.CalciteConnectionImpl.parseQuery(CalciteConnectionImpl.java:231)
at org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(CalciteMetaImpl.java:556)
at org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:675)
at org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:156)
at org.apache.calcite.avatica.AvaticaStatement.executeQuery(AvaticaStatement.java:227)
at org.apache.solr.client.solrj.io.stream.JDBCStream.open(JDBCStream.java:278)
at org.apache.solr.client.solrj.io.stream.ExceptionStream.open(ExceptionStream.java:52)
at org.apache.solr.handler.StreamHandler$TimerStream.open(StreamHandler.java:465)
at org.apache.solr.client.solrj.io.stream.TupleStream.writeMap(TupleStream.java:82)
at org.apache.solr.common.util.JsonTextWriter.writeMap(JsonTextWriter.java:164)
at org.apache.solr.common.util.TextWriter.writeMap(TextWriter.java:216)
at org.apache.solr.common.util.TextWriter.writeVal(TextWriter.java:69)
at org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:153)
at org.apache.solr.common.util.JsonTextWriter.writeNamedListAsMapWithDups(JsonTextWriter.java:387)
at org.apache.solr.common.util.JsonTextWriter.writeNamedList(JsonTextWriter.java:293)
at org.apache.solr.response.JSONWriter.writeResponse(JSONWriter.java:73)
at org.apache.solr.response.JSONResponseWriter.write(JSONResponseWriter.java:66)
at org.apache.solr.response.QueryResponseWriterUtil.writeQueryResponse(QueryResponseWriterUtil.java:65)
at org.apache.solr.servlet.HttpSolrCall.writeResponse(HttpSolrCall.java:890)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:583)
... 40 more
Kindly let me know what is being wrong with this. Also am adding additional lines because its compalining me add more details.
Iam using Solrj, and i have a application it supports only JDBC and iam in a situation to use only JDBC solr using SOLRJ,
Its basically a BIRT reporting tool.
Where we can define SQL and output would be automatically mapped to the report.
This is what we are trying to do
Added below in solr.in.sh
SOLR_OPTS="$SOLR_OPTS -Dsolr.modules=sql"
And in solr admin, can able to see,
Args in Systemdashboard,
-Dsolr.modules=sql
But still not working
{
"responseHeader":{
"status":0,
"QTime":2},
"config":{"requestHandler":{"/export":{
"class":"solr.ExportHandler",
"useParams":"_EXPORT",
"components":["query"],
"invariants":{
"rq":"{!xport}",
"distrib":false},
"name":"/export",
"_useParamsExpanded_":{"_EXPORT":"[NOT AVAILABLE]"},
"_effectiveParams_":{
"distrib":"false",
"rq":"{!xport}"}}}}}
Edit: I just realized you had linked to the wrong docs for your version. The way you are trying to add modules was not available until 9.0 (https://issues.apache.org/jira/browse/SOLR-15914) 8.8 docs are at: https://solr.apache.org/guide/8_8/ Note that until recently this feature was called Parallel Sql https://solr.apache.org/guide/8_8/parallel-sql-interface.html
For 8.8 you shouldn't need to configure modules since at that time /sql was an implicitly loaded request handler.
https://solr.apache.org/guide/8_8/implicit-requesthandlers.html
You may need to verify if the implicit handlers have been (mis)configured via the request parameters API (https://solr.apache.org/guide/8_8/implicit-requesthandlers.html#how-to-edit-implicit-handler-paramsets)
Below refers to the 9.x versions of solr
java.lang.NoClassDefFoundError: Could not initialize class org.apache.solr.handler.sql.SolrRules
Probably indicates that you have not successfully loaded the sql module. Normally one enables modules at the very bottom of solr.in.sh not solr.sh as you said in your comments. solr.in.sh is the place where Solr intends for you to set up environment variables. One really shouldn't ever need to modify solr.sh directly, and doing so may make upgrades difficult in the future.
Check that there isn't another set of enablements in solr.in.sh that is overwriting what you've tried to do in solr.sh. Also check that you aren't enabling any other modules in other ways (several methods are shown here: https://solr.apache.org/guide/solr/latest/configuration-guide/solr-modules.html). You should pick one way of enabling modules (sysprops, solr.in.sh, solr.xml or solrconfig.xml) and then enable all modules that way to avoid having to understand any complicated precedence logic if possible. I don't know the precedence order, but I can probably figure it out if you have other modules and absolutely can't avoid using more than one method.
Also, you still haven't told us where you got the version ending in -PATCH2. This version spec sounds like you are working with some folks who are compiling their own custom Solr, so you should be sure to understand what they've changed in case they've customized something about how or where Solr loads jar files (not very likely, but one never knows).
I am new to flink. I am trying to run the flink example on my local PC(windows).
However, after I run the start-cluster.bat, I login to the dashboard, it shows the task manager is 0.
I checked the log and seems it fails to initialize:
2020-02-21 23:03:14,202 ERROR org.apache.flink.runtime.taskexecutor.TaskManagerRunner - TaskManager initialization failed.
org.apache.flink.configuration.IllegalConfigurationException: Failed to create TaskExecutorResourceSpec
at org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils.resourceSpec.FromConfig(TaskExecutorResourceUtils.java:72)
at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.startTaskManager(TaskManagerRunner.java:356)
at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.<init>(TaskManagerRunner.java:152)
at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.runTaskManager(TaskManagerRunner.java:308)
at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.lambda$runTaskManagerSecurely$2(TaskManagerRunner.java:322)
at org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.runTaskManagerSecurely(TaskManagerRunner.java:321)
at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.main(TaskManagerRunner.java:287)
Caused by: org.apache.flink.configuration.IllegalConfigurationException: The required configuration option Key: 'taskmanager.cpu.cores' , default: null (fallback keys: []) is not set
at org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils.checkConfigOptionIsSet(TaskExecutorResourceUtils.java:90)
at org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils.lambda$checkTaskExecutorResourceConfigSet$0(TaskExecutorResourceUtils.java:84)
at java.util.Arrays$ArrayList.forEach(Arrays.java:3880)
at org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils.checkTaskExecutorResourceConfigSet(TaskExecutorResourceUtils.java:84)
at org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils.resourceSpecFromConfig(TaskExecutorResourceUtils.java:70)
... 7 more
2020-02-21 23:03:14,217 INFO org.apache.flink.runtime.blob.TransientBlobCache - Shutting down BLOB cache
Basically, it looks like a required option 'taskmanager.cpu.cores' is not set. However, I can't find this property in flink-conf.yaml and in the document(https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/config.html) either.
I am using flink 1.10.0. Any help would be highly appreciated!
That configuration option is intended for internal use only -- it shouldn't be user configured, which is why it isn't documented.
The windows start-cluster.bat is failing because of a bug introduced in Flink 1.10. See https://jira.apache.org/jira/browse/FLINK-15925.
One workaround is to use the bash script, start-cluster.sh, instead.
See also this mailing list thread: https://lists.apache.org/thread.html/r7693d0c06ac5ced9a34597c662bcf37b34ef8e799c32cc0edee373b2%40%3Cdev.flink.apache.org%3E
Setting up JanusGraph i noticed the following in the console:
09:04:12,175 INFO ReflectiveConfigOptionLoader:173 - Loaded and initialized config classes: 10 OK out of 12 attempts in PT0.023S
09:04:12,230 INFO Reflections:224 - Reflections took 28 ms to scan 1 urls, producing 2 keys and 2 values
09:04:12,291 WARN GraphDatabaseConfiguration:1445 - Local setting index.search.index-name=entity (Type: GLOBAL_OFFLINE) is overridden by globally managed value (janusgraph). Use the ManagementSystem interface instead of the local configuration to control this setting.
09:04:12,294 WARN GraphDatabaseConfiguration:1445 - Local setting index.search.backend=solr (Type: GLOBAL_OFFLINE) is overridden by globally managed value (elasticsearch). Use the ManagementSystem interface instead of the local configuration to control this setting.
09:04:12,300 INFO CassandraThriftStoreManager:628 - Closed Thrift connection pooler.
and then i see the following:
Exception in thread "main" java.lang.IllegalArgumentException: Could not instantiate implementation: org.janusgraph.diskstorage.es.ElasticSearchIndex
How do i stop using elasticsearch and switch to Solr?
My properties file is as follows:
index.search.backend=solr
index.search.directory=/path/to/directory/for/solr/index/something
index.search.index-name=something
index.search.solr.mode=http
index.search.solr.http-urls=http://127.0.0.1:8983/solr
storage.backend=cassandrathrift
storage.hostname=127.0.0.1
cache.db-cache = true
cache.db-cache-clean-wait = 20
cache.db-cache-time = 180000
cache.db-cache-size = 0.25
The answer to this basically the same as this one for Titan. JanusGraph was forked from Titan.
You are probably trying to connect to an existing graph that was previously configured to use Elasticsearch. By default, the keyspace is named janusgraph.
1) You could connect to a different keyspace by updating conf/janusgraph-cassandra.properties
gremlin.graph=org.janusgraph.core.JanusGraphFactory
storage.backend=cassandrathrift
storage.hostname=127.0.0.1
storage.cassandra.keyspace=mygraph
2) You could drop the existing keyspace. If you used bin/janusgraph.sh start from the quick start directions (which starts a single node Cassandra and a single node Elasticsearch),
bin/janusgraph.sh clean
Or if you have a standalone Cassandra installation:
$CASSANDRA_HOME/bin/cqlsh -e 'drop keyspace if exists janusgraph'
Then you would be able to connect with the default conf/janusgraph-cassandra.properties.
I am using Spark 1.6.2, Hadoop 2.6, Scala 2.10.5 and Java 1.7
I am using JDBC to read data from MSSQL and this works without any problem:
val hqlContext = new HiveContext(sc)
val url = "jdbc:sqlserver://1.1.1.1:1111;database=CIQOwnershipProcessing;user=OwnershipUser;password=Ownership123"
val driver = "com.microsoft.sqlserver.jdbc.SQLServerDriver";
val df1 = hqlContext.read.format("jdbc").options(
Map("url" -> url, "driver" -> driver,
"dbtable" -> "(select * from OwnershipStandardization_PositionSequence_tbl) as ps")).load()
And, while writing back dataframe to MSSQL, I am using the JDBC write as shown below. This works fine in Spark-shell but fails when I do spark-submit in Yarn-Cluster mode. What am I missing ?
val prop = new java.util.Properties
df1.write.mode("Overwrite").jdbc(url, "CIQOwnershipProcessing.dbo.df_sparkop",prop)
This is how my spark-submit command looks like. As you can see, I am passing the SQLJDBC jar path too. And, I have also specified the jdbc jar path in "spark.executor.extraClassPath" property in spark-defaults.conf on all nodes of the cluster. Since the JDBC read is working, I doubt if it has anything to do with the classpaths.
spark-submit --class com.spgmi.csd.OshpStdCarryOver --master yarn --deploy-mode cluster --conf spark.yarn.executor.memoryOverhead=2048 --num-executors 1 --executor-cores 2 --driver-memory 3g --executor-memory 8g --jars $SPARK_HOME/lib/datanucleus-api-jdo-3.2.6.jar,$SPARK_HOME/lib/datanucleus-core-3.2.10.jar,$SPARK_HOME/lib/datanucleus-rdbms-3.2.9.jar,/usr/share/java/sqljdbc_4.1/enu/sqljdbc41.jar --files $SPARK_HOME/conf/hive-site.xml $SPARK_HOME/lib/spark-poc2-17.1.0.jar
The error thrown in the Yarn-Cluster mode is:
17/01/05 10:21:31 ERROR yarn.ApplicationMaster: User class threw
exception: java.lang.InstantiationException:
org.apache.spark.sql.execution.datasources.jdbc.DriverWrapper
java.lang.InstantiationException:
org.apache.spark.sql.execution.datasources.jdbc.DriverWrapper
at java.lang.Class.newInstance(Class.java:368)
at org.apache.spark.sql.execution.datasources.jdbc.DriverRegistry$.register(DriverRegistry.scala:46)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$createConnectionFactory$2.apply(JdbcUtils.scala:53)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$createConnectionFactory$2.apply(JdbcUtils.scala:52)
at org.apache.spark.sql.DataFrameWriter.jdbc(DataFrameWriter.scala:278)
at com.spgmi.csd.OshpStdCarryOver$.main(SparkOshpStdCarryOver.scala:175)
at com.spgmi.csd.OshpStdCarryOver.main(SparkOshpStdCarryOver.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:558)
I was facing the same issue. I resolved it by setting connection property in prop.
val prop = new java.util.Properties
prop.setProperty("driver","com.mysql.jdbc.Driver")
now pass this prop in
df1.write.mode("Overwrite").jdbc(url, "CIQOwnershipProcessing.dbo.df_sparkop",prop)
Your problem feels very similar to SPARK-14204 and SPARK-14162 -- although that bug was supposed to be fixed in Spark 1.6.2 (?!)
With a Type 4 JDBC driver you should not have to explicitly mention the "driver" property; the JAR should automatically register the URL prefix that it supports (here jdbc:sqlserver:).
But because of the bug, the Spark JDBC module may not use that "registration" to find the driver that implicitly matches the URL.
In other words: for reading, you force the "driver" property and the connection works; for writing, you don't force it, and it does not work. Aha!
While going through the Google docs, I'm getting the below stack trace on the final export command (executed from the master instance with appropriate env variables set).
${HADOOP_HOME}/bin/hadoop jar ${HADOOP_BIGTABLE_JAR} export-table -libjars ${HADOOP_BIGTABLE_JAR} <table-name> <gs://bucket>
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/hadoop/hbase-install/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/hadoop/hadoop-install/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
2016-02-08 23:39:39,068 INFO [main] mapreduce.Export: versions=1, starttime=0, endtime=9223372036854775807, keepDeletedCells=false
2016-02-08 23:39:39,213 INFO [main] gcs.GoogleHadoopFileSystemBase: GHFS version: 1.4.4-hadoop2
java.lang.IllegalAccessError: tried to access field sun.security.ssl.Handshaker.localSupportedSignAlgs from class sun.security.ssl.ClientHandshaker
at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:278)
at sun.security.ssl.Handshaker.processLoop(Handshaker.java:913)
at sun.security.ssl.Handshaker.process_record(Handshaker.java:849)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1035)
at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1344)
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1371)
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1355)
at sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:559)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:185)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.connect(HttpsURLConnectionImpl.java:153)
at com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:93)
at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:972)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:419)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:469)
at com.google.cloud.hadoop.gcsio.GoogleCloudStorageImpl.getBucket(GoogleCloudStorageImpl.java:1599)
at com.google.cloud.hadoop.gcsio.GoogleCloudStorageImpl.getItemInfo(GoogleCloudStorageImpl.java:1554)
at com.google.cloud.hadoop.gcsio.CacheSupplementedGoogleCloudStorage.getItemInfo(CacheSupplementedGoogleCloudStorage.java:547)
at com.google.cloud.hadoop.gcsio.GoogleCloudStorageFileSystem.getFileInfo(GoogleCloudStorageFileSystem.java:1042)
at com.google.cloud.hadoop.gcsio.GoogleCloudStorageFileSystem.exists(GoogleCloudStorageFileSystem.java:383)
at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.configureBuckets(GoogleHadoopFileSystemBase.java:1650)
at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem.configureBuckets(GoogleHadoopFileSystem.java:71)
at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.configure(GoogleHadoopFileSystemBase.java:1598)
at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.initialize(GoogleHadoopFileSystemBase.java:783)
at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.initialize(GoogleHadoopFileSystemBase.java:746)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2591)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:89)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2625)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2607)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:368)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:167)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:352)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
at org.apache.hadoop.hbase.util.DynamicClassLoader.<init>(DynamicClassLoader.java:104)
at org.apache.hadoop.hbase.protobuf.ProtobufUtil.<clinit>(ProtobufUtil.java:241)
at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.convertScanToString(TableMapReduceUtil.java:509)
at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:207)
at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:168)
at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:291)
at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:92)
at org.apache.hadoop.hbase.mapreduce.IdentityTableMapper.initJob(IdentityTableMapper.java:51)
at org.apache.hadoop.hbase.mapreduce.Export.createSubmittableJob(Export.java:75)
at org.apache.hadoop.hbase.mapreduce.Export.main(Export.java:187)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:145)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:153)
at com.google.cloud.bigtable.mapreduce.Driver.main(Driver.java:35)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Here's my ENV var set up in case it's helpful:
export HBASE_HOME=/home/hadoop/hbase-install
export HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase classpath`
export HADOOP_HOME=/home/hadoop/hadoop-install
export HADOOP_CLIENT_OPTS="-Xbootclasspath/p:${HBASE_HOME}/lib/bigtable/alpn-boot-7.1.3.v20150130.jar"
export HADOOP_BIGTABLE_JAR=${HBASE_HOME}/lib/bigtable/bigtable-hbase-mapreduce-0.2.2-shaded.jar
export HADOOP_HBASE_JAR=${HBASE_HOME}/lib/hbase-server-1.1.2.jar
Also, when I try to run hbase shell and then list tables it just hangs and doesn't fetch me the list of tables. This is what happens:
~$ hbase shell
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/hadoop/hbase-install/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/hadoop/hadoop-install/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
2016-02-09 00:02:01,334 INFO [main] grpc.BigtableSession: Opening connection for projectId mystical-height-89421, zoneId us-central1-b, clusterId twitter-data, on data host bigtable.googleapis.com, table admin host bigtabletableadmin.googleapis.com.
2016-02-09 00:02:01,358 INFO [BigtableSession-startup-0] grpc.BigtableSession: gRPC is using the JDK provider (alpn-boot jar)
2016-02-09 00:02:01,648 INFO [bigtable-connection-shared-executor-pool1-t2] io.RefreshingOAuth2CredentialsInterceptor: Refreshing the OAuth token
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 1.1.2, rcc2b70cf03e3378800661ec5cab11eb43fafe0fc, Wed Aug 26 20:11:27 PDT 2015
hbase(main):001:0> list
TABLE
I've tried:
Double checking ALPN and ENV variables are appropriately set
Double checking hbase-site.xml and hbase-env.sh to make sure nothing looks wrong.
I also even tried connecting to my cluster (like I was previously able to following these directions) from ANOTHER gcloud instance, but it seems like I can't seem to get that to work now either...(it also hangs)
user#gcloud-instance:hbase-1.1.2$ bin/hbase shell
2016-02-09 00:07:03,506 WARN [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2016-02-09 00:07:03,913 INFO [main] grpc.BigtableSession: Opening connection for projectId <project>, zoneId us-central1-b, clusterId <cluster>, on data host bigtable.googleapis.com, table admin host bigtabletableadmin.googleapis.com.
2016-02-09 00:07:04,039 INFO [BigtableSession-startup-0] grpc.BigtableSession: gRPC is using the JDK provider (alpn-boot jar)
2016-02-09 00:07:05,138 INFO [Credentials-Refresh-0] io.RefreshingOAuth2CredentialsInterceptor: Refreshing the OAuth token
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 1.1.2, rcc2b70cf03e3378800661ec5cab11eb43fafe0fc, Wed Aug 26 20:11:27 PDT 2015
hbase(main):001:0> list
TABLE
Feb 09, 2016 12:07:08 AM com.google.bigtable.repackaged.io.grpc.internal.TransportSet$1 run
INFO: Created transport com.google.bigtable.repackaged.io.grpc.netty.NettyClientTransport#7b480442(bigtabletableadmin.googleapis.com/64.233.183.219:443) for bigtabletableadmin.googleapis.com/64.233.183.219:443
Any ideas with what I'm doing wrong? Looks like an access issue - how do I fix it?
Thanks!
You can spin up a Dataproc cluster w/ Bigtable enabled following these instructions.
ssh to the master by ./cluster.sh ssh
hbase shell to verify that all is in order.
hadoop jar ${HADOOP_BIGTABLE_JAR} export-table -libjars ${HADOOP_BIGTABLE_JAR} <table-name> gs://<bucket>/some-folder
gsutil ls gs://<bucket>/some-folder/** and see if _SUCCESS exists. If so, the remaining files are your data.
exit from your cluster master
./cluster.sh delete to get rid of the cluster, if you no longer require it.
You ran into a problem with the weekly java runtime update, that has been corrected.