Spark - Reading from SQL Server using com.microsoft.azure

Spark - Reading from SQL Server using com.microsoft.azure - sql-server

I am trying to read from a table using com.microsoft.azure. Below is the code snippet
import org.apache.log4j.{Level, Logger}
import org.apache.spark.sql.SparkSession
import com.microsoft.azure.sqldb.spark.config.Config
import com.microsoft.azure.sqldb.spark.connect._
import com.microsoft.azure.sqldb.spark.query._
import org.apache.spark.sql.functions.to_date
val spark = SparkSession.builder().master("local[*]").appName("DbApp").getOrCreate()
Class.forName("com.microsoft.sqlserver.jdbc.SQLServerDriver")
val config = Config(Map(
"url" -> "jdbc:sqlserver://localhost:1433",
"databaseName" -> "Student",
"dbTable" -> "dbo.MemberDetail",
"authentication" -> "SqlPassword",
"user" -> "test",
"password" -> "****"
))
val df = spark.sqlContext.read.sqlDB(config)
println("Total rows: " + df.count)
However I am getting below error
Exception in thread "main" java.lang.NoClassDefFoundError: scala/Product$class
at com.microsoft.azure.sqldb.spark.config.SqlDBConfigBuilder.<init>(SqlDBConfigBuilder.scala:31)
at com.microsoft.azure.sqldb.spark.config.Config$.apply(Config.scala:254)
at com.microsoft.azure.sqldb.spark.config.Config$.apply(Config.scala:235)
at DbApp$.main(DbApp.scala:55)
at DbApp.main(DbApp.scala)
MSSQL JDBC Version: mssql-jdbc-7.2.2.jre8
azure-sqldb-spark version: 1.0.2
Could anyone kindly guide me what am I doing wrong.?

The class doesn't seem to be set in your config nor specified anywhere else. Class.forName just validates presence of the JDBC driver. The driver is also for microsoft.sqlserver, which is different library.
Consider using this:
import com.microsoft.sqlserver.jdbc.SQLServerDriver
import java.util.Properties
val jdbcHostname = "localhost"
val jdbcPort = 1433
val jdbcDatabase = "Student"
val jdbcTable = "dbo.MemberDetail"
val MyDBUrl = s"jdbc:sqlserver://${jdbcHostname}:${jdbcPort};database=${jdbcDatabase}"
val MyDBProperties = new Properties()
MyDBProperties.put("user", "test")
MyDBProperties.put("password", "****")
MyDBProperties.setProperty("Driver", "com.microsoft.sqlserver.jdbc.SQLServerDriver")
val df = spark.read.jdbc(MyDBUrl, jdbcTable, MyDBProperties)
This approach was most stable in my environment (using Databricks and Azure SQL DB).
Related knowledgebase article available here.

Since you are using azure-sqldb-spark to connect to SQL server.
All connection properties in Microsoft JDBC Driver for SQL Server are supported in this connector. Add connection properties as fields in the com.microsoft.azure.sqldb.spark.config.Config object.
You don't need to create the jdbc driver Class.forName("com.microsoft.sqlserver.jdbc.SQLServerDriver") again.
Your cold should be like this:
import com.microsoft.azure.sqldb.spark.config.Config
import com.microsoft.azure.sqldb.spark.connect._
val config = Config(Map(
"url" -> "locaohost",
"databaseName" -> "MyDatabase",
"dbTable" -> "dbo.Clients",
"user" -> "username",
"password" -> "*********",
"connectTimeout" -> "5", //seconds
"queryTimeout" -> "5" //seconds
))
val collection = sqlContext.read.sqlDB(config)
collection.show()
Please ref:
Connect Spark to SQL DB using the connector
azure-sqldb-spark
Hope this helps.

This issue is due to version (versions are mentioned in the question itself) conflict between com.microsoft.azure.sqldb and com.microsoft.jdbc driver, after downloading com.microsoft.azure.sqldb with all its dependencies from below link it worked.
Note: com.microsoft.azure.sqldb works on Java 8, I downgraded my java runtime version.
Click here to com.microsoft.azure.sqldb with all dependencies

Related

Snowflake Scala Connector(snowpark)

I am new to scala and trying to create a snowflake connector in scala using snowflake snowpark scala library.
Here is my simple code
package com.abc.commons.rest.snowflake
import com.snowflake.snowpark._
import com.snowflake.snowpark.functions._
object ScalaConnector {
def main(args: Array[String]): Unit = {
// Replace the <placeholders> below.
val configs = Map (
"URL" -> "https://xxx.snowflakecomputing.com:443",
"USER" -> "xxx",
"PASSWORD" -> "xxx",
"ROLE" -> "xxx",
"WAREHOUSE" -> "xxx",
"DB" -> "xxx",
"SCHEMA" -> "xxx"
)
val session = Session.builder.configs(configs).create
session.sql("show tables").show()
session.close();
}
}
I have provided correct credentials and then on running the above code in IntelliJ I am getting below errors:
Exception in thread "main" java.lang.ExceptionInInitializerError
at com.abc.commons.rest.snowflake.ScalaConnector$.main(ScalaConnector.scala:17)
at com.abc.commons.rest.snowflake.ScalaConnector.main(ScalaConnector.scala)
Caused by: java.lang.NullPointerException
at com.snowflake.snowpark.Session$.getActiveSession(Session.scala:1204)
at com.snowflake.snowpark.SnowparkClientException.<init>(SnowparkClientException.scala:15)
at com.snowflake.snowpark.internal.ErrorMessage$.createException(ErrorMessage.scala:380)
at com.snowflake.snowpark.internal.ErrorMessage$.MISC_SCALA_VERSION_NOT_SUPPORTED(ErrorMessage.scala:340)
at com.snowflake.snowpark.internal.Utils$.checkScalaVersionCompatibility(Utils.scala:243)
at com.snowflake.snowpark.internal.Utils$.checkScalaVersionCompatibility(Utils.scala:233)
at com.snowflake.snowpark.Session$.<init>(Session.scala:1129)
at com.snowflake.snowpark.Session$.<clinit>(Session.scala)
... 2 more
libraryDependencies I use in build.sbt
"com.snowflake" % "snowpark" % "1.4.0"
Can anyone point out the issue in my code?

There's no issue with the code, you're just using an unsupported Scala version. From your stacktrace:
MISC_SCALA_VERSION_NOT_SUPPORTED
We only support Scala 2.12 as explained here.

Export Data from Hadoop using sql-spark-connector (Apache)

I am trying to export data from Hadoop to MS SQL using Apache Spark SQL Connector as instructed here sql-spark-connector which fails with exception java.lang.NoSuchMethodError: com.microsoft.sqlserver.jdbc.SQLServerBulkCopy.writeToServer (Lcom/microsoft/sqlserver/jdbc/ISQLServerBulkRecord;)V
According to official documentation Supported Versions
My Development Environment:
Hadoop Version: 2.7.0
Spark Version: 2.4.5
Scala Version: 2.11.12
MS SQL Version: 2016
My Code:
package com.company.test
import org.apache.spark.sql.SparkSession
object TestETL {
def main(args:Array[String]):Unit = {
val spark:SparkSession = SparkSession
.builder()
.getOrCreate()
import spark.implicits._
// create DataFrame
val export_df = Seq(1,2,3).toDF("id")
export_df.show(5)
// Connection String
val server_name = "jdbc:sqlserver://ip_address:port"
val database_name = "database"
val url = server_name + ";" + "databaseName=" + database_name + ";"
export_df.write
.format("com.microsoft.sqlserver.jdbc.spark")
.mode("append")
.option("url", url)
.option("dbtable", "export_test")
.option("user", "username")
.option("password", "password")
.save()
}
}
My SBT
build.sbt
Command line argument I executed
/mapr/abc.company.com/user/dir/spark-2.4.5/bin/spark-submit --class com.company.test.TestETL /mapr/abc.company.com/user/dir/project/TestSparkSqlConnector.jar
JDBC Exception
I de-compiled the mssql-jdbc-8.2.0.jre8.jar to check if it is missing the SQLServerBulkCopy.writeToServer method implementation but that doesn't see to be the case.
Any insights on how I can fix this?

it is a compatability error please to reffer to this link it will explain the error or just choose compatible versions. gitHub link

How correctly connect to Oracle 12g database in Play Framework?

I am new in Play Framework (Scala) and need some advise.
I use Scala 2.12 and Play Framework 2.6.20. I need to use several databases in my project. Right now I connected MySQL database as it says in documentation. How correctly connect project to remote Oracle 12g database?
application.conf:
db {
mysql.driver = com.mysql.cj.jdbc.Driver
mysql.url = "jdbc:mysql://host:port/database?characterEncoding=UTF-8"
mysql.username = "username"
mysql.password = "password"
}
First of all to lib folder I put ojdbc8.jar file from oracle website.
Then add libraryDependencies += "com.oracle" % "ojdbc8" % "12.1.0.1" code to sbt.build file. Finally I wrote settings to aplication.conf file.
After that step I notice error in terminal:
[error] (*:update) sbt.ResolveException: unresolved dependency: com.oracle#ojdbc8;12.1.0.1: not found
[error] Total time: 6 s, completed 10.11.2018 16:48:30
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=256M; support was removed in 8.0
EDIT:
application.conf:
db {
mysql.driver = com.mysql.cj.jdbc.Driver
mysql.url = "jdbc:mysql://#host:#port/#database?characterEncoding=UTF-8"
mysql.username = "#username"
mysql.password = "#password"
oracle.driver = oracle.jdbc.driver.OracleDriver
oracle.url = "jdbc:oracle:thin:#host:#port/#sid"
oracle.username = "#username"
oracle.password = "#password"
}
ERROR:
play.api.UnexpectedException: Unexpected exception[CreationException: Unable to create injector, see the following errors:
1) No implementation for play.api.db.Database was bound.
while locating play.api.db.Database
for the 1st parameter of controllers.GetMarkersController.<init>(GetMarkersController.scala:14)
while locating controllers.GetMarkersController
for the 7th parameter of router.Routes.<init>(Routes.scala:45)
at play.api.inject.RoutesProvider$.bindingsFromConfiguration(BuiltinModule.scala:121):
Binding(class router.Routes to self) (via modules: com.google.inject.util.Modules$OverrideModule -> play.api.inject.guice.GuiceableModuleConversions$$anon$1)
GetMarkersController.scala:
package controllers
import javax.inject._
import akka.actor.ActorSystem
import play.api.Configuration
import play.api.mvc.{AbstractController, ControllerComponents}
import play.api.libs.ws._
import scala.concurrent.duration._
import scala.concurrent.{ExecutionContext, Future, Promise}
import services._
import play.api.db.Database
class GetMarkersController #Inject()(db: Database, conf: Configuration, ws: WSClient, cc: ControllerComponents, actorSystem: ActorSystem)(implicit exec: ExecutionContext) extends AbstractController(cc) {
def getMarkersValues(start_date: String, end_date: String) = Action.async {
getValues(1.second, start_date: String, end_date: String).map {
message => Ok(message)
}
}
private def getValues(delayTime: FiniteDuration, start_date: String, end_date: String): Future[String] = {
val promise: Promise[String] = Promise[String]()
val service: GetMarkersService = new GetMarkersService(db)
actorSystem.scheduler.scheduleOnce(delayTime) {
promise.success(service.get_markers(start_date, end_date))
}(actorSystem.dispatcher)
promise.future
}
}

You cannot access Oracle without credentials. You need to have an account with Oracle. Then add something like the following to your build.sbt file
resolvers += "Oracle" at "https://maven.oracle.com"
credentials += Credentials("Oracle", "maven.oracle.com", "username", "password")
More information about accessing the OTN: https://docs.oracle.com/middleware/1213/core/MAVEN/config_maven_repo.htm#MAVEN9012
If you have the hard coded jar, you don't need to include as a dependency. See unmanagedDependencies https://www.scala-sbt.org/1.x/docs/Library-Dependencies.html

Spark query sql server

i'm trying to query SQL server using Spark/scala and running into an issue
here is the code
import org.apache.spark.SparkContext
object temp {
def main(args: Array[String]) {
val conf = new SparkConf().setAppName("temp").setMaster("local")
val sc = new SparkContext(conf)
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
val jdbcSqlConnStr = "jdbc:sqlserver://XXX.XXX.XXX.XXX;databaseName=test;user=XX;password=XXXXXXX;"
val jdbcDbTable = "[test].dbo.[Persons]"
val jdbcDF = sqlContext.read.format("jdbc").options(
Map("url" -> jdbcSqlConnStr,
"dbtable" -> jdbcDbTable)).load()
jdbcDF.show(10)
println("Complete")
}
}
below is the error and i assume it is complaining about main method - but why ?how to fix it.
error:
Exception in thread "main" java.lang.NoSuchMethodError: scala.runtime.ObjectRef.create(Ljava/lang/Object;)Lscala/runtime/ObjectRef;
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:888)
at org.apache.spark.sql.SQLContext.(SQLContext.scala:70)
at apachetika.temp$.main(sqltemp.scala:24)
at apachetika.temp.main(sqltemp.scala)
18/09/28 16:04:40 INFO spark.SparkContext: Invoking stop() from shutdown hook

As far as I can tell this is due to a scala version mismatch
The library compiled with spark_core dependence with scala 2.11 instead of scala 2.10. Use scala 2.11.8+.
Hope this helps.

Error in Jenkins using SQL Server Driver

I'm trying to run a Groovy script to connect to Microsoft SQL Server within Jenkins and insert new data . I want to use the SQL Server driver and I placed the driver in Jenkins\war\WEB-INF\lib. I get an error when I Tried to run the following code in the build step - Execute Groovy Script:
import groovy.sql.Sql
import com.microsoft.sqlserver.jdbc.SQLServerDriver
class Connection {
def sqlConnection
def route = "xxxx"
def user = "xxxx"
def password = "xxxxx"
def driver = "com.microsoft.sqlserver.jdbc.SQLServerDriver"
Connection(db){
this.route+=db.toString()
this.sqlConnection = Sql.newInstance( route, user, password, driver )
}
static main(args) {
Connection con = new Connection("nameDataBase")
}
}
The error is:
1 error org.codehaus.groovy.control.MultipleCompilationErrorsException: startup failed:
E:\Jenkins\workspace\pruebaDB\hudson1035758401251924782.groovy: 2:
unable to resolve class com.microsoft.sqlserver.jdbc.SQLServerDriver # line 2, column 1.
import com.microsoft.sqlserver.jdbc.SQLServerDriver`

The most evident case of such exception is that necessary class not yet in classpath during script execution.
During a which build step script is executed?
If my assumptions are right and you can't shift script execution time to the other build step (when all necessary classes/libs will be in the cp) try to use class loader.
Here is a few links which should help you with this:
class lodding fun
class loading in groovy

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Spark - Reading from SQL Server using com.microsoft.azure - sql-server

Related

Snowflake Scala Connector(snowpark)

Export Data from Hadoop using sql-spark-connector (Apache)

How correctly connect to Oracle 12g database in Play Framework?

Spark query sql server

Error in Jenkins using SQL Server Driver

Categories

Resources