How to use SQL JDBC driver with Spark and Zepplin - sql-server

I am trying to read data form SQL server to process using Spark. I am using Zeppelin for writing my scala commands. I never worked on Java or Spark or Zeppelin, so I am having hard time figuring out the issues.
I installed spark on my machine and everything seem to be working as I can get to spark-shell successfully. I have installed Zeppelin via Docker and this also seem to be working as I can create a new notebook and run "sc" and can see the SparkContext type printed.
Now I want to read data from SQL Server. I am planning to use azure-sqldb-spark connector but I am not sure how to use it. I am trying to add this as an interpreter to Zeppelin but not sure what are the required properties and how to use it.
This is what I did so far.
Downloaded the jar file from GitHub repo. (I am not able to run this on my machine as this is complaining that there is no manifest file)
Copied this jar file to the container running zeppelin
Tried to create an interpreter in Zeppelin
Here are the properties:
I am specifying the dependency on jar file like this.
I tried to play with the properties a bit but no luck. I am not even sure if this is the right way to do this.
I am trying to run the following query but running into suitable driver not found issue.

Related

Is there any way to change the SQL Server name in Python code based on the environment?

In my Python script, I established the SQL Server DEV connection and I am calling this script in my SSIS package, so now I want to deploy the project on Production server.
Q: How SQL server connection should be changed to Production from Development automatically/dynamically without editing the script manually? Is there a way that it should get/read Production environment?
Please help me out with this, thank you.
sys.argv and pass it as a command line parameter.
Pull it from an environment variable value os.environ
Read from a config file with configparser
Without any sample code, it's hard to say what the right approach should be but I would favor a command line parameter as that allows you to provide the value from the SSIS package (instead of defining configurations in both SSIS space and python space)

Vim dadbod configuring adapters

I'm trying to configure the plugin dadbod (https://github.com/tpope/vim-dadbod) and must confess I don't know vimscript well enough to comprehend the code :(
I'm stuck on configuring the database adapters. Irrespective of what URL I try, I just get the message
DB: no adapter for SQL Server.
I've also tried SQLite and Postgres with the same results.
In the WIKI, there's a statement: Supports a modern array of backends - which makes me think I haven't configured "the backend" perhaps? I have the jdbc SQL Server driver installed, and set a JAVA_HOME environment variable which works fine with DBeaver and with Azure Data Studio.
I haven't been able to find anything on the web about how to configure dadbod beyond the command structure. Am I missing something obvious about how the plugin works?
Your help greatly appreciated!
The vim-dadbod plugin was definitely not installed correctly. I did a clean install of Vim, then installed the package manager Vundle. Following Vundle's instructions I was able to install vim-dadbod.
I'll be posting a followup later, but the issue is no longer the plugin itself!

How To use JMeter for Database server performance testing?

I am testing the performance of the oracle 12c database using JMeter. I am totally new to JMeter. For testing I have create a .JAR file from java program. The java program uses JDBC driver to connect oracle database.
In JMeter, I have added Thread Group and inside the thread group I have added Java Request as Sampler. Am I following the right procedure?
If, my procedure is right, then also when I check the results in Table and Tree, I got an error. I have attached the snapshots of table and tree in JMeter.JMeter Result Table
Normally people use JDBC Configuration and JDBC Request sampler for database load testing. See Building a Database Test Plan for more information.
However JMeter is very flexible and your approach is also viable. In order to troubleshoot your problem:
First of all every time you face a problem with JMeter take a look into jmeter.log file, in the absolute majority of cases it contains enough information to get the idea why JMeter test has failed.
If your JAR doesn't contain Oracle JDBC driver you need to put the Oracle JDBC driver into JMeter Classpath as well. JMeter restart will be required to pick the Oracle JDBC driver jar.
You can run JMeter with the debugger enabled like:
java -agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=8888 -jar ApacheJMeter.jar -t your_testplan.jmx
and use your favourite IDE to connect to the machine running JMeter using port 8888, step-by-step walkthrough your code and see where the errors live. See How to Debug your Apache JMeter Script article for more tips on getting to the bottom of your JMeter test failures.

Filezilla/Wildfly server deployment error (Possible database dependencies)

This is my first time on here. I am having an issue deploying a java application I made on myEclipse. I am using Filezilla to host my Wildfly 9.0.2 test server. I exported my project to a .war file and upon dragging it into the test server I am met with a deployment.failed. Upon viewing the file in Notepad it declares "Services with missing/unavailable dependencies". one such error can be seen below:
[ "jboss.naming.context.java.module.myproject.myproject.env.common.jdbc.database_connection is missing [jboss.naming.context.java.database.connection] "
There are five of these similar errors and all point to a diffferent database connection of some type that I am not using within my project. I understand the issue but I do not know where these dependencies are declared and how I can go about removing them.
Any help will be greatly appreciated.
Kind Regards,
Paul
Creating the WAR file will use the project's deployment assembly (assuming you're using MyEclipse 2013 or later). Right click on the project and select Properties. Then go to the MyEclipse/Deployment Assembly page. This will have all of the files that are added to the deployment (or to the WAR file).
However, the message seems to suggest that a project is using a database connection which can't be found when running on the server. A first thought was that you're using the inbuilt Derby database but don't have that running when you run on Wildfly.But you say that you're not using a database. Also, I'm not familiar with how Filezilla can host a J2EE server - I thought Filezilla was an FTP client and server solution. Perhaps you could give more details, if this answer doesn't help.

How to transfer a ssis package from Dev to Prod?

I'm trying to move my packages to production using a configuration file, but file is changed only partly and the results go still to DEV server.
Does anybody know what to do?
It is difficult to isolate the cause of your issues without access to your configuration files.
What I suggest you do is make use of package configurations that reference a database within your environment. The databases themselves can then be referenced using environment variables that are unique to each environment.
This a brilliant time saver and a good way to centrally manage the configuration of all your SSIS packages. Take a look at the following reference for details.
http://www.mssqltips.com/tip.asp?tip=1405
Once configured, you can deploy the same identical package between dev and production without needing to apply a single modification to the SSIS package or mess around with configuration files.
You could still have hard-coded connections in your package even though you are using a configuration file. You'll need to check every connection as well.
You can also go the long way around. Go into Integration Services and Export the stored package to its dtsx file. Then you can pull open the file in any good text editor, do a find/replace on your server name and then go back into Integration Services and Import the updated package. Alot of times it's just easier...
everybody and thanks for answering. I'd managed to solve this problem in an ugly way - editing packages on server, but I'd like very much more elegant solution - now I'm trying with environment variable,it seems great, but the wizard that I'm getting is different from that is given in link - and I don't know how to continue.(I'm using VStudio 2005) Besides, I tried configuration file as XML, but package run fails even on the source machine, so I'm stuck !
My personal technique has been to first have a single config file that points the package to a SQL Based Package Config (the connection string to the config DB). Subsequent entries in the package config use the SQL store to load their settings. I have a script that goes into the XML of the package and preps them for deployment to stage or prod. A config file holds the name of the Package Configuration's initial file config entry and where the stage and prod configuration db configruation file is located. The script produces two subdirectories for stage and prod. Each directory has a copy of the solution packages modified for their particular deployment.
Also! Don't forget to turn off encryption in the package files!

Resources