Problem:
I receive multiple sets of flat files on a weekly basis that need t be imported my Database. The flat files I receive are not in a conventional format to import, so they need to be run through a script and be parsed in to a more SQL friendly format. These flat files are usually in JSON, TXT, XML, LOG, ect.
Current Solutions
Current I have a windows forms application and another GUI to transform the files and to bulkimport to SQL tables. However, I'm finding it unreliable to ask users to import data, and I would much rather automate the tasks.
More recently, I have been creating SSIS packages. This proves to much faster and useful since I can add script components. It allows me to manually parse whatever flat files I throw at it. My issue is finding a way to automate this. I have no control of the server where my database is hosted. So I'm unable to deploy the packages there to bring in the files. Currently, I'm just running the packages on my local machine manually to get the data in.
Needed Solution
I need a way for me to automate getting these flat files in. Originally I wanted to request and FTP server for the files to be dumped in. Then the files would be picked up by my packages and imported into the SQL Server DB. However, since I have no control of any of the local folders on that server, it seems to be impossible for me to automate this. Is there a better way for me to find a solution for this? Could I build something custom in C#, Python, Powershell, etc.? I'm very new to the scene and trying to find a solution for this problem has been a nightmare.
Related
I have a sql server database already containing data. I want to start versioning it. I know I can use Database project in Visual Studio, and by importing database I can generate sql scripts.
But what about data in the database? I tried to make some Data-Tier Application Files, but when I try to import it in my DB project in Visual Studio I am getting this error:
Import Data-Tier Application File - This operation is not supported for packages containing data
So how do I import data? It has to be some way, because when I am extracting DAC file there is option Extract Schema and Data so there has to be a way to use this data afterwards.
Or maybe post deployement scripts are the only option?
Grettings
Your only option for this at this time is to use post-deploy scripts to populate those tables, taking into account the fact that the scripts need to be able to run multiple times without re-inserting data. A temp table/table variable and a MERGE statement are probably your best bets if you might have changes to the reference data, otherwise a left join might suffice.
Others have tried to include reference data, but it's a pretty hard problem to solve in a manner that works well for everyone. I know others like Ed Elliott have written some stuff that can turn those on/off as needed so you're not always including all reference data every time. You could also look into a post-post-deploy scenario where after your publish and post-deploy, you run a separate script that updates the data from static files. They'd still be in source control, but not necessarily part of your SSDT project. You'd have to remember to run that script in your builds, though.
I know for a while we had a database that solely had the lookup tables populated so we could reference that and do data compares if needed, but that still requires someone to maintain those values in an ongoing manner.
I need to have users import Excel/CSV files to my database.
Currently, I have a VB.net application that will let me import CSV files only to our database. Rather than scaling this application to keep fitting my needs, and deploying it to users to import data, I'm considering switching to SSIS.
How do I deploy packages so that my users are able to use them to import Excel/CSV files? I know SSIS is not intended to be a front end, so should I not use it for my needs? Is it only used for SQL Developers to import data?
Also, my users have no experience with SQL or using a database. They are used to putting their excel files on Sharepoint or pass them around via email. I just introduced them to SSRS which works wonderfully as a reporting service but I need a simple and reliable import process.
Probably not for a few reasons:
You'd have to deploy the SSIS runtime for the package to run - this is not something that is usually done. You'd probably have to pay a licence cost
SSIS stores metadata (i.e. the type and number of columns in the source and target). If this metadata changes then the package will usually fail
SSIS is a server tool. It 's not really built for user feedback
Excel as a source is difficult for two reasons:
It has no validation. Users can put anything they want in it, including invalid or missing values
Excel drivers work out metadata by inspecting rows on the fly and this is sometimes incorrect (I'm sure you've already encountered this in your program)
A custom built solution requires more maintenance but has a lot more flexibility, and you probably need this flexibility given that you have excel sources.
If your excel files are guaranteed to be clean every time, and all of your users use a single SQL Server (with a single licensed install of SSIS) then it might be practical.
Added to reflect discussion below:
In this case you have consistent data files coming from elsewhere that need to be automatically uploaded into the database. SSIS can help in this case with the following proven pattern:
User (or process) saves the file is saved to a specific shared folder
A package, scheduled to run every (say) one minute in SQL Agent, imports all files in that folder
If the import is successful, the file is moved to a 'successful' folder
If the import is unsuccessful, the file is moved to a 'failed' folder
This way, a thick client app doesn't need to be deployed to everyone. Instead any user can drop the file (if they have share access), and it will be automatically pulled in
Users can also confirm that the file was successful by checking the folder
Here's an example of a package that imports all files in a folder and moves them when complete:
SSIS - How to loop through files in folder and get path+file names and finally execute stored Procedure with parameter as Path + Filename
The overall goal is to have data from an automated daily Cognos report stored in a database so that I am able to report not only on that day but also historical data if I so choose. My general thought is that if I can find a way to automatically add the new daily data to an existing Excel file, I can then use that as my data source and create a dashboard in Tableau. However, I don't have any programming experience, so I'm floundering here.
I'm committed to using Tableau, but I chose Excel only because I'm more familiar with that program than others, along with the fact that an Excel output file is an option in Cognos. If you have better ideas, please don't hesitate to suggest them along with why you believe it's a better idea.
Update: I'm still jumping through hoops to try to get read-only access to the backend database to make this process a lot more efficient, but in the meantime I've moved forward with the long method utilizing Cognos.
I was able to leverage a coworker to create a system file folder to automatically save the Cognos reports to, and then I scheduled a job to run the reports I need. Each of those now saves into a folder in a shared network drive (so my entire team has access to the files), and I wrote a series of macros to append the data each day from those feeder files in the shared drive to a Master File. Now all that's left is to create a Tableau dashboard using the Master File as the data source and I'll have what I need.
Thanks for all your help!
I'm posting this an an answer because, it's just too much to leave as a comment.
What you need are 3 things.
Figure out how to have COGNOS run your report and download your Excel file.
Use Visual Studio with BIDS (which is the suite of SQL analysis, reporting, and integration services) to automate all the stuff you need to do to append your Excel files, etc... Then you can use the same tools to import that data to your SQL server.
In fact, if all you're doing is trying to get this data into SQL, you can skip the Append Excel part, and just append the data directly to your SQL table.
Once your package is built, you can save it as an automated job on your SQL server to run whenever you wish.
Tableau can use your SQL server as a data source. Once you have that updated, you can run your reports.
I have a directory of SAS7BDAT files - about 300 of them which I need to import them into a SQL Server table. Unfortunately, the date field is not part of the dataset but is in the filename. So I need to parse the filename, get the date and append to each dataset at the time of import.
Is SSIS a good candidate for this? If so, do I use For-each loop to this? How do I parse the filename and append the date?
For individual files, I can easily use SQL Server Management Studio and import it. I can do the same for this exercise too and then handle the date when loading to the final table, but am hoping there is a much more cleaner solution.
Is there any other backend way of handling this without SAS installed? Python or otherwise?
TIA
[Solved]
Came across an article which mentioned R's SAS7BDAT library.
So using that, I could successfully load all the files along with the filename into an R list using "ldply".
After some data frame manipulation, I could load all the files into SQL Server using SQLSave.
The files are very small in size. So performance wasn't much of an issue, although I suspect it can be for larger volumes.
I have a JSON file of data which I have pulled from an API and I would very much like to just dump this data into an SQL Server.
The reason it's SQL Server specifically is that the database is already in place for the current project. I have spent time googling this and searching on here but was unable to find anything useful thus far. I'm familiar with Python but I'm open to any solution.
TLDR: I'm interested in which languages and packages provide easy solutions to automate JSON to an SQL Server table, do you have any suggestions or know of any packages that already achieve this?
You can use something like SSIS to accomplish this (you may already have it) by writing a script task. This could do custom parsing then load it into the correct table. This can be easily automated. I mention SSIS because it's very easy to add future tasks to this, if you're ever required.
Alternatively you could create a script outside of the database (ie. Python) that parses the JSON, connects to the database through ODBC/OLEDB and writes the records. This can be automated using Task Scheduler or something similar. An example implementation of this could use PYODBC.
you can use WCF Web Service, sending json data to SQL Server .
refer the links below hope it will be helpful for you
http://www.codeproject.com/Articles/167159/How-to-create-a-JSON-WCF-RESTful-Service-in-sec
http://mikesknowledgebase.azurewebsites.net/pages/Services/WebServices-Page2.htm
you cant directly fetch json data's in sql server instead you can use wcf service