I need to have users import Excel/CSV files to my database.
Currently, I have a VB.net application that will let me import CSV files only to our database. Rather than scaling this application to keep fitting my needs, and deploying it to users to import data, I'm considering switching to SSIS.
How do I deploy packages so that my users are able to use them to import Excel/CSV files? I know SSIS is not intended to be a front end, so should I not use it for my needs? Is it only used for SQL Developers to import data?
Also, my users have no experience with SQL or using a database. They are used to putting their excel files on Sharepoint or pass them around via email. I just introduced them to SSRS which works wonderfully as a reporting service but I need a simple and reliable import process.
Probably not for a few reasons:
You'd have to deploy the SSIS runtime for the package to run - this is not something that is usually done. You'd probably have to pay a licence cost
SSIS stores metadata (i.e. the type and number of columns in the source and target). If this metadata changes then the package will usually fail
SSIS is a server tool. It 's not really built for user feedback
Excel as a source is difficult for two reasons:
It has no validation. Users can put anything they want in it, including invalid or missing values
Excel drivers work out metadata by inspecting rows on the fly and this is sometimes incorrect (I'm sure you've already encountered this in your program)
A custom built solution requires more maintenance but has a lot more flexibility, and you probably need this flexibility given that you have excel sources.
If your excel files are guaranteed to be clean every time, and all of your users use a single SQL Server (with a single licensed install of SSIS) then it might be practical.
Added to reflect discussion below:
In this case you have consistent data files coming from elsewhere that need to be automatically uploaded into the database. SSIS can help in this case with the following proven pattern:
User (or process) saves the file is saved to a specific shared folder
A package, scheduled to run every (say) one minute in SQL Agent, imports all files in that folder
If the import is successful, the file is moved to a 'successful' folder
If the import is unsuccessful, the file is moved to a 'failed' folder
This way, a thick client app doesn't need to be deployed to everyone. Instead any user can drop the file (if they have share access), and it will be automatically pulled in
Users can also confirm that the file was successful by checking the folder
Here's an example of a package that imports all files in a folder and moves them when complete:
SSIS - How to loop through files in folder and get path+file names and finally execute stored Procedure with parameter as Path + Filename
Related
I have a sql server database already containing data. I want to start versioning it. I know I can use Database project in Visual Studio, and by importing database I can generate sql scripts.
But what about data in the database? I tried to make some Data-Tier Application Files, but when I try to import it in my DB project in Visual Studio I am getting this error:
Import Data-Tier Application File - This operation is not supported for packages containing data
So how do I import data? It has to be some way, because when I am extracting DAC file there is option Extract Schema and Data so there has to be a way to use this data afterwards.
Or maybe post deployement scripts are the only option?
Grettings
Your only option for this at this time is to use post-deploy scripts to populate those tables, taking into account the fact that the scripts need to be able to run multiple times without re-inserting data. A temp table/table variable and a MERGE statement are probably your best bets if you might have changes to the reference data, otherwise a left join might suffice.
Others have tried to include reference data, but it's a pretty hard problem to solve in a manner that works well for everyone. I know others like Ed Elliott have written some stuff that can turn those on/off as needed so you're not always including all reference data every time. You could also look into a post-post-deploy scenario where after your publish and post-deploy, you run a separate script that updates the data from static files. They'd still be in source control, but not necessarily part of your SSDT project. You'd have to remember to run that script in your builds, though.
I know for a while we had a database that solely had the lookup tables populated so we could reference that and do data compares if needed, but that still requires someone to maintain those values in an ongoing manner.
Problem:
I receive multiple sets of flat files on a weekly basis that need t be imported my Database. The flat files I receive are not in a conventional format to import, so they need to be run through a script and be parsed in to a more SQL friendly format. These flat files are usually in JSON, TXT, XML, LOG, ect.
Current Solutions
Current I have a windows forms application and another GUI to transform the files and to bulkimport to SQL tables. However, I'm finding it unreliable to ask users to import data, and I would much rather automate the tasks.
More recently, I have been creating SSIS packages. This proves to much faster and useful since I can add script components. It allows me to manually parse whatever flat files I throw at it. My issue is finding a way to automate this. I have no control of the server where my database is hosted. So I'm unable to deploy the packages there to bring in the files. Currently, I'm just running the packages on my local machine manually to get the data in.
Needed Solution
I need a way for me to automate getting these flat files in. Originally I wanted to request and FTP server for the files to be dumped in. Then the files would be picked up by my packages and imported into the SQL Server DB. However, since I have no control of any of the local folders on that server, it seems to be impossible for me to automate this. Is there a better way for me to find a solution for this? Could I build something custom in C#, Python, Powershell, etc.? I'm very new to the scene and trying to find a solution for this problem has been a nightmare.
The overall goal is to have data from an automated daily Cognos report stored in a database so that I am able to report not only on that day but also historical data if I so choose. My general thought is that if I can find a way to automatically add the new daily data to an existing Excel file, I can then use that as my data source and create a dashboard in Tableau. However, I don't have any programming experience, so I'm floundering here.
I'm committed to using Tableau, but I chose Excel only because I'm more familiar with that program than others, along with the fact that an Excel output file is an option in Cognos. If you have better ideas, please don't hesitate to suggest them along with why you believe it's a better idea.
Update: I'm still jumping through hoops to try to get read-only access to the backend database to make this process a lot more efficient, but in the meantime I've moved forward with the long method utilizing Cognos.
I was able to leverage a coworker to create a system file folder to automatically save the Cognos reports to, and then I scheduled a job to run the reports I need. Each of those now saves into a folder in a shared network drive (so my entire team has access to the files), and I wrote a series of macros to append the data each day from those feeder files in the shared drive to a Master File. Now all that's left is to create a Tableau dashboard using the Master File as the data source and I'll have what I need.
Thanks for all your help!
I'm posting this an an answer because, it's just too much to leave as a comment.
What you need are 3 things.
Figure out how to have COGNOS run your report and download your Excel file.
Use Visual Studio with BIDS (which is the suite of SQL analysis, reporting, and integration services) to automate all the stuff you need to do to append your Excel files, etc... Then you can use the same tools to import that data to your SQL server.
In fact, if all you're doing is trying to get this data into SQL, you can skip the Append Excel part, and just append the data directly to your SQL table.
Once your package is built, you can save it as an automated job on your SQL server to run whenever you wish.
Tableau can use your SQL server as a data source. Once you have that updated, you can run your reports.
I've been searching around and haven't found anything on my scenario that I understand:
I have a list of all of the Oracle databases and corresponding servers that my company owns (about 80 servers 150 databases). I am trying to figure out which one a specific file is being downloaded from (from a webpage).
I am mechanical engineer, not in software so if you could eli5 that would be very helpful.
Specifically I need the SID name, but figuring out the server name
would also be helpful.
Your question is kind of tricky here. if your downloading the file from web application(I assuming it is a Java webapp), oracle database could act as either the data store or a report server that can generate the oracle reports directly
In the first case, you need to find out if what kind of file you are downloading?
is it a PDF? is it a excel file? or just text file or anything? the best idea is to check out the file link and then decide what software generating this file. it could be any software in back end to generate the file like, POI(for generating excel file), or even a direct file link, but not oracle at all.
Also, In this case, the file is usually generated at backend by server-let. You need ask the developer which report or file generating engine they are employing. and if oracle database is also being used, it is usually providing the data fro that report or file engine.
In the second case, you can just check out the the URL and give it to the webmaster asking them which oracle server it is using. it is usually configured in the web server.
I want to create a stored procedure (on SQL Server 2005) that fetches a file from an FTP site, saves it locally and then runs an SSIS package to import the contents of the file into a table.
I'm after some suggestions on how to fetch the file by calling a stored procedure. Should I use SQL CLR, call an SSIS package that does it, xp_cmdshell, or something else?
I'd like this process to be as generic as possible, so we can use it over and over again.
I second the SSIS route.
Anything you save by making a generic FTP routine is great, but unless all your files are the same layout, you will not be able to easily handle the importing of differing files with a single re-useable SSIS package anyway. You can handle all the error handling and logging in SSIS and you won't have to worry about handling the FTP outside and handling errors there and then if that's successful going to the import package where you will already have to handle any errors there anyway.
I would suggest that you pursue the SSIS route. All the components and technologies that you would need are already created for you to use.
You could also add a layer of validation, data transformations prior to importing the data into your database should you wish.