Automate csv file upload into SQL Server with changing file name - sql-server

I want to automate uploading certain files into my SQL Server every day, but each file has a different name.
I was researching and found a simple way was to schedule a bulk insert statement to run every day, but I don't know how to implement the file name change in the query. I'm not too familiar with using Windows command prompt but I'm open to using that as a solution.
The file name change is something like mmddyyyyfile with the mmddyyyy part changing to correspond with the day's date.

We use this technique in our system bulk loads when we have regular file extracts to import, similar to what you describe in your situation. If you have access to and are willing to use xp_cmdshell (which is sounds like you are) then doing something like this allows for dynamic filenames and you don't have to worry about what your date pattern is:
SET NOCOUNT ON;
DECLARE #cmdstr VARCHAR(1024) = 'dir c:\upload /B'; --set your own folder path here
DECLARE #FileName VARCHAR(1024);
DROP TABLE IF EXISTS #CmdOutput;
CREATE TABLE #CmdOutput (CmdOutput varchar(1024));
INSERT #CmdOutput EXEC master..xp_cmdshell #cmdstr;
DECLARE FILES CURSOR FAST_FORWARD FOR
SELECT CmdOutput
FROM #CmdOutput
WHERE CmdOutput IS NOT NULL;
OPEN FILES;
FETCH NEXT FROM FILES INTO #FileName;
WHILE ##FETCH_STATUS = 0
BEGIN
/*
Use dynamic SQL to do your bulk load here based on the value of #FileName
*/
FETCH NEXT FROM FILES INTO #FileName;
END;
CLOSE FILES;
DEALLOCATE FILES;
DROP TABLE #CmdOutput;
This will blindly take any file in the folder and include it in the cursor iterations. If the folder containing your .csv files will have something else that you don't then you can easily add filtering to the WHERE clause that defines the cursor to limit the files.
Finally, the obligatory warning about enabling and using xp_cmdshell on your SQL Server instance. I won't go into the details about that here (there is ample information that can be searched out), but suffice to say it is a security concern and needs to be used with the understanding of risks involved.

Related

Azcopy move or remove specific files from SQL

I need to move files from one blob to another. Not copy. I need to move; meaning when I move from A to B then files go from A to B and nothing is there in A. Which is so basic but not possible in Azure Blob. Please let me know if its possible. I am using it from SQL Server using AzcopyVersion 10.3.2
Now because of this, I need to copy files from A to B and then remove files form A. There are 2 problems.
1) I only want certain files to go from A to B.
DECLARE #Program varchar(200) = 'C:\azcopy.exe'
DECLARE #Source varchar(max) = '"https://myblob.blob.core.windows.net/test/myfolder/*?SAS"'
DECLARE #Destination varchar(max) = '"https://myblob.blob.core.windows.net/test/archive?SAS"'
DECLARE #Cmd varchar(5000)
SELECT #Cmd = #Program +' cp '+ #Source +' '+ #Destination + ' --recursive'
PRINT #cmd
EXECUTE master..xp_cmdshell #Cmd
So When I type myfolder/* then it will take all the files. When I try myfolder/*.pdf, it says
failed to parse user input due to error: cannot use wildcards in the path section of the URL except in trailing "/*". If you wish to use * in your URL, manually encode it to %2A
When I try myfolder/%2A.pdf OR myfolder/%2Apdf it still gives the error.
INFO: Failed to create one or more destination container(s). Your transfers may still succeed if the container already exists.
But the destination folder is already there. And in the log file it says,
RESPONSE Status: 403 This request is not authorized to perform this operation.
For azcopy version 10.3.2:
1.Copy specific files, like only copy .pdf files: you should add --include-pattern "*.pdf" to your command. And also remember for the #Source variable, remove the wildcard *, so your #Source should be '"https://myblob.blob.core.windows.net/test/myfolder?SAS"'.
The completed command looks like this(please change it to meet your sql cmd):
azcopy cp "https://xx.blob.core.windows.net/test1/folder1?sas" "https://xx.blob.core.windows.net/test1/archive1?sas" --include-pattern "*.pdf" --recursive=true
2.For delete specific blobs, like only delete .pdf files, you should also add --include-pattern "*.pdf" to your azcopy rm command.
And also, there is no move command in azcopy, you should copy it first => then delete it. You can achieve this with the above 2 commands.

Schedule importing flat files with different names into SQL server 2014

As I am a beginner in SQL Server and my scripting is not very polished yet. I need suggestions on the below issue.
I receive files from a remote server to my machine (around 700/day) as follows :
ABCD.100.1601310200
ABCD.101.1601310210
ABCD.102.1601310215
Naming Convention:
Here the first part 'ABCD' remains the same, middle part is a sequence id which is in incremental order for every file. The last part is time stamp.
File structure
The file does not have any specific extension but can be opened with notepad/excel. Therefore can be called as flat file. Each files consist of 95 columns and 20000 rows fixed with some garbage value on top 4 and bottom 4 rows of column 1.
Now, I need to make a database in SQL server where I can import data from these flat files using a scheduler. Suggestion needed.
There are probably other ways of doing this, but this is one way:
Create a format file for your tables. You only need to create it once. Use this file in the import script in step 2.
Create an import script based on OPENROWSET(BULK '<file_name>', FORMATFILE='<format_file>'
Schedule the script from step 2 in SQL Server to run against the database you want the data imported in
Create the format file
This creates a format file to be used in the next step. The following script creates a format file in C:\Temp\imp.fmt based on an existing table (replace TEST_TT with the database you are importing to). This creates such a format file with a , as field seperator. If the files have tab as seperator, remove the -t, switch.
DECLARE #cmd VARCHAR(8000);
SET #cmd='BCP TEST_TT.dbo.[ABCD.100.1601310200] format nul -f "C:\Temp\imp.fmt" -c -t, -T -S ' + (SELECT ##SERVERNAME);
EXEC master..xp_cmdshell #cmd;
Before executing this you will to reconfigure SQL Server to allow the xp_cmdshell stored procedure. You only need to do this once.
EXEC sp_configure 'show advanced options', 1
GO
RECONFIGURE
GO
EXEC sp_configure 'xp_cmdshell', 1
GO
RECONFIGURE
GO
import script
This script assumes:
The files need to be imported to separate tables
The files are located in C:\Temp
The format file is C:\Temp\imp.fmt (generated in the previous step)
SET NOCOUNT ON;
DECLARE #store_path VARCHAR(256)='C:\Temp';
DECLARE #files TABLE(fn NVARCHAR(256));
DECLARE #list_cmd VARCHAR(256)='DIR ' + #store_path + '\ABCD.* /B';
INSERT INTO #files EXEC master..xp_cmdshell #list_cmd;
DECLARE #fullcmd NVARCHAR(MAX);
SET #fullcmd=(
SELECT
'IF OBJECT_ID('''+QUOTENAME(fn)+''',''U'') IS NOT NULL DROP TABLE '+QUOTENAME(fn)+';'+
'SELECT * INTO '+QUOTENAME(fn)+' '+
'FROM OPENROWSET(BULK '''+#store_path+'\'+fn+''',FORMATFILE=''C:\Temp\imp.fmt'') AS tt;'
FROM
#files
WHERE
fn IS NOT NULL
FOR XML PATH('')
);
EXEC sp_executesql #fullcmd;

Run a sql command against databases from table

Before I start, I need to point out that I am a SQL noob. I can write basic statements, but anything past JOIN statements is probably fairly new to me.
That said, I have cobbled together a script that deletes records from tables. The script itself does what it needs to do; however, when I run this script, I change the "USE" line to whatever database is next, stepping through databases manually. I use a command which populates a temporary table with a list of database names as reference.
How can I run my script against each database name in the temporary table directly, preferably all from a single stored procedure?
Well, one option is that you can use a cursor to grab all the database names (exclude databases you don't want to execute on), and use dynamic sql to execute for each database. It unfortunately has to be "dynamic" since you can't just do a while loop, USE #dbname /*your magic code*/ Fetch Next from MyCursor into #dbname... It errors out when trying to do USE #dbname, So you actually do have to use exec #variable in it.
DECLARE #dbname varchar(max)
DECLARE #executeme nvarchar(max)
DECLARE DBCursor CURSOR FOR
SELECT Name
FROM sys.databases
WHERE name not in ('master', 'tempdb', 'model', 'msdb'); --add additional exclusions here
OPEN DBCursor;
FETCH NEXT FROM DBCursor INTO #dbname;
WHILE ##FETCH_STATUS = 0
BEGIN
SET #executeme =
N'use '+#dbname+'
--YOUR SCRIPT HERE
'
EXEC sp_executesql #executeme
FETCH NEXT FROM DBCursor into #dbname;
END;
CLOSE DBCursor;
DEALLOCATE DBCursor;
GO
This is going to loop across all databases, execute your script in each, until there is none left in the list of databases. You can exclude certain databases in the select statement for the cursor (like the master, etc.) and add in any additional logic as you see fit for excluding.
Additionally, you can implement this in a stored procedure, so all you have to do is run the stored procedure, sit back and drink your favorite drink while it does the heavy lifting for you ;)

SSIS Bulk Insert Task Editor XML File

Good Day,
I am attempting to use SSIS to bulk insert XML files into a sql database. Within the Bulk Insert Task editor there are two options beneath the Connection section of the window beneath heading Format, they are "Specify" and "Use File". "Specify" appears to speak to traditional files, and I am thinking this is not applicable to xml files (?). The other option is "Use File", to exercise this option what would I need to do in relation to my source file?
Thank you.
I was able to achieve my goal of bulk inserting all xml files from a directory using the following script within a "Execute SQL Task" task. Ensure that within your "Execute SQL Task" that option ByPassPrepare is set to "True". Also on the Parameter Mapping section your parameter name MUST be preceded by an "#" character. You cannot call your variable in your sql by the name you just defined, a question mark must be used. If you are making multiple variable calls in the same script you have to adjust how you are calling the question marks. Within the For Each loop container where the execute sql task resides go to the Collection section, define your folder where the files reside that you want to load (Folder), and define your files (Files) as *.xml. Within the "Parameter Mapping" section register the user defined variable that contains the file path (Remember must contain an actual file in the variable definition).
declare #sql nvarchar(max);
set #sql = '
INSERT INTO testXMLwithOpenXML(XMLData, LoadedDateTime)
SELECT CONVERT(XML, BulkColumn) AS BulkColumn, GETDATE()
FROM OPENROWSET(BULK ''' + ? + ''', SINGLE_BLOB) AS x;'
exec(#sql)

Can EXEC master..xp_cmdshell be used on a set of data?

I have a single windows shell command I'd like to run (via EXEC master..xp_cmdshell) once for each row in a table. I'm using information from various fields to build the command output.
I'm relativity new to writing T-SQL programs (as opposed to individual queries) and can't quite get my head around the syntax for this, or if it's even possible/recommended.
I tried creating a single column table variable, and then populating each row with the command I want to run. I'm stifled at how to iterate over this table variable and actually run the commands. Googling around has proven unhelpful.
Thanks in advance for any help!
You could always use a cursor:
USE Northwind
DECLARE #name VARCHAR(32)
DECLARE #command VARCHAR(100)
DECLARE shell_cursor CURSOR FOR
SELECT LastName FROM Employees
OPEN shell_cursor
FETCH NEXT FROM shell_cursor INTO #name
WHILE ##FETCH_STATUS = 0
BEGIN
SET #command = 'echo ' + #name
EXEC master.dbo.xp_cmdshell #command
FETCH NEXT FROM shell_cursor INTO #name
END
CLOSE shell_cursor
DEALLOCATE shell_cursor
GO
Is this a once-off job? If so, you might be better off coming from the reverse direction. That is to say, instead of writing a stored procedure to call XP_CMDSHELL to run some program against table data, you should consider writing a program to work against the table data directly. If one scripting product comes to mind, it's PowerShell. It has integrated support for any database that the windows platform supports and you'll find a ton of scripts on www.poshcode.org to do that kind of thing.
On the other hand, if this is something that is to be scheduled, I guess there's nothing hugely wrong with your idea, apart from the fact that XP_CMDSHELL is disabled out of the box with SQL Server these days. Re-enabling it is opening your server to a whole new world of exploits, especially if that table data is sourced from a web page form or some other questionable source.
-Oisin

Resources