Bulk insert multiple .csv files into a table for each file - sql-server

I need a stored procedure that will do a bulk insert for multiple .csv files into tables in SQL Server. The .csv files all sit in a folder. The .csv files are tab delimited. I am able to iterate through the folder and create a record in a table named FileNames with each file in it. I get an error when it gets to the BULK INSERT code.
It's an incorrect syntax error near '-'.
BULK INSERT FILE01-K FROM C:\Temp\CSV_FILES\FILE01-K.csv')
My procedure:
declare #filename varchar(255),
#path varchar(255),
#sql varchar(8000),
#cmd varchar(1000)
--get the list of files to process
SET #path = 'C:\Temp\CSV_FILES\'
SET #cmd = 'dir ' + #path + '*.csv /b'
--clear the FileNames table
DELETE FROM FileNames
INSERT INTO FileName(FileName)
EXEC Master..xp_cmdShell #cmd
UPDATE FileName SET FilePath = #path where FilePath is null
DELETE FROM FileNames WHERE FileName is null
--cursor loop
declare c1 cursor for SELECT FilePath,FileName FROM FileNames where FileName like '%.csv%' order by FileName desc
open c1
fetch next from c1 into #path,#filename
While ##fetch_status <> -1
begin
set #sql = 'BULK INSERT '+ Replace(#filename, '.csv','')+' FROM '+#Path+#filename+''')'
print #sql
exec (#sql)
fetch next from c1 into #path,#filename
end
close c1
deallocate c1
I've searched and found some examples, including up here that I've used to try and build mine.

Related

How to migrate attachments stored on a fileshare, referenced in MS Access, to SQL Server

I have an MS Access database that we're converting to a SQL Server backend. This database has an Attachments table with a few simple columns:
PK, FK to MainTable.RecordID, Description, filename
Attachments are stored in a fileshare. VBA code uses a hardcoded filepath and ShellExecute to save attachments to a directory, under a RecordID subfolder.
We're moving to store attachments in SQL Server using filestream.
I need to move these attachments from fileshare, to SQL Server, while maintaining RecordID integrity. SQL Server tables and columns are already set up.
These attachments vary in extensions (.msg, .doc, .xlsx, .pdf)
I've been looking into "OPENROWSET" but every example I've seen uses only one file.
I've been looking into SSMA but can't find what I'm looking for.
Any references/reference articles or code resources I can use/repurpose would be greatly appreciated.
Sounds like you want to write an SQL stored procedure that will find all files in a given file path, iterate over those files, and insert the file into a table.
This article will help in general: https://www.mssqltips.com/sqlservertip/5432/stored-procedure-to-import-files-into-a-sql-server-filestream-enabled-table/
This article is about xp_dirtree: https://www.sqlservercentral.com/blogs/how-to-use-xp_dirtree-to-list-all-files-in-a-folder
Here's sample code to read the file system from SQL. THIS IS UNTESTED CODE, you'll need to modify to your needs but it gives you some idea of how to do the loops and read in files.
--You will need xm_cmdshell enabled on SQL server if not already.
USE master
GO
EXEC sp_configure 'show advanced option',1
RECONFIGURE WITH OVERRIDE
EXEC sp_configure 'xp_cmdshell',1
RECONFIGURE WITH OVERRIDE
GO
--Create a variable to hold the pickup folder.
DECLARE #PickupDirectory nvarchar(512) = '\\folder_containing_files_or_folders\';
--Create a temp table to hold the files found in the pickup directory.
PRINT 'Parsing directory to identify most recent file.';
DECLARE #DirTree TABLE (
id int IDENTITY(1,1)
, subdirectory nvarchar(512)
, depth int
, isfile bit
);
--Enumerate the pickup directory.
INSERT #DirTree
EXEC master.sys.xp_dirtree #PickupDirectory,1,1 --Second variable is depth.
--Create variables to loop through folders and files.
DECLARE #folderCount int;
DECLARE #folderName nvarchar(max);
DECLARE #folderPath nvarchar(max);
DECLARE #i int = 0;
DECLARE #fileCount int;
DECLARE #fileName NVARCHAR(max);
DECLARE #filePath varchar(max);
DECLARE #j int = 0;
DECLARE #RecordID nvarchar(50);
DECLARE #SQLText NVARCHAR(max);
SET #folderCount = (SELECT Count(*) FROM #DirTree WHERE isfile = 0);
WHILE ( #i < #folderCount )
BEGIN
--Get the next folder to process.
SET #folderName = (
SELECT TOP 1 subdirectory
FROM #DirTree as dt
LEFT OUTER JOIN #processedFolders as pf
on pf.folder_name = dt.subdirectory
WHERE isfile = 0
AND pf.folder_name IS NULL
);
--Get the recordID from folder name.
SET #recordID = #folderName; --Edit this to get the RecordID from your folder structure.
--Concat root path and new folder to get files from.
SET #folderPath = #PickupDirectory + #folderName + '\';
--Enumerate the this subdirectory to process files from.
INSERT #filesToProcess
EXEC master.sys.xp_dirtree #folderPath,1,1
--Get count of files to loop through.
SET #fileCount = (SELECT COUNT(*) FROM #filesToProcess WHERE isfile = 1);
WHILE (#j < #fileCount)
BEGIN
--Get next filename.
SET #fileName = (SELECT TOP 1 subdirectory FROM #filesToProcess WHERE isfile = 1);
--Concat the whole file path.
SET #filePath = #folderPath + #fileName;
SET #SQLText = '
INSERT INTO [table_name](RecordID,[filename],[filestreamCol])
SELECT
''' + #RecordID + '''
, ''' + #fileName + '''
, BulkColumn
FROM OPENROWSET(Bulk ''' + #filePath + ''', Single_Blob) as tb'
EXEC Sp_executesql #SQLText
DELETE FROM #filesToProcess
WHERE subdirectory = #fileName;
SET #j = #j + 1;
END
INSERT INTO #processedFolders (folder_name)
SELECT #folderName;
PRINT 'Folder complete: ' + #folderName;
SET #i = #i + 1
END
I think you want to parse just a root directory with the xp_dirtree command above. That will display all the subdirectories which should contain the "RecordID". Read the RecordID into a variable, then parse each of those subdirectories to get the actual files. If you want more detailed code, you'll have to show some examples of the directory structure and the destination table.

Dynamic bulkinsert of mutiple csv files from different location folders

I have multiple csv files in different location folders. I want to do bulk-insert in SQL server dynamically which will do bulk insert in a single table.
I did it for a single CSV file. Can someone help me out?
Here's something to get you started. You can read up on xp_dirtree and cursors to see how they work. If your files are spread across different parent folders, or different drives, you'll need an additional cursor to go get them...
---------------------------------------------------------------------------------------------------------------
--Set some variables
---------------------------------------------------------------------------------------------------------------
DECLARE #fileLocation VARCHAR(128) = '\\server\e$\data\' --location of files (parent folder)
DECLARE #sql NVARCHAR(4000) --dynamic sql variable
DECLARE #fileName VARCHAR(128) --full file name variable if you want to use this
---------------------------------------------------------------------------------------------------------------
--Get a list of all the file names in the directory
---------------------------------------------------------------------------------------------------------------
IF OBJECT_ID('tempdb..#FileNames') IS NOT NULL DROP TABLE #FileNames
CREATE TABLE #FileNames (
id int IDENTITY(1,1)
,subdirectory nvarchar(512)
,depth int
,isfile bit)
INSERT #FileNames (subdirectory,depth,isfile)
EXEC xp_dirtree #fileLocation, 1, 1
--Here's all the files and folders. Note isFile field.
select * from #FileNames
---------------------------------------------------------------------------------------------------------------
--Create a cursor to fetch the file names
---------------------------------------------------------------------------------------------------------------
DECLARE c CURSOR FOR
select name from #FileNames where isfile = 1
OPEN c
FETCH NEXT FROM c INTO #fileName
---------------------------------------------------------------------------------------------------------------
--For each file, bulk insert to the proper view, update the proper table, update the log, etc...
---------------------------------------------------------------------------------------------------------------
WHILE ##FETCH_STATUS = 0
BEGIN
--do your bulk insert work
FETCH NEXT FROM c INTO #fileName
END

Cannot get the correct files in the folder using SQL xp_smdshell command

I have files in my computer's folder with following names:
XXX.IN.txt
YYY.TEST.NUM.txt
ABC.AA.Z100.X.E999567777.Y001.txt
ABC.AA.Z100.X.E999568888.Y002.txt
ABC.AA.Z100.X.E999568888.Y003.txt
I want to write a SQL statement that would insert files that have the above described structure into a table, so I can later on write some logic on them.
I already used the command line statement inside of my stored proc to check if files exist:
EXEC master.dbo.xp_fileexist #fullPath, #exist OUTPUT
SET #exist = CAST(#exist AS BIT)
Now, I need to find certain files containing string in their names. I have the statement to do that:
DECLARE #cmdLine VARCHAR(200)
DECLARE #fullPath VARCHAR(900) = '\\my_network_path\MyDir\'
DECLARE #filter VARCHAR(100) = 'ABC.AA.Z100.X.*.txt'
SET #cmdLine = 'dir "' + #fullPath + '"'
EXEC master..xp_cmdshell #cmdLine
Above command should give me the following files:
ABC.AA.Z100.X.E999567777.Y001.txt
ABC.AA.Z100.X.E999568888.Y002.txt
ABC.AA.Z100.X.E999568888.Y003.txt
CREATE TABLE #FileDetails
(
data VARCHAR(MAX)
)
INSERT #FileDetails(data) EXEC master..xp_cmdshell #cmdLine
But it lists all the .txt files in the folder
How would I do list only those files that I need
First of all, the #cmdline should be set higher than #fullpath since it's supposed to fit all of it in the end.
Second of all, unless I am looking at it wrong or you didn't correct it here, the #filter variable isn't being used, so it would show every file regardless of extension.
My Code:
DECLARE #cmdLine VARCHAR(2000)
DECLARE #fullPath VARCHAR(1000) = '\\my_network_path\MyDir\'
DECLARE #filter VARCHAR(100) = 'ABC.AA.Z100.X.*.txt'
SET #cmdLine = 'dir "' + #fullPath + #filter + '"'
EXEC master..xp_cmdshell #cmdLine
My Output (keep in mind I created a Test.txt in the same folder):
10-10-2017 12:17 0 ABC.AA.Z100.X.Y001.txt
10-10-2017 12:18 0 ABC.AA.Z100.X.Y002.txt
10-10-2017 12:18 0 ABC.AA.Z100.X.Y003.txt
I would do this with either a CLR proc or an SSIS Script that uses the FileSystemObject to iterate through the files, filter for the ones you want and build a SQL String and execute it.
I don't know any way to do what you want with just straight TSQL.

How do I save base64-encoded image data to a file in a dynamically named subfolder

I have a table containing base64-encoded jpegs as well as some other data. The base64 string is stored in a VARCHAR(MAX) column.
How can I save these images out to actual files inside folders that are dynamically named using other data from the table in a stored procedure?
I found the answer by combining a lot of small tips from different places and wanted to collate them all here as no one seemed to have the full process.
I have two tables, called Photos and PhotoBinary. Photos contains at least a PhotoID BIGINT, the Base64 data VARCHAR(MAX), and something to get the FolderName from - NVARCHAR(15) - increase as needed. I also have a BIT field to mark them as isProcessed. PhotoBinary has a single VARBINARY(MAX) column and should be empty.
The second table serves two purposes, it holds the converted base64 encoded image in binary format, and allows me to work around the fact that BCP will not let you skip columns when exporting data from a table when using a "format file" to specify the column formats. So in my case the data has to sit in a table all on its own, I did try taking it from a view but had the aforementioned issue with not being allowed to skip the id column.
The main stored procedure has a bcp command that depends upon an .fmt file created using the following SQL. I'm pretty sure I had to edit the resulting file with a plain text editor to change the 8 to a 0 that indicates the prefix length after SQLBINARY. I couldn't just use the -n switch on the command in the main stored procedure as it resulted in an 8 byte prefix being put into the resulting file, which made it an invalid jpeg. So, I used the edited format file to get around that.
DECLARE #command VARCHAR(4000);
SET #command = 'bcp DB.dbo.PhotoBinary format nul -T -n -f "A:\pathto\photobinary.fmt"';
EXEC xp_cmdshell #command;
I then have the following in a stored procedure that I run to export the images into an appropriate folder:
DECLARE #command VARCHAR(4000),
#photoId BIGINT,
#imageFileName VARCHAR(128),
#folderName NVARCHAR(15),
#basePath NVARCHAR(500),
#fullPath NVARCHAR(500),
#dbServerName NVARCHAR(100);
DECLARE #directories TABLE (directory nvarchar(255), depth INT);
-- The location of the output folder
SET #basePath = '\\server\share';
-- The server that the photobinary db is on
SET #dbServerName = 'localhost';
-- #basePath values, get the folders already in the output folder
INSERT INTO #directories(directory, depth) EXEC master.sys.xp_dirtree #basePath;
-- Cursor for each image in table that hasn't already been exported
DECLARE photo_cursor CURSOR FOR
SELECT PhotoID,
'some_image_' + CAST(PhotoID AS NVARCHAR) + '.jpg',
FolderName
FROM dbo.Photos
WHERE isProcessed = 0;
OPEN photo_cursor
FETCH NEXT FROM photo_cursor
INTO #photoId,
#imageFileName,
#folderName;
WHILE (##FETCH_STATUS = 0) -- Cursor loop
BEGIN
-- Create the #basePath directory
IF NOT EXISTS (SELECT * FROM #directories WHERE directory = #folderName)
BEGIN
SET #fullPath = #basePath + '\' + #folderName;
EXEC master.dbo.xp_create_subdir #fullPath;
END
-- move and convert the base64 encoded image to a separate table in binary format
-- it should be the only row in the table
INSERT INTO DB.dbo.PhotoBinary (PhotoBinary)
SELECT CAST(N'' AS xml).value('xs:base64Binary(sql:column("Base64"))', 'varbinary(max)')
FROM DB.dbo.Photos
WHERE PhotoID = #photoId;
-- This command uses the command-line BCP tool to "bulk export" the image data in binary to an "archive" file that just happens to be a jpg
SET #command = 'bcp "SELECT TOP 1 PhotoBinary FROM DB.dbo.PhotoBinary" queryout "' + #basePath + '\' + #folderName + '\' + #imageFileName + '" -T -S ' + #dbServerName + ' -f "A:\pathto\photobinary.fmt"';
EXEC xp_cmdshell #command;
-- clean up the photo data
DELETE FROM DB.dbo.PhotoBinary;
-- mark photo as processed
UPDATE DB.dbo.Photos SET isProcessed = 1 WHERE PhotoID = #photoId;
FETCH NEXT FROM photo_cursor
INTO #photoId,
#imageFileName,
#folderName;
END -- cursor loop
CLOSE photo_cursor
DEALLOCATE photo_cursor

How can I delete files from a location where the file already exist in database?

How can I delete files which already exists in a database table as Filename.
Example On Drive C:\Data there are 100 Word documents and 70 of these documents will be found in the database DMS.Filename.
If directory.filename=table.filename then the File should be deleted. In this case we have to delete 70 Word documents. The procedure should run as daily task an check new files against the database.
How can I check and delete the files ?
Here new code:
you can't delete in cmd Files with space or blanks in filename. I think this the msg what i get.
Could Not Find C:\Data\Integration
Could Not Find C:\Windows\system32\Lettre
DECLARE #image_files TABLE (file_path VARCHAR(MAX))
DECLARE #file_path VARCHAR(MAX), #cmd VARCHAR(MAX)
INSERT INTO #image_files (file_path)
EXEC xp_cmdshell 'C:\Data\*.doc /b /s /x'
DECLARE file_cursor CURSOR FOR
SELECT file_path FROM #image_files
WHERE file_path IN
(
select 'C:\Data\' + QC_DESCRIPTION +'.doc' from tbl_France where QC_DESCRIPTION is not null
)
OPEN file_cursor
FETCH NEXT FROM file_cursor INTO #file_path
WHILE (##FETCH_STATUS = 0)
BEGIN
SET #cmd = 'EXEC xp_cmdshell ''del ' + #file_path + ''''
EXEC(#cmd)
FETCH NEXT FROM file_cursor INTO #file_path
END
CLOSE file_cursor
DEALLOCATE file_cursor
Haven't tested this code, but it should gives you a start: xp_cmdshell allows you to execute shell command from SQL Server:
You first need to enable it (credit pero):
-- To allow advanced options to be changed.
EXEC sp_configure 'show advanced options', 1
GO
-- To update the currently configured value for advanced options.
RECONFIGURE
GO
-- To enable the feature.
EXEC sp_configure 'xp_cmdshell', 1
GO
-- To update the currently configured value for this feature.
RECONFIGURE
GO
Then use a cursor to go through your table and delete files:
DECLARE #FileName varchar(200)
DECLARE #Command varchar(300)
DECLARE FileName_Cursor CURSOR FOR
SELECT [FileName] FROM MyTable
OPEN FileName_Cursor
FETCH NEXT FROM FileName_Cursor INTO #FileName
WHILE ##FETCH_STATUS <> 0
BEGIN
SET #Command = 'del "' + #FileName + '"'
EXEC xp_cmdshell #Command
FETCH NEXT FROM FileName_Cursor INTO #FileName
END
CLOSE FileName_Cursor
DEALLOCATE FileName_Cursor
Note that there are security risks with this approach. It does not handle escape characters, or double quotes in the FileName. You will face problems like shellshock. it's best to use SSIS to read your table and delete files, or do it through application code.

Resources