Import images from folder into SQL Server table - sql-server

I've being searching for this on google but I haven't found any good explanation, so here's my issue.
I need to import product images, which are in a folder to SQL Server, I tried to use xp_cmdshell but without success.
My images are in C:\users\user.name\Images and the images have their names as the product id, just like [product_id].jpg and they're going to be inserted in a table with the product ID and the image binary as columns.
I just need to list the images on the folder, convert the images to binary and insert them in the table with the file name (as the product_id)
My questions are:
How do I list the images on the folder?
How do I access the folder with dots in their name (like user.name)
How do I convert the images to binary in order to store them in the database (if SQL Server doesn't do that automatically)
Thanks in advance

I figured I'd try an xp_cmdshell-based approach just for kicks. I came up with something that does appear to work for me, so I'd be curious to know what problems you ran into when you tried using xp_cmdshell. See the comments for an explanation of what's going on here.
-- I'm going to assume you already have a destination table like this one set up.
create table Images (fname nvarchar(max), data varbinary(max));
go
-- Set the directory whose images you want to load. The rest of this code assumes that #directory
-- has a terminating backslash.
declare #directory nvarchar(max) = N'D:\Images\';
-- Query the names of all .JPG files in the given directory. The dir command's /b switch omits all
-- data from the output save for the filenames. Note that directories can contain single-quotes, so
-- we need the REPLACE to avoid terminating the string literal too early.
declare #filenames table (fname varchar(max));
declare #shellCommand nvarchar(max) = N'exec xp_cmdshell ''dir ' + replace(#directory, '''', '''''') + '*.jpg /b''';
insert #filenames exec(#shellCommand);
-- Construct and execute a batch of SQL statements to load the filenames and the contents of the
-- corresponding files into the Images table. I found when I called dir /b via xp_cmdshell above, I
-- always got a null back in the final row, which is why I check for fname IS NOT NULL here.
declare #sql nvarchar(max) = '';
with EscapedNameCTE as (select fname = replace(#directory + fname, '''', '''''') from #filenames where fname is not null)
select
#sql = #sql + N'insert Images (fname, data) values (''' + E.fname + ''', (select X.* from openrowset(bulk ''' + E.fname + N''', single_blob) X)); '
from
EscapedNameCTE E;
exec(#sql);
I started with an empty Images table. Here's what I had after running the above:
Now I'm not claiming this is necessarily the best way to go about doing this; the link provided by #nscheaffer might be more appropriate, and I'll be reading it myself since I'm not familiar with SSIS. But perhaps this will help illustrate the kind of approach you were initially trying for.

Related

Need help/explanation on how to properly add parameter to dynamic SQL Query

I'm searching through databases and I reached out to StackOverflow for some help in how to learn how to run the dynamic SQL through.
The answer I got was extremely useful but it didn't come with an explanation of what was happening exactly or why. Now I'm trying to add another parameter to the code and I'm having trouble adding it.
I would like someone to help me correct the parameter so its input correctly and explain what is going on in the dynamic SQL statement.
Lets get to the problem with the Query. Everything works great as it was correctly by another StackOverflow poster. But now I need to add where the 'ControlID' column must start with the letter Q.
With the way I've put it in now I'm getting an error that states the column 'ControlID' does not exist. But when I check to see if ControlID is the correct column name
select * FROM [EDDS1111111].[EDDSDBO].[Document] where ControlID like 'Q%'
I do get results. So it's not an invalid column name.
I designed this input of 'ControlID' to be similar to the way artifact ID was added earlier in the code so I'm confused as to why I'm getting this error.
-- this is used to add line breaks to make code easier to read
DECLARE #NewLine AS NVARCHAR(MAX) = CHAR(10)
-- to hold your dynamic SQL for all rows/inserts at once
DECLARE #sql NVARCHAR(MAX) = N'';
-- create temp table to insert your dynamic SQL results into
IF OBJECT_ID('tempdb..#DatabaseSizes', 'U') IS NOT NULL
DROP TABLE #DatabaseSizes;
create table #DatabaseSizes(
controlid nvarchar(128),
fileSize DECIMAL (10,6),
extractedTextSize DECIMAL(10,6)
)
SELECT #sql = #sql + N'' +
'select SUM(fileSize)/1024/1024/1024 as fileSize,
SUM(extractedTextSize)/1024/1024 as extractedTextSize ' + #NewLine +
'FROM [EDDS' + CAST(ArtifactID as nvarchar(128)) + '].[EDDSDBO].
[Document] ed' + #NewLine +
'where ed.CreatedDate >= (select CONVERT(varchar,dateadd(d,-
(day(getdate())),getdate()),106)) and ed.controlid = '+Cast(Controlid as
nvarchar(128))+'%' + #NewLine + #NewLine
FROM edds.eddsdbo.[Case]
WHERE name like '%Review%' and (StatusCodeArtifactID = '1780779' or
StatusCodeArtifactID = '1034288')
--controlid always needs to begin with a Q
-- for testing/validating
PRINT #sql
INSERT INTO #DatabaseSizes (
controlid,
fileSize,
extractedTextSize
)
-- executes all the dynamic SQL we just generated
EXEC sys.sp_executesql #SQL;
As I've put the code so far I'd expect for the controlID to equal the current control ID where it would begin with a Q.
I tried to copy the part where it was done correctly so I'm a little confused. Any help to increase my understanding is greatly appreciated.
Thank you for your time
There is not a column for ControlID within edds.eddsdbo.[case]. But it does exist within all databases located within FROM [EDDS' + CAST(ArtifactID as nvarchar(128)) + '].[EDDSDBO].[Document]. CreatedDate also does not exist within the .[Case] tables but does exist within the .Document databases. That is why I put the search for ControlID next to the where statement for the .Document section of the query
This is what was returned when I try to run the code.
Invalid column name 'Controlid'.

How to insert multiple PDF files into database using SSMS

I have a folder on my PC that contains nearly 300000 pdf files (~25GB). By using a SQL script, I want to insert all those pdf files into my database, including filename and the PDF file itself.
I have followed many tutorials or articles around there but all of them require the filename to be defined one-by-one for each file.
Upload files article.
I want to have a stored procedure like this one:
Create PROCEDURE [dbo].[Importfiles] (
#FolderPath NVARCHAR (1000)
)
AS
BEGIN
DECLARE #tsql NVARCHAR (2000);
SET NOCOUNT ON
SET #tsql = 'insert into dokumentet ( name, files) ' +
' SELECT ' + // I would like to select all files here (including their file name)
'FROM Openrowset( Bulk ' + '''' + #FolderPath + '''' + ', Single_Blob) as pdf'
EXEC (#tsql)
SET NOCOUNT OFF
END
My database table schema, and source folder is like this:
Name is the file name, Files is the pdf file.
They should look like this on the database table:
Once again my goal is to create a SQL script to take the file and its name automatically, and insert them in the table. But I'm having difficulties.

How do I save base64-encoded image data to a file in a dynamically named subfolder

I have a table containing base64-encoded jpegs as well as some other data. The base64 string is stored in a VARCHAR(MAX) column.
How can I save these images out to actual files inside folders that are dynamically named using other data from the table in a stored procedure?
I found the answer by combining a lot of small tips from different places and wanted to collate them all here as no one seemed to have the full process.
I have two tables, called Photos and PhotoBinary. Photos contains at least a PhotoID BIGINT, the Base64 data VARCHAR(MAX), and something to get the FolderName from - NVARCHAR(15) - increase as needed. I also have a BIT field to mark them as isProcessed. PhotoBinary has a single VARBINARY(MAX) column and should be empty.
The second table serves two purposes, it holds the converted base64 encoded image in binary format, and allows me to work around the fact that BCP will not let you skip columns when exporting data from a table when using a "format file" to specify the column formats. So in my case the data has to sit in a table all on its own, I did try taking it from a view but had the aforementioned issue with not being allowed to skip the id column.
The main stored procedure has a bcp command that depends upon an .fmt file created using the following SQL. I'm pretty sure I had to edit the resulting file with a plain text editor to change the 8 to a 0 that indicates the prefix length after SQLBINARY. I couldn't just use the -n switch on the command in the main stored procedure as it resulted in an 8 byte prefix being put into the resulting file, which made it an invalid jpeg. So, I used the edited format file to get around that.
DECLARE #command VARCHAR(4000);
SET #command = 'bcp DB.dbo.PhotoBinary format nul -T -n -f "A:\pathto\photobinary.fmt"';
EXEC xp_cmdshell #command;
I then have the following in a stored procedure that I run to export the images into an appropriate folder:
DECLARE #command VARCHAR(4000),
#photoId BIGINT,
#imageFileName VARCHAR(128),
#folderName NVARCHAR(15),
#basePath NVARCHAR(500),
#fullPath NVARCHAR(500),
#dbServerName NVARCHAR(100);
DECLARE #directories TABLE (directory nvarchar(255), depth INT);
-- The location of the output folder
SET #basePath = '\\server\share';
-- The server that the photobinary db is on
SET #dbServerName = 'localhost';
-- #basePath values, get the folders already in the output folder
INSERT INTO #directories(directory, depth) EXEC master.sys.xp_dirtree #basePath;
-- Cursor for each image in table that hasn't already been exported
DECLARE photo_cursor CURSOR FOR
SELECT PhotoID,
'some_image_' + CAST(PhotoID AS NVARCHAR) + '.jpg',
FolderName
FROM dbo.Photos
WHERE isProcessed = 0;
OPEN photo_cursor
FETCH NEXT FROM photo_cursor
INTO #photoId,
#imageFileName,
#folderName;
WHILE (##FETCH_STATUS = 0) -- Cursor loop
BEGIN
-- Create the #basePath directory
IF NOT EXISTS (SELECT * FROM #directories WHERE directory = #folderName)
BEGIN
SET #fullPath = #basePath + '\' + #folderName;
EXEC master.dbo.xp_create_subdir #fullPath;
END
-- move and convert the base64 encoded image to a separate table in binary format
-- it should be the only row in the table
INSERT INTO DB.dbo.PhotoBinary (PhotoBinary)
SELECT CAST(N'' AS xml).value('xs:base64Binary(sql:column("Base64"))', 'varbinary(max)')
FROM DB.dbo.Photos
WHERE PhotoID = #photoId;
-- This command uses the command-line BCP tool to "bulk export" the image data in binary to an "archive" file that just happens to be a jpg
SET #command = 'bcp "SELECT TOP 1 PhotoBinary FROM DB.dbo.PhotoBinary" queryout "' + #basePath + '\' + #folderName + '\' + #imageFileName + '" -T -S ' + #dbServerName + ' -f "A:\pathto\photobinary.fmt"';
EXEC xp_cmdshell #command;
-- clean up the photo data
DELETE FROM DB.dbo.PhotoBinary;
-- mark photo as processed
UPDATE DB.dbo.Photos SET isProcessed = 1 WHERE PhotoID = #photoId;
FETCH NEXT FROM photo_cursor
INTO #photoId,
#imageFileName,
#folderName;
END -- cursor loop
CLOSE photo_cursor
DEALLOCATE photo_cursor

Appending image ID to image file name in SQL Server

by using xp_dirtree and by doing some work around am able to get all the images into a temporary table and now i have to insert the same into main table with identity column ImageID and at the same time i have to append the same imageid to starting of the image file name added with '_' original file name
am using a query something like to insert main table
insert into tblImage(ImagePath)
SELECT fullpath + '\' + subdirectory
FROM #Directory ORDER BY fullpath,subdirectory;
currently am using loops in c# dot net to do so .... as inserting one image getting image id (scope_identity) and appending it to the file name.
but it taking large time as number of images increases.
Is there any better way to accomplish this in sql server only.
Assuming after the insert, the data in tblImage will look something like this?
ImageID ImagePath
1 somefilename.jpg
2 someotherfilename.jpg
The identity value will only be available after the insert, so you can do a bulk update after the initial bulk insert. The identity value is not available during an insert.
UPDATE tblImage SET
ImagePage = CONVERT(VARCHAR(10), ImageId) + ImagePath
WHERE (You'll have to specify a where clause to limit the update to the newly created records)
or you can add a 3rd column to tblImage
UPDATE tblImage SET
NewFileName = CONVERT(VARCHAR(10), ImageId) + '_' + ImagePath
WHERE NewFileName IS NULL
if you need to change the filename on disk, I can think of two possible solutions.
The first and quickest to implement, would be to use xp_cmdshell execute the rename statements. (this assumes ImagePage has the original filename, and NewFileName has the new filename.)
DECLARE #Cmd NVARCHAR(MAX) = '';
SELECT #Cmd = #Cmd + 'EXEC xp_cmdshell ''rename "' + ImagePath + '" "' + NewFileName + '"''
'
FROM tblImage
EXEC (#CMD);
To use xp_cmdshell, you need to enable if first, as it's disabled by default.
EXEC sp_configure 'xp_cmdshell', 1
GO
RECONFIGURE
GO
The second option to rename files on disk would be to setup a FileTable in SQL Server that points to the directory containing the files you want to rename. Then the renaming of the file becomes a simple update statement.
You can read more about FileTables here
There's also a couple of really comprehensive walkthroughs about setting up FileTables.
Hope that helps.

How to create text files of database rows?

I have a database table with a column named File Content, and many rows. What I need is to create a text file for each row of File Content column.
Example:
Sr. File Name File Content
1. FN1 Hello
2. FN2 Good Morning
3. FN3 How are you?
4. FN4 Where are you?
Suppose I have 4 rows, then 4 text files should be created (maybe with any name which we want)
File1.txt should have text "hello" in it.
File2.txt should have text "Good Morning" in it.
File3.txt should have text "How are you?" in it.
File4.txt should have text "Where are you?" in it
Although you said you said you need to do it in TSQL, I wouldn't do it that way if possible. Ram has shown you one solution, but it has the disadvantages that you need to use xp_cmdshell and the SQL Server service account needs permission to access the file system in whatever location you want to have the files.
My suggestion would be to write a script or small program in your preferred language (PowerShell, Perl, Python, C#, whatever) and use that instead. TSQL as a language is simply badly suited for manipulating files or handling anything outside the database. It is obviously possible (CLR procedures are another way), but you often run into problems with permissions, encodings and other issues that are much easier to deal with in an external language.
This can be done with BCP OUT syntax of SQL server.
For the setup: just make sure that you have xp_cmdshell exec permissions on the server. This can be checked from master.sys.configurations table. Also change filelocation path corresponding to your server or network share. I checked and was able to generate 4 files as there are 4 records in the table.
use master
go
declare #DSQL Nvarchar(max)
declare #counter int
declare #maxrows int
declare #filename Nvarchar(30)
select #counter=1, #maxrows = 0
create table t1 (
sno int identity(1,1) not null,
filename varchar(5),
filecontent varchar(100)
)
insert into t1
select 'FN1', 'Hello'
UNION
select 'FN2', 'Good Morning'
UNION
select 'FN3', 'How are you?'
UNION
select 'FN14', 'Where are you?'
select #maxrows = count(*) from t1
--SELECT * FROM T1
while (#counter <= #maxrows)
begin
select #filename = filename from t1
where sno = #counter
select #DSQL = N'exec xp_cmdshell' + ' ''bcp "select filecontent from master.dbo.T1 where sno = ' + cast(#counter as nvarchar(10)) + '" queryout "d:\temp\' + #filename + '.txt" -T -c -S home-e93994b54f'''
print #dsql
exec sp_executesql #DSQL
select #counter = #counter + 1
end
drop table t1

Resources