SQL Server (Transact-SQL) difference between Filestream and BLOB - sql-server

Could anyone explain in laymans terms, what is the difference between those two data types and moreso, which are the pros and cons of using either of those to store files in a database.
If context is needed then I am creating a web app, where users can upload a multitude of different data, like images, Excel files, .docx etc.

Blobs are stored in a varbinary(MAX) column with the value stored in data pages inside the database data file(s).
With FILESTREAM, values are stored as individual files separately on the filesystem, with an individual file for each row and value. These files are managed internally by SQL Server and can be stored and retrieved using T-SQL just like normal varbinary(MAX) values or with Win32 APIs.
There are also FileTables, which is a specialized table with a predefined schema on top of FILESTREAM. FileTables provide T-SQL access like blobs and FILESTREAM and can optionally be access via a SQL Server managed UNC path, similarly to a normal Windows share. Creating/deleing files via the share inserts/deletes rows from the file table and visa-versa.

Related

Any reason to NOT use FileTable (as opposed to plain FileStream) in SQL Server? [duplicate]

I want to store images in a sql database. The size of the image is between 50kb to 1mb. I was reading about a FileStream and a FileTable but I don't know which to choose. Each row will have 2 images and some other fields.
The images will never be updated/deleted and about 3000 rows will be inserted a day.
Which is recommend in this situation?
Originally it was always a bad idea to store files (= binary data) in a database. The usual workaround is to store the filepath in the database and ensure that a file actually exists at that path. It wás possible to store files in the database though, with the varbinary(MAX) data type.
sqlfilestream was introduced in sql-server-2008 and handles the varbinary column by not storing the data in the database files (only a pointer), but in a different file on the filesystem, dramatically improving the performance.
filetable was introduced with sql-server-2012 and is an enhancement over filestream, because it provides metadata directly to SQL and it allows access to the files outside of SQL (you can browse to the files).
Advice: Definitely leverage FileStream, and it might not be a bad idea to use FileTable as well.
More reading (short): http://www.databasejournal.com/features/mssql/filestream-and-filetable-in-sql-server-2012.html
In SQL Server, BLOBs can be standard varbinary(max) data that stores the data in tables, or FILESTREAM varbinary(max) objects that store the data in the file system. The size and use of the data determines whether you should use database storage or file system storage.
If the following conditions are true, you should consider using FILESTREAM:
Objects that are being stored are, on average, larger than 1 MB.
Fast read access is important.
You are developing applications that use a middle tier for application logic.
For smaller objects, storing varbinary(max) BLOBs in the database
often provides better streaming performance.
Benefits of the FILETABLE:
Windows API compatibility for file data stored within a SQL Server database. Windows API compatibility includes the following:
Non-transactional streaming access and in-place updates to FILESTREAM data.
A hierarchical namespace of directories and files.
Storage of file attributes, such as created date and modified date.
Support for Windows file and directory management APIs.
Compatibility with other SQL Server features including management tools, services, and relational query capabilities over FILESTREAM and file attribute data.
It depends. I personally will preffer link to the image inside the table. It is more simple and the files from the directory can be backed up separately.
You have to take into account several things:
How you will process images. Having only link allows you easily incorporates imges inside web pages (with propper config of the Web server).
How much are the images - if they are stored in the DB and they are a lot - this will increase the size of the DB and backups.
Are the images change oftenly - in that case it may be better to have them inside DB to have actual state of the backup inside DB.

Load all CSVs from path on local drive into AzureSQL DB w/Auto Create Tables

I frequently need to validate CSVs submitted from clients to make sure that the headers and values in the file meet our specifications. Typically I do this by using the Import/Export Wizard and have the wizard create the table based on the CSV (file name becomes table name, and the headers become the column names). Then we run a set of stored procedures that checks the information_schema for said table(s) and matches that up with our specs, etc.
Most of the time, this involves loading multiple files at a time for a client, which becomes very time consuming and laborious very quickly when using the import/export wizard. I tried using an xp_cmshell sql script to load everything from a path at once to have the same result, but xp_cmshell is not supported by AzureSQL DB.
https://learn.microsoft.com/en-us/azure/azure-sql/load-from-csv-with-bcp
The above says that one can load using bcp, but it also requires the table to exist before the import... I need the table structure to mimic the CSV. Any ideas here?
Thanks
If you want to load the data into your target SQL db, then you can use Azure Data Factory[ADF] to upload your CSV files to Azure Blob Storage, and then use Copy Data Activity to load that data in CSV files into Azure SQL db tables - without creating those tables upfront.
ADF supports 'auto create' of sink tables. See this, and this

Is it possible to feed a database table from a XML file?

We have some (stable) data that is saved in some generic database (database that contains a database structure and its data). To be used, this data must be re-written. Currently, we have an application that export this data to XML files to some very specific location.
We need to add this data to some databases. I know it's possible to load XML inside tables, but we'd like a direct link between the XML files and the database tables (reducing data duplication and risk of seeing people update the generated tables instead of using proper methods).
Is that possible?
Would it be very slow?
You can use SSIS to import XML files into database tables. This will work well if the xml files conform to a schema.
https://www.mssqltips.com/sqlservertip/3141/importing-xml-documents-using-sql-server-integration-services/

FileStream vs FileTable

I want to store images in a sql database. The size of the image is between 50kb to 1mb. I was reading about a FileStream and a FileTable but I don't know which to choose. Each row will have 2 images and some other fields.
The images will never be updated/deleted and about 3000 rows will be inserted a day.
Which is recommend in this situation?
Originally it was always a bad idea to store files (= binary data) in a database. The usual workaround is to store the filepath in the database and ensure that a file actually exists at that path. It wás possible to store files in the database though, with the varbinary(MAX) data type.
sqlfilestream was introduced in sql-server-2008 and handles the varbinary column by not storing the data in the database files (only a pointer), but in a different file on the filesystem, dramatically improving the performance.
filetable was introduced with sql-server-2012 and is an enhancement over filestream, because it provides metadata directly to SQL and it allows access to the files outside of SQL (you can browse to the files).
Advice: Definitely leverage FileStream, and it might not be a bad idea to use FileTable as well.
More reading (short): http://www.databasejournal.com/features/mssql/filestream-and-filetable-in-sql-server-2012.html
In SQL Server, BLOBs can be standard varbinary(max) data that stores the data in tables, or FILESTREAM varbinary(max) objects that store the data in the file system. The size and use of the data determines whether you should use database storage or file system storage.
If the following conditions are true, you should consider using FILESTREAM:
Objects that are being stored are, on average, larger than 1 MB.
Fast read access is important.
You are developing applications that use a middle tier for application logic.
For smaller objects, storing varbinary(max) BLOBs in the database
often provides better streaming performance.
Benefits of the FILETABLE:
Windows API compatibility for file data stored within a SQL Server database. Windows API compatibility includes the following:
Non-transactional streaming access and in-place updates to FILESTREAM data.
A hierarchical namespace of directories and files.
Storage of file attributes, such as created date and modified date.
Support for Windows file and directory management APIs.
Compatibility with other SQL Server features including management tools, services, and relational query capabilities over FILESTREAM and file attribute data.
It depends. I personally will preffer link to the image inside the table. It is more simple and the files from the directory can be backed up separately.
You have to take into account several things:
How you will process images. Having only link allows you easily incorporates imges inside web pages (with propper config of the Web server).
How much are the images - if they are stored in the DB and they are a lot - this will increase the size of the DB and backups.
Are the images change oftenly - in that case it may be better to have them inside DB to have actual state of the backup inside DB.

Sql Server FileStream: Architecture for document and image

In my project i want use sql server filestream for file document like pdf,docx, excel, and also for image file.
In generally for file document i use a single table.
My question is for image i have to use another table or it's ok to use the same table?
Thank!
If you want to store your files with filestream, you can use a single table for that.
I don't see any conditions for having multiple tables for different file types in your question.

Resources