Should image binaries be stored as BLOBS in a SQL Server? - sql-server

If an application requires images (ie. JPGs, PNGs etc) to be referenced in a database-driven application, should these images just be stored in a file system with their path referenced in a database, or should the images actually be stored in the database as BLOBS?

There's a really good paper by Microsoft Research called To Blob or Not To Blob.
Their conclusion after a large number of performance tests and analysis is this:
if your pictures or document are typically below 256K in size, storing them in a database VARBINARY column is more efficient
if your pictures or document are typically over 1 MB in size, storing them in the filesystem is more efficient (and with SQL Server 2008's FILESTREAM attribute, they're still under transactional control and part of the database)
in between those two, it's a bit of a toss-up depending on your use
If you decide to put your pictures into a SQL Server table, I would strongly recommend using a separate table for storing those pictures - do not store the employee foto in the employee table - keep them in a separate table. That way, the Employee table can stay lean and mean and very efficient, assuming you don't always need to select the employee foto, too, as part of your queries.
For filegroups, check out Files and Filegroup Architecture for an intro. Basically, you would either create your database with a separate filegroup for large data structures right from the beginning, or add an additional filegroup later. Let's call it "LARGE_DATA".
Now, whenever you have a new table to create which needs to store VARCHAR(MAX) or VARBINARY(MAX) columns, you can specify this file group for the large data:
CREATE TABLE dbo.YourTable
(....... define the fields here ......)
ON Data -- the basic "Data" filegroup for the regular data
TEXTIMAGE_ON LARGE_DATA -- the filegroup for large chunks of data
Check out the MSDN intro on filegroups, and play around with it!

Related

How does SQLITE DB saves data of multiple tables in a single file?

I am working on a project to create a simplified version of SQLite Database. I got stuck when trying to figure out how does it manages to store data of multiple tables with different schema, in a single file. I suppose it should be using some indexes to map the data of different tables. Can someone shed more light on how its actually done? Thanks.
Edit: I suppose there is already an explanation in the docs, but looking for some easier way to understand it better and faster.
The schema is the list of all entities (tables, views etc) (the database as a whole) rather than a database existing of many schemas on a per entity basis.
Data itself is stored in pages each page being owned by an entity. It is these blocks that are saved.
The default page size is 4k. You will notice that the file size will always be a mutliple of 4K. You could also, with experimentation create a database with some tables, note it's size, then add some data, and if the added data does not require another page, see that the size of the file is the same. This demonstrating how it's all about pages rather than a linear/contiguos stream of data.
It, the schema, is saved in a table called sqlite_master. This table has columns :-
type (the type e.g. table etc),
name (the name given to the entity),
tbl_name (the tale to which the entity applies )
root page (the map to the first page)
sql (the SQL used to generate the entity, if any)
note that another schema, sqlite_temp_master, may also exist if there are temporary tables.
For example :-
Using SELECT * FROM sqlite_master; could result in something like :-
2.6. Storage Of The SQL Database Schema

SQL Server Create Dynamic Table with different table names based on a template or an existing table

My team is creating a high volume data processing tool. The idea is to take a 30,000 line batch file and bulk load it into a table and then process the records use parallel processing.
The part I'm stuck on is creating dynamic tables. We want to create a new physical table for each file that we receive. The tables will be purged from our system by a separate process after they are completed.
The part I'm stuck on is creating dynamic tables. For each batch file we receive I need to create a new physical file with a unique table name.
I have the base structure for the table and I intend to create unique table names using a combination of date/time stamp and a guid (dashes converted to underscore characters).
I could do this easily enough in a stored procedure but I'm wondering if there is a better way.
Here is what I have considered...
Templates in SQL Server Management Studio. This is a GUI tool built into Management Studio (from Management Studio Ctrl+Alt+T) that allows you to define different sql objects including a table and specify parameters. This seems like it would work, however it appears that this is a GUI tool and not something that I could call from a stored procedure.
Stored Procedure. I could put everything into a stored procedure and build my file name and schema into a nvarchar(max) string and use sp_executesql to create the table. This might be the way to accomplish my goal but I wonder if there is a better way.
Stored Procedure with an existing table as a template. I could define a base table and then query sys.columns & sys.dataypes to create a string representing the new table. This would allow me to add columns to the base table without having to update my stored procedure. I'm not sure if this is a better approach.
I'm wondering if any Stack Overflow folks have solved a similar requirements. What are your recommendations.

file stream vs local save in sql server?

my application play videos files after that user they are registered .(files are larger than 100 MB ) .
Is it better to do I store them on the hard drive and Keep file path in database ?
Or
do I store in database as File Stream Type ?
When data is stored in the database, are more secure against manipulation vs with stored in hard ?
How to provide data security against manipulation ?
Thanks .
There's a really good paper by Microsoft Research called To Blob or Not To Blob.
Their conclusion after a large number of performance tests and analysis is this:
if your pictures or document are typically below 256K in size, storing them in a database VARBINARY column is more efficient
if your pictures or document are typically over 1 MB in size, storing them in the filesystem is more efficient (and with SQL Server 2008's FILESTREAM attribute, they're still under transactional control and part of the database)
in between those two, it's a bit of a toss-up depending on your use
If you decide to put your pictures into a SQL Server table, I would strongly recommend using a separate table for storing those pictures - do not store the employee foto in the employee table - keep them in a separate table. That way, the Employee table can stay lean and mean and very efficient, assuming you don't always need to select the employee foto, too, as part of your queries.
For filegroups, check out Files and Filegroup Architecture for an intro. Basically, you would either create your database with a separate filegroup for large data structures right from the beginning, or add an additional filegroup later. Let's call it LARGE_DATA.
Now, whenever you have a new table to create which needs to store VARCHAR(MAX) or VARBINARY(MAX) columns, you can specify this file group for the large data:
CREATE TABLE dbo.YourTable
(....... define the fields here ......)
ON Data -- the basic "Data" filegroup for the regular data
TEXTIMAGE_ON LARGE_DATA -- the filegroup for large chunks of data
Check out the MSDN intro on filegroups, and play around with it!
1 - depends on how you define "better". In general, I prefer to store binary assets in the database so they are backed up alongside the associated data, but cache them on the file system. Streaming the binary data out of SQL Server for a page request is a real performance hog, and it doesn't really scale.
If an attacker can get to your hard drive, your entire system is compromised - storing things in the database will offer no significant additional security.
3 - that's a whole question in its own right. Too wide for Stack Overflow...

Combining several sqlite databases (one table per file) into one big sqlite database

How to combine several sqlite databases (one table per file) into one big sqlite database containing all the tables. e.g. you have database files: db1.dat, db2.dat, db3.dat.... and you want to create one file dbNew.dat which contains tables from all the db1, db2...
Several similar questions have been asked on various forums. I posted this question (with answer) for a particular reason. When you are dealing with several tables and have indexed many fields there. It causes unnecessary confusion to create index properly into the destination database tables. You may miss 1-2 index and its just annoying. The given method can also deal with large amount of data i.e. when you really have gbs of tables. Following are the steps to do so:
Download sqlite expert: http://www.sqliteexpert.com/download.html
Create a new database dbNew: File-> New Database
Load the 1st sqlite database db1 (containing a single table): File-> Open Database
Click on the 'DDL' option. It gives you a list of commands which are needed to create the particular sqlite table CONTENT.
Copy these commands and select 'SQL' option. Paste the commands there. Change the name of destination table DEST (from default name CONTENT) into whatever you want.
6'Click on 'Execute SQL'. This should give you a copy of the table CONTENT in db1 with the name DEST. The main utility of doing it is that you create all the index also in the DEST table as they were in the CONTENT table.
Now just click and drag the DEST table from the database db1 to the database dbNew.
Now just delete the database db1.
Go back to step 3 and repeat with the another database db2 etc.

tables with multiple varbinary columns

If i have a table with varbinary(Max) datatype and have FILESTREAM attributes on the column. Now I need to have to store another binary data but without FILESTREAM attribute. So, if I add another column with VARBINARY(MAX) datatypes on the same table would there be any performance issue? Do I gain faster performance if I separate a table with FILESTREAM attributes and Create another separate table to store other VARBINARY(MAX) data?
for your this question.you can.
Filestream is the new feature in sqlserver2008,and in 2012 ,that change the name ,call fileTable.
I tested it.this feature is use the DB manage the file .and up file about 5M/s.
for your other column,if you not open the filestream,the file will be change the binary ,and store in sqlserver data file.
open the filestream,the file will store the server, and managed by sqlserver.
for your second question,i am not 100% sure,but if you use the filestream,it's will gain more effiencit,need to attention the backup and store.
one years ago,i implemented this function in our system,and i have the shcame,if you want ,i will send you.
sorry,my english is not good.
your performance might be effected if you add another VARBINARY(MAX) on the same table
When the FILESTREAM attribute is set, SQL Server stores the BLOB data in the NT file system and keeps a pointer the file, in the table. this allows SQL Server to take advantage of the NTFS I/O streaming capabilities. and reduces overhead on the SQL engine
The MAX types (varchar, nvarchar and varbinary) and in your case VARBINARY(MAX) datatype cannot be stored internally as a contiguous memory area, since they can possibly grow up to 2Gb. So they have to be represented by a streaming interface.
and they will effect performance very much
if you are sure your files are small you can go for VARBINARY(MAX) other wise if they are larger thab 2gb FILESTREAM is the best option for you
and yeah i would suggest you Create another separate table to store other VARBINARY(MAX) data

Resources