Filetable size vs database size in sql server - sql-server

If I create a FileTable in SQL server 2012, and then was to drop a 4G file onto the NT filesystem (that was in the filestream), would that entire 4G file be read into the table's filestream column?
Is SQL in fact making a COPY of my 4G file? Or does the filestream column represent a pointer to my 4G file, which it begins to read on a query?
Im just trying to figure out if I added 100G of data to my file system, would that add 100G of data size to my DB.
Can someone help explain how this works? And even better point me to some docs with more detail than the MS/MSDN 'how-to' stuff?

The file table stream is stored separately in individual files using the SQL Server FILESTREAM feature, not in the normal SQL Server data files. These files are managed internally by SQL Server.
You can consider the file_stream column similarly to a pointer to the file. The file stream is still part of the database, though, and will be backed up along with the rest of the database.

No, your 4 Gb file will be stored only once, as a file on disk. Sure, there will be some small amount of metadata stored in the database, but that's all.
MDSN link.

Related

Any reason to NOT use FileTable (as opposed to plain FileStream) in SQL Server? [duplicate]

I want to store images in a sql database. The size of the image is between 50kb to 1mb. I was reading about a FileStream and a FileTable but I don't know which to choose. Each row will have 2 images and some other fields.
The images will never be updated/deleted and about 3000 rows will be inserted a day.
Which is recommend in this situation?
Originally it was always a bad idea to store files (= binary data) in a database. The usual workaround is to store the filepath in the database and ensure that a file actually exists at that path. It wás possible to store files in the database though, with the varbinary(MAX) data type.
sqlfilestream was introduced in sql-server-2008 and handles the varbinary column by not storing the data in the database files (only a pointer), but in a different file on the filesystem, dramatically improving the performance.
filetable was introduced with sql-server-2012 and is an enhancement over filestream, because it provides metadata directly to SQL and it allows access to the files outside of SQL (you can browse to the files).
Advice: Definitely leverage FileStream, and it might not be a bad idea to use FileTable as well.
More reading (short): http://www.databasejournal.com/features/mssql/filestream-and-filetable-in-sql-server-2012.html
In SQL Server, BLOBs can be standard varbinary(max) data that stores the data in tables, or FILESTREAM varbinary(max) objects that store the data in the file system. The size and use of the data determines whether you should use database storage or file system storage.
If the following conditions are true, you should consider using FILESTREAM:
Objects that are being stored are, on average, larger than 1 MB.
Fast read access is important.
You are developing applications that use a middle tier for application logic.
For smaller objects, storing varbinary(max) BLOBs in the database
often provides better streaming performance.
Benefits of the FILETABLE:
Windows API compatibility for file data stored within a SQL Server database. Windows API compatibility includes the following:
Non-transactional streaming access and in-place updates to FILESTREAM data.
A hierarchical namespace of directories and files.
Storage of file attributes, such as created date and modified date.
Support for Windows file and directory management APIs.
Compatibility with other SQL Server features including management tools, services, and relational query capabilities over FILESTREAM and file attribute data.
It depends. I personally will preffer link to the image inside the table. It is more simple and the files from the directory can be backed up separately.
You have to take into account several things:
How you will process images. Having only link allows you easily incorporates imges inside web pages (with propper config of the Web server).
How much are the images - if they are stored in the DB and they are a lot - this will increase the size of the DB and backups.
Are the images change oftenly - in that case it may be better to have them inside DB to have actual state of the backup inside DB.

Restore data dumped from an Oracle database instance to a SQL Server database instance?

I would like to know the steps on how to restore data dumped from an Oracle database to a SQL Server database?
Our purpose is to get data from an external Oracle database out of our organization. Due to security concern, the team that manages data source refused us to transfer data through ODBC server link. They dumped the selected tables that we need so we can restore the data in our organization. Each table's data files include .sql file to create table and constraints, a ".ctl" file, one or multiple ".ldr" files.
An extra trouble is: one of the tables contains a blob column, which stores a lot of binary data files, such as PDF etc.. This column takes most of the size of our dumped files. Otherwise I could ask them to send us data in excel directly.
Can someone give me a suggestion about what route we should take?
Either get them to export the data in an open format, or load it into an Oracle instance you have full control over. .ctl and .ldr files looks like they used the old SQL*Loader.

How to handle an Excel file after data exported to SQL Server?

I have 1000's of Excel files, and the data stored in them needs to be imported into SQL Server. The minimum size of 250 kb to 50mb.
Currently, I am storing the files in the server location and importing each file content to SQL Server. Once the data imported, the physical file still remains in the system for future reference.
But now the file occupies more than 25Gb of our server space. I don't want to delete the source files.
Can anyone help me sort out this problem?
I'm planning to convert the source file into bytes and store those bytes in SQL Server. But I don't know it is the right way of handling it.
CSV is the best way for keep your files .You should try convert your .xls to .csv ...That happend to me and I resolved with this method.

FileStream vs FileTable

I want to store images in a sql database. The size of the image is between 50kb to 1mb. I was reading about a FileStream and a FileTable but I don't know which to choose. Each row will have 2 images and some other fields.
The images will never be updated/deleted and about 3000 rows will be inserted a day.
Which is recommend in this situation?
Originally it was always a bad idea to store files (= binary data) in a database. The usual workaround is to store the filepath in the database and ensure that a file actually exists at that path. It wás possible to store files in the database though, with the varbinary(MAX) data type.
sqlfilestream was introduced in sql-server-2008 and handles the varbinary column by not storing the data in the database files (only a pointer), but in a different file on the filesystem, dramatically improving the performance.
filetable was introduced with sql-server-2012 and is an enhancement over filestream, because it provides metadata directly to SQL and it allows access to the files outside of SQL (you can browse to the files).
Advice: Definitely leverage FileStream, and it might not be a bad idea to use FileTable as well.
More reading (short): http://www.databasejournal.com/features/mssql/filestream-and-filetable-in-sql-server-2012.html
In SQL Server, BLOBs can be standard varbinary(max) data that stores the data in tables, or FILESTREAM varbinary(max) objects that store the data in the file system. The size and use of the data determines whether you should use database storage or file system storage.
If the following conditions are true, you should consider using FILESTREAM:
Objects that are being stored are, on average, larger than 1 MB.
Fast read access is important.
You are developing applications that use a middle tier for application logic.
For smaller objects, storing varbinary(max) BLOBs in the database
often provides better streaming performance.
Benefits of the FILETABLE:
Windows API compatibility for file data stored within a SQL Server database. Windows API compatibility includes the following:
Non-transactional streaming access and in-place updates to FILESTREAM data.
A hierarchical namespace of directories and files.
Storage of file attributes, such as created date and modified date.
Support for Windows file and directory management APIs.
Compatibility with other SQL Server features including management tools, services, and relational query capabilities over FILESTREAM and file attribute data.
It depends. I personally will preffer link to the image inside the table. It is more simple and the files from the directory can be backed up separately.
You have to take into account several things:
How you will process images. Having only link allows you easily incorporates imges inside web pages (with propper config of the Web server).
How much are the images - if they are stored in the DB and they are a lot - this will increase the size of the DB and backups.
Are the images change oftenly - in that case it may be better to have them inside DB to have actual state of the backup inside DB.

Old data stored in database file

How can I ensure that all data that I've erase from the db tables, is no longer stored in the mdb files (and others) on the hard disk?
Here's my situation:
My client used to store non-encrypted credit card data, in their database (SQL Server). Thanks to PCI requirements, they now encrypt all that data... However, the mdb file still has some of the old, unencrypted CC written to it.
We've verified that there are no more CC's in the database; we've compressed the database; we've backed it up to a file and restored it anew, to a new database; we've even run sp_cleandb.
Yet, still, when we analyze the persisted file on disk, we still find a handful of non-encrypted CCs - that are not stored in the DB, they're not part of SPs, views, or UDFs, and they do not appear in any table metadata.
So, my question - how can I ensure all the "bad" CC data is gone? Or, more generally, how do I force MSSQL to store only current data, and clean the file from any "garbage"?
Based on what you've done, I'd suggest creating a new database, and moving all your data into that.
That way you know you're only working with your new data, and no legacy data will somehow be stored in files.
Have you tried freeing up unused space in the database files (and log files)?
To be absolutely sure:
dump your data in some textual format, such as CSV
search the CSV for any unencrypted data & remove it
create a new empty database
load the CSV into the new database
script out the database
bulk copy the data out to flat files
look in the flat files for unencrypted data
drop the database
delete the database files with a secure delete: http://www.snapfiles.com/Freeware/security/fwerase.html
create a new database on the server with your scripts
load the data from the flat files
If you are interested in this topic, I recommend:
Threats to privacy in the forensic analysis of database systems, International Conference on Management of Data archive, Proceedings of the 2007 ACM SIGMOD international conference on Management of data
http://www.cs.umass.edu/~miklau/pubs/sigmod2007LMS/stahlberg07forensicDB.pdf

Resources