Avoiding loading whole blob to memory - sql-server

I store large files (50-500MB) in database. Once loaded by the application, it doesn't need the whole file in memory. How do i fetch table row (or specifically the installer from the row) directly into the memory while avoiding loading the entire file into ram (So a sort of a buffered download into file)?
I haven't found a solution that avoid loading the file so far. Instead i forward requests to flask server that loads the entire file, and then allows the application instance to download it into a file. However this doesn't seem like a very good solution.

You are probably looking for FILESTREAM (SQL Server):
FILESTREAM enables SQL Server-based applications to store unstructured data, such as documents and images, on the file system. Applications can leverage the rich streaming APIs and performance of the file system and at the same time maintain transactional consistency between the unstructured data and corresponding structured data.
It is interesting because on SQL Server (for Windows) it can stream file data to Windows clients without having to load their entirety into the memory of the SQL Server:
The Win32 streaming support works in the context of a SQL Server transaction. Within a transaction, you can use FILESTREAM functions to obtain a logical UNC file system path of a file. You then use the OpenSqlFilestream API to obtain a file handle. This handle can then be used by Win32 file streaming interfaces, such as ReadFile() and WriteFile(), to access and update the file by way of the file system.
Do note that at this time it is not supported on SQL Server 2017 for Linux.

Related

Backing up an SQLite3 database on embed

I am currently working on a C project that contains an SQLite3 database with WAL enabled. We have an HTTP web interface over which you shall be able to get an online backup of the database. Currently, the database file is reachable over HTTP, which is bad in many ways. My task now is to implement a new backup algorithm.
There is the SQLite-Online-Backup API which seems to be pretty nice. There, you open two database connections and copy one database to the other. Anyway, in my setup, I can't be sure that there is enough space to copy the entire database, since we may have a lot of statistics and multimedia files in it. For me, the best solution would be to open a SQLite connection that is directly connected to stdout, so that I could backup the database through CGI.
Anyway, I didn't find a way in the SQLite3 API to open a database connection on special files like stdout. What would be best practice to backup the database? How do you perform online backups of your SQLite3 databases?
Thanks in advance!
If you need to have some special target interface for the backup, you can implement your custom VFS interface that does what you need. See the parameters for sqlite3_open_v2() where you can pass in the name of a VFS.
(see https://www.sqlite.org/c3ref/vfs.html for Details about VFS and the OS interface used by SQLite)
Basically every sqlite3_backup_step() call will write some blocks of data, and you would need to transfer those to your target database in some way.

How can I serialize an in-memory SQLite Database?

I have an in-memory SQLite database which I want to serialize and send to another computer. Is this possible without writing the database out to disk and reading the file from there?
You could use the online backup API to transfer the in-memory database, to a file-based database created in shared memory (for Linux, in /dev/shm for instance) avoiding the disk operations. Then this pseudo-file is transferred to the remote host (still put in /dev/shm), and the online load API is used to transfer from the file-based database, to your target in-memory database.
See:
http://www.sqlite.org/backup.html
http://www.sqlite.org/c3ref/backup_finish.html
AFAIK, there is no API to perform online/load without intermediate databases.
The sqlite3 shell program contains a .dump command that "dumps the database in an SQL text format." You can use the source code for .dump (it is public domain) to create your own serializer.

WinForms application design - moving documents from SQL Server to file storage

I have a standard WinForms application that connects to a SQL Server. The application allows users to upload documents which are currently stored in the database, in a table using an image column.
I need to change this approach so the documents are stored as files and a link to the file is stored in the database table.
Using the current approach - when the user uploads a document they are shielded from how this is stored, as they have a connection to the database they do not need to know anything about where the files are stored, no special directory permissions etc are required. If I set up a network share for the documents I want to avoid any IT issues such as the users having to have access to this directory to upload to or access existing documents.
What are the options available to do this? I thought of having a temporary database where the documents are uploaded to in the same way as the current approach and then a process running on the server to save these to the file store. This database could then be deleted and recreated to reclaim any space. Are there any better approaches?
ADDITIONAL INFO: There is no web server element to my application so I do not think a WCF service is possible
Is there a reason why you want to get the files out of the database in the first place?
How about still saving them in SQL Server, but using a FILESTREAM column instead of IMAGE?
Quote from the link:
FILESTREAM enables SQL Server-based applications to store unstructured
data, such as documents and images, on the file system. Applications
can leverage the rich streaming APIs and performance of the file
system and at the same time maintain transactional consistency between
the unstructured data and corresponding structured data.
FILESTREAM integrates the SQL Server Database Engine with an NTFS file
system by storing varbinary(max) binary large object (BLOB) data as
files on the file system. Transact-SQL statements can insert, update,
query, search, and back up FILESTREAM data. Win32 file system
interfaces provide streaming access to the data.
FILESTREAM uses the NT system cache for caching file data. This helps
reduce any effect that FILESTREAM data might have on Database Engine
performance. The SQL Server buffer pool is not used; therefore, this
memory is available for query processing.
So you would get the best out of both worlds:
The files would be stored as files on the hard disk (probabl faster compared to storing them in the database), but you don't have to care about file shares, permissions etc.
Note that you need at least SQL Server 2008 to use FILESTREAM.
I can tell you how I implemented this task. I wrote a WCF service which is used to send archived files. So, if I were you, I would create such a service which should be able to save files and send them back. This is easy and you also must be sure that the user under which context the WCF service works has permission to read write files.
You could just have your application pass the object to a procedure (CLR maybe) in the database which then writes the data out to the location of your choosing without storing the file contents. That way you still have a layer of abstraction between the file store and the application but you don't need to have a process which cleans up after you.
Alternatively a WCF/web service could be created which the application connects to. A web method could be used to accept the file contents and write them to the correct place, it could return the path to the file or some file identifier.

Is it possible to access the FILESTREAM share?

What I mean is being able to access it through Windows Explorer or other programs. I believe the answer is that it isn't possible. But I really want to know why it's not allowed. It seems that the files could be made available read-only through the network share.
You can't access the Filestream share directly and explore around. Any open to a Filestream file needs to be done using the path retrieved from SQL Server and by using NtCreateFile (or a wrapper) with the appropriate transaction context passed in through the EABuffer.
It is possible to create a new share and point it to the physical location of the files, however this is pretty pointless as there's no supported way to resolve a SQL Filestream row to a physical file location (the RsFx filter driver handles these conversions internally), the file location may change at any time due to concurrent updates / partition changes, and you'll need to relax security on the folder to an unacceptable level. It can also cause corruptions in the database if you move or delete files without the knowledge of SQL Server. Any locks held on physical files will interfere with deletes as mentioned in dportas' comment.
I agree it would be great to be able to browse a namespace of the Filestream files through explorer and open files directly through applications without requiring an application rewrite.
Yes it is possible. The point of filestream however is that you get that access via the filestream API rather than direct through the filesystem. Bear in mind that the file name could change without warning - for example updates may cause a new filestream file to be created. Possibly if you are holding file system locks (even shared locks) on a file that is needed by SQL Server then that may cause a contention problem. So if you access the data direct through the file system the results will be unsupported and may be unreliable - but then again it might work :-)
Yes it is possible if you are also using FileTables (I am using Sql Express 2017). When in Sql Server Configuration Manager, right click on your server instance, select Properties, and then go to the FILESTREAM tab. Check the "Allow remote clients access to FILESTREAM data". You may have to stop/start your instance. Now you can browse to the share, which is named according to your instance (in my case SqlExpress). In my database (SimioPortal) I had created a file (BlobStore) where I stored my files.
So, at the command prompt I can now type: dir \localhost\sqlexpress\SimioPortal\blobstore and see a list of my files. You can do a similar thing in File Explorer.

Is Streaming Video possible with Sql Filestream?

We have stored all media in Sql Filestream, but now we'll need Video and Audio streaming... Will this be possible with Sql Filestream or will I have to take all of the Video and Audio out of the database?
Which technology would you use to enable Video/Audio Streaming?
WebORB
FluorineFX
Wowza (way better I think than the first two)
IIS Media (haven't looked into this yet)
When using IIS Media its not possible to store the data in a SQL Fielstream.
For further details check here.
It's possibly very similar with the rest of your suggested solutions, since all of them need to re encode the material to enable streaming (if its not in the necessary format already).
You actually have 2 problems:
Re encoding the videos into a format
that enables you to stream it via
the server platform you choose, just
for this part you need to extract
the files from the db since the
encoding tools can't be fed from a
database, even if its a SQL FileStream
Store the encoded files
somewhere the media servers can
access them, again they don't allow
a SQL Server as a data soure, they
probably have their own storing
infrastructure or use the file
system.
Conclusion:
The FileStream is extremely helpful when you have full control over server/client, but sadly not in your case.
You will probably have to extract all files from the DB.
The FileTable feature in SQL Server "Denali" (not yet released) is designed specifically for this scenario (amongst others).
There's a good overview link here: Using FileTables to Manage Unstructured FILESTREAM Data.
This will allow you to directly access and play these files through a provided UNC path without requiring any changes to the application, so you can use any of the above mentioned streaming servers.

Resources