Extract DICOM Image From RayStation Microsoft SQL Database - sql-server

I was given a backup RayStation database, RS_Patients.bak, and am trying to extract and view the DICOM images that are stored in it. The trouble is two-fold: I don't know which one of the 2,000+ fields (or combinations of fields) refer to the images themselves, and even if I did know where the images were, I don't know how to extract them from the database into .dcm files.
From examining the schema, I found a few fields that are large varbinary fields (BLOBs) and I think they might be the fields I'm looking for. FileStream is enabled for the database and there is an FS directory. I've tried to download those fields into files using the bcp Utility, but that hasn't generated successful DICOMs.
Does anyone have any experience with this sort of database/image structure? Any other suggestions for pulling out and viewing the image? Do you think the image would be made up of a couple of fields instead of just one? There are fields next to what we believe the image field with headers for the DICOM image: in the table callefd ImageStack, next to a field called PixelData, there are fields called PixelSize, SlicePosition, NrPixels, etc.
Also, if you can think of another place to ask this, I would appreciate that too.
Edit per #mcNets suggestion, the bcp command:
DECLARE #Command Varchar(400)
SET #Command = 'bcp "SELECT TOP 1 PixelData FROM RayStationPatientDB.dbo.ImageStack" queryout "C:\Users\Administrator\Documents\test.dcm" -S WIN-123ABC\MSSQLSERVER01 -T -w'
EXEC xp_cmdshell #Command

Generally speaking, you're not going to be able to use SQL Server results to write image data directly. bcp.exe isn't going to help you, either. You need to either use something that understands that the binary string is raw file data, or, because this is a FILESTREAM, use something that will give you the path to the file on the SQL Server. I have limited experience with FILESTREAM, but here's what I would do.
I can't definitively answer which field to use. That will depend on the application. If we assume that the DICOM images are stored in a FILESTREAM, then you can find the available FILESTREAM columns with this:
select t.name TableName
,c.name ColumnName
from sys.tables t
join sys.columns c
on c.object_id = t.object_id
where c.is_filestream = 1
If we also assume that DICOM images are stored as raw image files -- i.e., as a complete binary version of what they would be if they were saved on a PACS disc -- then you can run this to determine the path for each file by the ID:
select TableName_Id
,FileData.PathName()
from TableName.ColumnName
The doc for the PathName() function of FILESTREAM columns is here.
If you instead want to pull the data through SQL Server in a traditional sense, then I would probably use a PowerShell script to do it. This has the advantage of letting you use arbitrary data from the server to name the files. This method also has the advantage that it will work on any binary or varbinary column. As a disadvantage, this method will be slower and uses more disk space, because the server has to read the data, send it to the client, and then the client writes the data to disk:
$SqlQuery = "select Name, FileData from TableName.ColumnName";
$OutputPath = 'C:\OutputPath';
$SqlServer = 'ServerName';
$SqlDatabase = 'DatabaseName';
$SqlConnectionString = 'Data Source={0};Initial Catalog={1};Integrated Security=SSPI' -f $SqlServer, $SqlDatabase;
$SqlCommand = New-Object -TypeName System.Data.SqlClient.SqlCommand;
$SqlCommand.CommandText = $SqlQuery;
$SqlConnection = New-Object -TypeName System.Data.SqlClient.SqlConnection -ArgumentList $SqlConnectionString;
$SqlCommand.Connection = $SqlConnection;
$SqlConnection.Open();
$SqlDataReader = $SqlCommand.ExecuteReader();
while ($SqlDataReader.Read()) {
$OutputFileName = Join-Path -Path $OutputPath -ChildPath "$($SqlDataReader['Name']).dcm"
[System.IO.File]::WriteAllBytes($OutputFileName,$SqlDataReader['FileData']);
}
$SqlConnection.Close();
$SqlConnection.Dispose();
It's also possible to use FILESTREAM functions to return Win32 API handles, but I have never done that.

Related

How do I attach varbinary(max) data from SQL to my SMTPMessage in powershell?

I have a varbinary(max) column in my SQL server containing binary that looks like the following:
0x255044462D312E340D0A25E2E3CFD30D...
I want to retrieve this value in a powershell so that I can run logic that sends out an email.
#Get the attachment binary into a variable from sql server
$Query = "SELECT AttachmentBinary FROM [dbo].MyFiles"
$e= Invoke-Sqlcmd -Query $Query -ServerInstance $server (etc.etc.)
Running $e.Attachmentbinary in powershell gives me a list of decimal numbers, which I assume is how powershell displays the binary:
37
80
68
70
45
(and so on)
The problem now is that when I prepare the file and send the email..
$contentStream = New-Object System.IO.MemoryStream(,$e.Attachmentbinary)
$MailAttachment = New-Object System.Net.Mail.Attachment($contentStream, "application/pdf")
$SMTPMessage.Attachments.Add($MailAttachment)
.. It leaves sends me a 1kb file. That is to say,
$e.AttachmentBinary.Length #Returns 1024 for some reason..
I know for sure the binary data in the database is correct. The problem seems to be in the second part when adding the attachment.
How can this be corrected?
The Invoke-Sqlcmd needs a -MaxBinaryLength value, as the default size of 1024 (1KB) may be to small for the varbinary.
The resulting array $e.Attachmentbinary needs to be cast to binary explicitly, as #Theo suggested.
[Byte[]]$e.Attachmentbinary

How to bulk import many Word documents into SQL Server database table

I need to import ~50,000 Word documents (.doc and .docx) from a single directory into a SQL Server 2016 database table so that I can use full text indexing and then search the documents' contents.
Since this is a one-off task and the database won't be required for long I'm not concerned with performance or the arguments for using FILESTREAM or FileTables.
I've just created a database with a single table:
CREATE TABLE [dbo].[MyDocument]
(
[ID] INT IDENTITY(1,1) NOT NULL,
[DocumentName] NVARCHAR(255) NOT NULL,
[Extension] NCHAR(10) NOT NULL,
[DocumentContent] VARBINARY(MAX) NOT NULL,
CONSTRAINT [PK_MyDocument] PRIMARY KEY CLUSTERED ([ID] ASC)
)
Now I'm looking for a way to get my documents into the table. There are plenty of examples online for importing a single document into a SQL Server database table using OPENROWSET, but they require me to specify a name for the file, which is obviously no use for my requirements.
I can't believe there isn't a well-documented and straightforward way to do this but a couple of hours of searching haven't turned anything up, which is starting to make me doubt this is even possible, but surely it is?
Can anybody give me an example snippet of T-SQL for importing multiple files into the database? Or suggest how else it might be achieved?
Below is a PowerShell script to import all ".docx" files in the specified folder using a parameterized query along with a FileStream parameter value to stream file contents to the database rather than loading the entire file contents into client memory.
# import all documents in specified directory using file stream parameter
try {
$timer = [System.Diagnostics.Stopwatch]::StartNew()
$insertQuery = #"
INSERT INTO dbo.MyDocument (DocumentName, Extension, DocumentContent)
VALUES(#DocumentName, #Extension, #DocumentContent);
"#
$connection = New-Object System.Data.SqlClient.SqlConnection("Data Source=.;Initial Catalog=YourDatabase;Integrated Security=SSPI")
$command = New-Object System.Data.SqlClient.SqlCommand($insertQuery, $connection)
$documentNameParameter = $command.Parameters.Add("#DocumentName", [System.Data.SqlDbType]::NVarChar, 255)
$documentExtensionParameter = $command.Parameters.Add("#Extension", [System.Data.SqlDbType]::NVarChar, 10)
$documentContentParameter = $command.Parameters.Add("#DocumentContent", [System.Data.SqlDbType]::VarBinary, -1)
$connection.Open()
$filesToImport = Get-ChildItem "E:\DocumentsToImport\*.docx"
$importedFileCount = 0
foreach($fileToImport in $filesToImport) {
$documentContentStream = [System.IO.File]::Open($fileToImport.FullName, [System.IO.FileMode]::Open)
$documentNameParameter.Value = [System.IO.Path]::GetFileNameWithoutExtension($fileToImport.FullName)
$documentExtensionParameter.Value = [System.IO.Path]::GetExtension($fileToImport.Name)
$documentContentParameter.Value = $documentContentStream
[void]$command.ExecuteNonQuery()
$documentContentStream.Close()
$importedFileCount += 1
}
$connection.Close()
$timer.Stop()
Write-Host "$importedFileCount files imported. Duration $($timer.Elapsed)."
}
catch {
throw
}

How to extract SQL from SQL Server XML Deadlock Graphs

I have some SQL deadlocks I am trying to capture mediaName from. The deadlock report is in XML but the attribute i need is buried in XML, then SQL, then XML again. Here is an example.
XPATH for where the SQL starts is /deadlock/process-list/process/inputbuf, then the SQL is:
SET DEADLOCK_PRIORITY 8;
EXEC spM_Ext_InsertUpdateXML N'<mediaRecords><media
title="This Is the title" mediaType="0"
creationTime="2018-03-16T00:59:43" origSOM="01:00:00;00" notes="Air Date:
2018-03-18 &#xa;Air Window: 3 &#xa;" mediaName="This is what i need"
><mediaInstances><mediaInstance directory="here"
duration="00:28:40;11" version="1" position="00:00:00;00" mediaSetId="34"
creationStartTime="2018-03-16T00:59:43;25" creationEndTime="2018-03-
16T00:59:43;25"/></mediaInstances><properties><
classifications><classification category="HD" classification="Content
Resolution"/></classifications><markups><markup
name=""><Item duration="00:00:10;00" orderNo="1"
type="Dynamic" som="00:59:50;00" comment=""
name="Segment"/></markup><markup
name="Segment"><markupItem duration="00:08:41;10" orderNo="2"
type="Dynamic" som="01:00:00;00" comment="Main Title and Segment 1 |
ID:SEDC" name="Segment"/></markup><markup
name="Black"><markup
See how the XML isnt using < and > for the elements but the &lt and &gt which adds complexity.
I am trying to extract only mediaName from this report but cant get past the above mentioned XPath with powershell. Was hoping someone might have an idea. I was using
$xml = [xml](Get-Content "C:\Users\user\desktop\test.xml")
$xml.SelectNodes('/deadlock/process-list/process/inputbuf') | select mediaName
I have also tried piping select-xml to where-object but I don't think I am using the right $_.[input]
With the help of tomalak and the answer below this is the fixed and working parsing script.
#report file location, edited by user when needed
$DeadlockReport = "C:\Users\User\Desktop\xml_report1.xml"
# Create object to load the XML from the deadlock report and find the SQL within
$xml = New-Object xml
$xml.Load($DeadlockReport)
$inputbuf = $xml.SelectNodes('//deadlock/process-list/process/inputbuf')
$value = $inputbuf.'#text'
#find the internal XML and replace bad values, SQL, and truncation with RE
$value = $value -replace "^[\s\S]*?N'","" -replace "';\s*$","" -replace "<markup.*$","</properties></media></mediaRecords>"
#append root elements to $value
$fix = "<root>" + $value + "</root>"
#Load the XML after its been corrected
$payload.LoadXml($fix)
#find the nodes in the xml for mediaName
$mediaName = $payload.SelectNodes('//root/mediaRecords/media/#mediaName')
#iterate through and return all media names.
foreach($i in $mediaName)
{
return $mediaName
}
What you have is:
an XML file,
which contains a string value,
which is SQL,
which contains another string value,
which is XML again.
So let's peel the onion.
First-off, please never load XML files like this
# this is bad code, don't use
$xml = [xml](Get-Content "C:\Users\user\desktop\test.xml")
XML has sophisticated file encoding detection, and you are short-circuiting that by letting Powershell load the file. This can lead to data breaking silently because Powershell's Get-Content has no idea what actual encoding of the XML file is. (Sometimes the above works, sometimes it doesn't. "It works for me" doesn't mean that you're doing it right, it means that you're being lucky.)
This is the correct way:
$xml = New-Object xml
$xml.Load("C:\Users\user\desktop\test.xml")
Here the XmlDocument object will take care of loading the file and transparently adapt to any encoding it might have. Nothing can break and you don't have to worry about file encodings.
Second, don't let the looks of the XML file in a text editor deceive you. As indicated, /deadlock/process-list/process/inputbuf contains a string as far as XML is concerned, the < and > and all the rest will be there when you look at the actual text value of the element.
$inputbuf = $xml.SelectSingleNode('/deadlock/process-list/process/inputbuf')
$value = $inputbuf.'#text'
Write-Host $value
Would print something like this, which is SQL:
SET DEADLOCK_PRIORITY 8;
EXEC spM_Ext_InsertUpdateXML N'<mediaRecords><media
title="This Is the title" mediaType="0"
creationTime="2018-03-16T00:59:43" origSOM="01:00:00;00" notes="Air Date:
2018-03-18
Air Window: 3
" mediaName="This is what i need"
><mediaInstances><mediaInstance directory="here"
duration="00:28:40;11" version="1" position="00:00:00;00" mediaSetId="34"
creationStartTime="2018-03-16T00:59:43;25" creationEndTime="2018-03-
16T00:59:43;25"/></mediaInstances><properties><
classifications><classification category="HD" classification="Content
Resolution"/></classifications><markups><markup
name=""><Item duration="00:00:10;00" orderNo="1"
type="Dynamic" som="00:59:50;00" comment=""
name="Segment"/></markup><markup
name="Segment"><markupItem duration="00:08:41;10" orderNo="2"
type="Dynamic" som="01:00:00;00" comment="Main Title and Segment 1 |
ID:SEDC" name="Segment"/></markup><markup
name="Black"><markup ...
</mediaRecords>';
And the XML you are interested in is actually a string inside this SQL. If the SQL follows this pattern...
SET DEADLOCK_PRIORITY 8;
EXEC spM_Ext_InsertUpdateXML N'<...>';
...we need to do three things in order to get to the XML payload:
Remove the enclosing SQL statements
Replace any '' with ' (because the '' is the escaped quote in SQL strings)
Pray that part in-between does not contain any other SQL expressions
So
$value = $value -replace "^[\s\S]*?N'","" -replace "';\s*$","" -replace "''","'"
would remove everything up to and including N' and the '; at the end, as well as replace all the duplicated single quotes (if any) with normal single quotes.
Adapt the regular expressions as needed. Replacing the SQL parts with regex isn't exactly clean, but if the expected input is very limited, like in this case, it'll do.
Write-Host $value
Now we should have a string that is actually XML. Let's parse it. This time, it's already in our memory, there isn't any file encoding to pay attention to. So it's actually all-right if we cast it to XML directly:
$payload = [xml]$value
And now we can query it for the value you are interested in:
$mediaName = $payload.SelectSingleNode("/mediaRecords/media/#mediaName")
Write-Host $mediaName

How to backup statistics in sql?

Let's just say I have a table A with some data on it in SSMS. There are sub tables such as columns, constraints,triggers,indexes and statistics etc.
I want to create a similar table with same properties as table A. I know I need to go to Script Table As-> Create To-> New Query Window to duplicate the table structure.
However, after doing that, I realized the statistics in my new table is empty when there are statistics in table A. Did I miss out something?
You can script the statistics blob only with the following bit of powershell (which I yoinked from an old blog post of mine):
pushd;
import-module sqlps -disablenamechecking;
popd;
$opts = new-object Microsoft.SqlServer.Management.SMO.ScriptingOptions;
$opts.OptimizerData = $true;
$server = new-object Microsoft.SqlServer.Management.SMO.Server ".";
$database = $server.Databases["AdventureWorks2008R2"];
foreach ($table in $database.Tables) {
foreach ($stat in $table.Statistics) {
$stat.Script($opts);
}
}
The above will script out all statistics (including the histogram data) for all tables in the AdventureWorks2008R2 database. You should be able to tailor it to your needs.

How to stop truncating of data when outputting XML from ReportServer using data returned from a stored procedure with Powershell (large XML file)

I have been twiddling with a fairly simple idea to export ReportServer reports from the underlying database and then find its dependent stored procedures and export them as well.
However, when testing initially I found that the XML data for the report itself is truncated in the standard way I export things to files, and I think I may be using an incorrect method.
The code is fairly simple at this point, and I am using a simplified report called "ChartReport":
Import-Module 'sqlps'
$saveloc = "$home\savedata\filename.txt"
$dataquery = #"
DECLARE #name NVARCHAR(MAX)='ChartReport',
#path NVARCHAR(MAX) = '/ChartReport'
SELECT CAST(CAST(c.Content AS VARBINARY(MAX)) AS XML) [ReportData], c.Name, c.Path
FROM ReportServer.dbo.Catalog c
WHERE c.Name = #name
AND c.Path LIKE #path+'%'
"#
Invoke-SQLCMD -Query $dataquery | select ReportData | Out-File $saveloc
I have verified the query returns XML (The underlying XML file itself is over 25000 characters, and I would be happy to provide a link to it if anyone is interested), however when I save the file I get something like:
Column1
<Report xmlns:rd="http://schemas.microsoft.com/SQLServer/reporting/reportdesigner" xmlns:cl="http://schemas.microsof...
I have attempted to use some of the ideas already posted on SO, such as:
> $somefile Powershell 2: Easy way to direct every bit of output to a file?
out-file and specifying width Powershell Add-Content Truncating Output
Using the format-table with -autosize and -wrap
Each of these fail at some point (though the format-table method gets pretty far before it truncates).
I would definitely consider some sort of XML specific solution, but really I think it is just that I am missing some information. As far as I am considering, this is a file of "stuff" and I want to write said file to the disk after it is loaded into the object.
Would iterating over some sort of line break and writing each line of the object to a file be the idiomatic answer?
Use -MaxCharLength parameter of Invoke-SQLCMD command. By default it 4000.
See Invoke-SqlCmd doesn't return long string?

Resources