I have a process that runs every hour, as a part of the process it iterating on a text file that contains about 100K strings and it need to check if each line already exists in specific table in a SQL Server database that has about 30M records.
I have 2 options:
Option 1: SELECT all strings from my table and load it into memory and then during the process it will check for each line in the file if it exists in the data.
Downside: It eats up the machine memory.
Option 2: check if each line in the 100K text file is found in the database (assumes table is indexed correctly).
Downside: It will require multiple requests (100K requests) to database.
If I'm using option 2, can SQL Server handle this number of requests?
What is the preferred way in order to overcome this issue?

Below is PowerShell example code for another option; bulk insert the strings into temp table and perform the lookups as a single set-based SELECT query. I would expect this method to typically run a few seconds, depending on your infrastructure.
$connectionString = "Data Source=.;Initial Catalog=YourDatabase;Integrated Security=SSPI"
$connection = New-Object System.Data.SqlClient.SqlConnection($connectionString)
# load strings from file into a DataTable
$timer = [System.Diagnostics.Stopwatch]::StartNew()
$dataTable = New-Object System.Data.DataTable
($dataTable.Columns.Add("StringData", [System.Type]::GetType("System.String"))).MaxLength = 20
$streamReader = New-Object System.IO.StreamReader("C:\temp\temp_strings.txt")
while ($streamReader.Peek() -ge 0) {
$string = $streamReader.ReadLine()
$row = $dataTable.NewRow()
$row[0] = $string
Write-Host "DataTable load completed. Duration $($timer.Elapsed.ToString())"
# bulk insert strings into temp table
$timer = [System.Diagnostics.Stopwatch]::StartNew()
$command = New-Object System.Data.SqlClient.SqlCommand("CREATE TABLE #temp_strings(StringValue varchar(20));", $connection)
$bcp = New-Object System.Data.SqlClient.SqlBulkCopy($connection)
$bcp.DestinationTableName = "#temp_strings"
Write-Host "BCP completed. Duration $($timer.Elapsed.ToString())"
# execute set-based lookup query and return found/notfound for each string
$timer = [System.Diagnostics.Stopwatch]::StartNew()
$command.CommandText = #"
WHEN YourTable.YourTableKey IS NOT NULL THEN CAST(1 AS bit)
END AS Found
FROM #temp_strings AS strings
LEFT JOIN dbo.YourTable ON strings.StringValue = YourTable.YourTableKey;
$reader = $command.ExecuteReader()
while($reader.Read()) {
Write-Host "String $($reader["StringValue"]) found: $($reader["Found"])"
Write-Host "Lookups completed. Duration $($timer.Elapsed.ToString())"
As an alternative to bulk insert, you could alternatively pass the strings using a table-valued parameter (or XML, JSON, delimited values) for use in the query.


Sqlbulkcopy Excessive Memory Consumtion even with EnableStreaming and low BatchSize

I try to bulk load data from Oracle to SqlServer through Powershell Sqlserver Module Sqlbulkcopy
On small Data, everything works fine, but on big Datasets, even if bachsize and streaming are set, sqlbulkcopy is taking all the memory available... until an out of memory
Also the notify function seems to give no answer, so I guess even with streaming=True, the process first load everything to memory...
What did I missed ?
$current = Get-Date
#copy table from Oracle table to SQL Server table
add-type -path "D:\oracle\product\12.1.0\client_1\odp.net\managed\common\Oracle.ManagedDataAccess.dll";
#define oracle connectin string
$conn_str = "cstr"
# query for oracle table
$qry = "
WHERE source.ISSYNTHETIC=0 AND source.VALIDFROM >= TO_Date('2019-01-01','yyyy-mm-dd')
# key (on the left side) is the source column while value (on the right side) is the target column
[hashtable] $mapping = #{'ID'='ID';'CREATEDT'='CREATEDT';'MODIFIEDDT'};
$adapter = new-object Oracle.ManagedDataAccess.Client.OracleDataAdapter($qry, $conn_str);
#$info = new-object Oracle.ManagedDataAccess.Client;
#Write-Host ( $info | Format-Table | Out-String)
$dtbl = new-object System.Data.DataTable('MYTABLE');
#this Fill method will populate the $dtbl with the query $qry result
#define sql server target instance
$sqlconn = "cstr";
$sqlbc = new-object system.data.sqlclient.Sqlbulkcopy($sqlconn)
$sqlbc.BatchSize = 1000;
$sqlbc.EnableStreaming = $true;
$sqlbc.NotifyAfter = 1000;
#need to tell $sqlbc the column mapping info
foreach ($k in $mapping.keys)
$colMapping = new-object System.Data.SqlClient.SqlBulkCopyColumnMapping($k, $mapping[$k]);
$sqlbc.ColumnMappings.Add($colMapping) | out-null
$end= Get-Date
$diff= New-TimeSpan -Start $current -End $end
Write-Output "import needed : $diff"
Thanks to Jeroen, I changed the code like this, now its no more consuming memory :
$oraConn = New-Object Oracle.ManagedDataAccess.Client.OracleConnection($conn_str);
$command = $oraConn.CreateCommand();
$reader = $command.ExecuteReader()

Output XML directly to file from SQL server

Output the results (XML) of a stored procedure to a file.
I have a stored procedure in SQL server that creates an XML file. It currently displays the resulting XML and I have to manually save as a file.
I have tried to call the procedure from Powershell, as in this question, this works for small files but not for large (>1gb files) as Powershell tries to store the entire thing as a variable and it quickly runs out of memory.
I'm opening this as a new question as I think there should be a way of doing this within SQL server (or a better way of doing it with Powershell).
You shouldn't use a stored procedure here. Just use better PowerShell. You can stream large types to and from SQL Server with SqlClient. So you just need to drop down and use ADO.NET instead of using the invoke-sqlcmd convenience method.
$conString = "server=localhost;database=tempdb;integrated security=true"
$sql = #"
select top (1000*1000) *
from sys.messages m
cross join sys.objects o
for xml auto
$fn = "c:\temp\out.xml"
$con = new-object System.Data.SqlClient.SqlConnection
$con.connectionstring = $conString
$cmd = $con.createcommand()
$cmd.CommandText = $sql
$cmd.CommandTimeout = 0
$rdr = $cmd.ExecuteXmlReader()
$w = new-object System.Xml.XmlTextWriter($fn,[System.Text.Encoding]::UTF8)
write-host "Process Memory: $( [System.GC]::GetTotalMemory($false) )"
write-host "File Size: $( (ls $fn)[0].Length )"
Process Memory: 34738200
File Size: 468194885
Other solution if you can you have to build your XML file in Temporary Table line by line and then output and read the result line by line from Powershell or other code :
SQL Example :
** Stored procedure
/*** Effacement: ********************************************************
IF EXISTS ( SELECT name FROM sysobjects
WHERE type = 'P' AND name = 'procTEST' )
*** Effacement: ********************************************************/
SELECT 'Line 1',1
SELECT 'Line 2',2
SELECT 'Line 3',3
/*** TESTS ****************************************************************************************************************************************
sp_helptext procTEST
*** TESTS ****************************************************************************************************************************************/
Powershell Script :
$readconn = New-Object System.Data.OleDb.OleDbConnection
$writeconn = New-Object System.Data.OleDb.OleDbConnection
[string]$connstr="Provider=SQLOLEDB.1;Integrated Security=SSPI;Persist Security Info=False;Initial Catalog=TEST;Data Source=.\XXXXX;Workstation ID=OMEGA2"
$readconn.connectionstring = $connstr
$readcmd = New-Object system.Data.OleDb.OleDbCommand
$readcmd.commandtext='EXEC procTEST'
$reader = $readcmd.executereader()
# generate header
for ($i=0;$i -lt $reader.FieldCount;$i++){
$dbrecords=while($reader.read()) {
for ($i=0;$i -lt $reader.FieldCount;$i++){
$hash[$reader.getname($i)] = $reader.GetValue($i)
New-Object PSObject -Property $hash

Powershell function to import csv file to SQL Server database table

I have created a PowerShell function that bulk copies data from a .csv file (first row is the header), and inserts the data in to a SQL Server database table.
See my code:
function BulkCsvImport($sqlserver, $database, $table, $csvfile, $csvdelimiter, $firstrowcolumnnames) {
Write-Host "Bulk Import Started."
$elapsed = [System.Diagnostics.Stopwatch]::StartNew()
# 50k worked fastest and kept memory usage to a minimum
$batchsize = 50000
# Build the sqlbulkcopy connection, and set the timeout to infinite
$connectionstring = "Data Source=$sqlserver;Integrated Security=true;Initial Catalog=$database;"
# Wipe the bulk insert table first
Invoke-Sqlcmd -Query "TRUNCATE TABLE $table" -ServerInstance $sqlserver -Database $database
$bulkcopy = New-Object Data.SqlClient.SqlBulkCopy($connectionstring, [System.Data.SqlClient.SqlBulkCopyOptions]::TableLock)
$bulkcopy.DestinationTableName = $table
$bulkcopy.bulkcopyTimeout = 0
$bulkcopy.batchsize = $batchsize
# Create the datatable, and autogenerate the columns.
$datatable = New-Object System.Data.DataTable
# Open the text file from disk
$reader = New-Object System.IO.StreamReader($csvfile)
$columns = (Get-Content $csvfile -First 1).Split($csvdelimiter)
if ($firstrowcolumnnames -eq $true) { $null = $reader.readLine() }
foreach ($column in $columns) {
$null = $datatable.Columns.Add()
# Read in the data, line by line
while (($line = $reader.ReadLine()) -ne $null) {
$null = $datatable.Rows.Add($line.Split($csvdelimiter))
if (($i % $batchsize) -eq 0) {
Write-Host "$i rows have been inserted in $($elapsed.Elapsed.ToString())."
# Add in all the remaining rows since the last clear
if($datatable.Rows.Count -gt 0) {
# Clean Up
Write-Host "Bulk Import Completed. $i rows have been inserted into the database."
# Write-Host "Total Elapsed Time: $($elapsed.Elapsed.ToString())"
# Sometimes the Garbage Collector takes too long to clear the huge datatable.
$i = 0
I am looking to modify the above though so that the column names in the .csv file match up with the column names in the SQL Server database table. They should be identical. At the moment the data is being imported in to the incorrect database columns.
Could I get some assistance as what I need to do to modify the above function to achieve this?
I would use existing open source solution:
Import-DbaCsv - dbatools.io
Efficiently imports very large (and small) CSV files into SQL Server.
Import-DbaCsv takes advantage of .NET's super fast SqlBulkCopy class to import CSV files into SQL Server.
By default, the bulk copy tries to automap columns. When it doesn't
work as desired, this parameter will help.
PS C:\> $columns = #{
>> Text = 'FirstName'
>> Number = 'PhoneNumber'
>> }
PS C:\> Import-DbaCsv -Path c:\temp\supersmall.csv
-SqlInstance sql2016 -Database tempdb -ColumnMap $columns
-BatchSize 50000 -Table table_name -Truncate
The CSV column 'Text' is inserted into SQL column 'FirstName' and CSV column Number is inserted into the SQL Column 'PhoneNumber'. All other columns are ignored and therefore null or default values.

Using Powershell to Bulk Import Large CSV into SQL Server

I came across a post discussing how to use Powershell to bulk import massive data relatively fast. I have a typical csv file with about 5 million rows formatted in the usual way.
I keep getting the same error messages regardless if I choose to import a txt or csv file. Playing around with the csvdelimiter/firstcolumnnames section also created their own issues.
I've spent hours trying to figure out how to get it to work with MY csv files and I keep getting the same error messages no matter what I try. All field names accept Null and they are identical in every way between the table and csv file. I do not have a primary key for the database.
# Database variables
$sqlserver = "SERVERNAMEHERE"
$database = "autos"
$table = "AgedAutos"
# CSV variables
$csvfile = "C:\temp\aged.csv"
$csvdelimiter = "',"
$firstRowColumnNames = $true
################### No need to modify anything below ###################
Write-Host "Script started..."
$elapsed = [System.Diagnostics.Stopwatch]::StartNew()
# 50k worked fastest and kept memory usage to a minimum
$batchsize = 50000
# Build the sqlbulkcopy connection, and set the timeout to infinite
$connectionstring = "Data Source=$sqlserver;Integrated Security=true;Initial Catalog=$database;"
$bulkcopy = New-Object Data.SqlClient.SqlBulkCopy($connectionstring, [System.Data.SqlClient.SqlBulkCopyOptions]::TableLock)
$bulkcopy.DestinationTableName = $table
$bulkcopy.bulkcopyTimeout = 0
$bulkcopy.batchsize = $batchsize
# Create the datatable, and autogenerate the columns.
$datatable = New-Object System.Data.DataTable
# Open the text file from disk
$reader = New-Object System.IO.StreamReader($csvfile)
$columns = (Get-Content $csvfile -First 1).Split($csvdelimiter)
if ($firstRowColumnNames -eq $true) { $null = $reader.readLine() }
foreach ($column in $columns) {
$null = $datatable.Columns.Add()
# Read in the data, line by line
while (($line = $reader.ReadLine()) -ne $null) {
$null = $datatable.Rows.Add($line.Split($csvdelimiter))
$i++; if (($i % $batchsize) -eq 1) {
Write-Host "$i rows have been inserted in $($elapsed.Elapsed.ToString())."
# Add in all the remaining rows since the last clear
if($datatable.Rows.Count -gt 0) {
# Clean Up
$reader.Close(); $reader.Dispose()
$bulkcopy.Close(); $bulkcopy.Dispose()
Write-Host "Script complete. $i rows have been inserted into the database."
Write-Host "Total Elapsed Time: $($elapsed.Elapsed.ToString())"
# Sometimes the Garbage Collector takes too long to clear the huge datatable.
Error message listed below.
Exception calling "WriteToServer" with "1" argument(s): "The given value of type String from the data source cannot be converted to
type date of the specified target column."
At C:\powershell_scripts\batch_csv_import-code1-working-test for auto table.ps1:43 char:3
+ $bulkcopy.WriteToServer($datatable)
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : NotSpecified: (:) [], MethodInvocationException
+ FullyQualifiedErrorId : InvalidOperationException
340000 rows have been inserted in 00:00:03.5156162
I have no idea what that error means since I cannot find anything useful on Google. I'm thinking one of the columns might be listed incorrectly in SQL Server, but I could be wrong.
Please help me figure out the problem. Thanks.
You are getting all the data in the first column because your value for $csvdelimiter is incorrect.
you have: $csvdelimiter = "',"
it should be: $csvdelimiter = ","

Specify insert into list with different sql source and dest connection in Powershell

I have this powershell script that would work if my DEST table ONLY had the columns listed in the select from my SOURCE server, but the DEST table has more. I haven't been able to find anything that gives examples on how to specify the columns from my dest table I want to insert into. Note that the SourceServer and DestServer are not linked servers.
Param (
#[parameter(Mandatory = $true)]
[string] $SrcServer = "SourceServer",
[parameter(Mandatory = $true)]
[string] $SrcDatabase = "SourceDb",
#[parameter(Mandatory = $true)]
[string] $SrcTable = "stage.InternalNotes",
#[parameter(Mandatory = $true)]
[string] $DestServer = "DestServer",
#[parameter(Mandatory = $true)]
[string] $DestDatabase = "DestDb",
[parameter(Mandatory = $true)]
[string] $DestTable = "dbo.InternalNotes",
Function ConnectionString([string] $ServerName, [string] $DbName)
"Data Source=$ServerName;Initial Catalog=$DbName;Integrated Security=True;User ID=$UID;Password=$PWD;"
$SrcConnStr = ConnectionString $SrcServer $SrcDatabase
$SrcConn = New-Object System.Data.SqlClient.SQLConnection($SrcConnStr)
$CmdText = "SELECT
,IsReadOnly = 0
stage.InternalNotes AS ino
$SqlCommand = New-Object system.Data.SqlClient.SqlCommand($CmdText, $SrcConn)
[System.Data.SqlClient.SqlDataReader] $SqlReader = $SqlCommand.ExecuteReader()
$DestConnStr = ConnectionString $DestServer $DestDatabase
$bulkCopy = New-Object Data.SqlClient.SqlBulkCopy($DestConnStr, [System.Data.SqlClient.SqlBulkCopyOptions]::KeepIdentity)
$bulkCopy.DestinationTableName = $DestTable
Catch [System.Exception]
$ex = $_.Exception
Write-Host $ex.Message
Write-Host "Table $SrcTable in $SrcDatabase database on $SrcServer has been copied to table $DestTable in $DestDatabase database on $DestServer"
Essentially, I need to be able to do this:
INSERT INTO dbo.InternalNotes --DEST Server table
,IsReadOnly = 0
stage.InternalNotes AS ino --SOURCE Server table
Edits after getting everything to work based on the accepted answer:
For some reason it didn't like the line:
$bulkCopy = New-Object -TypeName Data.SqlClient.SqlBulkCopy -ArgumentList $DestSqlConnection, [System.Data.SqlClient.SqlBulkCopyOptions]::KeepIdentity, $DestSqlTransaction;
It gave the error:
Cannot convert argument "1", with value:
"[System.Data.SqlClient.SqlBulkCopyOptions]::KeepIdentity", for
"SqlBulkCopy" to type "System.Data.SqlClient.SqlBulkCopyOptions":
"Cannot convert value
"[System.Data.SqlClient.SqlBulkCopyOptions]::KeepIdentity" to type
"System.Data.SqlClient.SqlBulkCopyOptions". Error: "Unable to match
the identifier name
[System.Data.SqlClient.SqlBulkCopyOptions]::KeepIdentity to a valid
enumerator name. Specify one of the following enumerator names and try
again: Default, KeepIdentity, CheckConstraints, TableLock, KeepNulls,
FireTriggers, UseInternalTransaction,
So Instead I changed it to this, and everything worked:
$bulkCopy = New-Object Data.SqlClient.SqlBulkCopy($DestSqlConnection, [System.Data.SqlClient.SqlBulkCopyOptions]::KeepIdentity,$DestSqlTransaction)
To do manual column mapping, you need to populate SqlBulkCopy.ColumnMappings. If you don't specify the mapping, then as far as I know SqlBulkCopy will assume the first column in the select list or DataRow goes into the first ordinal column of the destination table.
For example:
$bulkCopy.DestinationTableName = $DestTable;
However, there's a number of other issues with your script.
Your connection string authentication section is nonsense:
`Integrated Security=True; User ID=$UID; Password=$PWD;`
Integrated Security=True says, "Use passthrough Windows authentication with currently logged on user." User ID=$UID; Password=$PWD; says, "Use SQL authentication with the specified username and password." You can't do both.
You should specify only one or the other.
$SqlCommand = New-Object system.Data.SqlClient.SqlCommand($CmdText, $SrcConn)
$bulkCopy = New-Object Data.SqlClient.SqlBulkCopy($DestConnStr, [System.Data.SqlClient.SqlBulkCopyOptions]::KeepIdentity)
I may be wrong, but I'm pretty sure you're trying to pass two variables as one argument here. Just like with your ConnectionString function, I don't think you don't want parentheses here. In any case it's syntactically confusing. Do this instead:
$SqlCommand = New-Object -TypeName System.Data.SqlClient.SqlCommand -ArgumentList $CmdText, $SrcConn
$bulkCopy = New-Object -TypeName Data.SqlClient.SqlBulkCopy -ArgumentList $DestConnStr, [System.Data.SqlClient.SqlBulkCopyOptions]::KeepIdentity
Speaking of that last one, I have another issue with it. SqlBulkCopy is powerful, but you really have to hold it's hand. By default, SqlBulkCopy doesn't run with any transaction benefits. That means that if it errors in the middle, well, too bad, your data has been partially updated. You can enable internal transactions, but then only the most recent batch of the inserts will be rolled back. You really need to manage your own transaction to get an all-or-nothing result.
So you'll end up with something like this:
Try {
$DestConnStr = ConnectionString $DestServer $DestDatabase
# We have to open the connection before we can create the transaction
$DestSqlConnection = New-Object -TypeName System.Data.SqlClient.SqlConnection -ArgumentList $DestConnStr;
$DestSqlTransaction = $DestSqlConnection.BeginTransaction();
$bulkCopy = New-Object -TypeName Data.SqlClient.SqlBulkCopy -ArgumentList $DestSqlConnection, [System.Data.SqlClient.SqlBulkCopyOptions]::KeepIdentity, $DestSqlTransaction;
$bulkCopy.DestinationTableName = $DestTable
Try {
# Commit on success
Catch {
# Rollback on error
# Rethrow the error to the outer catch block
throw ($_);
Catch [System.Exception] {
$ex = $_.Exception
Write-Host $ex.Message
Finally {
I'd probably rewrite the above more because I don't like nested try blocks, but for a quick and dirty rewrite this will work. I don't think you'll run into any problems with distributed transaction problems doing this, but I may be wrong. I tend to use SSIS or linked servers when I need this sort of data pump.
