In NpgSql Insert bit datatype using BeginBinaryImport for bulk data insertion - npgsql

I have been trying to implement bulk insert operation for postgre database using Npgsql version 3.1.2, but I had facing one issue ('insufficient data left in message ')
regarding datatype miss match for the column paymentdone(bit(1)) datatype in postgre table. I had try with bool, char, integer datatype (C#) but that is also got same error.
Code For bulk data insertion
public void BulkInsert(string connectionString, DataTable dataTable)
{
using (var npgsqlConn = new NpgsqlConnection(connectionString))
{
npgsqlConn.Open();
var commandFormat = string.Format(CultureInfo.InvariantCulture, "COPY {0} {1} FROM STDIN BINARY", "logging.testtable", "(firstName,LastName,LogDateTime,RowStatus,active,id,paymentdone)");
using (var writer = npgsqlConn.BeginBinaryImport(commandFormat))
{
foreach (DataRow item in dataTable.Rows)
{
writer.WriteRow(item.ItemArray);
}
}
npgsqlConn.Close();
}
}
DataTable Function
private static void BulkInsert()
{
DataTable table = new DataTable();
table.Columns.Add("firstName", typeof(String));
table.Columns.Add("LastName", typeof(String));
table.Columns.Add("LogDateTime", typeof(DateTime));
table.Columns.Add("RowStatus", typeof(int));
table.Columns.Add("active", typeof(bool));
table.Columns.Add("id", typeof(long));
table.Columns.Add("paymentdone", typeof(bool));
var dataRow = table.NewRow();
dataRow[0] = "Test";
dataRow[1] = "Temp";
dataRow[2] = DateTime.Now;
dataRow[3] = 1;
dataRow[4] = true;
dataRow[5] = 10;
dataRow[6] = true;
table.Rows.Add(dataRow);
BulkInsert(ConfigurationManager.ConnectionStrings["StoreEntities"].ConnectionString, table);
}

This is probably happening because when Npgsql sees a boolean, its default is to sent a PostgreSQL boolean and not a BIT(1). When using binary COPY, you must write exactly the types PostgreSQL expects.
One solution is probably to use .NET BitArray instead of boolean. Npgsql will infer PostgreSQL BIT() from that type and everything should work.
But a safer solution is simply to call StartRow() and then to use the overload of Write() which accepts an NpgsqlDbType. This allows you to unambiguously specify which PostgreSQL type you want to send.

Related

best solution for multiple insert update solution

Struggle with understanding C# & Npgsql as a beginner. Following code examples:
// Insert some data
using (var cmd = new NpgsqlCommand())
{ cmd.Connection = conn;
cmd.CommandText = "INSERT INTO data (some_field) VALUES (#p)";
cmd.Parameters.AddWithValue("p", "Hello world");
cmd.ExecuteNonQuery();
}
The syntax for more than one insert & update statement like this is clear so far:
cmd.CommandText = "INSERT INTO data (some_field) VALUES (#p);INSERT INTO data1...;INSERT into data2... and so on";
But what is the right solution for a loop which should handle one statement within.
This works not:
// Insert some data
using (var cmd = new NpgsqlCommand())
{
foreach(s in SomeStringCollectionOrWhatever)
{
cmd.Connection = conn;
cmd.CommandText = "INSERT INTO data (some_field) VALUES (#p)";
cmd.Parameters.AddWithValue("p", s);
cmd.ExecuteNonQuery();
}
}
It seems the values will be "concatenated" or remembered. I cannot see any possibility to "clear" the existing cmd-object.
My second solution would be to wrap the whole "using" block into the loop. But every cycle would create a new object. That seems ugly to me.
So what is the best solution for my problem?
To insert lots of rows efficiently, take a look at Npgsql's bulk copy feature - the API is more suitable (and more efficient) for inserting large numbers of rows than concatenating INSERT statements into a batch like you're trying to do.
If you want to rerun the same SQL with changing parameter values, you can do the following:
using (var cmd = new NpgsqlCommand("INSERT INTO data (some_field) VALUES (#p)", conn))
{
var p = new NpgsqlParameter("p", DbType.String); // Adjust DbType according to type
cmd.Parameters.Add(p);
cmd.Prepare(); // This is optional but will optimize the statement for repeated use
foreach(var s in SomeStringCollectionOrWhatever)
{
p.Value = s;
cmd.ExecuteNonQuery();
}
}
If you need lots of rows and performance is key then i would recommend Npgsql's bulk copy capability as #Shay mentioned. But if you are looking for quick way to do this without the bulk copy i would recommend to use Dapper.
Consider the example below.
Lets say you have a class called Event and a list of events to add.
List<Event> eventsToInsert = new List<Event>
{
new Event() { EventId = 1, EventName = "Bday1" },
new Event() { EventId = 2, EventName = "Bday2" },
new Event() { EventId = 3, EventName = "Bday3" }
};
The snippet that would add the list to the DB shown below.
var sqlInsert = "Insert into events( eventid, eventname ) values (#EventId, #EventName)";
using (IDbConnection conn = new NpgsqlConnection(cs))
{
conn.Open();
// Execute is an extension method supplied by Dapper
// This code will add all the entries in the eventsToInsert List and match up the values based on property name. Only caveat is that the property names of the POCO should match the placeholder names in the SQL Statement.
conn.Execute(sqlInsert, eventsToInsert);
// If we want to retrieve the data back into the list
List<Event> eventsAdded;
// This Dapper extension will return an Ienumerable, so i cast it to a List.
eventsAdded = conn.Query<Event>("Select * from events").ToList();
foreach( var row in eventsAdded)
{
Console.WriteLine($"{row.EventId} {row.EventName} was added");
}
}
-HTH

SQL Server 2008 changed table name bizarre behavior

I changed the name of one of my tables, then afterwards encoded some data then pulled it using a view to my surprise the data is not showing. I tried renaming it back to its original name with no luck the same thing is happening.
Then finally I tried retyping the data on one of the columns and then executed the view and there the data is finally showing now the problem arises I need to re encode the data on one of the column every time a data is inserted which is obviously not a good thing to do.
here is the code on how i added some data
tblcsv.Columns.AddRange(new DataColumn[7] { new DataColumn("unit_name", typeof(string)), new DataColumn("unit", typeof(string)), new DataColumn("adrress", typeof(string)), new DataColumn("latitude", typeof(string))
,new DataColumn("longitude" , typeof(string)) , new DataColumn("region" , typeof(string)) , new DataColumn("linkid" , typeof(string))});
string ReadCSV = File.ReadAllText(forex);
foreach (string csvRow in ReadCSV.Split('\n'))
{
if (!string.IsNullOrEmpty(csvRow))
{
//Adding each row into datatable
tblcsv.Rows.Add();
int count = 0;
foreach (string FileRec in csvRow.Split(','))
{
tblcsv.Rows[tblcsv.Rows.Count - 1][count] = FileRec;
if (count == 5)
{
tblcsv.Rows[tblcsv.Rows.Count - 1][6] = link;
}
count++;
}
}
}
string consString = ConfigurationManager.ConnectionStrings["diposlConnectionString"].ConnectionString;
using (SqlConnection con = new SqlConnection(consString))
{
using (SqlBulkCopy sqlBulkCopy = new SqlBulkCopy(con))
{
//Set the database table name
sqlBulkCopy.DestinationTableName = "dbo.FRIENDLY_FORCES";
//[OPTIONAL]: Map the Excel columns with that of the database table
sqlBulkCopy.ColumnMappings.Add("unit_name", "unit_name");
sqlBulkCopy.ColumnMappings.Add("unit", "unit");
sqlBulkCopy.ColumnMappings.Add("adrress", "adrress");
sqlBulkCopy.ColumnMappings.Add("latitude", "latitude");
sqlBulkCopy.ColumnMappings.Add("longitude", "longitude");
sqlBulkCopy.ColumnMappings.Add("region", "region");
sqlBulkCopy.ColumnMappings.Add("linkid", "linkid");
con.Open();
sqlBulkCopy.WriteToServer(tblcsv);
con.Close();
}
}
the column region is where i manually edited the data
Did the renaming of the table did something to my data?
Or am I just missing something?
Thank you

Pass Dictionary<string,int> to Stored Procedure T-SQL

I have mvc application. In action I have Dictionary<string,int>. The Key is ID and Value is sortOrderNumber. I want to create stored procedure that will be get key(id) find this record in database and save orderNumber column by value from Dictionary. I want to call stored procedure once time and pass data to it, instead of calling many times for updating data.
Have you any ideas?
Thanks!
The accepted answer of using a TVP is generally correct, but needs some clarification based on the amount of data being passed in. Using a DataTable is fine (not to mention quick and easy) for smaller sets of data, but for larger sets it does not scale given that it duplicates the dataset by placing it in the DataTable simply for the means of passing it to SQL Server. So, for larger sets of data there is an option to stream the contents of any custom collection. The only real requirement is that you need to define the structure in terms of SqlDb types and iterate through the collection, both of which are fairly trivial steps.
A simplistic overview of the minimal structure is shown below, which is an adaptation of the answer I posted on How can I insert 10 million records in the shortest time possible?, which deals with importing data from a file and is hence slightly different as the data is not currently in memory. As you can see from the code below, this setup is not overly complicated yet highly flexible as well as efficient and scalable.
SQL object # 1: Define the structure
-- First: You need a User-Defined Table Type
CREATE TYPE dbo.IDsAndOrderNumbers AS TABLE
(
ID NVARCHAR(4000) NOT NULL,
SortOrderNumber INT NOT NULL
);
GO
SQL object # 2: Use the structure
-- Second: Use the UDTT as an input param to an import proc.
-- Hence "Tabled-Valued Parameter" (TVP)
CREATE PROCEDURE dbo.ImportData (
#ImportTable dbo.IDsAndOrderNumbers READONLY
)
AS
SET NOCOUNT ON;
-- maybe clear out the table first?
TRUNCATE TABLE SchemaName.TableName;
INSERT INTO SchemaName.TableName (ID, SortOrderNumber)
SELECT tmp.ID,
tmp.SortOrderNumber
FROM #ImportTable tmp;
-- OR --
some other T-SQL
-- optional return data
SELECT #NumUpdates AS [RowsUpdated],
#NumInserts AS [RowsInserted];
GO
C# code, Part 1: Define the iterator/sender
using System.Collections;
using System.Data;
using System.Data.SqlClient;
using System.IO;
using Microsoft.SqlServer.Server;
private static IEnumerable<SqlDataRecord> SendRows(Dictionary<string,int> RowData)
{
SqlMetaData[] _TvpSchema = new SqlMetaData[] {
new SqlMetaData("ID", SqlDbType.NVarChar, 4000),
new SqlMetaData("SortOrderNumber", SqlDbType.Int)
};
SqlDataRecord _DataRecord = new SqlDataRecord(_TvpSchema);
StreamReader _FileReader = null;
// read a row, send a row
foreach (KeyValuePair<string,int> _CurrentRow in RowData)
{
// You shouldn't need to call "_DataRecord = new SqlDataRecord" as
// SQL Server already received the row when "yield return" was called.
// Unlike BCP and BULK INSERT, you have the option here to create an
// object, do manipulation(s) / validation(s) on the object, then pass
// the object to the DB or discard via "continue" if invalid.
_DataRecord.SetString(0, _CurrentRow.ID);
_DataRecord.SetInt32(1, _CurrentRow.sortOrderNumber);
yield return _DataRecord;
}
}
C# code, Part 2: Use the iterator/sender
public static void LoadData(Dictionary<string,int> MyCollection)
{
SqlConnection _Connection = new SqlConnection("{connection string}");
SqlCommand _Command = new SqlCommand("ImportData", _Connection);
SqlDataReader _Reader = null; // only needed if getting data back from proc call
SqlParameter _TVParam = new SqlParameter();
_TVParam.ParameterName = "#ImportTable";
// _TVParam.TypeName = "IDsAndOrderNumbers"; //optional for CommandType.StoredProcedure
_TVParam.SqlDbType = SqlDbType.Structured;
_TVParam.Value = SendRows(MyCollection); // method return value is streamed data
_Command.Parameters.Add(_TVParam);
_Command.CommandType = CommandType.StoredProcedure;
try
{
_Connection.Open();
// Either send the data and move on with life:
_Command.ExecuteNonQuery();
// OR, to get data back from a SELECT or OUTPUT clause:
SqlDataReader _Reader = _Command.ExecuteReader();
{
Do something with _Reader: If using INSERT or MERGE in the Stored Proc, use an
OUTPUT clause to return INSERTED.[RowNum], INSERTED.[ID] (where [RowNum] is an
IDENTITY), then fill a new Dictionary<string, int>(ID, RowNumber) from
_Reader.GetString(0) and _Reader.GetInt32(1). Return that instead of void.
}
}
finally
{
_Reader.Dispose(); // optional; needed if getting data back from proc call
_Command.Dispose();
_Connection.Dispose();
}
}
Using Table Valued parameters is really not that complex.
given this SQL:
CREATE TYPE MyTableType as TABLE (ID nvarchar(25),OrderNumber int)
CREATE PROCEDURE MyTableProc (#myTable MyTableType READONLY)
AS
BEGIN
SELECT * from #myTable
END
this will show how relatively easy it is, it just selects out the values you sent in for demo purposes. I am sure you can easily abstract this away in your case.
using System;
using System.Collections.Generic;
using System.Data;
using System.Data.SqlClient;
namespace TVPSample
{
class Program
{
static void Main(string[] args)
{
//setup some data
var dict = new Dictionary<string, int>();
for (int x = 0; x < 10; x++)
{
dict.Add(x.ToString(),x+100);
}
//convert to DataTable
var dt = ConvertToDataTable(dict);
using (SqlConnection conn = new SqlConnection("[Your Connection String here]"))
{
conn.Open();
using (SqlCommand comm = new SqlCommand("MyTableProc",conn))
{
comm.CommandType=CommandType.StoredProcedure;
var param = comm.Parameters.AddWithValue("myTable", dt);
//this is the most important part:
param.SqlDbType = SqlDbType.Structured;
var reader = comm.ExecuteReader(); //or NonQuery, etc.
while (reader.Read())
{
Console.WriteLine("{0} {1}", reader["ID"], reader["OrderNumber"]);
}
}
}
}
//I am sure there is a more elegant way of doing this.
private static DataTable ConvertToDataTable(Dictionary<string, int> dict)
{
var dt = new DataTable();
dt.Columns.Add("ID",typeof(string));
dt.Columns.Add("OrderNumber", typeof(Int32));
foreach (var pair in dict)
{
var row = dt.NewRow();
row["ID"] = pair.Key;
row["OrderNumber"] = pair.Value;
dt.Rows.Add(row);
}
return dt;
}
}
}
Produces
0 100
1 101
2 102
3 103
4 104
5 105
6 106
7 107
8 108
9 109
Stored procedures do not support arrays as inputs. Googling gives a couple of hacks using XML or comma separated strings, but those are hacks.
A more SQLish way to do this is to create a temporary table (named e.g. #Orders) and insert all the data into that one. Then you can call the sp, using the same open Sql Connection and insie the SP use the #Orders table to read the values.
Another solution is to use Table-Valued Parameters but that requires some more SQL to setup so I think it is probably easier to use the temp table approach.

SqlServer Converting XML to varbinary and parsing it in .NET (C#)

Consider the following code:
[Test]
public void StackOverflowQuestionTest()
{
const string connectionString = "enter your connection string if you wanna test this code";
byte[] result = null;
using (var connection = new SqlConnection(connectionString))
{
connection.Open();
using (var sqlCommand = new SqlCommand("declare #xml as xml = '<xml/>' SELECT convert(varbinary(max), #xml) as value"))
//using (var sqlCommand = new SqlCommand("SELECT convert(varbinary(max), N'<xml/>') as value"))
{
sqlCommand.Connection = connection;
using (SqlDataReader reader = sqlCommand.ExecuteReader())
{
while (reader.Read())
{
result = (byte[])reader["value"];
}
reader.Close();
}
}
}
string decodedString = new UnicodeEncoding(false, true).GetString(result);
var document = XElement.Parse(decodedString);
}
If I run this test I get an XmlException with message : "Data at the root level is invalid. Line 1, position 1." As it turns out the problem is "0xFFFE" preamble which is considered as invalid character.
Note that if I use commented string instead, everything works just fine, which is strange as per me. Looks like SqlServer stores XML strings in UCS-2 with a BOM, and at the same time it stores nvarchar values without it.
The main question is: how can I decode this byte array to string which will not contain this preamble (BOM)?
In case anyone will need this in future, the following code works:
using(var ms = new MemoryStream(result))
{
using (var sr = new StreamReader(ms, Encoding.Unicode, true))
{
decodedString = sr.ReadToEnd();
}
}

Is it possible to use `SqlDbType.Structured` to pass Table-Valued Parameters in NHibernate?

I want to pass a collection of ids to a stored procedure that will be mapped using NHibernate. This technique was introduced in Sql Server 2008 ( more info here => Table-Valued Parameters ). I just don't want to pass multiple ids within an nvarchar parameter and then chop its value on the SQL Server side.
My first, ad hoc, idea was to implement my own IType.
public class Sql2008Structured : IType {
private static readonly SqlType[] x = new[] { new SqlType(DbType.Object) };
public SqlType[] SqlTypes(NHibernate.Engine.IMapping mapping) {
return x;
}
public bool IsCollectionType {
get { return true; }
}
public int GetColumnSpan(NHibernate.Engine.IMapping mapping) {
return 1;
}
public void NullSafeSet(DbCommand st, object value, int index, NHibernate.Engine.ISessionImplementor session) {
var s = st as SqlCommand;
if (s != null) {
s.Parameters[index].SqlDbType = SqlDbType.Structured;
s.Parameters[index].TypeName = "IntTable";
s.Parameters[index].Value = value;
}
else {
throw new NotImplementedException();
}
}
#region IType Members...
#region ICacheAssembler Members...
}
No more methods are implemented; a throw new NotImplementedException(); is in all the rest. Next, I created a simple extension for IQuery.
public static class StructuredExtensions {
private static readonly Sql2008Structured structured = new Sql2008Structured();
public static IQuery SetStructured(this IQuery query, string name, DataTable dt) {
return query.SetParameter(name, dt, structured);
}
}
Typical usage for me is
DataTable dt = ...;
ISession s = ...;
var l = s.CreateSQLQuery("EXEC some_sp #id = :id, #par1 = :par1")
.SetStructured("id", dt)
.SetParameter("par1", ...)
.SetResultTransformer(Transformers.AliasToBean<SomeEntity>())
.List<SomeEntity>();
Ok, but what is an "IntTable"? It's the name of SQL type created to pass table value arguments.
CREATE TYPE IntTable AS TABLE
(
ID INT
);
And some_sp could be like
CREATE PROCEDURE some_sp
#id IntTable READONLY,
#par1 ...
AS
BEGIN
...
END
It only works with Sql Server 2008 of course and in this particular implementation with a single-column DataTable.
var dt = new DataTable();
dt.Columns.Add("ID", typeof(int));
It's POC only, not a complete solution, but it works and might be useful when customized. If someone knows a better/shorter solution let us know.
A simpler solution than the accepted answer would be to use ADO.NET. NHibernate allows users to enlist IDbCommands into NHibernate transactions.
DataTable myIntsDataTable = new DataTable();
myIntsDataTable.Columns.Add("ID", typeof(int));
// ... Add rows to DataTable
ISession session = sessionFactory.GetSession();
using(ITransaction transaction = session.BeginTransaction())
{
IDbCommand command = new SqlCommand("StoredProcedureName");
command.Connection = session.Connection;
command.CommandType = CommandType.StoredProcedure;
var parameter = new SqlParameter();
parameter.ParameterName = "IntTable";
parameter.SqlDbType = SqlDbType.Structured;
parameter.Value = myIntsDataTable;
command.Parameters.Add(parameter);
session.Transaction.Enlist(command);
command.ExecuteNonQuery();
}
For my case, my stored procedure needs to be called in the middle of an open transaction.
If there is an open transaction, this code works because it is automatically reusing the existing transaction of the NHibernate session:
NHibernateSession.GetNamedQuery("SaveStoredProc")
.SetInt64("spData", 500)
.ExecuteUpdate();
However, for my new Stored Procedure, the parameter is not as simple as an Int64. It's a table-valued-parameter (User Defined Table Type)
My problem is that I cannot find the proper Set function.
I tried SetParameter("spData", tvpObj), but it's returning this error:
Could not determine a type for class: …
Anyways, after some trial and error, this approach below seems to work.
The Enlist() function is the key in this approach. It basically tells the SQLCommand to use the existing transaction. Without it, there will be an error saying
ExecuteNonQuery requires the command to have a transaction when the
connection assigned to the command is in a pending local transaction…
using (SqlCommand cmd = NHibernateSession.Connection.CreateCommand() as SqlCommand)
{
cmd.CommandText = "MyStoredProc";
NHibernateSession.Transaction.Enlist(cmd); // Because there is a pending transaction
cmd.CommandType = CommandType.StoredProcedure;
cmd.Parameters.Add(new SqlParameter("#wiData", SqlDbType.Structured) { Value = wiSnSqlList });
int affected = cmd.ExecuteNonQuery();
}
Since I am using the SqlParameter class with this approach, SqlDbType.Structured is available.
This is the function where wiSnList gets assigned:
private IEnumerable<SqlDataRecord> TransformWiSnListToSql(IList<SHWorkInstructionSnapshot> wiSnList)
{
if (wiSnList == null)
{
yield break;
}
var schema = new[]
{
new SqlMetaData("OriginalId", SqlDbType.BigInt), //0
new SqlMetaData("ReportId", SqlDbType.BigInt), //1
new SqlMetaData("Description", SqlDbType.DateTime), //2
};
SqlDataRecord row = new SqlDataRecord(schema);
foreach (var wi in wiSnList)
{
row.SetSqlInt64(0, wi.OriginalId);
row.SetSqlInt64(1, wi.ShiftHandoverReportId);
if (wi.Description == null)
{
row.SetDBNull(2);
}
else
{
row.SetSqlString(2, wi.Description);
}
yield return row;
}
}
You can pass collections of values without the hassle.
Example:
var ids = new[] {1, 2, 3};
var query = session.CreateQuery("from Foo where id in (:ids)");
query.SetParameterList("ids", ids);
NHibernate will create a parameter for each element.

Resources