Update uses previously autogenerated ID on Oracle - sql-server

We're having a strange problem in Oracle. I'll sketch some (simplified) context first:
Consider this mapping to an Entity:
public EntityMap()
{
Table("EntityTable");
Id(x => x.Id)
.Column("entityID")
.GeneratedBy.Native("ENTITYID").UnsavedValue(0);
Map(x => x.SomeBoolean).Column("SomeBoolean");
}
and this code:
var entity = new Entity();
using (var transaction = new TransactionScope(TransactionScopeOption.Required))
{
Session.Save(entity);
transaction.Complete();
}
//A lot of code
if(someCondition)
{
using (var transaction = new TransactionScope(TransactionScopeOption.Required))
{
enitity.SomeBoolean = true;
Session.Update(entity);
transaction.Complete();
}
}
This code is called a few times. The first time it generates the following queries:
select ENTITYID.nextval from dual
INSERT INTO Entity
(SomeBoolean, EntityID)
VALUES (0, 1216)
UPDATE Entity
SET SomeBoolean = 1
WHERE EntityID = 1216
The second time it is called these queries are generated (someCondition is false)
select ENTITYID.nextval from dual
INSERT INTO Entity
(SomeBoolean, EntityID)
VALUES (0, 1217)
And now the trouble begins. From now on, each insert will use the correct autoincremented value, but the update will always use 1217
select ENTITYID.nextval from dual
INSERT INTO Entity
(SomeBoolean, EntityID)
VALUES (0, 1218)
UPDATE Entity
SET SomeBoolean = 1
WHERE EntityID = 1217
And of course, this is not what we want to happen. If I inspect the value of the Id while debugging, it contains the correct autoincremented value. Somehow, deep in the bowels of NHibernate, the incorrect is is assigned to the WHERE clause...
The strange part is that this only happens on Oracle. If I switch NHibernate to MsSql, everything works like a charm.

So I found out what happened. NHibernate changed it's default connection release mode between versions 1.x and 2.x. Instead of closing the connection when the session is Disposed, the connections is now closed after each transaction. However, we were manually coordinating our transactions which apparently caused troubles in Oracle.
This question has some extra information and this entry in the NHibernate documentation also clarifies how the connections are handeled:
As of NHibernate, if your application manages transactions through .NET APIs such as System.Transactions library, ConnectionReleaseMode.AfterTransaction may cause NHibernate to open and close several connections during one transaction, leading to unnecessary overhead and transaction promotion from local to distributed. Specifying ConnectionReleaseMode.OnClose will revert to the legacy behavior and prevent this problem from occuring.
This blog post is what got me looking in the right direction.

Related

Entity Framework Insert Into Table With AFTER INSERT Trigger

I am working on a Web API and Entity Framework 6 that is doing a "bulk" insert of under 500 records at any given time to a Microsoft SQL Server table. The DbContext.SaveChanges() method will insert all the records into a table in a couple seconds, so have no issues with that. However, when the method is called to insert the same number of records into the same table with a semi-extensive trigger attached to it, the process can take many minutes. The trigger has some calls to table joins and inserts into other tables and then deletes the newly inserted record.
I do not have much control of the table or the trigger, so I am looking for suggestions on how to improve performance. I made a suggestion to move the trigger to a stored procedure and have the trigger call the stored procedure, but I am uncertain if that will achieve any gains.
EDIT: As I understand my question was kind of generic, I will post some of my code in case it helps. The SQL is not mine, so I will see what I can actually post.
Here is the part of my Web API method that does the call to SaveChanges():
string[] stringArray = results[0].Split(new[] { "\r\n", "\r", "\n" }, StringSplitOptions.None);
var profileObjs = db.Set<T_ProfileStaging>();
foreach (var s in stringArray)
{
string[] columns = s.Split(new[] {",", "\t"}, StringSplitOptions.None);
if (columns.Length == 6)
{
T_ProfileStaging profileObj = new T_ProfileStaging();
profileObj.CompanyCode = columns[0];
profileObj.SubmittedBy = columns[1];
profileObj.VersionName = columns[2];
profileObj.DMName = columns[3];
profileObj.Zone = columns[4];
profileObj.DMCode = columns[5];
profileObj.ProfileName = columns[6];
profileObj.Advertiser = columns[7];
profileObj.OriginalInsertDate = columns[8];
profileObjs.Add(profileObj);
}
}
try
{
db.SaveChanges();
return Ok();
}
catch (Exception e)
{
return Content(HttpStatusCode.BadRequest, "SQL Server Insert Exception");
}
When you load with SaveChanges() EF will send each row in a separate INSERT statement. So if you have a statement trigger, it will run for each row.
To work around this you need either
use a bulk load API from the client (instead of EF's SaveChanges()) using SqlBulkCopy directly, or one of the many EF extensions that wrap it.
or
Configure EF to insert into a different table and then INSERT ... SELECT into the target table

Audit trail with Entity Framework Core

I have an ASP.NET core 2.0 using Entity Framework core on a SQL Server db.
I have to trace and audit all the stuff made by the users on the data. My goal is to have an automatic mechanism writing all what is happening.
For example, if I have the table Animals, I want a parallele table "Audit_animals" where you can find all the info about the data, the operation type (add, delete, edit) and the user who made this.
I already made this time ago in Django + MySQL, but now the environment is different. I found this and it seems interesting, but I'd like to know if there are better ways and which is the best approach to do this in EF Core.
UPDATE
I'm trying this and something happens, but I have some problems.
I added this:
services.AddMvc().AddJsonOptions(options => {
options.SerializerSettings.ReferenceLoopHandling = ReferenceLoopHandling.Ignore;
});
public Mydb_Context(DbContextOptions<isMultiPayOnLine_Context> options) : base(options)
{
Audit.EntityFramework.Configuration.Setup()
.ForContext<Mydb_Context>(config => config
.IncludeEntityObjects()
.AuditEventType("Mydb_Context:Mydb"))
.UseOptOut()
}
public MyRepository(Mydb_Context context)
{
_context = context;
_context.AddAuditCustomField("UserName", "pippo");
}
I also created a table to insert the audits (only one to test this tool), but the only thing I got is what you see in the image. A list of json files with the data I created.... why??
Read the documentation:
Event Output
To configure the output persistence mechanism please see Configuration and Data Providers sections.
Then, in the documentation on Configuration:
If you don't specify a Data Provider, a default FileDataProvider will be used to write the events as .json files into the current working directory. (emphasis mine)
Long and short, follow the documentation to configure the data provider you'd like to use.
If you are going to map the audit table (Audit_Animals) to the same EF context as the audited Animals table, you can use the EntityFramework Data Provider included on the same Audit.EntityFramework library.
Check the documentation here:
Entity Framework Data Provider
If you plan to store the audit logs in
the same database as the audited entities, you can use the
EntityFrameworkDataProvider. Use this if you plan to store the audit
trails for each entity type in a table with similar structure.
There is another library that can audit EF contexts in a similar way, take a look: zzzprojects/EntityFramework-Plus.
Cannot recommend one over the other since they provide different features (and I'm the owner of the audit.net library).
Update:
.NET 6 and Entity Framework Core 6.0 supports SQL Server temporal tables out of the box.
See this answer for examples:
https://stackoverflow.com/a/70017768/3850405
Original:
You could have a look at Temporal tables (system-versioned temporal tables) if you are using SQL Server 2016< or Azure SQL.
https://learn.microsoft.com/en-us/sql/relational-databases/tables/temporal-tables?view=sql-server-ver15
From documentation:
Database feature that brings built-in support for providing
information about data stored in the table at any point in time rather
than only the data that is correct at the current moment in time.
Temporal is a database feature that was introduced in ANSI SQL 2011.
There is currently an open issue to support this out of the box:
https://github.com/dotnet/efcore/issues/4693
There are third party options available today but since they are not from Microsoft it is of course a risk that they won't be supported in future versions.
https://github.com/Adam-Langley/efcore-temporal-query
https://github.com/findulov/EntityFrameworkCore.TemporalTables
I solved it like this:
If you use the included Visual Studio 2019 LocalDB (Microsoft SQL Server 2016 (13.1.4001.0 LocalDB) you will need to upgrade if you use cascading DELETE or UPDATE. This is because Temporal tables with cascading actions is not supported in that version.
Complete guide for upgrading here:
https://stackoverflow.com/a/64210519/3850405
Start by adding a new empty migration. I prefer to use Package Manager Console (PMC):
Add-Migration "Temporal tables"
Should look like this:
public partial class Temporaltables : Migration
{
protected override void Up(MigrationBuilder migrationBuilder)
{
}
protected override void Down(MigrationBuilder migrationBuilder)
{
}
}
Then edit the migration like this:
public partial class Temporaltables : Migration
{
List<string> tablesToUpdate = new List<string>
{
"Images",
"Languages",
"Questions",
"Texts",
"Medias",
};
protected override void Up(MigrationBuilder migrationBuilder)
{
migrationBuilder.Sql($"CREATE SCHEMA History");
foreach (var table in tablesToUpdate)
{
string alterStatement = $#"ALTER TABLE [{table}] ADD SysStartTime datetime2(0) GENERATED ALWAYS AS ROW START HIDDEN
CONSTRAINT DF_{table}_SysStart DEFAULT GETDATE(), SysEndTime datetime2(0) GENERATED ALWAYS AS ROW END HIDDEN
CONSTRAINT DF_{table}_SysEnd DEFAULT CONVERT(datetime2 (0), '9999-12-31 23:59:59'),
PERIOD FOR SYSTEM_TIME (SysStartTime, SysEndTime)";
migrationBuilder.Sql(alterStatement);
alterStatement = $#"ALTER TABLE [{table}] SET (SYSTEM_VERSIONING = ON (HISTORY_TABLE = History.[{table}]));";
migrationBuilder.Sql(alterStatement);
}
}
protected override void Down(MigrationBuilder migrationBuilder)
{
foreach (var table in tablesToUpdate)
{
string alterStatement = $#"ALTER TABLE [{table}] SET (SYSTEM_VERSIONING = OFF);";
migrationBuilder.Sql(alterStatement);
alterStatement = $#"ALTER TABLE [{table}] DROP PERIOD FOR SYSTEM_TIME";
migrationBuilder.Sql(alterStatement);
alterStatement = $#"ALTER TABLE [{table}] DROP DF_{table}_SysStart, DF_{table}_SysEnd";
migrationBuilder.Sql(alterStatement);
alterStatement = $#"ALTER TABLE [{table}] DROP COLUMN SysStartTime, COLUMN SysEndTime";
migrationBuilder.Sql(alterStatement);
alterStatement = $#"DROP TABLE History.[{table}]";
migrationBuilder.Sql(alterStatement);
}
migrationBuilder.Sql($"DROP SCHEMA History");
}
}
tablesToUpdate should contain every table you need history for.
Then run Update-Database command.
Original source, a bit modified with escaping tables with square brackets etc:
https://intellitect.com/updating-sql-database-use-temporal-tables-entity-framework-migration/
Testing Create, Update and Delete will then show a complete history.
[HttpGet]
public async Task<ActionResult<string>> Test()
{
var identifier1 = "OATestar123";
var identifier2 = "OATestar12345";
var newQuestion = new Question()
{
Identifier = identifier1
};
_dbContext.Questions.Add(newQuestion);
await _dbContext.SaveChangesAsync();
var question = await _dbContext.Questions.FirstOrDefaultAsync(x => x.Identifier == identifier1);
question.Identifier = identifier2;
await _dbContext.SaveChangesAsync();
question = await _dbContext.Questions.FirstOrDefaultAsync(x => x.Identifier == identifier2);
_dbContext.Entry(question).State = EntityState.Deleted;
await _dbContext.SaveChangesAsync();
return Ok();
}
Tested a few times but the log will look like this:
This solution has a huge advantage IMAO that it is not Object Relational Mapper (ORM) specific and you will even get history if you write plain SQL.
The History tables are also read only by default so less chance of a corrupt audit trail. Error received: Cannot update rows in a temporal history table ''
If you need access to the data you can use your preferred ORM to fetch it or audit via SQL.

SQL CLR Trigger - get source table

I am creating a DB synchronization engine using SQL CLR Triggers in Microsoft SQL Server 2012. These triggers do not call a stored procedure or function (and thereby have access to the INSERTED and DELETED pseudo-tables but do not have access to the ##procid).
Differences here, for reference.
This "sync engine" uses mapping tables to determine what the table and field maps are for this sync job. In order to determine the target table and fields (from my mapping table) I need to get the source table name from the trigger itself. I have come across many answers on Stack Overflow and other sites that say that this isn't possible. But, I've found one website that provides a clue:
Potential Solution:
using (SqlConnection lConnection = new SqlConnection(#"context connection=true")) {
SqlCommand cmd = new SqlCommand("SELECT object_name(resource_associated_entity_id) FROM sys.dm_tran_locks WHERE request_session_id = ##spid and resource_type = 'OBJECT'", lConnection);
cmd.CommandType = CommandType.Text;
var obj = cmd.ExecuteScalar();
}
This does in fact return the correct table name.
Question:
My question is, how reliable is this potential solution? Is the ##spid actually limited to this single trigger execution? Or is it possible that other simultaneous triggers will overlap within this process id? Will it stand up to multiple executions of the same and/or different triggers within the database?
From these sites, it seems the process Id is in fact limited to the open connection, which doesn't overlap: here, here, and here.
Will this be a safe method to get my source table?
Why?
As I've noticed similar questions, but all without a valid answer for my specific situation (except that one). Most of the comments on those sites ask "Why?", and in order to preempt that, here is why:
This synchronization engine operates on a single DB and can push changes to target tables, transforming the data with user-defined transformations, automatic source-to-target type casting and parsing and can even use the CSharpCodeProvider to execute methods also stored in those mapping tables for transforming data. It is already built, quite robust and has good performance metrics for what we are doing. I'm now trying to build it out to allow for 1:n table changes (including extension tables requiring the same Id as the 'master' table) and am trying to "genericise" the code. Previously each trigger had a "target table" definition hard coded in it and I was using my mapping tables to determine the source. Now I'd like to get the source table and use my mapping tables to determine all the target tables. This is used in a medium-load environment and pushes changes to a "Change Order Book" which a separate server process picks up to finish the CRUD operation.
Edit
As mentioned in the comments, the query listed above is quite "iffy". It will often (after a SQL Server restart, for example) return system objects like syscolpars or sysidxstats. But, it seems that in the dm_tran_locks table there's always an associated resource_type of 'RID' (Row ID) with the same object_name. My current query which works reliably so far is the following (will update if this changes or doesn't work under high load testing):
select t1.ObjectName FROM (
SELECT object_name(resource_associated_entity_id) as ObjectName
FROM sys.dm_tran_locks WHERE resource_type = 'OBJECT' and request_session_id = ##spid
) t1 inner join (
SELECT OBJECT_NAME(partitions.OBJECT_ID) as ObjectName
FROM sys.dm_tran_locks
INNER JOIN sys.partitions ON partitions.hobt_id = dm_tran_locks.resource_associated_entity_id
WHERE resource_type = 'RID'
) t2 on t1.ObjectName = t2.ObjectName
If this is always the case, I'll have to find that out during testing.
How reliable is this potential solution?
While I do not have time to set up a test case to show it not working, I find this approach (even taking into account the query in the Edit section) "iffy" (i.e. not guaranteed to always be reliable).
The main concerns are:
cascading (whether recursive or not) Trigger executions
User (i.e. Explicit / Implicit) transactions
Sub-processes (i.e. EXEC and sp_executesql)
These scenarios allow for multiple objects to be locked, all at the same time.
Is the ##SPID actually limited to this single trigger execution? Or is it possible that other simultaneous triggers will overlap within this process id?
and (from a comment on the question):
I think I can join my query up with the sys.partitions and get a dm_trans_lock that has a type of 'RID' with an object name that will match up to the one in my original query.
And here is why it shouldn't be entirely reliable: the Session ID (i.e. ##SPID) is constant for all of the requests on that Connection). So all sub-processes (i.e. EXEC calls, sp_executesql, Triggers, etc) will all be on the same ##SPID / session_id. So, between sub-processes and User Transactions, you can very easily get locks on multiple resources, all on the same Session ID.
The reason I say "resources" instead of "OBJECT" or even "RID" is that locks can occur on: rows, pages, keys, tables, schemas, stored procedures, the database itself, etc. More than one thing can be considered an "OBJECT", and it is possible that you will have page locks instead of row locks.
Will it stand up to multiple executions of the same and/or different triggers within the database?
As long as these executions occur in different Sessions, then they are a non-issue.
ALL THAT BEING SAID, I can see where simple testing would show that your current method is reliable. However, it should also be easy enough to add more detailed tests that include an explicit transaction that first does some DML on another table, or have a trigger on one table do some DML on one of these tables, etc.
Unfortunately, there is no built-in mechanism that provides the same functionality that ##PROCID does for T-SQL Triggers. I have come up with a scheme that should allow for getting the parent table for a SQLCLR Trigger (that takes into account these various issues), but haven't had a chance to test it out. It requires using a T-SQL trigger, set as the "first" trigger, to set info that can be discovered by the SQLCLR Trigger.
A simpler form can be constructed using CONTEXT_INFO, if you are not already using it for something else (and if you don't already have a "first" Trigger set). In this approach you would still create a T-SQL Trigger, and then set it as the "first" Trigger using sp_settriggerorder. In this Trigger you SET CONTEXT_INFO to the table name that is the parent of ##PROCID. You can then read CONTEXT_INFO() on a Context Connection in a SQLCLR Trigger. If there are multiple levels of Triggers then the value of CONTEXT INFO will get overwritten, so reading that value must be the first thing you do in each SQLCLR Trigger.
This is an old thread, but it is an FAQ and I think I have a better solution. Essentially it uses the schema of the inserted or deleted table to find the base table by doing a hash of the column names and comparing the hash with the hashes of tables with a CLR trigger on them.
Code snippet below - at some point I will probably put the whole solution on Git (it sends a message to Azure Service Bus when the trigger fires).
private const string colqry = "select top 1 * from inserted union all select top 1 * from deleted";
private const string hashqry = "WITH cols as ( "+
"select top 100000 c.object_id, column_id, c.[name] "+
"from sys.columns c "+
"JOIN sys.objects ot on (c.object_id= ot.parent_object_id and ot.type= 'TA') " +
"order by c.object_id, column_id ) "+
"SELECT s.[name] + '.' + o.[name] as 'TableName', CONVERT(NCHAR(32), HASHBYTES('MD5',STRING_AGG(CONVERT(NCHAR(32), HASHBYTES('MD5', cols.[name]), 2), '|')),2) as 'MD5Hash' " +
"FROM cols "+
"JOIN sys.objects o on (cols.object_id= o.object_id) "+
"JOIN sys.schemas s on (o.schema_id= s.schema_id) "+
"WHERE o.is_ms_shipped = 0 "+
"GROUP BY s.[name], o.[name]";
public static void trgSendSBMsg()
{
string table = "";
SqlCommand cmd;
SqlDataReader rdr;
SqlTriggerContext trigContxt = SqlContext.TriggerContext;
SqlPipe p = SqlContext.Pipe;
using (SqlConnection con = new SqlConnection("context connection=true"))
{
try
{
con.Open();
string tblhash = "";
using (cmd = new SqlCommand(colqry, con))
{
using (rdr = cmd.ExecuteReader(CommandBehavior.SingleResult))
{
if (rdr.Read())
{
MD5 hash = MD5.Create();
StringBuilder hashstr = new StringBuilder(250);
for (int i=0; i < rdr.FieldCount; i++)
{
if (i > 0) hashstr.Append("|");
hashstr.Append(GetMD5Hash(hash, rdr.GetName(i)));
}
tblhash = GetMD5Hash(hash, hashstr.ToString().ToUpper()).ToUpper();
}
rdr.Close();
}
}
using (cmd = new SqlCommand(hashqry, con))
{
using (rdr = cmd.ExecuteReader(CommandBehavior.SingleResult))
{
while (rdr.Read())
{
string hash = rdr.GetString(1).ToUpper();
if (hash == tblhash)
{
table = rdr.GetString(0);
break;
}
}
rdr.Close();
}
}
if (table.Length == 0)
{
p.Send("Error: Unable to find table that CLR trigger is on. Message not sent!");
return;
}
….
HTH

Linq to sql testing stored procedures - call to procedure, verify and rollback in one transaction

I'm trying to use linq to sql for integration testing of stored procedures. I'm trying to call an updating stored procedure and after that retrieving the updated row from db to verify the change. All this should happen in one transaction so that I can rollback the transaction after the verification.
The code fails in assert, because the the row I retrieved does not seem to be updated. I know that my SP works when called from ordinary code. Is it even possible see the updated row in same transaction?
I'm using Sql Server 2008 and used sqlmetal.exe to create linq-to-sql mapping.
I've tried many different things, and right now my code looks following:
DbTransaction transaction = null;
try
{
var context =
new DbConnection(
ConfigurationManager.ConnectionStrings["MyConnectionString"].ConnectionString);
context.Connection.Open();
transaction = context.Connection.BeginTransaction();
context.Transaction = transaction;
const string newUserName= "TestUserName";
context.SpUpdateUserName(136049 , newUserName);
context.SubmitChanges();
// select to verify
var user=
(from d in context.Users where d.NUserId == 136049 select d).First();
Assert.IsTrue(user.UserName == newUserName);
}
finally
{
if (transaction != null) transaction.Rollback();
}
I believe you are coming acress a stale datacontext issue.
Your update is done through a stored procedure so your context does not "see" the changes and has no way to update the Users.
If you use a new datacontext to do the assert, it usually works well. However, since you are using a transaction you probably have to add the second datacontext to the same transaction.

Correct method of deleting over 2100 rows (by ID) with Dapper

I am trying to use Dapper support my data access for my server app.
My server app has another application that drops records into my database at a rate of 400 per minute.
My app pulls them out in batches, processes them, and then deletes them from the database.
Since data continues to flow into the database while I am processing, I don't have a good way to say delete from myTable where allProcessed = true.
However, I do know the PK value of the rows to delete. So I want to do a delete from myTable where Id in #listToDelete
Problem is that if my server goes down for even 6 mintues, then I have over 2100 rows to delete.
Since Dapper takes my #listToDelete and turns each one into a parameter, my call to delete fails. (Causing my data purging to get even further behind.)
What is the best way to deal with this in Dapper?
NOTES:
I have looked at Tabled Valued Parameters but from what I can see, they are not very performant. This piece of my architecture is the bottle neck of my system and I need to be very very fast.
One option is to create a temp table on the server and then use the bulk load facility to upload all the IDs into that table at once. Then use a join, EXISTS or IN clause to delete only the records that you uploaded into your temp table.
Bulk loads are a well-optimized path in SQL Server and it should be very fast.
For example:
Execute the statement CREATE TABLE #RowsToDelete(ID INT PRIMARY KEY)
Use a bulk load to insert keys into #RowsToDelete
Execute DELETE FROM myTable where Id IN (SELECT ID FROM #RowsToDelete)
Execute DROP TABLE #RowsToDelte (the table will also be automatically dropped if you close the session)
(Assuming Dapper) code example:
conn.Open();
var columnName = "ID";
conn.Execute(string.Format("CREATE TABLE #{0}s({0} INT PRIMARY KEY)", columnName));
using (var bulkCopy = new SqlBulkCopy(conn))
{
bulkCopy.BatchSize = ids.Count;
bulkCopy.DestinationTableName = string.Format("#{0}s", columnName);
var table = new DataTable();
table.Columns.Add(columnName, typeof (int));
bulkCopy.ColumnMappings.Add(columnName, columnName);
foreach (var id in ids)
{
table.Rows.Add(id);
}
bulkCopy.WriteToServer(table);
}
//or do other things with your table instead of deleting here
conn.Execute(string.Format(#"DELETE FROM myTable where Id IN
(SELECT {0} FROM #{0}s", columnName));
conn.Execute(string.Format("DROP TABLE #{0}s", columnName));
To get this code working, I went dark side.
Since Dapper makes my list into parameters. And SQL Server can't handle a lot of parameters. (I have never needed even double digit parameters before). I had to go with Dynamic SQL.
So here was my solution:
string listOfIdsJoined = "("+String.Join(",", listOfIds.ToArray())+")";
connection.Execute("delete from myTable where Id in " + listOfIdsJoined);
Before everyone grabs the their torches and pitchforks, let me explain.
This code runs on a server whose only input is a data feed from a Mainframe system.
The list I am dynamically creating is a list of longs/bigints.
The longs/bigints are from an Identity column.
I know constructing dynamic SQL is bad juju, but in this case, I just can't see how it leads to a security risk.
Dapper request the List of object having parameter as a property so in above case a list of object having Id as property will work.
connection.Execute("delete from myTable where Id in (#Id)", listOfIds.AsEnumerable().Select(i=> new { Id = i }).ToList());
This will work.

Resources