Concurrent updates on a single staging table - sql-server

I am developing a service application (VB.NET) which pulls information from a source and imports it to a SQL Server database
The process can involve one or more “batches” of information at a time (the number and size of batches in any given “run” is arbitrary based on a queue maintained elsewhere)
Each batch is assigned an identifier (BatchID) so that the set of records in the staging table which belong to that batch can be easily identified
The ETL process for each batch is sequential in nature; the raw data is bulk inserted to a staging table and then a series of stored procedures perform updates on a number of columns until the data is ready for import
These stored procedures are called in sequence by the service and are generally simple UPDATE commands
Each SP takes the BatchID as an input parameter and specifies this as the criteria for inclusion in each UPDATE, á la :
UPDATE dbo.stgTable
SET FieldOne = (CASE
WHEN S.[FieldOne] IS NULL
THEN T1.FieldOne
ELSE
S.[FieldOne]
END
)
, FieldTwo = (CASE
WHEN S.[FieldTwo] IS NULL
THEN T2.FieldTwo
ELSE
S.[FieldTwo]
END
)
FROM dbo.stgTable AS S
LEFT JOIN dbo.someTable T1 ON S.[SomeField] = T1.[SomeField]
LEFT JOIN dbo.someOtherTable T2 ON S.[SomeOtherField] = T2.[SomeOtherField]
WHERE S.BatchID = #BatchID
Some of the SP’s also refer to functions (both scalar and table-valued) and all incorporate a TRY / CATCH structure so I can tell from the output parameters if a particular SP has failed
The final SP is a MERGE operation to move the enriched data from the staging table into the production table (again, specific to the provided BatchID)
I would like to thread this process in the service so that a large batch doesn’t hold up smaller batches in the same run
I figured there should be no issue with this as no thread could ever attempt to process records in the staging table that could be targeted by another thread (no race conditions)
However, I’ve noticed that, when I do thread the process, arbitrary steps on arbitrary batches seem to fail (but no error is recorded from the output of the SP)
The failures are inconsistent; e.g. sometimes batches 2, 3 & 5 will fail (on SP’s 3, 5 & 7 respectively), other times it will be different batches, each at different steps in the sequence
When I import the batches sequentially, they all import perfectly fine – always!
I can’t figure out if this is an issue on the service side (VB.NET) – e.g. is each thread opening an independent connection to the DB or could they be sharing the same one (I’ve set it up that each one should be independent…)
Or if the issue is on the SQL Server side – e.g. is it not feasible for concurrent SP calls to manipulate data on the same table, even though, as described above, no thread/batch will ever touch records belonging to another thread/batch
(On this point – I tried using CTE’s to create subsets of data from the staging table based on the BatchID and apply the UPDATE’s to those instead but the exact same behaviour occurred)
WITH CTE AS (
SELECT *
FROM dbo.stgTable
WHERE BatchID = #BatchID
)
UPDATE CTE...
Or maybe the problem is that multiple SP’s are calling the same function at the same time and that is why one or more of them are failing (I don’t see why that would be a problem though?)
Any suggestions would be very gratefully received – I’ve been playing around with this all week and I can’t for the life of me determine precisely what the problem might be!
Update to include sample service code
This is the code in the service class where the threading is initiated
For Each ItemInScope In ScopedItems
With ItemInScope
_batches(_batchCount) = New Batch(.Parameter1, .Parameter2, .ParameterX)
With _batches(_batchCount)
If .Initiate() Then
_doneEvents(_batchCount) = New ManualResetEvent(False)
Dim _batchWriter = New BatchWriter(_batches(_batchCount), _doneEvents(_batchCount))
ThreadPool.QueueUserWorkItem(AddressOf _batchWriter.ThreadPoolCallBack, _batchCount)
Else
_doneEvents(_batchCount) = New ManualResetEvent(True)
End If
End With
End With
_batchCount += 1
Next
WaitHandle.WaitAll(_doneEvents)
Here is the BatchWriter class
Public Class BatchWriter
Private _batch As Batch
Private _doneEvent As ManualResetEvent
Public Sub New(ByRef batch As Batch, ByVal doneEvent As ManualResetEvent)
_batch = batch
_doneEvent = doneEvent
End Sub
Public Sub ThreadPoolCallBack(ByVal threadContext As Object)
Dim threadIndex As Integer = CType(threadContext, Integer)
With _batch
If .PrepareBatch() Then
If .WriteTextOutput() Then
.ProcessBatch()
End If
End If
End With
_doneEvent.Set()
End Sub
End Class
The PrepareBatch and WriteTextOutput functions of the Batch class are entirely contained within the service application - it is only the ProcessBatch function where the service starts to interact with the database (via Entity Framework)
Here is that function
Public Sub ProcessScan()
' Confirm that a file is ready for import
If My.Computer.FileSystem.FileExists(_filePath) Then
Dim dbModel As New DatabaseModel
With dbModel
' Pass the batch to the staging table in the database
If .StageBatch(_batchID, _filePath) Then
' First update (results recorded for event log)
If .UpdateOne(_batchID) Then
_stepOneUpdates = .RetUpdates.Value
' Second update (results recorded for event log)
If .UpdateTwo(_batchID) Then
_stepTwoUpdates = .RetUpdates.Value
' Third update (results recorded for event log)
If .UpdateThree(_batchID) Then
_stepThreeUpdates = .RetUpdates.Value
....
End Sub

Related

How do I set the correct transaction level?

I am using Dapper on ADO.NET. So at present I am doing the following:
using (IDbConnection conn = new SqlConnection("MyConnectionString")))
{
conn.Open());
using (IDbTransaction transaction = conn.BeginTransaction())
{
// ...
However, there are various levels of transactions that can be set. I think this is the various settings.
My first question is how do I set the transaction level (where I am using Dapper)?
My second question is what is the correct level for each of the following cases? In each of these cases we have multiple instances of a web worker (Azure) service running that will be hitting the DB at the same time.
I need to run monthly charges on subscriptions. So in a transaction I need to read a record and if it's due for a charge create the invoice record and mark the record as processed. Any other read of that record for the same purpose needs to fail. But any other reads of that record that are just using it to verify that it is active need to succeed.
So what transaction do I use for the access that will be updating the processed column? And what transaction do I use for the other access that just needs to verify that the record is active?
In this case it's fine if a conflict causes the charge to not be run (we'll get it the next day). But it is critical that we not charge someone twice. And it is critical that the read to verify that the record is active succeed immediately while the other operation is in its transaction.
I need to update a record where I am setting just a couple of columns. One use case is I set a new password hash for a user record. It's fine if other access occurs during this except for deleting the record (I think that's the only problem use case). If another web service is also updating that's the user's problem for doing this in 2 places simultaneously.
But it's key that the record stay consistent. And this includes the use case of "set NumUses = NumUses + #ParamNum" so it needs to treat the read, calculation, write of the column value as an atomic action. And if I am setting 3 column values, they all get written together.
1) Assuming that Invoicing process is an SP with multiple statements your best bet is to create another "lock" table to store the fact that invoicing job is already running e.g.
CREATE TABLE InvoicingJob( JobStarted DATETIME, IsRunning BIT NOT NULL )
-- Table will only ever have one record
INSERT INTO InvoicingJob
SELECT NULL, 0
EXEC InvoicingProcess
ALTER PROCEDURE InvoicingProcess
AS
BEGIN
DECLARE #InvoicingJob TABLE( IsRunning BIT )
-- Try to aquire lock
UPDATE InvoicingJob WITH( TABLOCK )
SET JobStarted = GETDATE(), IsRunning = 1
OUTPUT INSERTED.IsRunning INTO #InvoicingJob( IsRunning )
WHERE IsRunning = 0
-- job has been running for more than a day i.e. likely crashed without releasing a lock
-- OR ( IsRunning = 1 AND JobStarted <= DATEADD( DAY, -1, GETDATE())
IF NOT EXISTS( SELECT * FROM #InvoicingJob )
BEGIN
PRINT 'Another Job is already running'
RETURN
END
ELSE
RAISERROR( 'Start Job', 0, 0 ) WITH NOWAIT
-- Do invoicing tasks
WAITFOR DELAY '00:01:00' -- to simulate execution time
-- Release lock
UPDATE InvoicingJob
SET IsRunning = 0
END
2) Read about how transactions work: https://learn.microsoft.com/en-us/sql/t-sql/language-elements/transactions-transact-sql?view=sql-server-2017
https://learn.microsoft.com/en-us/sql/t-sql/statements/set-transaction-isolation-level-transact-sql?view=sql-server-2017
You second question is quite broad.

is there something faster than Enumerable.Except<TSource> Method?

I have a program that downloads data from server database to client database. server database keeps growing recently.
in that program, there is an option to select download all data OR download data for a specific time period (can select backward days from today). if the user selects all, I wrote the program to truncate client database table and insert all data using bulk copy. that part is ok.
but the problem is when user select a specific time period (each recode has created data time ) program has to compare two tables and divide recodes (server data) in two tables. one is, not exist data and the second one is not existing data. and what I'm going to do is,
not existing data directly insert into client DB (i'm using bulk insert) and Existing data inserting into a tempory table using bulkcopy and after update client's table using the above tempory table. My actual problem occurs when dividing server's table. this is how I did it
updateTable = (From c In dt_from_server.AsEnumerable()
Join o In Dt_from_client.AsEnumerable()
On c.Field(Of String)("BARCODE").Trim() Equals o.Field(Of String)("BARCODE").Trim()
And c.Field(Of String)("ITEM_CODE").Trim() Equals o.Field(Of String)("ITEM_CODE").Trim()
Select c).CopyToDataTable()
insertTable = dt_server.AsEnumerable()
.Except(updateTable.AsEnumerable(), DataRowComparer.Default)
.CopyToDataTable()
(normally there is over 1M recodes in the server table )
when there is over 1 Milion recodes, Update part taking acceptable time like 10 minutes (Yes it taking 5GB space from Ram - in this case, it's ok when considering performance )
but insert part seams taking days, just to assing the insertTable(datatable). this is the issue.
AsEnumerable().Except() part taking long time and I couldn't find a solution speedup this process. I'm not sure I explained this correctly. Could anyone can give me some advice for this?
Since you have commented that dt_from_server and dt_server are actually the same DataTable you don't need to compare all values of all DataRows with each other, which is what DataRowComparer.Default does. You can use Except without second parameter for the comparer, then only references are compared which is much faster.
You also don't need two CopyToDataTable which creates two additonal big DataTables in memory, process the rows one after the other.
Here is a different approach using Linq's left-outer join, which is more efficient:
Dim query = from rServ in dt_from_server.AsEnumerable()
group join rClient in Dt_from_client.AsEnumerable()
On New With{
Key .BarCode = rServ.Field(Of String)("BARCODE").Trim(),
Key .ItemCode = rServ.Field(Of String)("ITEM_CODE").Trim()
} Equals New With{
Key .BarCode = rClient.Field(Of String)("BARCODE").Trim(),
Key .ItemCode = rClient.Field(Of String)("ITEM_CODE").Trim()
} into Group
From client In Group.DefaultIfEmpty()
Select new With { .ServerRow = rServ, .InsertRow = client is Nothing }
Dim insertOrUpdateRows = query.ToLookup(Function(x) x.InsertRow, Function(x) x.ServerRow)
Dim insertRows = insertOrUpdateRows(true).CopyToDataTable() 'CopyToDataTable redundant if you process rows immediately now'
Dim updateRows = insertOrUpdateRows(false).CopyToDataTable() 'CopyToDataTable redundant if you process rows immediately now'
But in general the most scalable and efficient approach would be to not load all into memory at once and then process all, but to use database paging(or a stored-procedure) to process only parts of it in memory, otherwise it's likely that you will encounter a OutOfMemoryException sooner or later.
C# as requested:
var query = from rServ in dt_from_server.AsEnumerable()
join rClient in Dt_from_client.AsEnumerable()
on new { BarCode = rServ.Field<string>("BARCODE").Trim(), ItemCode = rServ.Field<string>("ITEM_CODE").Trim() }
equals new { BarCode = rClient.Field<string>("BARCODE").Trim(), ItemCode = rClient.Field<string>("ITEM_CODE").Trim() }
into clientGroup
from client in clientGroup.DefaultIfEmpty()
select new { ServerRow = rServ, InsertRow = client == null };
var insertOrUpdateRows = query.ToLookup(x => x.InsertRow, x => x.ServerRow);
var insertRows = insertOrUpdateRows[true].CopyToDataTable(); // CopyToDataTable redundant if you process rows immediately now
var updateRows = insertOrUpdateRows[false].CopyToDataTable(); // CopyToDataTable redundant if you process rows immediately now

Using Multiple variables using for loop in SSIS

I am very new to SSIS and below is what I am trying to do.
I have a huge OLEDB database as source and the tables which I want to access is divided by multiple areas.
For example:
Select * from employee where area='01' and sub_area='02'
What I want to do is use the same query to loop through all the departments and sub departments and insert the data in the SQL server table using SSIS.
What I have tried so far is creating multiple variables for each department and sub department, then I was trying to use FOR loop container to loop through all the variables. I am trying to loop through because the database is huge and dividing it by area/subarea gives faster results rather than querying the whole database which takes forever to execute.
I went through quite a few examples of for loop container but I am still not sure how do I loop through and let the program know to go to the next set of variables (Dept and sub dept in my case) after finishing the first set.
Any help will be appreciated. Thank you so much.
Looks like you want to bring your data in chunks using area and sub area id's. Here is what you would need to do. Declare the following variables -
VariableName,Type
objAreaSubAreaList,Object
sArea,String
sSub_Area,String
sSQL_GetEmployeeDataByAreaSubArea,String
For the last variable in the expression window, have this expression in it -
"select * from dbo.Employee where Area = '" + #[User::#sAreaId] + "' and Sub_Area = '" + #[User::#sSub_Are] + "'"
Steps to do -
1. Drag a Execute SQL Task with your 'SourceDB' connection. Set the following properties -
General Tab -
SQLStatement - select Area, Sub_Area from dbo.Employee group by Area, Sub_Area
ResultSet - Full Result Set
Result Set Tab -
Click on 'Add' and assign it the variable - objAreaSubAreaListand give 'ResultSetName' as 0
Drag a 'Foreach Loop Container' and connect the above task to this task.
Collection
Enumerator - Foreach ADO Enumerator
Set the variable as - objAreaSubAreaList
Variable Mappings
Add the two variables sArea, sSub_Area in that order.
Within Foreach loop drag a 'Data Flow Task'. Open the Data Flow Task. Drag and drop OLE DB Source task and set the connection. Set the Data Access Mode as 'SQL Command from Variable'. Set the Variable as sSQL_GetEmployeeDataByAreaSubArea and go ahead with the rest of the tasks.
Here is what worked for me. I created variables named counter and counterlimit. I set counter to 0 and counterlimit to 4. Then in the for loop container- Initexpression was set to #Counter=0, Evalexpression was set to #Counter<#CounterLimit and assign expression was set to #Counter=#Counter+1.
Then in the where condition I used Case statement with #Counter variable to say that if #Counter=0 then fetch area=1 and subarea=2 and so on.
Here area and sub area are also variables. This way I used for container to loop through all the area and subarea to get the results. Not sure if it the best way but it solves my issue.

SQL CLR Trigger - get source table

I am creating a DB synchronization engine using SQL CLR Triggers in Microsoft SQL Server 2012. These triggers do not call a stored procedure or function (and thereby have access to the INSERTED and DELETED pseudo-tables but do not have access to the ##procid).
Differences here, for reference.
This "sync engine" uses mapping tables to determine what the table and field maps are for this sync job. In order to determine the target table and fields (from my mapping table) I need to get the source table name from the trigger itself. I have come across many answers on Stack Overflow and other sites that say that this isn't possible. But, I've found one website that provides a clue:
Potential Solution:
using (SqlConnection lConnection = new SqlConnection(#"context connection=true")) {
SqlCommand cmd = new SqlCommand("SELECT object_name(resource_associated_entity_id) FROM sys.dm_tran_locks WHERE request_session_id = ##spid and resource_type = 'OBJECT'", lConnection);
cmd.CommandType = CommandType.Text;
var obj = cmd.ExecuteScalar();
}
This does in fact return the correct table name.
Question:
My question is, how reliable is this potential solution? Is the ##spid actually limited to this single trigger execution? Or is it possible that other simultaneous triggers will overlap within this process id? Will it stand up to multiple executions of the same and/or different triggers within the database?
From these sites, it seems the process Id is in fact limited to the open connection, which doesn't overlap: here, here, and here.
Will this be a safe method to get my source table?
Why?
As I've noticed similar questions, but all without a valid answer for my specific situation (except that one). Most of the comments on those sites ask "Why?", and in order to preempt that, here is why:
This synchronization engine operates on a single DB and can push changes to target tables, transforming the data with user-defined transformations, automatic source-to-target type casting and parsing and can even use the CSharpCodeProvider to execute methods also stored in those mapping tables for transforming data. It is already built, quite robust and has good performance metrics for what we are doing. I'm now trying to build it out to allow for 1:n table changes (including extension tables requiring the same Id as the 'master' table) and am trying to "genericise" the code. Previously each trigger had a "target table" definition hard coded in it and I was using my mapping tables to determine the source. Now I'd like to get the source table and use my mapping tables to determine all the target tables. This is used in a medium-load environment and pushes changes to a "Change Order Book" which a separate server process picks up to finish the CRUD operation.
Edit
As mentioned in the comments, the query listed above is quite "iffy". It will often (after a SQL Server restart, for example) return system objects like syscolpars or sysidxstats. But, it seems that in the dm_tran_locks table there's always an associated resource_type of 'RID' (Row ID) with the same object_name. My current query which works reliably so far is the following (will update if this changes or doesn't work under high load testing):
select t1.ObjectName FROM (
SELECT object_name(resource_associated_entity_id) as ObjectName
FROM sys.dm_tran_locks WHERE resource_type = 'OBJECT' and request_session_id = ##spid
) t1 inner join (
SELECT OBJECT_NAME(partitions.OBJECT_ID) as ObjectName
FROM sys.dm_tran_locks
INNER JOIN sys.partitions ON partitions.hobt_id = dm_tran_locks.resource_associated_entity_id
WHERE resource_type = 'RID'
) t2 on t1.ObjectName = t2.ObjectName
If this is always the case, I'll have to find that out during testing.
How reliable is this potential solution?
While I do not have time to set up a test case to show it not working, I find this approach (even taking into account the query in the Edit section) "iffy" (i.e. not guaranteed to always be reliable).
The main concerns are:
cascading (whether recursive or not) Trigger executions
User (i.e. Explicit / Implicit) transactions
Sub-processes (i.e. EXEC and sp_executesql)
These scenarios allow for multiple objects to be locked, all at the same time.
Is the ##SPID actually limited to this single trigger execution? Or is it possible that other simultaneous triggers will overlap within this process id?
and (from a comment on the question):
I think I can join my query up with the sys.partitions and get a dm_trans_lock that has a type of 'RID' with an object name that will match up to the one in my original query.
And here is why it shouldn't be entirely reliable: the Session ID (i.e. ##SPID) is constant for all of the requests on that Connection). So all sub-processes (i.e. EXEC calls, sp_executesql, Triggers, etc) will all be on the same ##SPID / session_id. So, between sub-processes and User Transactions, you can very easily get locks on multiple resources, all on the same Session ID.
The reason I say "resources" instead of "OBJECT" or even "RID" is that locks can occur on: rows, pages, keys, tables, schemas, stored procedures, the database itself, etc. More than one thing can be considered an "OBJECT", and it is possible that you will have page locks instead of row locks.
Will it stand up to multiple executions of the same and/or different triggers within the database?
As long as these executions occur in different Sessions, then they are a non-issue.
ALL THAT BEING SAID, I can see where simple testing would show that your current method is reliable. However, it should also be easy enough to add more detailed tests that include an explicit transaction that first does some DML on another table, or have a trigger on one table do some DML on one of these tables, etc.
Unfortunately, there is no built-in mechanism that provides the same functionality that ##PROCID does for T-SQL Triggers. I have come up with a scheme that should allow for getting the parent table for a SQLCLR Trigger (that takes into account these various issues), but haven't had a chance to test it out. It requires using a T-SQL trigger, set as the "first" trigger, to set info that can be discovered by the SQLCLR Trigger.
A simpler form can be constructed using CONTEXT_INFO, if you are not already using it for something else (and if you don't already have a "first" Trigger set). In this approach you would still create a T-SQL Trigger, and then set it as the "first" Trigger using sp_settriggerorder. In this Trigger you SET CONTEXT_INFO to the table name that is the parent of ##PROCID. You can then read CONTEXT_INFO() on a Context Connection in a SQLCLR Trigger. If there are multiple levels of Triggers then the value of CONTEXT INFO will get overwritten, so reading that value must be the first thing you do in each SQLCLR Trigger.
This is an old thread, but it is an FAQ and I think I have a better solution. Essentially it uses the schema of the inserted or deleted table to find the base table by doing a hash of the column names and comparing the hash with the hashes of tables with a CLR trigger on them.
Code snippet below - at some point I will probably put the whole solution on Git (it sends a message to Azure Service Bus when the trigger fires).
private const string colqry = "select top 1 * from inserted union all select top 1 * from deleted";
private const string hashqry = "WITH cols as ( "+
"select top 100000 c.object_id, column_id, c.[name] "+
"from sys.columns c "+
"JOIN sys.objects ot on (c.object_id= ot.parent_object_id and ot.type= 'TA') " +
"order by c.object_id, column_id ) "+
"SELECT s.[name] + '.' + o.[name] as 'TableName', CONVERT(NCHAR(32), HASHBYTES('MD5',STRING_AGG(CONVERT(NCHAR(32), HASHBYTES('MD5', cols.[name]), 2), '|')),2) as 'MD5Hash' " +
"FROM cols "+
"JOIN sys.objects o on (cols.object_id= o.object_id) "+
"JOIN sys.schemas s on (o.schema_id= s.schema_id) "+
"WHERE o.is_ms_shipped = 0 "+
"GROUP BY s.[name], o.[name]";
public static void trgSendSBMsg()
{
string table = "";
SqlCommand cmd;
SqlDataReader rdr;
SqlTriggerContext trigContxt = SqlContext.TriggerContext;
SqlPipe p = SqlContext.Pipe;
using (SqlConnection con = new SqlConnection("context connection=true"))
{
try
{
con.Open();
string tblhash = "";
using (cmd = new SqlCommand(colqry, con))
{
using (rdr = cmd.ExecuteReader(CommandBehavior.SingleResult))
{
if (rdr.Read())
{
MD5 hash = MD5.Create();
StringBuilder hashstr = new StringBuilder(250);
for (int i=0; i < rdr.FieldCount; i++)
{
if (i > 0) hashstr.Append("|");
hashstr.Append(GetMD5Hash(hash, rdr.GetName(i)));
}
tblhash = GetMD5Hash(hash, hashstr.ToString().ToUpper()).ToUpper();
}
rdr.Close();
}
}
using (cmd = new SqlCommand(hashqry, con))
{
using (rdr = cmd.ExecuteReader(CommandBehavior.SingleResult))
{
while (rdr.Read())
{
string hash = rdr.GetString(1).ToUpper();
if (hash == tblhash)
{
table = rdr.GetString(0);
break;
}
}
rdr.Close();
}
}
if (table.Length == 0)
{
p.Send("Error: Unable to find table that CLR trigger is on. Message not sent!");
return;
}
….
HTH

Same data is inserted during insert

I have couple insert queries which are merged in transaction. First of that insert is to create new product articel number incrementing the most higher in table by one. Unfortunetly i just noticed that mostly during tests if for instance two users from two diffrent applications click button which trigger my transaction's method they could get same new product number. How can avoid that situation? Is there something like lock on first insertion so that if first user accessing table to insert restrict other's user/s about their insertion so they have to wait in queue after first user insert is finished? Is there something like that? Besides i thought if someone inserts other users are not able to insert. I made comments in code you to understand.
Part of my transaction query below:
Public Sub ProcessArticle(ByRef artikel As ArticlesVariations)
Dim strcon = New AppSettingsReader().GetValue("ConnectionString", GetType(System.String)).ToString()
Using connection As New SqlConnection(strcon)
connection.Open()
Using transaction = connection.BeginTransaction()
Try
For Each kvp As KeyValuePair(Of Integer, Artikel) In artikel.collection
articleIndex = kvp.Key
Dim art As Artikel = kvp.Value
Using cmd As New SqlCommand("INSERT INTO tbArtikel (Nummer) VALUES (#Nummer);Select Scope_Identity()", transaction.Connection)
cmd.CommandType = CommandType.Text
cmd.Connection = connection
cmd.Transaction = transaction
'Get next product number from table tbArtikel (this will be new product number)'
Dim NewArtNummer as String = New DALArtikel().GetNewArtikelNumber(transaction)
art.Nummer = NewArtNummer
cmd.Parameters.AddWithValue("#Nummer", art.Nummer)
'Get inserted product id for other diffrent inserts below'
newArticleRowId = CInt(cmd.ExecuteScalar())
'....
other INSERTs queries to other tables ...
...'
transaction.Commit()
Catch ex As Exception
transaction.Rollback()
Throw 'Rethrow exception.'
End Try
End Using
End Using
End Sub
Just about the only way to assure that users are not assigned the same values is to issue them from the server when the row is inserted. It is the entire premise behind the server issuing AI values for PKs.
BUT since your thing is a multi-segment, "numeric string" that presents a problem. Rather than tearing the string apart to find the Max()+1 for one segment with a WHERE clause on parts of the string. Consider something like this:
Start with a table used to increment and issue the values:
{DocId Int, SegmentB int, SegmentC Int}
This will simply track the values to use in the other table. Then a stored procedure to create/increment a new code (MySQL - this is a conceptual answer):
CREATE DEFINER=`root`#`localhost` PROCEDURE `GetNextProductCode`(in docId int,
in Minr int,
in Rev int
)
BEGIN
SET #maxR = 0;
SET #retCode ='';
if Minr =-1 then
Start transaction;
SET #maxR = (SELECT Max(SegmentB) FROM articlecode WHERE MainId = docId) + 1;
UPDATE articlecode SET SegmentB = #maxR WHERE MainId = docId;
Commit;
Select concat(Cast(docId As char) , '.',
Cast(#maxR AS char) , '.',
Cast(Rev As char)
);
end if;
END
This is a rough idea of the process. As such, it only works on the second segment (I dunno what happens when you create a NEW SegmentB - does SegmentC reset to 1???). The idea is:
pass numbers so there is no need to tear up a string
pass -1 for the segment you need the next value for
the sp gets the Max()+1 and updates the counter table so the next user will get a new value
If for some reason you end up not saving the row, there will be gaps
the sp uses a transaction (probably only needs to protect the update) so that only 1 update can happen at a time
returns the new code. it could just return 2 values, but your going to glue them together anyway
There is much To Do:
It only does SegmentB
For a NEW DocId (-1), insert a new row with 1000 and 1(?) defaults
Same for a NEW segmentB (whatever it is): insert a new row for that DocId with default values
To get a new code before you insert a row:
cmd.CommandType = CommandType.StoredProcedure
cmd.Parameters.Add("docId", MySqlDbType.Int32).Value = 3
cmd.Parameters.Add("Minr", MySqlDbType.Int32).Value = -1
cmd.Parameters.Add("Rev", MySqlDbType.Int32).Value = 1
dbcon.Open()
Using rdr = cmd.ExecuteReader()
rdr.Read()
Console.WriteLine(rdr(0))
End Using
The obvious downside is that each insert requires you to hit the DB in order to...well save to the DB. If they were int values it could be a Trigger.
I'm a SQL developer and my VB skills are about fifteen years out of date, but instead of creating the incremented number yourself in VB just let SQL generate them with an IDENTITY field. SQL will never allow duplicates and then you just need to return the SCOPE_IDENTITY():
ALTER TABLE dbo.tbArtikel
ADD [ArtikelID] INT IDENTITY(1,1) PRIMARY KEY;
I have two suggestions:
First suggestion: move your code to a stored procedure this way all your users will execute the same transaction where you can set your isolation level the way you want. Read This.
Second suggestion: I would create a unique index on your field Nummer. This way when I try to insert a duplicate value it will raise an error that I can deal with it by telling the user that he need to retry the same operation or retry it automatically.
Trying to lock the record or the table for your operation is not advisable, however you can check this article on code project you might find what you are looking for. Make sure that you provide a mechanism of releasing all locks if your program stops at the middle of the transaction.

Resources