I have an excel file with thousand of rows which I need to use to delete/update/insert some tables.
The excel provides the following data: provider_id, country_name, locale, property1, property2.
The tables which need to be updated are:
provider_country with columns : provider_country_id, provider_id, country_id, property1, property2 and
provider_country_language with columns : provider_country_language_id, provider_country_id, language_id.
I can also use table country with columns (for joins): country_id, country_name.
And table language with columns: language_id, locale, country_id.
The fields which need to be updated are country_id, language_id, property1,property2 (from provider_country and provider_country_language)
I have created a temporary table with all the data from the excel:
CREATE TABLE #TempProviderCountryLanguage(
[provider_id] int NULL,
[country_name] nvarchar(50) NULL,
[locale] nvarchar(10) NULL,
[property1] int NULL,
[property2] decimal(5,2) NULL
)
INSERT INTO #TempProviderCountryLanguage VALUES
(1,N'Provider1',N'Brazil',N'en-br',4,NULL)
INSERT INTO #TempProviderCountryLanguage VALUES
(1,N'Provider1',N'Brazil',N'pt-br',4,NULL)
INSERT INTO #TempProviderCountryLanguage VALUES
(1,N'Provider1',N'Denmark',N'da-dk',4,12.21)
INSERT INTO #TempProviderCountryLanguage VALUES
(2,N'Provider2',N'Denmark',N'da-dk',5,14.21)
......
MERGE [provider_country] AS TARGET
USING (
SELECT tb.provider_id
,c.country_id
,l.language_id
,tb.property1
,tb.property2
FROM #TempProviderCountryLanguage tb
INNER JOIN country c ON c.country_name = tb.country_name
INNER JOIN language l ON l.locale = l.locale
) AS SOURCE
ON (
TARGET.provider_id = SOURCE.provider_id AND
TARGET.country_id = SOURCE.country_id
)
WHEN MATCHED
THEN
UPDATE
SET TARGET.country_id = SOURCE.country_id,
TARGET.property1 = SOURCE.property1,
TARGET.property2 = SOURCE.property2
WHEN NOT MATCHED BY TARGET
THEN
INSERT (
provider_id
,country_id
,property1
,property2
)
VALUES (
SOURCE.provider_id
,SOURCE.country_id
,SOURCE.property1
,SOURCE.property2
)
WHEN NOT MATCHED BY SOURCE THEN
DELETE;
For the provider_country_language I plan to make another merge.
I am trying to update the tables using merge but I have a problem because I cannot make a unique select here (somehow I would need the language_id as well):
ON (
TARGET.provider_id = SOURCE.provider_id AND
TARGET.country_id = SOURCE.country_id
)
And the error is :
The MERGE statement attempted to UPDATE or DELETE the same row more
than once. This happens when a target row matches more than one source
row. A MERGE statement cannot UPDATE/DELETE the same row of the target
table multiple times. Refine the ON clause to ensure a target row
matches at most one source row, or use the GROUP BY clause to group
the source rows.
How can I make this work and make sure all the tables are updated correctly?(not necessarily using Merge)
And from performance point of view what would be the best approach? (only the INSERT INTO will be performed thousand of times...)
Related
I need to create a sequence in the database that cannot be using sequence or identity.
There is a table in the database called File where all the files that users send in different areas of the system are stored.
It contains the id (primary key), name, type, folder, number, hash...
CREATE TABLE dbo.[File]
(
FileId uniqueidentifier NOT NULL,
Name nvarchar(30) NOT NULL,
FileTypeId int NOT NULL,
FileFolderId int NOT NULL,
Number int NOT NULL,
Hash nvarchar(50) NOT NULL
...
) ON [PRIMARY]
And then for each feature there is a table expanding the properties of the File table, an example is ContractFile.
It has the same id of the File table and with a few more fields and the id of the Contract table, creating the relation.
CREATE TABLE dbo.ContractFile
(
FileId uniqueidentifier NOT NULL,
ContractId uniqueidentifier NOT NULL
...
) ON [PRIMARY]
So the filename should follow a pattern.
050#H4G5H4G244#001.pdf
050#H4G5H4G244#002.pdf
060#H4G5H4G244#001.pdf
The first 3 digits is a code that is in the FileType table.
The digits in the middle is the code in the Contract table.
And the last 3 is the sequence that was inserted.
Then it groups the string by the FileType and the Contract.
So I created a trigger in the ContractFile table for when inserting it get the biggest number for that FileType and for the Contract and add +1, setting the Number field of the File table.
Then the file name is updated (in the same trigger)
CREATE TRIGGER [dbo].[tgContractFileInsert]
ON [dbo].[ContractFile]
FOR INSERT
AS
BEGIN
SET NOCOUNT ON
UPDATE dbo.File
SET Number = COALESCE(
(SELECT MAX(AR.Number)
FROM dbo.ContractFile NOA
INNER JOIN dbo.File AR
ON AR.FileId = NOA.FileId
WHERE NOA.ContractId = I.ContractId AND
AR.FileTypeId = T.FileTypeId
),
0) + 1
FROM dbo.File T WITH (XLOCK)
INNER JOIN Inserted I
ON I.FileId = T.FileId
WHERE T.Number IS NULL
UPDATE dbo.File
SET Name = dbo.fnFileName(AP.Code, NOB.Code, T.Numero, T.Name)
FROM dbo.File T
INNER JOIN Inserted I
ON I.FileId = T.FileId
INNER JOIN dbo.FileType AP
ON AP.FileTypeId = T.FileTypeId
INNER JOIN dbo.Contract NOB
ON NOB.ContractId = I.ContractId
END
At first it works, but when we have a large volume being inserted, there is a deadlock.
And from what I'm seeing also when inserting more than one record will end up getting the same number, since the Inserted table will bring two records and the +1 is not checking this.
How could I solve this? What is the best way?
Avoid deadlock, will the sequence be correct even inserting more than one record at a time and have a good performance?
I currently have a stored procedure that compares my target table (Ticket_Report) to my data source table (New_Tickets).
I am using a MERGE INTO statement to compare these two. When it finds a match between the two tables, it updates the current row in the target table with the corresponding info from the source table. If it dosent find a match, it inserts that data from the source table into the target table.
MERGE INTO Ticket_REPORT T1
USING #New_Tickets T2
ON T1.TICKET_NO=T2.TICKET_NO
WHEN MATCHED THEN
UPDATE SET
T1.TICKET_NO = T2.TICKET_NO,
T1.ASSIGNED_GROUP = T2.ASSIGNED_GROUP,
T1.ASSIGNEE = T2.ASSIGNEE,
T1.FNAME = T2.FNAME,
T1.LNAME = T2.LNAME
WHEN NOT MATCHED THEN
INSERT VALUES(
T2.TICKET_NO,
T2.ASSIGNED_GROUP,
T2.ASSIGNEE,
T2.FNAME,
T2.LNAME
);
I need to change this, so that when match is found on the Ticket Number, instead up just updating it, I need to A.)replace the current row in the Target table by deleting it, then B.)inserting the corresponding Row from the source table.
I currently have
MERGE INTO Ticket_REPORT T1
USING #New_Tickets T2
ON T1.Ticket_NO=T2.Ticket_NO
WHEN MATCHED THEN DELETE
//Now I need to replace what I deleted with the row from the source table
Which will delete the row from the Target Table. Now I want to Insert the corresponding Row from the Source Table. I am having trouble trying to do multiple things inside the WHEN MATCHED clause. Does anyone know how I can accomplish this?
*Side note: When matched, I could Insert the row from the Source, but then I how would I delete the original?
Strictly solution For your boss
First put the matched records into one temp table
Next use Merge query to delete the matched records from target table and Insert the unmatched records
Finally insert the records from temp table to target table
Try something like this
select * into #temp
from #New_Tickets T2
where exists(select 1
from Ticket_REPORT T1
where T1.Ticket_NO=T2.Ticket_NO)
MERGE INTO Ticket_REPORT T1
USING #New_Tickets T2
ON T1.TICKET_NO=T2.TICKET_NO
WHEN MATCHED THEN DELETE
WHEN NOT MATCHED THEN
INSERT VALUES(
T2.TICKET_NO,
T2.ASSIGNED_GROUP,
T2.ASSIGNEE,
T2.FNAME,
T2.LNAME
);
insert into Ticket_REPORT (col1,col2,..)
select col1,col2,..
from #temp
Note :
What has to be done is delete the matched records from target table and insert all then records from source table to target table
One approach is to subquery your MERGE statement. You can filter the output based on the action (INSERT, UPDATE or DELETE). The filtered records can then be INSERTED, UPDATED or DELETED.
This is a common technique for loading slowly changing dimensions into a data warehouse.
My example uses the follow temp tables:
SETUP
/* Create and populate sample tables.
*/
CREATE TABLE #NewTicket
(
Id INT
)
;
CREATE TABLE #TicketReport
(
Id INT
)
;
INSERT INTO #NewTicket
(
Id
)
VALUES
(1),
(2),
(3)
;
INSERT INTO #TicketReport
(
Id
)
VALUES
(3),
(4),
(5)
;
In the example 1, 2 and 3 appear in new ticket. 3, 4 and 5 appear in ticket report. 3 is deleted by the MERGE statement and INSERTED by the outer query.
EXAMPLE
/* Filter the results from the sub query
* for deleted records.
* These are then appended in the main
* outer query.
*/
INSERT INTO #TicketReport
(
Id
)
SELECT
Id
FROM
(
/* MERGE statments can be used as a sub query.
* You'll need the OUTPUT clause for this to work.
* The column $action describes what happened to each record.
*/
MERGE
#TicketReport AS t
USING #NewTicket AS s ON s.Id = t.Id
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED BY TARGET THEN
INSERT
(
Id
)
VALUES
(
Id
)
OUTPUT
$action,
s.*
) AS r
WHERE
[$Action] = 'DELETE'
;
/* View the final result.
*/
SELECT
*
FROM
#TicketReport
ORDER BY
Id
;
We have a table where we store all the exceptions (message, stackTrace, etc..), the table is getting big and we would like to reduce it.
There are plenty of repeated StackTraces, Messages, etc, but enabling compression produces a modest size reduction (10%) while I think much bigger benefits could come if somehow Sql Server will intern the strings in some per-column hash-table.
I could get some of the benefits if I normalize the table and extract StackTraces to another one, but exception messages, exception types, etc.. are also repeated.
Is there a way to enable string interning for some column in Sql Server?
There is no built-in way to do this. You could easily do something like:
SELECT MessageID = IDENTITY(INT, 1, 1), Message
INTO dbo.Messages
FROM dbo.HugeTable GROUP BY Message;
ALTER TABLE dbo.HugeTable ADD MessageID INT;
UPDATE h
SET h.MessageID = m.MessageID
FROM dbo.HugeTable AS h
INNER JOIN dbo.Messages AS m
ON h.Message = m.Message;
ALTER TABLE dbo.HugeTable DROP COLUMN Message;
Now you'll need to do a few things:
Change your logging procedure to perform an upsert to the Messages table
Add proper indexes to the messages table (wasn't sure of Message data type) and PK
Add FK to MessageID column
Rebuild indexes on HugeTable to reclaim space
Do this in a test environment first!
Aaron's posting answers the questions of adding interning to a table, but afterwards you will need to modify your application code and stored-procedures to work with the new schema.
...or so you might think. You can actually create a VIEW that returns data matching the old schema, and you can also support INSERT operations on the view too, which are translated into child operations on the Messages and HugeTable tables. For readability I'll use the names InternedStrings and ExceptionLogs for the tables.
So if the old table was this:
CREATE TABLE ExceptionLogs (
LogId int IDENTITY(1,1) NOT NULL PRIMARY KEY,
Message nvarchar(1024) NOT NULL,
ExceptionType nvarchar(512) NOT NULL,
StackTrace nvarchar(4096) NOT NULL
)
And the new tables are:
CREATE TABLE InternedStrings (
StringId int IDENTITY(1,1) NOT NULL PRIMARY KEY,
Value nvarchar(max) NOT NULL
)
CREATE TABLE ExceptionLogs2 ( -- note the new name
LogId int IDENTITY(1,1) NOT NULL PRIMARY KEY,
Message int NOT NULL,
ExceptionType int NOT NULL,
StackTrace int NOT NULL
)
Add an index to InternedStrings to make the value lookups faster:
CREATE UNIQUE NONCLUSTERED INDEX IX_U_InternedStrings_Value ON InternedStrings ( Value ASC )
Then you would also have a VIEW:
CREATE VIEW ExeptionLogs AS
SELECT
LogId,
MessageStrings .Value AS Message,
ExceptionTypeStrings.Value AS ExceptionType,
StackTraceStrings .Value AS StackTrace
FROM
ExceptionLogs2
INNER JOIN InternedStrings AS MessageStrings ON
MessageStrings.StringId = ExceptionLogs2.Message
INNER JOIN InternedStrings AS ExceptionTypeStrings ON
ExceptionTypeStrings.StringId = ExceptionLogs2.ExceptionType
INNER JOIN InternedStrings AS StackTraceStrings ON
StackTraceStrings.StringId = ExceptionLogs2.StackTrace
And to handle INSERT operations from unmodified clients:
CREATE TRIGGER ExceptionLogsInsertHandler
ON ExceptionLogs INSTEAD OF INSERT AS
DECLARE #messageId int = SELECT StringId FROM InternedStrings WHERE Value = inserted.Message
IF #messageId IS NULL
BEGIN
INSERT INTO InternedStrings ( Text ) VALUES ( inserted.Message )
SET #messageId = SCOPE_IDENTITY()
END
DECLARE #exceptionTypeId int = SELECT StringId FROM InternedStrings WHERE Value = inserted.ExceptionType
IF #exceptionTypeId IS NULL
BEGIN
INSERT INTO InternedStrings ( Text ) VALUES ( inserted.ExceptionType )
SET #exceptionTypeId = SCOPE_IDENTITY()
END
DECLARE #stackTraceId int = SELECT StringId FROM InternedStrings WHERE Value = inserted.StackTrace
IF #stackTraceId IS NULL
BEGIN
INSERT INTO InternedStrings ( Text ) VALUES ( inserted.StackTrace )
SET #stackTraceId = SCOPE_IDENTITY()
END
INSERT INTO ExceptionLogs2 ( Message, ExceptionType, StackTrace )
VALUES ( #messageId, #exceptionTypeId, #stackTraceId )
Note this TRIGGER can be improved: it only supports single-row insertions, and is not entirely concurrency-safe, though because previous data won't be mutated it means that there's a slight risk of data duplication in the InternedStrings table - and because of a UNIQUE index the insert will fail. There are different possible ways to handle this, such as using a TRANSACTION and changing the queries to use holdlock and updlock.
I'm using MERGE in my query and i'm making INSERT on clause WHEN NOT MATCHED THEN, but then i would like to get the inserted row identity and make another INSERT to some other table. Query for now is:
ALTER PROCEDURE [dbo].[BulkMergeOffers]
#data ImportDataType READONLY
AS
SET NOCOUNT ON;
DECLARE #cid int = 0
MERGE dbo.oferta AS target
USING #data AS source
ON (target.nr_oferty = source.nr_oferty)
WHEN NOT MATCHED THEN
INSERT (nr_oferty,rynek,typ_transakcji, typ_nieruchomosci,cena,powierzchnia, rok_budowy, wojewodztwo, miasto, dzielnica, ulica, opis, wspolrzedne, film, zrodlo, KontaktStore, data, forma_wlasnosci, stan_techniczny, liczba_pokoi, liczba_peter, pietro, material, kuchnia, pow_dzialki, typ_dzialki, woda,gaz, prad,sila, przeznaczenie,lokal_dane)
VALUES (source.nr_oferty,source.rynek,source.typ_transakcji, source.typ_nieruchomosci,source.cena,source.powierzchnia, source.rok_budowy, source.wojewodztwo, miasto, source.dzielnica, source.ulica, source.opis, source.wspolrzedne, source.film, source.zrodlo, source.KontaktStore, source.data, source.forma_wlasnosci, source.stan_techniczny, source.liczba_pokoi, source.liczba_peter, source.pietro, source.material, source.kuchnia, source.pow_dzialki, source.typ_dzialki, source.woda,source.gaz, source.prad,source.sila, source.przeznaczenie,source.lokal_dane);
So as you see i need to insert some values to the target table based on source data, then i need to take the insert identity and insert it into another table but also based on some source data, so something like that, just after the first insert:
SET #cid = SCOPE_IDENTITY();
if source.photo is not null
begin
insert into dbo.photos(offerID, file) values (#cid, source.photo);
end
But i can't assemble it, a have no access to the source no more, also if statement show error :
"the multi-part identifier
source.photo can not be bound"
but it is there. Just for clarity ImportDataType is a table-valued parameter.
Please HELP
If you don't need the WHEN MATCHED part of the MERGE statement in your query, there's no real reason to use MERGE. You could use INSERT with an outer join or NOT EXISTS statement.
In either case, you can use the OUTPUT clause to retrieve the inserted identity value an pass it on to a second query.
I've extended your example:
<stored procedure header - unchanged>
--declare a table variable to hold the inserted values data
DECLARE #newData TABLE
(nr_oferty int
,newid int
) -- I'm guessing the datatype for both columns
MERGE dbo.oferta AS target
USING #data AS source
ON (target.nr_oferty = source.nr_oferty)
WHEN NOT MATCHED THEN
INSERT (nr_oferty,rynek,typ_transakcji, typ_nieruchomosci,cena,powierzchnia, rok_budowy, wojewodztwo, miasto, dzielnica, ulica, opis, wspolrzedne, film, zrodlo, KontaktStore, data, forma_wlasnosci, stan_techniczny, liczba_pokoi, liczba_peter, pietro, material, kuchnia, pow_dzialki, typ_dzialki, woda,gaz, prad,sila, przeznaczenie,lokal_dane)
VALUES (source.nr_oferty,source.rynek,source.typ_transakcji, source.typ_nieruchomosci,source.cena,source.powierzchnia, source.rok_budowy, source.wojewodztwo, miasto, source.dzielnica, source.ulica, source.opis, source.wspolrzedne, source.film, source.zrodlo, source.KontaktStore, source.data, source.forma_wlasnosci, source.stan_techniczny, source.liczba_pokoi, source.liczba_peter, source.pietro, source.material, source.kuchnia, source.pow_dzialki, source.typ_dzialki, source.woda,source.gaz, source.prad,source.sila, source.przeznaczenie,source.lokal_dane)
OUTPUT inserted.nr_oferty, inserted.<tableId> INTO #newData;
-- replace <tableId> with the name of the identity column in dbo.oftera
insert into dbo.photos(offerID, file)
SELECT nd.newid, pt.photo
FROM #data AS pt
JOIN #newData AS nd
ON nd.nr_oferty = pt.nr_oferty
WHERE pt.photo IS NOT NULL
I have a Stored procedure, in which I have to insert 3 strings into 3 different Tables at a time, each string into each of the 3 tables.
In each table, a unique primary key (rowid) would be generated on insertion of the value.
Now, the Primary Key of first two tables is the Foreign key of the Third Table which as you all know, should not be null.
Here in my SP, insertion of value and generation of RowID (PrimaryKey) is done successfully.
Now I have to pass the two primary keys(Rowids) as values/Parameters(foreignkeys) into the third table, which is returning null.
Here is my SP:-
(1st Table)
INSERT INTO [kk_admin].[FA_Master](FA_Name,FA_CSession,FA_MSession) Values
(#FA_Name,#Session_Id,#Session_Id)
SELECT #**FA_ID=FA_ID** FROM [kk_admin].[FA_Master] where FA_Name=#FA_Name
(2nd Table)
INSERT INTO [kk_admin].[Dept_Master](Dept_Name,Dept_CSession,Dept_MSession) Values
(#Dept_Name,#Session_Id,#Session_Id)
SELECT #**Dept_id=Dept_id** from [kk_admin].[Dept_Master] where Dept_Name=#Dept_Name
(3rd Table)
INSERT INTO [kk_admin].[Category_Master] (**FA_Id**,**Dept_Id**,Category_Name,Category_CSession,Category_MSession) Values (#**FA_ID**,#**Dept_Id**,#Category_Name,#Session_Id,#Session_Id)
Hope everyone understood what I have explained.
Plz Help me,
Iam running out of time.
Plz help me.
Thank You in Advance,
Brightsy
You can use an OUTPUT clause (assuming you're using SQL Server 2005) to capture the primary key for the two rows you're inserting with the first two queries. You can capture the values into a temporary table. [I previously wrote that you could use a regular variable, but that's not supported.] Example code:
CREATE TABLE #FA_Master_ID ( ID int );
CREATE TABLE #Dept_Master_ID ( ID int );
INSERT kk_admin.FA_Master ( FA_Name, FA_CSession, FA_MSession )
OUTPUT INSERTED.ID INTO #FA_Master_ID
VALUES ( #FA_Name, #Session_Id, #Session_Id );
INSERT kk_admin.Dept_Master ( Dept_Name, Dept_CSession, Dept_MSession )
OUTPUT INSERTED.ID INTO #Dept_Master_ID
VALUES ( #Dept_Name, #Session_Id, #Session_Id );
INSERT kk_admin.Category_Master ( **FA_Id**, **Dept_Id**, Category_Name, Category_CSession, Category_MSession )
SELECT **FA_Id** = ( SELECT TOP 1 ID FROM #FA_Master_ID ),
**Dept_Id** = ( SELECT TOP 1 ID FROM #Dept_Master_ID ),
Category_Name = #Category_Name,
Category_CSession = #Session_Id,
Category_MSession = #Session_Id;