I need to import XML data into SQL Server 2012. The import works correctly, but I would want to avoid double import. I already tried with WHERE NOT EXISTS but it didn't work.
The import:
INSERT INTO dbo.tXMLImport(cText)
SELECT cast(CONVERT(XML,x.BulkColumn,2) AS varchar(max))
FROM OPENROWSET (BULK 'D:\XML\Data.xml', SINGLE_BLOB) AS x
EXL file content:
<?xml version="1.0" encoding="UTF-8"?>
<tOrder>
<cName>Name1</cName>
<cID>100</cID>
</tOrder>
Now, it should be checked if cID value 100 from XML file already exist in
dbo.tOrder row cOrderNumber
cOrderNumber
1 100
2 101
3 102
Following extention does not wokr:
WHERE NOT EXISTS(SELECT *
FROM dbo.tOrder
WHERE x.value('(/tOrder/cID)') = dbo.tOrder.CorderNumber)
If yes, no Import to be done. Maybe some one can support me with?
Thanks in advance.
I'm not sure if I really get this... If the same cOrderNumber exists already, wouldn't you try to update the existing row? Something like you'd do with MERGE?
But It might be something like this what you are looking for:
WHERE NOT EXISTS(SELECT 1 FROM dbo.tOrder
WHERE x.exist(N'/tOrder[cID/text()=sql:column("cOrderNumber")])')=1)
(Untested air code)
This looks if there is any record within tOrder where the XML column x has any occurance of a node <tOrder><CID> with a value like the current cOrderNumber's value.
T-SQL adds the sql:column() method to XQuery, which allows to use the value of a row within the query. There's sql:variable() too.
The xml's method .exist() checks the XML for any existance of a given condition and returns with 0 or 1.
UPDATE
After reading your question once again, I'm not sure if I got this correctly... Please check the following. If this doesn't help, please use my code to set up a stand-alone sample to reprodcue your issue:
A dummy table with some orders
DECLARE #YourTable TABLE(cOrderNumber INT, OrderName VARCHAR(100));
INSERT INTO #YourTable VALUES
(100,'Order 100')
,(200,'Order 200')
,(300,'Order 300')
--Try to insert an XML with the existing OrderNumber=100
DECLARE #xml100 XML=
'<tOrder>
<cName>Name1</cName>
<cID>100</cID>
</tOrder>';
INSERT INTO #YourTable(cOrderNumber,OrderName)
SELECT #xml100.value('(/tOrder/cID/text())[1]','int')
,#xml100.value('(/tOrder/cName/text())[1]','varchar(100)')
WHERE NOT EXISTS(SELECT 1 FROM #YourTable AS t2
WHERE t2.cOrderNumber=#xml100.value('(/tOrder/cID/text())[1]','int'));
--Same code as above, but the order number is now a not existing number
DECLARE #xml101 XML=
'<tOrder>
<cName>Name1</cName>
<cID>101</cID>
</tOrder>';
INSERT INTO #YourTable(cOrderNumber,OrderName)
SELECT #xml101.value('(/tOrder/cID/text())[1]','int')
,#xml101.value('(/tOrder/cName/text())[1]','varchar(100)')
WHERE NOT EXISTS(SELECT 1 FROM #YourTable AS t2
WHERE t2.cOrderNumber=#xml101.value('(/tOrder/cID/text())[1]','int'));
--check the result
SELECT *
FROM #YourTable;
nr name
-------------
100 Order 100
200 Order 200
300 Order 300
101 Name1
Related
I have data inside a table's column. I SELECT DISTINCT of that column, i also put LTRIM(RTRIM(col_name)) as well while writing SELECT. But still I am getting duplicate column record.
How can we identify why it is happening and how we can avoid it?
I tried RTRIM, LTRIM, UPPER function. Still no help.
Query:
select distinct LTRIM(RTRIM(serverstatus))
from SQLInventory
Output:
Development
Staging
Test
Pre-Production
UNKNOWN
NULL
Need to be decommissioned
Production
Pre-Production
Decommissioned
Non-Production
Unsupported Edition
Looks like there's a unicode character in there, somewhere. I copied and pasted the values out initially as a varchar, and did the following:
SELECT DISTINCT serverstatus
FROM (VALUES('Development'),
('Staging'),
('Test'),
('Pre-Production'),
('UNKNOWN'),
('NULL'),
('Need to be decommissioned'),
('Production'),
(''),
('Pre-Production'),
('Decommissioned'),
('Non-Production'),
('Unsupported Edition'))V(serverstatus);
This, interestingly, returned the values below:
Development
Staging
Test
Pre-Production
UNKNOWN
NULL
Need to be decommissioned
Production
Pre-Produc?tion
Decommissioned
Non-Production
Unsupported Edition
Note that one of the values is Pre-Produc?tion, meaning that there is a unicode character between the c and t.
So, let's find out what it is:
SELECT 'Pre-Production', N'Pre-Production',
UNICODE(SUBSTRING(N'Pre-Production',11,1));
The UNICODE function returns back 8203, which is a zero-width space. I assume you want to remove these, so you can update your data by doing:
UPDATE SQLInventory
SET serverstatus = REPLACE(serverstatus, NCHAR(8203), N'');
Now your first query should work as you expect.
(I also suggest you might therefore want a lookup table for your status' with a foreign key, so that this can't happen again).
DB<>fiddle
I deal with this type of thing all the time. For stuff like this NGrams8K and PatReplace8k and PATINDEX are your best friends.
Putting what you posted in a table variable we can analyze the problem:
DECLARE #table TABLE (txtID INT IDENTITY, txt NVARCHAR(100));
INSERT #table (txt)
VALUES ('Development'),('Staging'),('Test'),('Pre-Production'),('UNKNOWN'),(NULL),
('Need to be decommissioned'),('Production'),(''),('Pre-Production'),('Decommissioned'),
('Non-Production'),('Unsupported Edition');
This query will identify items with characters other than A-Z, spaces and hyphens:
SELECT t.txtID, t.txt
FROM #table AS t
WHERE PATINDEX('%[^a-zA-Z -]%',t.txt) > 0;
This returns:
txtID txt
----------- -------------------------------------------
10 Pre-Production
To identify the bad character we can use NGrams8k like this:
SELECT t.txtID, t.txt, ng.position, ng.token -- ,UNICODE(ng.token)
FROM #table AS t
CROSS APPLY dbo.NGrams8K(t.txt,1) AS ng
WHERE PATINDEX('%[^a-zA-Z -]%',ng.token)>0;
Which returns:
txtID txt position token
------ ----------------- -------------------- ---------
10 Pre-Production 11 ?
PatReplace8K makes cleaning up stuff like this quite easily and quickly. First note this query:
SELECT OldString = t.txt, p.NewString
FROM #table AS t
CROSS APPLY dbo.patReplace8K(t.txt,'%[^a-zA-Z -]%','') AS p
WHERE PATINDEX('%[^a-zA-Z -]%',t.txt) > 0;
Which returns this on my system:
OldString NewString
------------------ ----------------
Pre-Produc?tion Pre-Production
To fix the problem you can use patreplace8K like this:
UPDATE t
SET txt = p.newString
FROM #table AS t
CROSS APPLY dbo.patReplace8K(t.txt,'%[^a-zA-Z -]%','') AS p
WHERE PATINDEX('%[^a-zA-Z -]%',t.txt) > 0;
CREATE TABLE SportsEvent
(ID INT, Name NVARCHAR(20), Results XML);
GO
DECLARE #Results XML=
'<Athletics>
<Event ID="001" Name="100m">
<Gold>John Doe</Gold>
<Silver>Harry Smith</Silver>
<Bronze>Kenneth Brown</Bronze>
</Event>
<Event ID="002" Name="High Jump">
<Gold>Sarah Jones</Gold>
<Silver>Janice Johnson</Silver>
<Bronze>Alicia Armstrong</Bronze>
</Event>
</Athletics>'
INSERT INTO SportsEvent
VALUES(1, 'AthleticsDay', #Results);
SELECT * FROM SportsEvent;
If I want to pull out an element based on the event ID, no problem:
SELECT Results.query('(/Athletics/Event[#ID="001"]/Gold)')
FROM SportsEvent
WHERE ID = 1
I can do the same with a relative reference:
SELECT Results.query('(Athletics/Event)[1]')
FROM SportsEvent
WHERE ID = 1
But what if I want to pull the event Name based on either a relative or absolute ?:
SELECT Results.query('(Athletics/Event[#Name])[#ID="001"]')
FROM SportsEvent
WHERE ID = 1
SELECT Results.query('(Athletics/Event[#Name])[1]')
FROM SportsEvent
WHERE ID = 1
...both bring back ALL the data for that event.
I tried using the value method:
SELECT Results.value('(/Athletics/Event/#Name)[1]','VARCHAR(20)')
FROM SportsEvent
WHERE ID = 1
...but this only works for a relative reference i.e in this case the first set of results in the XML.
What if I want to specify an event ID and return just the event name (either as an XML fragment or as data/value)?
Ok, so for some (many?) this will seem blindingly obvious, but I'll post my asnwer in case it stops someone else spending a frustrating couple of hours going ground and round in circles (as I have just done)...
SELECT Results.value('(/Athletics/Event[#ID="001"]/#Name)[1]','VARCHAR(20)')
FROM SportsEvent
WHERE ID = 1
SELECT Results.value('(/Athletics/Event[#Name="100m"]/#ID)[1]','VARCHAR(20)')
FROM SportsEvent
WHERE ID = 1
SELECT Results.value('(/Athletics/Event[#Name="High Jump"]/#ID)[1]','VARCHAR(20)')
FROM SportsEvent
WHERE ID = 1
SELECT Results.value('(/Athletics/Event[#ID="002"]/#Name)[1]','VARCHAR(20)')
FROM SportsEvent
WHERE ID = 1
From what I can ascertain you can't use the query method to return just the value of an attribute the query would have to be written with the attribute outside the element which is not allowed.
You can however use the query method as follows - the '[1]'(singletons) at the end of the path aren't obligatory (they are with the value method). What you will get is the whole fragment which contains the specified attribute.
SELECT Results.query('(/Athletics/Event[#ID="002"][#Name])[1]')
FROM SportsEvent
WHERE ID = 1
SELECT Results.query('(/Athletics/Event[#Name="100m"][#Name])[1]')
FROM SportsEvent
WHERE ID = 1
Hope this helps someone.
Any comments, corrections or additions are welcomed.
Thanks.
I have a quick question. I've written a stored procedure to interrogate a table, check if there are any records already based on 2 key fields and if not to add a record.
So currently my code looks like this.
select #counter=count(*) from f03 where someid NOT IN (select someid from ReportedEventGPQ)
I would like to know how to convert this counter to check not just on someid in the ReportedEventGPQ table but also on another field called TimepointID
This is so that when the stored procedure runs, it checks if the userid (someid) and the timepointid not in ReportedEventGPQ already. As the user can enter a row in f03 over 10 timepoints which each get a row in the f03 table with a timepointid.
Any help with this would be much appreciated.
Try use WHERE NOT EXISTS:
select #counter=count(*) from f03
where not exists(select * from ReportedEventGPQ where someid =f03.someid and TimepointID=f03.TimepointID)
You can use this
select #counter=count() from f03 where 1> (select COUNT() from ReportedEventGPQ where id=someid and other_column=value)
I need to know if there is any way to have a SEQUENCE or something like that, as we have in Oracle. The idea is to get one number and then use it as a key to save some records in a table. Each time we need to save data in that table, first we get the next number from the sequence and then we use the same to save some records. Is not an IDENTITY column.
For example:
[ID] [SEQUENCE ID] [Code] [Value]
1 1 A 232
2 1 B 454
3 1 C 565
Next time someone needs to add records, the next SEQUENCE ID should be 2, is there any way to do it? the sequence could be a guid for me as well.
As Guillelon points out, the best way to do this in SQL Server is with an identity column.
You can simply define a column as being identity. When a new row is inserted, the identity is automatically incremented.
The difference is that the identity is updated on every row, not just some rows. To be honest, think this is a much better approach. Your example suggests that you are storing both an entity and detail in the same table.
The SequenceId should be the primary identity key in another table. This value can then be used for insertion into this table.
This can be done using multiple ways, Following is what I can think of
Creating a trigger and there by computing the possible value
Adding a computed column along with a function that retrieves the next value of the sequence
Here is an article that presents various solutions
One possible way is to do something like this:
-- Example 1
DECLARE #Var INT
SET #Var = Select Max(ID) + 1 From tbl;
INSERT INTO tbl VALUES (#var,'Record 1')
INSERT INTO tbl VALUES (#var,'Record 2')
INSERT INTO tbl VALUES (#var,'Record 3')
-- Example 2
INSERT INTO #temp VALUES (1,2)
INSERT INTO #temp VALUES (1,2)
INSERT INTO ActualTable (col1, col2, sequence)
SELECT temp.*, (SELECT MAX(ID) + 1 FROM ActualTable)
FROM #temp temp
-- Example 3
DECLARE #var int
INSERT INTO ActualTable (col1, col2, sequence) OUTPUT #var = inserted.sequence VALUES (1, 2, (SELECT MAX(ID) + 1 FROM ActualTable))
The first two examples rely on batch updating. But based on your comment, I have added example 3 which is a single input initially. You can then use the sequence that was inserted to insert the rest of the records. If you have never used an output, please reply in comments and I will expand further.
I would isolate all of the above inside of a transactions.
If you were using SQL Server 2012, you could use the SEQUENCE operator as shown here.
Forgive me if syntax errors, don't have SSMS installed
I am going to start off by displaying my table structures:
Numbers Table:
Id AccountId MobileNr FirstName LastName AttributeKeyValues Labels
--- ---------- ----------- ---------- ----------- ------------------- -------
490 2000046 2XXXXXXXXXX Eon du Plessis <attrs /> <lbls>
<lbl>Meep11e</lbl>
<lbl>43210</lbl>
<lbl>1234</lbl>
<lbl>Label 5</lbl>
<lbl>Label 6 (edit)</lbl>
</lbls>
-----------------------------------------------------------------------------
Labels Table:
Id AccountId Label RGB LastAssigned LastMessage
----------- ----------- ----------------- ------ ----------------------- ------------
91 2000046 Meep11e 000000 2013-04-15 13:42:06.660 NULL
-------------------------------------------------------------------------------------
This is the issue
Every number can have multiple labels assigned to it and is stored as untyped XML. In Numbers.Labels //lbls/lbl/text() you will notice that the text there will match the text in Labels.Label
This is the stored procedure which updates the Numbers.Labels column, and is run by an external application I am busy writing. The XML structure is generated by this external application, depending on which rows are read in the Labels.Label table
CREATE PROCEDURE [dbo].[UpdateLabels]
#Id INT,
#Labels XML
AS
BEGIN
UPDATE
Numbers
SET
Labels = #Labels
WHERE
Id = #Id
UPDATE
Labels
SET
LastAssigned = GETDATE()
WHERE
label
IN
(SELECT #Labels.value('(//lbls/lbl)[1]', 'VARCHAR(100)'))
END
The issue here is if 2 people log onto the same account, both with their own session, and User 1 tries to run this update stored procedure, but just before the button is pressed to do this update, user 2 deletes 1 of the labels in the Labels.label table which was included in User 1's update session, it will cause the XML to include the "Deleted" row, and can be problematic when I try to query the numbers again (The RGB column gets queried when I display the number since the label is marked up in jQuery to have a hexidecimal colored background)
My thought approach went to checking if the rows included in the built up XML exists before committing the update. How can I achieve this in TSQL? Or can any better way be recommended?
EDIT
Our table structure is intentionally denormalized, there are no foreign key constraints.
EDIT 2
Ok, it would seem my question is a bit hard, or that I brained too hard and got the dumb :). I will try and simplify.
In the Labels column in Numbers, every <lbl> element must exist within the Labels table
When updating the Labels column in Numbers, if a Label in the XML is found which does not exist in the Labels table, an error must be raised.
The XML is pre-formed in my application, meaning, every time the update is run, the old XML in the Labels column in Numbers will be REPLACED with the new XML generated by my application
This is where I need to check whether there are label nodes in my XML which no longer exists within the Labels table
I would check to see if there are rows in your xml that are not in the real table (in the database) before trying anything. And if you find something, exit out early.
Here is a Northwind example.
Use Northwind
GO
DECLARE #data XML;
SET #data =
N'
<root>
<Order>
<OrderId>10248</OrderId>
<CustomerId>VINET</CustomerId>
</Order>
<Order>
<OrderId>-9999</OrderId>
<CustomerId>CHOPS</CustomerId>
</Order>
</root>';
/* select * from dbo.Orders */
declare #Holder table ( OrderId int, CustomerId nchar(5) )
Insert Into #Holder (OrderId , CustomerId )
SELECT
T.myAlias.value('(./OrderId)[1]', 'int') AS OrderId
, T.myAlias.value('(./CustomerId)[1]', 'nchar(5)') AS CustomerId
FROM
#data.nodes('//root/Order') AS T(myAlias);
if exists (select null from #Holder h where not exists (select null from dbo.Orders realTable where realTable.OrderID = h.OrderId ))
BEGIN
print 'you have rows in your xml that are not in the real table. raise an error here'
END
Else
BEGIN
print 'Using the data'
Update dbo.Orders Set CustomerID = h.CustomerId
From dbo.Orders o , #Holder h
Where o.OrderID = h.OrderId
END