Looping through SQL OpenXML to get Field Names - sql-server

I have a stored procedure that will receive an xml document I will be processing. Depending on the different processes that will call this procedure, I will be inserting data into a temp table from this openxml prepared document that will be from one group or another.
Ex: It could be:
FROM OPENXML(#idoc, '/data/BeneInfoGroup/BeneInfo', 1)
OR it could be:
FROM OPENXML(#idoc, '/data/PersonalInfoGroup/PersonalInfo', 1)
Only one group will come through. Depending on this group - I want to create an SQL string (to be EXECUTED) that inserts into a temp table (already created) and contains field names from the openxml doc.
So my code (in the 2 examples above) could look like:
strSQL = N'INSERT INTO #tmpTbl (BeneID, BeneSSN, BeneDate)
SELECT BeneID, BeneSSN, BeneDate
FROM OPENXML(#idoc, ''/data/BeneInfoGroup/BeneInfo'', 1)
WITH (BeneID nvarchar(50) ''BeneID'', BeneSSN nvarchar(50) ''BeneSSN'', BeneDate nvarchar(50) ''BeneDate'')'
Or it could look like:
strSQL = N'INSERT INTO #tmpTbl (PersonID, PersonSSN, PersonDate)
SELECT PersonID, PersonSSN, PersonDate
FROM OPENXML(#idoc, ''/data/PersonInfoGroup/PersonInfo'', 1)
WITH (PersonID nvarchar(50) ''PersonID'', PersonSSN nvarchar(50) ''PersonSSN'', PersonDate nvarchar(50) ''PersonDate'')'
The 2 examples above are minimal. It won't be "Person" vs "Bene" headers. The fields could be named anything in either group. So I'd like to loop through the openxml document and have the fieldnames go into my string where appropriate. I can take care of the data type details, etc. I just need to pull the field names from the openxml doc I've prepared.
I do NOT need an alternative method using something other than OPENXML, please.
The basic question is - how do I pull the field name from the document in a way that's not hard coded depending on the xml doc I receive?
I don't want to create a table of the field names.
I just want to go through a loop and reference every fieldname that exists for that particular group in my openxml.
No need to worry about #tmpTbl - I have created this with all possible fields that might be inserted into it.
Thanks in advance! Daniel

Related

Update on key violation in Stored Procedure using BULK INSERT & Trigger

I have a stored procedure that performs a bulk insert of a large number of DNS log entries. I wish to summarise this raw data in a new table for analysis. The new table takes a given log entry for FQDN and Record Type and holds one record only with a hitcount.
Source table might include 100 rows of:
FQDN, Type
www.microsoft.com,A
Destination table would have:
FQDN, Type, HitCount
www.microsoft.com, A, 100
The SP establishes a unique ID made up of [FQDN] +'|'+ [Type], which is then used as the primary key in the destination table.
My plan was to have the SP fire a trigger that did an UPDATE...IF ##ROWCOUNT=0...INSERT. However, that of course failed because the trigger receives all the [inserted] rows as a single set so always throws a key violation error.
I'm having trouble getting my head around a solution and need some fresh eyes and better skills to take a look. The bulk insert SP works just fine and the raw data is exactly as desired. However trying to come up with a suitable method to create the summary data is beyond my present skills/mindset.
I have several 10s of Tb of data to process, so I don't see the summary as a something we could do dynamically with a SELECT COUNT - which is why I started down the trigger route.
The relevant code in the SP is driven by a cursor consisting of a list of compressed log files needing to be decompressed and bulk-inserted, and is as follows:
-- Bulk insert to a view because bulk insert cannot populate the UID field
SET #strDynamicSQL = 'BULK INSERT [DNS_Raw_Logs].[dbo].[vwtblRawQueryLogData] FROM ''' + #strTarFolder + '\' + #strLogFileName + ''' WITH (FIRSTROW = 1, FIELDTERMINATOR = '' '', ROWTERMINATOR = ''0x0a'', ERRORFILE = ''' + #strTarFolder + '\' + #strErrorFile + ''', TABLOCK)'
--PRINT #strDynamicSQL
EXEC (#strDynamicSQL)
-- Update [UID] field after the bulk insert
UPDATE [DNS_Raw_Logs].[dbo].[tblRawQueryLogData]
SET [UID] = [FQDN] + '|' + [Type]
FROM [tblRawQueryLogData]
WHERE [UID] IS NULL
I know that the UPDATE...IF ##ROWCOUNT=0...INSERT solution is wrong because it assumes that the input data is a single row. I'd appreciate help on a way to do this.
Thank you
First, at that scale make sure you understand columnstore tables. They are very highly compressed and fast to scan.
Then write a query that reads from the raw table and returns the summarized
create or alter view DnsSummary
as
select FQDN, Type, count(*) HitCount
from tblRawQueryLogData
group by FQDN, Type
Then if querying that view directly is too expensive, write a stored procedure that loads a table after each bulk insert. Or make the view into an indexed view.
Thanks for the answer David, obvious when someone else looks at it!
I ran the view-based solution with 14M records (about 4 hours worth) and it took 40secs to return, so I think i'll modify the SP to drop and re-create summary table each time it runs the bulk insert.
The source table also includes a timestamp for each entry. I would like to grab the earliest and latest times associated with each UID and add that to the summary.
My current summary query (courtesy of David) looks like this:
SELECT [UID], [FQDN], [Type], COUNT([UID]) AS [HitCount]
FROM [DNS_Raw_Logs].[dbo].tblRawQueryLogData
GROUP BY [UID], [FQDN], [Type]
ORDER BY COUNT([UID]) DESC
And returns:
UID, FQDN, Type, HitCount
www.microsoft.com|A, www.microsoft.com, A, 100
If I wanted to grab first earliest and latest times then I think I'm looking at nesting 3 queries to grab the earliest time (SELECT TOP N...ORDER BY... ASC), the latest time (SELECT TOP N...ORDER BY... DESC) and the hitcount. Is there a more efficient way of doing this, before I try and wrap my head around this route?

Inserting a distinct entry into db.table results into implicit conversion error in SQL Server

I have a requirement where I need to insert new entries found in a record into its Master Table and Map the ID of the identifier to the main table
For Instance consider the below example,
-- Insert into Category Master if not exists
INSERT INTO tblCategoryMaster (Category,
CreatedBy,
CreatedDate,
UpdatedBy,
UpdatedDate)
SELECT DISTINCT
(category),
SERVERPROPERTY('MACHINENAME'),
GETDATE(),
SERVERPROPERTY('MACHINENAME'),
GETDATE()
FROM tblTempDataStaging stg
WHERE category IS NOT NULL
AND NOT EXISTS (SELECT 1 FROM tblCategoryMaster ctg WHERE ctg.Category = stg.category);
After executing the select query we get list of distinct entries and every time a new entry is entered in the staging table, the entries are populated in the Master Table accordingly.
Server is not allowing me to insert, its giving me an error saying
Msg 257, Level 16, State 3, Line 39
Implicit conversion from data type sql_variant to nvarchar(max) is not allowed. Use the CONVERT function to run this query.
The data type of the staging table is NVARCHAR(MAX) for the relevant fields except datetime for the date fields
Tried using CONVERT method but I'm unsure on how do we use it with DISTINCT in the picture
Can you suggest how do I resolve this issue?
The error is telling you the problem: SERVERPROPERTY('MACHINENAME') returns the datatype sql_variant:
SELECT system_type_name
FROM sys.dm_exec_describe_first_result_set(N'SELECT SERVERPROPERTY(''MACHINENAME'') AS MachineName',NULL,NULL);
The underlying data type is a nvarchar (thought it certainly won't be 2GB of storage for the name of a machine!) as can be seen here:
SELECT SQL_VARIANT_PROPERTY(SERVERPROPERTY('MACHINENAME'),'Basetype')
You need to explicitly convert the value. For example:
CONVERT(nvarchar(256),SERVERPROPERTY('MACHINENAME'))
I do suggest you change the data type of your column CreatedBy, and I assume UpdatedBy, from nvarchar(MAX) to something like an nvarchar(256); you don't need 2GB of characters (about 1 Billion) to store that information.

T-SQL joining external variables into tables

I've come from an application dev and been thrust into the web dev and I'm getting my head around asymmetrical data requests/returns and how to handle them.
I need to make a number of SQL requests and though the best way to manage which ones are returned would be to insert a UUID or something similar into the return sql table.
Also, in general I'm pretty basic with my sql language, but I want to add an external value into my returned table, where #ext would be the external data added in from the original request.
SELECT *
FROM
#ext AS uuid,
dbo.Orders
WHERE ....
expected return table
uuid: 12234
customer: jack
orderNo: 774
postAddy: 123 Albert St
...
The error I'm always getting is "but declare the table variable "#ext".
Is this the right approach or am I just doing something dumb?
The error message you are getting is telling you that you haven't declared the table variable #ext. This is because you've used a variable name (with the # prefix) in the FROM clause where it's expecting a table or other table-like object (ie. table, view, table variable, TVF, etc).
The #ext variable appears to be a scalar (single-valued) variable, so it isn't recognised in the FROM clause. You should try something like this instead:
SELECT
-- scalar values and column names / aliases go here
#ext AS uuid, *
FROM
-- only tables, views, table variables, TVF's etc go here
dbo.Orders
WHERE ....
Note that if your query returns multiple rows, they will all have the same value for uuid. This may or may not be desirable, and there may be better ways to achieve what you want, in terms of managing the data that is returned from multiple queries, but this is best posed in another question once you have a working example.
Make sure you know what #ext is for you and how to properly reference it.
If it's a sacalar value you can use it on expressions:
DECLARE #ext INT = 5
SELECT
#ext AS ScalarValue,
#ext + 10 AS ScalarOperation,
#ext + S.SomeColumn AS ScalarOperationWithTableColumn
FROM
SomeTable AS S
If it's a table variable, you can reference it as table (as in your example):
DECLARE #ext TABLE (
FirstValue INT,
SecondValue VARCHAR(100))
INSERT INTO #ext (
FirstValue,
SecondValue)
VALUES
(10, 'SomeText'),
(20, 'AnotherText')
SELECT
E.FirstValue,
E.SecondValue
FROM
#ext AS E
/*
LEFT JOIN ....
WHERE
....
*/

SSRS multiple single cells

I am trying to figure out best way to add multiple fields to a SSRS report.
Report has some plots and tablix which are populated from queries but now I have been asked to add a table with ~20 values. The problem is that I need to have them in a specific order/layout (that I cannot obtain by sorting) and they might need to have a description added above which will be static text (not from the DB).
I would like to avoid situation where I keep 20 copy of the same query which returns single cell where the only difference would be in:
WHERE myTable.partID = xxxx
Any chance I could keep a single query which takes that string like a parameter which I could specify somehow via expression or by any other means?
Not a classical SSRS parameter as I need a different one for each cell...
Or will I need to create 20 queries to fetch all those single values and then put them as separate textfields on the report?
When I've done this in the past, I build a single query that gets all the data I need with some kind of key.
For example I might have a list of captions and values, one per row, that I need to display as part of a report page. The dataset query might look something like ...
DECLARE #t TABLE(Key varchar(20), Amount float, Caption varchar(100))
INSERT INTO #t
SELECT 'TotalSales', SUM(Amount), NULL AS Amount FROM myTable WHERE CountryID = #CountryID
UNION
SELECT 'Currency', NULL, CurrencyCode FROM myCurrencyTable WHERE CountryID = #CountryID
UNION
SELECT 'Population', Population, NULL FROM myPopualtionTable WHERE CountryID = #CountryID
SELECT * FROM #t
The resulting dataset would look like this.
Key Amount Caption
'TotalSales' 12345 NULL
'Currency' NULL 'GBP'
'Population' 62.3 NULL
Lets say we call this dataset dsStuff then in each cell/textbox the xpression would simply be something like.
=LOOKUP("Population", Fields!Key.Value, Fields!Amount.Value, "dsStuff")
or
=LOOKUP("Currency", Fields!Key.Value, Fields!Caption.Value, "dsStuff")

Sql Server - Capture an XML query and save it to a table?

I would like to run a (extensive) query that produces one line of XML. This XML represents about 12 tables worth of relational data. When I "delete" a row from the top level table I would like to "capture" the state of the data at that moment (in XML), save it to an archive table, then delete all the child table data and finally mark the top level table/row as "isDeleted=1". We have other data hanging off of the parent table that CANNOT be deleted and cannot lose the relation to the top table.
I have most of the XML worked out - see my post here on that can of worms.
Now, how can I capture that into an archive table?
CREATE TABLE CampaignArchive(CampaignID int, XmlData XML)
INSERT INTO CampaignArchive(CampaignID, XmlData)
SELECT CampaignID, (really long list of columns) FROM (all my tables) FOR XML PATH ...
This just does not work. :)
Any suggestions?
TIA
You need a subquery to wrap all that XML creation into one single scalar that goes into column XmlData. Again, use TYPE to create a scalar of type XML not a string:
INSERT INTO CampaignArchive(CampaignID, XmlData)
SELECT CampaignID, (
select (really long list of columns)
FROM (all my tables) FOR XML PATH ..., type);

Resources