XML containing Ampersand, breaks SQL XML Schema validation - sql-server

I have a stored proc that is receiving XML as an input. A XML Schema Collection within SQL validates the XML received.
One of the items referenced in the XML string is a product code, and this could contain an ampersand. I'm receiving an error when trying to pick up the product code when trying to extract the data. This is what I'm currently using:
DECLARE #cursor CURSOR;
SET #cursor = CURSOR FAST_FORWARD READ_ONLY FOR
SELECT Quotenbr = COALESCE(items.value('(#quotenbr)[1]', 'varchar(25)'), 'Skipped')
, QuotelineItem = COALESCE(items.value('(#quotelineitem)[1]', 'varchar(25)'), 'Skipped')
, Branch = COALESCE(items.value('(#branch)[1]', 'varchar(25)'), 'Skipped')
, Partnbr = items.value('(#partnbr)[1]','varchar(25)') --Part Number can contain & character, which breaks this - need alternate
, Qty = items.value('(#qty)[1]','decimal(9,2)')
, Unit = items.value('(#unit)[1]','nvarchar(25)')
FROM #XMLString.nodes('/request/items/item') AS XTbl(items);
OPEN #cursor;
FETCH NEXT FROM #cursor
INTO #quotenbr
, #quotelineitem
, #Branch
, #Partnbr
, #qty
, #unit;
The code fails to correctly handle the Partnbr input. Is there something I can do for partnbr to not break when the string contains an ampersand?
I'm assuming the schema validation element is swapping the & for '&+amp;' - is there a way to fix this?

Related

Read flat file having array in Azure Data Factory

I need to read a pipe delimited file where we have an array repeating 30 times. I need to access these array of elements and change the sequence and send in the output file.
E.g.
Tanya|1|Pen|2|Book|3|Eraser
Raj|11|Eraser|22|Bottle
In the above example, first field is the Customer name. After that we have an array of items ordered - Order ID and Item name.
Could you please suggest how to read these array elements individually to process these further?
You will be using copy activity in Azure Data Factory pipeline in which the source will be DelimitedText dataset and sink will be JSON dataset. If your source and destination files are located in Azure Blob Storage, create a Linked Service to connect the files with Azure Data Factory.
The source file dataset properties will look like this. You need to select Column Delimiter as Pipe (|).
The sink file with JSON type dataset settings will look like as shown in below. Just mention the output location path where the file will be saved after conversion.
In copy data activity sink tab, select the File pattern as Array of files. Trigger the pipeline.
The sample input and output shown below.
If you can control the output file format then you're better off using two (or three) delimiters, one for the person and the other for the order items. If you can get a third then use that to split the order item from the order line.
Assuming the data format:
Person Name | 1:M [Order]
And each Order is
order line | item name
You can simply ingest the entire row into a single nvarchar(max) column and then use SQL to break out the data you need.
The following is one such example.
declare #tbl table (d nvarchar(max));
insert into #tbl values('Tanya|1|Pen|2|Book|3|Eraser'),('Raj|11|Eraser|22|Bottle');
declare #base table (id int, person varchar(100),total_orders int, raw_orders varchar(max));
declare #output table (id int, person varchar(100),item_id int, item varchar(100));
with a as
(
select
CHARINDEX('|',d) idx
,d
,ROW_NUMBER() over (order by d) as id /*Or newid()*/
from #tbl
), b as
(
select
id
,SUBSTRING(d,0,idx) person
,SUBSTRING(d,idx+1,LEN(d)-idx+1) order_array
from a
), c as
(
select id, person, order_array
,(select count(1) from string_split(order_array,'|')) /2 orders
from b
)
insert into #base (id,person,total_orders,raw_orders)
select id,person,orders,order_array from c
declare #total_persons int = (select count(1) from #base);
declare #person_enu int = 1;
while #person_enu <= #total_persons
BEGIN
declare #total_orders int = (select total_orders from #base where id = #person_enu);
declare #raw_orders nvarchar(max) = (select raw_orders from #base where id = #person_enu);
declare #order_enu int = 1;
declare #i int = 1;
print CONCAT('Person ', #person_enu, '. Total orders: ', #total_orders);
while #order_enu <= #total_orders
begin
--declare #id int = (select value from string_split(#raw_orders,'|',1) where ordinal = #i);
--declare #val varchar(100) = (select value from string_split(#raw_orders,'|',1) where ordinal = #i+1);
--print concat('Will process order ',#order_enu);
--print concat('ID:',#i, ' Value:', #i+1)
--print concat('ID:',#id, ' Value:', #val)
INSERT INTO #output (id,person,item_id,item)
select b.id,b.person,n.value [item_id], v.value [item] from #base b
cross apply string_split(b.raw_orders,'|',1) n
cross apply string_split(b.raw_orders,'|',1) v
where b.id = #person_enu and n.ordinal = #i and v.ordinal = #i+1;
set #order_enu +=1;
set #i+=2;
end
set #person_enu += 1;
END
select * from #output;

In T-SQL is there a way to say that if this parameter is equal to "XYZ" then use "XYZ" but if not return all?

I'm currently creating stored procedures on the SQL server by using linked servers, "OPENQUERY" statements, and temporary tables. My goal is to have one source that will be consumed by multiple third party sources so that everyone is viewing the same data.
Where I'm running into my problem is that some instances need a specific where clause where others don't need this where clause. Is there a way to Declare this where clause equal to something that nullifies that where clause if it's blank but use the where clause if it's populated? I've tried making the parameter equal to "%", "%?%", etc. but nothing seems to work.
I would also like to point out that this is an Oracle Database that I'm pulling from on a Microsoft SQL Server. My code is below and the parameter #WINS is what I'm trying to nullify if left blank:
DECLARE #query_start DATETIME;
DECLARE #query_end DATETIME;
DECLARE #query_wins NVARCHAR(MAX);
SET #query_start = '7/1/2020';
SET #query_end = '7/15/2020';
SET #query_wins = 'F6666';
DECLARE #START_DATE NVARCHAR(MAX) = CONVERT(VARCHAR,#query_start,105)
DECLARE #END_DATE NVARCHAR(MAX) = CONVERT(VARCHAR,#query_end,105)
DECLARE #WINS NVARCHAR(MAX) = #query_wins
DECLARE #SqlCommand NVARCHAR(MAX) =
'
SELECT
*
FROM
OPENQUERY
(
PDB,
'' SELECT
T1.WELL_NUM
, D2.WELL_NAME
, T1.DAILY_RDG_DATE
, T1.GROSS_OIL_BBLS
, T1.GROSS_GAS_MCF
, T1.GROSS_WTR_BBLS
, T1.TUBING_PRESS
, T1.CASING_PRESS
, T1.GAS_LINE_PRESS
, T1.CHOKE,T1.CHOKE_SIZE AS CHOKE2
, T2.GAS_PROD_FORECAST
, T2.OIL_PROD_FORECAST
, T2.WTR_PROD_FORECAST
FROM
(PDB.T003031 T1
INNER JOIN WINS.DW_ANORM_ROWL#WINP_DBLINK.WORLD D2
ON T1.WELL_NUM = D2.WINS_NO
AND T1.CMPL_NUM = D2.CMPL_NO)
LEFT JOIN PDB.T000057 T2 ON T1.WELL_NUM = T2.WELL_NUM
AND T1.CMPL_NUM = T2.CMPL_NUM
AND T2.FORECAST_DATE=T1.DAILY_RDG_DATE
WHERE
D2.HOLE_DIRECTION = ''''HORIZONTAL''''
AND D2.ASSET_GROUP = ''''Powder River Basin''''
AND T1.DAILY_RDG_DATE > TO_DATE(''''' + CONVERT(VARCHAR,#START_DATE,105) + ''''',''''DD-MM-YYYY'''') - 2
AND T1.DAILY_RDG_DATE < TO_DATE(''''' + CONVERT(VARCHAR,#END_DATE,105) + ''''',''''DD-MM-YYYY'''')
AND D2.OPER_NON_OPER = ''''OPERATED''''
AND T1.WELL_NUM = ''''' + #WINS + '''''
''
)
'
PRINT #SqlCommand
DROP TABLE IF EXISTS #temp
CREATE TABLE #temp (
WELL_NUM NVARCHAR(MAX)
, WELL_NAME NVARCHAR(MAX)
, DAILY_RDG_DATE DATETIME
, GROSS_OIL_BBLS FLOAT
, GROSS_GAS_MCF FLOAT
, GROSS_WTR_BBLS FLOAT
, TUBING_PRESS FLOAT
, CASING_PRESS FLOAT
, GAS_LINE_PRESS FLOAT
, CHOKE1 FLOAT
, CHOKE2 FLOAT
, GAS_PROD_FORECAST FLOAT
, OIL_PROD_FORECAST FLOAT
, WTR_PROD_FORECAST FLOAT
)
PRINT #SqlCommand
INSERT INTO #temp
EXEC sp_ExecuteSQL #SqlCommand
SELECT
WELL_NUM
, WELL_NAME
, DAILY_RDG_DATE
, ISNULL(GROSS_OIL_BBLS,0) AS 'GROSS_OIL_BBLS'
, ISNULL(GROSS_GAS_MCF,0) AS 'GROSS_GAS_MCF'
, ISNULL(GROSS_WTR_BBLS,0) AS 'GROSS_WTR_BBLS'
, ISNULL(TUBING_PRESS,0) AS 'TUBING_PRESS'
, ISNULL(CASING_PRESS,0) AS 'CASING_PRESS'
, ISNULL(GAS_LINE_PRESS,0) AS 'GAS_LINE_PRESS'
, ISNULL(CHOKE1,0) AS 'CHOKE1'
, ISNULL(CHOKE2,0) AS 'CHOKE2'
, ISNULL(GAS_PROD_FORECAST,0) AS 'CHOKE2'
, ISNULL(OIL_PROD_FORECAST,0) AS 'OIL_PROD_FORECAST'
, ISNULL(WTR_PROD_FORECAST,0) AS 'WTR_PROD_FORECAST'
FROM #temp
ORDER BY
DAILY_RDG_DATE ASC
DROP TABLE IF EXISTS #temp
You can set up an optional parameter like this in SQL Server
WHERE ISNULL(#parameter,column_name) = column_name
Ok, if I abstract your problem properly, you're trying to apply different filtering logic based on the value of a field.
You might consider using a cursor for this, iterate over the table and apply if / else as needed.
Do you really need to compose the SQL command as a string? Maybe you're only doing that to share the code here, hopefully?
Anyway, It sounds like this would work for you:
WHERE (T1.WELL_NUM = #WINS OR #WINS IS NULL) AND
...other conditions...

How to copy files from one location to another depending on where the file does not exist

I am trying to create a script that will go through the MANY files and their locations, ensure the file exists, and if it does not exist, copy it from the location where it does.
All of the files exist in at least one location. For example document ID 8675309 may exist in repository 5, but it needs to also exist in repository 12, while document 9035768 exists in 12 but also needs to exist in 5. So far I have been writing to a temporary table and I am getting all of the files document IDs, locations and whether they exist or not. Now I need to fix the data by copying files to the correct locations. As there are over 250,000 of them, manually copying isn't very feasible. I also am not allowed to download any 3rd party tools in order to do this task. Below is what I have so far which pulls the correct data, also this is the first time I have used a cursor, if there are any suggestions, please let me know!
BEGIN TRANSACTION
DECLARE #document_id INT
DECLARE #repository_id INT
DECLARE #root_access varchar(50)
DECLARE #location varchar(50)
DECLARE #expected_location varchar(100)
DECLARE #VerificationCursor cursor
SET
#VerificationCursor = CURSOR FAST_FORWARD FOR
(SELECT object_id, repository_id, location
FROM m3_object_repository where creatortime >= '2018-01-01' AND creatortime <= '2018-12-31')
OPEN #VerificationCursor
FETCH NEXT FROM #VerificationCursor INTO #document_id, #repository_id, #location
print 'CREATING TEMPORARY TABLE'
CREATE TABLE #Verification_Files
(
document_id INT,
repository_id INT,
file_exists VARCHAR(50),
expected_location VARCHAR(100)
)
print 'BEGINNING TASKS'
print 'TESTING IF DOCUMENTS EXIST, THIS MAY TAKE A WHILE:'
WHILE ##FETCH_STATUS = 0
BEGIN
INSERT INTO #Verification_Files (document_id, repository_id, file_exists, expected_location)
VALUES (#document_id, #repository_id, (SELECT dbo.fc_FileExists(
(
SELECT dev.root_access FROM m3_repositories rep
LEFT JOIN m3_device dev ON rep.m2_id = dev.m2_id AND rep.name = dev.name
WHERE rep.repository_id = #repository_id AND rep.m2_id = 2
)
+ 'EbImages\' + #location +cast(#document_id as varchar))),
(
SELECT dev.root_access FROM m3_repositories rep
LEFT JOIN m3_device dev ON rep.m2_id = dev.m2_id AND rep.name = dev.name
WHERE rep.repository_id = #repository_id AND rep.m2_id = 2
)
+ 'EbImages\' + #location +cast(#document_id as varchar)
);
FETCH NEXT FROM #VerificationCursor INTO #document_id, #repository_id, #location
END
print 'TABLE RECORDS ADDED'
print 'CONVERTING BIT VALUES TO TRUE/FALSE'
UPDATE #Verification_Files
SET file_exists = 'FALSE' WHERE file_exists = '0'
UPDATE #Verification_Files
SET file_exists = 'TRUE' WHERE file_exists = '1'
CLOSE #VerificationCursor
DEALLOCATE #VerificationCursor
So have you now got a list of pairs of where files are and where they need to be? Just do a
SELECT CONCAT('copy "', file_location, '" "', file_copy_to, '"') from temp_table
Which will generate 250,000 DOS copy commands in the SSMS grid, save the results to file called 'go.bat' and double click it
If you don't know whether the files exist or not you can chuck some IF EXIST into your concat etc - How to verify if a file exists in a batch file?
Sometimes it's easier to just write an sql that generates some other kind of "code" and then take the results and run it, especially if it's a one off operation

SQL: Cast VARCHAR to XML removes CDATA section [duplicate]

When I generate Xml in Sql Server 2008 R2 using For Explicit (because my consumer wants one of the elements wrapped in CDATA) and store the results in an Xml variable, the data I want wrapped in CDATA tags no longer appears wrapped in CDATA tags. If I don't push the For Xml Explicit results into an Xml variable then the CDATA tags are retained. I am using the #Xml variable as an SqlParameter from .Net.
In this example, the first select (Select #Xml) does not have Line2 wrapped in CDATA tags. But the second select (the same query used to populate the #Xml variable) does have the CDATA tags wrapping the Line2 column.
Declare #Xml Xml
Begin Try
Drop Table #MyTempTable
End Try
Begin Catch
End Catch
Select
'Record' As Record
, 'Line1' As Line1
, 'Line2' As Line2
Into
#MyTempTable
Select #Xml =
(
Select
x.Tag
, x.Parent
, x.[Root!1]
, x.[Record!2!Line1!Element]
, x.[Record!2!Line2!cdata]
From
(
Select
1 As Tag, Null As Parent
, Null As [Root!1]
, Null As [Record!2!Line1!Element]
, Null As [Record!2!Line2!cdata]
From
#MyTempTable
Union
Select
2 As Tag, 1 As Parent
, Null As [Root!1]
, Line1 As [Record!2!Line1!Element]
, Line2 As [Record!2!Line2!cdata]
From
#MyTempTable
) x
For
Xml Explicit
)
Select #Xml
Select
x.Tag
, x.Parent
, x.[Root!1]
, x.[Record!2!Line1!Element]
, x.[Record!2!Line2!cdata]
From
(
Select
1 As Tag, Null As Parent
, Null As [Root!1]
, Null As [Record!2!Line1!Element]
, Null As [Record!2!Line2!cdata]
From
#MyTempTable
Union
Select
2 As Tag, 1 As Parent
, Null As [Root!1]
, Line1 As [Record!2!Line1!Element]
, Line2 As [Record!2!Line2!cdata]
From
#MyTempTable
) x
For
Xml Explicit
Begin Try
Drop Table #MyTempTable
End Try
Begin Catch
End Catch
You can't. The XML data type does not preserve CDATA sections.
Have a look here for a discussion about the subject.
http://social.msdn.microsoft.com/forums/en-US/sqlxml/thread/e22efff3-192e-468e-b173-ced52ada857f/

How to find low performance code in my t-sql procedure

I have some nested procedures which show low performance. In order to find bottle neck I inserted into t-sql code some debug marks for measure performance of chunks of code which I suspect in low performance. This debug marks look like:
select #start_point = GETDATE() -- start measuring point
---
open #session_license_fee_cur -- suspected chunk of code
---
select #end_point = GETDATE()-- end measuring point
select #duration = datediff(ms, #start_point, #end_point)
select #log_info_total = 'Opening cursor license_fee (bills_supp_create_license_fee) (#class_id = ' + cast(#class_id as nvarchar) + ')';
exec bills_supp_save_calculation_log #duration, #log_info_total, #house_id, #account_id, #log_level -- procedure for creating log (simple insert into log table pes_bl_bills_calculation_log_total)
After running the procedures I run query from pes_bl_bills_calculation_log_total table to find lowest performance code. It looks like this
set #session_license_fee_cur = cursor static for
select activity_id
, addendum_id
, service_id
, active_from
, active_to
from dbo.bills_supp_get_activate_license_fee_for_sessions_by_house(#active_from, #active_to, #house_id)
select #start_point = GETDATE()
---
open #session_license_fee_cur
---
select #end_point = GETDATE()
select #duration = datediff(ms, #start_point, #end_point)
select #log_info_total = 'Opening cursor license_fee (bills_supp_create_license_fee) (#class_id = ' + cast(#class_id as nvarchar) + ')';
exec bills_supp_save_calculation_log #duration, #log_info_total, #house_id, #account_id, #log_level
In other words open #session_license_fee_cur works very slowly (about 501980 ms).
I’m trying to run this chunk of code with given parameters in SQL Server Management Studio in order to look on query plan and try to optimize it. I run it like this
declare #active_from date = '01.03.2014'
declare #active_to date = '01.04.2014'
declare #house_id integer = 11927
select activity_id
, addendum_id
, service_id
, active_from
, active_to
from dbo.bills_supp_get_activate_license_fee_for_sessions_by_house(#active_from, #active_to, #house_id)
But it works very fast (returns 3000 records in about 0(zero) seconds).
What the difference in opening cursor in procedure
open #session_license_fee_cur
And running it in SQL Server Management Studio?
declare #active_from date = '01.03.2014'
declare #active_to date = '01.04.2014'
declare #house_id integer = 11927
select activity_id
, addendum_id
, service_id
, active_from
, active_to
from dbo.bills_supp_get_activate_license_fee_for_sessions_by_house(#active_from, #active_to, #house_id)
Where is my bottle neck?
Find Top 5 expensive Queries from a Read IO perspective
http://www.sqlservercentral.com/scripts/DMVs/102045/

Resources