Update SQL server xml column using XQUERY? - sql-server

I have a simple xml in a XML column
<Bands>
<Band>
<Name>beatles</Name>
<Num>4</Num>
<Score>5</Score>
</Band>
<Band>
<Name>doors</Name>
<Num>4</Num>
<Score>3</Score>
</Band>
</Bands>
I have managed to update the column with :
-----just update the name to the id)----
UPDATE tbl1
SET [myXml].modify('replace value of (/Bands/Band/Name/text())[1]
with sql:column("id")')
All fine.
Question #1
How can I use this query to udpate the value to id+"lalala":
UPDATE tbl1
SET [myXml].modify('replace value of (/Bands/Band/Name/text())[1]
with sql:column("id") + "lalala"')
Error = XQuery [tbl1.myXml.modify()]: The argument of '+' must be of a single numeric primitive type
Question #1
Let's say I Don't want to update first record ([1]) , But I want to udpate (the same update as above) only where score>4.
I can write ofcourse in the xpath :
replace value of (/Bands/Band[Score>4]/Name/text())[1]
But I dont want to do it in the Xpath. Isn't there a Normal way of doing this with a Where clause ?
something like :
UPDATE tbl1
SET [myXml].modify('replace value of (/Bands/Band/Name/text())[1]
with sql:column("id") where [...score>4...]')
here is the online sql

If you want to concatenate strings you should use concat and if id in your case is an integer you need to cast it to a string in the concat function.
In the where clause you can filter rows of the table to update, you can not specify what nodes to update in the XML. That has to be done in the xquery expression. You can however use exist in the where clause to filter out the rows that really needs the update.
update tbl1
set myXml.modify('replace value of (/Bands/Band[Score > 4]/Name/text())[1]
with concat(string(sql:column("id")), "lalalala")')
where myXml.exist('/Bands/Band[Score > 4]') = 1

Q1:
;with t as (select convert(varchar(10),id) + 'lalala' id2, * from #tbl1)
UPDATE t
SET [myXml].modify('replace value of (/Bands/Band/Name/text())[1]
with sql:column("id2")');
Note: Do you realise that this updates the name of only the first band's name, not all bands?
2nd Q1:
No you cannot. Especially since (/Bands/Band[Score<4]/Name/text())[1] (I changed to <) specifically targets the doors Band in your example xml in the question. A WHERE clause on the other hand will work across the XML, instead of a particular level in the path. e.g. a very wrong interpretation:
;with t as (
select a.*, n.m.value('.','int') Score
from #tbl1 a
cross apply myXml.nodes('/Bands/Band/Score') n(m) -- assume singular
)
UPDATE t
SET [myXml].modify('replace value of (/Bands/Band/Name/text())[1]
with sql:column("id")')
where Score < 4
Because there is at least one Band in the xml with a score < 4, the XML gets updated. HOWEVER, because xml.modify only works ONCE on the first match, the first band's name gets updated, not the one matching the score filter.

Related

Order of XML nodes from document preserved in insert?

If I do:
INSERT INTO dst
SELECT blah
FROM src
CROSS APPLY xmlcolumn.nodes('blah')
where dst has an identity column, can one say for certain that the identity column order matches the order of the nodes from the original xml document?
I think the answer is no, there are no guarantees and that to ensure the ordering is able to be retained, some ordering information needs to also be extracted from the XML at the same time the nodes are enumerated.
There's no way to see it explicitly in an execution plan, but the id column returned by the nodes() method is a varbinary(900) OrdPath, which does encapsulate the original xml document order.
The solution offered by Mikael Eriksson on the related question Does the `nodes()` method keep the document order? relies on the OrdPath to provide an ORDER BY clause necessary to determine how identity values are assigned for the INSERT.
A slightly more compact usage follows:
CREATE TABLE #T
(
ID integer IDENTITY,
Fruit nvarchar(10) NOT NULL
);
DECLARE #xml xml =
N'
<Fruits>
<Apple />
<Banana />
<Orange />
<Pear />
</Fruits>
';
INSERT #T
(Fruit)
SELECT
N.n.value('local-name(.)', 'nvarchar(10)')
FROM #xml.nodes('/Fruits/*') AS N (n)
ORDER BY
ROW_NUMBER() OVER (ORDER BY N.n);
SELECT
T.ID,
T.Fruit
FROM #T AS T
ORDER BY
T.ID;
db<>fiddle
Using the OrdPath this way is presently undocumented, but the technique is sound in principle:
The OrdPath reflects document order.
The ROW_NUMBER computes sequence values ordered by OrdPath*.
The ORDER BY clause uses the row number sequence.
Identity values are assigned to rows as per the ORDER BY.
To be clear, this holds even if parallelism is employed. As Mikael says, the dubious aspect is using id in the ROW_NUMBER since id is not documented to be the OrdPath.
* The ordering is not shown in plans, but optimizer output using TF 8607 contains:
ScaOp_SeqFunc row_number order[CALC:QCOL: XML Reader with XPath filter.id ASC]
Under the current implementation of .nodes, the XML nodes are generated in document order. The result of that is always joined to the original data using a nested loops, which always runs in order also.
Furthermore, inserts are generally serial (except under very specific circumstances that it goes parallel, usually when you have an empty table, and never with an IDENTITY value being generated).
Therefore there is no reason why the server would ever return rows in a different order than the document order. You can see from this fiddle that that is what happens.
That being said, there is no guarantee that the implementation of .nodes won't change, or that inserts may in future go parallel, as neither of these is documented anywhere as being guaranteed. So I wouldn't rely on it without an explicit ORDER BY, and you do not have a column to order it on.
Using an ORDER BY would guarantee it. The docs state: "INSERT queries that use SELECT with ORDER BY to populate rows guarantees how identity values are computed but not the order in which the rows are inserted."
Even using ROW_NUMBER as some have recommended is also not guaranteed. The only real solution is to get the document order directly from XQuery.
The problem is that SQL Server's version of XQuery does not allow using position(.) as a result, only as a predicate. Instead, you can use a hack involving the << positional operator.
For example:
SELECT T.X.value('text()[1]', 'nvarchar(100)') as RowLabel,
T.X.value('let $i := . return count(../*[. << $i]) + 1', 'int') as RowNumber
FROM src
CROSS APPLY xmlcolumn.nodes('blah') as T(X);
What this does is:
Assign the current node . to the variable $i
Takes all the nodes in ../* i.e. all children of the parent of this node
... [. << $i] that are previous to $i
and counts them
Then add 1 to make it one-based

Need to Add Values to Certain Items

I have a table that I need to add the same values to a whole bunch of items
(in a nut shell if the item doesn't have a UNIT of "CTN" I want to add the same values i have listed to them all)
I thought the following would work but it doesn't :(
Any idea what i am doing wrong ?
INSERT INTO ICUNIT
(UNIT,AUDTDATE,AUDTTIME,AUDTUSER,AUDTORG,CONVERSION)
VALUES ('CTN','20220509','22513927','ADMIN','AU','1')
WHERE ITEMNO In '0','etc','etc','etc'
If I understand correctly you might want to use INSERT INTO ... SELECT from original table with your condition.
INSERT INTO ICUNIT (UNIT,AUDTDATE,AUDTTIME,AUDTUSER,AUDTORG,CONVERSION)
SELECT 'CTN','20220509','22513927','ADMIN','AU','1'
FROM ICUNIT
WHERE ITEMNO In ('0','etc','etc','etc')
The query you needs starts by selecting the filtered items. So it seems something like below is your starting point
select <?> from dbo.ICUNIT as icu where icu.UNIT <> 'CTN' order by ...;
Notice the use of schema name, terminators, and table aliases - all best practices. I will guess that a given "item" can have multiple rows in this table so long as ICUNIT is unique within ITEMNO. Correct? If so, the above query won't work. So let's try slightly more complicated filtering.
select distinct icu.ITEMNO
from dbo.ICUNIT as icu
where not exists (select * from dbo.ICUNIT as ctns
where ctns.ITEMNO = icu.ITEMNO -- correlating the subquery
and ctns.UNIT = 'CTN')
order by ...;
There are other ways to do that above but that is one common way. That query will produce a resultset of all ITEMNO values in your table that do not already have a row where UNIT is "CTN". If you need to filter that for specific ITEMNO values you simply adjust the WHERE clause. If that works correctly, you can use that with your insert statement to then insert the desired rows.
insert into dbo.ICUNIT (...)
select distinct icu.ITEMNO, 'CTN', '20220509', '22513927', 'ADMIN', 'AU', '1'
from ...
;

Highlight Duplicate Values in a NetSuite Saved Search

I am looking for a way to highlight duplicates in a NetSuite saved search. The duplicates are in a column called "ACCOUNT" populated with text values.
NetSuite permits adding fields (columns) to the search using a stripped down version of SQL Server. It also permits conditional highlighting of entire rows using the same code. However I don't see an obvious way to compare values between rows of data.
Although duplicates can be grouped together in a summary report and identified by a count of 2 or more, I want to show duplicate lines separately and highlight each.
The closest thing I found was a clever formula that calculates a running total here:
sum/* comment */({amount})
OVER(PARTITION BY {name}
ORDER BY {internalid}
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
I wonder if it's possible to sort results by the field being checked for duplicates and adapt this code to identify changes in the "ACCOUNT" field between a row and the previous row.
Any ideas? Thanks!
This post has been edited. I have left the progression as a learning experience about NetSuite.
Original - plain SQL way - not suitable for NetSuite
Does something like this meet your needs? The test data assumes looking for duplicates on id1 and id2. Note: This does not work in NetSuite as it supports limited SQL functions. See comments for links.
declare #table table (id1 int, id2 int, value int);
insert #table values
(1,1,11),
(1,2,12),
(1,3,13),
(2,1,21),
(2,2,22),
(2,3,23),
(1,3,1313);
--select * from #table order by id1, id2;
select t.*,
case when dups.id1 is not null then 1 else 0 end is_dup --identify dups when there is a matching dup record
from #table t
left join ( --subquery to find duplicates
select id1, id2
from #table
group by id1, id2
having count(1) > 1
) dups
on dups.id1 = t.id1
and dups.id2 = t.id2
order by t.id1, t.id2;
First Edit - NetSuite target but in SQL.
This was a SQL test based on the example available syntax provided in the question since I do not have NetSuite to test against. This will give you a value greater than 1 on each duplicate row using a similar syntax. Note: This will give the appropriate answer but not in NetSuite.
select t.*,
sum(1) over (partition by id1, id2)
from #table t
order by t.id1, t.id2;
Second Edit - Working NetSuite version
After some back and forth here is a version that works in NetSuite:
sum/* comment */(1) OVER(PARTITION BY {name})
This will also give a value greater than 1 on any row that is a duplicate.
Explanation
This works by summing the value 1 on each row included in the partition. The partition column(s) should be what you consider a duplicate. If only one column makes a duplicate (e.g. user ID) then use as above. If multiple columns make a duplicate (e.g. first name, last name, city) then use a comma-separated list in the partition. SQL will basically group the rows by the partition and add up the 1s in the sum/* comment */(1). The example provided in the question sums an actual column. By summing 1 instead we will get the value 1 when there is only 1 ID in the partition. Anything higher is a duplicate. I guess you could call this field duplicate count.

Update a view doesn't work

I'm working on a view which is then updated by the user. This update basically changes the value of column. But right now it doesnt let me do that and produces this :
Update or insert of view or function '' failed because it contains a derived or constant field.
I know this is because I have a constant in the select statement but is there a way to get around it? Please help
This is my code for the view
Create view Schema.View1
as
SELECT
Convert(Varchar(20),l.jtpName) as JobType, Convert(Varchar(10),' <All> ')as SubCategory , Convert(varchar (3), Case when a.jtpName= l.jtpName and a.subName= ' <All> ' then 'Yes' else 'No' end) As AutoProcess from Schema.JobType l left join Schema.Table1 a on l.jtpName=a.jtpName
UNION
SELECT
Convert(Varchar(20),a.jtpName) as JobType, Convert(Varchar(10),a.subName) as SubCategory, Convert(varchar (3),Case when b.jtpName= a.jtpName and b.subName= a.subName then 'Yes' else 'No' end) As AutoProcess from Schema.SubCategory a left join fds.Table1 b on a.subName=b.subName
GO
Finally the update statement:
UPDATE Schema.View1 SET AUTOPROCESS = Case WHEN AUTOPROCESS = 'Yes' Then 'No' END Where JOBTYPE = 'Transport' and SUBCATEGORY= 'Cargo'
Thank You
You cannot update a column that is the result of a computation.
According to MSDN, one of the conditions for a view column to be updatable is this:
Any modifications, including UPDATE, INSERT, and DELETE statements, must reference columns from only one base table.
The columns being modified in the view must directly reference the underlying data in the table columns. The columns cannot be derived in any other way, such as through the following:
An aggregate function: AVG, COUNT, SUM, MIN, MAX, GROUPING, STDEV, STDEVP, VAR, and VARP.
A computation. The column cannot be computed from an expression that uses other columns. Columns that are formed by using the set operators UNION, UNION ALL, CROSSJOIN, EXCEPT, and INTERSECT amount to a computation and are also not updatable.
The columns being modified are not affected by GROUP BY, HAVING, or DISTINCT clauses.
TOP is not used anywhere in the select_statement of the view together with the WITH CHECK OPTION clause.
Here not only does your view uses the UNION statement, the AutoProcess field you are trying to update is actually the result of a CASE statement that uses two fields. It makes no sense to try and update that.
I would recommend that you use stored proc to perform writing operations. Or, as Damien suggest, you could use an INSTEAD OF trigger on the view too.
You have to create a TRIGGER and manually apply the changes from the inserted and deleted pseudo-tables against the base tables yourself.
There is no way for sql server to work backwards from your convert functions to the original fields. You cannot update a view this way.
If the view contained your jptName and subName fields, you might be able to update just those fields.

SQL Server Reference a Calculated Column

I have a select statement with calculated columns and I would like to use the value of one calculated column in another. Is this possible? Here is a contrived example to show what I am trying to do.
SELECT [calcval1] = CASE Statement, [calcval2] = [calcval1] * .25
No.
All the results of a single row from a select are atomic. That is, you can view them all as if they occur in parallel and cannot depend on each other.
If you're referring to computed columns, then you need to update the formula's input for the result to change during a select.
Think of computed columns as macros or mini-views which inject a little calculation whenever you call them.
For example, these columns will be identical, always:
-- assume that 'Calc' is a computed column equal to Salaray*.25
SELECT Calc, Salary*.25 Calc2 FROM YourTable
Also keep in mind that the persisted option doesn't change any of this. It keeps the value around which is nice for indexing, but the atomicity doesn't change.
Unfortunately not really, but a workaround that is sometimes worth it is
SELECT [calcval1], [calcval1] * .25 AS [calcval2]
FROM (SELECT [calcval1] = CASE Statement FROM whatever WHERE whatever)
Yes it's possible.
Use the WITH Statement for nested selects:
Two ways I can think of to do that. First understand that the calval1 column does not exist as far as SQL Server is concerned until the statement has run, therefore it cannot be directly used as showning your example. So you can put the calculation in there twice, once for calval1 and once as substitution for calcval1 in the calval2 calculation.
The other way is to make a derived table with calval1 in it and then calculate calval2 outside the derived table something like:
select calcval1*.25 as calval2, calval1, field1, field2
from (select casestament as cavlval1, field1, field2 from my table) a
You'll need to test both for performance.
You should use an outer apply instead of a subselect:
select V.calc,V.calc*0.25 from FOO outer apply (select case Statement as calc) V
You can't "reset" the value of a calculated column in a Select clause, if that's what you're trying to do... The value of a calculated column is based on the calculated column formulae. Which CAN include the value of another calculated column.... but you canlt reset the formulae in a Select clause... if all you want to do is "output" the value based on two calculated columns, (as the syntax in your question reads" Then the
"[calcval2]"
in
SELECT [calcval1] = CASE Statement, [calcval2] = [calcval1] * .25
would just become a column alias in the output of the Select Clause.
or are you asking how to define the formulae for one calculated column to be based on another?

Resources