MSSQL Data type conversion - sql-server

I have a pair of databases (one mssql and one oracle), ran by different teams. Some data are now being synchronized regularily by a stored procedure in the mssql table. This stored procedure is calling a very large
MERGE [mssqltable].[Mytable] as s
USING THEORACLETABLE.BLA as t
ON t.[R_ID] = s.[R_ID]
WHEN MATCHED THEN UPDATE SET [Field1] = s.[Field1], ..., [Brokenfield] = s.[BrokenField]
WHEN NOT MATCHED BY TARGET THEN
... another big statement
Field Brokenfield was a numeric one until today, and could take value NULL, 0, 1, .., 24
Now, the oracle team introduced a breaking change today for some reason, changed the type of the column to string and now has values NULL, "", "ALFA", "BRAVO"... in the column. Of course, the sync got broken.
What is the easiest way to fix the sync here? I (Mysql team lead, frontend expert but not so in databases) would usually apply one of our database expert guys here, but all of them are now ill, and the fix must go online today....
I thought of a stored procedure like CONVERT_BROKENFIELD_INT_TO_STRING or so, based on some switch-case, which could be called in that merge statement, but not sure how to do that.
Edit/Clarification:
What I need is a way to make a chunk of SQL code (stored procedure), taking an input of "ALFA" and returning 1, "BRAVO" -> 2, etc. and which can be reused, to avoid writing huge ifs in more then one place.

If you can not simplify the logic for correct values the way #RichardHansell desribed, you can create a crosswalk table for BrokenField to the correct values. Then you can use a common table expression or subquery with a left join to that crosswalk to use in the merge.
create table dbo.BrokenField_Crosswalk (
BrokenField varchar(32) not null primary key
, CorrectedValue int
);
insert into dbo.BrokenField_Crosswalk (BrokenField,CorrectedValue) values
('ALFA', 1)
, ('ALPHA', 1)
, ('BRAVO', 2)
...
go
And your code for the merge would look something like this:
;with cte as (
select o.R_ID
, o.Field1
, BrokenField = cast(isnull(c.CorrectedValue,o.BrokenField) as int)
....
from oracle_table.bla as o
left join dbo.BrokenField_Crosswalk as c
)
merge into [mssqltable].[Mytable] t
using cte as s
on t.[R_ID] = s.[R_ID]
when matched
then update set
[Field1] = s.[Field1]
, ...
, [Brokenfield] = s.[BrokenField]
when not matched by target
then

If they are using names with a letter at the start that goes in a sequence:
A = 1
B = 2
C = 3
etc.
Then you could do something like this:
MERGE [mssqltable].[Mytable] as s
USING THEORACLETABLE.BLA as t
ON t.[R_ID], 1)) - ASCII('A') + 1 = s.[R_ID]
WHEN MATCHED THEN UPDATE SET [Field1] = s.[Field1], ..., [Brokenfield] = s.[BrokenField]
WHEN NOT MATCHED BY TARGET THEN
... another big statement
Edit: but actually I re-read your question and you are talking about [Brokenfield] being the problem column, so my solution wouldn't work.
I don't really understand now, as it seems as though the MERGE statement is updating the oracle table with numbers, so surely you need the mapping to work the other way, i.e. 1 -> ALFA, 2 -> BETA, etc.?

Related

Postgres - CRUD operations with arrays of composite types

One really neat feature of Postgres that I have only just discovered is the ability to define composite type - also referred to in their docs as ROWS and as RECORDS. Consider the following example
CREATE TYPE dow_id AS
(
tslot smallint,
day smallint
);
Now consider the following tables
CREATE SEQUENCE test_id_seq INCREMENT 1 MINVALUE 1 MAXVALUE 2147483647 START 1 CACHE 1;
CREATE TABLE test_simple_array
(
id integer DEFAULT nextval('test_id_seq') NOT NULL,
dx integer []
);
CREATE TABLE test_composite_simple
(
id integer DEFAULT nextval('test_id_seq') NOT NULL,
dx dow_id
);
CREATE TABLE test_composite_array
(
id integer DEFAULT nextval('test_id_seq') NOT NULL,
dx dow_id[]
);
CRUD operations on the first two tables are relatively straightforward. For example
INSERT INTO test_simple_array (dx) VALUES ('{1,1}');
INSERT INTO test_composite_simple (dx) VALUES (ROW(1,1));
However, I have not been able to figure out how to perform CRUD ops when the table has an array of records/composite types as in test_composite_array. I have tried
INSERT INTO test_composite_array (dx) VALUES(ARRAY(ROW(1,1),ROW(1,2)));
which fails with the message
ERROR: syntax error at or near "ROW"
and
INSERT INTO test_composite_array (dx) VALUES("{(1,1),(1,2)}");
which fails with the message
ERROR: column "{(1,1),(1,2)}" does not exist
and
INSERT INTO test_composite_array (dx) VALUES('{"(1,1)","(1,2)"}');
which appears to work though it leaves me feeling confused since a subsequent
SELECT dx FROM test_composite_array
returns what appears to be a string result {"(1,1),(1,2)} although a further query such as
SELECT id FROM test_composite_array WHERE (dx[1]).tslot = 1;
works. I also tried the following
SELECT (dx[1]).day FROM test_composite_array;
UPDATE test_composite_array SET dx[1].day = 99 WHERE (dx[1]).tslot = 1;
SELECT (dx[1]).day FROM test_composite_array;
which works while
UPDATE test_composite_array SET (dx[1]).day = 99 WHERE (dx[1]).tslot = 1;
fails. I find that I am figuring out how to manipulate arrays of records/composite types in Postgres by trial and error and - altough Postgres documentation is generally excellent - there appears to be no clear discussion of this topic in the documentation. I'd be much obliged to anyone who can point me to an authoritative discussion of how to manipulate arrays of composite types in Postgres.
That apart are there any unexpected gotchas when working with such arrays?
You need square brackets with ARRAY:
ARRAY[ROW(1,1)::dow_id,ROW(1,2)::dow_id]
A warning: composite types are a great feature, but you will make your life harder if you overuse them. As soon as you want to use elements of a composite type in WHERE or JOIN conditions, you are doing something wrong, and you are going to suffer. There are good reasons for normalizing relational data.

How to find rows in ms-sql with another rows' value followingly?

I have created an Sql table to trace objects' operation history. I have two columns; first one is the self tracing code and second tracing code is the tracing code for the code coming from source object to target. I created this to be able to look up the route of operations through the objects. You can see the tracing sample table below:
I need to create an sql code to query to show all the route in one table. When I first select the self code, it will be the incoming code for previous rows. There may be more than one incoming code to self and I want to be able to trace all. And I want to reach end until my search is null.
I tried select query like below but I am so new sql and need your help.
SELECT [TracingCode.Self],
[TracingCode.Incoming],
[EquipmentNo]
FROM [MKP_PROCESS_PRODUCT_REPORTS].[dbo].[ProductionTracing.Main]
WHERE [TracingCode.Self] = (SELECT [TracingCode.Incoming]
FROM [MKP_PROCESS_PRODUCT_REPORTS].[dbo].[ProductionTracing.Main]
WHERE [TracingCode.Self] = (SELECT [TracingCode.Incoming]
FROM [MKP_PROCESS_PRODUCT_REPORTS].[dbo].[ProductionTracing.Main]
WHERE [TracingCode.Self] = (SELECT [TracingCode.Incoming]
FROM [MKP_PROCESS_PRODUCT_REPORTS].[dbo].[ProductionTracing.Main]
WHERE [TracingCode.Self] = '028.001.19.2.3')));
To do this kind of parent/child thing to any level without explicitly coding all levels you need to use a recursive CTE.
More details here
https://www.red-gate.com/simple-talk/sql/t-sql-programming/sql-server-cte-basics/
Here is some test data and a solution I came up with. Note that three records actually match 028.001.19.2.3
If this doesn't do what you need please explain further with sample data.
DECLARE #Sample TABLE (
TC_Self CHAR(14) NOT NULL,
TC_In CHAR(14) NOT NULL,
EquipmentNo INT NOT NULL
);
INSERT INTO #Sample (TC_Self, TC_In, EquipmentNo)
VALUES
('028.001.19.2.3','026.003.19.2.2',96),
('028.001.19.2.3','026.001.19.2.2',96),
('028.001.19.2.3','026.002.19.2.2',96),
('028.001.19.2.2','026.002.19.2.1',96),
('028.001.19.2.2','026.002.19.2.1',96),
('028.001.19.2.1','026.002.19.1.1',96),
('026.003.19.2.2','024.501.19.2.5',117),
('024.501.19.2.5','024.501.19.2.6',999),
('024.501.19.2.6','024.501.19.2.7',998);
WITH CTE (RecordType, TC_Self, TC_In, EquipmentNo)
AS
(
-- This is the 'root'
SELECT 'Root' RecordType, TC_Self, TC_In, EquipmentNo FROM #Sample
WHERE TC_Self = '028.001.19.2.3'
UNION ALL
SELECT 'Leaf' RecordType, S.TC_Self, S.TC_In, S.EquipmentNo FROM #Sample S
INNER JOIN CTE
ON S.TC_Self = CTE.TC_In
)
SELECT * FROM CTE;
Also please note that most of the time to generate this answer was taken in generating the sample data to use.
In future when asking questions, people are far more likely to help if you post this sample data generation yourself

checking if the same row exists in the table or not [duplicate]

I've got a table with data named energydata
it has just three columns
(webmeterID, DateTime, kWh)
I have a new set of updated data in a table temp_energydata.
The DateTime and the webmeterID stay the same. But the kWh values need updating from temp_energydata table.
How do I write the T-SQL for this the correct way?
Assuming you want an actual SQL Server MERGE statement:
MERGE INTO dbo.energydata WITH (HOLDLOCK) AS target
USING dbo.temp_energydata AS source
ON target.webmeterID = source.webmeterID
AND target.DateTime = source.DateTime
WHEN MATCHED THEN
UPDATE SET target.kWh = source.kWh
WHEN NOT MATCHED BY TARGET THEN
INSERT (webmeterID, DateTime, kWh)
VALUES (source.webmeterID, source.DateTime, source.kWh);
If you also want to delete records in the target that aren't in the source:
MERGE INTO dbo.energydata WITH (HOLDLOCK) AS target
USING dbo.temp_energydata AS source
ON target.webmeterID = source.webmeterID
AND target.DateTime = source.DateTime
WHEN MATCHED THEN
UPDATE SET target.kWh = source.kWh
WHEN NOT MATCHED BY TARGET THEN
INSERT (webmeterID, DateTime, kWh)
VALUES (source.webmeterID, source.DateTime, source.kWh)
WHEN NOT MATCHED BY SOURCE THEN
DELETE;
Because this has become a bit more popular, I feel like I should expand this answer a bit with some caveats to be aware of.
First, there are several blogs which report concurrency issues with the MERGE statement in older versions of SQL Server. I do not know if this issue has ever been addressed in later editions. Either way, this can largely be worked around by specifying the HOLDLOCK or SERIALIZABLE lock hint:
MERGE INTO dbo.energydata WITH (HOLDLOCK) AS target
[...]
You can also accomplish the same thing with more restrictive transaction isolation levels.
There are several other known issues with MERGE. (Note that since Microsoft nuked Connect and didn't link issues in the old system to issues in the new system, these older issues are hard to track down. Thanks, Microsoft!) From what I can tell, most of them are not common problems or can be worked around with the same locking hints as above, but I haven't tested them.
As it is, even though I've never had any problems with the MERGE statement myself, I always use the WITH (HOLDLOCK) hint now, and I prefer to use the statement only in the most straightforward of cases.
I often used Bacon Bits great answer as I just can not memorize the syntax.
But I usually add a CTE as an addition to make the DELETE part more useful because very often you will want to apply the merge only to a part of the target table.
WITH target as (
SELECT * FROM dbo.energydate WHERE DateTime > GETDATE()
)
MERGE INTO target WITH (HOLDLOCK)
USING dbo.temp_energydata AS source
ON target.webmeterID = source.webmeterID
AND target.DateTime = source.DateTime
WHEN MATCHED THEN
UPDATE SET target.kWh = source.kWh
WHEN NOT MATCHED BY TARGET THEN
INSERT (webmeterID, DateTime, kWh)
VALUES (source.webmeterID, source.DateTime, source.kWh)
WHEN NOT MATCHED BY SOURCE THEN
DELETE
If you need just update your records in energydata based on data in temp_energydata, assuming that temp_enerydata doesn't contain any new records, then try this:
UPDATE e SET e.kWh = t.kWh
FROM energydata e INNER JOIN
temp_energydata t ON e.webmeterID = t.webmeterID AND
e.DateTime = t.DateTime
Here is working sqlfiddle
But if temp_energydata contains new records and you need to insert it to energydata preferably with one statement then you should definitely go with the answer that Bacon Bits gave.
UPDATE ed
SET ed.kWh = ted.kWh
FROM energydata ed
INNER JOIN temp_energydata ted ON ted.webmeterID = ed.webmeterID
Update energydata set energydata.kWh = temp.kWh
where energydata.webmeterID = (select webmeterID from temp_energydata as temp)
THE CORRECT WAY IS :
UPDATE test1
INNER JOIN test2 ON (test1.id = test2.id)
SET test1.data = test2.data

Strange Behaviour on MSSQL Stored Procedure using Conditional WHERE with CONTAINS (Full Text Index)

I need some help from a MS SQL Master...
Short version:
When I execute a Conditional Where followed by a Contains, my query delays 1 minute (In its normal execution, it takes 200 milliseconds).
With this query, everything works fine:
Where
Contains(table.product_name, #search_word)
But with a Conditional Where, it takes 1 minute to execute:
Where
(#ExecuteWhereStatement = 0 Or (Contains(table.product_name, #search_word))
Long Version:
I'm using a stored procedure that receives some parameters. This Stored Procedure query a really large table, but everything is indexed properly and the query goes very well so far.
The main query is a little big, so I want to make the WHERE clause more smart possible, to avoid repeat multiple times the same statement.
The whole idea of the DataBase, is a history of purchases made by the State. So this query involves 3 tables:
Table 1 (table_purchase) - The purchase itself
id_purchase int (PK)
date_purchase datetime
buyer_code int (Nullable)
Table 2 (table_purchase_product) - The Items of a Purchase
id_product int (PK)
id_purchase int (FK of table_purchase)
product_quantity int (Nullable)
product_name varchar(255) (Nullable) (Full-Text-Indexed)
product_description varchar(2000) (Nullable) (Full-Text-Indexed)
id_product_bid_winner int (FK of table_product_bid)
Table 3 (table_product_bids) - The Bids for Each product of a Purchase
id_product_bid int (PK)
id_product int (FK of table_purchase_product)
product_brand varchar(255) (Nullable) (Full-Text-Indexed)
bid_value decimal (20,6)
So basicly, We have a "Purchase", that has several "Products (or Items)", and each "Product" has some "Bids (or Prices)"
And there is the Bad Girl (The SQL Stored Procedure):
ALTER PROCEDURE [dbo].[procPesquisaFullText]
#search_date datetime,
#search_word varchar(8000),
#search_brand varchar(255),
#only_one_bid bit = 0,
#search_buyer_code int = 0,
#quantityFrom decimal(20,6) = 0,
#quantityTo decimal(20,6) = 0
AS
BEGIN
SET NOCOUNT ON;
Declare #ExecuteWordSearch AS bit;
if (#uasg != 0 And #search_word = '')
begin
Set #ExecuteWordSearch = 0;
Set #search_word = 'nothing';
end
else
begin
Set #ExecuteWordSearch = 1;
end
Declare #ExecuteBrandSearch AS bit;
if (#search_brand = '')
begin
Set #ExecuteBrandSearch = 0;
Set #search_brand = 'nothing';
end
else
begin
Set #ExecuteMarcaSearch = 1;
end
begin
SELECT
pp.id_product,
pp.id_purchase,
pp.description
FROM
table_purchase_product pp
inner join table_purchase p on p.id_purchase = pp.id_purchase
WHERE
(p.date_purchase >= #search_date)
and (#search_buyer_code = 0 or (l.buyer_code = #search_buyer_code))
and (#quantityFrom = 0 or (li.product_quantity >= #QuantityFrom))
and (#quantityTo = 0 or (li.product_quantity <= #QuantityTo))
and (contains(pp.product_description, #search_word) or contains(pp.product_name, #search_word))
and (#only_one_bid = 0
or ((Select COUNT(*) From table_product_bid Where table_product_bid.id_product = pp.id_product) = 1))
and (#ExecuteBrandSearch = 0 Or (exists(
select 1
from table_product_bid ppb
where ppb.id_product_bid = pp.id_product_bid_winner
and contains(ppb.product_brand, #search_brand)
)
))
ORDER BY p.date_purchase DESC
end
END
So far, so good...
In the beginning I set two variables, used inside the query.
The first, verify if the user specified a "Buyer Code" AND didn't specify a "Search Word" (So, not the Product's description nor the Product's name is verified)
The second, verify if the user specified a "Specific Brand". If so, then the Winning Bid's BRAND is verified to match the users one.
Observation: You'll notice that when the "Search Words" is empty, I set them to "nothing". I do it because if the search term in the Contains is empty, it throws me a exception, even when it's not executed (I tested it in another query, absolutely isolated too)
As You can see, my user is able to search for:
- "Products" of Some Distinct Buyer "Purchase" (passing the #search_buyer_code parameter)
- A "Product" that contains a distinct word in its name or description
- A "Product" that has the Winner Bid of a specific Brand
- A "Product" that has only 1 bid at all
- A "Product" with a maximum and minimum quantity
And You'll notice that I used a lot of Conditions INSIDE the Where, producing a very dynamic Where, instead of using a "BIG If Else" statement, and repeating a lot of code. (I guess some "Googlers" will land here looking for Conditionally Wheres, and If so, I'm glad to help!)
Ok, so everything works veeery great at all. The query executes flawless. But here is the strange, damn, tricky issue:
If I want the user to be able to specify only a "Buyer Code" for Purchase, but No Word to Search of the Product using the code above (which is the first piece of code in the stored procedure does):
Changing from:
and (contains(pp.product_description, #search_word) or contains(pp.product_name, #search_word))
To:
and (#ExecuteWordSearch = 0 Or (contains(pp.product_description, #search_word) or contains(pp.product_name, #search_word)))
The query delays near 1 minute! (the execution is about 200 milliseconds for the query above).
But WHY??? I Use the same Logic of in all "Conditionally Wheres". I also use the same logic of having a flag/variable to indicate when execute the Where clause in the Word Search and the Brand Search, but the Brand Search works PERFECTLY! So Why, WHY only when I use the condition followed by a Contains my query delays 1 minute????
And this issue is not related with the amount of data, because I tried removing the entire Contains condition, allowing a lot of data to return, and it takes 1 second maximum...
Ow, It's a Microsoft SQL Server 2008 R2.
Thanks already for You read so far!
I cannot find the documentation I had around a very similar issue, but it sounded so familiar, I at least wanted to share what I remembered. Part of the issue is that for Sql Server, the full-text search engine is separate from the regular query execution engine, and so when you mix the two, in some cases, performance can tank. This is particularly true when the condition is an 'OR' rather than and 'AND'. (I remember hitting this exact situation). Conditional ANDs worked fine. But for OR, it's as if each condition gets evaluated repeatedly row by row.
Among the workarounds, one is, as already suggested, create your sql dynamically before execution.
Another would be to break the full-text and non-full text conditions into two search functions (literally UDF's) and then do whatever is needed (INTERSECT, EXCEPT, etc) with the two resultsets.
Try changing your WHERE clause to use a CASE statement, e.g.:
WHERE
CASE
WHEN #ExecuteWhereStatement = 0 THEN 1
WHEN #ExecuteWhereStatement = 1 THEN
CASE
WHEN CONTAINS([table].product_name, #search_word) THEN 1
ELSE 0
END
END = 1;

NVarchar Prefix causes wrong index to be selected

I have an entity framework query that has this at the heart of it:
SELECT 1 AS dummy
FROM [dbo].[WidgetOrder] AS widgets
WHERE widgets.[SomeOtherOrderId] = N'SOME VALUE HERE'
The execution plan for this chooses an index that is a composite of three columns. This takes 10 to 12 seconds.
However, there is an index that is just [SomeOtherOrderId] with a few other columns in the "include". That is the index that should be used. And when I run the following queries it is used:
SELECT 1 AS dummy
FROM [dbo].[WidgetOrder] AS widgets
WHERE widgets.[SomeOtherOrderId] = CAST(N'SOME VALUE HERE' AS VARCHAR(200))
SELECT 1 AS dummy
FROM [dbo].[WidgetOrder] AS widgets
WHERE widgets.[SomeOtherOrderId] = 'SOME VALUE HERE'
This returns instantly. And it uses the index that is just SomeOtherOrderId
So, my problem is that I can't really change how Entity Framework makes the query.
Is there something I can do from an indexing point of view that could cause the correct index to be selected?
As far as I know, since version 4.0, EF doesn't generate unicode parameters for non-unicode columns. But you can always force non-unicode parameters by DbFunctions.AsNonUnicode (prior to EF6, DbFunctions is EntityFunctions):
from o in db.WidgetOrder
where o.SomeOtherOrderId == DbFunctions.AsNonUnicode(param)
select o
Try something like ....
SELECT 1 AS dummy
FROM [dbo].[WidgetOrder] AS widgets WITH (INDEX(Target_Index_Name))
WHERE widgets.[SomeOtherOrderId] = N'SOME VALUE HERE'
This query hint sql server explicitly what index to use to get resutls.

Resources