Find records where string values are similar - sql-server

I am aggregating errors in order to track total number of errors being logged. I am currently trying to create a query that finds/groups several records that have similar value in order to record them as one error and not several individual ones. The only difference is an id or 2, which is why they are being grouped. The database gets injected with errors from our system via app insights and analytic stream. I have a table that holds the "template" error that will be used to find and group those specific errors. Not all errors being recorded need a template because they are being grouped appropriately because they are exactly the same. When trying to use like to find the errors, it seems to be having a hard time with the dashes. I am having a hard time finding info to help me with this issue. I've tried to use the replace for removing the dashes, but that doesn't work because the errors are too long.
Sample template:
Auto Resubmit for % failed with ' Object reference not set to an instance of an object. '
Sample error:
Auto Resubmit for 004e9e2d-3704-4cfd-a90d-42520203df79 - 18723191 failed with ' Object reference not set to an instance of an object. '
Auto Resubmit for 0130e64e-64e6-4a23-88a4-51fba823705b - 18734821 failed with ' Object reference not set to an instance of an object. '
Auto Resubmit for 11809bf5672f4e98987119dbd06e5d78 - 17359076 failed with ' Object reference not set to an instance of an object. '
Sample Query:
select top 1000 * from errorTable where error like 'Auto Resubmit for % failed with '' Object reference not set to an instance of an object.'

If the only thing that's differing is the parameter values, you could normalize the string before storing/checking it. There are lots of utilities around that can do that. One example is the ExtractSQLTemplate function in our free SDU Tools. You don't need to use the toolkit. Just grab the code for that function as an example. You can see it here: https://youtu.be/yX5q00m_uCA
The other option is to use a full text index on the string instead. You can use functions like FREETEXTTABLE and/or FREETEXT to find similarities between the strings.

One of the main reason that you are not able to use Template Table is that you have basically fail the very purpose of Template by storing error in this manner.
In other word, your table are not Normalize nor any relation is define between
2 tables.
Auto Resubmit for 004e9e2d-3704-4cfd-a90d-42520203df79 - 18723191 failed with
' Object reference not set to an instance of an object. '
Your error table design should be like,
TemplateID -- id column of Template table
Module -- Name of module from where error originated
SubmoduleName-- Method name
ErrorMessage-- Original error message like "Object reference not set to an instance of an object".
CustomError -- Auto Resubmit for 004e9e2d-3704-4cfd-a90d-42520203df79 - 18723191 failed with
This way you can easily join Template Table with Error Table using column ErrorMessage.
ErrorMessage : It should always contain original message thrown by `exception`.
You can customize your Error Table as per your requirement.It will take only little effort.
It would be very nice if you can store TemplateID (FK) in Error table.
All problem will vanish at once.
In short keep table in such a manner that it is easy to join them without string manipulation.
It is not clear, in what pattern % will be replace.
create table #Template(Templateid int identity(1,1),Template varchar(500))
insert into #Template values ('Auto Resubmit for % failed with ''Object reference not set to an instance of an object.''')
--select * from #Template
create table #ErrorLog(Errorid int identity(1,1),ErrorMessage varchar(500))
insert into #ErrorLog values
('Auto Resubmit for 004e9e2d-3704-4cfd-a90d-42520203df79 - 18723191 failed with ''Object reference not set to an instance of an object.''')
,('Auto Resubmit for 0130e64e-64e6-4a23-88a4-51fba823705b - 18734821 failed with ''Object reference not set to an instance of an object.''')
,('Auto Resubmit for 11809bf5672f4e98987119dbd06e5d78 - 17359076 failed with ''Object reference not set to an instance of an object.''' )
,('Auto Resubmit for itemid2 - 18137385 failed with '' Access Denied: userId=''12602174''' )
Use any split string function to split Template table by '%'
and store it in #temp table
create table #temp(Errorid int identity(1,1),ErrMsg varchar(500),rownum int)
insert into #temp
select value,row_number()over(order by (Select null))rn from #Template t
cross apply(select replace(value,'failed with','')value from string_split(t.Template,'%'))ca
select t1.* from
(
select el.ErrorMessage,c.ErrMsg,c.rownum from #ErrorLog EL
inner join #temp c on el.ErrorMessage like '%' +ErrMsg+'%' and c.rownum=1
)t1
inner join #temp c1 on t1.ErrorMessage like '%' +ltrim(c1.ErrMsg)+'%' and c1.rownum=2
select * from #temp
drop table #Template,#ErrorLog,#temp
In fact,for writing near perfect query 1 need to analyze data and write as many Test cases for it.

Related

Django model based on an SQL table-valued function using MyModel.objects.raw()

If it's relevant I'm using Django with Django Rest Framework, django-mssql-backend and pyodbc
I am building some read only models of a legacy database using fairly complex queries and Django's MyModel.objects.raw() functionality. Initially I was executing the query as a Select query which was working well, however I received a request to try and do the same thing but with a table-valued function from within the database.
Executing this:
MyModel.objects.raw(select * from dbo.f_mytablefunction)
Gives the error: Invalid object name 'myapp_mymodel'.
Looking deeper into the local variables at time of error it looks like this SQL is generated:
'SELECT [myapp_mymodel].[Field1], '
'[myapp_mymodel].[Field2] FROM '
'[myapp_mymodel] WHERE '
'[myapp_mymodel].[Field1] = %s'
The model itself is mapped properly to the query as executing the equivalent:
MyModel.objects.raw(select * from dbo.mytable)
Returns data as expected, and dbo.f_mytablefunction is defined as:
CREATE FUNCTION dbo.f_mytablefunction
(
#param1 = NULL etc etc
)
RETURNS TABLE
AS
RETURN
(
SELECT
field1, field2 etc etc
FROM
dbo.mytable
)
If anyone has any explanation as to why these two modes of operation are treated substantially differently then I would be very pleased to find out.
Guess you've figured this out by now (see docs):
MyModel.objects.raw('select * from dbo.f_mytablefunction(%s)', [1])
If you'd like to map your table valued function to a model, this gist has a quite thorough approach, though no license is mentioned.
Once you've pointed your model 'objects' to the new TableFunctionManager and added the 'function_args' OrderedDict (see tests in gist), you can query it as follows:
MyModel.objects.all().table_function(param1=1)
For anyone wondering about use cases for table valued functions, try searching for 'your_db_vendor tvf'.

How to find rows in ms-sql with another rows' value followingly?

I have created an Sql table to trace objects' operation history. I have two columns; first one is the self tracing code and second tracing code is the tracing code for the code coming from source object to target. I created this to be able to look up the route of operations through the objects. You can see the tracing sample table below:
I need to create an sql code to query to show all the route in one table. When I first select the self code, it will be the incoming code for previous rows. There may be more than one incoming code to self and I want to be able to trace all. And I want to reach end until my search is null.
I tried select query like below but I am so new sql and need your help.
SELECT [TracingCode.Self],
[TracingCode.Incoming],
[EquipmentNo]
FROM [MKP_PROCESS_PRODUCT_REPORTS].[dbo].[ProductionTracing.Main]
WHERE [TracingCode.Self] = (SELECT [TracingCode.Incoming]
FROM [MKP_PROCESS_PRODUCT_REPORTS].[dbo].[ProductionTracing.Main]
WHERE [TracingCode.Self] = (SELECT [TracingCode.Incoming]
FROM [MKP_PROCESS_PRODUCT_REPORTS].[dbo].[ProductionTracing.Main]
WHERE [TracingCode.Self] = (SELECT [TracingCode.Incoming]
FROM [MKP_PROCESS_PRODUCT_REPORTS].[dbo].[ProductionTracing.Main]
WHERE [TracingCode.Self] = '028.001.19.2.3')));
To do this kind of parent/child thing to any level without explicitly coding all levels you need to use a recursive CTE.
More details here
https://www.red-gate.com/simple-talk/sql/t-sql-programming/sql-server-cte-basics/
Here is some test data and a solution I came up with. Note that three records actually match 028.001.19.2.3
If this doesn't do what you need please explain further with sample data.
DECLARE #Sample TABLE (
TC_Self CHAR(14) NOT NULL,
TC_In CHAR(14) NOT NULL,
EquipmentNo INT NOT NULL
);
INSERT INTO #Sample (TC_Self, TC_In, EquipmentNo)
VALUES
('028.001.19.2.3','026.003.19.2.2',96),
('028.001.19.2.3','026.001.19.2.2',96),
('028.001.19.2.3','026.002.19.2.2',96),
('028.001.19.2.2','026.002.19.2.1',96),
('028.001.19.2.2','026.002.19.2.1',96),
('028.001.19.2.1','026.002.19.1.1',96),
('026.003.19.2.2','024.501.19.2.5',117),
('024.501.19.2.5','024.501.19.2.6',999),
('024.501.19.2.6','024.501.19.2.7',998);
WITH CTE (RecordType, TC_Self, TC_In, EquipmentNo)
AS
(
-- This is the 'root'
SELECT 'Root' RecordType, TC_Self, TC_In, EquipmentNo FROM #Sample
WHERE TC_Self = '028.001.19.2.3'
UNION ALL
SELECT 'Leaf' RecordType, S.TC_Self, S.TC_In, S.EquipmentNo FROM #Sample S
INNER JOIN CTE
ON S.TC_Self = CTE.TC_In
)
SELECT * FROM CTE;
Also please note that most of the time to generate this answer was taken in generating the sample data to use.
In future when asking questions, people are far more likely to help if you post this sample data generation yourself

SSIS Foreach Loop failure

I have created a lookup for a list of IDs and a subsequent Foreach loop to run an sql stmt for each ID.
My variable for catching the list of IDs is called MissingRecordIDs and is of type Object. In the Foreach container I map each value to a variable called RecordID of type Int32. No fancy scripts - I followed these instructions: https://www.simple-talk.com/sql/ssis/implementing-foreach-looping-logic-in-ssis-/ (without the file loading part - I am just running an SQL stmt).
It runs fine from within SSIS, but when I deploy it my Integration Services Catalogue in MSSQL it fails.
This is the error I get when running from SQL Mgt Studio:
I thought I could just put a Precendence Constraint after MissingRecordsIDs get filled to check for NULL and skip the Foreach loop if necessary - but I can't figure out how to check for NULL in an Object variable?
Here is the Variable declaration and Object getting enumerated:
And here is the Variable mapping:
The SQL stmt that is in 'Lookup missing Orders':
select distinct cast(od.order_id as int) as order_id
from invman_staging.staging.invman_OrderDetails_cdc od
LEFT OUTER JOIN invman_staging.staging.invman_Orders_cdc o
on o.order_id = od.order_id and o.BatchID = ?
where od.BatchID = ?
and o.order_id is null
and od.order_id is not null
In the current environment this query returns nothing - there are no missing Orders, so I don't want to go into the 'Foreach Order Loop' at all.
This is a known issue Microsoft is aware of: https://connect.microsoft.com/SQLServer/feedback/details/742282/ssis-2012-wont-assign-null-values-from-sql-queries-to-variables-of-type-string-inside-a-foreach-loop
I would suggest to add an ISNULL(RecordID, 0) to the query as well as set an expression to the component "Load missing Orders" in order to enable it only when RecordID != 0.
In my case it wasn't NULL causing the problem, the ID value which I loaded from database was stored as nvarchar(50), even if it was a integer, I attempted to use it as integer in SSIS and it kept giving me the same error message, this worked for me:
SELECT CAST(id as INT) FROM dbo.Table

Source data type "200" not found error when exporting query results to excel Microsoft SQL Server 2012

I am very new to Microsoft SQL Server and am using 2012 Management Studio. I get the error above when I try to export query results to an excel file using the wizard. I have seen solutions posted elsewhere for this error but do not know enough to figure out how to implement the solutions recommended. Can somebody please walk me through one of these solutions step by step?
I believe my problem is that the SQL Server Import and Export Wizard Does Not Recognise Varchar and NVarchar which I believe is the data type for the columns that I am receiving errors for.
Source Type 200 in SQL Server Import and Export Wizard?
http://connect.microsoft.com/SQLServer/feedback/details/775897/sql-server-import-and-export-wizard-does-not-recognise-varchar-and-nvarchar#
Query:
SELECT licenseEntitlement.entID, licenseEntitlement.entStartDate, entEndDate, quote.quoteId, quote.accountId, quote.clientId, quote.clientName, quote.contactName,
quote.contactEmail, quote.extReference, quote.purchaseOrderNumber, quote.linkedTicket
FROM licenseEntitlement INNER JOIN
quote ON quote.quoteId = SUBSTRING(licenseEntitlement.entComments, 12, PATINDEX('% Created%', licenseEntitlement.entComments) - 12)
inner join sophos521.dbo.computersanddeletedcomputers on computersanddeletedcomputers.name = entid and IsNumeric(computersanddeletedcomputers.name) = 1
WHERE (licenseEntitlement.entType = 'AVS') AND (licenseEntitlement.entComments LIKE 'OV Order + %') and entenddate < '4/1/2014'
ORDER BY licenseEntitlement.entEndDate
Error:
TITLE: SQL Server Import and Export Wizard
------------------------------
Column information for the source and the destination data could not be retrieved, or the data types of source columns were not mapped correctly to those available on the destination provider.
[Query] -> `Query`:
- Column "accountId": Source data type "200" was not found in the data type mapping file.
- Column "clientId": Source data type "200" was not found in the data type mapping file.
- Column "clientName": Source data type "200" was not found in the data type mapping file.
- Column "contactName": Source data type "200" was not found in the data type mapping file.
- Column "contactEmail": Source data type "200" was not found in the data type mapping file.
- Column "extReference": Source data type "200" was not found in the data type mapping file.
- Column "purchaseOrderNumber": Source data type "200" was not found in the data type mapping file.
- Column "linkedTicket": Source data type "200" was not found in the data type mapping file.
If any more details are needed please let me know
So, implementing the suggestion at the StackOverflow link you gave, of turning the query into a View, here's an example of what that could look like (with some code formatting ;) --
CREATE VIEW [dbo].[test__View_1]
AS
SELECT LIC.entID, LIC.entStartDate, entEndDate,
quote.quoteId, quote.accountId, quote.clientId, quote.clientName,
quote.contactName, quote.contactEmail, quote.extReference,
quote.purchaseOrderNumber, quote.linkedTicket
FROM [dbo].licenseEntitlement LIC WITH(NOLOCK)
INNER JOIN [dbo].quote WITH(NOLOCK)
ON quote.quoteId = SUBSTRING(LIC.entComments, 12,
PATINDEX('% Created%', LIC.entComments) - 12)
INNER JOIN sophos521.dbo.computersanddeletedcomputers COMPS WITH(NOLOCK)
ON COMPS.name = entid and IsNumeric(COMPS.name) = 1
WHERE (LIC.entType = 'AVS')
AND (LIC.entComments LIKE 'OV Order + %')
and (entenddate < '4/1/2014')
ORDER BY LIC.entEndDate
GO
Then, you would export from test__View_1 (or whatever real name you choose for it), as if test__View_1 was the table name.
FYI, after the first time you've executed the above -- after you've "created" the view -- then from then on, the view's first line (during modifications) changes, from CREATE VIEW, to ALTER VIEW.
((And, aside from the bug question... in your WHERE clause, did you intend entComments LIKE 'OV Order + %', or was that really intended to be entComments LIKE 'OV Order%'? I've made that change, in the alternative example code, below.))
Note: if you're going to be exporting repeatedly (or re-using) the output from one run, and especially if your query is slow or hogs the machine... then instead of a VIEW, you might prefer a SELECT INTO, to create a table once, which can be quickly re-used. (I would also choose SELECT INTO rather than CREATE VIEW, when developing a one-time-only query for export.)
IF EXISTS (SELECT * FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_NAME = 'zz_LIC_ENT_DETAIL')
DROP TABLE [dbo].zz_LIC_ENT_DETAIL
SELECT LIC.entID, LIC.entStartDate, LIC.entEndDate,
quote.quoteId, quote.accountId, quote.clientId, quote.clientName,
quote.contactName, quote.contactEmail, quote.extReference,
quote.purchaseOrderNumber, quote.linkedTicket
INTO [dbo].zz_LIC_ENT_DETAIL
FROM [dbo].licenseEntitlement LIC WITH(NOLOCK)
INNER JOIN [dbo].quote WITH(NOLOCK)
ON quote.quoteId = SUBSTRING(LIC.entComments, 12,
PATINDEX('% Created%', LIC.entComments) - 12)
INNER JOIN sophos521.dbo.computersanddeletedcomputers COMPS WITH(NOLOCK)
ON COMPS.name = LIC.entid and IsNumeric(COMPS.name) = 1
WHERE (LIC.entType = 'AVS')
AND (LIC.entComments LIKE 'OV Order%')
and (LIC.entenddate < '4/1/2014')
ORDER BY LIC.entEndDate
Then, you would of course export from table zz_LIC_ENT_DETAIL (or whatever table name you chose).
Hope that helps...
It might be easier to right click query results window and choosing Save Results As (CSV)..
To append the column names in the first row you'd also need to modify your query in this way (note the cast for int or datetime columns):
select 'col1', 'col2', 'col3'
union all
select cast(id as varchar(10)), name, cast(someinfo as varchar(28))
from Question1355876

How do I update an XML column in sql server by checking for the value of two nodes including one which needs to do a contains (like) comparison

I have an xml column called OrderXML in an Orders table...
there is an XML XPath like this in the table...
/Order/InternalInformation/InternalOrderBreakout/InternalOrderHeader/InternalOrderDetails/InternalOrderDetail
There InternalOrderDetails contains many InternalOrderDetail nodes like this...
<InternalOrderDetails>
<InternalOrderDetail>
<Item_Number>FBL11REFBK</Item_Number>
<CountOfNumber>10</CountOfNumber>
<PriceLevel>FREE</PriceLevel>
</InternalOrderDetail>
<InternalOrderDetail>
<Item_Number>FCL13COTRGUID</Item_Number>
<CountOfNumber>2</CountOfNumber>
<PriceLevel>NONFREE</PriceLevel>
</InternalOrderDetail>
</InternalOrderDetails>
My end goal is to modify the XML in the OrderXML column IF the Item_Number of the node contains COTRGUID (like '%COTRGUID') AND the PriceLevel=NONFREE. If that condition is met I want to change the PriceLevel column to equal FREE.
I am having trouble with both creating the xpath expression that finds the correct nodes (using OrderXML.value or OrderXML.exist functions) and updating the XML using the OrderXML.modify function).
I have tried the following for the where clause:
WHERE OrderXML.value('(/Order/InternalInformation/InternalOrderBreakout/InternalOrderHeader/InternalOrderDetails/InternalOrderDetail/Item_Number/node())[1]','nvarchar(64)') like '%13COTRGUID'
That does work, but it seems to me that I need to ALSO include my second condition (PriceLevel=NONFREE) in the same where clause and I cannot figure out how to do it. Perhaps I can put in an AND for the second condition like this...
AND OrderXML.value('(/Order/InternalInformation/InternalOrderBreakout/InternalOrderHeader/InternalOrderDetails/InternalOrderDetail/PriceLevel/node())[1]','nvarchar(64)') = 'NONFREE'
but I am afraid it will end up operating like an OR since it is an XML query.
Once I get the WHERE clause right I will update the column using a SET like this:
UPDATE Orders SET orderXml.modify('replace value of (/Order/InternalInformation/InternalOrderBreakout/InternalOrderHeader/InternalOrderDetails/InternalOrderDetail/PriceLevel[1]/text())[1] with "NONFREE"')
However, I ran this statement on some test data and none of the XML columns where updated (even though it said zz rows effected).
I have been at this for several hours to no avail. Help is appreciated. Thanks.
if you don't have more than one node with your condition in each row of Orders table, you can use this:
update orders set
data.modify('
replace value of
(
/Order/InternalInformation/InternalOrderBreakout/
InternalOrderHeader/InternalOrderDetails/
InternalOrderDetail[
Item_Number[contains(., "COTRGUID")] and
PriceLevel="NONFREE"
]/PriceLevel/text()
)[1]
with "FREE"
');
sql fiddle demo
If you could have more than one node in one row, there're a several possible solutions, none of each is really elegant, sadly.
You can reconstruct all xmls in table - sql fiddle demo
or you can do your updates in the loop - sql fiddle demo
This may get you off the hump.
Replace #HolderTable with the name of your table.
SELECT T2.myAlias.query('./../PriceLevel[1]').value('.' , 'varchar(64)') as MyXmlFragmentValue
FROM #HolderTable
CROSS APPLY OrderXML.nodes('/InternalOrderDetails/InternalOrderDetail/Item_Number') as T2(myAlias)
SELECT T2.myAlias.query('.') as MyXmlFragment
FROM #HolderTable
CROSS APPLY OrderXML.nodes('/InternalOrderDetails/InternalOrderDetail/Item_Number') as T2(myAlias)
EDIT:
UPDATE
#HolderTable
SET
OrderXML.modify('replace value of (/InternalOrderDetails/InternalOrderDetail/PriceLevel/text())[1] with "MyNewValue"')
WHERE
OrderXML.value('(/InternalOrderDetails/InternalOrderDetail/PriceLevel)[1]', 'varchar(64)') = 'FREE'
print ##ROWCOUNT
Your issue is the [1] in the above.
Why did I put it there?
Here is a sentence from the URL listed below.
Note that the target being updated must be, at most, one node that is explicitly specified in the path expression by adding a "[1]" at the end of the expression.
http://msdn.microsoft.com/en-us/library/ms190675.aspx
EDIT.
I think I've discovered the the root of your frustration. (No fix, just the problem).
Note below, the second query works.
So I think the [1] is some cases is saying "only ~~search~~ the first node".....and not (as you and I were hoping)...... "use the first node..after you find a match".
UPDATE
#HolderTable
SET
OrderXML.modify('replace value of (/InternalOrderDetails/InternalOrderDetail/PriceLevel/text())[1] with "MyNewValue001"')
WHERE
OrderXML.value('(/InternalOrderDetails/InternalOrderDetail/PriceLevel[text() = "NONFREE"])[1]', 'varchar(64)') = 'NONFREE'
/* and OrderXML.value('(/InternalOrderDetails/InternalOrderDetail/Item_Number)[1]', 'varchar(64)') like '%COTRGUID' */
UPDATE
#HolderTable
SET
OrderXML.modify('replace value of (/InternalOrderDetails/InternalOrderDetail/PriceLevel/text())[1] with "MyNewValue002"')
WHERE
OrderXML.value('(/InternalOrderDetails/InternalOrderDetail/PriceLevel[text() = "FREE"])[1]', 'varchar(64)') = 'FREE'
Try this :
;with InternalOrderDetail as (SELECT id,
Tbl.Col.value('Item_Number[1]', 'varchar(40)') Item_Number,
Tbl.Col.value('CountOfNumber[1]', 'int') CountOfNumber,
case
when Tbl.Col.value('Item_Number[1]', 'varchar(40)') like '%COTRGUID'
and Tbl.Col.value('PriceLevel[1]', 'varchar(40)')='NONFREE'
then 'FREE'
else
Tbl.Col.value('PriceLevel[1]', 'varchar(40)')
end
PriceLevel
FROM (select id ,orderxml from demo)
as a cross apply orderxml.nodes('//InternalOrderDetail')
as
tbl(col) ) ,
cte_data as(SELECT
ID,
'<InternalOrderDetails>'+(SELECT ITEM_NUMBER,COUNTOFNUMBER,PRICELEVEL
FROM InternalOrderDetail
where ID=Results.ID
FOR XML AUTO, ELEMENTS)+'</InternalOrderDetails>' as XML_data
FROM InternalOrderDetail Results
GROUP BY ID)
update demo set orderxml=cast(xml_data as xml)
from demo
inner join cte_data on demo.id=cte_data.id
where cast(orderxml as varchar(2000))!=xml_data;
select * from demo;
SQL Fiddle
I have handled following cases :
1. As required both where clause in question.
2. It will update all <Item_Number> like '%COTRGUID' and <PriceLevel>= NONFREE in one
node, not just the first one.
It may require minor changes for your data and tables.

Resources