I have a large database system in SQL Server 2012 and a front end interface written in ASP. Users can query the database and I also have a dictionary to pick up typos and alternate spellings.
Historically users could have a single word search, or a multi word AND or OR search. I have just finished writing an exact multi-word search, and the search includes the multiple variations from the dictionary.
In the examples below I am searching therefore for "stroke unit" or "storke unit" in various fields. Note that the working version is text that I cut and pasted from output from the non-working version, and is not normally present. Apologies for the length of the statements, I just have a lot of fields to search. The AND and OR searches have much longer statements but I have never had any problem with them.
I have absolutely no idea why one version works and the other doesn't, so would be grateful for any tips. (I also don't normally have error checking every line of a connection statement, I just was hoping something might appear)
My code is currently
Err.Clear
LineOut("Version 1: Not working")
Session("Version1")="INSERT INTO dbo.Temp_RCT_Existing_Results "+Session("DORIS_RCT_SEL")+Session("DORIS_RCT_FROM2")+" ORDER BY Acronym, Year DESC, Authors, Title"
LineOut("<BR>-<BR>["+Session("Version1")+"]")
SET conn =Server.CreateObject("ADODB.Connection")
LineOut("Error Number on Create ="+CStr(Err.Number))
conn.Open adserverlocn
LineOut("Error Number on Open="+CStr(Err.Number))
conn.Execute Session("Version1")
LineOut("Error Number on Execute="+CStr(Err.Number))
conn.Close
SET conn=Nothing
LineOut("<BR>-<BR>Version 2: Working")
Session("Version2")="SELECT DISTINCT REF.REF_ID AS RefID, REF.REF_LTITLE AS Title, REF.REF_AUTHRS AS Authors, REF.REF_YEAR AS Year, STUDY.ST_ACRONYM AS Acronym, REF.REF_CITATION AS Citation,REF.REF_URL,REF.REF_PUBID, STUDY.ST_INTDTLS AS StudyDets, STUDY.ST_ISRCTN ,STUDY.ST_NCT, STUDY.ST_Registers,STUDY.ST_UKCRN,STUDY.ST_NAME,STUDY.ST_SYSREV,STUDY.ST_STATUS,REF.REF_DORIS_ORDER, 'Bob2' AS BelongsTo,ST_DOI, '2016-11-08' AS SearchDate, STUDY.ST_SIZE, REF_ORTITL AS OTitle FROM INTERV INNER JOIN ST_INT ON INTERV.INT_ID = ST_INT.[INT] INNER JOIN STUDY ON ST_INT.ST = STUDY.STUDY_ID INNER JOIN REF_ST ON STUDY.STUDY_ID = REF_ST.ST INNER JOIN REF ON REF_ST.REF = REF.REF_ID WHERE STUDY.ST_CONFID='OPEN' AND REF.REF_PUBSTA='PUBLISHED' AND (((REF.REF_LTITLE LIKE '%stroke UNIT %' OR STUDY.ST_INTDTLS LIKE '%stroke UNIT %' OR INTERV.INT_METHOD LIKE '%stroke UNIT %' OR INTERV.INT_CODE1 LIKE '%stroke UNIT %' OR INTERV.INT_CODE2 LIKE '%stroke UNIT %' OR INTERV.INT_CODE3 LIKE '%stroke UNIT %' OR INTERV.INT_CODE4 LIKE '%stroke UNIT %' OR INTERV.INT_DISEASE LIKE '%stroke UNIT %' OR INTERV.INT_CONDITION LIKE '%stroke UNIT %')) OR ((REF.REF_LTITLE LIKE '%storke UNIT %' OR STUDY.ST_INTDTLS LIKE '%storke UNIT %' OR INTERV.INT_METHOD LIKE '%storke UNIT %' OR INTERV.INT_CODE1 LIKE '%storke UNIT %' OR INTERV.INT_CODE2 LIKE '%storke UNIT %' OR INTERV.INT_CODE3 LIKE '%storke UNIT %' OR INTERV.INT_CODE4 LIKE '%storke UNIT %' OR INTERV.INT_DISEASE LIKE '%storke UNIT %' OR INTERV.INT_CONDITION LIKE '%storke UNIT %'))) AND (Study.st_sysrev='-') AND (study.st_status='COMPLETED')"
Session("Version2")="INSERT INTO dbo.Temp_RCT_Existing_Results "+Session("Version2")+" ORDER BY Acronym, Year DESC, Authors, Title"
LineOut("<BR>-<BR>["+Session("Version2")+"]")
SET conn =Server.CreateObject("ADODB.Connection")
conn.Open adserverlocn
conn.Execute Session("Version2")
conn.Close
SET conn=Nothing
The output for this is
Version 1: Not working
INSERT INTO DBO.temp_rct_existing_results
SELECT DISTINCT ref.ref_id AS RefID,
ref.ref_ltitle AS Title,
ref.ref_authrs AS Authors,
ref.ref_year AS Year,
study.st_acronym AS Acronym,
ref.ref_citation AS Citation,
ref.ref_url,
ref.ref_pubid,
study.st_intdtls AS StudyDets,
study.st_isrctn,
study.st_nct,
study.st_registers,
study.st_ukcrn,
study.st_name,
study.st_sysrev,
study.st_status,
ref.ref_doris_order,
'Bob' AS BelongsTo,
st_doi,
'2016-11-08' AS SearchDate,
study.st_size,
ref_ortitl AS OTitle
FROM interv
INNER JOIN st_int
ON interv.int_id = st_int.[int]
INNER JOIN study
ON st_int.st = study.study_id
INNER JOIN ref_st
ON study.study_id = ref_st.st
INNER JOIN ref
ON ref_st.ref = ref.ref_id
WHERE study.st_confid = 'OPEN'
AND ref.ref_pubsta = 'PUBLISHED'
AND ( (( ref.ref_ltitle LIKE '%stroke UNIT %'
OR study.st_intdtls LIKE '%stroke UNIT %'
OR interv.int_method LIKE '%stroke UNIT %'
OR interv.int_code1 LIKE '%stroke UNIT %'
OR interv.int_code2 LIKE '%stroke UNIT %'
OR interv.int_code3 LIKE '%stroke UNIT %'
OR interv.int_code4 LIKE '%stroke UNIT %'
OR interv.int_disease LIKE '%stroke UNIT %'
OR interv.int_condition LIKE '%stroke UNIT %' ))
OR (( ref.ref_ltitle LIKE '%storke UNIT %'
OR study.st_intdtls LIKE '%storke UNIT %'
OR interv.int_method LIKE '%storke UNIT %'
OR interv.int_code1 LIKE '%storke UNIT %'
OR interv.int_code2 LIKE '%storke UNIT %'
OR interv.int_code3 LIKE '%storke UNIT %'
OR interv.int_code4 LIKE '%storke UNIT %'
OR interv.int_disease LIKE '%storke UNIT %'
OR interv.int_condition LIKE '%storke UNIT %' )) )
AND ( study.st_sysrev = '-' )
AND ( study.st_status = 'COMPLETED' )
ORDER BY acronym,
year DESC,
authors,
title
Error Number on Create =0
Error Number on Open=0
Error Number on Execute=0
Version 2: Working
INSERT INTO DBO.temp_rct_existing_results
SELECT DISTINCT ref.ref_id AS RefID,
ref.ref_ltitle AS Title,
ref.ref_authrs AS Authors,
ref.ref_year AS Year,
study.st_acronym AS Acronym,
ref.ref_citation AS Citation,
ref.ref_url,
ref.ref_pubid,
study.st_intdtls AS StudyDets,
study.st_isrctn,
study.st_nct,
study.st_registers,
study.st_ukcrn,
study.st_name,
study.st_sysrev,
study.st_status,
ref.ref_doris_order,
'Bob2' AS BelongsTo,
st_doi,
'2016-11-08' AS SearchDate,
study.st_size,
ref_ortitl AS OTitle
FROM interv
INNER JOIN st_int
ON interv.int_id = st_int.[int]
INNER JOIN study
ON st_int.st = study.study_id
INNER JOIN ref_st
ON study.study_id = ref_st.st
INNER JOIN ref
ON ref_st.ref = ref.ref_id
WHERE study.st_confid = 'OPEN'
AND ref.ref_pubsta = 'PUBLISHED'
AND ( (( ref.ref_ltitle LIKE '%stroke UNIT %'
OR study.st_intdtls LIKE '%stroke UNIT %'
OR interv.int_method LIKE '%stroke UNIT %'
OR interv.int_code1 LIKE '%stroke UNIT %'
OR interv.int_code2 LIKE '%stroke UNIT %'
OR interv.int_code3 LIKE '%stroke UNIT %'
OR interv.int_code4 LIKE '%stroke UNIT %'
OR interv.int_disease LIKE '%stroke UNIT %'
OR interv.int_condition LIKE '%stroke UNIT %' ))
OR (( ref.ref_ltitle LIKE '%storke UNIT %'
OR study.st_intdtls LIKE '%storke UNIT %'
OR interv.int_method LIKE '%storke UNIT %'
OR interv.int_code1 LIKE '%storke UNIT %'
OR interv.int_code2 LIKE '%storke UNIT %'
OR interv.int_code3 LIKE '%storke UNIT %'
OR interv.int_code4 LIKE '%storke UNIT %'
OR interv.int_disease LIKE '%storke UNIT %'
OR interv.int_condition LIKE '%storke UNIT %' )) )
AND ( study.st_sysrev = '-' )
AND ( study.st_status = 'COMPLETED' )
ORDER BY acronym,
year DESC,
authors,
title
If I cut and and post the non-working version into the SQL Server Management Studio, it works perfectly. I have also swapped the order of the working and non-working versions to see if it makes any difference.
Many thanks in advance.
Related
I'm trying to convert this query below to Snowflake but what I came up with kept giving me an error that it couldn't conver '04/17/22' to a numeric value.
SQL:
SELECT
user_id AS u_id,
Substring(Max( CONVERT(VARCHAR(10), system_modstamp, 121) +
CASE -- Categorizing all of the team roles
WHEN team_member_role LIKE 'AM%'
OR team_member_role LIKE '%AM %'
OR team_member_role LIKE 'ASR%'
THEN 'AM Sales'
WHEN team_member_role LIKE '%fsr%'
THEN 'FSR'
WHEN team_member_role LIKE '%RSD%'
AND team_member_role NOT LIKE '%parts%'
THEN 'AC Sales'
WHEN team_member_role LIKE 'RSA%'
THEN 'AC Sales'
ELSE team_member_role
END
), 11, 99) AS team_groups, Max(system_modstamp) AS SYSTEM_MODSTAMP
FROM S_SFDC_ACCOUNT_TEAM
GROUP BY user_id
Snowflake:
SELECT
user_id AS u_id,
SUBSTR(Max( TO_VARCHAR( system_modstamp,'YYYY-MM-DD') +
CASE WHEN team_member_role LIKE 'AM%' OR team_member_role LIKE '%AM %' OR team_member_role LIKE 'ASR%' THEN 'AM Sales' WHEN team_member_role LIKE '%fsr%' THEN 'FSR' WHEN team_member_role LIKE '%RSD%' AND team_member_role NOT LIKE '%parts%' THEN 'AC Sales' WHEN team_member_role LIKE 'RSA%' THEN 'AC Sales' ELSE team_member_role END
), 11, 99) AS team_groups, Max(system_modstamp) AS SYSTEM_MODSTAMP
FROM S_SFDC_ACCOUNT_TEAM
GROUP BY user_id
For closure, as expressed in the comments:
The issue was using + for string concatenation, because in Snowflake you need to use || instead.
The error "couldn't convert '...' to a numeric value" shows that it tried to transform that date (into a string), and then Snowflake tried to convert it to a number for the + operation.
Thanks Pankaj and Mike!
I am using SQL Server Management Studio. I have inherited a query that has a section that looks like:
...
ISNULL(
CASE
WHEN LOWER(PersonClass.Detail) LIKE '%student%'
THEN SUBSTRING((
SELECT DISTINCT
' / '+STUDENT_TERM.DEPT_NAME
FROM Warehouse.STUDENT_TERM STUDENT_TERM
INNER JOIN Warehouse.TERM TERM
ON STUDENT_TERM.TERM_CD = TERM.TERM_CD
AND TERM.TERM_START_DT <= #fyEnd
AND ISNULL(TERM.TERM_END_DT, GETDATE()) >= #fyStart
WHERE Persons.DWPersonId = STUDENT_TERM.DWPERSID FOR
XML PATH('')
), 4, 100000)
END, '') AS StudentHome,
...
This is finding a student's "home department". There is the possibility that a student could have more than one home so the above works a bit like MySQL's group_concat.
My question is about an unintended artifact of the query. Several departments have names in the data warehouse that have embedded ampersands & in them like:
A & B
The result of the query though is "HTML encoded" turning "A & B" into "A & B".
If I run the inner query the result is as expected with a simple ampersand and not the encoded form. I am guessing that the FOR XML is doing the encoding.
Is there a way to do the group_concat without having the result encoded?
You can get the value from the xml instead of cast to string:
ISNULL(
CASE
WHEN LOWER(PersonClass.Detail) LIKE '%student%'
THEN SUBSTRING((
SELECT DISTINCT
' / '+STUDENT_TERM.DEPT_NAME
FROM Warehouse.STUDENT_TERM STUDENT_TERM
INNER JOIN Warehouse.TERM TERM
ON STUDENT_TERM.TERM_CD = TERM.TERM_CD
AND TERM.TERM_START_DT <= #fyEnd
AND ISNULL(TERM.TERM_END_DT, GETDATE()) >= #fyStart
WHERE Persons.DWPersonId = STUDENT_TERM.DWPERSID FOR
XML PATH(''),TYPE
).value(N'.','nvarchar(max)') , 4, 100000)
END, '') AS StudentHome,
I am working with a set of vehicle data that uses the following query:
SELECT
VIN_NUM AS [Registration VIN]
,REGION_IND AS [Location of Registration]
,REG_CHANGE AS [Changed Location Since Last Check]
,CASE
WHEN REG_CHANGE = '' THEN REGION_IND
ELSE REG_CHANGE
END AS [Final Location]
FROM
dbo.All_Tests
WHERE
VIN_NUM LIKE '1FM%' AND
CASE
WHEN REGION_IND = '1' THEN 'Upstate'
WHEN REGION_IND = '2' THEN 'Downstate'
ELSE 'Unknown'
END = 'Downstate'
The query pulls from a table a vehicle VIN (VIN_NUM) and whether it is located in one of two regions (REGION_IND), "1" or "2". It also pulls a column, "REG_CHANGE" checking if the vehicle registration has changed location between the two regions since last report. All three come from the same table.
REG_CHANGE is blank (not NULL) if there was no change, and contains the new region location, '1' or '2', if there was a change. This is used in a CASE statement with REGION_IND to give a current location to all vehicles in the database, alias name [Final Location].
The code works if I want the original regions since REGION_IND is a table column. However, I can't use [Final Location] because WHERE statements don't allow aliases. I'm thinking this would be a subquery construct within the SELECT columns, but I'm not certain how it would be structured.
Does anyone have any suggestions?
A useful approach for this is to use an applyoperator within the fromclause which then does permit the use of that column alias within the where clause:
SELECT
VIN_NUM AS [registration vin]
, REGION_IND AS [location of registration]
, REG_CHANGE AS [changed location since last check]
, ca.[final location]
FROM dbo.All_Tests
CROSS APPLY (
SELECT
CASE
WHEN REG_CHANGE = '' THEN REGION_IND
ELSE REG_CHANGE
END AS [final location]
) ca
WHERE VIN_NUM LIKE '1FM%'
AND ca.[final location] = 'Downstate'
Any further uses of àpply that follow can also use these column aliases.
btw: Although a SQL select query starts with the select clause, that clause performed after the from and where clauses. So defining a column alias in the from clause makes that alias available much earlier in the execution of the query.
you can write your where like below
SELECT
VIN_NUM AS [Registration VIN]
,REGION_IND AS [Location of Registration]
,REG_CHANGE AS [Changed Location Since Last Check]
,CASE
WHEN REG_CHANGE = '' THEN REGION_IND
ELSE REG_CHANGE
END AS [Final Location]
FROM
dbo.All_Tests
WHERE
VIN_NUM LIKE '1FM%' AND
(
(REGION_IND = '1' and REG_CHANGE ='Upstate') OR
(REGION_IND = '2' and REG_CHANGE ='Downstate') OR
(REG_CHANGE = 'Downstate')
)
I have data like :
My table
My final results should be like this:
My SQL Statement:
SELECT 'Q'+cast([Month_Quarter] as varchar) Month_Quarter,COALESCE([Zugänge],0) Zugänge,COALESCE([Abgänge],0) Abgänge
FROM
(
SELECT DATEPART(QUARTER,[Monat]) [Month_Quarter],
[Zu-, Abgang],
Count(DISTINCT [Projektdefinition DB]) NoProjects
FROM AbZugänge
GROUP BY DATEPART(QUARTER,[Monat]), [Zu-, Abgang]
) proj
PIVOT (SUM(NoProjects) FOR [Zu-, Abgang] IN (Zugänge, Abgänge)) As pvt
WHERE [Month_Quarter] is not null
ORDER BY Month_Quarter
BUT with this statement I am getting the results without the Amount column Zugang and column Abgang:
How can I edit the statement to get the aggregation amount columns?
I suppose you can just wrap your query inside another select statement, then use GROUP BY. Something like this:
SELECT Month, SUM(ISNULL(column_name,0))
FROM (Your Query in here)
GROUP BY Month
Not sure I understand the point of the PIVOT in your original query. This looks like a typical aggregate is all that is required. See if this is what you need.
SELECT DATENAME(MONTH,Monat) [Month]
, sum(case when [Zu-, Abgang] = 'Zugänge' then [Zu-, Abgang] else 0 end) as Zugänge
, SUM(case when [Zu-, Abgang] = 'Abgänge' then [Zu-, Abgang] else 0 end) as Abgänge
, SUM([GWU aktuell]) as [GWU Total]
, SUM([GWU Planung aktuell]) AS [Plan Total]
, COUNT(DISTINCT [Projektdefinition DB]) NoProjects
FROM AbZugänge
group by DATENAME(MONTH,Monat)
I have a query that I am working on and it is displaying performance issues that I would not have expected. Here is the query so far.
INSERT INTO #Bridge (PolicyNumber, ProducerCode, BridgeDate, EffectiveDate, FirstName, LastName, LicenseNumber, BirthDate, Address, City, State, ZipCode)
SELECT tab.col.value('#PolicyNumber', 'VARCHAR(10)') AS PolicyNumber,
tab.col.value('#ProducerCode','VARCHAR(10)') as ProducerCode,
tab.col.value('#BridgeDate','DATETIME') AS BridgeDate,
tab.col.value('#EffectiveDate', 'DATETIME') as EffectiveDate,
tab.col.value('#FirstName', 'VARCHAR(200)') as FirstName,
tab.col.value('#LastName', 'VARCHAR(200)') as LastName,
CASE
WHEN tab.col.value('#LicenseNumber','VARCHAR(50)') LIKE '%0000%' THEN NULL
WHEN tab.col.value('#LicenseNumber','VARCHAR(50)') LIKE '%1111%' THEN NULL
WHEN tab.col.value('#LicenseNumber','VARCHAR(50)') LIKE '%2222%' THEN NULL
WHEN tab.col.value('#LicenseNumber','VARCHAR(50)') LIKE '%3333%' THEN NULL
WHEN tab.col.value('#LicenseNumber','VARCHAR(50)') LIKE '%4444%' THEN NULL
WHEN tab.col.value('#LicenseNumber','VARCHAR(50)') LIKE '%5555%' THEN NULL
WHEN tab.col.value('#LicenseNumber','VARCHAR(50)') LIKE '%6666%' THEN NULL
WHEN tab.col.value('#LicenseNumber','VARCHAR(50)') LIKE '%7777%' THEN NULL
WHEN tab.col.value('#LicenseNumber','VARCHAR(50)') LIKE '%8888%' THEN NULL
WHEN tab.col.value('#LicenseNumber','VARCHAR(50)') LIKE '%9999%' THEN NULL
ELSE tab.col.value('#LicenseNumber','VARCHAR(50)')
END as LicenseNumber,
tab.col.value('#BirthDate','DATETIME') as BirthDate,
REPLACE(tab.col.value('#Address1','VARCHAR(300)'), ' APT ',' #') as Address1,
tab.col.value('#City','VARCHAR(300)') as City,
tab.col.value('#State','VARCHAR(5)') as State,
tab.col.value('#ZipCode','VARCHAR(10)') as Zip
FROM #xml.nodes('//rows/datarow') as tab(col)
SELECT B.PolicyNumber,
B.ProducerCode,
B.BridgeDate,
B.EffectiveDate,
H.current_policy,
H.cancel_date,
H.first_eff_date,
H.display_address,
H.city,
H.state,
H.zip
FROM #Bridge B
LEFT JOIN (
SELECT P.policy_id,
P.current_policy,
CASE
WHEN A.pobox <> '' THEN 'PO BOX ' + REPLACE(A.pobox,'PO BOX ','')
ELSE RTRIM(A.house_num + ' ' + A.street_name + ' ' + CASE
WHEN A.apt_num = '' THEN ''
ELSE '#' + A.apt_num
END)
END as display_address,
A.pobox,
A.house_num,
A.street_name,
A.apt_num,
A.city,
MAX(A.policyimage_num) as policimage_num, --this is just to limit the results to the most recent
S.state,
A.zip,
P.first_eff_date,
P.cancel_date
FROM Diamond.dbo.Policy P WITH (NOLOCK)
LEFT JOIN Diamond.dbo.Address A WITH (NOLOCK)
ON P.policy_id = A.policy_id
AND A.nameaddresssource_id = 3
LEFT JOIN Diamond.dbo.State S WITH (NOLOCK)
ON A.state_id = S.state_id
WHERE A.state_id IS NOT NULL
AND P.current_policy NOT IN (SELECT PolicyNumber FROM #Bridge)
GROUP BY P.policy_id,
P.current_policy,
P.cancel_date,
P.first_eff_date,
A.pobox,
A.house_num,
A.street_name,
A.apt_num,
A.city,
S.state,
A.zip) AS H
ON B.Address = H.display_address
AND B.State = H.state
AND B.City = H.city
AND SUBSTRING(B.ZipCode,1,5) = SUBSTRING(H.Zip,1,5)
AND B.PolicyNumber != H.current_policy
WHERE H.current_policy IS NOT NULL
This query, run by itself, finishes in about 1:30 seconds. But if I add the following to the WHERE clause
AND B.EffectiveDate != H.first_eff_date
Suddenly the query takes far longer to return results. (We are at over 15 minutes and still going while I am writing this) I would think that simply having a clause to weed out a few additional rows wouldn't have such a drastic effect, but apparently it does. I how to get around it, I am just curious if anyone has any ideas as to why it has this effect?
Without having a hands on I can only guess at this, but here are some places I think you can tidy up and probably shave off run time.
1, You duplicate the effort required to make sure policy numbers don't match. Pick one of the two you have, not both. I would suggest trying both see which is faster.
i.e. this:
AND P.current_policy NOT IN (SELECT PolicyNumber FROM #Bridge)
Will do the same as this, you don't need both.
AND B.PolicyNumber != H.current_policy
2, It's worth a try to remove all that grouping from your sub query - you don't actually use policimage_num for anything. So why do the grouping? If you are worried that many rows are returned from Address, then you can use DISTINCT on your column set instead, that may be faster.
3, Is A.state_id a nullable value? If not consider trying an INNER JOIN to Address and removing the null check.
4, In all honesty I'm not seeing an obvious reason for that subquery at all, it seems to be over-complicating matters. Can you not simply join the tables together without it (again using DISTINCT if required)?
In other words get tweaking, I bet you can get it below the original run time if you try a few of these ideas.