Can I use 'OR' statement under CASE WHEN THEN clause? - snowflake-cloud-data-platform

I have a table my_table on snowflake:
COMPANY
TRANSACTIONS
AMERICAN
321-AMERICAN EAGLE 123
NIKE
080* NIKE_74093
AMERICAN
00 AMERICANEAGLE_42
ADIDAS
0101ADIDAS **093
AMERICAN
987 AMERICAN AIRLINE_4
AMERICAN
17 AMERICAN-EXPRESS 02
AMERICAN
09 AMERICAN-EAGLE_42
AMERICAN
0* AMERICANAIRLINE **7
AMERICAN
101AMERICAN EXPRESS *9
COCA
98*COCA COLA __4237
The COMPANY column is the company's abbreviation (basically the first word of the company's name).
The TRANSACTIONS column is the transaction names that showed in my dataset. There will be some prefixes and suffixes in each transaction name due to different processing methods.
For the Company column value "American", there could be "American Eagle", "American Airline", "American Express", etc. in the Transactions column, correspondingly.
What should I do if I only want to keep the rows with transactions from American Eagle while the Company column value is "American" with all other companies' transactions?
Resulting table I am looking for:
COMPANY
TRANSACTIONS
AMERICAN
321-AMERICAN EAGLE 123
NIKE
080* NIKE_74093
AMERICAN
00 AMERICANEAGLE_42
ADIDAS
0101ADIDAS **093
AMERICAN
09 AMERICAN-EAGLE_42
COCA
98*COCA COLA __4237
Below is my SQL query trying to solve the problem, the challenge I got here is even for American Eagle, the Transactions column value could be like "AMERICAN EAGLE"(space in the middle) , "AMERICANEAGLE" (no space) , "AMERICAN-EAGLE" (hyphen in the middle) , etc. Therefore, I am trying to use CASE WHEN ... THEN (...OR...) statements in my SQL Query. However, the below query doesn't work and pops up errors.
SELECT *
FROM my_table
WHERE
transactions LIKE CONCAT('%',
CASE WHEN company = 'AMERICAN' THEN ('AMERICAN EAGLE' OR 'AMERICANEAGLE' OR 'AMERICAN-EAGLE')
ELSE company END, '%')
Can I use the "OR" statement under the "THEN" clause since the CASE WHEN THEN only returns a single value?

I would handle company 'AMERICAN' separately with normal OR-conditions to include the three cases you have, and then all other companies as such:
with my_table as (
select 'AMERICAN' as COMPANY, '321-AMERICAN EAGLE 123' as TRANSACTIONS
union all select 'NIKE', '080* NIKE_74093'
union all select 'AMERICAN', '00 AMERICANEAGLE_42'
union all select 'ADIDAS', '0101ADIDAS **093'
union all select 'AMERICAN', '987 AMERICAN AIRLINE_4'
union all select 'AMERICAN', '17 AMERICAN-EXPRESS 02'
union all select 'AMERICAN', '09 AMERICAN-EAGLE_42'
union all select 'AMERICAN', '0* AMERICANAIRLINE **7'
union all select 'AMERICAN', '101AMERICAN EXPRESS *9'
union all select 'COCA', '98*COCA COLA __4237'
)
select *
from my_table
where ( ( company = 'AMERICAN' and (transactions like '%AMERICAN EAGLE%'
or transactions like '%AMERICANEAGLE%'
or transactions like '%AMERICAN-EAGLE%')
)
OR company <> 'AMERICAN'
);

Related

Power Bi : Filter a SQL Server table which contains a string

I have 2 tables in SQL Server :
Table Country which countains the name of the country :
Table Society which the name of the society and the names of the countries where the society worked :
In Power Bi, i have to create a filter country (US, Germany, France, UK,...) that will filter the table society :
For example, if i put "US" in the filter Country, in my matrix i will have Society A and Society B.
If i put "France" in the filter Country, i will have Society B and Society C.
(My first idea was to add some binary fields "IsInThisCountry" in SQL Server then use these fields as a filter )
Something like this :
CASE WHEN country LIKE '%US%' THEN 1 ELSE 0 END 'IsUS'
But the issue is if i have 50 country, i will have to create 50 filter
If you have SQL Server with compatibility 130 or higher (with string_split) you can try something like this in your data model to split the delimited countries in your societies table:
;with countries as (
select 'germany' as country
union all
select 'sweden'
),
socities as (
select 'A' as society, 'germany-sweden' as countries
union all
select 'B', 'sweden'
),
societyByCountry as (
select c.society, value as Country from socities c
cross apply string_split(c.countries, '-') s
)
select c.country, s.society from countries c
inner join societyByCountry s on s.Country = c.country

how to create a table join on elements in an Array in Google BigQuery

I have some data, contact_IDs, that are in an Array of Structs inside a table called deals as in the example below.
WITH deals AS (
Select "012345" as deal_ID,
[STRUCT(["abc"] as company_ID, [123,678,810] as contact_ID)]
AS associations)
SELECT
deal_ID,
ARRAY(
SELECT AS STRUCT
( SELECT STRING_AGG(CAST(id AS STRING), ', ')
FROM t.contact_ID id
) AS contact_ID
FROM d.associations t
) AS contacts
FROM deals d
The query above takes the contact_IDs in the associations array and list them separated by commas.
Row deal_ID contacts.contact_ID
1 012345 123, 678, 810
But my problem now is that I need to replace the contact_IDs with with first and last names from another table called contacts that looks like the following where contact_ID is INT64 and the name fields are Strings.
contact_id first_name last_name
123 Jane Doe
678 John Smith
810 Alice Acre
I've attempted doing it with a subquery like this:
WITH deals AS (
Select "012345" as deal_ID,
[STRUCT(["abc"] as company_ID, [123,678,810] as contact_ID)]
AS associations)
SELECT
deal_ID,
ARRAY(
SELECT AS STRUCT
company_ID,
( SELECT STRING_AGG(
(select concat(c.first_name, " ", c.last_name)
from contacts c
where c.contact_id=id), ', ')
FROM t.contact_ID id
) AS contact_name
FROM d.associations t
) AS contacts
FROM deals d
But this gives an error "Correlated subqueries that reference other tables are not supported unless they can be de-correlated, such as by transforming them into an efficient JOIN." But I can't figure out how to make a join between deals.associations.contact_ID and contacts.contact_id when the thing I need to be joining on is inside the deals.associations array...
Thanks in advance for any guidance.
Below is for BigQuery Standard SQL
#standardSQL
SELECT deal_ID,
ARRAY_AGG(STRUCT(company_ID, contact_name)) AS contacts
FROM (
SELECT
deal_ID,
ANY_VALUE(company_ID) AS company_ID,
STRING_AGG(FORMAT('%s %s', IFNULL(first_name, ''), IFNULL(last_name, '')), ', ') AS contact_name
FROM deals d,
d.associations AS contact,
contact.contact_ID AS contact_ID
LEFT JOIN contacts c
USING(contact_ID)
GROUP BY deal_ID, FORMAT('%t', company_ID)
)
GROUP BY deal_ID
if applied to sample data from your question - output is
Row deal_ID contacts.company_ID contacts.contact_name
1 012345 abc Jane Doe, John Smith, Alice Acre
Note - below
FROM deals d,
d.associations AS contact,
contact.contact_ID AS contact_ID
is a shortcut for
FROM deals,
UNNEST(associations) AS contact,
UNNEST(contact_ID) AS contact_ID
Somehow - this is my preference when possible not to use explicit UNNEST() in the query text

Perform union and count from two tables

I have two tables:
Table A 'Real Orders':
Contract,
Order ID,
YYYYMM
Table B 'Uploads':
Contract,
Order ID,
YYYYMM
I want to make a query with the following structure:
Contract, YYYYMM, Nb of Real Orders, Nb of Uploads
How can I accomplish that knowing that some contracts from Table A don't appear in Table B and vice-versa? I am use SQL Server 2012.
I tried splitting the code into two sub queries, let me know if this works..
select coalesce(a.contract,b.contract) as contract,
coalesce(a.YYYYMM,b.YYYYMM) as YYYYMM,
no_real_orders,
no_uploads
from
(
select contract,YYYYMM, count (order_id) as no_real_orders,
from Table_a
group by contract, YYYYMM
) as a
full join
(
select contract,YYYYMM, count (order_id) as no_uploads,
from Table_b
group by contract, YYYYMM
) as b
on a.contract = b.contract
and a.YYYYMM = b.YYYYMM
Let me know if this works
select coalesce( a.contract,b.contract) as contract,
coalesce(a.YYYYMM,b.YYYYMM) as YYYYMM,
count (a.order_id) as no_real_orders,
count (b.order_id) as no_uploads
from Table_a as a
full join Table_b as b
on a.contract = b.contract
and a.YYYYMM = b.YYYYMM
group by coalesce(a.contract,b.contract), coalesce(a.YYYYMM,b.YYYYMM)
Unfortunately, I get the same count for both columns. Here is an excerpt from the output:
Contract YYYYMM Nb Real Orders Nb Uploads
Contract_x 201701 17 17
Contract_x 201612 72 72
Contract_y 201702 196 196
Contract_y 201612 345 345
Contract_y 201701 264 264
The code is:
select coalesce(a.Contract_Code,b.Contract_Code) as Contract,
coalesce(a.OIA_Creation_Date_YYYYMM,b.Ordear_Creation_YYYYMM) as
YearMonth, count(a.OIA_Order_Number) as Nb_Real_Orders,
count(b.Order_Number) as Nb_Uploads from Raw_Data_A as a full join
Raw_Data_B as b on a.Contract_Code=b.Contract_Code and
a.OIA_Creation_Date_YYYYMM=b.Ordear_Creation_YYYYMM group by
coalesce(a.Contract_Code,b.Contract_Code),coalesce(a.OIA_Creation_Date_YYYYMM,b.Ordear_Creation_YYYYMM)

how to match the values of two fields from two different tables to return another value associated with either one

I am having trouble with SQL server to retrieve a field of values by matching up two fields from different table.
here is my description:
Table A contains
ProductID ProductName
01 Health insurance1
02 Health insurance2
03 Health insurance3
o4 Car Insurance1
o5 Car Insurance2
06 Property Insurance1
07 Property Insurance2
Table B only contains
ProductName
Health Insurance1 Yr 10- 11
TTK Health Insurance Yr 2
Health Insurance3 Yr 5-6
Car Insurance1 Yr 3
Car Insurance Yr 4
Car Insurance3 Yr 4-5
Property Insurance Yr 1
Property Insurance3 Yr 5
What I want the query to return is the ProductID from the table A be appeared and aligned exactly with productName in Table B as it is in table A. Notice that the values from both productName fields are not exactly the same but look very similar.
Following is the script I tried using LIKE operator, but it returned me with redundant productID since it seems that the LIKE operator does not process anything after the 'insurance'.
select distinct
a.productID, b.productname
from
tableA a,
tableB b
where
b.productname like '%' + a.productname+ '%'
or a.productname like '%' + b.productname+ '%'
order by a.prodID
Please help me solve this problem. Thank you in advance!!
Try this code
SELECT distinct
a.productID, b.productname
FROM tableA a
INNER JOIN tableB b
ON b.productname LIKE '%' + a.productname+ '%'
ORDER BY a.prodID
Hope this helps you!

Query ignores filtering in WHERE-clause

I get unexpected results from a query against a Micrsoft SQL Server 2008 (10.0.1600.22 / Service Pack 2).
Could it be a bug?
I have tried to create a similar query to replicate the problem - without success. So I guess there's nothing wrong with the query itself, but something else is causing this somewhat strange behaviour.
I hope for some suggestions on what might be causing the problem.
First, take a look at this working example:
DROP TABLE #TempType
DROP TABLE #TempData
SELECT * INTO #TempType FROM (
SELECT 1 AS Id, 10 AS TempDataFK, 'Do' AS Name, 'CODE1' AS Type UNION
SELECT 2 AS Id, 10 AS TempDataFK, 'Re' AS Name, 'CODE2' AS Type UNION
SELECT 3 AS Id, 20 AS TempDataFK, 'Mi' AS Name, 'CODE2' AS Type UNION
SELECT 5 AS Id, 10 AS TempDataFK, 'Fa' AS Name, 'CODE3' AS Type UNION
SELECT 6 AS Id, 20 AS TempDataFK, 'So' AS Name, 'CODE4' AS Type
) sub
SELECT * INTO #TempData FROM (
SELECT 10 AS Id, 150 AS Number UNION
SELECT 20 AS Id, 150 AS Number UNION
SELECT 30 AS Id, 150 AS Number UNION
SELECT 40 AS Id, 180 AS Number
) sub
SELECT C1.Name Name1,
C2.Name Name2,
C3.Name Name3,
C4.Name Name4,
#TempData.Id,
#TempData.Number
FROM #TempData
LEFT JOIN #TempType C1 (NOLOCK)
ON #TempData.Id = C1.TempDataFK
AND C1.Type = 'CODE1'
LEFT JOIN #TempType C2 (NOLOCK)
ON #TempData.Id = C2.TempDataFK
AND C2.Type = 'CODE2'
LEFT JOIN #TempType C3 (NOLOCK)
ON #TempData.Id = C3.TempDataFK
AND C3.Type = 'CODE3'
LEFT JOIN #TempType C4 (NOLOCK)
ON #TempData.Id = C4.TempDataFK
AND C4.Type = 'CODE4'
WHERE 1=1
AND (#TempData.Number = 150)
AND (C1.Name='Mi' OR C2.Name='Mi' OR C3.Name='Mi' OR C4.Name='Mi')
As you can see I'm LEFT JOINing the table TempType to TempData four times. Each time giving it a different alias. I then filter by a certain number (150) and want that at least one of the Names should be 'Mi'.
The result is as expected:
--------------------------------------------
Name1 Name2 Name3 Name4 Id Number
NULL Mi NULL So 20 150
--------------------------------------------
Nevertheless if I run a similar query against my clients database I get:
--------------------------------------------
Name1 Name2 Name3 Name4 Id Number
Do Re Fa NULL 10 150
NULL Mi NULL So 20 150
NULL NULL NULL NULL 30 150
--------------------------------------------
It is as if the part similar to (C1.Name='Mi' OR C2.Name='Mi' OR C3.Name='Mi' OR C4.Name='Mi') is not used in the filtering.
Different query, but same structure
On the clients database the tables are not temporary and they already contain data. The tables also have various other fields, but I'm still only left joining with one other table. So same structure as above.
Couldn't replicate in new database
I tried to replicate the problem by creating a new database with only the tables and fields affected - no succes in replicating the problem.
Schemas
Worth mentioning also is that the clients database contain several schemas. Both tables similar to TempData and TempType belongs to the same schema (not dbo).
Also, the field similar to TempType.Type is a foreign key to another table in another schema, but as we're not joining with that I don't see that it would be relevant?
Another peculiarity
If I put C1.Name etc, as part of my SELECT, like this:
SELECT C1.Name Name1,
C2.Name Name2,
C3.Name Name3,
C4.Name Name4,
#TempData.Id,
#TempData.Number,
CASE WHEN (C1.Name = 'Mi' AND C1.Name IS NOT NULL) THEN 1 ELSE 0 END,
CASE WHEN (C2.Name = 'Mi' AND C2.Name IS NOT NULL) THEN 1 ELSE 0 END,
CASE WHEN (C3.Name = 'Mi' AND C3.Name IS NOT NULL) THEN 1 ELSE 0 END,
CASE WHEN (C4.Name = 'Mi' AND C4.Name IS NOT NULL) THEN 1 ELSE 0 END
I get the expected result against my clients database (only one row):
Name1 Name2 Name3 Name4 Id Number (No column name) (No column name) (No column name) (No column name)
NULL Mi NULL So 20 150 0 1 0 0
Any suggestions what could be causing this?
Please make sure you have the same settings for ansi_null for both your local DB settings and your clients. style like "name is not null" will return the desired result, not like "name =/<> null" that depends on ansi_null settings.
check for ansi_null: http://msdn.microsoft.com/en-us/library/ms188048(v=sql.105).aspx

Resources