Using WHERE with Group By and Having - sql-server

Let's say I have a table with 3 columns (a, b, c) with following values:
+---+------+---+
| a | b | c |
+---+------+---+
| 1 | 5 | 1 |
| 1 | NULL | 1 |
| 2 | NULL | 0 |
| 2 | NULL | 0 |
| 3 | NULL | 5 |
| 3 | NULL | 5 |
+---+------+---+
My desired output: 3
I want to select only those distinct values from column a for which every single occurrence of this value has NULL in column b given that value in c is not 0. Therefore from my desired output, "1" won't come in because there is a "5" in column b even though there is a NULL for the 2nd occurrence of "1". And "2" won't come in because the value of c is 0
The query that I'm using currently which is not working:
SELECT a FROM tab WHERE c!=0 GROUP BY a HAVING COUNT(b) = 0

You can do this using HAVING clause:
SQL Fiddle
SELECT a
FROM tbl
GROUP BY a
HAVING
SUM(CASE
WHEN b IS NOT NULL OR c = 0 THEN 1
ELSE 0 END
) = 0

I think this is the having clause that you want:
select a
from table t
group by a
having count(case when c <> 0 then b end) = 0 and
max(c) > 0
This assumes that c is non-negative.
However, it is not entirely clear why "2" doesn't meet your condition. There are no rows where "c" is not zero. Hence, all such rows have NULL values.

DECLARE #Table TABLE (
A INT
,B INT
,C INT
)
INSERT INTO #Table SELECT 1,5,1
INSERT INTO #Table SELECT 1,NULL,1
INSERT INTO #Table SELECT 2,NULL,0
INSERT INTO #Table SELECT 2,NULL,0
INSERT INTO #Table SELECT 3,NULL,5
INSERT INTO #Table SELECT 3,NULL,5
SELECT
a,max(b) [MaxB],max(C) [MaxC]
FROM #Table
GROUP BY A
HAVING max(b) IS NULL AND ISNULL(max(C),1)<>0

Although you've got 3 answers already, I decided to contribute my 2c...
The query from Ghost comes out most efficient when I check in SQL Server Query analyzer, however, I suspect if your data-set changes that Ghost's query may not be exactly as you require based on what you've written.
I think the query below is what you're looking for at the lowest execution cost in SQL, just basing this on your written requirements as opposed to the data example you've provided (Note: This queries performance is similar to Felix and Gordon's answers, however, I haven't included a conditional "case" statement in my having clause.).
SELECT DISTINCT(a) FROM intTable
GROUP BY a
HAVING SUM(ISNULL(b,0))=0 AND SUM(c)<>0
Hope this helps!

Related

How can I do a custom order in Snowflake?

In Snowflake, how do I define a custom sorting order.
ID Language Text
0 ENU a
0 JPN b
0 DAN c
1 ENU d
1 JPN e
1 DAN f
2 etc...
here I want to return all rows sorted by Language in this order: Language = ENU comes first, then JPN and lastly DAN.
Is this even possible?
I would like to order by language, in this order: ENU, JPN, DNA, and so on: ENU, JPN, DNA,ENU,JPN, DAN,ENU, JPN, DAN
NOT: ENU,ENU,ENU,JPN,JPN,JPN,DAN,DAN,DAN
I liked array_position solution of Phil Coulson. It's also possible to use DECODE:
create or replace table mydata ( ID number, Language varchar, Text varchar )
as select * from values
(0, 'JPN' , 'b'),
(0, 'DAN' , 'c' ),
(0, 'ENU' , 'a'),
(1 , 'JPN' , 'e'),
(1 , 'ENU' , 'd'),
(1 , 'DAN' , 'f');
select * from
mydata order by ID, DECODE(Language,'ENU',0,'JPN',1,'DAN',2 );
+----+----------+------+
| ID | LANGUAGE | TEXT |
+----+----------+------+
| 0 | ENU | a |
| 0 | JPN | b |
| 0 | DAN | c |
| 1 | ENU | d |
| 1 | JPN | e |
| 1 | DAN | f |
+----+----------+------+
You basically need 2 levels of sort. I am using arrays to arrange the languages in the order I want and then array_position to assign every language an index based on which they will be sorted. You can achieve the same using either a case expression or decode. To make sure the languages don't repeat within the same id, we use row_number. You can comment out the the row_number() line if that's not a requirement
with cte (id, lang) as
(select 0,'JPN' union all
select 0,'ENU' union all
select 0,'DAN' union all
select 0,'ENU' union all
select 0,'JPN' union all
select 0,'DAN' union all
select 1,'JPN' union all
select 1,'ENU' union all
select 1,'DAN' union all
select 1,'ENU' union all
select 1,'JPN' union all
select 1,'DAN')
select *
from cte
order by id,
row_number() over (partition by id, array_position(lang::variant,['ENU','JPN','DAN']) order by lang), --in case you want languages to not repeat within each id
array_position(lang::variant,['ENU','JPN','DAN'])
This is not to say other answers are wrong.
But here's yet another not using ANSI SQL 'CASE':
SELECT * FROM "Example"
ORDER BY
CASE "Language" WHEN 'ENU' THEN 1
WHEN 'JPN' THEN 2
WHEN 'DAN' THEN 3
ELSE 4 END
,"Language";
Notice the "Language" code is used as a disambiguation for 'other' languages not specified.
It's good defensive programming when dealing with CASE to deal with ELSE.
The ultimate most flexible answer is to have a table with a collation order for languages in it.
Collation order columns are common in many applications.
I've seen them for things like multiple parties to a contract who should appear in a specified order to (of course) the positional order of columns in a table of metadata.

Checking Palindrome

We have a string variable where we capture string listed below:
String-like >>
Temp Table Temp | Temp1 Table1 Temp1 | Temp2 Table2 Temp2 | ABD EFG
EFG
Now we need to check, in this particular string how many Palindromes exists.
So, can you help me with this, that how may I fetch the number of Palindrome counts exists.
Note: "|" this pipeline exists after every successful string completion.
Answer should be: 3
The query which I have written, I used Reverse() / Replace() functions but not able to understand how to split the string after every pipeline symbol.
So, please help me in doing that, I am a beginner in SQL Server.
It seems you are confusing your requirement with searching for palindromes, so I have put together a solution to your question as well as a few methods should anyone else come across this question looking for and answer relating to actual palindromes:
Answer to your question as it is here
To do this, you can split your string on the delimiter and then split the result again on the spaces (I have included the function I've used here at the end). With this ordered list of words, you can compare the words in order to the words in reverse order to see if they are the same:
declare #s nvarchar(100) = 'Temp Table Temp | Temp1 Table1 Temp1 | Temp2 Table2 Temp2 | ABD EFG EFG';
with w as
(
select s.item as s
,ss.rn
,row_number() over (partition by s.item order by ss.rn desc) as rrn
,ss.item as w
from dbo.fn_StringSplit4k(#s,'|',null) as s
cross apply dbo.fn_StringSplit4k(ltrim(rtrim(s.item)),' ',null) as ss
)
select w.s
,case when sum(case when w.w = wr.w then 1 else 0 end) = max(w.rn) then 1 else 0 end as p
from w
join w as wr
on w.s = wr.s
and w.rn = wr.rrn
group by w.s
order by w.s
Which outputs:
+----------------------+---+
| s | p |
+----------------------+---+
| ABD EFG EFG | 0 |
| Temp1 Table1 Temp1 | 1 |
| Temp2 Table2 Temp2 | 1 |
| Temp Table Temp | 1 |
+----------------------+---+
Solution for actual palindromes
Firstly to check if a string value is a proper palindrome (ie: spelled the same forwards and backwards) this is a trivial comparison of the original string with it's reverse value, which in the example below correctly outputs 1:
declare #p nvarchar(100) = 'Temp Table elbaT pmeT';
select case when #p = reverse(#p)
then 1
else 0
end as p
To do this across a set of delimited values within the same string, you should firstly feel bad for storing your data in a delimited string within your database and contemplate why you are doing this. Seriously, it's incredibly bad design and you should fix it as soon as possible. Once you have done that you can apply the above technique.
It that is genuinely unavoidable however, you can split your string using one of many set based table valued functions and then apply the above operation on the output:
declare #ps nvarchar(100) = 'Temp Table elbaT pmeT | Temp1 Table1 1elbaT 1pmeT | Temp2 Table2 Temp2 | ABD EFG EFG';
select ltrim(rtrim(s.item)) as s
,case when ltrim(rtrim(s.item)) = reverse(ltrim(rtrim(s.item))) then 1 else 0 end as p
from dbo.fn_StringSplit4k(#ps,'|',null) as s
Which outputs:
+---------------------------+---+
| s | p |
+---------------------------+---+
| Temp Table elbaT pmeT | 1 |
| Temp1 Table1 1elbaT 1pmeT | 1 |
| Temp2 Table2 Temp2 | 0 |
| ABD EFG EFG | 0 |
+---------------------------+---+
String split function
create function [dbo].[fn_StringSplit4k]
(
#str nvarchar(4000) = ' ' -- String to split.
,#delimiter as nvarchar(1) = ',' -- Delimiting value to split on.
,#num as int = null -- Which value to return, null returns all.
)
returns table
as
return
-- Start tally table with 10 rows.
with n(n) as (select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1)
-- Select the same number of rows as characters in #str as incremental row numbers.
-- Cross joins increase exponentially to a max possible 10,000 rows to cover largest #str length.
,t(t) as (select top (select len(isnull(#str,'')) a) row_number() over (order by (select null)) from n n1,n n2,n n3,n n4)
-- Return the position of every value that follows the specified delimiter.
,s(s) as (select 1 union all select t+1 from t where substring(isnull(#str,''),t,1) = #delimiter)
-- Return the start and length of every value, to use in the SUBSTRING function.
-- ISNULL/NULLIF combo handles the last value where there is no delimiter at the end of the string.
,l(s,l) as (select s,isnull(nullif(charindex(#delimiter,isnull(#str,''),s),0)-s,4000) from s)
select rn
,item
from(select row_number() over(order by s) as rn
,substring(#str,s,l) as item
from l
) a
where rn = #num
or #num is null;

Reverse order of a XML Column in SQL Server

In a SQL Server table, I have a XML column where status are happened (first is oldest, last current status).
I have to write a stored procedure that returns the statuses: newest first, oldest last.
This is what I wrote:
ALTER PROCEDURE [dbo].[GetDeliveryStatus]
#invoiceID nvarchar(255)
AS
BEGIN
SET NOCOUNT ON;
DECLARE #xml xml
SET #xml = (SELECT statusXML
FROM Purchase
WHERE invoiceID = #invoiceID )
SELECT
t.n.value('text()[1]', 'nvarchar(50)') as DeliveryStatus
FROM
#xml.nodes('/statuses/status') as t(n)
ORDER BY
DeliveryStatus DESC
END
Example of value in the statusXML column:
<statuses>
<status>A</status>
<status>B</status>
<status>A</status>
<status>B</status>
<status>C</status>
</statuses>
I want the procedure to return:
C
B
A
B
A
with ORDER BY .... DESC it return ALPHABETIC reversed (C B B A A)
How should I correct my procedure ?
Create a sequence for the nodes based on the existing order then reverse it.
WITH [x] AS (
SELECT
t.n.value('text()[1]', 'nvarchar(50)') as DeliveryStatus
,ROW_NUMBER() OVER (ORDER BY t.n.value('..', 'NVARCHAR(100)')) AS [Order]
FROM
#xml.nodes('/statuses/status') as t(n)
)
SELECT
DeliveryStatus
FROM [x]
ORDER BY [x].[Order] DESC
... results ...
DeliveryStatus
C
B
A
B
A
There is no need to declare a variable first. You can (and you should!) read the needed values from your table column directly. Best was an inline table valued function (rather than a SP just to read something...)
Better performance
inlineable
You can query many InvoiceIDs at once
set-based
Try this (I drop the mock-table at the end - carefull with real data!):
CREATE TABLE Purchase(ID INT IDENTITY,statusXML XML, InvocieID INT, OtherValues VARCHAR(100));
INSERT INTO Purchase VALUES('<statuses>
<status>A</status>
<status>B</status>
<status>A</status>
<status>B</status>
<status>C</status>
</statuses>',100,'Other values of your row');
GO
WITH NumberedStatus AS
(
SELECT ID
,InvocieID
, ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS Nr
,stat.value('.','nvarchar(max)') AS [Status]
,OtherValues
FROM Purchase
CROSS APPLY statusXML.nodes('/statuses/status') AS A(stat)
WHERE InvocieID=100
)
SELECT *
FROM NumberedStatus
ORDER BY Nr DESC
GO
--Clean-Up
--DROP TABLE Purchase;
The result
+---+-----+---+---+--------------------------+
| 1 | 100 | 5 | C | Other values of your row |
+---+-----+---+---+--------------------------+
| 1 | 100 | 4 | B | Other values of your row |
+---+-----+---+---+--------------------------+
| 1 | 100 | 3 | A | Other values of your row |
+---+-----+---+---+--------------------------+
| 1 | 100 | 2 | B | Other values of your row |
+---+-----+---+---+--------------------------+
| 1 | 100 | 1 | A | Other values of your row |
+---+-----+---+---+--------------------------+

SQL Server - Transpose rows into columns

I've searched high and low for an answer to this so apologies if it's already answered!
I have the following result from a query in SQL 2005:
ID
1234
1235
1236
1267
1278
What I want is
column1|column2|column3|column4|column5
---------------------------------------
1234 |1235 |1236 |1267 |1278
I can't quite get my head around the pivot operator but this looks like it's going to be involved. I can work with there being only 5 rows for now but a bonus would be for it to be dynamic, i.e. can scale to x rows.
EDIT:
What I'm ultimately after is assigning the values of each resulting column to variables, e.g.
DECLARE #id1 int, #id2 int, #id3 int, #id4 int, #id5 int
SELECT #id1 = column1, #id2 = column2, #id3 = column3, #id4 = column4,
#id5 = column5 FROM [transposed_table]
You also need a value field in your query for each id to aggregate on. Then you can do something like this
select [1234], [1235]
from
(
-- replace code below with your query, e.g. select id, value from table
select
id = 1234,
value = 1
union
select
id = 1235,
value = 2
) a
pivot
(
avg(value) for id in ([1234], [1235])
) as pvt
I think you'll find the answer in this answer to a slightly different question: Generate "scatter plot" result of members against sets from SQL query
The answer uses Dynamic SQL. Check out the last link in mellamokb's answer: http://www.sqlfiddle.com/#!3/c136d/14 where he creates column names from row data.
In case you have a grouped flat data structure that you want to group transpose, like such:
GRP | ID
---------------
1 | 1234
1 | 1235
1 | 1236
1 | 1267
1 | 1278
2 | 1234
2 | 1235
2 | 1267
2 | 1289
And you want its group transposition to appear like:
GRP | Column 1 | Column 2 | Column 3 | Column 4 | Column 5
-------------------------------------------------------------
1 | 1234 | 1235 | 1236 | 1267 | 1278
2 | 1234 | 1235 | NULL | 1267 | NULL
You can accomplish it with a query like this:
SELECT
Column1.ID As column1,
Column2.ID AS column2,
Column3.ID AS column3,
Column4.ID AS column4,
Column5.ID AS column5
FROM
(SELECT GRP, ID FROM FlatTable WHERE ID = 1234) AS Column1
LEFT OUTER JOIN
(SELECT GRP, ID FROM FlatTable WHERE ID = 1235) AS Column2
ON Column1.GRP = Column2.GRP
LEFT OUTER JOIN
(SELECT GRP, ID FROM FlatTable WHERE ID = 1236) AS Column3
ON Column1.GRP = Column3.GRP
LEFT OUTER JOIN
(SELECT GRP, ID FROM FlatTable WHERE ID = 1267) AS Column4
ON Column1.GRP = Column4.GRP
LEFT OUTER JOIN
(SELECT GRP, ID FROM FlatTable WHERE ID = 1278) AS Column5
ON Column1.GRP = Column5.GRP
(1) This assumes you know ahead of time which columns you will want — notice that I intentionally left out ID = 1289 from this example
(2) This basically uses a bunch of left outer joins to append 1 column at a time, thus creating the transposition. The left outer joins (rather than inner joins) allow for some columns to be null if they don't have corresponding values from the flat table, without affecting any subsequent columns.

Find "regional" relationships in SQL data using a query, or SSIS

Edit for clarification: I am compiling data weekly, based on Zip_Code, but some Zip_Codes are redundant. I know I should be able to compile a small amount of data, and derive the redundant zip_codes if I can establish relationships.
I want to define a zip code's region by the unique set of items and values that appear in that zip code, in order to create a "Region Table"
I am looking to find relationships by zip code with certain data. Ultimately, I have tables which include similar values for many zip codes.
I have data similar to:
ItemCode |Value | Zip_Code
-----------|-------|-------
1 |10 | 1
2 |15 | 1
3 |5 | 1
1 |10 | 2
2 |15 | 2
3 |5 | 2
1 |10 | 3
2 |10 | 3
3 |15 | 3
Or to simplify the idea, I could even concantenate ItemCode + Value into unique values:
ItemCode+
Value | Zip_Code
A | 1
B | 1
C | 1
A | 2
B | 2
C | 2
A | 3
D | 3
E | 3
As you can see, Zip_Code 1 and 2 have the same distinct ItemCode and Value. Zip_Code 3 however, has different values for certain ItemCodes.
I need to create a table that establishes a relationship between Zip_Codes that contain the same data.
The final table will look something like:
Zip_Code | Region
1 | 1
2 | 1
3 | 2
4 | 2
5 | 1
6 | 3
...etc
This will allow me to collect data only once for each unique Region, and derive the zip_code appropriately.
Things I'm doing now:
I am currently using a query similar to a join, and compares against Zip_Code using something along the lines of:
SELECT a.ItemCode
,a.value
,a.zip_code
,b.ItemCode
,b.value
,b.zip_code
FROM mytable as a, mytable as b -- select from table twice, similar to a join
WHERE a.zip_code = 1 -- left table will have all ItemCode and Value from zip 1
AND b.zip_code = 2 -- right table will have all ItemCode and Value from zip 2
AND a.ItemCode = b.ItemCode -- matches rows on ItemCode
AND a.Value != b.Value
ORDER BY ItemCode
This returns nothing if the two zip codes have exactly the same ItemNum, and Value, and returns a slew of differences between the two zip codes if there are differences.
This needs to move from a manual process to an automated process however, as I am now working with more than 100 zip_codes.
I do not have much programming experience in specific languages, so tools in SSIS are somewhat limited to me. I have some experience using the Fuzzy tools, and feel like there might be something in Fuzzy Grouping that might shine a light on apparent regions, but can't figure out how to set it up.
Does anyone have any suggestions? I have access to SQLServ and its related tools, and Visual Studio. I am trying to avoid writing a program to automate this, as my c# skills are relatively nooby, but will figure it out if necessary.
Sorry for being so verbose: This is my first Question, and the page I agreed to in order to ask a question suggested to explain in detail, and talk about what I've tried...
Thanks in advance for any help I might receive.
Give this a shot (I used the simplified example, but this can easily be expanded). I think the real interesting part of this code is the recursive CTE...
;with matches as (
--Find all pairs of zip_codes that have matching values.
select d1.ZipCode zc1, d2.ZipCode zc2
from data d1
join data d2 on d1.Val=d2.Val
group by d1.ZipCode, d2.ZipCode
having count(*) = (select count(distinct Val) from data where zipcode = d1.Zipcode)
), cte as (
--Trace each zip_code to it's "smallest" matching zip_code id.
select zc1 tempRegionID, zc2 ZipCode
from matches
where zc1<=zc2
UNION ALL
select c.tempRegionID, m.zc2
from cte c
join matches m on c.ZipCode=m.zc1
and c.ZipCode!=m.zc2
where m.zc1<=m.zc2
)
--For each zip_code, use it's smallest matching zip_code as it's region.
select zipCode, min(tempRegionID) as regionID
from cte
group by ZipCode
Demonstrating that there's a use for everything, though normally it makes me cringe: concatenate the values for each zip code into a single field. Store ZipCode and ConcatenatedValues in a lookup table (PK on the one, UQ on the other). Now you can assess which zip codes are in the same region by grouping on ConcatenatedValues.
Here's a simple function to concatenate text data:
CREATE TYPE dbo.List AS TABLE
(
Item VARCHAR(1000)
)
GO
CREATE FUNCTION dbo.Implode (#List dbo.List READONLY, #Separator VARCHAR(10) = ',') RETURNS VARCHAR(MAX)
AS BEGIN
DECLARE #Concat VARCHAR(MAX)
SELECT #Concat = CASE WHEN Item IS NULL THEN #Concat ELSE COALESCE(#Concat + #Separator, '') + Item END FROM #List
RETURN #Concat
END
GO
DECLARE #List AS dbo.List
INSERT INTO #List (Item) VALUES ('A'), ('B'), ('C'), ('D')
SELECT dbo.Implode(#List, ',')

Resources