I have strings in my database that represents days in the week like this:
1234567 (all days of the week)
1230567 (all days but Thursday, EU standard - day 1 is Monday)
0000067 (no day apart from Saturday and Sunday)
And I need to write an SQL question that checks for overlaps.
For instance:
1234500 and 0000067 are NOT overlapping.
while 1234500 and 0030000 are overlapping (the 3).
and 1234500 and 0000567 IS overlapping (the 5).
Each entry has an ID, customer number, and this weekday representation.
I was thinking something like this:
SELECT
*
FROM dbo.Customers c
JOIN dbo.Customers c2 ON c.CustomerNumber = c2.CustomerNumber
AND c.Days <> c2.Days
WHERE 1 = 1
AND ...?
To get two entries that are the same customer but when I come to the WHERE statement I hit a blank. Finding a substring (for instance 3) in both Days fields is very easy, but when any one of the 7 entries can be overlapping and I have to exclude 0 (not active day) then I get confused.
I need some help.
One way of doing it. Matching every days string char by char and ignoring 0s (by replacing with non matching values). Below query will return rows where there was no overlapping days (ignoring 0s) for same customer.
SELECT
*
FROM Customers c
JOIN Customers c2 ON c.CustomerNumber = c2.CustomerNumber
and c.days <> c2.days
where
(
REPLACE (substring (c.[days],1,1),'0','8') <> REPLACE (substring (c2.[days],1,1) ,'0','9')
AND
REPLACE (substring (c.[days],2,1),'0','8') <> REPLACE (substring (c2.[days],2,1) ,'0','9')
AND
REPLACE (substring (c.[days],3,1),'0','8') <> REPLACE (substring (c2.[days],3,1) ,'0','9')
AND
REPLACE (substring (c.[days],4,1),'0','8') <> REPLACE (substring (c2.[days],4,1) ,'0','9')
AND
REPLACE (substring (c.[days],5,1),'0','8') <> REPLACE (substring (c2.[days],5,1) ,'0','9')
AND
REPLACE (substring (c.[days],6,1),'0','8') <> REPLACE (substring (c2.[days],6,1) ,'0','9')
AND
REPLACE (substring (c.[days],7,1),'0','8') <> REPLACE (substring (c2.[days],7,1) ,'0','9')
)
Using two common table expressions, one for a tiny tally table, the other to split Days with cross apply() and substring() for each position using the tiny tally table along with the count() over() windowed aggregation function to count the occurrences of each Day by CustomerNumber. The final select shows each overlapping Day:
;with n as (
select i from (values (1),(2),(3),(4),(5),(6),(7)) t(i)
)
, expand as (
select c.CustomerNumber, c.Days, d.Day
, cnt = count(*) over (partition by c.CustomerNumber, d.Day)
from Customers c
cross apply (
select Day = substring(c.Days,n.i,1)
from n
) d
where d.Day > 0
)
select *
from expand
where cnt > 1
rextester demo: http://rextester.com/SZUANG12356
with test setup:
create table Customers (customernumber int, days char(7))
insert into Customers values
(1,'1234500')
,(1,'0000067') -- NOT overlapping
,(2,'1234500')
,(2,'0030000') -- IS overlapping (the 3).
,(3,'1234500')
,(3,'0000567') -- IS overlapping (the 5).
;
returns:
+----------------+---------+-----+-----+
| CustomerNumber | Days | Day | cnt |
+----------------+---------+-----+-----+
| 2 | 1234500 | 3 | 2 |
| 2 | 0030000 | 3 | 2 |
| 3 | 1234500 | 5 | 2 |
| 3 | 0000567 | 5 | 2 |
+----------------+---------+-----+-----+
Reference:
common table expression
table value constructor (values (...),(...))
cross apply()
over()
The "Numbers" or "Tally" Table: What it is and how it replaces a loop - Jeff Moden
Without any DDL (underlying table structure) it's impossible to understand where the data lives that you'll be comparing. That said, what you are trying to do will be simple using ngrams8k.
Note this query:
declare #searchstring char(7) = '1234500';
select * from dbo.ngrams8k(#searchstring,1);
returns
position token
----------- ------
1 1
2 2
3 3
4 4
5 5
6 0
7 0
with that in mind, this will help you:
-- sample data
declare #days table (daystring char(7));
insert #days values ('0000067'),('0030000'),('0000567');
declare #searchstring char(7) = '1234500';
-- how to break down and compare the strings
select
searchstring = #searchstring,
overlapstring = OverlapCheck.daystring,
overlapPosition = OverlapCheck.position,
overlapValue = OverlapCheck.token
from dbo.ngrams8k(#searchstring, 1) search
join
(
select *
from #days d
cross apply dbo.ngrams8k(d.daystring,1)
where token <> 0
) OverlapCheck on search.position = OverlapCheck.position
and search.token = OverlapCheck.token;
Returns:
searchstring overlapstring overlapPosition overlapValue
------------ ------------- -------------------- ---------------
1234500 0030000 3 3
1234500 0000567 5 5
Related
I have a stored procedure in SQL Server 2014 that selects some rows from a table with pagination, along with total row count:
SELECT
[...], COUNT(*) OVER () AS RowCount
FROM
[...]
WHERE
[...]
ORDER BY
[...]
OFFSET ([..]) ROWS FETCH NEXT 3 ROWS ONLY
Output:
+----+------+----------+
| ID | Name | RowCount |
+----+------+----------+
| 1 | Bob | 55 |
| 123| John | 55 |
| 99 | Jack | 55 |
+----+------+----------+
I would like to return results with actual data only, passing RowCount in an output parameter.
+----+------+
| ID | Name |
+----+------+
| 1 | Bob |
| 123| John |
| 99 | Jack |
+----+------+
#OutRowCount = 55
I tried with a CTE, but CTE is available only within the first SELECT:
WITH CTE AS
(
SELECT [...], COUNT(*) OVER () AS RowCount
FROM [...]
WHERE [...]
ORDER BY [...]
OFFSET ([..]) ROWS FETCH NEXT 3 ROWS ONLY
)
SELECT
ID, Name
FROM
CTE
SET #OutRowCount = (SELECT TOP 1 RowCount FROM CTE) -- here CTE is no longer defined
How can I do this? I think I can use temp table but I wonder if in this case performance might be an issue.
The "total row count" you have in mind is a bit unclear. Typically when paging you also display to total number of (filtered) rows, e.g. "Showing 3 of 42 Blue Widgets". That doesn't involve Max.
A CTE can have multiple queries, e.g.:
with
AllRows as ( -- All of the filtered rows.
select ..., Count(*) over (...) as RowCount
from ...
where ... -- Filter criteria. ),
SinglePage as ( -- One page of filtered rows.
select ...
from AllRows
order by ... -- Order here to get the correct rows in the page.
offset (...) rows fetch next 3 rows only )
select SP.Id, SP.Name,
( select Count(42) from AllRows ) as TotalRowCount -- Constant over all rows.
from SinglePage
order by ...; -- Keep the rows in the desired order.
Re: SET #OutRowCount = (SELECT TOP 1 RowCount FROM CTE)
Note that TOP 1 without order by isn't guaranteed to pick the row you have in mind.
Thanks to #Larnu and #Stu, I solved this using a table variable, this way:
CREATE PROCEDURE MyProc
#OutRowCount INT OUTPUT
AS
BEGIN
DECLARE #TempTbl TABLE (
ID INT,
Name VARCHAR(MAX),
RowCount INT
)
INSERT INTO
#TempTbl
SELECT
ID,
Name,
COUNT(*) OVER() AS TotRighe
FROM
MyTable
WHERE
[...]
ORDER BY
Name
OFFSET ([...]) ROWS FETCH NEXT 3 ROWS ONLY
SELECT
ID,
Name
FROM
#TempTbl
SET #OutRowCount = ISNULL((SELECT TOP 1 RowCount FROM #TempTbl), 0)
END
I have a Table Animals
Id | Name | Count | -- (other columns not relevant)
1 | horse | 11
2 | giraffe | 20
I want to try to insert or update values from a CSV string
Is it possible to do something like the following in 1 query?
;with results as
(
select * from
(
values ('horse'), ('giraffe'), ('lion')
)
animal_csv(aName)
left join animals on
animals.[Name] = animal_csv.aName
)
update results
set
[Count] = 1 + animals.[Count]
-- various other columns are set here
where Id is not null
--else
--insert into results ([Name], [Count]) values (results.aName, 1)
-- (essentially Where id is null)
It looks like what you're looking for is a table variable or temporary table rather than a common table expression.
If I understand your problem correctly, you are building a result set based on data you're getting from a CSV, merging it by incrementing values, and then returning that result set.
As I read your code, it looks as if your results would look like this:
aName | Id | Name | Count
horse | 1 | horse | 12
giraffe | 2 | giraffe | 21
lion | | |
I think what you're looking for in your final result set is this:
Name | Count
horse | 12
giraffe | 21
lion | 1
First, you can get from your csv and table to a resultset in a single CTE statement:
;WITH animal_csv AS (SELECT * FROM (VALUES('horse'),('giraffe'), ('lion')) a(aName))
SELECT ISNULL(Name, aName) Name
, CASE WHEN [Count] IS NULL THEN 1 ELSE 1 + [Count] END [Count]
FROM animal_csv
LEFT JOIN animals
ON Name = animal_csv.aName
Or, if you want to build your resultset using a table variable:
DECLARE #Results TABLE
(
Name VARCHAR(30)
, Count INT
)
;WITH animal_csv AS (SELECT * FROM (VALUES('horse'),('giraffe'), ('lion')) a(aName))
INSERT #Results
SELECT ISNULL(Name, aName) Name
, CASE WHEN [Count] IS NULL THEN 1 ELSE 1 + [Count] END [Count]
FROM animal_csv
LEFT JOIN animals
ON Name = animal_csv.aName
SELECT * FROM #results
Or, if you just want to use a temporary table, you can build it like this (temp tables are deleted when the connection is released/closed or when they're explicitly dropped):
;WITH animal_csv AS (SELECT * FROM (VALUES('horse'),('giraffe'), ('lion')) a(aName))
SELECT ISNULL(Name, aName) Name
, CASE WHEN [Count] IS NULL THEN 1 ELSE 1 + [Count] END [Count]
INTO #results
FROM animal_csv
LEFT JOIN animals
ON Name = animal_csv.aName
SELECT * FROM #results
We have a string variable where we capture string listed below:
String-like >>
Temp Table Temp | Temp1 Table1 Temp1 | Temp2 Table2 Temp2 | ABD EFG
EFG
Now we need to check, in this particular string how many Palindromes exists.
So, can you help me with this, that how may I fetch the number of Palindrome counts exists.
Note: "|" this pipeline exists after every successful string completion.
Answer should be: 3
The query which I have written, I used Reverse() / Replace() functions but not able to understand how to split the string after every pipeline symbol.
So, please help me in doing that, I am a beginner in SQL Server.
It seems you are confusing your requirement with searching for palindromes, so I have put together a solution to your question as well as a few methods should anyone else come across this question looking for and answer relating to actual palindromes:
Answer to your question as it is here
To do this, you can split your string on the delimiter and then split the result again on the spaces (I have included the function I've used here at the end). With this ordered list of words, you can compare the words in order to the words in reverse order to see if they are the same:
declare #s nvarchar(100) = 'Temp Table Temp | Temp1 Table1 Temp1 | Temp2 Table2 Temp2 | ABD EFG EFG';
with w as
(
select s.item as s
,ss.rn
,row_number() over (partition by s.item order by ss.rn desc) as rrn
,ss.item as w
from dbo.fn_StringSplit4k(#s,'|',null) as s
cross apply dbo.fn_StringSplit4k(ltrim(rtrim(s.item)),' ',null) as ss
)
select w.s
,case when sum(case when w.w = wr.w then 1 else 0 end) = max(w.rn) then 1 else 0 end as p
from w
join w as wr
on w.s = wr.s
and w.rn = wr.rrn
group by w.s
order by w.s
Which outputs:
+----------------------+---+
| s | p |
+----------------------+---+
| ABD EFG EFG | 0 |
| Temp1 Table1 Temp1 | 1 |
| Temp2 Table2 Temp2 | 1 |
| Temp Table Temp | 1 |
+----------------------+---+
Solution for actual palindromes
Firstly to check if a string value is a proper palindrome (ie: spelled the same forwards and backwards) this is a trivial comparison of the original string with it's reverse value, which in the example below correctly outputs 1:
declare #p nvarchar(100) = 'Temp Table elbaT pmeT';
select case when #p = reverse(#p)
then 1
else 0
end as p
To do this across a set of delimited values within the same string, you should firstly feel bad for storing your data in a delimited string within your database and contemplate why you are doing this. Seriously, it's incredibly bad design and you should fix it as soon as possible. Once you have done that you can apply the above technique.
It that is genuinely unavoidable however, you can split your string using one of many set based table valued functions and then apply the above operation on the output:
declare #ps nvarchar(100) = 'Temp Table elbaT pmeT | Temp1 Table1 1elbaT 1pmeT | Temp2 Table2 Temp2 | ABD EFG EFG';
select ltrim(rtrim(s.item)) as s
,case when ltrim(rtrim(s.item)) = reverse(ltrim(rtrim(s.item))) then 1 else 0 end as p
from dbo.fn_StringSplit4k(#ps,'|',null) as s
Which outputs:
+---------------------------+---+
| s | p |
+---------------------------+---+
| Temp Table elbaT pmeT | 1 |
| Temp1 Table1 1elbaT 1pmeT | 1 |
| Temp2 Table2 Temp2 | 0 |
| ABD EFG EFG | 0 |
+---------------------------+---+
String split function
create function [dbo].[fn_StringSplit4k]
(
#str nvarchar(4000) = ' ' -- String to split.
,#delimiter as nvarchar(1) = ',' -- Delimiting value to split on.
,#num as int = null -- Which value to return, null returns all.
)
returns table
as
return
-- Start tally table with 10 rows.
with n(n) as (select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1)
-- Select the same number of rows as characters in #str as incremental row numbers.
-- Cross joins increase exponentially to a max possible 10,000 rows to cover largest #str length.
,t(t) as (select top (select len(isnull(#str,'')) a) row_number() over (order by (select null)) from n n1,n n2,n n3,n n4)
-- Return the position of every value that follows the specified delimiter.
,s(s) as (select 1 union all select t+1 from t where substring(isnull(#str,''),t,1) = #delimiter)
-- Return the start and length of every value, to use in the SUBSTRING function.
-- ISNULL/NULLIF combo handles the last value where there is no delimiter at the end of the string.
,l(s,l) as (select s,isnull(nullif(charindex(#delimiter,isnull(#str,''),s),0)-s,4000) from s)
select rn
,item
from(select row_number() over(order by s) as rn
,substring(#str,s,l) as item
from l
) a
where rn = #num
or #num is null;
Let's say I have a table with 3 columns (a, b, c) with following values:
+---+------+---+
| a | b | c |
+---+------+---+
| 1 | 5 | 1 |
| 1 | NULL | 1 |
| 2 | NULL | 0 |
| 2 | NULL | 0 |
| 3 | NULL | 5 |
| 3 | NULL | 5 |
+---+------+---+
My desired output: 3
I want to select only those distinct values from column a for which every single occurrence of this value has NULL in column b given that value in c is not 0. Therefore from my desired output, "1" won't come in because there is a "5" in column b even though there is a NULL for the 2nd occurrence of "1". And "2" won't come in because the value of c is 0
The query that I'm using currently which is not working:
SELECT a FROM tab WHERE c!=0 GROUP BY a HAVING COUNT(b) = 0
You can do this using HAVING clause:
SQL Fiddle
SELECT a
FROM tbl
GROUP BY a
HAVING
SUM(CASE
WHEN b IS NOT NULL OR c = 0 THEN 1
ELSE 0 END
) = 0
I think this is the having clause that you want:
select a
from table t
group by a
having count(case when c <> 0 then b end) = 0 and
max(c) > 0
This assumes that c is non-negative.
However, it is not entirely clear why "2" doesn't meet your condition. There are no rows where "c" is not zero. Hence, all such rows have NULL values.
DECLARE #Table TABLE (
A INT
,B INT
,C INT
)
INSERT INTO #Table SELECT 1,5,1
INSERT INTO #Table SELECT 1,NULL,1
INSERT INTO #Table SELECT 2,NULL,0
INSERT INTO #Table SELECT 2,NULL,0
INSERT INTO #Table SELECT 3,NULL,5
INSERT INTO #Table SELECT 3,NULL,5
SELECT
a,max(b) [MaxB],max(C) [MaxC]
FROM #Table
GROUP BY A
HAVING max(b) IS NULL AND ISNULL(max(C),1)<>0
Although you've got 3 answers already, I decided to contribute my 2c...
The query from Ghost comes out most efficient when I check in SQL Server Query analyzer, however, I suspect if your data-set changes that Ghost's query may not be exactly as you require based on what you've written.
I think the query below is what you're looking for at the lowest execution cost in SQL, just basing this on your written requirements as opposed to the data example you've provided (Note: This queries performance is similar to Felix and Gordon's answers, however, I haven't included a conditional "case" statement in my having clause.).
SELECT DISTINCT(a) FROM intTable
GROUP BY a
HAVING SUM(ISNULL(b,0))=0 AND SUM(c)<>0
Hope this helps!
I require is a min & max of the BaseDate where the available to sell = 1 and there are 3 or more consecutive days still available to sell. However, the sum needs to be excluded if the properties changeoverday starts on the same day as the BaseDate, as we are only interested in the gaps that we can't sell due to changeover restrictions. The data would have to be grouped by Code, as we have over 1,000 properties. BaseDates are for 2015 & 2016.
NB: Some properties have more than 1 changeoverDay & are currently held in one column comma separated i.e. Saturday, Sunday
Example Data:-
DECLARE #sampleData TABLE (
Code VARCHAR(5) NOT NULL
, BaseDate DATE NOT NULL
, DayName VARCHAR(9) NOT NULL
, ChangeoverDay VARCHAR(8) NOT NULL
, AvailabletoSell BIT NOT NULL
);
INSERT INTO #sampleData VALUES
('PERCH','2015-05-06','Wednesday','Saturday',0),
('PERCH','2015-05-07','Thursday','Saturday',0),
('PERCH','2015-05-08','Friday','Saturday',0),
('PERCH','2015-05-09','Saturday','Saturday',1), -- Not this one as changeover day is the same as the BaseDate
('PERCH','2015-05-10','Sunday','Saturday',1),
('PERCH','2015-05-11','Monday','Saturday',1),
('PERCH','2015-05-12','Tuesday','Saturday',0),
('PERCH','2015-05-13','Wednesday','Saturday',0),
('PERCH','2015-05-14','Thursday','Saturday',1), -- This one = 3
('PERCH','2015-05-15','Friday','Saturday',1),
('PERCH','2015-05-16','Saturday','Saturday',1),
('PERCH','2015-05-17','Sunday','Saturday',0),
('PERCH','2015-05-18','Monday','Saturday',1), -- This one = 4
('PERCH','2015-05-19','Tuesday','Saturday',1),
('PERCH','2015-05-20','Wednesday','Saturday',1),
('PERCH','2015-05-21','Thursday','Saturday',1),
('PERCH','2015-05-22','Friday','Saturday',0),
('PERCH','2015-05-23','Saturday','Saturday',0),
('PERCH','2015-05-24','Sunday','Saturday',0),
('PERCH','2015-05-25','Monday','Saturday',0),
('PERCH','2015-05-26','Tuesday','Saturday',0),
('PERCH','2015-05-27','Wednesday','Saturday',1), -- Not this one, as only 2 consecutive days
('PERCH','2015-05-28','Thursday','Saturday',1),
('PERCH','2015-05-29','Friday','Saturday',0),
('PERCH','2015-05-30','Saturday','Saturday',0);
I would require the output as below:-
+-------+---------------+-------------+----------------------+
| Code | StartBaseDate | EndBaseDate | TotalAvailabletoSell |
+-------+---------------+-------------+----------------------+
| PERCH | 14/05/2015 | 16/05/2015 | 3 |
| PERCH | 18/05/2015 | 21/05/2015 | 4 |
+-------+---------------+-------------+----------------------+
This gives you what you want. But I feel there's a way to reduce the number of times it touches the table
WITH Groupings AS (
SELECT
Code
,LastChange
,MIN(BaseDate) AS StartBaseDate
,MAX(BaseDate) AS EndBaseDate
,COUNT(*) AS DaysInPeriod
FROM
#sampleData AS s1
CROSS APPLY (
SELECT
MAX(BaseDate) AS LastChange
FROM
#sampleData AS cv
WHERE
s1.BaseDate > cv.BaseDate
AND s1.AvailabletoSell != cv.AvailabletoSell
AND s1.Code = cv.Code
) AS cv
WHERE
s1.AvailabletoSell = 1
GROUP BY
Code
,LastChange
)
SELECT
g.Code
,g.StartBaseDate
,g.EndBaseDate
,CASE WHEN a.DayName = a.ChangeoverDay THEN DaysInPeriod - 1 ELSE DaysInPeriod END AS TotalAvailableToSell
FROM
Groupings AS g
INNER JOIN #sampleData AS a
ON a.BaseDate = g.StartBaseDate AND a.Code = g.Code
WHERE
CASE WHEN a.DayName = a.ChangeoverDay THEN DaysInPeriod - 1 ELSE DaysInPeriod END > 2
The logic is pretty much:
Find the last date where the AvailableToSell flag flipped before "this row"
Group into sets by those dates and count the rows in it
Decrement by 1 if the start date has DayName as the ChangeoverDay
I havent accounted for your note about the ChangeoverDay being a comma separated field. There are plenty of resources on breaking that out which you could then join to. But I think you also need to expand what happens in this scenario with regards to DayName is in the list of ChangeoverDays