Based on the following example : (it is a "QueryLog" table, this table store interactions between a user and two different products N and R):
Id Date UserID Product
--------------------------------------------------
0 2013-06-09 14:50:24.000 100 N
1 2013-06-09 15:27:23.000 100 N
2 2013-06-09 15:29:23.000 100 N
3 2013-06-17 15:31:23.000 100 N
4 2013-06-17 15:32:23.000 100 N
5 2014-05-19 15:30:23.000 250 N
6 2014-07-19 15:27:23.000 250 N
7 2014-07-19 15:27:23.000 333 R
8 2014-08-19 15:27:23.000 333 R
Expected results :
Count
-----
1
(Only UserID 250 is inside my criteria)
If one user interacts 10 times with the product in only one month, he's not in my criteria.
To resume, I am looking for :
The Number of distinct users that had interactions with product N on at least more than one month (what ever the number of interactions this user may have had during a single month)
This is the code I've tried:
select distinct v.UserID, v.mois , v.annee
from
(select c.UserID , c. mois, c.annee, COUNT(c.UserID) as frequence
from
(
SELECT
datepart(month,[DATE]) as mois,
datepart(YEAR,[DATE]) as annee ,
Username,
UserID,
Product
FROM QueryLog
where Product = 'N'
) c
group by c.UserID, c.annee, c.mois
) v
group by v.UserID, v.mois, v.annee
try this:
DECLARE #YourTable table (Id int, [Date] datetime, UserID int, Product char(1))
INSERT INTO #YourTable VALUES (0,'2013-06-09 14:50:24',100 ,'N')
,(1,'2013-06-09 15:27:23',100 ,'N')
,(2,'2013-06-09 15:29:23',100 ,'N')
,(3,'2013-06-17 15:31:23',100 ,'N')
,(4,'2013-06-17 15:32:23',100 ,'N')
,(5,'2014-05-19 15:30:23',250 ,'N')
,(6,'2014-07-19 15:27:23',250 ,'N')
,(7,'2014-07-19 15:27:23',333 ,'R')
,(8,'2014-08-19 15:27:23',333 ,'R')
;WITH MultiMonthUsers AS
(
select
UserID
FROM (select
UserID
FROM #YourTable
WHERE product='N'
GROUP BY UserID, YEAR([Date]),MONTH([Date])
)dt2
GROUP BY UserID
HAVING COUNT(*)>1
)
SELECT COUNT(*) FROM MultiMonthUsers
Depending on number of rows and indexes, this will run slow. Using YEAR([Date]),MONTH([Date]) will prevent any index usage.
I think this will do it, but I need a better dataset to test with:
SELECT COUNT(*)
FROM (
--roll all month/user records into single row
SELECT UserID, datediff(month 0, [date]) As MonthGroup
FROM QueryLog
WHERE Product='N'
GROUP BY datediff(month 0, [date]), UserId
) t
-- look for users with multiple rows
GROUP BY UserID
HAVING COUNT(UserID) > 1
Seems like there should be a way to roll this up further, to avoid the need for the nested select.
Related
I'm trying to sum totals together that goes beyond a basic "group by" or "case" statement.
Here's an example datasets:
Amt Cust_id Ranking PlanType
10 1 1 Term
6 1 2 Variable
8 1 3 Variable
7 1 4 Variable
12 1 5 Term
6 1 6 Variable
10 1 7 Variable
The objective is to return the max sum where the plan type is 'Variable' and
the Ranking numbers are adjacent to each other.
So the answer to the example would be the sum of rows 2-4 which returns 21.
The answer is not the sum of all variable plan types, because row 5 is a 'Term' which breaks it apart.
So I'd like to end with a dataset like below to handle multiple groups of customers:
Amt Cust_ID
21 1
30 2
45 3
Here's where I'm stuck which returns wrong answer:
Create Table #tb (Amt INT, Cust_id TINYINT, Ranking INT, PlanType
VARCHAR(10))
INSERT INTO #tb
VALUES (10,1,1,'Term'),
(6,1,2,'Variable'),
(8,1,3,'Variable'),
(7,1,4,'Variable'),
(12,1,5,'Term'),
(6,1,6,'Variable'),
(10,1,7,'Variable'),
(10,2,1,'Term'),
(6,2,2,'Variable'),
(7,2,4,'Variable'),
(12,2,5,'Term'),
(6,2,6,'Variable'),
(50,2,7,'Variable')
select
( SELECT SUM(Amt) FROM #tb as t2
WHERE t2.Cust_ID=t1.Cust_ID AND t2.Ranking<=t1.Ranking AND
t2.PlanType='Variable') RollingAmt
,Cust_ID, Ranking, Amt, PlanType
from #tb as t1
order by Cust_ID, Ranking
The query runs a rolling sum ordered by "Ranking" where PlanType = 'Variable'. Unfortunately it runs a rolling sum of all "Variable"'s together. I need it to not do that.
If it runs into a PlanType "Term" it needs to start over its sum within each group.
In order to do this you need to use a gaps-and-islands technique to generate a "group id" based on consecutive runs of the same PlanType, then you can sum and sort based on that new group id.
Try this:
DECLARE #data TABLE (Amt INT, Cust_id TINYINT, Ranking INT, PlanType VARCHAR(10))
INSERT INTO #data
VALUES (10,1,1,'Term'),
(6,1,2,'Variable'),
(8,1,3,'Variable'),
(7,1,4,'Variable'),
(12,1,5,'Term'),
(6,1,6,'Variable'),
(10,1,7,'Variable'),
(10,2,1,'Term'),
(6,2,2,'Variable'),
(7,2,4,'Variable'),
(12,2,5,'Term'),
(6,2,6,'Variable'),
(50,2,7,'Variable')
;WITH X AS
(
SELECT *,
ROW_NUMBER() OVER(PARTITION BY Cust_id,PlanType ORDER BY Ranking)
- ROW_NUMBER() OVER(PARTITION BY Cust_id ORDER BY Ranking) groupID /* Assign a groupID to consecutive runs of PlanTypes by Cust_id */
FROM #data
), Y AS
(
SELECT *, SUM(Amt) OVER(PARTITION BY Cust_id,groupID) AS AmtSum /* Sum Amt by Cust/groupID */
FROM X
WHERE PlanType='Variable'
), Z AS
(
SELECT *, ROW_NUMBER() OVER(PARTITION BY Cust_id ORDER BY AmtSum DESC) AS RN /* Assign a row number (1) to highest AmtSum by Cust */
FROM Y
)
SELECT AmtSum, Cust_id
FROM Z
WHERE RN=1 /* Only select RN=1 to get highest value by cust_id/groupId */
If you are curious about how this all works, you can comment the last SELECT and do SELECT * FROM X then SELECT * FROM Y etc, to see what each step does along the way; but only one SELECT can follow the entire CTE structure.
I am trying to grab a series of dates and the corresponding values (if any) that exist in my database.
I have two parameters - today (date using getDate()) - and a number of days (integer). For this example, I'm using the value 10 for the days.
Code to get the sequential dates for 10 days after today:
SELECT top 10 DATEADD(DAY, ROW_NUMBER()
OVER (ORDER BY object_id), REPLACE(getDate(),'-','')) as Alldays
FROM sys.all_objects
I now need to look up several values for each day in the sequential days code, which may or may not exist in the time table (we assume 8 hours for all dates, unless otherwise specified). The lookup would be on the field recordDateTime. If no "hours" value exists in the table cap_time for that date, I need to return a default value of 8 as the number of hours. Here's the base query:
SELECT u.FullName as UserName, d2.department,
recordDateTime, ISNULL(hours,8) as hours
FROM cap_time c
left join user u on c.userID = u.userid
left join dept d2 on u.deptID = d2.DeptID
WHERE c.userid = 38 AND u.deptID = 1
My end result for the next 10 days should be something like:
Date (sequential), Department, UserName, Number of Hours
I can accomplish this using TSQL and a temp table, but I'd like to see if this can be done in a single statement. Any help is appreciated.
Without any DDL or sample data it's hard to determine exactly what you need.
I think this will get you pretty close (note my comments):
-- sample data
------------------------------------------------------------------------------------------
DECLARE #table TABLE
(
fullName varchar(10),
department varchar(10),
[hours] tinyint,
somedate date
);
INSERT #table VALUES
('bob', 'sales', 5, getdate()+1),
('Sue', 'marketing', 3, getdate()+2),
('Sue', 'sales', 12, getdate()+4),
('Craig', 'sales', 4, getdate()+8),
('Joe', 'sales', 18, getdate()+9),
('Fred', 'sales', 10, getdate()+10);
--SELECT * FROM #table
;
-- solution
------------------------------------------------------------------------------------------
WITH alldays([day]) AS -- logic to get your dates for a LEFT date table
(
SELECT TOP (10)
CAST(DATEADD
(
DAY,
ROW_NUMBER() OVER (ORDER BY object_id),
getdate()
) AS date)
FROM sys.all_objects
)
SELECT d.[day], t.fullName, department, [hours] = ISNULL([hours], 8)
FROM alldays d
LEFT JOIN #table t ON d.[day] = t.somedate;
Results:
day fullName department hours
---------- ---------- ---------- -----
2017-04-12 bob sales 5
2017-04-13 Sue marketing 3
2017-04-14 NULL NULL 8
2017-04-15 Sue sales 12
2017-04-16 NULL NULL 8
2017-04-17 NULL NULL 8
2017-04-18 NULL NULL 8
2017-04-19 Craig sales 4
2017-04-20 Joe sales 18
2017-04-21 Fred sales 10
Maybe a subquery and the in statement, like:
SELECT u.FullName as UserName, d2.department,
recordDateTime, ISNULL(hours,8) as hours
FROM cap_time c
left join user u on c.userID = u.userid
left join dept d2 on u.deptID = d2.DeptID
WHERE c.userid = 38 AND u.deptID = 1 and recordDateTime in
(SELECT top 10 DATEADD(DAY, ROW_NUMBER()
OVER (ORDER BY object_id), REPLACE(getDate(),'-','')) as Alldays
FROM sys.all_objects)
My data is like below:
ClassId ClassName StudentId Subject SubjectId
-----------------------------------------------------
1 ESL 12 English 20
1 ESL 13 Science 30
1 ESL 12 Social 40
1 ESL 12 Maths 50
Required output: parameters are Subject column values
ClassId ClassName TotalStudents SubjectIds
-----------------------------------------------
1 ESL 2 20, 40, 50, 30
When one student takes multiple subjects then count student only once, so in the above data 12 is one student id takes multiple subjects so counted only once. TotalStudents value is 2 (1 from student id 12 and 1 from student id 13)
I am not looking for how to display subjectIds column value in comma separated string.
Thanks in advance
COUNT DISTINCT then use STUFF for combined the subject
declare #temp table
(ClassId int,ClassName nvarchar(max),StudentId int,Subject nvarchar(max), SubjectId int)
insert into #temp values (1,'ESL',12,'English' , 20 )
insert into #temp values (1,'ESL',13,'Science' , 30 )
insert into #temp values (1,'ESL',12,'Social ' , 40 )
insert into #temp values (1,'ESL',12,'Maths ' , 50 )
select ClassId,ClassName,COUNT(DISTINCT StudentId) CNT,
STUFF( (SELECT ',' + CAST(t1.SubjectId AS NVARCHAR)
FROM #temp t1
WHERE StudentId = t1.StudentId
FOR XML PATH('')),
1, 1, '') SubjectIdS
from #temp
GROUP BY ClassId,ClassName
OUTPUT
DISTINCT can be applied inside aggregate functions.
SELECT COUNT(DISTINCT column_name) FROM table_name;
If you don't need to display the SubjectIds, then you need to use a GROUP BY clause to group the resultset by ClassId and ClassName.
SELECT ClassId, ClassName, COUNT(distinct StudentId) as TotalStudents
FROM MyTable
GROUP BY ClassId, ClassName
See this example at SqlFiddle
I'm trying to select randomly few rows for each Id stored in one table where these Ids have multiple rows on this table. It's difficult to explain with words, so let me show you with an example :
Example from the table :
Id Review
1 Text11
1 Text12
1 Text13
2 Text21
3 Text31
3 Text32
4 Text41
5 Text51
6 Text61
6 Text62
6 Text63
Result expected :
Id Review
1 Text11
1 Text13
2 Text21
3 Text32
4 Text41
5 Text51
6 Text62
In fact, the table contains thousands of rows. Some Ids contain only one Review but others can contain hundreds of reviews. I would like to select 10% of these, and select at least once, all rows wich have 1-9 reviews (I saw the SELECT TOP 10 percent FROM table ORDER BY NEWID() includes the row even if it's alone)
I read some Stack topics, I think I have to use a subquery but I don't find the correct solution.
Thanks by advance.
Regards.
Try this:
DECLARE #t table(Id int, Review char(6))
INSERT #t values
(1,'Text11'),
(1,'Text12'),
(1,'Text13'),
(2,'Text21'),
(3,'Text31'),
(3,'Text32'),
(4,'Text41'),
(5,'Text51'),
(6,'Text61'),
(6,'Text62'),
(6,'Text63')
;WITH CTE AS
(
SELECT
id, Review,
row_number() over (partition by id order by newid()) rn,
count(*) over (partition by id) cnt
FROM #t
)
SELECT id, Review
FROM CTE
WHERE rn <= (cnt / 10) + 1
Result(random):
id Review
1 Text12
2 Text21
3 Text31
4 Text41
5 Text51
6 Text63
I would like to ask the community the following:
I have a table Productivity with the following columns:
SerialNumber (Primary Key) - Identity Column
Processed_Time_Stamp - DateTime
Login_ID - nvarchar(255)
Order_Number - Float
Order_Location - Float
Status - nvarchar(255)
I am using this query:
SELECT
Login_ID, COUNT(Login_ID) as [Total Number]
FROM
Productivity
WHERE
Processed_Time_Stamp >= '2014-12-03 10:30:00.000'
AND Processed_Time_Stamp <= '2014-12-04 10:30:00.000'
GROUP BY
Login_ID
This gives me these results:
Login_ID Total Number
------------------------
Zohaib 10
XYX 20
However, I want to break this up in the following format:
Login_ID 10 AM 11 AM
-------------------------
zohaib 5 5
XYZ 7 13
Thanks in advance
You can try this:
WITH DATA
AS
(
SELECT Login_ID, 1 AS N, DATEPART(HOUR, Processed_Time_Stamp) H
FROM Productivity
WHERE Processed_Time_Stamp >='2014-12-03 10:30:00.000' and Processed_Time_Stamp <= '2014-12-04 10:30:00.000'
)
SELECT Login_ID, [10], [11]
FROM DATA
PIVOT (SUM(N) FOR H IN ([10], [11])) AS P
Hope this helps.
Cheers.