How to count values from rows and display the result in columns - sql-server

please help me
Oracle | Status| other columnns |
41 | A |
52 | W |
41 | A |
52 | W |
41 | W |
__________________
I need a resulting query that shows the count of Status in every Oracle like this:
Oracle | Total(A) | Total(W) |
41 | 2 | 1 |
52 | 0 | 2 |

Try this
with CTE AS
( select oracle,status from TableName)
select * from CTE
Pivot
(count(status) for status in ([A],[W]) ) as pvt

Try this:
select oracle, count(distinct status)
from your_table
group by oracle
You won't have the exact same result but you will have all the data you need to build that table.
You could also try some windows functions.

There are at least 2 ways to get that data:
1.
SELECT t1.oracle, ISNULL(Total_A, 0) As [Total (A)], ISNULL(Total_W,0) As [Total (W)]
FROM
(
SELECT oracle, count(status) As Total_A
FROM TableName
WHERE status = 'A'
GROUP BY oracle
) t1 INNER JOIN
(
SELECT oracle, count(status) As Total_W
FROM TableName
WHERE status = 'W'
GROUP BY oracle
) t2 ON(t1.oracle = t2.oracle)
This will give you the table you asked for, however, I will not recommend it as it is a terrible mess.
2.
SELECT oracle, status, COUNT(status)
FROM TableName
GROUP BY oracle, status
This will give you a table that you can then arrange in code to look like the table you requested in very little effort. Also, it is much cleaner and will be easy to handle a new status if introduced to the system.

Related

Optimize SQL query Select on Select Case

I was looking for some threads in here that mention optimization in queries, but i couldn't resolve my problem.
I need to perform a query in SQL Server that involve using a select case on my primary select, this is the description of the main table:
WS:
| Oid | model_code | product_code | year |
In my query, I need to select all of this columns plus an extra column that compares to another table if by some criteria the values from my main table exist on my other table, let me explain my other table and then I explain what i mean by this.
TA:
| Oid | model_code | product_code | year |
Both tables have matching columns, so for example, if on my table WS I have this result:
| Oid | model_code | product_code | year |
| 1 | 13 | 123 | 2018 |
And on my TA table I have this:
| Oid | model_code | product_code | year |
| 1 | 25 | 134 | 2016 |
| 2 | 13 | 123 | 2018 |
| 3 | 67 | 582 | 2017 |
I need to print an "Exist" result on that row because the row on my main table match exactly with this 3 column values.
So my query on that row should print something like this:
| model_code | product_code | year | Exist |
| 13 | 123 | 2018 | Yes |
The query I was trying to use to make this happen, was this:
SELECT
WS.Oid, WS.model_code, WS.product_code, Ws.year,
(SELECT
CASE
WHEN EXISTS (SELECT 1 FROM TA
WHERE TA.model_code = Ws.model_code
AND TA.product_code = Ws.product_code
AND TA.[Year] = Ws.[Year])
THEN 'Yes'
ELSE 'No'
END) as 'Exist'
FROM
Ws
And it works, the problem is that on my real tables there are more columns and more rows (about 960,000) and for example, a query around 50,000 elements (using this query) takes more than a minute, and the same query with same elements but without the select case, takes about 2 seconds, so the difference is immense.
I'm sure that a more viable way to achieve this exist, in less time, but I don't know how. any recommendations?
Unless already there, an index on ta (model_code, product_code, year) might help.
CREATE INDEX ta_model_code_product_code_year
ON ta (model_code,
product_code,
year);
Though chances are that the optimizer already rewrites your query in such a way, another thing you could try is to (explicitly) rewrite the query using a left join. I assume oid is NOT NULL in ta.
SELECT ws.oid,
ws.model_code,
ws.product_code,
ws.year,
CASE
WHEN ta.oid IS NULL THEN
'No'
ELSE
'Yes'
END exist
FROM ws
LEFT JOIN ta
ON ta.model_code = ws.model_code
AND ta.product_code = ws.product_code
AND ta.year = ws.year;
With that you want the index from above and maybe try one one ws (model_code, product_code, year) too.
CREATE INDEX ws_model_code_product_code_year
ON ws (model_code,
product_code,
year);
You might also want to play with the order of the columns in the indexes. If for a column more distinct values exist in ta, put it before a column where fewer distinct values exist in ta. But keep the order in both indexes identical, i.e. if you shift a column in the index on ta also move it in the index on ws the same way.
What you want to do is join the two tables together, instead of looking for a matching record for each record. Try something like this:
SELECT
WS.model_code, WS.product_code, Ws.year,
SELECT CASE
WHEN TA.OID IS NOT NULL THEN 'Yes'
ELSE 'No'
END As 'Exist'
FROM WS LEFT OUTER JOIN TA ON
TA.model_code = Ws.model_code
AND TA.product_code = Ws.product_code
AND TA.[Year] = Ws.[Year]
That will print all of the records from the WS table, and if there's a matching record in the TA table, the 'Exist' column will say 'Yes', otherwise it will say 'No'.
This uses one query to do everything. Your original approach would do a completely separate sub-query to check the TA table, and that is creating your performance issue.
You may also want to look at putting indexes on these 3 fields in each table to make the matching go even faster.

UPDATE JOIN statement for DB2

I am using DB2, and am a beginner in SQL. I have two tables here:
Table1:
ID | PageID
------------
1 | 101
2 | 102
3 | 103
4 | 104
Table2:
ID | SRCID | PageID
--------------------
1 | 2 | 179
2 | 3 | 103
3 | 3 | 109
Table2 and Table1 have different number of records. Table2.SCRID corresponds to Table1.ID.
I would like to update the PageID in Table2 to follow what is stated in PageID of Table1, based on the SRCID.
My end result of Table2 should be:
ID | SRCID | PageID
--------------------
1 | 2 | 102
2 | 3 | 103
3 | 3 | 103
How do I do this in SQL for DB2?
I tried:
UPDATE table2
SET PageID = (SELECT t1.PageID from table1 as t1 join table2 as t2
WHERE t2.SCRID = t1.ID);
But the above doesn't work as I get:
DB21034E The command was processed as an SQL statement because it was not a
valid Command Line Processor command. During SQL processing it returned:
SQL0811N The result of a scalar fullselect, SELECT INTO statement, or VALUES
INTO statement is more than one row. SQLSTATE=21000
The problem here is there is no unique column for me to join such that each column gets a unique result..or so it seems to me. Please help? :(
Try this:
UPDATE table2
SET table2.PageID =
(SELECT t1.PageID
FROM table1 t1
WHERE t1.id = table2.SCRID)
WHERE EXISTS(
SELECT 'TABLE1PAGE'
FROM table1 t1
WHERE t1.id = table2.SCRID)
I've added EXISTS clause to prevent NULL assignment to PageID of table2
As a SQL Server loyalist, I've been struggling with DB2's seeming inability to update a table with information from another table--the update with join that's so easy in SSMS.
I finally discovered a workaround that functions perfectly instead: the MERGE statement. I usually find IBM's support documents impenetrable, or at least not friendly reading, but the explanation at their MERGE website was actually quite clear: https://www.ibm.com/support/knowledgecenter/en/ssw_ibm_i_71/sqlp/rbafymerge.htm
Hope this helps you as much as it did me.

SQL Server: how to create sequence number column

I have a Sales table with the following data:
| SalesId | CustomerId | Amount |
|---------|------------|--------|
| 1 | 1 | 100 |
| 2 | 2 | 75 |
| 3 | 1 | 30 |
| 4 | 3 | 49 |
| 5 | 1 | 93 |
I would like to insert a column into this table that tells us the number of times the customer has made a purchase. So it'll be like:
| SalesId | CustomerId | Amount | SalesNum |
|---------|------------|--------|----------|
| 1 | 1 | 100 | 1 |
| 2 | 2 | 75 | 1 |
| 3 | 1 | 30 | 2 |
| 4 | 3 | 49 | 1 |
| 5 | 1 | 93 | 3 |
So I can see that in salesId = 5, that is the 3rd transaction for customerId = 1. How can I write such a query to insert / update such column? I am on MS SQL but I am also interested in the MYSQL solution should I need to do this there in the future.
Thank you.
ps. Apology for the table formatting. Couldn't figure out how to format it nicely.
You need ROW_NUMBER() to assign a sequence number. I'd strongly advise against storing this value though, since you will need to recalculate it with every update, instead, you may be best off creating a view if you need it regularly:
CREATE VIEW dbo.SalesWithRank
AS
SELECT SalesID,
CustomerID,
Amount,
SalesNum = ROW_NUMBER() OVER(PARTITION BY CustomerID ORDER BY SalesID)
FROM Sales;
GO
SQL Server Example on SQL Fiddle
ROW_NUMBER() will not assign duplicates in the same group, e.g. if you were assigning the rows based on Amount and you have two sales for the same customer that are both 100, they will not have the same SalesNum, in the absence of any other ordering criteria in your ROW_NUMBER() function they will be randomly sorted. If you want Sales with the same amount to have the same SalesNum, then you need to use either RANK or DENSE_RANK. DENSE_RANK will have no gaps in the sequence, e.g 1, 1, 2, 2, 3, whereas RANK will start at the corresponding position, e.g. 1, 1, 3, 3, 5.
If you must do this as an update then you can use:
WITH CTE AS
( SELECT SalesID,
CustomerID,
Amount,
SalesNum,
NewSalesNum = ROW_NUMBER() OVER(PARTITION BY CustomerID ORDER BY SalesID)
FROM Sales
)
UPDATE CTE
SET SalesNum = NewSalesNum;
SQL Server Update Example on SQL Fiddle
MySQL Does not have ranking functions, so you need to use local variables to achieve a rank by keeping track of the value from the previous row. This is not allowed in views so you would just need to repeat this logic wherever you needed the row number:
SELECT s.SalesID,
s.Amount,
#r:= CASE WHEN #c = s.CustomerID THEN #r + 1 ELSE 1 END AS SalesNum,
#c:= CustomerID AS CustomerID
FROM Sales AS s
CROSS JOIN (SELECT #c:= 0, #r:= 0) AS var
ORDER BY s.CustomerID, s.SalesID;
The order by is critical here, which means in order to order the results without affecting the ranking you need to use a subquery:
SELECT SalesID,
Amount,
CustomerID,
SalesNum
FROM ( SELECT s.SalesID,
s.Amount,
#r:= CASE WHEN #c = s.CustomerID THEN #r + 1 ELSE 1 END AS SalesNum,
#c:= CustomerID AS CustomerID
FROM Sales AS s
CROSS JOIN (SELECT #c:= 0, #r:= 0) AS var
ORDER BY s.CustomerID, s.SalesID
) AS s
ORDER BY s.SalesID;
MySQL Example on SQL Fiddle
Again, I would recommend against storing the value, but if you must in MySQL you would use:
UPDATE Sales
INNER JOIN
( SELECT s.SalesID,
#r:= CASE WHEN #c = s.CustomerID THEN #r + 1 ELSE 1 END AS NewSalesNum,
#c:= CustomerID AS CustomerID
FROM Sales AS s
CROSS JOIN (SELECT #c:= 0, #r:= 0) AS var
ORDER BY s.CustomerID, s.SalesID
) AS s2
ON Sales.SalesID = s2.SalesID
SET SalesNum = s2.NewSalesNum;
MySQL Update Example on SQL Fiddle
Using Subquery,
Select *, (Select count(customerid)
from ##tmp t
where t.salesid <= s.salesid
and t.customerid = s.customerid)
from ##tmp s
Try this -
SELECT SalesId, CustomerId, Amount,
SalesNum = ROW_NUMBER() OVER (PARTITION BY CustomerId ORDER BY SalesId)
FROM YOURTABLE

SQL Server making rows into columns

I'm trying to take three tables that I have and show the data in a way the user asked me to do it. The tables look like this. (I should add that I am using MS SQL Server)
First Table: The ID is varchar, since it's an ID they use to identify assets and they use numbers as well as letters.
aID| status | group |
-----------------------
1 | acti | group1 |
2 | inac | group2 |
A3 | acti | group1 |
Second Table: This table is fixed. It has around 20 values and the IDs are all numbers
atID| traitname |
------------------
1 | trait1 |
2 | trait2 |
3 | trait3 |
Third Table: This table is used to identify the traits the assets in the first table have. The fields that have the same name as fields in the above tables are obviously linked.
tID| aID | atID | trait |
----------------------------------
1 | 1 | 1 | NAME |
2 | 1 | 2 | INFO |
3 | 2 | 3 | GOES |
4 | 2 | 1 | HERE |
5 | A3 | 2 | HAHA |
Now, the user wants the program to output the data in the following format:
aID| status | group | trait1 | trait2 | trait 3
-------------------------------------------------
1 | acti | group1 | NAME | INFO | NULL
2 | inac | group2 | HERE | NULL | GOES
A3 | acti | group1 | NULL | HAHA | NULL
I understand that to achieve this, I have to use the Pivot command in SQL. However, I've read and tried to understand it but I just can't seem to get it. Especially the part where it asks for a MAX value. I don't get why I need that MAX.
Also, the examples I've seen are for one table. I'm not sure if I can do it with three tables. I do have a query that joins all three of them with the information I need. However, I don't know how to proceed from there. Please, any help with this will be appreciated. Thank you.
There are several ways that you can get the result, including using the PIVOT function.
You can use an aggregate function with a CASE expression:
select t1.aid, t1.status, t1.[group],
max(case when t2.traitname = 'trait1' then t3.trait end) trait1,
max(case when t2.traitname = 'trait2' then t3.trait end) trait2,
max(case when t2.traitname = 'trait3' then t3.trait end) trait3
from table1 t1
inner join table3 t3
on t1.aid = t3.aid
inner join table2 t2
on t3.atid = t2.atid
group by t1.aid, t1.status, t1.[group];
See SQL Fiddle with Demo
The PIVOT function requires an aggregate function this is why you would need to use either the MIN or MAX function (since you have a string value).
If you have a limited number of traitnames then you could hard-code the query:
select aid, status, [group],
trait1, trait2, trait3
from
(
select t1.aid,
t1.status,
t1.[group],
t2.traitname,
t3.trait
from table1 t1
inner join table3 t3
on t1.aid = t3.aid
inner join table2 t2
on t3.atid = t2.atid
) d
pivot
(
max(trait)
for traitname in (trait1, trait2, trait3)
) piv;
See SQL Fiddle with Demo.
If you have an unknown number of values, then you will want to look at using dynamic SQL to get the final result:
DECLARE #cols AS NVARCHAR(MAX),
#query AS NVARCHAR(MAX)
select #cols = STUFF((SELECT distinct ',' + QUOTENAME(traitname)
from Table2
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
set #query = 'SELECT aid, status, [group],' + #cols + '
from
(
select t1.aid,
t1.status,
t1.[group],
t2.traitname,
t3.trait
from table1 t1
inner join table3 t3
on t1.aid = t3.aid
inner join table2 t2
on t3.atid = t2.atid
) x
pivot
(
max(trait)
for traitname in (' + #cols + ')
) p '
execute sp_executesql #query;
See SQL Fiddle with Demo

Efficient Date Comparisons in SQL

I hope this question provides all of the necessary information, but please do request more if anything is unclear. This is my first question on stack overflow so please bear with me.
I am running this query on SQL Server 2005.
I have a large derived dataset (i'll provide a small subset later) which has 4 fields;
ID,
Year,
StartDate,
EndDate
Within this data set the ID may (correctly) appear multiple times with different date combinations.
The question I have is what ways are there to identify if a record is 'new' I.E it's start date does not fall between the start and end date of any other records for the same id.
For an example take the data set below (I hope this table comes out correctly!);
+----+------+------------+------------+
| ID | Year | Start Date | End Date |
+----+------+------------+------------+
| 1 | 2007 | 01/01/2007 | 10/10/2007 |
| 1 | 2007 | 01/01/2007 | 05/04/2007 |
| 1 | 2007 | 05/04/2007 | 08/10/2007 |
| 1 | 2007 | 15/10/2007 | 20/10/2007 |
| 1 | 2007 | 25/10/2007 | 01/01/2008 |
| 2 | 2007 | 01/01/2007 | 01/01/2008 |
| 2 | 2008 | 01/01/2008 | 15/07/2008 |
| 2 | 2008 | 10/06/2008 | 01/01/2009 |
+----+------+------------+------------+
If we say nothing existed before 2007 then Row 1 and Row 6 are 'new' at that time.
Rows 2,3,7 and 8 are not 'new' as they either join the end of a previous record or overlap it to form a continuous date period (take rows 6 and 7 there are no 'breaks' between 01/01/2008 and 01/01/2009)
Row 4 and 5 would be considered a new record as it does not attach directly to the end of the previous period for ID 1 or overlap any of the other periods.
Currently to get this data set I have to put all of my data into temporary tables and then join them together on various fields to remove the records I don't want.
Firstly I remove rows where the startdate equals the enddate of another row for that ID (This would get rid of rows 3 and 7)
Then I remove rows where the the start date is between the startdate and enddate of other records for that ID (this would remove rows 2 and 8)
That would leave me withRows 1,4,5 and 6 as the 'new' records which is correct.
Is there a more efficient way to do this such as in some sort of loop, CTE or cough Cursor?
As per the above, if there is anything unclear don't hesitate to ask and I will try and provide you with the information you request.
Try
;with cte as
(
Select *, row_number() over (partition by id order by startdate) rn from yourtable
)
select distinct t1.*
from cte t1
left join cte t2
on t1.ID = t2.ID
and t1.EndDate>=t2.StartDate and t1.StartDate<=t2.EndDate
and t1.rn<>t2.rn
where t2.ID is null
or t1.rn=1
this should work, if you have a unique identifier for each row:
select * from
tbl t3
left outer join
(
select distinct t1.id as id_inside, t1.recno as recno_inside
from
tbl t1 inner join
tbl t2 on
t1.id = t2.id and
(t1.startdate <> t2.startdate or t1.enddate <> t2.enddate) and
(t1.startdate >= t2.startdate and t1.enddate <= t2.enddate)
) t4 on
t3.id = t4.id_inside and
t3.recno = t4.recno_inside
where
id_inside is null and
recno_inside is null
sqlfiddle

Resources