Using PIVOT with SQL Server without Aggregate function - sql-server

I'm stuck on using PIVOT in a simple example (which I give in entirety below). Full disclosure, I got this from https://www.hackerrank.com/. I picked it precisely because I want to get more familiar with PIVOT and this looked like a simple example! I've looked at numerous posts on the subject, and have been using this to crib off: https://social.msdn.microsoft.com/Forums/sqlserver/en-US/b76a4668-d0c3-4c51-8d86-117d5c181e69/pivot-without-aggregate-function?forum=transactsql but don't seem to be able to get things quite right. Here is the table:
TABLE OCCUPATIONS
Name Occupation
Samantha Doctor
Julia Actor
Maria Actor
Meera Singer
Ashley Professor
Ketty Professor
Christeen Professor
Jane Actor
Jenny Doctor
Priya Singer
The task is to have the output with columns Doctor, Professor, Singer or Actor (in that order). If you run out of data for one or more columns, put NULL. Here is the expected output (copied directly from the site).
Jenny Ashley Meera Jane
Samantha Christeen Priya Julia
NULL Ketty NULL Maria
As an aside, it appears they want the results without column headers (I'm not sure!).
Here is the latest iteration of what I have tried:
SELECT [Doctor], [Professor],[Singer], [Actor]
FROM
(SELECT [Name], [Occupation] from OCCUPATIONS) as pvtsource
PIVOT
( MAX([Name]) FOR [Occupation] IN ([Doctor], [Professor],[Singer], [Actor]) ) AS p
and it yields:
Doctor Professor Singer Actor
Samantha Ketty Priya Maria
I'm not surprised by this incorrect result. After all, I did say in my query MAX. I assume it's just picking the MAX name for each profession based on the alphabetical sort. Maria is a "bigger" actor than Julia or Jane for example if you based it on the alphabet. But when I remove the MAX, I get an error ("Incorrect syntax..."). How does one do this?
Thanks!
Bonus questions
1. Good, gentle, articles to PIVOT? I clearly haven't gotten it through my thick head. Eventually, I do want to be able to do more complicated pivots where I SUM or take MAX.
2. How to display results without column headers?
3. I'd also be interested in how to do this without PIVOT if there is a simple way.

You need to "FEED" the pivot with an X-Axis,Y-Axis and a Value. We create a row key via dense_rank()
Example
Declare #YourTable Table ([Name] varchar(50),[Occupation] varchar(50)) Insert Into #YourTable Values
('Samantha','Doctor')
,('Julia','Actor')
,('Maria','Actor')
,('Meera','Singer')
,('Ashley','Professor')
,('Ketty','Professor')
,('Christeen','Professor')
,('Jane','Actor')
,('Jenny','Doctor')
,('Priya','Singer')
Select *
from (Select *
,RN = dense_rank() over (partition by occupation order by name)
From #YourTable
) src
Pivot (max(Name) for Occupation in ([Doctor], [Professor],[Singer], [Actor]) ) pvt
Returns
RN Doctor Professor Singer Actor
1 Jenny Ashley Meera Jane
2 Samantha Christeen Priya Julia
3 NULL Ketty NULL Maria
NOTE:
If you don't want RN in your results, rather than the top SELECT *, you can specify the desired columns
SELECT [Doctor], [Professor],[Singer], [Actor]
From (...) src
Pivot (...) pvt
EDIT - Commentary
If you run the inner query
Select *
,RN = dense_rank() over (partition by occupation order by name)
From #YourTable
Order By RN
You'll get
Name Occupation RN
Jane Actor 1
Jenny Doctor 1
Ashley Professor 1
Meera Singer 1
Priya Singer 2
Christeen Professor 2
Samantha Doctor 2
Julia Actor 2
Maria Actor 3
Ketty Professor 3
RN becomes the Y-Axis, Occupation becomes the X-Axis and Name is the value.
Pivots by design are aggregates, therefore we just need a Y-Axis to perform the group by.

Related

SQL - aggregate related accounts - maually set up ID

I have two tables:
Account & Amount column
list of related accounts
Data samples:
Account | Amount
--------+---------
001 | $100
002 | $150
003 | $200
004 | $300
Account | Related Account
--------+------------------
001 | 002
002 | 003
003 | 002
My goal is to be able to aggregate all related accounts. From table two - 001,002 & 003 are actually all related to each other. What I would like to be able to do is to get a sum of all related accounts. Possibly ID 001 to 003 as Account #1, so I can aggregate them.
Result below
ID | Account | Amount
-----+-----------+--------
#1 | 001 | $100
#1 | 002 | $150
#1 | 003 | $200
#2 | 004 | $300
I can then manipulate the above table as below (final result)
ID | Amount
-----+--------
#1 | $450
#2 | $300
I tried doing a join, but it doesn't quite achieve what I want. I still have a problem relating account 001 with 003 (they are indirectly related because 002 is related with both 001 and 003.
If anyone can point me to the right direction, will be much appreciated.
Well, you really made this harder then it should be.
If you could change the data in the second table, so it will not contain reversed duplicates (in your sample data - 2,3 and 3,2) it would simplify the solution.
If you could refactor both tables into a single table, where the related column is a self referencing nullable foreign key, it would simplify the solution even more.
Let's assume for a minute you can't do either, and you have to work with the data as provided. So the first thing you want to do is to ignore the reversed duplicates in the second table. This can be done using a common table expression and a couple of case expressions.
First, create and populate sample tables (Please save us this step in your future questions):
DECLARE #TAccount AS TABLE
(
Account int,
Amount int
)
INSERT INTO #TAccount (Account, Amount) VALUES
(1, 100),
(2, 150),
(3, 200),
(4, 300)
DECLARE #TRelatedAccounts AS TABLE
(
Account int,
Related int
)
INSERT INTO #TRelatedAccounts (Account, Related) VALUES
(1,2),
(2,3),
(3,2)
You want to get only the first two records from the #TRelatedAccounts table.
This is the AccountAndRelated CTE.
Now, you want to left join the #TAccount table with the results of this query, so for each Account we will have the Account, the Amount, and the Related Account or NULL, if the account is not related to any other account or it's the first on the relationship chain.
This is the CTERecursiveBase CTE.
Then, based on that you can create a recursive CTE (called CTERecursive), and finally select the sum of amount from the recursive CTE based on the root of the recursion.
Here is the entire script:
;WITH AccountAndRelated AS
(
SELECT DISTINCT CASE WHEN Account > Related THEN Account Else Related END As Account,
CASE WHEN Account > Related THEN Related Else Account END As Related
FROM #TRelatedAccounts
)
, CTERecursiveBase AS
(
SELECT A.Account, Related, Amount
FROM #TAccount As A
LEFT JOIN AccountAndRelated As R ON A.Account = R.Account
)
, CTERecursive AS
(
SELECT Account As Id, Account, Related, Amount
FROM CTERecursiveBase
WHERE Related IS NULL
UNION ALL
SELECT Id, B.Account, B.Related, B.Amount
FROM CTERecursiveBase AS B
JOIN CTERecursive AS R ON B.Related = R.Account
)
SELECT Id, SUM(Amount) As TotalAmount
FROM CTERecursive
GROUP BY Id
Results:
Id TotalAmount
1 450
4 300
You can see a live demo on rextester.
Now, Let's assume you can modify the data of the second table. You can use the AccountAndRelated cte to get only the records you need to keep in the #TRelatedAccounts table - This means you can skip the AccountAndRelated cte and use the #TRelatedAccounts directly in the CTERecursiveBase cte.
You can see a live demo of that as well.
Finally, let's assume you can refactor your database. In that case, I would recommend joining the two tables together - so your #TAccount table would look like this:
Account Amount Related
1 100 NULL
2 150 1
3 200 2
4 300 NULL
Then you only need the recursive cte.
Here is a live demo of that option as well.

Compare Two SQL Tables for Unique Cells and Update Master Table

I'm using SQL Server 2017 and I've been trying to figure this out for hours. My goal is to compare 2 tables and only insert NEW rows based on UNIQUE cells. All the columns have an ID number, but I have not assigned a primary key. My goal is to ONLY add extra rows containing UNIQUE cells if none of the criteria match. This is how my tables are setup now.
Old-Data (Table name is Test1)
FName LNname Address City State Zipcode Phone Phone2 ID
Frank Smith 444 Main Y'all TX 77484 281-788-9898 NULL 1
Thomas Parker 343 Tire Y'all TX 77484 281-788-5453 NULL 2
Ben Krull 232 Wheel Y'all TX 77484 281-788-9535 NULL 3
New-Data (Table name is Test2)
FName LNname Address City State Zipcode Phone Phone2 ID
Frank Smith 444 Main Y'all TX 77484 281-788-9898 NULL 1
Thomas Parker 343 Tire Y'all TX 77484 281-788-5453 NULL 2
Ben Krull 232 Wheel Y'all TX 77484 281-788-9535 NULL 3
Juan Roberto 444 Gas Y'all TX 77484 281-788-3434 NULL 4
Ben Krull 232 Wheel Y'all TX 77484 281-788-9535 713-545-4353 5
As you can see, ID's 1,2 and 3 are identical in both tables. ID-4 is a completely unique row, as is ID-5 because of the Phone2 entry. I found some code and modified it a bit to match the headers I care about it to help me determine what entries are duplicates or not. This is the code that has been driving me crazy.
INSERT TEST1 (Name
,Last_Name
,Address
,City
,State
,Zip_Code
,Phone
,Phone2
)
SELECT Name
,Last_Name
,Address
,City
,State
,Zip_Code
,Phone
,Phone2
FROM TEST2
WHERE TEST2.NAME not in (select Name from test1)
AND TEST2.Address not in (select Address from test1)
AND TEST2.City not in (select City from test1)
AND TEST2.State not in (select State from test1)
AND TEST2.Zip_Code not in (select Zip_Code from test1)
AND TEST2.Phone not in (select Phone from test1)
AND TEST2.Phone2 not in (select phone2 from test1)
I'm trying to match all the fields and if a unique CELL is found the new row is entered into the old_data table. I see no errors after executing it, but nothing happens too. Interestingly enough, If I remove all the code below the line that says, "WHERE TEST2.NAME not in (select Name from test1)" ID-4 (Juan Roberto) is transferred over, but nothing happens with ID-5.
I'm really starting to think WHERE cannot be used to compare the duplicates and modify or add entries, but I could be wrong. A merge feature would be awesome, but I'm happy with just the former since I could always run a different script to clean up the table for dupes. I'm hoping somebody might be able to point me in the right direction since I've got millions of rows in different tables that need to be compared and trimmed down. Thanks.
Just try the following code, I am not sure about it will work for you, because I am not tested it
SELECT * INTO #TEMP FROM Test2(NOLOCK);
DELETE #TEMP
FROM #TEMP
INNER JOIN Test1
ON #TEMP.NAME = Test1.NAME
AND #TEMP.Address = Test1.Address
AND #TEMP.City = Test1.City
AND #TEMP.State = Test1.State
AND #TEMP.Zip_Code = Test1.Zip_Code
AND #TEMP.Phone = Test1.Phone
AND #TEMP.Phone2 = Test1.Phone2 ;
INSERT INTO Test1
SELECT * FROM #TEMP;

Getting ROW_NUMBER to repeat if field meets a condition

I need ROW_NUMBER to assign data to a specific user if a condition is met.
ROW_NUMBER will increment normally until a duplicate value is found. When the duplicate value is found, I need it to use the same ROW_NUMBER until a new value is found.
For instance...
When using
SELECT ROW_NUMBER() OVER (ORDER BY COMPANY) AS rownum
,Company
,Contact
FROM TABLE
We can obviously expect this result
rownum Company Contact
1 BOB'S BURGERS BOB
2 STEVE'S SARDINES STEVE
3 STEVE'S SARDINES JERRY
4 STEVE'S SARDINES MARY
5 LARRY's LOBSTER LARRY
6 CHRIS' COWS CHRIS
What I'm trying to get is this. Whenever the Company name doesn't change, repeat the ROW_NUMBER and continue to increment the number when the company does change
rownum Company Contact
1 BOB'S BURGERS BOB
2 STEVE'S SARDINES STEVE
2 STEVE'S SARDINES JERRY
2 STEVE'S SARDINES MARY
3 LARRY'S LOBSTER LARRY
4 CHRIS' COWS CHRIS
I'm using this condition to see if the company matches the previous company name. It returns a 2 if the condition is true
ROW_NUMBER() OVER (PARTITION BY COMPANY ORDER BY COMPANY) AS SameCompany
You want DENSE_RANK not ROW_NUMBER. Try this:
SELECT DENSE_RANK() OVER (ORDER BY COMPANY) AS rownum
,Company
,Contact
FROM TABLE

sql server pivot string from one column to three columns

I've been approaching a problem perhaps in the wrong way. I've researched pivot examples
http://www.codeproject.com/Tips/500811/Simple-Way-To-Use-Pivot-In-SQL-Query
How to create a pivot query in sql server without aggregate function
but they aren't the type I'm looking for.. or perhaps I'm approaching this in the wrong way, and I'm new to sql server.
I want to transform:
Student:
studid | firstname | lastname | school
-----------------------------------------
1 mike lee harvard
1 mike lee ucdavis
1 mike lee sfsu
2 peter pan chico
2 peter pan ulloa
3 peter smith ucb
Desired output: (note for school, want only 3 columns max.)
studid| firstname | lastname | school1 | school2 | school3
---------------------------------------------------------------------
1 mike lee Harvard ucdavis sfsu
2 peter pan chico ulloa
3 peter smith ucb
The tutorials I see shows the use of Sum() , count() ... but I have no idea how to pivot string values of one column and put them into three columns.
You can get the results you desire by taking max(school) for each pivot value. I'm guessing the pivot value you want is rank over school partitioned by student. This would be the query for that:
select * from
(select *, rank() over (partition by studid order by school) rank from student) r
pivot (max(school) for rank in ([1],[2],[3])) pv
note that max doesn't actually do anything. the query would return the same results if you replaced it with min. just the pivot syntax requires the use of an aggregate function here.

Crystal Report do not do the sum on Database "Cache"

I am using Crystal report to do a sum over 3 columns. The table structure looks like:
table #test (Country VARCHAR(10), Name VARCHAR(10), Weight VARCHAR(10), Qty INT)
I wrote a query in the crystal command pane when I do the connection:
SELECT Country, SUM(Qty) As Qty, Name, Weight FROM #test GROUP BY Country, Name, Weight
I should get something like:
CANADA 2 John 200
US 1 John 160
US 2 Mike 180
US 6 Sam 90
However, the crystal report does not sum the field, instead it pulls every single row, and the result looks like I write the query:
SELECT Country, Qty, Name, Weight FROM #test
CANADA 1 John 200
CANADA 1 John 200
US 1 John 160
US 2 Mike 180
US 3 Sam 90
US 3 Sam 90
By the way, the backend database is called "Cache". It might be due to there are some hidden characters, but I cannot see them. I have used replace (char(10)), replace (char(13) and trim to try to clean.
I also try to pull the table column directly without writing the query, but I do not know how to sum three columns (Country, Name and Weight). I only know how to sum one column. By the way, the request do not want the details, only the sum over these three columns;
First group by country.
Create one more group by quantity
Create one more group by name
Place weight in details and take the sum for all 3 groups if you need or only particular group
Supress tje details.

Resources