SQL Get row that have maximum values over two columns - snowflake-cloud-data-platform

I have a table like this in Snowflake. It supports ANSI SQL, so don't worry if this DB isn't familiar to you.
Salesman
Customer
Country
Brown
Super Company
UK
Brown
Another customer
UK
Smith
Contoso
US
Brown
Test company
US
I'd need to find where each salesman have most of customers. So desired response for the query would be like this.
Salesman
Country
cnt(country)
Brown
UK
2
Smith
US
1
I've come up with this
SELECT
salesman,
country,
max(count(country))
FROM
customertable
GROUP BY
salesman, country
But nested aggeregation functions aren't supported. And I've already read quite good reasons for that. But I just cannot find a way to do that in any other way.

QUALIFY could be used to filter the highest value per salesman:
SELECT salesman,
country,
count(country) AS cnt
FROM customertable
GROUP BY salesman, country
QUALIFY RANK() OVER(PARTITION BY salesman ORDER BY cnt DESC) = 1

Regarding your questions i guess you would want to count customer by country instead of country.
This should do the job with the use of WINDOW FUNCTIONS AND QUALIFY
Window Functions documentation
CREATE OR REPLACE TABLE customers (salesman STRING, customer STRING, country STRING);
INSERT INTO customers
VALUES
('Brown', 'Super Company', 'UK'),
('Brown', ' Another customer', 'UK'),
('Smith', 'Contoso', 'US'),
('Brown', 'Test company', 'US')
;
SELECT
salesman,
country,
COUNT(customer) AS nb_customer
FROM customers
GROUP BY
salesman,
country
QUALIFY RANK() OVER (PARTITION BY salesman ORDER BY nb_customer DESC) = 1
;

Related

Every customer, one item, multiple orders

My question is actually healthcare related, but I'm going to provide a couple of mock customer,sales driven questions to make this question more practical for most users on this site.
Say I have one table with all the following columns:
CustomerID
CustomerName
OrderID
OrderDate
Region
ItemID
ItemName
SalespersonID
SalespersonName
Since everything is in the same table, I don't believe that any joins are necessary.
Question
Let's say that I run a store that pays its Sales people in the commission that requires customers make appointments. The commission is based on the number of times a customer has returned to that sales person.
How can I find out which customers have only had appointments with the same sales person and have never seen a different sales person?
Alternatively, I can provide another approach to solving this problem.
Is there a way to find out if there are customers that have only bought one item multiple times for separate orders?
For example, if Customer A, Jennifer Smith, really likes one specialty product, Extra Blue Caviar, and Jennifer buys Extra Blue Caviar every three months and has never bought another item in the store. Also, let's say Customer B, John Smith, buys Extra Fragrant Jasmine Rice every week and has never bought another item in the store.
How do I get a list of Jennifer Smith and John Smith along with Extra Blue Caviar and Extra Fragrant Jasmine Rice based on the criteria that they have only ever bought one item from the store multiple times?
Thank you so much!
For option #2
create table #t ( c_ID int, c_Name varchar(20),
o_ID int, o_date datetime, r int,
i_ID int, i_name varchar(20),
s_ID int, s_Name varchar(20))
;with cte as ( select
c_ID,
c_Name,
i_ID,
i_name,
count(distinct i_id) over (partition by c_ID) num_diffent_items,
count(*) over (partition by c_ID, i_ID) number_of_purchases
from #t)
select distinct
c_ID,
c_Name,
i_ID,
i_name,
number_of_purchases
from cte where num_diffent_items = 1

How to construct an SQL query for finding which company has the most employees?

I have the following tables and I would like to find the company which has the most workers. I am fairly new to sql and I would like some help on constructing the query. Any briefing would be appreciated on which keywords to use or how to begin with writing the query. I would like to
“Find the company that has the most workers.”
worker(worker_name, city, street)
work for(worker_name, company_name, salary)
company(company_name, city)
manages( worker_name, manage_name)
this will get you the company with the most employees in it.
select top 1 company_name,
count(*) as nbr_of_employees
from work-for
group by company_name
order by 2 desc
For more detailed answer please add sample data to your question and expected result.
how it works:
the group by company_name will group all records with the same company_name togheter. Because of that the count(*) will give you the number of records in work-for for each group. (thus all workers for each company)
the order by 2 desc will make sure that the company-name with the most employees is on top of the lists
Finally, the top 1 in the select will only return the first record in that list

SQL Server 2008 similar to group_concat topic fill distinct colums from two separate tables

Sorry... first time here and amateur...
I have two different tables (contact_data_a and contact_data_b) from two different divisions as follows:
contact_data_a
id customer contact
11200 Müller KG Hans
11201 Huber GmbH Patrick
11203 Gruber GmbH Manu
11205 Meyer GmbH Manu
contact_data_b
id customer contact
11200 Müller E. Peter
11202 Schubert AG Louis
11204 E.Schmidt Louis
11205 Mayer GmbH Peter
What I would like to have in the end is something like this:
contact_data_all
id customer contact_a contact_b
11200 Müller KG Hans Peter
11201 Huber GmbH Patrick 0
11202 Schubert AG 0 Louis
11203 Gruber GmbH Manu 0
11204 E. Schmidt 0 Louis
11205 Meyer GmbH Manu Peter
"id" is clear and distinct, but names in column "customer" might vary (incl. misspellings). This is no problem. Information could come from either table. My problem are the contact columns. Contacts from list contact_data_a should appear in column contact_a (or Null if they do not exist) and contacts from list contact_data_b should appear in column contact_b (or Null).
A friend said I might use
`SELECT id, customer, GROUP_CONCAT(contact_a) as contact_a,GROUP_CONCAT(contact_b) as contact_b FROM
(SELECT id, customer, contact_a, null as contact_b FROM contact_data_a
UNION
SELECT id, customer, null as contact_a, contact_b FROM contact_data_b)
GROUP BY id ORDER BY id`
But I only have SQL 2008, so CONCAT is not available yet.
Thank you in advance for any help or idea!!!
Try this one -
SELECT
id
, customer
, MAX(contact_a) AS contact_a
, MAX(contact_b) AS contact_b
FROM (
SELECT id, customer, contact_a, '0' AS contact_b
FROM contact_data_a
UNION ALL
SELECT id, customer, '0' AS contact_a, contact_b
FROM contact_data_b
) t
GROUP BY id, customer
ORDER BY id

Is it possible to do this query without a temp table?

If I have a table of data like this
tableid author book pubdate
1 1 The Hobbit 1923
2 1 Fellowship 1925
3 2 Foundation Trilogy 1947
4 2 I Robot 1942
5 3 Frankenstein 1889
6 3 Frankenstein 2 1894
Is there a query that would get me the following without having to use a temp table, table variable or cte?
tableid author book pubdate
1 1 The Hobbit 1923
4 2 I Robot 1942
5 3 Frankenstein 1889
So I want min(ranking) grouping by person and ending up with book for that min(ranking) value.
OK, the data I gave initially was flawed. Instead of a ranking column I'll have a date column. I need the book published earliest by author.
Missed that a CTE was not valid (but not sure why). How about as a subquery?
SELECT tableid, author, book, pubdate
FROM
(
SELECT
tableid, author, book, pubdate,
rn = ROW_NUMBER() OVER
(
PARTITION BY author
ORDER BY pubdate
)
FROM dbo.src -- replace this with the real table name
) AS x
WHERE rn = 1
ORDER BY tableid;
Original:
;WITH x AS
(
SELECT
tableid, author, book, pubdate,
rn = ROW_NUMBER() OVER
(
PARTITION BY author
ORDER BY pubdate
)
FROM dbo.src -- replace this with the real table name
)
SELECT tableid, author, book, pubdate
FROM x
WHERE rn = 1
ORDER BY tableid;
If you want to return multiple rows when there is a tie for earliest book, use RANK() in place of ROW_NUMBER(). In the case of a tie and you only want to return one row, you need to add additional tie breaker columns to the ORDER BY within OVER().
select * from table where ranking = 1
EDIT
Are you looking for this query to work in situations where there is no value of rank=1 for a given table and person? in that case, try this:
select *, RANK() OVER (Partition By talbeid, personid order by rank asc) as sqlrank
from table
where sqlrank = 1
EDIT OF MY EDIT:
This will work for the earliest pub date:
select *, RANK() OVER (Partition By author order by pubdate asc) as sqlrank
from table
where sqlrank = 1
SELECT tableid,author,book,pubdate FROM my_table as my_table1 WHERE pubdate =
(SELECT MIN(pubdate) FROM my_table as my_table2 WHERE my_table1.author = my_table2.author);
WITH min_table as
(
SELECT author, min(pubdate) as min_pubdate
FROM table
GROUP BY author
)
SELECT t.tableid, t.author, t.book, t.pubdate
FROM table t INNER JOIN min_table mt on t.author = mt.author and t.pub_date = mt.min_pubdate
Your sample data may be a overly simplistic. You talk about 'min(ranking)', but for all your examples, the minimum ranking for each personid is 1. So the answers you have received so far short-circuit the issue and simple select for ranking = 1. You don't state it in your "requirements", but it sounds like the minimum rank value for any particular personid may not necessarily be 1, correct? Also, you don't mention if a person can rank two or more books with the same minimum rank, so answers will be incomplete due to this missing requirement.
If my psychic abilities are accurate, then you might want to try something like this (untested obviously):
SELECT tableid, personid, book, ranking
FROM UnknownTable UNKTBL INNER JOIN
(SELECT personid, min(ranking) as ranking
FROM UnknownTable GROUP BY personid) MINRANK
ON UNKTBL.personid = MINRANK.personid AND UNKTBL.ranking = MINRANK.ranking
This will return all the rows for each person where the ranking value is the minimum value for that person. So if the minimum ranking for person 6 is 2, and there are two books for that person with that ranking, then both book rows will be returned.
If these are not, in fact your requirements, then please edit your question with more details/example data. Thanks!
Edit
Based on your change in requirements/example data, the SQL above should still work, if you change the column names appropriately. You still don't mention if an author can have two books in the same year (i.e. a prolific author such as Stephen King), so the SQL I have here will give multiple rows if the same author publishes two books in the same year, and that year is the earliest year of publication for that author.
SELECT * FROM my_table WHERE ranking = 1
ZING!
Seriously though I don't follow your question - can you provide a more elaborate or complicated example? I think I'm missing something obvious.

SQL Query that can return intersecting data

I have a hard time finding a good question title - let me just show you what I have and what the desired outcome is. I hope this can be done in SQL (I have SQL Server 2008).
1) I have a table called Contacts and in that table I have fields like these:
FirstName, LastName, CompanyName
2) Some demo data:
FirstName LastName CompanyName
John Smith Smith Corp
Paul Wade
Marc Andrews Microsoft
Bill Gates Microsoft
Steve Gibbs Smith Corp
Diane Rowe ABC Inc.
3) I want to get an intersecting list of people and companies, but companies only once. This would look like this:
Name
ABC Inc.
Bill Gates
Diane Rowe
John Smith
Marc Andrews
Microsoft
Smith Corp
Steve Gibbs
Paul Wade
Can I do this with SQL? How?
You take all the person names, and then also add all the companies
SELECT CONCAT([First Name],' ',[Last Name]) AS Name FROM Contacts
UNION ALL
SELECT DISTINCT CompanyName FROM Contacts
WHERE CompanyName IS NOT NULL
The DISTINCT keyword ensures that companies are output only once, and the WHERE
clause removes rows where no company info is known.
If a person has the same name as a company, then this will output a duplicate. If you don't want that, then change UNION ALL to UNION, and any name will be output only once.
I'm not sure what you mean by "intersecting," but you can easily get the results you describe as the union of two queries against that same table.
select
t.firstname + ' ' + t.lastname
from
mytable t
union
select
t.company
from
mytable t
Edit: UNION should make each SELECT distinct by default.
Does this do what you need?
SELECT FirstName + ' ' + LastName AS Name
FROM Contacts
UNION
SELECT CompanyName
FROM Contacts
(The UNION rather than UNION ALL will ensure distinctness of both top and bottom parts. mdma's answer will work if you do need the possibility of duplicate people names. You might need to add an ORDER BY Name depending on your needs)

Resources