SQL Server: REPLACE comma and combine it with similar row - sql-server

I have a bad table structure. I try to modify it as little as possible because the real problem is more complicated. I'm working in SQL Server 2005.
Here is my table:
tbl_item
id | item
---+------
1 | car
2 | car
3 | car, => This is what I focus
4 | jet
5 | jet, car => This is just an example of the comma purpose
Query :
SELECT item, count(*) AS sum
FROM tbl_item
GROUP BY item
ORDER BY sum DESC
Result:
item | sum
--------+-------
car | 2
car, | 1
jet | 1
jet,car | 1
What I want is like this:
item | sum
---------+--------
car | 3
jet | 1
jet, car | 1 => actually I don't care about this, but this just for example
I tried :
SELECT REPLACE(item,',',''), count(*) AS sum
FROM tbl_item
GROUP BY item
ORDER BY sum DESC
But the result is:
item | sum
---------+------
car | 2
car | 1 => still in the different row
jet | 1
car, jet | 1
I can manipulate this so easy with PHP, but I wonder how to do this with pure SQL Server.
Thanks in advance!

Try:
GROUP BY REPLACE(item,',','')
This will normalize the items, then group on them.

You need to add it to the group by clause:
SELECT REPLACE(item,',',''), count(*) AS sum
FROM tbl_item
GROUP BY replace(item,',','')
ORDER BY sum DESC

Related

Displaying data in a different manner

So I have a table that binds ProductId and GroupId. The product can be assigned to all of 5 groups (1-5).
If the product doesn't exist in the table, it's not assigned to any of the group
ProductId | GroupId
-------------------
100 | 1
100 | 2
200 | 1
200 | 2
200 | 3
200 | 4
200 | 5
Taking a look at this table, we know that Product that goes by id 100 is assigned to 2 groups (1,2) and the product of id 200 is assigned to 5 groups (1-5).
I'm trying to write a query that will display each product in separate row, together with columns for all of the 5 groups and a bit value that contains information if the product belongs to the group or not (0,1). A visualization of the result I need:
ProductId | IsGroup1 | IsGroup2 | IsGroup3 | IsGroup4 | IsGroup5
-----------------------------------------------------------------
100 | 1 | 1 | 0 | 0 | 0 -- this belongs to groups 1, 2
200 | 1 | 1 | 1 | 1 | 1 -- this belongs to all of the groups
I know I could probably solve it using a self join 5 times on each distinct product, but I'm wondering if there's a more elegant way of solving it?
Any tips will be strongly appreciated
You could use a pivot. Since you only have 5 groups you don't need a dynamic pivot.
DB FIDDLE
select
ProductId
,IsGroup1 = iif([1] is null,0,1)
,IsGroup2 = iif([2] is null,0,1)
,IsGroup3 = iif([3] is null,0,1)
,IsGroup4 = iif([4] is null,0,1)
,IsGroup5 = iif([5] is null,0,1)
from
(select ProductID, GroupId from mytable) x
pivot
(max(GroupId) for GroupId in ([1],[2],[3],[4],[5])) p

SQL Server: Parent id with the least number of children

I have two tables Client and Instructor.
Client table :
id_client|name_client|FK_instructor
---------+-----------+------------
1 | Clinton | 2
2 | Gates` | 1
3 | Bush | 1
4 | Clinton | 2
5 | Obama | 1
6 | Jack | 3
Instructor table :
id_instructor|name_instructor
-------------+---------------
1 | Sara
2 | Sam
3 | Dean
4 | Julie
5 | Jake
I want to select the 3 instructors who have the least number of clients associated.
Thank you in advance.
Now that you mentioned you're using SQL Server, in addition to the GROUP BY and ORDER BY you need a TOP(3) on your SELECT.
SELECT TOP(3) i.id_instructor, i.name_instructor
FROM Instructor i
JOIN Client c ON c.FK_instructor = i.id_instructor
GROUP BY i.id_instructor, i.name_instructor
ORDER BY COUNT(*) --Implicitly ascending
Note that I added the instructor id to the group by compared to the other answer in case more than one instructor has the same name.
If you are working with Netezza, you could try:
SELECT name_instructor, COUNT(id_client)
FROM instructor_table
JOIN client_table on instructor_table.id_instructor = client_table.FK_instructor
GROUP BY name_instructor
ORDER BY COUNT(id_client) DESC
LIMIT 3
There is great documentation for Netezza here:
http://www-304.ibm.com/support/knowledgecenter/SSULQD_7.2.0/com.ibm.nz.dbu.doc/c_dbuser_sql_grammar.html
There are also SQL tutorials here:
http://www.w3schools.com/sql/

Find the top ranked unique item for each grouping in a set

Given the following dataset which contains a series of products for a customer, along with a number of related products for each, I want to pick the top ranked unique Related Product ID for each of the Product IDs.
Sample Data
This table shows what the data looks like for a single Customer. There will be multiple Customers.
The items selected in yellow are an example of what the results would look like for this example Customer ID.
So, a single Product ID may have multiple Related Product IDs. For a single customer with, say 6 Product IDs, I want to return the top ranked Related Product ID for each individual Product ID.
Rules
The catch is, that I want to eliminate duplication as much as possible. So if the same Related Product ID is the top ranked for more than one Product ID, the selection should move down to the next highest ranked Related Product ID.
The goal is to, where possible, provide a unique (within each Customer ID) Related Product ID for each Product ID.
Where it is not possible for a unique Related Product ID to be selected (because there are only duplicate Related Product IDs available), then the top ranked should be selected.
Results
For Product 2, the Related Product ID 23194 is the highest ranked, but it is not unique, so is skipped in favour of 23287. For Product 4, we could use either 23194 or 23300, but because neither is unique, we take the highest ranked item.
I've tried doing this using a recursive CTE, but this will iterate through the items and allocate the Related Product on the first Products before finding out if the Related Products are repeated later in the set.
How else can I approach this?
You can use ROW_NUMBER and COUNT OVER():
SQL Fiddle
;WITH Cte AS(
SELECT *,
RN = (RelatedProductRanking + COUNT(*) OVER(PARTITION BY ProductID)) *
COUNT(*) OVER(PARTITION BY RelatedProductID)
FROM tbl
),
CteRnk AS(
SELECT *,
RNK = ROW_NUMBER() OVER(PARTITION BY ProductID ORDER BY RN)
FROM Cte
)
SELECT
CustomerID, ProductRanking, ProductID, RelatedProductRanking, RelatedProductID
FROM CteRnk
WHERE RNK = 1
ORDER BY ProductRanking, RelatedProductRanking
RESULT
| CustomerID | ProductRanking | ProductID | RelatedProductRanking | RelatedProductID |
|------------|----------------|-----------|-----------------------|------------------|
| 12436 | 1 | 14553 | 1 | 14481 |
| 12436 | 2 | 33017 | 2 | 23287 |
| 12436 | 3 | 14203 | 1 | 14289 |
| 12436 | 4 | 23038 | 1 | 23194 |
| 12436 | 5 | 15120 | 1 | 14520 |
| 12436 | 6 | 23014 | 1 | 23300 |

SQL Query to get data based on multiple filters

I have following Product table and ProductTag tables -
ID | Product
--------------
1 | Product_A
2 | Product_B
3 | Product_C
TagID | ProductID
----------------------
1 | 2
1 | 3
2 | 1
2 | 2
2 | 3
3 | 1
3 | 2
Now I need a SQL query that return all products list which are having both Tag 1 and 2. Result should be as given below -
ProductID | Product
------------------------
2 | Product_B
3 | Product_C
Please suggest how can i write a MS SQL query for this.
SELECT p.ID, p.Product
FROM Product p
INNER JOIN ProductTag pt
ON p.ID = pt.ProductID
WHERE pt.TagID IN (1, 2) -- <== Tags you want to find
GROUP BY p.ID, o.Product
HAVING COUNT(*) = 2 -- <== tag count on WHERE clause
however, if TagID is not unique on every Product, you need to count only the distinct product.
HAVING COUNT(DISTINCT pt.TagID) = 2
More on: SQL of Relational Division

Finding duplicate in SQL Server Table

I have a table
+--------+--------+--------+--------+--------+
| Market | Sales1 | Sales2 | Sales3 | Sales4 |
+--------+--------+--------+--------+--------+
| 68 | 1 | 2 | 3 | 4 |
| 630 | 5 | 3 | 7 | 8 |
| 190 | 9 | 10 | 11 | 12 |
+--------+--------+--------+--------+--------+
I want to find duplicates between all the above sales fields. In above example markets 68 and 630 have a duplicate Sales value that is 3.
My problem is displaying the Market having duplicate sales.
This problem would be incredibly simple to solve if you normalised your table.
Then you would just have the columns Market | Sales, or if the 1, 2, 3, 4 are important you could have Market | Quarter | Sales (or some other relevant column name).
Given that your table isn't in this format, you could use a CTE to make it so and then select from it, e.g.
WITH cte AS (
SELECT Market, Sales1 AS Sales FROM MarketSales
UNION ALL
SELECT Market, Sales2 FROM MarketSales
UNION ALL
SELECT Market, Sales3 FROM MarketSales
UNION ALL
SELECT Market, Sales2 FROM MarketSales
)
SELECT a.Market
,b.Market
FROM cte a
INNER JOIN cte b ON b.Market > a.Market
WHERE a.Sales = b.Sales
You can easily do this without the CTE, you just need a big where clause comparing all the combinations of Sales columns.
Supposing the data size is not so big,
make a new temporay table joinning all data:
Sales
Market
then select grouping by Sales and after take the ones bigger than 1:
select Max(Sales), Count(*) as Qty
from #temporary
group by Sales

Resources