Duplicated rows on getting Categories of a Publication - sql-server

I have a table of Publications
Id | Title | Content ...
1 | 'Ex title 1' | 'example content 1'
2 | 'Ex title 2' | 'example content 2'
And a table of Categories
CategoryId | PublicationId
1 | 1
2 | 1
2 | 2
3 | 2
So a Publication could have one or many categories.
I am trying to get the first 10 publications and their categories on a single query, like that:
SELECT [Publication].Id, [Publication].Title, [Publication].Content, [PublicationCategory].CategoryId
FROM [Publication]
LEFT JOIN [PublicationCategory] ON [Publication].Id = [PublicationCategory].Id
ORDER BY [Publication].Id DESC
But I am getting duplicated values because of diferents categories ids, which is the better way to get 10 publications and their categories and not getting duplicated rows (because of duplicated rows, i got duplicated publications)

You can first pick the TOP 10 Publications and then put a JOIN with the Category table like following query to get all the categories.
SELECT [Publication].*,[PublicationCategory].[categoryid]
SELECT TOP 10 [Publication].id,
FROM Publications [Publication]
ORDER BY [Publication].Id DESC
) [Publication]
INNER JOIN Categories [PublicationCategory]
ON [Publication].id = [PublicationCategory].publicationid

Use a CTE to number your publlication, and then JOIN onto your table PublicationCategory and filter on the value of ROW_NUMBER():
SELECT P.Id, P.Title, P.Content,
FROM Publication P)
SELECT RNs.Id, Rns.Title, RNs.Content,
LEFT JOIN PublicationCategory PC ON RNs.Id = PC.Id
WHERE RNs.RN <= 10;

I Think the best answer is the #PSK's but What if a publication is not categorized? (weird case but if is not validated maybe could happen) so you can add a left join and always get at least the 10 publications, if a publication has no category you still will get it but with a NULL category
SELECT [Publication].*,[PublicationCategory].[categoryid]
SELECT TOP 10 [Publication].id,
FROM Publications [Publication]
ORDER BY [Publication].Id DESC
) [Publication]
LEFT JOIN Categories [PublicationCategory]
ON [Publication].id = [PublicationCategory].publicationid


Find the average and total revenue by each sub-category for the categories which are among top 5 categories in terms of quantity sold? <SQL Server>

For this question, I have two tables and are as follows :
prod_cat_info --- This table has the following columns:
prod_cat : It contains the products' category names
prod_cat_id : It contains the products' category ID. Note that every product category has been assigned a unique ID. For example :: Lets say I have following product categories Books,Sports,Electronics. So these 3 product categories will be assigned product category ID as 1,2 & 3 respectively.
prod_subcat : It contains products' subcategories
prod_subcat_id : It contains products' subcategories ID
Now how this product subcategories are stored. For example : Lets say for product category "Books", I have 3 product subcategories like "Novels", "Schoolbooks" & "Fiction". So in this case also, each and every product subcategory would be assigned an ID like 1,2,3 and so on.
Transactions --- This is another table which has the following columns :
total_amt : It contains amount paid by customer when a transaction took place.
Qty : It contains quantities ordered by customer of a particular product.
prod_subcat_id : It contains products' subcategories ID
prod_cat_id : It contains the products' category ID.
Cust_ID : It contains customer ID [Irrelevant column in case of this question]
What I did is, I break this question into 2 parts & wrote 2 separate queries. Query is given below. I am not able to figure out how to join these 2 queries in order to achieve the output.
For my query1 - I have fetched all the product subcategories.
In query2 - I have fetched the top 5 product categories based on quantities sold.
Now I feel that Query2 can be used as a subquery in Query1 inside WHERE clause.
But It may require some modifications because what I know is that orderby can't be used in subquery & also result of a subquery will be a single output.
Therefore, I need some help on how can I combine/modify this query in order to achieve the result.
select P.prod_subcat as Product_SubCategory,
AVG(cast(total_amt as float)) as Average_Revenue,
SUM(cast(total_amt as float)) as Total_Revenue
from Transactions as T
INNER JOIN prod_Cat_info as P
ON T.prod_cat_code = P.prod_cat_code AND T.prod_subcat_code =
group by P.prod_subcat
select top 5 P.prod_cat, sum(Cast(Qty as int)) AS Quantities_sold from
prod_cat_info as P
inner join Transactions as T
ON P.prod_cat_code = T.prod_cat_code AND P.prod_sub_cat_code =
group by P.prod_cat
order by sum(Cast(Qty as int)) desc
If you have a TOP operator with ORDER BY, which is exactly your case, then you can use order by in a subquery. Because in this case the ORDER BY is used to determine the rows returned by the TOP clause.
And for multiple values you can use IN operator
select P.prod_subcat as Product_SubCategory,
AVG(cast(total_amt as float)) as Average_Revenue,
SUM(cast(total_amt as float)) as Total_Revenue
from Transactions as T
INNER JOIN prod_Cat_info as P
ON T.prod_cat_code = P.prod_cat_code AND T.prod_subcat_code =
WHERE P.prod_cat_code IN (
select top 5 P.prod_cat_code
from prod_cat_info as P
inner join Transactions as T
ON P.prod_cat_code = T.prod_cat_code AND P.prod_sub_cat_code =
group by P.prod_cat
order by sum(Cast(Qty as int)) desc
group by P.prod_subcat
Select prod_cat, prod_subcat , avg(total_amt) as average_amount , sum(total_amt) as total_amount
From transactions as t
inner join prod_cat_info as p
on t.prod_subcat_code=p.prod_sub_cat_code and t.prod_cat_code = p.prod_cat_code
Where prod_cat in
(Select Top 5 prod_cat
From transactions as t
inner join prod_cat_info as p
on t.prod_subcat_code=p.prod_sub_cat_code and t.prod_cat_code = p.prod_cat_code
Where total_amt > 0 and qty > 0
Group by prod_cat
Order by count(qty) desc)
Group by prod_subcat
Order by prod_cat asc;

SQL query to bring back all orders only if all order details are in a list

I have an orders table that contains a lot of order specific info that is irrelevant to the question. However, I then have an orderDetails table that has a foreign key (orders.id == orderDetails.orderId). In this orderDetails, a customer can select to order lots of flavors of a product, each flavor gets a new entry in this table linked back to the main order.
What I want to do is select all the orders where ALL the flavors are present in the order. So, if an order has apple, peach and orange and I query for apple and peach, it wouldn't return that order because orange wasn't in my query.
I have tried subqueries and so on, but I feel like the only way to solve it is with looping each order and looking at the details, but that is horribly inefficient. Any thoughts?
FROM orders
FROM orderdetails
WHERE flavor IN('apple', 'peach', 'orange'))
AND isInvoiced = 1
AND isShipped = 0
AND isOnHold = 0
So, if I don't have any peach in stock, I want to see orders that do not contain any peach:
FROM orders
FROM orderdetails
WHERE flavor IN ('apple', 'orange'))
AND isInvoiced = 1
AND isShipped = 0
AND isOnHold = 0
The problem with the existing query here is that it returns everything because it just says, sure, you asked for apple... sure you asked for orange and this order contains those so I will return it. I need it to be ALL or nothing.
In the real database, the flavors are ID's, I just simplified it for this example.
Database tables were requested... I'll go ahead and list them as they really exist.
One more edit, this is my original failed attempt:
select * from orders WHERE id in
select orderId from orderdetails where flavorId in
and isInvoiced = 1 and isShipped = 0 and isOnHold = 0
That list of ID's would change based on what flavors are actually in stock.
basically you can just GROUP BY flavor with condition HAVING COUNT(*) = 3. So orders with those 3 flavor will be listed
select *
from orders o
where exists
select x.flavor
from orderdetails x
where x.orderId = o.id
and x.flavor in ('apple', 'peach', 'orange')
group by x.flavor
having count(*) = 3
and isInvoiced = 1
and isShipped = 0
and isOnHold = 0
You can use count function to make sure all flavors are represented.
select o.*
from orders o
inner join
select orderId, count(*) as flavorCount
from orderdetails
where flavor in ('apple', 'peach', 'orange')
group by orderId
) as t1
on o.orderId = t1.orderId
and isInvoiced = 1
and isShipped = 0
and isOnHold = 0
and t1.falvourCount = 3;
It would be simpler if you have a list of out-of-stock and in-stock flavors.
So if for example 'peach' is out of stock , and 'apple' and 'orange' are in stock, the following query will produce Orders that have only 'apple' OR 'orange' :
SELECT * FROM orders
id IN (SELECT id FROM orderdetails WHERE flavor IN ('apple','orange') ) -- in stock
id NOT IN (SELECT id FROM orderdetails WHERE flavor IN ('peach') ) --out of stock
What do you think ?
The question completely changed via this comment "I realize now I left out one very important part in that each order may have 1 or any number of flavors, not all have to be present."
The following query meets the original requirements: "What I want to do is select all the orders where ALL the flavors are present in the order" & "I need it to be ALL or nothing."
An order might have more than one item referring to a flavor ("apple pie", and "apple cake" for example), so I recommend you use 3 case expressions in a having clause to guard against this, whilst still achieving your objective:
select o.*
from orders as o
inner join (
select orderId
from orderdetails
group by orderId
having sum(case when flavor = 'apple' then 1 else 0 end) > 0
and sum(case when flavor = 'peach' then 1 else 0 end) > 0
and sum(case when flavor = 'orange' then 1 else 0 end) > 0
) as od on o.id = od.orderid
where o.isInvoiced = 1
and o.isShipped = 0
and o.isOnHold = 0
Note that the use of an inner join limits the orders listed to only those that refer to all 3 flavors.
This query is demonstrated here: http://rextester.com/AKMM54555

MSSQL/TSQL Joining against a subquery

I'm analyzing IIS log files from sharepoint and need to match each entry to it's SPWeb.
This SQL code works for a single value (#var1):
DECLARE #var1 varchar(128);
set #var1 = '/sites/Site1/Subsite1/Subsite2/Documents/marketing.docx';
TOP 1 *,
charindex(urlstub, #var1) as found
charindex(urlstub, #var1) = 1
order by
urlstub DESC;
I'm looking for a way to get this to work for a tables worth of data instead of just the single variable #var1.
Example data
IISlog: (this is the table I'd like to take the place of #var1 above)
The expected outcome of the above would be:
Foreach record in the IISLog table:
Find the best/deepest matching record from the spwebs table:
|table | matchingSPweb |
|---------------------------------------------------------| --------------------------------|
| /sites/Site1/Subsite1/Subsite2/Documents/marketing.docx | /sites/Site1/Subsite1/Subsite2/ |
| /sites/Site1/Subsite1/Subsite2/Documents/sales.docx | /sites/Site1/Subsite1/Subsite2/ |
| /sites/Site1/Subsite1/Subsite2/Documents/hr.docx | /sites/Site1/Subsite1/Subsite2/ |
| /sites/Site1/research/funding.docx | /sites/Site1 |
I've tried
select iislogs2.*, spwebs.urlstub
inner join
select TOP 1 urlstub, csURIStem as found
from spwebs
where charindex(urlstub, iislogs2.csUriStem) = 1
order by urlstub DESC
) as x
on x.csuristem = iislogs2.csUriStem
but this just errors, it doesn't seem to understand csUriStem in the context of the subselect statement.
The easiest ways to fix your issue are either to change your current query to use a subquery in the select statement, e.g.:
SELECT iislogs2.*,
urlstub = (SELECT TOP 1 urlstub FROM spwebs WHERE CHARINDEX(urlstub, iislogs2.csUriStem) = 1 ORDER BY urlstub DESC)
from iislogs2;
... or change your current join to a cross apply, e.g.:
SELECT iislogs2.*, x.urlstub
from iislogs2
cross apply (SELECT TOP 1 urlstub FROM spwebs WHERE CHARINDEX(urlstub, iislogs2.csUriStem) = 1 ORDER BY urlstub DESC) AS x;
The query optimiser might do all sorts of weird sorts and spools, so one option to avoid that might be to use an explicit join with a CTE and then left join this back to your original table. For example:
SELECT i.csUriStem, s.urlstub, RN = ROW_NUMBER() OVER (PARTITION BY i.csUriStem ORDER BY s.urlstub DESC)
FROM iislogs2 AS i
JOIN spwebs AS s
ON i.csUriStem LIKE s.urlstub + '%'
SELECT i.*, c.urlstub
FROM iislogs2 AS i
ON c.csUriStem = i.csUriStem
AND c.RN = 1;
Unfortunately, with strings and substrings, it's hard to get an execution plan that is really optimal for what you want to do, but I expect this sort of query will perform better with indexes than the other two.

How to select distinct records from tables which are inner join with each other

I have 3 Tables
News (News_ID,Title,Article,Attatchment,Publish_Status,Sort_Order,Date,Read_Status)
Category (Category_ID,Category_Name,Parent)
I want to select top 4 records from table News whose Category_ID in News_Category table is 12 and also whose Parent in category table is 12.
I have used following query:-
News.News_ID, News.Title, News.Article, News.Attatchment, News.Publish_Status, News.Sort_Order, News.Read_Status, News.[Date]
INNER JOIN News_Category ON News_Category.News_ID = News.News_ID
INNER JOIN Category ON News_Category.Category = Category.Category_ID
WHERE News_Category.Category in
( select Category_ID
from Category
where Category_ID=12
select Category_ID
from Category
where Parent=12 )
But when i insert a single news in two Category having Parent=12 then it is showing duplicate data.Please help me.
Leaving out the News_Category.ID should fix it I believe.
News.News_ID, News.Title, News.Article, News.Attatchment, News.Publish_Status, News.Sort_Order, News.Read_Status, News.[Date]
INNER JOIN News_Category ON News_Category.News_ID = News.News_ID
INNER JOIN Category ON News_Category.Category = Category.Category_ID
WHERE News_Category.Category in
( select Category_ID
from Category
where Category_ID=12
or Parent=12 )

Count Function with Linked Table

I am attempting to count the number of parts ordered by each of our customers in 2013. However the returned results seem to be grouped by order number and not trader. I am using the following select statement;
SELECT orders.traderid, COUNT(orderitems.partid) AS configuredparts
FROM orders LEFT JOIN orderitems
ON orders.id = orderitems.orderid AND orders.ordertype = orderitems.ordertype
WHERE (orderitems.partid LIKE N'P%') AND (YEAR(orders.createddate) = 2013)
GROUP BY orders.traderid, orderitems.partid, orders.ordertype
HAVING (orders.ordertype = N'SO')
ORDER BY orders.traderid
an example of my results are
traderid configured parts
800001 3
800001 3
800001 2
800001 1
A00002 1
A00002 2
Any help is much appreciated
This answer gives the total number (count) of part ids in table [order items] by trader id in table [orders].
trader_id part_id total_parts
800001 P001 3
800001 P002 2
A00002 P001 1
If table [order items] has a qty column, you should change COUNT(oi.partid) to SUM(oi.qty).
o.traderid as trader_id,
oi.partid as part_id,
COUNT(oi.partid) AS total_parts
orders as o
orderitems as oi
o.id = oi.orderid AND
o.ordertype = oi.ordertype
(oi.partid LIKE N'P%') AND
(o.createddate >= '20130101') AND
(o.createddate < '20140101') AND
(o.ordertype = N'SO')
o.traderid, oi.partid
o.traderid, oi.partid
Last but not least, why does the caption have a linked table (server)?
If you are using a linked server you will need to use 4 part notation.
give this a go
SELECT orders.traderid, COUNT(orderitems.partid) AS configuredparts
FROM orders LEFT JOIN orderitems AND orders.ordertype = orderitems.ordertype
ON orders.id = orderitems.orderid
WHERE (orderitems.partid LIKE N'P%') AND (orders.createddate >= '20130101')
AND (orders.createddate < '20140101') AND (orders.ordertype = N'SO')
GROUP BY orders.traderid
ORDER BY orders.traderid
