Top results from multiple elections in T-SQL - sql-server

I have a table with votes from multiple sectors (think of each sector as a state or similar) on multiple candidates. Each sector has multiple candidates, each with a different vote count.
Here is my table (simplified)
CREATE TABLE [Results]
(
[SectorID] BIGINT,
[CanditateID] BIGINT,
[VoteCount] BIGINT,
[Newness] DATETIME
)
I obviously keep sector and candidate meta data in another table, but I need to find the highest voted candidate for each sector, so I can join those tables into a view.
The best candidate in each sector, is determined by [VoteCount], and if there are two with the same number of votes, it is determined by [Newness]. There must remain exactly one line per sector, and I have to be able to use it in a view, joined together with the meta data.
How do I obtain the highest voted candidate from each sector?

Assuming that you are using SQL Server 2005 or greater, you want to do this with row_number():
select r.*
from (select r.*,
row_number() over (partition by SectorId order by VoteCount desc, Newness desc) as seqnum
from Results r
) r
where seqnum = 1

This would work and can be joined:
SELECT
TR.SectorID, TR.CandidateID
FROM
tblResults TR
INNER JOIN ...
INNER JOIN ...
WHERE CandidateID =
(
SELECT TOP 1 CandidateID
FROM tblResults TRSUB
WHERE TRSUB.SectorID = TR.SectorID
ORDER BY VoteCount DESC, Newness DESC
)

Related

SQL Project using a where clause

So this is what I am working with new to sql and still learning been stuck on this for a few days now. Any advice would be appreciated I attached the image of the goal I'm trying to achieve
OrderItem And Product Table
Order And OrderItem Table(https://i.stack.imgur.com/pdbMT.png)
Scenario: Our boss would like to see the OrderNumber, OrderDate, Product Name, UnitPrice and Quantity for products that have TotalAmounts larger than the average
Create a query with a subquery in the WHERE clause. OrderNumber, OrderDate and TotalAmount come from the Order table. ProductName comes from the Product table. UnitPrice and Quantity come from the OrderItem table.
This is the code I came up with but it causes product name to run endlessly and displays wrong info.
USE TestCorp;
SELECT DISTINCT OrderNumber,
OrderDate,
ProductName,
i.UnitPrice,
Quantity,
TotalAmount
FROM [Order], Product
JOIN OrderItem i ON Product.UnitPrice = i.UnitPrice
WHERE TotalAmount < ( SELECT AVG(TotalAmount)
FROM [Order]
)
ORDER BY TotalAmount DESC;
Best guess assuming joins and fields not provided.
SELECT O.OrderNumber, O.orderDate, P.ProductName, OI.UnitPrice, OI.Quantity, O.TotalAmount
FROM [Order] O
INNER JOIN OrderItem OI
on O.ID = OI.orderID
INNER JOIN Product P
on P.ID= OI.ProductID
CROSS JOIN (SELECT avg(TotalAmount) AvgTotalAmount FROM [Order]) z
WHERE O.TotalAmount > z.AvgTotalAmount
Notes:
You're mixing join notations don't use , and inner join together that's mixing something called ANSI Standards.
I'm not sure why you have a cross join to product to begin with
You don't specify how to join Order to order item.
It seems very odd to be joining on Price.... join on order ID or productID maybe?
you could cross join to an "Average" result so it's available on every record. (I aliased this inline view "Z" in my attempt)
so what the above does is include all Orders. and for each order, an order item must be associated for it to be included. And then for each order item, a productid must be included and related to a record in product. If for some reason an order item record doens't have a related entry in product table, it gets excluded.
I use a cross join to get the average as it's executed 1 time and applied/joined to every record.
If we use the query in the where clause it's executed one time for EVERY record (unless the DB Engine optimizer figures it out and generates a better plan)
I Assume
Order.ID relates to OrderItem.OrderID
OrderItem.productID relates to Product.ID
Order.TotalAmount is what we are wanting to "Average" and compare against
Every Order has an Order Item entry
Every Order Item entry has a related product.

Get top 10 unique vendor data based on sub query which return vendor id

I have two tables FileMaster and VendorMaster.
In VendorMaster i have vendor id and other stuff. in FileMaster i have file related data where 'Vendorid' is foreign key in FileMaster. Now I want to fetch top 10 data from FileMaster for Each 'Vendor' (One record for one vendor).
I have tried below query, but it returns me 10 records with duplicate vendorID
select top 10 * from FileMaster where VendorId in (select top 10 VendorId from VendorMaster)
You can use ROW_NUMBER. I assumed FileID column for the identity of File Master. By the way, you don't need any subquery
SELECT TOP 10 * FROM (
select *,
ROW_NUMBER() OVER(PARTITION BY VendorID ORDER BY FileID) AS RN
FROM FileMaster ) AS T
WHERE RN = 1
ORDER BY FileID
Here you can use simple way instead of subquery
Or Anther way you can use CTE click this Link
SELECT DISTINCT FM.*
FROM FileMaster FM WITH(NOLOCK)
INNER JOIN [dbo].[VendorMaster] VM WITH(NOLOCK)
ON FM.VendorId = VM.VendorId
ORDER BY FM.VendorId ASC
OFFSET 0 ROWS
FETCH NEXT 10 ROWS ONLY
For more details OFFSET related check this Link

SQL performance problem: Select N rows until find distinct 200 customer

I have table ORDERS with these columns:
id | Customer | product
I want to calculate N number of rows in ORDERS table. N should be big enough to contain 200 distinct CUSTOMER from bottom of table. I have wrote the following query using max(ID) but It takes too many seconds to run this query. I think this is not optimized because I have thousands of rows and every time I have to use group by on whole table to find just an ID:
select count(*) as N from ORDERS where id > (
select top 1 id from
(select distinct top 200 CUSTOMER,max(id) as maxid from ORDERS group by CUSTOMER order by maxid desc) x
order by id asc
)
Is there another way to handle this with better performance?
This is a huge guess, but this will at least return a result:
WITH Customers AS(
SELECT TOP 200
CUSTOMER,
MAX(id) AS MaxID
FROM ORDERS
GROUP BY CUSTOMER
ORDER BY MaxID DESC)
SELECT COUNT(*)
FROM ORDERS O
WHERE EXISTS (SELECT 1
FROM Customers C
WHERE C.MaxID = O.id);
If it returns the correct results, but it still runs slowly, post the DDL of your table, and include the DDL for your indexes. I also suggest posting the query plan by using Paste the Plan

Get the id of the row with the max value with two grouping

We have a data structure with four columns:
ContractoreName, ProjectCode, InvoiceID, OrderID
We want to group the data by both ContractoreName and ProjectCode columns, and then get the InvoiceID of the row for each group with MAX(OrderID).
You could use ROW_NUMBER:
SELECT ContractorName, ProjectName, OrderId, InvoiceId
FROM (SELECT *, ROW_NUMBER() OVER(PARTITION BY ContractorName, ProjectName
ORDER BY OrderId DESC) AS rn
FROM tab
) AS sub
WHERE rn = 1;
ROW_NUMBER() is what I would call the canonical solution. In many cases, an old-fashioned solution has better performance:
select t.*
from t
where t.orderid = (select max(t2.orderid)
from t t2
where t2.contractorname = t.contractorname and
t2.projectname = t.projectname
);
This is especially true if there is an index on (contractorname, projectname, orderid).
Why is this faster? Basically, SQL Server can scan the table doing a lookup in an index. The lookup is really fast because the index is designed for it, so the scan is just a little faster than a full table scan.
When using row_number(), SQL Server has to scan the table to calculate the row number (and that can use the index, so it might be fast). But then it has to go back to the table to fetch the columns and apply the where clause. So, even if it uses an index, it is doing more work.
EDIT:
I should also point out that this can be done without a subquery:
select distinct contractorname, projectname,
max(orderid) over (partition by contractorname, projectname) as lastest_order,
first_value(invoiceid) partition by (order by contractorname, projectname order by orderid desc) as lastest_invoice
from t;
Unfortunately, SQL Server doesn't offer first_value() as an aggregation function, but you can use select distinct and get the same effect.

Unique vs MAX in SQL statement

I have a table with three columns:
PERSON
VISITOR
DATE
The table is basically a transactional table. The following is true:
There are multiple rows per person
There are multiple rows per visitor
There are multiple rows of a given person/visitor combination.
Assumed unique person/date combination
What I need is
I want visitor for each Person's MAX Date.
I cannot have multiple persons in the output.
Person must be unique.
visitor may repeat.
I have tried:
SELECT
ROW_NUMBER() OVER (PARTITION BY PERSON, VISITOR ORDER BY Date DESC) row_num,
PERSON,
VISITOR as VISITOR
FROM
`TABLE`
ORDER BY
PERSON
Maybe this... not sure I fully understand question. Sample data /expected results would help.
You said you wanted only the 1 person with the visitor per max date so the row_num of 1 will be the record w/ the max date. and since we partition by person it will not matter if person A had 3 visitors. only the person and their Most recent visitor will be listed.
WITH cte as (
SELECT ROW_NUMBER() OVER (PARTITION BY PERSON ORDER BY Date DESC) row_num
, PERSON
, VISITOR as VISITOR
FROM `TABLE`)
SELECT *
FROM cte
WHERE row_Num = 1
I think this can be done with a cross apply too though i'm not as good at using them yet...
SELECT A.Person, A.Visitor, A.Date
FROM table A
CROSS APPLY (SELECT TOP 1 *
FROM TABLE B
WHERE A.Person = B.Person
and A.Visitor = B.Visitor
and A.Date = B.Date
ORDER BY DATE DESC) C
Essentially the inner query runs for each record on the outer query; thus only the top most record will be returned thus the newest date.
select a.* from myTable as a inner join (
SELECT person, max(date) as maxDate from myTable group by person
) as b
on a.date = b.maxDate
and a.person = b.person;
I am weak in reading and writing English.
In my opinion the answer may be:
SELECT `PERSON`, `VISITOR`, MAX(`DATE`) AS `DATE`
FROM `TABLE`
GROUP BY `PERSON`, `VISITOR`;

Resources