I have following table in SQL Server 2005. One order can have multiple containers. A container can be either Plastic or wood (New types may come in future).
I need to list the following columns -
OrderID, ContainerType, ContainerCOUNT and ContainerID.
Since I need to list the ContainerID also, the following group by approach won’t work.
DECLARE #OrderCoarntainers TABLE (OrderID INT, ContainerID INT, ContainerType VARCHAR(10))
INSERT INTO #OrderCoarntainers VALUES (1,101,'Plastic')
INSERT INTO #OrderCoarntainers VALUES (1,102,'Wood')
INSERT INTO #OrderCoarntainers VALUES (1,103,'Wood')
INSERT INTO #OrderCoarntainers VALUES (2,104,'Plastic')
SELECT OrderID,ContainerType,COUNT(DISTINCT ContainerID) AS ContainerCOUNT
FROM #OrderCoarntainers
GROUP BY OrderID,ContainerType
What is the best way to achive this?
Note: Upgrading SQL Server version is not an option for me.
Expected Result
You should be able to use a windowed function
SELECT OrderID,
ContainerType,
COUNT(ContainerID) OVER (PARTITION BY OrderID, ContainerType) AS ContainerCOUNT,
ContainerID
FROM #OrderCoarntainers
I really don't know SQL Server dialect of SQL that well, but I can suggest something that is pretty basic and may work. It relies on a join, which is not optimal for performance but will get the job done if the table is not huge or performance is not critical. Really the problem here is the table design is pretty bad for the data you are managing, as this should not all be in one table. But anyway:
SELECT o1.OrderID, o1.ContainerType, count(o2.ContainerID) AS ContainerCOUNT, o1.ContainerID
FROM #OrderCoarntainers o1 JOIN #OrderCoarntainers o2
ON o1.OrderID = o2.orderID AND o1.ContainerType = o2.ContainerType
GROUP BY o1.OrderID
Related
I tried using the SQL below to insert values from one table, importTable, into another table, POInvoicing. It appears that the way this query below works is it checks the POInvoicing table for any possible duplicates from the importTable and for those entries that are not duplicates, it inserts them into the table. The end result is SQL inserting duplicates that already exist in importTable. Is there a way to tell SQL Server to check the table for a possible duplicate entry, if not, add the next row. Then check the table for a duplicate entry, if not, add the next row. I know this will be slower but speed isn't an issue.
INSERT INTO POInvoicing
(VendorID, InvoiceNo)
SELECT dbo.importTable.VendorID,
dbo.importTable.InvoiceNo
FROM dbo.importTable
WHERE NOT EXISTS (SELECT VendorID,
InvoiceNo
FROM POInvoicing
WHERE POInvoicing.VendorID = dbo.importTable.VendorID AND
POInvoicing.InvoiceNo = dbo.importTable.InvoiceNo)
This isn't exactly the functionality I was hoping for. What I want is for the query to insert a row into the table and then check for "duplicates" before inserting the next row. What constitutes a duplicate in the importTable would be the combination of VendorID and InvoiceNo. There are about a dozen different columns in importTable and technically each row is distinct, so DISTINCT won't work here.
I can't simply remove duplicates from the importTable for a couple of reasons not relevant to the question above (though I can provide it if necessary), so that method is out.
If you really don't care (or refuse to tell us) how you want to decide between two rows with the same VendorID and InvoiceNo values, you can pick an arbitrary row like this:
;WITH NewRows AS
(
SELECT VendorID, InvoiceNo, InvoiceDate, /* ... other columns ... */
rn = ROW_NUMBER() OVER (PARTITION BY VendorID, InvoiceNo ORDER BY (SELECT NULL))
FROM dbo.importTable AS i
WHERE NOT EXISTS (SELECT 1 FROM dbo.POInvoicing AS p
WHERE p.VendorID = i.VendorID
AND p.InvoiceNo = i.InvoiceNo)
)
INSERT dbo.POInvoicing(VendorID, InvoiceNo, InvoiceDate /* , ... other columns ... */)
SELECT VendorID, InvoiceNo, InvoiceDate /* , ... other columns */
FROM NewRows
WHERE rn = 1;
If you later decide there is a specific row you want in the case of duplicates, you can swap out (SELECT NULL) for something else. For example, to take the row with the latest invoice date:
OVER (PARTITION BY VendorID, InvoiceNo ORDER BY InvoiceDate DESC)
Again, I wasn't asking questions here to be annoying, it was to help you get the solution you need. If you want SQL Server to pick between two duplicates, you can either tell it how to pick, or you'll have to accept arbitrary / non-deterministic results. You should not jump the fence for looping / cursors just because the first thing you tried didn't work the way you wanted it to.
Also please always specify the schema and use sensible table aliases.
Adding a primary key constraint or unique key constraint in your table to avoid duplicate data insertion.
Also use distinct keyword in your select query to avoid this.
Duplicate rows can also be eliminated by using group by or row_number() functions in SQL.
Using DISTINCT Keyword
INSERT INTO POInvoicing
(VendorID, InvoiceNo, InvoiceDate)
SELECT DISTINCT dbo.importTable.VendorID,
dbo.importTable.InvoiceNo,
dbo.importTable.InvoiceDate
FROM dbo.importTable
WHERE NOT EXISTS (SELECT VendorID,
InvoiceNo
FROM POInvoicing
WHERE POInvoicing.VendorID = dbo.importTable.VendorID
AND
POInvoicing.InvoiceNo = dbo.importTable.InvoiceNo)
Try this INNER JOIN
INSERT INTO POInvoicing
(VendorID, InvoiceNo, InvoiceDate)
SELECT dbo.importTable.VendorID,
dbo.importTable.InvoiceNo,
dbo.importTable.InvoiceDate
FROM dbo.importTable IM
INNER JOIN POInvoicing S ON S.POInvoicing.VendorID <>
dbo.importTable.VendorID
AND
S.POInvoicing.InvoiceNo <> dbo.importTable.InvoiceN
This is a problem that has troubled several times in the past an I have always wondered if a solution is possible.
I have a query using several tables one of the values is mobile phone number.
I have name, addresss etc.... I also have income information in the table which is used for a summary in Excel.
Where the problem occurs is when a contact has more than one mobile number, as you know this will create extra rows with the majority of the data being duplicate including the income.
Question: is it possible for the query to identify whether the contact has more than one number and if so create a new column with the 2nd mobile number.
Effectively returning the contacts information to one row and creating new columns.
My SQL is intermediate and I cannot think of a solution so thought I would ask.
Many thanks
I am pretty sure that it isn't the best possible solution, since we don't have information on how many records do you have in your dataset and I didn't have enough time, so just an idea how you can solve your original problem with two different numbers for one same customer.
declare #t table (id int
,firstName varchar(20)
,lastName varchar(20)
,phoneNumber varchar(20)
,income money)
insert into #t values
(1,'John','Doe','1234567',50)
,(1,'John','Doe','6789856',50)
,(2,'Mike','Smith','5687456',150)
,(3,'Stela','Hodhson','3334445',500)
,(4,'Nick','Slotter','5556667',550)
,(4,'Nick','Slotter','8889991',550)
,(5,'Abraham','Lincoln','4578912',52)
,(6,'Ronald','Regan','6987456',587)
,(7,'Thomas','Jefferson','8745612',300);
with a as(
select id
,max(phoneNumber) maxPhone
from #t group by id
),
b as(
select id
,min(phoneNumber) minPhone
from #t group by id
)
SELECT distinct t.id
,t.firstName
,t.lastName
,t.income
,a.maxPhone as phoneNumber1
,case when b.minPhone = a.maxPhone then ''
else b.minphone end as phoneNumber2
from #t t
inner join a a on a.id = t.id
inner join b b on b.id = t.id
Hello I'm struggling to get the query below right. What I want is to return rows with unique names and surnames. What I get is all rows with duplicates
This is my sql
DECLARE #tmp AS TABLE (Name VARCHAR(100), Surname VARCHAR(100))
INSERT INTO #tmp
SELECT CustomerName,CustomerSurname FROM Customers
WHERE
NOT EXISTS
(SELECT Name,Surname
FROM #tmp
WHERE Name=CustomerName
AND ID Surname=CustomerSurname
GROUP BY Name,Surname )
Please can someone point me in the right direction here.
//Desperate (I tried without GROUP BY as well but get same result)
DISTINCT would do the trick.
SELECT DISTINCT CustomerName, CustomerSurname
FROM Customers
Demo
If you only want the records that really don't have duplicates (as opposed to getting duplicates represented as a single record) you could use GROUP BY and HAVING:
SELECT CustomerName, CustomerSurname
FROM Customers
GROUP BY CustomerName, CustomerSurname
HAVING COUNT(*) = 1
Demo
First, I thought that #David answer is what you want. But rereading your comments, perhaps you want all combinations of Names and Surnames:
SELECT n.CustomerName, s.CustomerSurname
FROM
( SELECT DISTINCT CustomerName
FROM Customers
) AS n
CROSS JOIN
( SELECT DISTINCT CustomerSurname
FROM Customers
) AS s ;
Are you doing that while your #Tmp table is still empty?
If so: your entire "select" is fully evaluated before the "insert" statement, it doesn't do "run the query and add one row, insert the row, run the query and get another row, insert the row, etc."
If you want to insert unique Customers only, use that same "Customer" table in your not exists clause
SELECT c.CustomerName,c.CustomerSurname FROM Customers c
WHERE
NOT EXISTS
(SELECT 1
FROM Customers c1
WHERE c.CustomerName = c1.CustomerName
AND c.CustomerSurname = c1.CustomerSurname
AND c.Id <> c1.Id)
If you want to insert a unique set of customers, use "distinct"
Typically, if you're doing a WHERE NOT EXISTS or WHERE EXISTS, or WHERE NOT IN subquery,
you should use what is called a "correlated subquery", as in ypercube's answer above, where table aliases are used for both inside and outside tables (where inside table is joined to outside table). ypercube gave a good example.
And often, NOT EXISTS is preferred over NOT IN (unless the WHERE NOT IN is selecting from a totally unrelated table that you can't join on.)
Sometimes if you're tempted to do a WHERE EXISTS (SELECT from a small table with no duplicate values in column), you could also do the same thing by joining the main query with that table on the column you want in the EXISTS. Not always the best or safest solution, might make query slower if there are many rows in that table and could cause many duplicate rows if there are dup values for that column in the joined table -- in which case you'd have to add DISTINCT to the main query, which causes it to SORT the data on all columns.
-- Not efficient at all.
And, similarly, the WHERE NOT IN or NOT EXISTS correlated subqueries can be accomplished (and give the exact same execution plan) if you LEFT OUTER JOIN the table you were going to subquery -- and add a WHERE . IS NULL.
You have to be careful using that, but you don't need a DISTINCT. Frankly, I prefer to use the WHERE NOT IN subqueries or NOT EXISTS correlated subqueries, because the syntax makes the intention clear and it's hard to go wrong.
And you do not need a DISTINCT in the SELECT inside such subqueries (correlated or not). It would be a waste of processing (and for WHERE EXISTS or WHERE IN subqueries, the SQL optimizer would ignore it anyway and just use the first value that matched for each row in the outer query). (Hope that makes sense.)
I have searched for paging in SQL Server. I found most of the solution look like that
What is the best way to paginate results in SQL Server
But it don't meet my expectation.
Here is my situation:
I work on JasperReport, for that: to export the report I just need pass the any Select query into the template, it will auto generated out the report
EX : I have a select query like this:
Select * from table A
I don't know any column names in table A. So I can't use
Select ROW_NUMBER() Over (Order By columsName)
And I also don't want it order by any columns.
Anyone can help me do it?
PS: In Oracle , it have rownum very helpful in this case.
Select * from tableA where rownum > 100 and rownum <200
Paging with Oracle
You should use ROW_NUMBER with an ORDER BY - because without an ORDER BY there is no determinism in how rows are returned. You can run the same query three times and get the results back in three different orders. Especially if merry-go-round scans come into play.
So unless you want your report to have the possibility of showing the same rows to users on multiple pages, or some rows never on any page, you need to find a way to order the result set to make it deterministic.
From my opinion, you can use sql query to find out how many columns in a table, and then find out a proper one for ' order by ' to depend on.
The script of how to get out columns of an table refer to : How can I get column names from a table in SQL Server?
Check out this link
http://msdn.microsoft.com/en-us/library/ms186734.aspx
SQL Server has similar function ROW_NUMBER. Though it behaves a bit differently.
SQL Server provides no guarantee of row order unless you have have specified a column in order by clause. I would recommend that you give an order by clause that has unique values.
Thank for all your help. Because of order by are required when paging in MS SQL Server, so I used ResultSetMetaData to get the Columns name and do paging as well.
You can use the below query aswell.
declare #test table(
id int,
value1 varchar(100),
value2 int)
insert into #test values(1,'10/50',50)
insert into #test values(2,'10/60',60)
insert into #test values(3,'10/60',61)
insert into #test values(4,'10/60',10)
insert into #test values(5,'10/60',11)
insert into #test values(6,'10/60',09)
select *
from ( select row_number() over (order by (select 0)) as rownumber,* from #test )test
where test.rownumber<=5
Please consider the below example:
CREATE VIEW VW_YearlySales
AS
SELECT 2011 AS YearNo, ProductID, SUM(Amount) FROM InvoiceTable2011
UNION ALL
SELECT 2012 AS YearNo, ProductID, SUM(Amount) FROM InvoiceTable2012
UNION ALL
SELECT 2013 AS YearNo, ProductID, SUM(Amount) FROM InvoiceTable2013
GO
The InvoiceTable2013 doesn't exist actually and I don't want to create it right now, it will be created automatically when recording the first invoice for year 2013.
Can anyone help me on how to specify a condition that will verify the existence of the table before doing the UNION ALL?
Many thanks for your help.
As others have correctly said, you can't achieve this with a view, because the select statement has to reference a concrete set of tables - and if any of them don't exist, the query will fail to execute.
It seems to me like your problem is more fundamental. Clearly there should conceptually be exactly one InvoiceTable, with rows for different dates. Separating this out into different logical tables by year is presumably something that's been done for optimisation (unless the columns are different, which I very much doubt).
In this case, partitioning seems like the way to remedy this problem (partitioning large tables by year/quarter/month is the canonical example). This would let you have a single InvoiceTable logically, yet specify that SQL Server should store the data behind the scenes as if it were different tables split out by year. You get the best of both worlds - an accurate model, and fast performance - and this makes your view definition simple.
No, according to my knowledge its not possible in view, you have to use Stored Procedure. In Stored Procedure you can validate table existance & based on the existance of that table you can change your SQL.
EDIT:
CREATE PROCEDURE GetYearlySales
AS
IF (EXISTS (SELECT *
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_NAME = 'InvoiceTable2013'))
BEGIN
SELECT 2011 AS YearNo, ProductID, SUM(Amount) FROM InvoiceTable2011
UNION ALL
SELECT 2012 AS YearNo, ProductID, SUM(Amount) FROM InvoiceTable2012
UNION ALL
SELECT 2013 AS YearNo, ProductID, SUM(Amount) FROM InvoiceTable2013
END
ELSE
BEGIN
SELECT 2011 AS YearNo, ProductID, SUM(Amount) FROM InvoiceTable2011
UNION ALL
SELECT 2012 AS YearNo, ProductID, SUM(Amount) FROM InvoiceTable2012
END
Looks like you want to have a table for every year and you want to ensure that you have the query for SP without modifying the SP. This is slightly risky , you will have to maintain the naming conventions all the time. In this case what you will have to do is query the informationschema tables for table_name like 'InvoiceTable%'. Get the records in a table and then loop through the records attaching the fixed SQL. And then execute the dynamic sql like its done here http://www.vishalseth.com/post/2008/07/10/Dynamic-SQL-sp_executesql.aspx