SQL Distinct function din't work

SQL Distinct function din't work - sql-server

Here i have a table called tblemployee which consists id,name and salary column.The name and salary column consists five rows, name column consists 3 different name (i.e each name in name column does not match with another name) while the salary column consists the same integer value (i.e 40,000 in each row of salary column).
Table tblemployee structure
name|salary
-----------
max |40000
rob |40000
jon |40000
Now what i want is that, i want all the names from name column but only one salary value from salary column as shown below:
name|salary
-----------
max |40000
rob |
jon |
Sql Server query i have tried which didn't give the expected output
select DISTINCT salary,name from tblabca

Declare #tblemployee table (name varchar(25),salary int)
Insert Into #tblemployee values
('max',40000),
('rob',40000),
('jon',40000),
('joseph',25000),
('mary',25000)
Select Name
,Salary = case when RN=1 then cast(Salary as varchar(25)) else '' end
From (
Select *
,RN = Row_Number() over (Partition By Salary Order By Name)
,DR = Dense_Rank() over (Order By Salary)
From #tblemployee
) A
Order by DR Desc,RN
Returns
Name Salary
jon 40000
max
rob
joseph 25000
mary

"GROUP BY" and group_concat would suit your case. Please try like this
select salary, group_concat(name) from tblabca group by salary;
Reference: GROUP BY, GROUP_CONCAT

You will never get the result as you stated. Because DISTINCT operator works on a SET. not on individual column. In a relational database you will only work with sets.
So the combination of Salary and Name will be treated as Distinct.
But if you want you can get names in comma concatenated list like below
SELECT SALARY
, STUFF ((SELECT ','+NAME From TableA T2
WHERE T1.SALARY = T2.SALARY FOR XML PATH('')
),1,1,'') FROM TABLEA T1

As others already stated, you are definately not looking for DISTINCT operator.
The distinct operator will work upon the entire result set, meaning you'll get result rows that are unique (column by column).
Although with some rework you might end up with the result you want, do you really want the result in such a not uniform way? I mean, getting a list of names on the name column and only one salary on the salary column do not look like a nice result set to work with.
Maby you should work on your code to account for the change you want to make in the query.

declare #tblemployee Table(
id int identity(1,1) primary key not null,
name nvarchar(MAX) not null,
salary int not null
);
declare #Result Table(
name nvarchar(MAX) not null,
salaryString nvarchar(MAX)
);
insert into #tblemployee(name,salary) values ('joseph' ,25000);
insert into #tblemployee(name,salary) values ('mary' ,25000);
insert into #tblemployee(name,salary) values ('Max' ,40000);
insert into #tblemployee(name,salary) values ('rob' ,40000);
insert into #tblemployee(name,salary) values ('jon' ,40000);
declare #LastSalary int = 0;
declare #name nvarchar(MAX);
declare #salary int;
DECLARE iterator CURSOR LOCAL FAST_FORWARD FOR
SELECT name,
salary
FROM #tblemployee
Order by salary desc
OPEN iterator
FETCH NEXT FROM iterator INTO #name,#salary
WHILE ##FETCH_STATUS = 0
BEGIN
IF (#salary!=#LastSalary)
BEGIN
SET #LastSalary = #salary
insert into #Result(name,salaryString)
values(#name,#salary+'');
END
ELSE
BEGIN
insert into #Result(name,salaryString)
values(#name,'');
END
FETCH NEXT FROM iterator INTO #name,#salary
END
Select * from #Result

Related

How could I pass variable referrence out of batch using trigger so that i wouldn't need to hardcode

Seems like I couldn't get across my issue properly so I've decided to re-ask my question in a different shape. I have two tables named SALES_TABLE and PRODUCT_TABLE. So whenever I sell an item from the product table, the number of the sold item(#sale_count) is subtracted from the total number of the same item in the stock (pr_stock) and the result gets displayed on the Product_table while the figures for product id, sale number, and the product name is supposed to be triggered with :
INSERT INTO DBO.SALES_TABLE (SALE_COUNT,PROD_ID,Prod_name) VALUES (3,4, #prd_name )
and inserted into Sales_table. However, #prd_name reference which was initialized with Product_name from Product Table where PRODUCT_id = #PRD_ID gives the error since it's out of Begin and END block.
"You must declare scalar variable #prd_name".
So how to make #prd_name variable to be passed out of the batch so that I could avoid hardcoding Product_name into the sale_table?
alter TRIGGER DBO.TRG_STOCK
ON DBO.SALES_TABLE
AFTER INSERT
AS
BEGIN
DECLARE #SALE_COUNT INT
DECLARE #PRD_ID INT
declare #prd_name varchar(20)
SELECT #PRD_ID = PROD_ID, #SALE_COUNT = SALE_COUNT FROM INSERTED
select #prd_NAME = PRODUCT_NAME from PRODUCT_TABLE where PRODUCT_id = #PRD_ID
UPDATE PRODUCT_TABLE SET PR_STOCK = PR_STOCK - #SALE_COUNT WHERE PRODUCT_id = #PRD_ID
END
INSERT INTO DBO.SALES_TABLE (SALE_COUNT,PROD_ID,Prod_name) VALUES (3,4, #prd_name )
Note that:
INSERT INTO DBO.SALES_TABLE (SALE_COUNT,PROD_ID,Prod_name) VALUES (3,4, #prd_name )
is not the part of the trigger, it's what should trigger the script in the Begin and the End clause

You need to handle the case where multiple rows are inserted in a single statement, as SQL Server has statement triggers, not row triggers. So something like this:
alter TRIGGER DBO.TRG_STOCK
ON DBO.SALES_TABLE
AFTER INSERT
AS
BEGIN
with sold as
(
select prod_id, sum(sale_count) sale_count
from inserted
group by prod_id
), prod as
(
select p.prod_id, p.pr_stock, sold.sale_count
from sold
join product_table p
on sold.prod_id = p.prod_id
)
update prod set pr_stock = pr_stock - sale_count;
END
Additionally you appear to be inserting the product_name in the sales table instead of the product_id, which you shouldn't be doing. But you can make that work by changing the joins.

Inserting random number of rows in SQL Server via join to integer list is inconsistent

I am creating a database with sample data. Each time I run the stored procedure to generate some new data for my sample database, I would like to clear out and repopulate table B ("Item") based on all the rows in table A ("Product").
If table A contained the rows with primary key values 1, 2, 3, 4, and 5, I would want table B to have a foreign key for table A and insert a random number of rows into table B for each table A row. (We are essentially stocking the shelves with a random number of "item" for any given "product.")
I am using code from this answer to generate a list of numbers. I join to the results of this function to create the rows to insert:
WITH cte AS
(
SELECT
ROW_NUMBER() OVER (ORDER BY (select 0)) AS i
FROM
sys.columns c1 CROSS JOIN sys.columns c2 CROSS JOIN sys.columns c3
)
SELECT i
FROM cte
WHERE
i BETWEEN #p_Min AND #p_Max AND
i % #p_Increment = 0
Random numbers are generated in a view (to get around the limitations of functions) as follows:
-- Mock.NewGuid view
SELECT id = ABS(CAST(CAST(NEWID() AS VARBINARY) AS INT)))
And a function that returns the random numbers:
-- Mock.GetRandomInt(min, max) function definition
DECLARE #random int;
SELECT #random = Id % (#MaxValue - #MinValue + 1) FROM Mock.NewGuid;
RETURN #random + #MinValue;
However, when you look at this code and execute it...
WITH Products AS
(
SELECT ProductId, ItemCount = Mock.GetRandomInt(1,5)
FROM Product.Product
)
SELECT A = Products.ProductId, B = i
FROM Products
JOIN (SELECT i FROM Mock.GetIntList(1,5,1)) Temp ON
i < Products.ItemCount
ORDER BY ProductId, i
... this returns some inconsistent results!
A,B
1,1
1,2
1,3
2,1
2,2
3,2 <-- where is 1?
3,3
4,1
5,3 <-- where is 1, 2?
6,1
I would expect that, for every product id, the JOIN results in 1-5 rows. However, it seems like values get skipped! This is even more apparent with larger data sets. I was originally trying to generate 20-50 rows in Item for each Product row, but this resulted in only 30-40 rows for each product.
The question: Any idea why this is happening? Each product should have a random number of rows (between 1 and 5) inserted for it and the B value should be sequential! Instead, some numbers are missing!
This issue also happens if I store numbers in a table I created and then join to that, or if I use a recursive CTE.
I am using SQL Server 2008R2, but I believe I see the same issue on my 2012 database as well. Compatibility levels are 2008 and 2012 respectively.

This is a fun problem. I've dealt with this in a round about way a number of times. I am sure there is a way to not use a cursor. But why not. This is a cheap problem memory wise so long as the #RandomMaxRecords doesn't get huge or you have a significant amount of product records. If the data in the Items table is meaningless then I would suggest truncating any in memory table where I define the hash table for #Item. And obviously you will pull from your Product table not the hash I have created for testing.
This is a fantastic article and describes in detail how I arrive at my solution. Less Than Dot Blog
CODE
--This is your product table with 5 random products
IF OBJECT_ID('tempdb..#Product') IS NOT NULL DROP TABLE #Product
CREATE TABLE #Product
(
ProductID INT PRIMARY KEY IDENTITY(1,1),
ProductName VARCHAR(25),
ProductDescription VARCHAR(max)
)
INSERT INTO #Product (ProductName,ProductDescription) VALUES ('Product Name 1','Product Description 1'),
('Product Name 2','Product Description 2'),
('Product Name 3','Product Description 3'),
('Product Name 4','Product Description 4'),
('Product Name 5','Product Description 5')
--This is your item table. This would probably just be a truncate statement so that your table is reset for the new values to go in
IF OBJECT_ID ('tempdb..#Item') IS NOT NULL DROP TABLE #Item
CREATE TABLE #Item
(
ItemID INT PRIMARY KEY IDENTITY(1,1),
FK_ProductID INT NOT NULL,
ItemName VARCHAR(25),
ItemDescription VARCHAR(max)
)
--Declare a bunch of variables for the cursor and insert into the item table process
DECLARE #ProductID INT
DECLARE #ProductName VARCHAR(25)
DECLARE #ProductDescription VARCHAR(max)
DECLARE #RandomItemCount INT
DECLARE #RowEnumerator INT
DECLARE #RandomMaxRecords INT = 10
--We declare a cursor to iterate over the records in product and generate random amounts of items
DECLARE ItemCursor CURSOR
FOR SELECT * FROM #Product
OPEN ItemCursor
FETCH NEXT FROM ItemCursor INTO #ProductID, #ProductName, #ProductDescription
WHILE (##FETCH_STATUS <> -1)
BEGIN
--Get the Random Number into the variable. And we only want 1 or more records. Mod division will produce a 0.
SELECT #RandomItemCount = ABS(CHECKSUM(NewID())) % #RandomMaxRecords
SELECT #RandomItemCount = CASE #RandomItemCount WHEN 0 THEN 1 ELSE #RandomItemCount END
--Iterate on the RowEnumerator to the RandomItemCount and insert item rows
SET #RowEnumerator = 1
WHILE (#RowEnumerator <= #RandomItemCount)
BEGIN
INSERT INTO #Item (FK_ProductID,ItemName,ItemDescription)
SELECT #ProductID, REPLACE(#ProductName,'Product','Item'),REPLACE(#ProductDescription,'Product','Item')
SELECT #RowEnumerator = #RowEnumerator + 1
END
FETCH NEXT FROM ItemCursor INTO #ProductID, #ProductName, #ProductDescription
END
CLOSE ItemCursor
DEALLOCATE ItemCursor
GO
--Look at the result
SELECT
*
FROM
#Product AS P
RIGHT JOIN #Item AS I ON (P.ProductID = I.FK_ProductID)
--Cleanup
DROP TABLE #Product
DROP TABLE #Item

It looks like a LEFT OUTER JOIN to GetIntList (as opposed to INNER JOIN) fixes the problem I am having.

How to query with unknown combination of optional parameters in SQL Server without cursors?

I have a search that has three input fields (for arguments sake, let's say LastName, Last4Ssn, and DateOfBirth). These three input fields are in a dynamic grid where the user can choose to search for one or more combinations of these three fields. For example, a user might search based on the representation below:
LastName Last4Ssn DateOfBirth
-------- -------- -----------
Smith NULL 1/1/1970
Smithers 1234 NULL
NULL 5678 2/2/1980
In the example, the first row represents a search by LastName and DateOfBirth. The second, by LastName and Last4Ssn. And, the third, by Last4Ssn and DateOfBirth. This example is a bit contrived as the real-world scenario has four fields. At least two of the fields must be filled (don't worry about how to validate) with search data and it is possible that all fields are filled out.
Without using cursors, how does one use that data to join to existing tables using the given values in each row as the filter? Currently, I have a cursor that goes through each row of the above table, performs the join based on the columns that have values, and inserts the found data into a temp table. Something like this:
CREATE TABLE #results (
Id INT,
LastName VARCHAR (26),
Last4Ssn VARCHAR (4),
DateOfBirth DATETIME
)
DECLARE #lastName VARCHAR (26)
DECLARE #last4Ssn VARCHAR (4)
DECLARE #dateOfBirth DATETIME
DECLARE search CURSOR FOR
SELECT LastName, Last4Ssn, DateOfBirth
FROM #searchData
OPEN search
FETCH NEXT FROM search
INTO #lastName, #last4Ssn, #dateOfBirth
WHILE ##FETCH_STATUS = 0
BEGIN
INSERT INTO #results
SELECT s.Id, s.LastName, s.Last4Ssn, s.DateOfBirth
FROM SomeTable s
WHERE Last4Ssn = ISNULL(#last4Ssn, Last4Ssn)
AND DateOfBirth = ISNULL(#dateOfBirth, DateOfBirth)
AND (
LastName = ISNULL(#lastName, LastName)
OR LastName LIKE #lastName + '%'
)
FETCH NEXT FROM search
INTO #lastName, #last4Ssn, #dateOfBirth
END
CLOSE search
DEALLOCATE search
I was hoping there was some way to avoid the cursor to make the code a bit more readable. Performance is not an issue as the table used to search will never have more than 5-10 records in it, but I would think that for more than a few, it'd be more efficient to query the data all at once rather than one row at a time. The SomeData table in my example can be very large.

I don't see why you can't just join the two tables together:
CREATE TABLE #results (
Id INT,
LastName VARCHAR (26),
Last4Ssn VARCHAR (4),
DateOfBirth DATETIME
)
INSERT INTO #results
select s.id, s.lastname, s.last4ssn, s.dateofbirth
from SomeTable s
join #searchData d
ON s.last4ssn = isnull(d.last4ssn, s.last4ssn)
AND s.dateofbirth = isnull(d.dateofbirth, s.dateofbirth)
AND (s.lastname = isnull(d.lastname, s.lastname) OR
OR s.lastname like d.lastname + '%')
EDIT:
Since the data is large, we'll need some good indices. One index isn't good enough since you effectively have 3 clauses OR'd together. So the first step is to create those indices:
CREATE TABLE SomeData (
Id INT identity(1,1),
LastName VARCHAR (26),
Last4Ssn VARCHAR (4),
DateOfBirth DATETIME
)
create nonclustered index ssnlookup on somedata (last4ssn)
create nonclustered index lastnamelookup on somedata (lastname)
create nonclustered index doblookup on somedata (dateofbirth)
The next step involves crafting the query to use those indices. I'm not sure what the best way here is, but I think it's to have 4 queries union'd together:
with searchbyssn as (
select somedata.* from somedata join #searchData
on somedata.last4ssn = #searchData.last4ssn
), searchbyexactlastname as (
select somedata.* from somedata join #searchData
on somedata.lastname = #searchData.lastname
), searchbystartlastname as (
select somedata.* from somedata join #searchData
on somedata.lastname like #searchdata.lastname + '%'
), searchbydob as (
select somedata.* from somedata join #searchData
on somedata.dateofbirth = #searchData.dateofbirth
), s as (
select * from searchbyssn
union select * from searchbyexactlastname
union select * from searchbystartlastname
union select * from searchbydob
)
select s.id, s.lastname, s.last4ssn, s.dateofbirth
from s
join #searchData d
ON (d.last4ssn is null or s.last4ssn = d.last4ssn)
AND s.dateofbirth = isnull(d.dateofbirth, s.dateofbirth)
AND (s.lastname = isnull(d.lastname, s.lastname)
OR s.lastname like d.lastname + '%')
Here's a fiddle showing the 4 index seeks: http://sqlfiddle.com/#!6/3741d/3
It shows significant resource usage for the union, but I think that would be negligible compared to the index scans for large tables. It wouldn't let me generate more than a few hundred rows of sample data. Since the number of result rows is small, it is not expensive to join to #searchData at the end and filter all the results again.

coalesce two records into one

I have a table that stores two values; 'total' and 'owing' for each customer. Data is uploaded to the table using two files, one that brings in 'total' and the other brings in 'owing'. This means I have two records for each customerID:
customerID:--------Total:--------- Owing:
1234---------------- 1000----------NULL
1234-----------------NULL-----------200
I want to write a stored procedure that merges the two records together:
customerID:--------Total:--------- Owing:
1234---------------- 1000----------200
I have seen examples using COALESCE so put together something like this:
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
--Variable declarations
DECLARE #customer_id varchar(20)
DECLARE #total decimal(15,8)
DECLARE #owing decimal(15,8)
DECLARE #customer_name_date varchar(255)
DECLARE #organisation varchar(4)
DECLARE #country_code varchar(2)
DECLARE #created_date datetime
--Other Variables
DECLARE #totals_staging_id int
--Get the id of the first row in the staging table
SELECT #totals_staging_id = MIN(totals_staging_id)
from TOTALS_STAGING
--iterate through the staging table
WHILE #totals_staging_id is not null
BEGIN
update TOTALS_STAGING
SET
total = coalesce(#total, total),
owing = coalesce(#owing, owing)
where totals_staging_id = #totals_staging_id
END
END
Any Ideas?

SELECT t1.customerId, t1.total, t2.owing FROM test t1 JOIN test t2 ON ( t1.customerId = t2.customerId) WHERE t1.total IS NOT NULL AND t2.owing IS NOT NULL
Wondering why aren't you just using UPDATE on a second file execution?

Except for COUNT, aggregate functions ignore null values. Aggregate
functions are frequently used with the GROUP BY clause of the SELECT
statement. MSDN
So you don't need to worry about null values with summing. Following will give your merging records together. Fiddle-demo
select customerId,
sum(Total) Total,
sum(Owing) Owing
from T
Group by customerId

Try this :
CREATE TABLE #Temp
(
CustomerId int,
Total int,
Owing int
)
insert into #Temp
values (1024,100,null),(1024,null,200),(1025,10,null)
Create Table #Final
(
CustomerId int,
Total int,
Owing int
)
insert into #Final
values (1025,100,50)
MERGE #Final AS F
USING
(SELECT customerid,sum(Total) Total,sum(owing) owing FROM #Temp
group by #Temp.customerid
) AS a
ON (F.customerid = a.customerid)
WHEN MATCHED THEN UPDATE SET F.Total = F.Total + isnull(a.Total,0)
,F.Owing = F.Owing + isnull(a.Owing,0)
WHEN NOT MATCHED THEN
INSERT (CustomerId,Total,Owing)
VALUES (a.customerid,a.Total,a.owing);
select * from #Final
drop table #Temp
drop table #Final

This should work:
SELECT CustomerID,
COALESCE(total1, total2) AS Total,
COALESCE(owing1, owing2) AS Owing
FROM
(SELECT row1.CustomerID AS CustomerID,
row1.Total AS total1,
row2.Total AS total2,
row1.Owing AS owing1,
row2.Owing AS owing2
FROM YourTable row1 INNER JOIN YourTable row2 ON row1.CustomerID = row2.CustomerID
WHERE row1.Total IS NULL AND row2.Total IS NOT NULL) temp
--Note: Alter the WHERE clause as necessary to ensure row1 and row2 are unique.
...but note that you'll need some mechanism to ensure row1 and row2 are unique. My WHERE clause is an example based on the data you provided. You'll need to tweak this to add something more specific to your business rules.

Can I use #table variable in SQL Server Report Builder?

Using SQL Server 2008 Reporting services:
I'm trying to write a report that displays some correlated data so I thought to use a #table variable like so
DECLARE #Results TABLE (Number int
,Name nvarchar(250)
,Total1 money
,Total2 money
)
insert into #Results(Number, Name, Total1)
select number, name, sum(total)
from table1
group by number, name
update #Results
set total2 = total
from
(select number, sum(total) from table2) s
where s.number = number
select from #results
However, Report Builder keeps asking to enter a value for the variable #Results. It this at all possible?
EDIT: As suggested by KM I've used a stored procedure to solve my immediate problem, but the original question still stands: can I use #table variables in Report Builder?

No.
ReportBuilder will
2nd guess you
treats #Results as a parameter

Put all of that in a stored procedure and have report builder call that procedure. If you have many rows to process you might be better off (performance wise) with a #temp table where you create a clustered primary key on Number (or would it be Number+Name, not sure of your example code).
EDIT
you could try to do everything in one SELECT and send that to report builder, this should be the fastest (no temp tables):
select
dt.number, dt.name, dt.total1, s.total2
from (select
number, name, sum(total) AS total1
from table1
group by number, name
) dt
LEFT OUTER JOIN (select
number, sum(total) AS total2
from table2
GROUP BY number --<<OP code didn't have this, but is it needed??
) s ON dt.number=s.number

I've seen this problem as well. It seems SQLRS is a bit case-sensitive. If you ensure that your table variable is declared and referenced everywhere with the same letter case, you will clear up the prompt for parameter.

You can use Table Variables in SSRS dataset query like in my code where I am adding needed "empty" records for keep group footer in fixed postion (sample use pubs database):
DECLARE #NumberOfLines INT
DECLARE #RowsToProcess INT
DECLARE #CurrentRow INT
DECLARE #CurRow INT
DECLARE #cntMax INT
DECLARE #NumberOfRecords INT
DECLARE #SelectedType char(12)
DECLARE #varTable TABLE (# int, type char(12), ord int)
DECLARE #table1 TABLE (type char(12), title varchar(80), ord int )
DECLARE #table2 TABLE (type char(12), title varchar(80), ord int )
INSERT INTO #varTable
SELECT count(type) as '#', type, count(type) FROM titles GROUP BY type ORDER BY type
SELECT #cntMax = max(#) from #varTable
INSERT into #table1 (type, title, ord) SELECT type, N'', 1 FROM titles
INSERT into #table2 (type, title, ord) SELECT type, title, 1 FROM titles
SET #CurrentRow = 0
SET #SelectedType = N''
SET #NumberOfLines = #RowsPerPage
SELECT #RowsToProcess = COUNT(*) from #varTable
WHILE #CurrentRow < #RowsToProcess
BEGIN
SET #CurrentRow = #CurrentRow + 1
SELECT TOP 1 #NumberOfRecords = ord, #SelectedType = type
FROM #varTable WHERE type > #SelectedType
SET #CurRow = 0
WHILE #CurRow < (#NumberOfLines - #NumberOfRecords % #NumberOfLines) % #NumberOfLines
BEGIN
SET #CurRow = #CurRow + 1
INSERT into #table2 (type, title, ord)
SELECT type, '' , 2
FROM #varTable WHERE type = #SelectedType
END
END
SELECT type, title FROM #table2 ORDER BY type ASC, ord ASC, title ASC

Why can't you just UNION the two resultsets?

How about using a table valued function rather than a stored proc?

It's possible, only declare your table with '##'. Example:
DECLARE ##results TABLE (Number int
,Name nvarchar(250)
,Total1 money
,Total2 money
)
insert into ##results (Number, Name, Total1)
select number, name, sum(total)
from table1
group by number, name
update ##results
set total2 = total
from
(select number, sum(total) from table2) s
where s.number = number
select * from ##results