Sybase: Get the first data in a Group

Sybase: Get the first data in a Group - sybase

I'm having a problem getting the first row data in group in my data collection. Currently I'm using Sybase as my datasource.
I also used below query but not working.
SELECT Id, Product, RANK() OVER (PARTITION BY Id ORDER BY Id) FROM ProductTbl
SELECT Id, Product FROM ProductTbl as X
WHERE Id = (
SELECT min(Id)
FROM ProductTbl WHERE Id = X.Id
)
Below example are the data that I'm working on.
Id Product
1111 Apple
1111 Orange
1111 Banana
2222 Guava
2222 JackFruit
2222 Grape
3333 ProductA
3333 ProductB
My expected output should be
Id Product
1111 Apple
2222 Guava
3333 ProductA

Why not:
SELECT Id, MIN(Product) FROM ProductTbl GROUP BY Id
Have I misunderstood?

select Id,Product from(
SELECT
Id,
Product,
RANK() OVER (PARTITION BY Id ORDER BY Product) as first_row
FROM ProductTbl group by Id,Product)x
where first_row = 1
Output:
Id Product
1111,Apple
2222,Grape
3333,ProductA
R comes before U, so it cant be Guava.

Related

How to Update a record Based on the the values on Previous row and Next Row

Below is the table with columns uniqueid Number, Complete date, and User id. The data below is sorted by userid, Complete date.
I want to update the Number Column where Uniqueid =456 to 'xxxx 1111' if the previous row and the row after null are having the same Number. If the Previous row or row after is null or different then no update.
Basically, I want to order the rows by userid, Complete date, identify the Rows that are null and check the previous row and row after the null record and update the null record with the same number the previous row and row after is having. if both don't match just leave it NULL record blank. Thank you.
Original
UniqueId Number Complete date UserId
----------------------------------------------------------
123 xxxx 1111 2022-03-17 11:19:07.000 11011
456 NULL 2022-03-17 11:22:50.000 11011
789 xxxx 1111 2022-03-17 11:28:32.000 11011
Expected output:
UniqueId Number Complete date UserId
----------------------------------------------------------
123 xxxx 1111 2022-03-17 11:19:07.000 11011
456 xxxx 1111 2022-03-17 11:22:50.000 11011
789 xxxx 1111 2022-03-17 11:28:32.000 11011
SQL Server 2017 - T-SQL
I have tried the partition, min, and max and was not able to find exactly what I want.
select
UserId, completedate, number,
newremarks = max(userid) over (partition by number order by userid, completedate)
from
foo
order by
userId, CompleteDate
I tried something like the code shown above.

You could utilise lag & lead here and an updatable CTE:
with n as (
select *,
Lag(number) over(partition by UserId order by [Complete date]) pn,
Lead(number) over(partition by UserId order by [Complete date]) nn
from t
)
update n set number = pn
from n
where pn = nn and number is null;
See Demo Fiddle

SQL Server : join with top record selection

Clientcode Emailaddress Accountcode clientname phoneno
----------------------------------------------------------------
AAA ragu#bib.com 100 Berjeya 90909090
AAA ragu1#bib.com 100 Berjeya 90909090
AAABBB jkkjkj#bib.com 200 Berjeya sooo 3222
CCCC dfdf#bib.com 200 Berjeya klkl 123
dddd sdsdsd#bib.com 33300 Berjeya penn 33333
This is the data in my table, I need to remove any one of the email address with same client code and account code. For example the email address ragu#bib.com and ragu1#bib.com have the same client code and account code, but email address is different; I need to show only one of the email addresses with all records. Please suggest the suitable query for this.

you can use top 1 with ties as below:
Select top (1) with ties * from yourtable
order by row_number() over(partition by ClientCode,AccountCode order by EmailAddress)
with subquery you can do like below
Select * from (
Select *, RowN = Row_Number() over(partition by ClientCode, AccountCode order by EmailAddress) from yourtable
) a where a.RowN = 1

How to do pagination based on a calculated column

I have a problem ascribable to this simplified version:
I have a table in Sql Server 2008 like this (it's a denormalized table specific for my search):
ItemId | CategoryId | Descr | ExtendedDescription
0001 | 1 | Mouse X | Blue mouse
0002 | 1 | Blue Pen | Beautiful ....
0003 | 2 | Blue Pencil | Pencil with ...
0004 | 2 | Eraser | Eraser with ....
I need to search a word (like "Blue") in this table, assign a rank to the result based on where the word appears and group the result by CategoryId summing the rank.
I am able to do that; the problem arises when I try to paginating the result.
This is the stored procedure that I tried (now the word to search is fixed, but I know how to make it a parameter; I know also how to filter ID to have pagination):
CREATE PROCEDURE [dbo].[spSearch]
AS
BEGIN
SET NOCOUNT ON;
SELECT
[CategoryId],
SUM(
CASE WHEN (PATINDEX('%Blue%', Descr) > 0) THEN 100 ELSE 0 END +
CASE WHEN (PATINDEX('%Blue%', ExtendedDescription) > 0) THEN 10 ELSE 0 END
) AS Ranking,
ROW_NUMBER() OVER(ORDER BY CategoryId DESC) ID
FROM [dbo].[Data]
where (Descr like '%Blue%' or
ExtendedDescription like '%Blue%')
GROUP BY CategoryId
ORDER BY Ranking DESC
END
With this sp I get the following result:
CategoryId | Ranking | ID
0001 | 110 | 2
0002 | 100 | 1
The problem is: to paginating the result, I need that ID (ROW_NUMBER) is generated ordering descending the Ranking, while in this way it's generated in the order of CategoryId.
If I try to change the sp in this way:
ROW_NUMBER() OVER(ORDER BY Ranking DESC) ID
I can't save the sp because Ranking is not a column.
Do you have some hint?

I think by using Common Table expression (CTE) we can solve this
Try this
;WITH cte AS
(
SELECT
[CategoryId],
SUM(
CASE WHEN (PATINDEX('%Blue%', Descr) > 0) THEN 100 ELSE 0 END +
CASE WHEN (PATINDEX('%Blue%', ExtendedDescription) > 0) THEN 10 ELSE 0 END
) AS Ranking,
ROW_NUMBER() OVER(ORDER BY CategoryId DESC) ID
FROM [dbo].[Data]
where (Descr like '%Blue%' or
ExtendedDescription like '%Blue%')
GROUP BY CategoryId
)
SELECT ROW_NUMBER () OVER(ORDER BY Ranking DESC) AS r_no,
Ranking,
ID
FROM cte

SQL Server 2008 how to select top [column value] and random record?

I'm using SQL Server 2008, I want select random row record, and the total number of record is depend on another table's column value, how to do this?
My SQL statement is something like this, but wrong..
select top b.number a.name, a.link_id
from A a
left join B b on b.link_id = a.link_id
order by newid()
Here are my tables and the expected result.
Table A:
name link_id
james 100
albert 100
susan 100
simon 101
tom 101
fion 101
Table B:
link_id number
100 2
101 1
Expected result:
when run 1st time, result may be:
name link_id
james 100
susan 100
fion 101
2nd time result may be:
albert 100
susan 100
simon 101
3rd time could be:
james 100
albert 100
fion 101
Explaination
Refer to table B, link_id: 100, number: 2
meaning that Table A should select out 2 random record for link_id = 100
and need to select 1 random record for link_id=101

You can use the ROW_NUMBER() function:
SELECT A.name, A.link_id
FROM(
SELECT name,link_id, ROW_NUMBER()OVER(PARTITION BY link_id ORDER BY NEWID()) rn
FROM dbo.tblA
) AS A
JOIN dbo.tblB AS B
ON A.link_id = B.link_id
WHERE A.rn <= B.number;
Here is a SqlFiddle to show this in action: http://sqlfiddle.com/#!3/92eac/2

Try this:
SELECT a.*
FROM b
CROSS APPLY
(
SELECT TOP (b.number) a.*
FROM a
WHERE a.link_id = b.link_id
ORDER BY
NEWID()
) a
Also see: SQLFiddle

SQL query like GROUP BY with OR condition

I'll try to describe the real situation. In our company we have a reservation system with a table, let's call it Customers, where e-mail and phone contacts are saved with each incoming order - that's the part of a system I can't change. I'm facing the problem how to get count of unique customers. With the unique customer I mean group of people who has either the same e-mail or same phone number.
Example 1: From the real life you can imagine Tom and Sandra who are married. Tom, who ordered 4 products, filled in our reservation system 3 different e-mail addresses and 2 different phone numbers when one of them shares with Sandra (as a homephone) so I can presume they are connected somehow. Sandra except this shared phone number filled also her private one and for both orders she used only one e-mail address. For me this means to count all of the following rows as one unique customer. So in fact this unique customer may grow up into the whole family.
ID E-mail Phone Comment
---- ------------------- -------------- ------------------------------
0 tom#email.com +44 111 111 First row
1 tommy#email.com +44 111 111 Same phone, different e-mail
2 thomas#email.com +44 111 111 Same phone, different e-mail
3 thomas#email.com +44 222 222 Same e-mail, different phone
4 sandra#email.com +44 222 222 Same phone, different e-mail
5 sandra#email.com +44 333 333 Same e-mail, different phone
As ypercube said I will probably need a recursion to count all of these unique customers.
Example 2: Here is the example of what I want to do.Is it possible to get count of unique customers without using recursion for instance by using cursor or something or is the recursion necessary ?
ID E-mail Phone Comment
---- ------------------- -------------- ------------------------------
0 linsey#email.com +44 111 111 ─┐
1 louise#email.com +44 111 111 ├─ 1. unique customer
2 louise#email.com +44 222 222 ─┘
---- ------------------- -------------- ------------------------------
3 steven#email.com +44 333 333 ─┐
4 steven#email.com +44 444 444 ├─ 2. unique customer
5 sandra#email.com +44 444 444 ─┘
---- ------------------- -------------- ------------------------------
6 george#email.com +44 555 555 ─── 3. unique customer
---- ------------------- -------------- ------------------------------
7 xavier#email.com +44 666 666 ─┐
8 xavier#email.com +44 777 777 ├─ 4. unique customer
9 xavier#email.com +44 888 888 ─┘
---- ------------------- -------------- ------------------------------
10 robert#email.com +44 999 999 ─┐
11 miriam#email.com +44 999 999 ├─ 5. unique customer
12 sherry#email.com +44 999 999 ─┘
---- ------------------- -------------- ------------------------------
----------------------------------------------------------------------
Result ∑ = 5 unique customers
----------------------------------------------------------------------
I've tried a query with GROUP BY but I don't know how to group the result by either first or second column. I'm looking for let's say something like
SELECT COUNT(*) FROM Customers
GROUP BY Email OR Phone
Thanks again for any suggestions
P.S.
I really appreciate the answers for this question before the complete rephrase. Now the answers here may not correspond to the update so please don't downvote here if you're going to do it (except the question of course :). I completely rewrote this post.Thanks and sorry for my wrong start.

Here is a full solution using a recursive CTE.
;WITH Nodes AS
(
SELECT DENSE_RANK() OVER (ORDER BY Part, PartRank) SetId
, [ID]
FROM
(
SELECT [ID], 1 Part, DENSE_RANK() OVER (ORDER BY [E-mail]) PartRank
FROM dbo.Customer
UNION ALL
SELECT [ID], 2, DENSE_RANK() OVER (ORDER BY Phone) PartRank
FROM dbo.Customer
) A
),
Links AS
(
SELECT DISTINCT A.Id, B.Id LinkedId
FROM Nodes A
JOIN Nodes B ON B.SetId = A.SetId AND B.Id < A.Id
),
Routes AS
(
SELECT DISTINCT Id, Id LinkedId
FROM dbo.Customer
UNION ALL
SELECT DISTINCT Id, LinkedId
FROM Links
UNION ALL
SELECT A.Id, B.LinkedId
FROM Links A
JOIN Routes B ON B.Id = A.LinkedId AND B.LinkedId < A.Id
),
TransitiveClosure AS
(
SELECT Id, Id LinkedId
FROM Links
UNION
SELECT LinkedId Id, LinkedId
FROM Links
UNION
SELECT Id, LinkedId
FROM Routes
),
UniqueCustomers AS
(
SELECT Id, MIN(LinkedId) UniqueCustomerId
FROM TransitiveClosure
GROUP BY Id
)
SELECT A.Id, A.[E-mail], A.Phone, B.UniqueCustomerId
FROM dbo.Customer A
JOIN UniqueCustomers B ON B.Id = A.Id

Finding groups that have only same Phone:
SELECT
ID
, Name
, Phone
, DENSE_RANK() OVER (ORDER BY Phone) AS GroupPhone
FROM
MyTable
ORDER BY
GroupPhone
, ID
Finding groups that have only same Name:
SELECT
ID
, Name
, Phone
, DENSE_RANK() OVER (ORDER BY Name) AS GroupName
FROM
MyTable
ORDER BY
GroupName
, ID
Now, for the (complex) query you describe, let's say we have a table like this instead:
ID Name Phone
---- ------------- -------------
0 Kate +44 333 333
1 Sandra +44 000 000
2 Thomas +44 222 222
3 Robert +44 000 000
4 Thomas +44 444 444
5 George +44 222 222
6 Kate +44 000 000
7 Robert +44 444 444
--------------------------------
Should all these be in one group? As they all share name or phone with someone else, forming a "chain" of relative persons:
0-6 same name
6-1-3 same phone
3-7 same name
7-4 same-phone
4-2 same name
2-5 bame phone

For the dataset in the example you could write something like this:
;WITH Temp AS (
SELECT Name, Phone,
DENSE_RANK() OVER (ORDER BY Name) AS NameGroup,
DENSE_RANK() OVER (ORDER BY Phone) AS PhoneGroup
FROM MyTable)
SELECT MAX(Phone), MAX(Name), COUNT(*)
FROM Temp
GROUP BY NameGroup, PhoneGroup

I don't know if this is the best solution, but here it is:
SELECT
MyTable.ID, MyTable.Name, MyTable.Phone,
CASE WHEN N.No = 1 AND P.No = 1 THEN 1
WHEN N.No = 1 AND P.No > 1 THEN 2
WHEN N.No > 1 OR P.No > 1 THEN 3
END as GroupRes
FROM
MyTable
JOIN (SELECT Name, count(Name) No FROM MyTable GROUP BY Name) N on MyTable.Name = N.Name
JOIN (SELECT Phone, count(Phone) No FROM MyTable GROUP BY Phone) P on MyTable.Phone = P.Phone
The problem is that here are some joins made on varchars and could end up in increasing execution time.

Here is my solution:
SELECT p.LastName, P.FirstName, P.HomePhone,
CASE
WHEN ph.PhoneCount=1 THEN
CASE
WHEN n.NameCount=1 THEN 'unique name and phone'
ELSE 'common name'
END
ELSE
CASE
WHEN n.NameCount=1 THEN 'common phone'
ELSE 'common phone and name'
END
END
FROM Contacts p
INNER JOIN
(SELECT HomePhone, count(LastName) as PhoneCount
FROM Contacts
GROUP BY HomePhone) ph ON ph.HomePhone = p.HomePhone
INNER JOIN
(SELECT FirstName, count(LastName) as NameCount
FROM Contacts
GROUP BY FirstName) n ON n.FirstName = p.FirstName
LastN FirstN Phone Comment
Hoover Brenda 8138282334 unique name and phone
Washington Brian 9044563211 common name
Roosevelt Brian 7737653279 common name
Reagan Charles 7734567869 unique name and phone