T-SQL grouping by either of two columns

T-SQL grouping by either of two columns - sql-server

I have a table where I hold chat history as follows:
Id From To Text Hour
=================================================
1 A B Msg_A_B1 00:01
2 A B Msg_A_B2 00:02
3 B A Msg_B_A1 00:03
4 A B Msg_A_B3 00:05
5 C A Msg_C_A1 00:11
6 A C Msg_A_C1 00:12
7 C A Msg_C_A2 00:14
8 D B Msg_D_B1 00:17
I want to create a chat header list from data for a spesific user. The rules are
I want to get the start (first) "Hour" of each chat
and last message of chat for a spesific user
ordered by "Hour" ascending
For example if the user is "A" I want to get
Correspondant Text Hour
=======================================
B Msg_A_B3 00:01
C Msg_C_A2 00:11
Or for user "B" :
Correspondant Text Hour
=======================================
A Msg_A_B3 00:01
D Msg_D_B1 00:17
I can possibly do it by using Temporary tables, but I am seeking a simpler and faster solution.
This information might lead to the use of Stored Procedures, but a proper use of Views is also accepted.

What you are missing is a grouping column to mark a chat between A and B as "belonging together" without looking if A or B is the From or the To.
It is your table design, which makes things difficult. Below my suggestion I will add some hints how this might be done better:
Your mockup to simulate your issue:
DECLARE #mockupTable TABLE(Id INT,[From] VARCHAR(100),[To] VARCHAR(100),[Text] VARCHAR(100),[Hour] TIME(0))
INSERT INTO #mockupTable VALUES
(1,'A','B','Msg_A_B1','00:01')
,(2,'A','B','Msg_A_B2','00:02')
,(3,'B','A','Msg_B_A1','00:03')
,(4,'A','B','Msg_A_B3','00:05')
,(5,'C','A','Msg_C_A1','00:11')
,(6,'A','C','Msg_A_C1','00:12')
,(7,'C','A','Msg_C_A2','00:14')
,(8,'D','B','Msg_D_B1','00:17');
--The query
WITH cte AS
(
SELECT t.*
,CONCAT(CASE WHEN t.[From]>t.[To] THEN t.[To] ELSE t.[From] END,'-',CASE WHEN t.[From]>t.[To] THEN t.[From] ELSE t.[To] END) AS ChatID
FROM #mockupTable t
)
,FindFirstAndLast AS
(
SELECT cte1.ChatID
,(SELECT TOP 1 Id FROM cte cte2 WHERE cte2.ChatID=cte1.ChatID ORDER BY cte2.[Hour] ASC) AS FirstId
,(SELECT TOP 1 Id FROM cte cte2 WHERE cte2.ChatID=cte1.ChatID ORDER BY cte2.[Hour] DESC) AS LastId
FROM cte cte1
GROUP BY cte1.ChatID
)
SELECT fal.ChatID
,tFirst.[From] AS FirstFrom
,tFirst.[To] AS FirstTo
,tFirst.[Hour] AS FirstHour
,tLast.[From] AS LastFrom
,tLast.[To] AS LastTo
,tLast.[Text] AS LastText
FROM FindFirstAndLast fal
INNER JOIN #mockupTable tFirst ON fal.FirstId=tFirst.Id
INNER JOIN #mockupTable tLast ON fal.LastId=tLast.Id;
The idea in short:
The first CTE will create a ChatID by concatenating the From and the To in a sorted way. Doing so a message from A to B will get the same ChatID as a message from B to A.
The second CTE will use a correlated sub-query to find the first and the last message id, grouped for the previously computed ChatID.
The final SELECT will use these message ids to join the appropriate rows.
The result is coming with everything you need. It's on you, to put it in the format needed:
+--------+-----------+---------+-----------+----------+--------+----------+
| ChatID | FirstFrom | FirstTo | FirstHour | LastFrom | LastTo | LastText |
+--------+-----------+---------+-----------+----------+--------+----------+
| A-B | A | B | 00:01:00 | A | B | Msg_A_B3 |
+--------+-----------+---------+-----------+----------+--------+----------+
| A-C | C | A | 00:11:00 | C | A | Msg_C_A2 |
+--------+-----------+---------+-----------+----------+--------+----------+
| B-D | D | B | 00:17:00 | D | B | Msg_D_B1 |
+--------+-----------+---------+-----------+----------+--------+----------+
Some ideas about the desing
I would use
one table Person for your chatting persons.
a second table Chat for a chat with a ChatID.
one m:n mapping table ChattingPerson with JoinTime, a ChatID and a PersonID, both as FKs. Here you can set timestamps like LastAction or mark the status (active, has left, ...)
one more table Message for the messages with time, text, and ChatPersonID as FK.
Your advantages
The opener can explicitly invite more persons (or limit it to one for a person2person chat), or just wait for participants.
Starting a chat creates the row in the Chat table, the first row in the ChattingPerson table to mark the opener, and eventually a first message row.
Following messages add - if not existing yet - a row to the ChatPerson (with a new participant) and the message row.
The ID to the ChatPerson-table will give you the ChatID and the PersonID.
You can filter per chat and/or by person.
There can be separate chats between A and B over the time
You can control the type of chat with a PersonCount-Constraint
You can enforce, that a new ChatPerson can only be added by the opener
You can create certain chat types (like "person2person") with a template
Happy Coding :-)

Let's do it Creating a view:
First let's load the data in table t1:
create table t1 (Id int,[From] varchar(10),[To] varchar(10),Text varchar(100),Hour time(0))
insert into t1 values (1,'A','B','Msg_A_B1','00:01')
insert into t1 values (2,'A','B','Msg_A_B2','00:02')
insert into t1 values (3,'B','A','Msg_B_A1','00:03')
insert into t1 values (4,'A','B','Msg_A_B3','00:05')
insert into t1 values (5,'C','A','Msg_C_A1','00:11')
insert into t1 values (6,'A','C','Msg_A_C1','00:12')
insert into t1 values (7,'C','A','Msg_C_A2','00:14')
insert into t1 values (8,'D','B','Msg_D_B1','00:17')
Then let's create the view
create view vChats
as
with cte as (
select left([text],len([text])-1) as chat,
t.*
from t1 as t
),
cte2 as (
select chat,
min(hour) as minHour,
max(text) as maxText
from cte
group by chat
),
cte3 as (select distinct [From] as [User]
from t1
UNION
select distinct [To] as [User]
from t1
)
select c3.[User],
t.[To] as Correspondant,
c.maxText as [Text],
c.minHour as [Hour]
from cte2 as c
inner join t1 as t ON c.maxText = t.[Text]
inner join cte3 as c3 ON c3.[User] = t.[From]
UNION
select c3.[User],
t.[From] as Correspondant,
c.maxText as [Text],
c.minHour as [Hour]
from cte2 as c
inner join t1 as t ON c.maxText = t.[Text]
inner join cte3 as c3 ON c3.[User] = t.[To]
After that you can use to get all the communications for each user like this:
select *
from vChats
where [User] = 'A'

Related

Simplify multiple joins

I have a Claims table with 70 columns, 16 of which contain diagnosis codes. The codes mean nothing, so I need to pull the descriptions for each code located in a separate table.
There has to be a simpler way of pulling these code descriptions:
-- This is the claims table
FROM
[database].[schema].[claimtable] AS claim
-- [StagingDB].[schema].[Diagnosis] table where the codes located
-- [ICD10_CODE] column contains the code
LEFT JOIN
[StagingDB].[schema].[Diagnosis] AS diag1 ON claim.[ICDDiag1] = diag1.[ICD10_CODE]
LEFT JOIN
[StagingDB].[schema].[Diagnosis] AS diag2 ON claim.[ICDDiag2] = diag2.[ICD10_CODE]
LEFT JOIN
[StagingDB].[schema].[Diagnosis] AS diag3 ON claim.[ICDDiag3] = diag3.[ICD10_CODE]
-- and so on, up to ....
LEFT JOIN
[StagingDB].[schema].[Diagnosis]AS diag16 ON claim.[ICDDiag16] = diag16.[ICD10_CODE]
-- reported column will be [code_desc]
-- ie:
-- diag1.[code_desc] AS Diagnosis1
-- diag2.[code_desc] AS Diagnosis2
-- diag3.[code_desc] AS Diagnosis3
-- diag4.[code_desc] AS Diagnosis4
-- etc.

I think what you are doing is already correct in given scenario.
Another way can be from programming point of view or you can give try and compare ther performace.
i) Pivot Claim table on those 16 description columns.
ii) Join the Pivoted column with [StagingDB].[schema].[Diagnosis]
Another way can be to put [StagingDB].[schema].[Diagnosis] table in some #temp table
instead of touching large Staging 16 times.
But for data analysis has to be done to decide if there is any way.

You can go for UNPIVOT of the claimTable and then join with Diagnosis table.
TEST SETUP
create table #claimTable(ClaimId INT, Diag1 VARCHAR(10), Diag2 VARCHAR(10))
CREATE table #Diagnosis(code VARCHAR(10), code_Desc VARCHAR(255))
INSERT INTO #ClaimTable
VALUES (1, 'Fever','Cold'), (2, 'Headache','toothache');
INSERT INTO #Diagnosis
VALUEs ('Fever','Fever Desc'), ('cold','cold desc'),('headache','headache desc'),('toothache','toothache desc');
Query to Run
;WITH CTE_Claims AS
(SELECT ClaimId,DiagnosisNumeral, code
FROM #claimTable
UNPIVOT
(
code FOR DiagnosisNumeral in ([Diag1],[Diag2])
) as t
)
SELECT c.ClaimId, c.code, d.code_Desc
FROM CTE_Claims AS c
INNER JOIN #Diagnosis as d
on c.code = d.code
ResultSet
+---------+-----------+----------------+
| ClaimId | code | code_Desc |
+---------+-----------+----------------+
| 1 | Fever | Fever Desc |
| 1 | Cold | cold desc |
| 2 | Headache | headache desc |
| 2 | toothache | toothache desc |
+---------+-----------+----------------+

Find one record that exists as two records in another vendor database

I have two vendor databases that have become horribly out-of-sync over the years that I'm trying to correct. A single customer can have multiple id_numbers, and these IDs exist in both vendor databases. All of the IDs for a single customer are correctly attached to one customer record in the Vendor1 database (meaning they belong to the same customer_code). The problem, however, is that those same IDs might be split amongst multiple customers in the Vendor2 database, which is incorrect. I will need to merge those multiple customers together in the Vendor2 database.
I'm trying to identify which customers are represented as two or more customers in the second vendor database. So far I have joined the two together, but I can't figure out how to find only customers that having two or more distinct MemberInternalKeys for the same customer_code.
Here's what I have so far:
select top 10
c.customer_code,
i.id_number,
cc.MemberInternalKey
from Vendor1.dbo.customer_info as c
join Vendor1.dbo.customer_ids as i
on c.customer_code = i.customer_code
join Vendor2.dbo.Clubcard as cc
on (i.id_number collate Latin1_General_CI_AS_KS) = cc.ClubCardId
where i.id_code = 'PS'
In the example below, I would expect to only get back the last two rows in the table. The first two rows should not be included in the results because they have the same MemberInternalKey for both records and belong to the same customer_code. The third row should also not be included since there is a 1-1 match between both vendor databases.
customer_code | id_number | MemberInternalKey
--------------|-----------|------------------
5549032 | 4000 | 4926877
5549032 | 4001 | 4926877
5031101 | 4007 | 2379218
2831779 | 4029 | 1763760
2831779 | 4062 | 4950922
Any help is greatly appreciated.

If I understand correctly, you can use window functions for this logic:
select c.*
from (select c.customer_code, i.id_number, cc.MemberInternalKey,
min(MemberInternalKey) over (partition by customer_code) as minmik,
max(MemberInternalKey) over (partition by customer_code) as maxmik
from Vendor1.dbo.customer_info c join
Vendor1.dbo.customer_ids i
on c.customer_code = i.customer_code join
Vendor2.dbo.Clubcard as cc
on (i.id_number collate Latin1_General_CI_AS_KS) = cc.ClubCardId
where i.id_code = 'PS'
) c
where minmik <> maxmik;
This calculates the minimum and maximum MemberInternalKey for each customer_code. The outer where then returns only rows where these are different.

Another option is
Declare #YourTable table (customer_code int, id_number int, MemberInternalKey int)
Insert Into #YourTable values
(5549032,4000,4926877),
(5549032,4001,4926877),
(5031101,4007,2379218),
(2831779,4029,1763760),
(2831779,4062,4950922)
Select A.*
From #YourTable A
Join (
Select customer_code
From #YourTable
Group By customer_code
Having min(MemberInternalKey)<>max(MemberInternalKey)
) B on A.customer_code=B.customer_code
Returns
customer_code id_number MemberInternalKey
2831779 4029 1763760
2831779 4062 4950922

select resultset of counts by array param in postgres

I've been searching for this and it seems like it should be something simple, but apparently not so much. I want to return a resultSet within PostgreSQL 9.4.x using an array parameter so:
| id | count |
--------------
| 1 | 22 |
--------------
| 2 | 14 |
--------------
| 14 | 3 |
where I'm submitting a parameter of {'1','2','14'}.
Using something (clearly not) like:
SELECT id, count(a.*)
FROM tablename a
WHERE a.id::int IN array('{1,2,14}'::int);
I want to test it first of course, and then write it as a storedProc (function) to make this simple.

Forget it, here is the answer:
SELECT a.id,
COUNT(a.id)
FROM tableName a
WHERE a.id IN
(SELECT b.id
FROM tableName b
WHERE b.id = ANY('{1,2,14}'::int[])
)
GROUP BY a.id;

You can simplify to:
SELECT id, count(*) AS ct
FROM tbl
WHERE id = ANY('{1,2,14}'::int[])
GROUP BY 1;
More:
Check if value exists in Postgres array
To include IDs from the input array that are not found I suggest unnest() followed by a LEFT JOIN:
SELECT id, count(t.id) AS ct
FROM unnest('{1,2,14}'::int[]) id
LEFT JOIN tbl t USING (id)
GROUP BY 1;
Related:
Preserve all elements of an array while (left) joining to a table
If there can be NULL values in the array parameter as well as in the id column (which would be an odd design), you'd need (slower!) NULL-safe comparison:
SELECT id, count(t.id) AS ct
FROM unnest('{1,2,14}'::int[]) id
LEFT JOIN tbl t ON t.id IS NOT DISTINCT FROM id.id
GROUP BY 1;

SQL Pulling the latest information and information from another table

I have a record table that is recording changes within a table. I can pull the data from the first table fine, however when i try to join in another table to add some of its column information it stops displaying the information.
PartNumber | PartDesc | value | date
1 | test | 1 | 3/4/2015
I wanted to include the Aisle tag's from the location table
PartNumber| AisleTag | AisleTagTwo
1 | A1 | N/A
here is what i have as my sql statement so far
Select t1.PartNumber, t1.PartDesc , t1.NewValue , t1.Date,t2.AisleTag,t2.AisleTagTwo
from InvRecord t1
JOIN PartAisleListTbl t2 ON t1.PartNumber = t2.PartNumber
where Date = (select max(Date) from InvRecord where t1.PartNumber = InvRecord.PartNumber)
order by t1.PartNumber
it is coming up blank, my original sql statement doesn't include anything from t2. I am not sure what approach to go with in terms of getting the data combined any help is much appreciated thank you !
this should be the end result
PartNumber | PartDesc | value | date | AisleTag | AisleTagTwo
1 | test | 1 | 3/4/2015 | A1 | N/A

Pull the most recent row (based on Date) for each PartNumber in Table A and append data from Table B (joined on PartNumber):
SELECT *
FROM (
SELECT A.PartNumber
, A.PartDesc
, A.NewValue
, A.Date
, B.AisleTag
, B.AisleTagTwo
, DateSeq = ROW_NUMBER() OVER(PARTITION BY A.PartNumber ORDER BY A.Date DESC)
FROM InvRecord A
LEFT JOIN PartAisleListTbl B
ON A.PartNumber = B.PartNumber
) A
WHERE A.DateSeq = 1
ORDER BY A.PartNumber

Are you returning no records at all, or only records with AisleTag and AisleTagTwo as null?
Your sentence "it is coming up blank, my original sql statement doesn't include anything from t2." makes it sound like you're getting records with nulls for the t2 fields.
If you are, then you probably have a record in t2 that has nulls for those fields.
For troubleshooting purposes, try running the query without the WHERE clause:
Select t1.PartNumber, t1.PartDesc , t1.NewValue , t1.Date,t2.AisleTag,t2.AisleTagTwo
from InvRecord t1
JOIN PartAisleListTbl t2 ON t1.PartNumber = t2.PartNumber
order by t1.PartNumber
If you do get records, your problem is with the WHERE clause. If you don't, your problem is with the PartNumber fields in InvRecord and PartAisleListTbl not matching.

Not sure why your's isn't working... is date in both t1 and t2 by any chance?
Here's it re factored to use a inline view instead of a correlated query wonder if it makes a difference.
Select t1.PartNumber, t1.PartDesc , t1.NewValue , t1.Date,t2.AisleTag,t2.AisleTagTwo
from InvRecord t1
JOIN PartAisleListTbl t2
ON t1.PartNumber = t2.PartNumber
JOIN (select max(Date) mdate, PartNumber from InvRecord GROUP BY PartNumber) t3
on t3.partNumber= T1.PartNumber
and T3.mdate = T1.Date
order by t1.PartNumber

Query in SQL Server with multiple join tables

I have 3 tables:
Table Position:
KodePosition | NamePosition | UserLogin
========================================
0037 Master A winz\alfa
0038 Master B winz\beta
0043 Master C winz\carlie
Table UserBackup (PJS):
KodePosition | UserOrigin | UserChange | StartDate | EndDate
================================================================
0037 winz\alfa winz\carlie 10-10-2014 17-10-2014
Table History:
IdHistory | KodePosition | StartDate | EndDate | User | Comment
===============================================================================
19F5FCFC 0038 14-10-2014 14-10-2014 winz\beta i not agree...
19F5FCFC 0043 15-10-2014 15-10-2014 winz\carlie i agree...
I want to display data like this :
Name | Date | Position | Comment
===================================================
winz\beta 14-10-2014 Master B i not agree...
winz\carlie 15-10-2014 Master A i agree...
Description :
please note the data in Table UserBackup(PJS).
if StartDate in Table History between StartDate and EndDate in Table UserBackup(PJS) and also the same UserChange with user, and then get NamePosition from Table Position by KodePosition of Table UserBackup(PJS).
For now, I have a stored procedure like this, but doesn't display the data I need.
select
A.IdHistory, A.StartDate, B.NamePosition, B.UserLogin, A.Comment
from
History as A
left join
Position as B on A.KodePosition = B.KodePosition
Where
A.IdHistory = '19F5FCFC'
order by
A.StartDate asc
Please help me guys... Thanks...

If you want to ask three tables, you can do it like this (sorry for not using your data, but I dont understand the connections of them(or your explanations)).
SELECT a.Column1, a.Column2, b.Column1
FROM table1 AS a
LEFT JOIN table2 AS b
ON a.Column1=b.Column1
WHERE (
SELECT c.Column1
FROM table3 AS c
INNER JOIN table2 as b
ON b.Column1=c.Column1
WHERE c.Column2 LIKE 'MyTarget%');
You got to think in two SELECTs, dont try to put everything in one. Btw. most DBs dont support multi Joins.

Without more information about what you are trying to do, this is a little bit of a stab in the dark...but based on your description, I believe you are trying to first filter the rows in the UserBackup(PJS) table based on whether the corresponding History.KodePosition record falls within the StartDate and EndDate of the corresponding UserBackup(PJS) record. Then, based on the returned KodePositions, you want to retrieve the related records from the Position table. I'm not sure this is the complete picture of what you are looking for, but hopefully this gets you further along:
;WITH cteData
AS (
SELECT u.KodePosition, h.IDHistory, h.StartDate, h.Comment
FROM UserBackup(PJS) u
INNER JOIN History h ON h.User = u.UserChange AND (h.StartDate >= u.Startdate and h.StartDate <= u.EndDate)
)
SELECT c.IDHistory, c.StartDate, p.NamePostition, p.UserLogin, c.Comment
FROM Position p
INNER JOIN cteData c ON c.KodePosition = p.KodePosition

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight