Group the two rows but Last 2 columns value is different - sql-server

I have the below table in which I got the data from different tables.
Policy_Number
Name_Of_Client
Email
Phone
BEI/BGAMMQ/0000431
Test, Lda
t#t.com
NULL
BEI/BGAMMQ/0000431
Test, Lda
NULL
1212121212
Can someone please help me to get the result as below?
Policy_Number
Name_Of_Client
Email
Phone
BEI/BGAMMQ/0000431
Test, Lda
t#t.com
1212121212
Thank you in advance

Aggregate by policy number and name of client, then select the max of the email and phone.
SELECT Policy_Number, Name_Of_Client, MAX(Email) AS Email, MAX(Phone) AS Phone
FROM yourTable
GROUP BY Policy_Number, Name_Of_Client;
By the way, your table in its current state might imply that there is some sort of design or data gathering problem. The output you want is the version you probably should be using.

Related

Why doesn't this "not in" clause do what I expect?

I'm writing what I think should be a simple query using the "not in" operator, and it's not doing what I expect.
Background:
I have two tables, Contact and Company.
Contact includes columns ContactID (person's identity) and CompanyID (which company they work for)
CompanyID values are expected to be equivalent to the CompanyIDs in the Company table
I want to write a query that checks how many people from the Contact table that have an "invalid" CompanyID (i.e., listed as working for a Company that isn't in the Company table)
I have a working query that does this:
select
count(ContactID)
from
Contact left join Company on Contact.CompanyID = Company.CompanyID
where
Company.CompanyID is null;
This query returns the value 2725538, which I believe to be the correct answer (I've done some simple "show me the top 10 rows" debugging, and it appears to be counting the right rows).
I wrote a second query which I expected to return the same result:
select
count(ContactID)
from
Contact
where
CompanyID not in
(select
CompanyID
from
Company)
However, this query instead returns 0.
To help me debug this, I checked two additional queries.
First, I tried commenting out the WHERE clause, which should give me all of the ContactIDs, regardless of whether they work for an invalid company:
select
count(ContactID)
from
Contact
This query returns 29722995.
Second, I tried removing the NOT from my query, which should give me the inverse of what I'm looking for (i.e., it should count the Contacts who work for valid companies):
select
count(ContactID)
from
Contact
where
CompanyID in
(select
CompanyID
from
Company)
This query returns 26997457.
Notably, these two numbers differ by exactly 2725538, the number returned by the first, working query. This is what I would expect if my second query was working. The total number of Contacts, minus the number of Contacts whose CompanyIDs are in the Company table, should equal the number of Contacts whose CompanyIDs are not in the Company table, shouldn't it?
So, why is the "not in" version of the query returning 0 instead of the correct answer?
the only issue could be of NULL CompanyID. Not In doesn't work with NULLs because of non-comparability of NULL.
try the following:
select
count(ContactID)
from
Contact
where
CompanyID not in
(select
ISNULL(CompanyID,'')
from
Company)
you can see the example in db<>fiddle here.
Please find more details HERE.

show values of two columns from two different tables into one column in sql server

there are two tables 1) participant and 2) logindatetime.
for first time login datetime value along with other user data like,Name, location, contact number, email gets inserted into participant table having datatime column...for any subsequent login of the same user we insert datetime value into logindatetime column to keep the records of how many times the user logged in....now i have to show all the login time (first login time and subsequent login time) in a single column along with name, location, contact I number, email of the same user.
(I do have an identity in participant table).
Have tried following query:
select a.firstname as 'Name', a.Email as 'Email', a.Address1 as 'Location',
a.MobileNo as 'Contact', COALESCE(a.datetime, b.datetime) as DateTime
from eventonline.participant a, eventonline.logindatetime b
where a.Id = b.Rid";
but it show first login time multiple times.
You need to do something like this to fetch the first and then the other logons separately:
select a.firstname as Name, a.Email, a.Address1 as Location,
a.MobileNo as Contact, a.datetime
from eventonline.participant a
union all
select a.firstname as Name, a.Email, a.Address1 as Location,
a.MobileNo as Contact, b.datetime
from eventonline.participant a
join eventonline.logindatetime b on a.Id = b.Rid
It might be easier just to add the first logon to logindatetime
JamesZ's answer gives the solution but a further note on why your approach isn't working. You're joining the two tables and using coalesce(a.DateTime, b.DateTime) to display login time. If the user has logged in before, a.DateTime has a non-null value. coalesce(x,y) only uses y if x is null. But that's not the real problem. The first login-time needs a record of its own in the logindatetime table. If the users logs in 5 times, your logindatetime will have 4 rows and that's all you'll see when joining the two tables. You need to either save the first login time as a row in logindatetime, or use a UNION to force that first login time to be added as an extra row.
I faced the same issue but according to James Z's answer I solved my problem:
select a.start_date as start, a.end_date as end, a.rooms_id as roomid from maintenance a UNION ALL SELECT b.check_in as start, b.check_out as end, b.room_id AS roomid from reservations b
The result of the query from both tables are here:
You could use joins in your sql statement, i just learnt them 3 days ago and they are very useful!
it would look something like this:
SELECT * FROM table1
LEFT JOIN table2 ON table1.column1 = table2.column1
WHERE table1.column1 = '$yourVar';
table1.column1 could be the id, and the corresponding id in the second table can be the id that links it to the first table (Foreign key), it will retrieve the data of the first table and bring all the data of the second table as well that would match the on the ON criteria

Assistance With query across 2 tables within specified date range

I have 2 tables in a large SQL Database and i need to query across them and I am struggling TBH. Here are the parameters:
Table 1 - Live Policies
Table 2 - Email Addresses
Common Pivot = Client number which is present in both tables.
From Table 1 i need to retrieve the following fields:
Client Number
Ref Number
Name
Postcode
Inception date
Policy Type (= 'PC')
Select Client, Ptype, Ref, Incep, [Name], Postcode from [Live
Policies] where Ptype = 'PC'
This works fine.
From Table 2 i need to retrieve:
Webaddr
My question is how do i return the email address for the required records from the second table by referencing the client number? (client number is the same for all records) The second part of the statement is where i'm getting stuck.. I'm aware of the JOIN statement but if i try this i just get nowhere.. Help most appreciated!
USE a JOIN
select L.Client, L.Ptype, L.Ref, L.Incep, L.[Name], L.Postcode, E.Webaddr
from [Live Policies] as L
JOIN [Email Addresses] as E
ON L.Client = E.Client
where Ptype = 'PC'

In SQL how to change null values in a view

Hi I have two columns one called BankNo and the other called BranchNo. im running a view that selects the bank numbers along with there branches. One bank doesn't have a branch number so I would like that value to not come up as "null" but to come up as "No Branch". The BranchNo column is a numeric field. Please help
SELECT Coalesce(Cast(BranchNo as varchar(11)), 'No Branch') As BranchNo
...
Coalesce() function: http://msdn.microsoft.com/en-us/library/ms190349.aspx
Try something like this
Select BankNo, ISNULL(Cast(BranchNo as nvarchar(10),'No Branch') from BankTable

Efficient checking of possible duplicate entities

I have a requirement to produce a list of possible duplicates before a user saves an entity to the database and warn them of the possible duplicates.
There are 7 criteria on which we should check the for duplicates and if at least 3 match we should flag this up to the user.
The criteria will all match on ID, so there is no fuzzy string matching needed but my problem comes from the fact that there are many possible ways (99 ways if I've done my sums corerctly) for at least 3 items to match from the list of 7 possibles.
I don't want to have to do 99 separate db queries to find my search results and nor do I want to bring the whole lot back from the db and filter on the client side. We're probably only talking of a few tens of thousands of records at present, but this will grow into the millions as the system matures.
Anyone got any thoughs of a nice efficient way to do this?
I was considering a simple OR query to get the records where at least one field matches from the db and then doing some processing on the client to filter it some more, but a few of the fields have very low cardinality and won't actually reduce the numbers by a huge amount.
Thanks
Jon
OR and CASE summing will work but are quite inefficient, since they don't use indexes.
You need to make UNION for indexes to be usable.
If a user enters name, phone, email and address into the database, and you want to check all records that match at least 3 of these fields, you issue:
SELECT i.*
FROM (
SELECT id, COUNT(*)
FROM (
SELECT id
FROM t_info t
WHERE name = 'Eve Chianese'
UNION ALL
SELECT id
FROM t_info t
WHERE phone = '+15558000042'
UNION ALL
SELECT id
FROM t_info t
WHERE email = '42#example.com'
UNION ALL
SELECT id
FROM t_info t
WHERE address = '42 North Lane'
) q
GROUP BY
id
HAVING COUNT(*) >= 3
) dq
JOIN t_info i
ON i.id = dq.id
This will use indexes on these fields and the query will be fast.
See this article in my blog for details:
Matching 3 of 4: how to match a record which matches at least 3 of 4 possible conditions
Also see this question the article is based upon.
If you want to have a list of DISTINCT values in the existing data, you just wrap this query into a subquery:
SELECT i.*
FROM t_info i1
WHERE EXISTS
(
SELECT 1
FROM (
SELECT id
FROM t_info t
WHERE name = i1.name
UNION ALL
SELECT id
FROM t_info t
WHERE phone = i1.phone
UNION ALL
SELECT id
FROM t_info t
WHERE email = i1.email
UNION ALL
SELECT id
FROM t_info t
WHERE address = i1.address
) q
GROUP BY
id
HAVING COUNT(*) >= 3
)
Note that this DISTINCT is not transitive: if A matches B and B matches C, this does not mean that A matches C.
You might want something like the following:
SELECT id
FROM
(select id, CASE fld1 WHEN input1 THEN 1 ELSE 0 "rule1",
CASE fld2 when input2 THEN 1 ELSE 0 "rule2",
...,
CASE fld7 when input7 THEN 1 ELSE 0 "rule2",
FROM table)
WHERE rule1+rule2+rule3+...+rule4 >= 3
This isn't tested, but it shows a way to tackle this.
What DBS are you using? Some support using such constraints by using server side code.
Have you considered using a stored procedure with a cursor? You could then do your OR query and then step through the records one-by-one looking for matches. Using a stored procedure would allow you to do all the checking on the server.
However, I think a table scan with millions of records is always going to be slow. I think you should work out which of the 7 fields are most likely to match are make sure these are indexed.
I'm assuming your system is trying to match tag ids of a certain post, or something similar. This is a multi-to-multi relationship and you should have three tables to handle it. One for the post, one for tags and one for post and tags relationship.
If my assumptions are correct then the best way to handle this is:
SELECT postid, count(tagid) as common_tag_count
FROM posts_to_tags
WHERE tagid IN (tag1, tag2, tag3, ...)
GROUP BY postid
HAVING count(tagid) > 3;

Resources