I have a table with 3 columns and the first column is 'name'. Some names are entered twice, some 3 times and some more than that. I would like to keep only one value for each name and delete the extra rows.
There are no primary keys or id column.
There are about 1 million rows in the table.
Would like to delete using one query(preferably) in SQL 14. Can someone help please?
Name column2 column3
Suzy
Suzy
Suzy
John
John
George
George
George
George
Would like to have it as:
Name column2 column3
Suzy
John
George
Many thanks in advance
You can use row_number function, try like this,
WITH CTE
AS (
SELECT NAME
,column2
,column3
,RN = ROW_NUMBER() OVER (
PARTITION BY NAME ORDER BY NAME
)
FROM < YourTableName >
)
DELETE
FROM CTE
WHERE RN > 1
Related
I am trying to write a function to check between two tables which have a common column with the same name and ID values.
Table 1: CompanyRecords
CompanyRecordsID CompanyId CompanyName CompanyProcessID
-----------------------------------------------------------
1 222 Sears 123
2 333 JCPenny 456
Table 2: JointCompanies
JointCompaniesID CompanyId CompanyName ComanyProcessID
-----------------------------------------------------------
3 222 KMart 123
4 444 Walmart 001
They both use the same foreign key CompanyProcessID with value 123.
How do I write a select statement when it is passed the CompanyProcessID to tell if the CompanyId has changed for the same CompanyProcessId.
I assume it is a join between the two tables with WHERE CompanyProcessID
Thanks for any help.
Is this what you want?
select max(case when cr.name = jc.name then 0 else 1 end) as name_not_same
from CompanyRecords cr join
JointCompanies jc
on cr.ComanyProcessID = jc.ComanyProcessID
where cr.ComanyProcessID = ?
Good day
I have :
TableX
Column1
John Smith 007
Tera Name 111
Bob Eva 554
I need
TableX
Column1 Column2
John Smith 007 007
Tera Name 111 111
Bob Eva 554 554
I created code but not work. I think there must be join to recognise columns.
ALTER TABLE [dbo].[TableX]
ADD Column2 varchar (50);
UPDATE [dbo].[TableX] SET
Column1=Column2
WHERE select SUBSTRING([Column1], PATINDEX('%[0-9]%', [Column1]
), LEN([column1]))
Thanks for help
If the number you want to extract is always at the end, then you can use:
PATINDEX('%[^0-9]%', REVERSE(Column1))
to get the index of the first character that is not a number, starting from the end.
So, to extract the number you can use:
RIGHT(Column1, PATINDEX('%[^0-9]%', REVERSE(Column1)) - 1)
Hence, the UPDATE will look like this:
UPDATE [dbo].[TableX]
SET Column2 = RIGHT(Column1, PATINDEX('%[^0-9]%', REVERSE(Column1)) - 1)
Demo here
Assuming required part length = 3
UPDATE [dbo].[TableX] SET
column2 = RTRIM(right(column1, CHARINDEX('/', column1) +3))
I have a table with 3 columns and the first column is 'name'. Some names are entered twice, some 3 times and some more than that. I would like to keep only one value for each name and delete the extra rows based on the values of Column 2 and 3. If column 2 and 3 are null, I would like to delete that row.
There are no primary keys or id column.
There are about 2.75 million rows in the table.
Would like to delete using one query(preferably) in SQL 14. Can someone help please?
Name column2 column3
Suzy english null
Suzy null null
Suzy null 5
John null null
John 7 7
George null benson
George null null
George benson null
George 5 benson
Would like to have it as:
Name column2 column3
Suzy english null
Suzy null 5
John 7 7
George benson null
George 5 benson
Many thanks in advance.
Use partitions over name with the appropriate order by:
WITH cte as (
SELECT ROW_NUMBER()
OVER (PARTITION BY name
ORDER BY case
when column1 = 'null' and column2 = 'null' then 3
when column2 = 'null' then 2
when column1 = 'null' then 1
else 0 end
) num
FROM mytable
)
delete from cte where num > 1
This deletes duplicates, keeping in order of preference, rows with:
both column1 and column2 not null (random one kept if there are multiple of these)
column1 not null
column2 not null
both column1 and column2 null
Note that is query assumes (based on comments to question) that your "null" values are actually the text string "null" and not an SQL null.
If they were actually nulls, replace = 'null' with IS NULL.
Delete from yourtable
where column2 is null and column3 is null
above query is Based on this..
I would like to keep only one value for each name and delete the extra rows based on the values of Column 2 and 3. If column 2 and 3 are null, I would like to delete that row
I have two rows in my table which are exact duplicates with the exception of a date field. I want to find these records and delete the older record by hopefully comparing the dates.
For example I have the following data
ctrc_num | Ctrc_name | some_date
---------------------------------------
12345 | John R | 2011-01-12
12345 | John R | 2012-01-12
56789 | Sam S | 2011-01-12
56789 | Sam S | 2012-01-12
Now the idea is to find duplicates with a different 'some_date' field and delete the older records. The final output should look something like this.
ctrc_num | Ctrc_name | some_date
---------------------------------------
12345 | John R | 2012-01-12
56789 | Sam S | 2012-01-12
Also note that my table does not have a primary key, it was originally created this way, not sure why, and it has to fit inside a stored procedure.
If you look at this:
SELECT * FROM <tablename> WHERE some_date IN
(
SELECT MAX(some_date) FROM <tablename> GROUP BY ctrc_num,ctrc_name
HAVING COUNT(ctrc_num) > 1
AND COUNT(ctrc_name) > 1
)
You can see it selects the two most recent dates for the duplicate rows. If I switch the select in the brackets to 'min date' and use it to delete then you are removing the two older dates for the duplicate rows.
DELETE FROM <tablename> WHERE some_date IN
(
SELECT MIN(some_date) FROM <tablename> GROUP BY ctrc_num,ctrc_name
HAVING COUNT(ctrc_num) > 1
AND COUNT(ctrc_name) > 1
)
This is for SQL Server
CREATE TABLE StackOverFlow
([ctrc_num] int, [Ctrc_name] varchar(6), [some_date] datetime)
;
INSERT INTO StackOverFlow
([ctrc_num], [Ctrc_name], [some_date])
SELECT 12345, 'John R', '2011-01-12 00:00:00' UNION ALL
SELECT 12345, 'John R', '2012-01-12 00:00:00' UNION ALL
SELECT 56789, 'Sam S', '2011-01-12 00:00:00' UNION ALL
SELECT 56789, 'Sam S', '2012-01-12 00:00:00'
;WITH RankedByDate AS
(
SELECT ctrc_num
,Ctrc_name
,some_date
,ROW_NUMBER() OVER(PARTITION BY Ctrc_num, Ctrc_name ORDER BY some_date DESC) AS rNum
FROM StackOverFlow
)
DELETE
FROM RankedByDate
WHERE rNum > 1
SELECT
[ctrc_num]
, [Ctrc_name]
, [some_date]
FROM StackOverFlow
And here is the sql fiddle to test it http://sqlfiddle.com/#!6/32718/6
What I tried to do here is
rank the records by descending order of date
delete those that are older (keep the latest)
Update table1
set column1 = 'abc', column2 = 25
where column3 IN ('John','Kate','Tim')
Column3 contains John twice (two associated rows/records), similarly - it has Kate third times and Tim twice.
How can I adjust the query so that the update affects only the first row with John, the first row with Kate and the first with Tim?
For the reference, here is table1:
column1 column2 column3
aa 2 John (!)
affd 24 John
dfd 5 Tim (!)
ss 77 Kate (!)
s 4 Tim
s 1 Kate
sds 34 Kate
I want to update only the rows marked with (!)
I am especially interested in Ms Access! - but also curious how this is done in Sql Server in case it differs. Thank you!
Sql Server solution - Note, you must have a unique identity column for this to work (or some set of unique columns).
UPDATE table1
SET column1 = 'abc',
column2 = 25
WHERE id IN (SELECT id
FROM (SELECT id,
Row_number()
OVER (
ORDER BY rowyouwanttoorderby ) AS ROWNUM
FROM table1
WHERE column3 IN ( 'John', 'Kate', 'Tim' )) AS temp
WHERE rownum = 1)