I have two identical tables A and B. And both the tables have same fields, as an example Table A (bin, storage, plant) and B (bin, storage, plant). But when I checked the data, table A has 5238 rows and B has 5249 rows. So I dont know which 11 rows are missing. I need help to write a query where I can find those missing rows.
Thanks for the help in advance.
Can use the EXCEPT command for your problem:
SELECT bin
FROM tableB
EXCEPT
SELECT bin
FROM tableA;
Shows all bins which are in tableB but not in tableA.
select *
from tableA
full outer join tableB on tableA.bin = tableB.bin
where tableA.bin is null or tableB.bin is null
SQL-Server allows a full outer join. You can select all records from both table and limit the result to those where the join does not find matches on the other table.
Related
I still have much to learn in database work, so please be kind.
I am attempting to combine two tables that have similar data, but wanted to be sure that I wasn't duplicating any entries. I decided to use the query below to see how many names were already in the target table
select A.Name
From SourceTable A
where Name NOT IN
(
select B.Name
From [Production].[dbo].[DestinationTable] B
)
This returned 0 rows, so I assumed that every Name was already in the target table. But when I changed the query to
select A.Name
From SourceTable A
where Name IN
(
select B.Name
From [Production].[dbo].[DestinationTable] B
)
I got back about half of the total rows in the source table. How can these two totals not add up to the total number of rows in the source table? I assumed duplicate names, but the numbers still don't add up. What could I be missing here?
Kamil's answer is a good explanation of what's going on with IN and NOT IN. But a better way to see if your destination table is missing any names from the source table would be to use a LEFT JOIN and check for NULL.
The query would look like this:
SELECT A.Name
FROM SourceTable A
LEFT JOIN [Production].[dbo].[DestinationTable] B ON A.Name = B.Name
WHERE B.Name IS NULL
This would return all names from your source that aren't in your destination.
The reason you are not getting the total row count from both queries combined is because you have NULL values in your DestinationTable.
Generally you are ommitting checking for null values and this is the reason. You could add OR name is null to see it.
Check it using
select count(*) from destinationtable where name is null
Alternatively you could perform a CROSS JOIN and see for yourself where the data doesn't match and inspect why
If there are two tables. Table A has table_code as PK and its FK in Table B.
How can query be designed so that the results display all those values of table_code which is in Table A but not in Table B?
Tried all three joins
Tried Criteria is null and is not null
try
SELECT tableA.table_code
FROM tableA LEFT JOIN tableB ON tableA.table_code = tableB.table_code
WHERE (((tableB.table_code) Is Null));
if this doesn't help, show us the SQL you tried.
I have a couple of tables that have data in them that I am looking to get information from. Here is the rundown....In table 1 I have bunch of columns that I am pulling data from, one of the columns is a user ID (which is a number)that was the last userID to modify a record. In table 2 I want to pull in the name of that user based on the ID that is pulled from the other table (this table has both the userID and the username).
so my final query would have the columns in table 1 as well as the username from table 2 to show that was the user to last edit the record. I assume this has to be done in a nested select statement but for the life of me I cannot come up with the correct syntax.
Can anyone help me out?
Thanks
Jeff
Yes, you need a very basic join that link both tables together.
Select t1.UserID,
t2.UserName
FROM table1 t1 INNER JOIN
table2 t2 ON t1.userid=t2.userid
select t1.*, t2.{username} from table1 as t1
join table2 as t2 on t1.{userId}=t2.{userid};
change {username} with the actual column name of user
similarly {userId} with appropriate column name in tables.
Hope it helps you.
this is standard inner join query, to learn more consider reading: http://www.w3schools.com/sql/
The following query returns >7000 rows when each table only has 340 rows.
SELECT Config.Spec, TempTable.Spec FROM Confg INNER JOIN TempTable on Config.Spec = TempTable.Spec
Why would this happen? If an INNER JOIN only returns a row if there is a match in both tables then why would it return multiple rows for a match.
If there is more than one row with the same Spec value in TempTable for the same Spec value in Confg, then you will get duplicate rows, and vice versa.
Are the Spec field values non unique? This might explain why the query returns too many results; with duplicates you get get an effective cross product for those.
I have a master table A, with ~9 million rows. Another table B (same structure) has ~28K rows from table A. What would be the best way to remove all contents of B from table A?
The combination of all columns (~10) are unique. Nothing more in the form a of a unique key.
If you have sufficient rights you can create a new table and rename that one to A. To create the new table you can use the following script:
CREATE TABLE TEMP_A AS
SELECT *
FROM A
MINUS
SELECT *
FROM B
This should perform pretty good.
DELETE FROM TableA WHERE ID IN(SELECT ID FROM TableB)
Should work. Might take a while though.
one way, just list out all the columns
delete table a
where exists (select 1 from table b where b.Col1= a.Col1
AND b.Col2= a.Col2
AND b.Col3= a.Col3
AND b.Col4= a.Col4)
Delete t2
from t1
inner join t2
on t1.col1 = t2.col1
and t1.col2 = t2.col2
and t1.col3 = t2.col3
and t1.col4 = t2.col4
and t1.col5 = t2.col5
and t1.col6 = t2.col6
and t1.col7 = t2.col7
and t1.col8 = t2.col8
and t1.col9 = t2.col9
and t1.col10 = t2.col0
This is likely to be very slow as you would have to have every col indexed which is highly unlikely in an environment when a table this size has no primary key, so do it during off peak. What possessed you to have a table with 9 million records and no primary key?
If this is something you'll have to do on a regular basis, the first choice should be to try to improve the database design (looking for primary keys, trying to get the "join" condition to be on as few columns as possible).
If that is not possible, the distinct second option is to figure out the "selectivity" of each of the columns (i.e. how many "different" values does each column have, 'name' would be more selective than 'address country' than 'male/female').
The general type of statement I'd suggest would be like this:
Delete from tableA
where exists (select * from tableB
where tableA.colx1 = tableB.colx1
and tableA.colx2 = tableB.colx2
etc. and tableA.colx10 = tableB.colx10).
The idea is to list the columns in order of the selectivity and build an index on colx1, colx2 etc. on tableB. The exact number of columns in tableB would be a result of some trial&measure. (Offset the time for building the index on tableB with the improved time of the delete statement.)
If this is just a one time operation, I'd just pick one of the slow methods outlined above. It's probably not worth the effort to think too much about this when you can just start a statement before going home ...
Is there a key value (or values) that can be used?
something like
DELETE a
FROM tableA a
INNER JOIN tableB b
on b.id = a.id