Query to compare differences of two columns from two different tables

Query to compare differences of two columns from two different tables - sql-server

I am attempting to create a UNION ALL query, on two differently named columns in two different tables.
I would like to take the "MTRL" column in the table USER_EXCEL and compare it against the "short_material_number" column from the IM_EXCEL table. I would then like the query to only return the differences between the two columns. Both columns house material numbers but are named differently (column wise) in the tables.
What I have so far is:
(SELECT [MTRL] FROM dbo.USER_EXCEL
EXCEPT
SELECT [short_material_number] FROM dbo.IM_Excel)
UNION ALL
(SELECT [short_material_number] FROM dbo.IM_Excel
EXCEPT
SELECT [MTRL] FROM dbo.USER_EXCEL)
However, when trying to run that query I receive an error message that states:
Msg 8114, Level 16, State 5, Line 22
Error converting data type varchar to float.

You're almost certainly getting this error because one of your two columns is a FLOAT data type, while the other is VARCHAR. This article would be a good place to start reading about implicit conversions, which happen when you try to compare two columns that are of different data types.
To get this working, you need to convert the float to a varchar, like in the example below.
(
SELECT [MTRL] FROM dbo.USER_EXCEL
EXCEPT
SELECT CAST([short_material_number] AS VARCHAR(18)) FROM dbo.IM_Excel
)
UNION ALL
(
SELECT CAST([short_material_number] AS VARCHAR(18)) FROM dbo.IM_Excel
EXCEPT
SELECT [MTRL] FROM dbo.USER_EXCEL
)

From you question I understand you are trying to compare two columns but returning only one column I would recommend you to use following query to compare the differences side by side
SELECT ue.[MTRL], ie.[short_material_number]
FROM dbo.IM_Excel ie
FULL OUTER JOIN
dbo.USER_EXCEL ue
ON CAST(ie.[short_material_number] AS VARCHAR(20)) = ue.[MTRL]
WHERE ie.[short_material_number] IS NULL
OR ue.[MTRL] IS NULL

Related

Null records in data comparison

I am building a VIEW by comparing two tables having columns namely RUN_DATE(TABLE 1) of date format and FT_Data_cut(TABLE 2 ) of date format. I need to fetch the latest data from Table 2
into View. My below condition is not working.
SELECT
FROM "XYZ"."DATA_LAKE"."STE_INCOMING"
WHERE FT_DATA_CUT = (
select max(RUN_DATE)
from "XYZ"."OUTBOUND"."CONTROL_TABLE"
where release_flag = 'RELEASED'
and table_name='STE_INCOMING');
On brief checking,
select max(RUN_DATE)
from "XYZ"."OUTBOUND"."CONTROL_TABLE"
where release_flag = 'RELEASED'
and table_name='STE_INCOMING'
(Working fine)
My above query is working properly but I am not able to compare the two dates. It is returning Zero Rows in output. But we have a common date(ex- 2022-02-21) in both columns. The data type is also the same for both tables as I discussed earlier. I don't know why it is not able to compare the two dates

How can I check and remove duplicate rows?

Have problem with quite big table, where are some null values in 3 columns - datetime2 (and 2 float columns).
Nice simple request from similar question returns only 2 rows where datetime2 is null, but nothing else (same as lot of others):
DELETE FROM MyTable
LEFT OUTER JOIN (
SELECT MIN(RowId) as RowId, allRemainingCols
FROM MyTable
GROUP BY allRemainingCols
) as KeepRows ON
MyTable.RowId = KeepRows.RowId
WHERE
KeepRows.RowId IS NULL
Seems to work without datetime2 column having nulls ??
There is manual workaround, but is there any way to create request or procedure using TSQL only ?
SELECT id,remainingColumns
FROM table
order BY remainingColumns
Compare all columns in XL (15 in my case, placed =ROW() in first column as a check and formula next to last column + auto filter for TRUEs): =AND(B1=B2;C1=C2;D1=D2;E1=E2;F1=F2;G1=G2;H1=H2;I1=I2;J1=J2;K1=K2;L1=L2;M1=M2;N1=N2;O1=O2;P1=P2)
Or compare 3 rows like this and select all non-unique rows
=OR(
AND(B1=B2;C1=C2;D1=D2;E1=E2;F1=F2;G1=G2;H1=H2;I1=I2;J1=J2;K1=K2;L1=L2;M1=M2;N1=N2;O1=O2;P1=P2);
AND(B2=B3;C2=C3;D2=D3;E2=E3;F2=F3;G2=G3;H2=H3;I2=I3;J2=J3;K2=K3;L2=L3;M2=M3;N2=N3;O2=O3;P2=P3)
)

Quite much work to find my particular data/answer...
Most of float numbers were slightly different.
Hard to find, but simple CAST(column as binary) can show these invisible differences...
Like 96,6666666666667 vs 0x0000000000000000000000000000000000000000000040582AAAAAAAAAAD vs 0x0000000000000000000000000000000000000000000040582AAAAAAAAAAB etc.
And visible 96.6666666666667 can return something different way again:
0x0000000000000000000000000000000000000F0D0001AB6A489F2D6F0300

Group by an evaluated field (sql server) [duplicate]

Why are column ordinals legal for ORDER BY but not for GROUP BY? That is, can anyone tell me why this query
SELECT OrgUnitID, COUNT(*) FROM Employee AS e GROUP BY OrgUnitID
cannot be written as
SELECT OrgUnitID, COUNT(*) FROM Employee AS e GROUP BY 1
When it's perfectly legal to write a query like
SELECT OrgUnitID FROM Employee AS e ORDER BY 1
?
I'm really wondering if there's something subtle about the relational calculus, or something, that would prevent the grouping from working right.
The thing is, my example is pretty trivial. It's common that the column that I want to group by is actually a calculation, and having to repeat the exact same calculation in the GROUP BY is (a) annoying and (b) makes errors during maintenance much more likely. Here's a simple example:
SELECT DATEPART(YEAR,LastSeenOn), COUNT(*)
FROM Employee AS e
GROUP BY DATEPART(YEAR,LastSeenOn)
I would think that SQL's rule of normalize to only represent data once in the database ought to extend to code as well. I'd want to only right that calculation expression once (in the SELECT column list), and be able to refer to it by ordinal in the GROUP BY.
Clarification: I'm specifically working on SQL Server 2008, but I wonder about an overall answer nonetheless.

One of the reasons is because ORDER BY is the last thing that runs in a SQL Query, here is the order of operations
FROM clause
WHERE clause
GROUP BY clause
HAVING clause
SELECT clause
ORDER BY clause
so once you have the columns from the SELECT clause you can use ordinal positioning
EDIT, added this based on the comment
Take this for example
create table test (a int, b int)
insert test values(1,2)
go
The query below will parse without a problem, it won't run
select a as b, b as a
from test
order by 6
here is the error
Msg 108, Level 16, State 1, Line 3
The ORDER BY position number 6 is out of range of the number of items in the select list.
This also parses fine
select a as b, b as a
from test
group by 1
But it blows up with this error
Msg 164, Level 15, State 1, Line 3
Each GROUP BY expression must contain at least one column that is not an outer reference.

There is a lot of elementary inconsistencies in SQL, and use of scalars is one of them. For example, anyone might expect
select * from countries
order by 1
and
select * from countries
order by 1.00001
to be a similar queries (the difference between the two can be made infinitesimally small, after all), which are not.

I'm not sure if the standard specifies if it is valid, but I believe it is implementation-dependent. I just tried your first example with one SQL engine, and it worked fine.

use aliasses :
SELECT DATEPART(YEAR,LastSeenOn) as 'seen_year', COUNT(*) as 'count'
FROM Employee AS e
GROUP BY 'seen_year'
** EDIT **
if GROUP BY alias is not allowed for you, here's a solution / workaround:
SELECT seen_year
, COUNT(*) AS Total
FROM (
SELECT DATEPART(YEAR,LastSeenOn) as seen_year, *
FROM Employee AS e
) AS inline_view
GROUP
BY seen_year

databases that don't support this basically are choosing not to. understand the order of the processing of the various steps, but it is very easy (as many databases have shown) to parse the sql, understand it, and apply the translation for you. Where its really a pain is when a column is a long case statement. having to repeat that in the group by clause is super annoying. yes, you can do the nested query work around as someone demonstrated above, but at this point it is just lack of care about your users to not support group by column numbers.

Issue in union operation

I have a query in database like
SELECT 0 AS [DocumentType],'Select Document Type' [DocumentTypeX],0 ,0
UNION
SELECT dbo.tbDocumentType.*
FROM dbo.tbDocumentType where Site=#Site
It throws error message "All queries combined using a UNION, INTERSECT or EXCEPT operator must have an equal number of expressions in their target lists."

First and foremost rule for UNION Operation:
1.Both Query should have the same number of the resultset.
2.Respective Columns of both queries should have similar data types.
3.Never go with TableName.*.Instead Specify Column Names
Please check on that....

Instead of
SELECT dbo.tbDocumentType.*
Select the columns matching your UNION fields
SELECT dbo.tbDocumentType.[DocumentType],
dbo.tbDocumentType.[DocumentTypeX],
dbo.tbDocumentType.[Something1],
null -- Or use any value you want if doesnt have the column

how to use all columns (*) in select statement from different tables using UNION ALL?

I have 10 tables of which 4 tables have 99 columns and 6 tables have 100 columns. I have to combine using UNION ALL. when executing SQL query getting below error
Msg 205, Level 16, State 1, Line 6
All queries combined using a UNION, INTERSECT or EXCEPT operator must have an equal number of expressions in their target lists.
I understood the reason of error is for not same number of columns. I tried using NULL as Column100 but still getting same error.
please can anyone suggest me how to use * and UNION ALL in SQL query.
Thanks.

If the extra column happens to be at the beginning or end and the other columns are in exactly the same order, then you can add the column manually:
select t99.*, 't99' as col
from t99
union all
select t100.*
from t100;
But really, is it that hard to list the columns? An explicit column list is much less prone to error. And, it will work regardless of where the 100th column appears.
You can get the list in SQL Server Management Studio by clicking on the table name. You can also run a query such as:
select column_name
from information_schema.columns
where table_name = 't99';
And then use the column names to construct the query (I often use a spreadsheet for this purpose).

UNION requres that columns before and after it MATCH.
You can not do union of 99 columns and then 100 columns. You have to either provide dummy value for 100th column that do not exist in that table, or tell DB to skipp that column.
So add to the smaller table select:
NULL AS missing-column-name
Or list all the common columns by hand omitting columns that do not exists in both.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Query to compare differences of two columns from two different tables - sql-server

Related

Null records in data comparison

How can I check and remove duplicate rows?

Group by an evaluated field (sql server) [duplicate]

Issue in union operation

how to use all columns (*) in select statement from different tables using UNION ALL?

Categories

Resources