Null records in data comparison - snowflake-cloud-data-platform

I am building a VIEW by comparing two tables having columns namely RUN_DATE(TABLE 1) of date format and FT_Data_cut(TABLE 2 ) of date format. I need to fetch the latest data from Table 2
into View. My below condition is not working.
SELECT
FROM "XYZ"."DATA_LAKE"."STE_INCOMING"
WHERE FT_DATA_CUT = (
select max(RUN_DATE)
from "XYZ"."OUTBOUND"."CONTROL_TABLE"
where release_flag = 'RELEASED'
and table_name='STE_INCOMING');
On brief checking,
select max(RUN_DATE)
from "XYZ"."OUTBOUND"."CONTROL_TABLE"
where release_flag = 'RELEASED'
and table_name='STE_INCOMING'
(Working fine)
My above query is working properly but I am not able to compare the two dates. It is returning Zero Rows in output. But we have a common date(ex- 2022-02-21) in both columns. The data type is also the same for both tables as I discussed earlier. I don't know why it is not able to compare the two dates

Related

SQL Server: Comparing dates from multiple records

When a part is created in a table ("ASC_PMA_TBL"), a number is auto-generated. Any "sub-parts" that are subsequently created then have an associated number. So for example, the "master" part might be 18245, and it may have several subparts which would be "18245-50", or "18245-40", etc. Subparts are always identified by having the master part number, followed by a '-' then a two-digit number. Each sub part has a date associated with it ("EO_DATE"). All I want to do is display records where the "master" dates don't match each of the sub-parts dates. All data is in the one table "ASC_PMA_TBL".
Normally this would be easily achieved using a join. However in the database, the subparts are not related to their master through the use of foreign keys, so I'm having to find a different way of doing things.
Furthermore, the date field is a date/ time field, so to compare them I first have to convert the field into a date only field. I can do this, but then am unable to use the alias in my query!
Any help is much appreciated :)
I have tried creating temporary tables and using subqueries, but cannot solve this problem :(
UPDATE: Managed to solve the problem using temporary tables, truncating the part number of the sub-parts to match the master parts, and then joining the two to compare the dates. Might be messy, but it works!
SELECT
PMA_PART_ONLY,
CONVERT(DATE,PMA_EFFECT_DATE_OFF) As 'EO_DATE'
INTO
##MParts
FROM
ASC_PMA_TBL
WHERE
(PMA_PROC_CODE = 'M') AND
(PMA_EFFECT_DATE_OFF IS NOT NULL)
SELECT
PMA_PART_ONLY,
CONVERT(DATE,PMA_EFFECT_DATE_OFF) As 'EO_DATE',
SUBSTRING(PMA_PART_ONLY,0,CHARINDEX('-',PMA_PART_ONLY,0)) As 'MP_NO'
INTO
##SParts
FROM
ASC_PMA_TBL
WHERE
(PMA_PROC_CODE = 'S') AND
(PMA_EFFECT_DATE_OFF IS NOT NULL)
SELECT
##SParts.PMA_PART_ONLY As 'SUB_PART_NO',
##MParts.EO_DATE As 'M_PART_DATE',
##SParts.EO_DATE As 'S_PART_DATE'
FROM
##MParts INNER JOIN ##SParts ON ##SParts.MP_NO = ##MParts.PMA_PART_ONLY
WHERE
(##MParts.EO_DATE <> ##SParts.EO_DATE)
ORDER BY
SUB_PART_NO DESC
DROP TABLE ##MParts
DROP TABLE ##SParts
If you want to compare just dates and not times you gotta convert the dates:
select *
from ASC_PMA_TBL master
inner join ASC_PMA_TBL parts
ON parts.number like CAST(master.number AS VARCHAR(30)) + '[_]%'
where CAST(master.EO_DATE AS DATE) <> CAST(parts.EO_DATE AS DATE)
That's the main idea, get all master and parts where part number is like master number + underscope.
Note that you have to escape "_" in []-quotes when performing LIKE

Combine multiple date columns into one date column

I want to combine multiple date columns by taking the least date value among date columns excluding the nulls values. I have tried various ways such as using 'case when' and 'Min' function but can't weed out NULL values. I am not looking for first non-NULL value either. What makes matters worst is that the 'LEAST' function is not available in Netezza.
My dummy data(not highlighted columns), my desired output(highlighted columns) is shown in the table below:
the MIN and MAX functions in a netezza system work as simple scalar functions (as well as columnar functions) and you can give them multiple arguments. Essentially the same as LEAST. This example covers it for the MIN function but MAX is the same:
select min(a)
from (
select min(min(1,2),NULL) a
union all
select 10
union all
select NULL
) x
Result is 10
To disregard possible NULL values with a NVL (same as DB2 coalesce function) and an "infinate" value of some sort. Replace line 3 in the above statement with this:
select min(min(1,2),nvl(NULL,100)) a
Result is now 1
In you case you should be able to do this:
select patient_id,min(
nvl(INDEX_DT,'9999-12-31'),
nvl(PRE_FEVER_DT,'9999-12-31'),
nvl(POST_FEVER_DT,'9999-12-31'),
nvl(PRE_DIARR_DT,'9999-12-31'),
nvl(PRE_DIARR_DT,'9999-12-31'),
nvl(PRE_COUGH_DT,'9999-12-31'),
nvl(POST_COUGH_DT,'9999-12-31')
) as signs_DT
The least() function is available as a part of the SQL extension toolkit.
Here is the link to the least() function -> https://www.ibm.com/docs/en/netezza?topic=functions-least
Here is the install instructions -> https://www.ibm.com/docs/en/psfa/7.1.0?topic=setup-installing-netezza-sql-extensions-toolkit
SQL Toolkit can be downloaded from IBM Fix central.

How can I check and remove duplicate rows?

Have problem with quite big table, where are some null values in 3 columns - datetime2 (and 2 float columns).
Nice simple request from similar question returns only 2 rows where datetime2 is null, but nothing else (same as lot of others):
DELETE FROM MyTable
LEFT OUTER JOIN (
SELECT MIN(RowId) as RowId, allRemainingCols
FROM MyTable
GROUP BY allRemainingCols
) as KeepRows ON
MyTable.RowId = KeepRows.RowId
WHERE
KeepRows.RowId IS NULL
Seems to work without datetime2 column having nulls ??
There is manual workaround, but is there any way to create request or procedure using TSQL only ?
SELECT id,remainingColumns
FROM table
order BY remainingColumns
Compare all columns in XL (15 in my case, placed =ROW() in first column as a check and formula next to last column + auto filter for TRUEs): =AND(B1=B2;C1=C2;D1=D2;E1=E2;F1=F2;G1=G2;H1=H2;I1=I2;J1=J2;K1=K2;L1=L2;M1=M2;N1=N2;O1=O2;P1=P2)
Or compare 3 rows like this and select all non-unique rows
=OR(
AND(B1=B2;C1=C2;D1=D2;E1=E2;F1=F2;G1=G2;H1=H2;I1=I2;J1=J2;K1=K2;L1=L2;M1=M2;N1=N2;O1=O2;P1=P2);
AND(B2=B3;C2=C3;D2=D3;E2=E3;F2=F3;G2=G3;H2=H3;I2=I3;J2=J3;K2=K3;L2=L3;M2=M3;N2=N3;O2=O3;P2=P3)
)
Quite much work to find my particular data/answer...
Most of float numbers were slightly different.
Hard to find, but simple CAST(column as binary) can show these invisible differences...
Like 96,6666666666667 vs 0x0000000000000000000000000000000000000000000040582AAAAAAAAAAD vs 0x0000000000000000000000000000000000000000000040582AAAAAAAAAAB etc.
And visible 96.6666666666667 can return something different way again:
0x0000000000000000000000000000000000000F0D0001AB6A489F2D6F0300

MSRS column group

I'm trying to prepare a report like on image below
Report1
When I'm trying to preview a report I get three additional columns between column Reservations and first type of stock_description
Report2
Now in T-SQL Query in select part I have got:
sum(units)
sum(units_required),
sum(units_avaliable)
I know that t-sql ignore null values. But when I change the query to:
sum(isnull (units,0)),
sum(isnull (units_required,0)),
sum(isnull (units_avaliable,0))
then I get 0 value in those additional columns instead of null value. When query returns any value them it is where it should be - in one of the stock_description.
What should I do to delete those three columns between Reservations and stock_location?
It is because your data has NULL values of Stock_description field. You can put additional condition in your TSQL to exclude NULL Stock Description.
SELECT ....
FROM ....
JOIN ....
WHERE .....
AND TableName.Stock_Description IS NOT NULL
But one thing you need to watch/Test is what happens if there are units under NULL Stock_description
You can also handle this in SSRS by filtering either at Tablix or datasource but doing in SQL itself is much better.

Query to compare differences of two columns from two different tables

I am attempting to create a UNION ALL query, on two differently named columns in two different tables.
I would like to take the "MTRL" column in the table USER_EXCEL and compare it against the "short_material_number" column from the IM_EXCEL table. I would then like the query to only return the differences between the two columns. Both columns house material numbers but are named differently (column wise) in the tables.
What I have so far is:
(SELECT [MTRL] FROM dbo.USER_EXCEL
EXCEPT
SELECT [short_material_number] FROM dbo.IM_Excel)
UNION ALL
(SELECT [short_material_number] FROM dbo.IM_Excel
EXCEPT
SELECT [MTRL] FROM dbo.USER_EXCEL)
However, when trying to run that query I receive an error message that states:
Msg 8114, Level 16, State 5, Line 22
Error converting data type varchar to float.
You're almost certainly getting this error because one of your two columns is a FLOAT data type, while the other is VARCHAR. This article would be a good place to start reading about implicit conversions, which happen when you try to compare two columns that are of different data types.
To get this working, you need to convert the float to a varchar, like in the example below.
(
SELECT [MTRL] FROM dbo.USER_EXCEL
EXCEPT
SELECT CAST([short_material_number] AS VARCHAR(18)) FROM dbo.IM_Excel
)
UNION ALL
(
SELECT CAST([short_material_number] AS VARCHAR(18)) FROM dbo.IM_Excel
EXCEPT
SELECT [MTRL] FROM dbo.USER_EXCEL
)
From you question I understand you are trying to compare two columns but returning only one column I would recommend you to use following query to compare the differences side by side
SELECT ue.[MTRL], ie.[short_material_number]
FROM dbo.IM_Excel ie
FULL OUTER JOIN
dbo.USER_EXCEL ue
ON CAST(ie.[short_material_number] AS VARCHAR(20)) = ue.[MTRL]
WHERE ie.[short_material_number] IS NULL
OR ue.[MTRL] IS NULL

Resources