Converting 'date' stored as integer (number of days since 1 Jan 1970) in Avro to Snowflake 'date' type - snowflake-cloud-data-platform

I've a requirement to migrate data from some on-premise databases to the cloud. Some of the data in the tables is stored as 'date' in format yyyy-mm-dd.
We are converting the data stored in the tables into Avro format and then it's copied into Snowflake.
In Avro, date is stored as an integer Avro Date type
When I try to push the data into snowflake, it's unable to convert that integer back into date. I get the following error: 'Failed to case VARIANT 13707 to date'
where 13707 is number of days since Jan 1 1970
Thanks!

You need to calculate the date value based on the variant value. You can use DATEADD for this purpose:
https://docs.snowflake.com/en/sql-reference/functions/dateadd.html
create table avro_test ( x date );
insert into avro_test(x)
select dateadd('day',parse_json('13707'),'1970-01-01');
select * from avro_test;
+------------+
| X |
+------------+
| 2007-07-13 |
+------------+

If the format of the input parameter is a string that contains an integer:
After the string is converted to an integer, the integer is treated as a number of seconds, milliseconds, microseconds, or nanoseconds after the start of the Unix epoch (1970-01-01 00:00:00.000000000 UTC).
If the integer is less than 31536000000 (the number of milliseconds in a year), then the value is treated as a number of seconds.
If the value is greater than or equal to 31536000000 and less than 31536000000000, then the value is treated as milliseconds.
If the value is greater than or equal to 31536000000000 and less than 31536000000000000, then the value is treated as microseconds.
If the value is greater than or equal to 31536000000000000, then the value is treated as nanoseconds.
If more than one row is evaluated (for example, if the input is the column name of a table that contains more than two rows), the first processed value determines whether all subsequent values are treated as seconds, milliseconds, microseconds, or nanoseconds.
If the first value is greater than or equal to 31536000000, then all values will be treated as milliseconds, even if some remaining values are less than 31536000000. Similar logic applies for microseconds and nanoseconds.

Related

Postgres Date Type Value

Want to retrieve a date type from a postgres table using liqpq PQexecParams() in binary mode (please humor me).
https://www.postgresql.org/docs/14/datatype-datetime.html says that a date is 4 bytes (4713 BC to 5874897 AD).
src/include/utils/date.h defines:
typedef int32 DateADT;
But obviously given the supported date range it's not a normal int. Something like this:
int32_t haha = be32toh(*((uint32_t *) PQgetvalue(res, 0, 17)));
Gives haha=1466004328 for 2022-10-25.
Which is clearly not a day count and since its not a ratio of 86,400 is also not seconds since an epoch. Number is also too small to be microseconds.
How do I interpret the 4 bytes of postgresql 'date' data?
Added Later:
This question contains an error - PQgetvalue() references column 17 (a text value) instead of column 18 (a date value) - with that corrected haha=8332
Date is an integer day count from POSTGRES_EPOCH_JDATE (2000-01-01).

Formula to write milliseconds in hh:mm:ss.000 format gives wrong values

I'm trying to convert duration in one column which is written in milliseconds (Ex: 600,2101,1110....) to hh:mm:ss.000 format(Ex:00:00:00.600, 00:00:02.101...) using the below formula in google spreadsheets:
=CONCATENATE(TEXT(INT(A1/1000)/86400,"hh:mm:ss"),".",A1-(INT(A1/1000)*1000))
It gives correct values for almost all , but one type of values which is durations having '0' as their second digit (Eg: 2010,3056,1011).
When 0 is the second digit , the after decimal value in hh:mm:ss.000 is rounded to the third digit and 0 is ignored (Example row 1 and 2 in below table). But for other durations it gives right value(row 3).
I need a formula that works well on all type of values i.e 1080 → 00:00:01.080 and not 00:00:01.80 .
Can someone please help with this.
Duration in milliseconds
hh:mm:ss.000 format
1080
00:00:01.80 (wrong)
2010
00:00:02.10 (wrong)
1630
00:00:01.630 (correct)
try:
=INDEX(IF(A2:A="",,TEXT(A2:A/86400000, "hh:mm:ss.000")))

What does second parameter in ClickHouse function toDateTime64 mean?

ClickHouse has function toDateTime64() to convert string into DateTime64 data type.
Example from official documentation:
SELECT * FROM dt WHERE timestamp = toDateTime64('2019-01-01 00:00:00', 3, 'Europe/Moscow')
It takes 3 parameters:
Date string
Integer
Timezone
But there is no info about the second parameter. What does it mean?
That's precision.
3 is milliseconds (2019-01-01 03:00:00.000),
6 is microseconds (2019-01-01 03:00:00.000000)
and so on.
You can find more info in DateTime64 datatype description https://clickhouse.tech/docs/en/sql-reference/data-types/datetime64/

How to convert 8 byte datetime from fn_dblog() details in [Log Content 0] into a C# DateTime object?

I've deleted a row of data that was inserted recently.
Rather than restore and roll forward a second copy of this huge DB to retrieve the inserted data, I'm trying to use the fn_dblog() "undocumented" system function to retrieve it.
Using a description (found here: https://sqlfascination.com/2010/02/03/how-do-you-decode-a-simple-entry-in-the-transaction-log-part-1/)
of the contents of the [Log Content 0] column fn_dblog() returns, I am successfully retrieving my inserted (and later deleted) data from the log file. In the section of this binary data reserved for fixed width column data, I found that the SQL DateTime column values take 8 bytes. I'm processing the binary data in a .NET program, using BitConverter.ToInt64 or BitConverter.ToInt32 as appropriate for the Int or BigInt values
I've managed to retrieve all the inserted column values I need except for the datetime columns...
I'm unclear how to interpret the 8 bytes of a SQL DateTime column as a C# DateTime object. If it helps, below is an example hex and Int64 version of the datetime 8 bytes retrieved from the transaction log data for a particular datetime.
DateTime (around 7/31/2020) in binary: 0xF030660009AC0000 (Endian reversed: 0x0000AC09006630F0)
as an Int64: 189154661380804
Any suggestions? This is internal SQL Server representation of a date, I'm not sure where to find doc on it...
I finally did discover the answer: The SQL DateTime stored as VARBINARY (similar to the bytes I'm reading from the transaction log) contains two integers. The first is the date part - number of days since 1/1/1900. It will be negative for earlier dates.
The second integer is the number of milliseconds since midnight, divided by 3.33333333.
Because the bytes are stored as a long and in reverse, the first 4 bytes of the 8 bytes in the buffer are the minutes, the second is the date.
So here is a code snippet I used to get the date. I'm running through the fixed length fields one at a time, keeping track of the current offset in the byte array...
the variable ba is the byte array of the bytes in the [Log Content 0] column.
int TimeInt;
int DateInt;
DateTime tmpDt;
//initialize the starting point for datetime - 1/1/1900
tmpDt = new DateTime(1900, 1, 1);
// get the time portion of the SQL DateTime
TimeInt = BitConverter.ToInt32(ba, currOffset);
currOffset += 4;
// get the date portion of the SQL DateTime
DateInt = BitConverter.ToInt32(ba, currOffset);
currOffset += 4;
// Add the number of days since 1/1/1900
tmpDt = tmpDt.AddDays(DateInt);
// Add the number of milliseconds since midnight
tmpDt = tmpDt.AddMilliseconds(TimeInt * 3.3333333);

SSRS FORMAT Function by Default Rounding of the Decimal Values to 2

When i Use Format Function by Default value (12.1234) is rounding off to 2 decimal points (12.12)
Below is my Expression
=Format(Fields!FEEPERUNIT.Value, "C") & " Rate Per Member "
It Gave me $12.12 Rate Per Member
I expect my data to be like
My Data | Expected Data
12.1234 | $12.1234 Rate Per Member
45.6700 | $45.67 Rate Per Member
78.00 | $78 Rate Per Member
901.23 | $901.23 Rate Per Member
It's not SSRS's fault, it's the format code you're using.
If you want it always to be accurate to at least 2 decimal places, then use $0.00## as the format. This'll return the values below:
$12.1234
$45.67
$78.00
$901.23
$11725.50
If you must return an integer only for those that are integers, you'll need to use an expression instead. For example:
IIf(Fields!FEEPERUNIT.Value Mod 1 = 0, "$0", "$0.0###")
This'll return the values below:
$12.1234
$45.67
$78
$901.23
$11725.5
Thanks #Larnu for responding . I've achieved it with below expression . i removed Format Function and concatenating the $ symbol manually to the existing Amounts.
"$" & Fields!FEEPERUNIT.Value & " Rate Per Member "

Resources