How to extract date from the source file name - snowflake-cloud-data-platform

to_timestamp_ntz(substring(SOURCE_OF_RECORD,REGEXP_INSTR(SOURCE_OF_RECORD, '.json')-14,14),'yyyymmddhh24miss') AS EXTRACTED_AT
file name :-OEA_CustomerOnboarded_Czech_20230208085230.JSON

breaking this down:
select
column1 as SOURCE_OF_RECORD
,REGEXP_INSTR(SOURCE_OF_RECORD, '.json') as a
,a - 14 as b
,substring(SOURCE_OF_RECORD, b, 14) as c
,to_timestamp_ntz(c, 'yyyymmddhh24miss') as d
--to_timestamp_ntz(substring(SOURCE_OF_RECORD,REGEXP_INSTR(SOURCE_OF_RECORD, '.json')-14,14),'yyyymmddhh24miss') AS EXTRACTED_AT
from values
('OEA_CustomerOnboarded_Czech_20230208085230.JSON')
gives the error:
Can't parse '208085230.JSON' as timestamp with format 'yyyymmddhh24miss'
so flipping to try_ version, I get null for d
but can see that my date extract is not working as intended.
Firstly REGEXP_INSTR is case sensitive, via the 'i' option.
select
column1 as SOURCE_OF_RECORD
,REGEXP_INSTR(SOURCE_OF_RECORD, '.json', 1, 1, 0, 'i') as a
,a - 14 as b
,substring(SOURCE_OF_RECORD, b, 14) as c
,try_to_timestamp_ntz(c, 'yyyymmddhh24miss') as d
from values
('OEA_CustomerOnboarded_Czech_20230208085230.JSON'),
('OEA_CustomerOnboarded_Czech_20230208085230.json')
;
now this gives:
SOURCE_OF_RECORD
A
B
C
D
OEA_CustomerOnboarded_Czech_20230208085230.JSON
43
29
20230208085230
2023-02-08 08:52:30.000
OEA_CustomerOnboarded_Czech_20230208085230.json
43
29
20230208085230
2023-02-08 08:52:30.000
if you always expect .json you could just use:
left(right(SOURCE_OF_RECORD, 19), 14)
thus:
try_to_timestamp_ntz(left(right(SOURCE_OF_RECORD, 19), 14), 'yyyymmddhh24miss')

The other option is to divide and conquer using split_part()
select try_to_timestamp_ntz(split_part(split_part(col,'_',-1),'.',1), 'yyyymmddhh24miss')

Related

Is there a way in SQL server to interpret the underlying varchar(4) bits as an INT?

I have data harvested from a binary file that has been encoded as a SQL column with type varchar(4). This is not changeable. The 4 bytes used to create this varchar need to be interpreted sometimes as an int value (big endian). It would be nice if we could do this entirely inside SQL.
Printing the values in this varchar(4) column is not helpful as most of the bytes get interpreted as unprintable control characters.
I can't figure out how CAST or CONVERT can help since they seem to be tailored to converting a varchar like "0054" to int 54. Instead, I need the underlying bits to be interpreted as an int (big endian)--not the varchar characters as an int.
For example, one record prints this column as no visible characters, but STRING_ESCAPE(#value,'json')
will display
\u0000\u0000\u0000\u0007
This needs to be interpreted somehow to be the int 7
Here's a few more examples of what STRING_ESCAPE returns and what the int value should be:
\u0000\u0000\u0000\b ==> 8
\u0000\u0000\u0000\t ==> 9
\u0000\u0000\u0000\n ==> 10
\u0000\u0000\u0000\u000b ==> 11
\u0000\u0000\u0000\f ==> 12
\u0000\u0000\u0000\r ==> 13
\u0000\u0000\u0000\u000e ==> 14
\u0000\u0000\u0000\u000f ==> 15
\u0000\u0000\u0000\u0010 ==> 16
Thanks for your brain!
So, here is a table of sample data. The first row represents your main example. But you don't have any examples where any one of the first 3 characters is not character 0. So I threw in another row where this is the case.
declare #values table (value char(4))
insert #values values
(char(0) + char(0) + char(0) + char(7)),
(char(13) + char(9) + char(14) + char(8));
In the query below, I isolate each character using substring. Then I call ascii to retrieve the character code. What is not clear, however, is how you would take those integer values and combine them. I give 3 possibilities. 'Option1' concatenates them. 'Option2' sums them together. 'Option3' concatenates them like option1, but pads them first so that there is a leading '0' if it is only one digit long.
select escapedVal = string_escape(value,'json'),
ap.*,
option1 = convert(int,concat(pos1, pos2, pos3, pos4)),
option2 = pos1 + pos2 + pos3 + pos4,
option3 = convert(int,
right('00' + convert(varchar(2),pos1),2) +
right('00' + convert(varchar(2),pos2),2) +
right('00' + convert(varchar(2),pos3),2) +
right('00' + convert(varchar(2),pos4),2)
)
from #values v
cross apply (select
pos1 = ascii(substring(value,1,1)),
pos2 = ascii(substring(value,2,1)),
pos3 = ascii(substring(value,3,1)),
pos4 = ascii(substring(value,4,1))
) ap;
This produces:
escapedVal
pos1
pos2
pos3
pos4
option1
option2
option3
\u0000\u0000\u0000\u0007
0
0
0
7
7
7
7
\r\t\u000e\b
13
9
14
8
139148
44
13091408
CAST(CAST(#value as BINARY(4)) as INT)
The part I was missing is specifying the size of binary as 4. Without the size, it always casts to 0!

Convert decimal to hex string without 0x in SQL Server

I want to convert
string
------
BB
C1
GB
to
hex
---
4242
4331
4742
using
SELECT CONVERT(BINARY(2), 'B1')
Result is '0x4231'
but I want remove the 0x from the result, so I tried varbinary to string:
SELECT CONVERT([VARCHAR](MAX), CONVERT(BINARY(2), 'B1', 2))
result is '?'
Then I tried
SELECT SUBSTRING(CONVERT(BINARY(2), 'B1'), 2, 4)
result is '0x42'
How to convert 'B1' to '4231'?
Convert to hex using the system function master.dbo.fn_varbintohexstr, then remove the first two characters.
SELECT SUBSTRING(master.dbo.fn_varbintohexstr(convert(binary(2), 'B1')),3,999)
Output:
4231
solved it myself
SELECT convert(varchar(4), convert(binary(2), ('B1')), 2)

Issue while adding values in SQL Server

Please read again till end (description updated)
I want something like this.
ex :
if (7200 / 42) is float then
floor(7200/42) + [7200 - {(floor(7200/42)) * 42}] / 10 ^ length of [7200 - {(floor(7200/42)) * 42}]
STEP : 1 => 171 + ((7200 - (171*42))/10 ^ len(7200-7182))
STEP : 2 => 171 + ((7200 - 7182)/10 ^ len(18))
STEP : 3 => 171 + (18/10 ^ 2)
STEP : 4 => 171 + (18/100)
STEP : 5 => 171 + 0.18
STEP : 6 => 171.18
I have written the code in SQL which actually works perfectly but the addition of 171 + 0.18 only gives 171
IF I can get "171/18" instead of "171.18" as string then it'd also be great. (/ is just used as separator and not a divison sign)
Following is the code I written
Here,
(FAP.FQTY + FAP.QTY) = 7200,
PRD.CRT = 42
(values only for example)
select
case when PRD.CRT <> 0 then
case when (FAP.FQTY + FAP.QTY)/PRD.CRT <> FLOOR((FAP.FQTY + FAP.QTY)/PRD.CRT) then --DETERMINE WHETHER VALUE IS FLOAT OR NOT
(floor((FAP.FQTY + FAP.QTY)/PRD.CRT)) +
((FAP.FQTY + FAP.QTY) - floor((FAP.FQTY + FAP.QTY)/PRD.CRT) * PRD.CRT) /
POWER(10, len(floor((FAP.FQTY + FAP.QTY) - floor((FAP.FQTY + FAP.QTY)/PRD.CRT) * PRD.CRT)))
else
(FAP.FQTY + FAP.QTY)/PRD.CRT -- INTEGER
end
else
0
end
from FAP inner join PRD on FAP.Comp_Year = PRD.Comp_Year and
FAP.Comp_No = PRD.Comp_No and FAP.Prd_Code = PRD.Prd_Code
I got all the values correct till 171 + 0.1800 correct but after that I am only receiving 171 in the addition. I want exactly 171.18.
REASON FOR THIS CONFUSING CALCULATION
Its all about accounting
Suppose, a box(or a cartoon) has 42 nos. of items.
A person sends 7200 items. how many boxes he has to send?
So that will be (7200/42) = 171.4257.
But boxes cannot be cut (its whole number i.e 171).
so 171 * 42 ie 7182 items.
Remaining items = 7200 - 7182 = 18.
So answer is 171 boxes and 18 items.
In short 171.18 or "171/18"
Please help me with this..
Thank you in advance.
Recognise that you're not producing an actual numeric result, I'd describe it as unhealthy to try to keep it using such a datatype1.
This produces the strings you're seeking, if I've understood your requirement:
;With StartingPoint as (
select 7200 as Dividend, 42 as Divisor
)
select
CONVERT(varchar(10),Quotient) +
CASE WHEN Remainder > 0 THEN '.' + CONVERT(varchar(10),Remainder)
ELSE '' END as FinalString
from
StartingPoint
cross apply
(select Dividend/Divisor as Quotient, Dividend % Divisor as Remainder) t
(Not tested for negative values. Some adjustments may be required. Technically % computes the modulus rather than the remainder, etc)
1Because someone might try and add two of these values together and I doubt that produces a correct result, not even necessarily if using the same Divisor to compute both.
Just another idea about how to calculate it.
Simple calculate the whole boxes.
And concatinate a dot with the remaining items (using a modulus).
Wrapped it all up in a CASE WHEN (or IIF) to avoid the divide by zero.
Example snippet:
declare #TestTable table (FQTY numeric(18,2), QTY numeric(18,2), CRT numeric(18,0));
insert into #TestTable (FQTY,QTY,CRT) values
(5000, 2200, 42),
(5000, 2200, 0),
( 100, 200, 10);
select *,
(CASE
WHEN CRT>0
THEN CONCAT(CAST(FLOOR((FQTY+QTY)/CRT) as INT),'/',CAST((FQTY+QTY)%CRT as INT))
ELSE '0'
END) AS Boxes
from #TestTable;
Result:
FQTY QTY CRT Boxes
------- ------- --- ------
5000.00 2200.00 42 171/18
5000.00 2200.00 0 0
100.00 200.00 10 30/0
The CONCAT returns a varchar, and so does the CASE WHEN.
But you could wrap that CASE WHEN in a CAST.
You're getting an automatic type conversion from int to decimal(10,0) which is probably not what you want.
https://learn.microsoft.com/en-us/sql/t-sql/data-types/int-bigint-smallint-and-tinyint-transact-sql?view=sql-server-2017
Check out the "Caution" box.
If you want a specific amount of precision, you'll need to explicitly cast() the values to the desired data type.
if i understand your logic correctly you want the remainder of 7200 divide by 42 and the remainder is to divide by 100
declare
#dividend int = 7200,
#divisor int = 42
select (#dividend / #divisor)
+ convert(decimal(10,4),
(#dividend % #divisor) * 1.0 / power(10, len(#dividend % #divisor)))
EDIT: change to handle the 10^len(remainder)

Left selection up until a certain character SQL

My select statement returns two columns Column A is based on Column B BUT I removed the initial 17 characters. What I would then like to do is take all the characters in Column A up until it hits a \ (backslash). Can anyone help me achieve this please - current code below
SELECT distinct substring(Path,17,18) AS Detail, Path
FROM [DB].[dbo].[Projects]
Where [Path] like '\DATA%'
AND [Deleted] = '0'
Just to re-iterate as my example wasn't very clear in the comment below. I am trying to extract from the following result
\DATA\More Data\Even More Data\Data 1
To show
Even More Data
So I have removed the proceeding 17 characters until the next \
For ColumnA, if you only want to take out the first 17 characters, you should use
RIGHT(Path, LEN(Path) - 17)
As your current solution will not work correctly if Path is longer than 35 characters.
As for returning the string up to the first backslash, use:
SELECT LEFT(Detail, CHARINDEX('\', Detail)) FirstFolder, Detail, Path
FROM
(
SELECT distinct RIGHT(Path, LEN(Path) - 17) AS Detail, Path
FROM [DB].[dbo].[Projects]
Where [Path] like '\DATA%'
AND [Deleted] = '0'
) a
Or all in one:
SELECT DISTINCT SUBSTRING(Path, 18, CHARINDEX('\', Path, 18) - 18)
FROM [DB].[dbo].[Projects]
WHERE [Path] like '\DATA%'
AND [Deleted] = '0'
This says:
extract a substring from path
start at Character 18
The length of the string will be calculated by finding the position of the first backslash in path starting at Character 18, minus 18 (because we started the search on the 18th character, and we want it relative to the start of our search not the start of the original string)
Update:
As #etsa correctly points out, if you cannot guarantee that Path is at least 18 characters long and contains a backslash after character 18 for every row, you should use the following to return only the rows that do meet this criteria:
SELECT DISTINCT SUBSTRING(Path, 18, CHARINDEX('\', Path, 18) - 18)
FROM [DB].[dbo].[Projects]
WHERE [Path] like '\DATA%'
AND [Deleted] = '0'
AND CHARINDEX('\', Path, 18) > 0
Regardless of the length of your data, this gets the parent folder which seems to be what you want.
DECLARE #table TABLE ([Path] VARCHAR(256))
INSERT INTO #table VALUES
('\DATA\More Data\Even More Data\Data 1'),
('\\server\top folder\middle folder\bottom folder\file 1'),
('x\x\x\x\x\x\x\x\x\x\x\x\x\x\x\x\x\x\x\this is what we want\x')
SELECT
[Path]
,CHARINDEX('\',REVERSE([Path])) as [Reverse Position of last \]
,CHARINDEX('\',REVERSE([Path]),CHARINDEX('\',REVERSE([Path])) + 1) as [Reverse Postion of next to last \]
,REVERSE(
SUBSTRING(
REVERSE([Path]),
CHARINDEX('\',REVERSE([Path]))+1,
CHARINDEX('\',REVERSE([Path]),CHARINDEX('\',reverse([Path]))+1)-CHARINDEX('\',REVERSE([Path])) - 1)) as [Your Desired Results]
FROM
#table
RETURNS
+--------------------------------------------------------------+----------------------------+-----------------------------------+----------------------+
| Path | Reverse Position of last \ | Reverse Postion of next to last \ | Your Desired Results |
+--------------------------------------------------------------+----------------------------+-----------------------------------+----------------------+
| \DATA\More Data\Even More Data\Data 1 | 7 | 22 | Even More Data |
| \\server\top folder\middle folder\bottom folder\file 1 | 7 | 21 | bottom folder |
| x\x\x\x\x\x\x\x\x\x\x\x\x\x\x\x\x\x\x\this is what we want\x | 2 | 23 | this is what we want |
+--------------------------------------------------------------+----------------------------+-----------------------------------+----------------------+

select integer before a certain character

hie am trying to select the integer value before the char C in my SQL database table which contains the information below.
240mm2 X 15C WIRING CABLE
150mm2 X 3C flex
10mm2 x 4C swa
so far i have used the query
select left ('C',CHARINDEX ('C',product_name)) from product
and i get 'C' on my results which is correct. Now am stuck does anyone know how i can modify the above select query to get a result which only lists the integers for eg
15
3
4
Two observations: the integer before "C" has a space before it and there is no space between the integer and "C".
If these are generally true, then you can do what you want using substring_index():
select substring_index(substring_index(product_name, 'C', 1), ' ', -1) + 0 as thenumber
The + 0 simply converts the value to a number.
If you're doing this in SQL Server you could try the following:
Select Substring(product_name,
PATINDEX('% [0-9]%',product_name) + 1,
PATINDEX('%[0-9]C%',product_name) - PATINDEX('% [0-9]%',product_name)
) as num
from Product
This assumes that there is a space before the number and always a C after the number with no space.
It works out the starting point and then the length based on the start and end and performs a substring with the results.
You could use a combination of instring and substring.
First get the position of the C
Then substring till C
It goes like this:
SELECT INSTR('foobarbar', 'bar');
= 4
And then you select substring from 1 to 4.

Resources