Snowflake BLANK_IF? - snowflake-cloud-data-platform

Is there a way to load a particular value as blank or empty string in Snowflake.
For example, I have hex value x98 written in the file which represents blank, and I want to load this value in the column as an empty string (e.g. '').

NULLIF() might be what you're after. Takes 2 values ... if they're equal then returns null. Other functions used HEX_DECODE_STRING() HEX_ENCODE()
select
NULLIF('Snowflake',HEX_DECODE_STRING('536E6F77666C616B65'))
,HEX_ENCODE('Snowball')
Side note there's not HEX to INT in snowflake but here's a link to showing you a simple UDF to do the conversion. As 0098 hex -> 152 decimal you might be better doing this first?

Related

How to add leading zeros in ADF data flow from the expression builder

How to add leading zeros in ADF data flow from the expression builder
For example – have column with numeric value as “000001” but it is coming as 1 only in SQL DB , if I put in entire value in single quotes it is coming but I need dynamic way of implementation with out hard coding.
I agree with #Larnu's comments that even if we give 00001 to an int type column it will give as 1 only.
So, we have to give those in single quotes ('00001') to use like that or import the incoming data as string instead of int.
As you are using ADF dataflow, if you want to use the 00001, you can generate those using derived column transformation from SQL source. But this depends on your requirement like how your leading 0's varies. So, use according to it.
Sample demo:
concat('0000', toString(id))
Result:
Use that column as per your requirement, after that you can convert it back to the same input by toInteger(id).

Converting binary data containing null (0x00) characters to ASCII in SQL Server

On SQL Server (2016+), I have data stored in a varbinary column, saved by some Java application, which contains a mixture of binary data and ASCII text. I want to search the column using a like operator or otherwise to look for certain ASCII strings, and then view the returned values as ASCII (so that I can read the surrounding text).
The data contains non-characters such as "00" (0x00), and these seem to stop SQL Server from converting the string as might otherwise be possible according to the answers at Hex to ASCII string conversion on the fly . In the example below, it can be seen that the byte "00" stops the parsing of the ASCII.
select convert(varchar(max),0x48454C4C004F205000455445,0) as v1 -- HELL
select convert(varchar(max),0x48454C4C4F205000455445,0) as v2 -- HELLO P
select convert(varchar(max),0x48454C4C4F2050455445,0) as v3 -- HELLO PETE
How can I have
select convert(varchar(max), 0x48454C4C004F205000455445, 0)
...return something like this?:
HELL?O P?ETE
(Or, less ideally, have an expression similar to
convert(varchar(max), 0x48454C4C004F205000455445, 0) like '%HE%ETE%'
...return the row?)
It works on the website https://www.rapidtables.com/convert/number/hex-to-ascii.html with 48454C4C004F205000455445 as input.
I'm not overly concerned about performance, but I want to stay within SQL Server, and ideally within the scope of T-SQL which can be copied and pasted easily.
I've tried using replace on "00", but this could causes problems with characters ending with 0, as in "5000" in the examples above. There may be bytes other than 0x00 which cause string conversion to stop as well.
To return the row (the more limited version of this question), a simple like operator on the value appears to work when run directly on the binary value, despite the intervening 0x00 values:
0x48454C4C004F205000455445 like 'HE%ETE%'
In other words, like can cope where convert can't.
To view the actual value, the best I've managed so far is this:
convert(varchar(max),convert(varbinary(max),
REPLACE(
convert(varchar(max), 0x48454C4C004F205000455445, 1)
,'00',''
)
,1),0)
This gives HELLO PETE, and works well enough on the actual data, getting to its end.
(It depends on the heuristic of not caring about converting e.g. 0x50 0x03 to 0x53 and similar, but I can live with that, as 0x0z, where z is 1 to f, represents control characters, which don't occur around the text I'm interested in).
(thanks to Panagiotis Kanavos for prodding me in a useful direction!)

Unable to return query Thai data

I have a table with columns that contain both thai and english text data. NVARCHAR(255).
In SSMS I can query the table and return all the rows easy enough. But if I then query specifically for one of the Thai results it returns no rows.
SELECT TOP 1000 [Province]
,[District]
,[SubDistrict]
,[Branch ]
FROM [THDocuworldRego].[dbo].[allDistricsBranches]
Returns
Province District SubDistrict Branch
อุตรดิตถ์ ลับแล ศรีพนมมาศ Northern
Bangkok Khlong Toei Khlong Tan SSS1
But this query:
SELECT [Province]
,[District]
,[SubDistrict]
,[Branch ]
FROM [THDocuworldRego].[dbo].[allDistricsBranches]
where [Province] LIKE 'อุตรดิตถ์'
Returns no rows.
What do I need o do to get the expected results.
The collation set is Latin1_General_CI_AS.
The data is displayed and inserted with no errors just can't search.
Two problems:
The string being passed into the LIKE clause is VARCHAR due to not being prefixed with a capital "N". For example:
SELECT 'อุตรดิตถ์' AS [VARCHAR], N'อุตรดิตถ์' AS [NVARCHAR]
-- ????????? อุตรดิตถ
What is happening here is that when SQL Server is parsing the query batch, it needs to determine the exact type and value of all literals / constants. So it figures out that 12 is an INT and 12.0 is a NUMERIC, etc. It knows that N'ดิ' is NVARCHAR, which is an all-inclusive character set, so it takes the value as is. BUT, as noted before, 'ดิ' is VARCHAR, which is an 8-bit encoding, which means that the character set is controlled by a Code Page. For string literals and variables / parameters, the Code Page used for VARCHAR data is the Database's default Collation. If there are characters in the string that are not available on the Code Page used by the Database's default Collation, they are either converted to a "best fit" mapping, if such a mapping exists, else they become the default replacement character: ?.
Technically speaking, since the Database's default Collation controls string literals (and variables), and since there is a Code Page for "Thai" (available in Windows Collations), then it would be possible to have a VARCHAR string containing Thai characters (meaning: 'ดิ', without the "N" prefix, would work). But that would require changing the Database's default Collation, and that is A LOT more work than simply prefixing the string literal with "N".
For an in-depth look at this behavior, please see my two-part series:
Which Collation is Used to Convert NVARCHAR to VARCHAR in a WHERE Condition? (Part A of 2: “Duck”)
Which Collation is Used to Convert NVARCHAR to VARCHAR in a WHERE Condition? (Part B of 2: “Rabbit”)
You need to add the wildcard characters to both ends:
N'%อุตรดิตถ์%'
The end result will look like:
WHERE [Province] LIKE N'%อุตรดิตถ์%'
EDIT:
I just edited the question to format the "results" to be more readable. It now appears that the following might also work (since no wildcards are being used in the LIKE predicate in the question):
WHERE [Province] = N'อุตรดิตถ์'
EDIT 2:
A string (i.e. something inside of single-quotes) is VARCHAR if there is no "N" prefixed to the string literal. It doesn't matter what the destination datatype is (e.g. an NVARCHAR(255) column). The issue here is the datatype of the source data, and that source is a string literal. And unlike a string in .NET, SQL Server handles 'string' as an 8-bit encoding (VARCHAR; ASCII values 0 - 127 same across all Code Pages, Extended ASCII values 128 - 255 determined by the Code Page, and potentially 2-byte sequences for Double-Byte Character Sets) and N'string' as UTF-16 Little Endian (NVARCHAR; Unicode character set, 2-byte sequences for BMP characters 0 - 65535, two 2-byte sequences for Code Points above 65535). Using 'string' is the same as passing in a VARCHAR variable. For example:
DECLARE #ASCII VARCHAR(20);
SET #ASCII = N'อุตรดิตถ์';
SELECT #ASCII AS [ImplicitlyConverted]
-- ?????????
Could be a number of things!
Fist of print out the value of the column and your query string in hex.
SELECT convert(varbinary(20)Province) as stored convert(varbinary(20),'อุตรดิตถ์') as query from allDistricsBranches;
This should give you some insight to the problem. I think the most likely cause is the ั, ิ, characters being typed in the wrong sequence. They are displayed as part of the main letter but are stored internally as separate characters.

Why does a number imported to SQL Server from Excel contain the letter e?

I have got a excel sheet which inserts data in to SQL Server, but noticed for a particular field, the data is being inserted with e, this particular field is of type varchar and size 20.
Why is e being inserted when the actual data for these respective fields is 54607677038, 77200818179 and 9920996.
Help me out
Thanks in anticipation.
You may think of '2007038971' as being just a string of numbers (some kind of article code, I guess). Excel just sees numbers and treats it as a numerical value. It probably is right aligned (default for numbers) and not left-aligned (default for strings).
When asked to store in as a string, it 'helpfully' formats that number into a string, thereby introducing that "e" notation (the value 2007038971 is about 2.00704 * 10^9).
You need to convince Excel that that code really is a string, maybe by adding a quote in front of it.
How about this. When you read value from excel, then convert ToString() and insert into DB. Need to change relevant data type based on data in your excel.
double doub = 2.00704e+009;
string val = doub.ToString();

Error in storing values in SQL database table

In my table, there is a column called zipcode whose datatype is int. And when I am storing a zipcode which starts with 0 (for eg. 08872), it is getting stored as 8872.
Can anybody explain me why is it happening?
An INT value is numeric - and numerically, 08872 and 8872 are identical - both represent the value 8872.
SQL Server will not store leading zeroes for numerical values. That's just the way it is.
Either store this as CHAR(5) instead, or handle the formatting (adding leading zeroes to your zip codes) on the frontend when you need to display it.

Resources