snowflake regex_replace not working as expected - snowflake-cloud-data-platform

I am trying to do this that strips "['" or "']" in the string.
For Example, if we have ['Customer Name'] it should be "Customer Name"
select regexp_replace("['Customers NY']","\\['|\\']","") as customername;
I am getting this error--
SQL compilation error: error line 1 at position 22 invalid identifier "['Customers NY']"

It's a typo..
ALL string in SQL are single quotes only.. you double quotes are for named objects like columns/tables.
Then you will have to escape the quotes in the quotes
select regexp_replace('[\'Customers NY\']','\\[\'|\'\\]','') as customername;
gives:
CUSTOMERNAME
Customers NY

Double $$ makes escaping easier. Combine that with translate and you could do
select translate('[\'Customers NY\']',$$[']$$,'');

Related

How to load array data in string format in snowflake string based column?

I have spark dataframe with string type column like ['Match', {'src_cnt': '541', 'tgt_cnt': '541'}].
When I tried to insert using the above data surrounding double quotes getting error.
The query I used was:
INSERT INTO tbl_nm VALUES ('test_case01','Pass',"['Match', {'src_cnt': '541', 'tgt_cnt': '541'}]",'2023-01-02') ;
Error:
SQL compilation error: error line 1 at position 92 invalid identifier '"['Match', {'src_cnt': '541', 'tgt_cnt': '541'}]"'
It is working when I surround the string with single quote(') and replace inside ' with " like
INSERT INTO tbl_nm VALUES ('test_case01','Pass','["Match", {"src_cnt": "541", "tgt_cnt": "541"}]','2023-01-02') ;
But My data coming with ' inside square bracket['Match', {'src_cnt': '541', 'tgt_cnt': '541'}]
Could any one guide me how can I insert the data with ' inside square brackets.
Thanks in advance.
You can surround your array in double dollar that will escape single and double quotes. Like this..
$$['Match', {'src_cnt': '541', 'tgt_cnt': '541'}]$$
You can escape single quotes by putting 2 single quotes. It should work on both Snowflake and Sqlite3:
INSERT INTO tbl_nm VALUES ('test_case01','Pass','[''Match", {''src_cnt'': ''541'', ''tgt_cnt'': ''541''}]','2023-01-02') ;

Snowflake JSON with foreign language to tabular format dynamically

I read through snowflake documentation and the web and found only one solution to my problem by https://stackoverflow.com/users/12756381/greg-pavlik which can be found here Snowflake JSON to tabular
This doesn't work on data with Russian attribute names and attribute values. What modifications can be made for this to fit my case?
Here is an example:
create or replace table target_json_table(
v variant
);
INSERT INTO target_json_table SELECT parse_json('{
"at": {
"cf": "NV"
},
"pd": {
"мо": "мо",
"ä": "ä",
"retailerName": "retailer",
"productName":"product"
}
}');
call create_view_over_json('target_json_table', 'V', 'MY_VIEW');
ERROR: Encountered an error while creating the view. SQL compilation error: syntax error line 7 at position 7 unexpected 'ä:'. syntax error line 8 at position 7 unexpected 'мо'.
There was a bug in the original SQL used as a basis for the creation of the stored procedure. I have corrected that. You can get an update on the Github page. The changed section is here:
sql =
`
SELECT DISTINCT '"' || array_to_string(split(f.path, '.'), '"."') || '"' AS path_nAme, -- This generates paths with levels enclosed by double quotes (ex: "path"."to"."element"). It also strips any bracket-enclosed array element references (like "[0]")
DECODE (substr(typeof(f.value),1,1),'A','ARRAY','B','BOOLEAN','I','FLOAT','D','FLOAT','STRING') AS attribute_type, -- This generates column datatypes of ARRAY, BOOLEAN, FLOAT, and STRING only
'"' || array_to_string(split(f.path, '.'), '.') || '"' AS alias_name -- This generates column aliases based on the path
FROM
#~TABLE_NAME~#,
LATERAL FLATTEN(#~COL_NAME~#, RECURSIVE=>true) f
WHERE TYPEOF(f.value) != 'OBJECT'
AND NOT contains(f.path, '[') -- This prevents traversal down into arrays
limit ${ROW_SAMPLE_SIZE}
`;
Previously this SQL simply replaced non-ASCII characters with underscores. The updated SQL will wrap key names in double quotes to create non-ASCII key names.
Be sure that's what you want it to do. Also, the keys are nested. I decided that the best way to handle that is to create column names in the view with dot notation, for example one column name is pd.ä. That will require wrapping the column name with double quotes, such as:
select * from MY_VIEW where "pd.ä" = 'ä';
Final note: The name of your stored procedure is create_view_over_json, however, in the Github project the name is create_view_over_variant. When you update, be sure to call the right procedure.

T-SQL wildcard not operator ^ not working

T-SQL Not wildcard:
SELECT * FROM Customers
WHERE City LIKE 'A[^a]%';
It returns: 'Aachen'
So what is the meaning of ^ operator here, same result will come if use
WHERE City LIKE 'A[a]%';
I know I can use 'A[!a]%' and will work, my concern is then why ^?
From here:
The Caret Wildcard Character [^]:
The Caret Wildcard Character is used to search for any single
character not within the specified range [^a-c] or set [^abc].
To find all employees with a 3 characters long first name that begins
with ‘Ja’ and the third character is not ‘n’:
SELECT FirstName, MiddleName, LastName
FROM Person.Person
WHERE FirstName LIKE 'Ja[^n]'
Here is a screenshot depicting that it is working as expected:

Building dynamic query for Sql Server 2008 when table name contains " ' "

I need to fetch Table's TOP_PK, IDENT_CURRENT, IDENT_INCR, IDENT_SEED for which i am building dynamic query as below:
sGetSchemaCommand = String.Format("SELECT (SELECT TOP 1 [{0}] FROM [{1}]) AS TOP_PK, IDENT_CURRENT('[{1}]') AS CURRENT_IDENT, IDENT_INCR('[{1}]') AS IDENT_ICREMENT, IDENT_SEED('[{1}]') AS IDENT_SEED", pPrimaryKey, pTableName)
Here pPrimaryKey is name of Table's primary key column and pTableName is name of Table.
Now, i am facing problem when Table_Name contains " ' " character.(For Ex. KIN'1)
When i am using above logic and building query it would be as below:
SELECT (SELECT TOP 1 [ID] FROM [KIL'1]) AS TOP_PK, IDENT_CURRENT('[KIL'1]') AS CURRENT_IDENT, IDENT_INCR('[KIL'1]') AS IDENT_ICREMENT, IDENT_SEED('[KIL'1]') AS IDENT_SEED
Here, by executing above query i am getting error as below:
Incorrect syntax near '1'.
Unclosed quotation mark after the character string ') AS IDENT_SEED'.
So, can anyone please show me the best way to solve this problem?
Escape a single quote by doubling it: KIL'1 becomes KIL''1.
If a string already has adjacent single quotes, two becomes four, or four becomes eight... it can get a little hard to read, but it works :)
Using string methods from .NET, your statement could be:
sGetSchemaCommand = String.Format("SELECT (SELECT TOP 1 [{0}] FROM [{1}]) AS TOP_PK, IDENT_CURRENT('[{2}]') AS CURRENT_IDENT, IDENT_INCR('[{2}]') AS IDENT_ICREMENT, IDENT_SEED('[{2}]') AS IDENT_SEED", pPrimaryKey, pTableName, pTableName.Replace("'","''"))
EDIT:
Note that the string replace is now only on a new, third substitution string. (I've taken out the string replace for pPrimaryKey, and for the first occurrence of pTableName.) So now, single quotes are only doubled, when they will be within other single quotes.
You need to replace every single quote into two single quotes http://beyondrelational.com/modules/2/blogs/70/posts/10827/understanding-single-quotes.aspx

H2 DB CSVWRITE Duplicate Double Quotes Inside a String

I was trying to export a table in H2 DB into CSV using CSVWRITE function and found out if double quotes are included in a varchar column they will be duplicated.
Eg. - 'hello"howareyou' will be 'hello""howareyou' in the written csv.
Tried saving this varchar column with escape characters and few other combinations but result is the same.
Following is my table column I created to test this issue and the resulted CSV value I got.
My column CSV written value
------------------------------
hello"how hello""how
hello\"how hello\""how
hello""how hello""""how
hello\""how hello\""""how
hello\\"how hello\\""how
hello\\\\"how hello\\\\""how
hello["]how hello[""]how
hello&quote;how hello&quote;how
Following is my CSVWrite command:
CALL CSVWRITE(
'#DELTA_CSV_DIR#/DELTA.csv',
'SELECT ccc from temptemp',
null, '|', '');
Am I doing this wrong? or is there any option or workaround I can use to avoid this situation?
Thanks in advanced.
You are currently using the built-in CSVWRITE function with the following options:
fileName = '#DELTA_CSV_DIR#/DELTA.csv'
query = 'SELECT ccc from temptemp'
characterSet = default (UTF-8)
fieldSeparator = '|'
fieldDelimiter = '' (empty string)
As documented, the default escape character is a double quote, so that double quotes are escaped using a double quote (in the same way as you need to escape a backslash within a Java string with a backslash). The escape character is needed to escape the field separator.
You can disable the escape character as follows:
CALL CSVWRITE(
'#DELTA_CSV_DIR#/DELTA.csv',
'SELECT ccc from temptemp',
'fieldSeparator=| fieldDelimiter= escape=');
This is also using the more readable new format for options.

Resources