Illegal character sequence \p' in string literal - salesforce

When executing the following code:
string query = 'select id from ClientData__c';
string Point05 = '%\\file-01\projects\Internal Audit\Internal Audit
Team\FY18\SOX\Testing%';
query += ' WHERE Point05__c LIKE \'' + Point05 + '\'';
List<ClientData__c> clientData = database.query(query);
I get the following error:
Line: 2, Column: 18
Illegal string literal: Invalid string literal '%\file-01\projects\Internal Audit\Internal Audit Team\FY18\SOX\Testing%'. Illegal character sequence \p' in string literal.

Backslash (\) is used normally as an escape character. Just like you used in query line to escape single quote (').
Here, in order to use in your string, you need to double backslash to escape the line backslash. Just change to:
string Point05 = '%\\\\file-01\\projects\\Internal Audit\\Internal Audit Team\\FY18\\SOX\\Testing%';
You may find its use here.

Related

How to remove commas from string in database with SQLite in C?

I have problem on removing all comma from string data in SQLite database.
The program is based on C so I use C API of SQLite as sqlite3_mprintf().
I try to get rows matched with input and it needs to be checked without comma(,).
REPLACE() of SQLite is used as REPLACE(data, ',', '').
Sample code is below.
sqlite3_stmt *main_stmt;
const char* comma = "','";
const char* removeComma = "''"
char *zSQL;
zSQL = sqlite3_mprintf(SELECT * FROM table WHERE (REPLACE(colA||colB, %w, %w) LIKE %%%q%%, comma, removeComma, input);
int result = sqlite3_prepare_v2(database, zSQL, -1, &main_stmt, 0);
I refer the sqlite3 reference.
https://www.sqlite.org/printf.html
Any Substitution Types of SQLite3 reference delete apostrophes from input data.
It makes REPLACE() functions different from what I think.
What I expect is SELECT * FROM table WHERE (REPLACE(colA||colB, ',', '') LIKE %q
by passing ',' and '' as argument of sqlite3_mprintf().
However, result is changed as SELECT * FROM table WHERE (REPLACE(colA||colB, ,, ) LIKE %q
so comma is not removed from data, colA||colB, and result is different from what I expect.
Are there any idea to input comma as first argument of REPLACE() with apostrophes
and blank with apostrophes as second argument?

Import data using special escape character

I'm trying to import data into Snowflake using the copy command.
I have a file format defined as follows:
CREATE FILE FORMAT mydb.schema1.myFileFormat
TYPE = CSV
COMPRESSION = 'AUTO'
FIELD_DELIMITER = ','
RECORD_DELIMITER = '\n'
SKIP_HEADER = 0
FIELD_OPTIONALLY_ENCLOSED_BY = '\042'
TRIM_SPACE = FALSE
ERROR_ON_COLUMN_COUNT_MISMATCH = FALSE
ESCAPE = '\241'
ESCAPE_UNENCLOSED_FIELD = NONE
DATE_FORMAT = 'AUTO'
TIMESTAMP_FORMAT = 'AUTO'
NULL_IF = ('\\N')
COMMENT = '¡ used as escape character';
There's nothing special about the file format, except it's using ¡ as an escape character.
When importing data with this file format, it seems Snowflake is not recognizing the escape character, and it's throwing an error saying "Found character 'XYZ' instead of field delimiter ','".
I tried creating a file with 1 line, like the following:
"ABC123","584382","2","01","01/22/2019","02/08/2019","02/08/2019","04/03/2019","04/03/2019","TEST","Unknown","Unknown","01-884400","Unknown","DACRON CONNECTIONS 15¡"1/2 DIA. X 11¡" LONG FOR EXHAUST DAMPER","","0.0","0.0","0.0","0.0","192.0","USD","2.0","2.0","0","0","96.00000","1","","","","","07882-0047","ASDF","ASDF","02/27/2019","04/06/2021","01/01/1970","0"
This file fails on line 1, char 167, which is right after the first escape character (before the 1 in the following text: CONNECTIONS 15¡"1/2)
Any idea why this is happening?
This is the code I'm running to do the copy
copy into mydb.schema1.mytable from #mydb.schema1.mystage/file-path/2021-05-26/test.txt
file_format = mydb.schema1.myFileFormat
validation_mode = 'return_all_errors';
Short Answer
Looks like Snowflake only allows single-byte characters to be used as an escape character for a file format. The character you're using as the escape character uses two bytes and therefore isn't allowed as an escape character by the file format.
You can however use multi-byte characters for field and row delimiters so not sure why Snowflake hasn't allowed it as the escape character as well.
Longer Answer
The character you're trying to use as the escape character (¡) is two bytes long with a hex value of \xC2\xA1. This isn't allowed as you can see by the following error:
CREATE OR REPLACE FILE FORMAT myFileFormat
TYPE = CSV
COMPRESSION = 'AUTO'
FIELD_DELIMITER = ','
RECORD_DELIMITER = '\n'
SKIP_HEADER = 0
FIELD_OPTIONALLY_ENCLOSED_BY = '\x22' -- Double quotes (")
ERROR_ON_COLUMN_COUNT_MISMATCH = FALSE
ESCAPE = '\xC2\xA1' -- Inverted exclamation point (¡)
DATE_FORMAT = 'AUTO'
TIMESTAMP_FORMAT = 'AUTO'
NULL_IF = ('\\N')
invalid value ['\xC2\xA1'] for parameter 'ESCAPE'
On the other hand, if I use the last single-byte character I could possibly use (and is visible), the tilde (~), with a hex value of \x7E (you'd think it should be \xFF but utf-8 uses 7 bits before it goes into 2 bytes. Long story.) then it works fine. I tested this with a file and copy command and it works without issue.
CREATE OR REPLACE FILE FORMAT myFileFormat
TYPE = CSV
COMPRESSION = 'AUTO'
FIELD_DELIMITER = ','
RECORD_DELIMITER = '\n'
SKIP_HEADER = 0
FIELD_OPTIONALLY_ENCLOSED_BY = '\x22' -- Double quotes (")
ERROR_ON_COLUMN_COUNT_MISMATCH = FALSE
ESCAPE = '\x7E' -- Tilde (~)
DATE_FORMAT = 'AUTO'
TIMESTAMP_FORMAT = 'AUTO'
NULL_IF = ('\\N')
[2021-05-26 23:49:21] completed in 149 ms

Snowflake REGEXP_REPLACE guidence

I'm looking for some assistance in debugging a REGEXP_REPLACE() statement in Snowflake.
I wanted to replace |(pipe) between double quoted string only with #.
Example:
"Foreign Corporate| Name| Registration"|"99999"|"Valuation Research"
Required Result:
"Foreign Corporate# Name# Registration"|"99999"|"Valuation Research"
I have tried regex101.com with (?!(([^"]"){2})[^"]*$)[|] and substitution\1#, works, but doesn't work in Snowflake.
The regexp functions in Snowflake do not lookahead and lookbehind. If you want to use regular expressions with lookahead and lookbehind functions, you can do so in a JavaScript UDF.
Note that the regular expression here finds all the pipes including those inside double quotes. I was able to find a regular expression that finds pipes outside double quotes, which is why this UDF splits by those findings and rejoins the string. If you can find a regular expression that finds the pipes inside rather than outside the double quotes, you can simplify the UDF. However, splitting it allows other possibilities such as removing wrapping quotes if you want to do that.
set my_string = '"Foreign Corporate| Name| Registration"|"99999"|"Valuation Research"';
create or replace function REPLACE_QUOTED_PIPES(STR string)
returns string
language javascript
as
$$
const search = `(?!\\B"[^"]*)\\|(?![^"]*"\\B)`;
const searchRegExp = new RegExp(search, 'g');
var splits = STR.split(searchRegExp);
var out = "";
var del = "|";
for(var i = 0; i < splits.length; i++) {
if (i == splits.length -1) del = "";
out += splits[i].replace(/\|/g, '#') + del;
}
return out;
$$;
select REPLACE_QUOTED_PIPES($my_string);
Different approach, just using REPLACE
Replace "|" with a string that will never appear in your data. I've used ### in my example
Replace the remaining pipes with #
Replace the dummy string, ###, back to the original value "|"
e.g.
replace(replace(replace(sample_text,'"|"','###'),'|','#'),'###','"|"')
SQL statement to show each step:
select
sample_text
,replace(sample_text,'"|"','###') r1
,replace(replace(sample_text,'"|"','###'),'|','#') r2
,replace(replace(replace(sample_text,'"|"','###'),'|','#'),'###','"|"') r3
from test_solution;

escape double quotes in snowflake

I'm trying to load the data using copy into command. Field has special character as value \", but
FIELD_OPTIONALLY_ENCLOSED_BY its escaping \ and getting error while loading
Found character '0' instead of field delimiter ';'
DATA:
"TOL";"AANVR. 1E K ZIE RF.\";"011188"
After escaping second column value its considering and escaping delimiter AANVR. 1E K ZIE RF.\"; but actually it should be AANVR. 1E K ZIE RF.\.
File format
CREATE OR REPLACE FILE FORMAT TEST
FIELD_DELIMITER = ';'
SKIP_HEADER = 1
TIMESTAMP_FORMAT = 'MM/DD/YYYYHH24:MI:SS'
escape = "\\" '
TRIM_SPACE = TRUE
FIELD_OPTIONALLY_ENCLOSED_BY = '\"'
NULL_IF = ('')
ENCODING = "iso-8859-1"
;
If you need to replace double quotes in an existing table, you can use '\"' syntax in replace function. Example provided below.
select replace(column_name,'\"','') as column_name from table_name
Rough example, but the below works for me. Let me know if you're looking for a different output.
CREATE OR REPLACE table DOUBLE_TEST_DATA (
string1 string
, varchar1 varchar
, string2 string
);
COPY INTO DOUBLE_TEST_DATA FROM #TEST/doublequotesforum.csv.gz
FILE_FORMAT = (
TYPE=CSV
, FIELD_DELIMITER = ';'
, FIELD_OPTIONALLY_ENCLOSED_BY='"'
);
select * from DOUBLE_TEST_DATA;
Output:

My CSV file with double quotes enclosed fields - numeric value ' "12131" ' not recognized

I staged a csv file has all the fields enclosed in double quotes (" ") and comma separated and rows are separated by newline character. The value in the enclosed fields also contains newline characters (\n).
I am using the default FILE FORMAT = CSV. When using COPY INTO I am seeing a column mismatch error in this case.
I solved this first error by adding the file type to specify the FIELD_OPTIONALLY_ENCLOSED_BY = attribute in the SQL below.
However when I try to import NUMBER values from csv file, I already used FIELD_OPTIONALLY_ENCLOSED_BY='"'; but it's not working. I get "Numeric value '"3922000"' is not recognized" error.
A sample of my .csv file looks like this:
"3922000","14733370","57256","2","3","2","2","2019-05-23
14:14:44",",00000000",",00000000",",00000000",",00000000","1000,00000000","1000,00000000","1317,50400000","1166,50000000",",00000000",",00000000",",00000000",",00000000",",00000000",",00000000",",00000000",",00000000",",00000000",",00000000",",00000000",",00000000",",00000000",",00000000",",00000000",",00000000","","tcllVeEFPD"
My COPY INTO statement is below:
COPY INTO '..'
FROM '...'
FILE_FORMAT = (TYPE = CSV
STRIP_NULL_VALUES = TRUE
FIELD_DELIMITER = ','
SKIP_HEADER = 1
error_on_column_count_mismatch=false
FIELD_OPTIONALLY_ENCLOSED_BY = '"'
)
ON_ERROR = "ABORT_STATEMENT";
I get a feeling that NUMBER is interpreted as STRING.
Does anyone have solution for that one?
Try using a subquery in the FROM clause of the COPY command where each column is listed out and cast the appropriate columns.
Ex.
COPY INTO '...'
FROM (
SELECT $1::INTEGER
$2::FLOAT
...
)
FILE_FORMAT = (TYPE = CSV
STRIP_NULL_VALUES = TRUE
FIELD_DELIMITER = ','
SKIP_HEADER = 1
error_on_column_count_mismatch=false
FIELD_OPTIONALLY_ENCLOSED_BY = '"'
)
ON_ERROR = "ABORT_STATEMENT";

Resources