How do you escape strings for SQLite table names in c?
I find a document, but it do not tell me the detail https://www.sqlite.org/lang_keywords.html
And this document says that sql is end with '\x00' https://www.sqlite.org/c3ref/prepare.html
Here is the similar question in python: How do you escape strings for SQLite table/column names in Python?
Identifiers should be wrapped in double quotes if they need escaping, with all double quotes in them escaped by doubling up the quotes. bad"name" needs to become "bad""name" to be used in a SQL statement.
Sqlite comes with custom versions of *printf() functions that include formats for escaping sql identifiers and strings (Which use single quotes in SQL). The one that does the escaping of double quotes for identifiers is %w:
char *sanitized_ddl = sqlite3_mprintf("CREATE TABLE \"%w\"(\"%w\", \"%w\");",
"bad\"name", "foo bar", "baz");
Ideally, though, you're not going to use table or column names that need escaping, but it's good practice to escape user-supplied names to help protect against SQL injection attacks and the like.
Example:
example to 'example'
'a to '''a'
Detail:
Do not use the byte value 0 in the string.It will return an error like unrecognized token: "'" even if you pass the correct zSql len to sqlite3_prepare_v2().
Replace ' to '' in the string, add ' to the start and the end to the string. ' is single quote (byte 39).
It is not recommend to use invalid utf8 string. The string do not need to be valid utf8 string according to the parse code in sqlite3 version 3.28.0 . I have tested that invalid utf8 string can use as table name, but the document of sqlite3_prepare_v2() says you need SQL statement, UTF-8 encoded
I have write a program to confirm that any byte list with len 1,2 without byte value 0 in it, can use as the table name, and the program can read value from that table, can list the table from the SQLITE_MASTER table in sqlite3 version 3.28.0.
Related
I am trying to filter items with a stored procedure using like. The column is a varchar(15). The items I am trying to filter have square brackets in the name.
For example: WC[R]S123456.
If I do a LIKE 'WC[R]S123456' it will not return anything.
I found some information on using the ESCAPE keyword with LIKE, but how can I use it to treat the square brackets as a regular string?
LIKE 'WC[[]R]S123456'
or
LIKE 'WC\[R]S123456' ESCAPE '\'
Should work.
Let's say you want to match the literal its[brac]et.
You don't need to escape the ] as it has special meaning only when it is paired with [.
Therefore escaping [ suffices to solve the problem. You can escape [ by replacing it with [[].
I needed to exclude names that started with an underscore from a query, so I ended up with this:
WHERE b.[name] not like '\_%' escape '\' -- use \ as the escape character
Here is what I actually used:
like 'WC![R]S123456' ESCAPE '!'
The ESCAPE keyword is used if you need to search for special characters like % and _, which are normally wild cards. If you specify ESCAPE, SQL will search literally for the characters % and _.
Here's a good article with some more examples
SELECT columns FROM table WHERE
column LIKE '%[[]SQL Server Driver]%'
-- or
SELECT columns FROM table WHERE
column LIKE '%\[SQL Server Driver]%' ESCAPE '\'
According to documentation:
You can use the wildcard pattern matching characters as literal
characters. To use a wildcard character as a literal character,
enclose the wildcard character in brackets.
You need to escape these three characters %_[:
'5%' LIKE '5[%]' -- true
'5$' LIKE '5[%]' -- false
'foo_bar' LIKE 'foo[_]bar' -- true
'foo$bar' LIKE 'foo[_]bar' -- false
'foo[bar' LIKE 'foo[[]bar' -- true
'foo]bar' LIKE 'foo]bar' -- true
If you would need to escape special characters like '_' (underscore), as it was in my case, and you are not willing/not able to define an ESCAPE clause, you may wish to enclose the special character with square brackets '[' and ']'.
This explains the meaning of the "weird" string '[[]' - it just embraces the '[' character with square brackets, effectively escaping it.
My use case was to specify the name of a stored procedure with underscores in it as a filter criteria for the Profiler. So I've put string '%name[_]of[_]a[_]stored[_]procedure%' in a TextData LIKE field and it gave me trace results I wanted to achieve.
Here is a good example from the documentation:
LIKE (Transact-SQL) - Using Wildcard Characters As Literals
There is a problem in that while
LIKE 'WC[[]R]S123456'
and
LIKE 'WC\[R]S123456' ESCAPE '\'
both work for SQL Server, neither work for Oracle.
It seems that there isn't any ISO/IEC 9075 way to recognize a pattern involving a left brace.
Instead of '\' or another character on the keyboard, you can also use special characters that aren't on the keyboard. Depending o your use case this might be necessary, if you don't want user input to accidentally be used as an escape character.
Use the following.
For user input to search as it is, use escape, in that it will require the following replacement for all special characters (the below covers all of SQL Server).
Here a single quote, "'" ,is not taken as it does not affect the like clause as it is a matter of string concatenation.
The "-" & "^" & "]" replace is not required as we are escaping "[".
String FormattedString = "UserString".Replace("ð","ðð").Replace("_", "ð_").Replace("%", "ð%").Replace("[", "ð[");
Then, in SQL Query it should be as following. (In parameterised query, the string can be added with patterns after the above replacement).
To search an exact string.
like 'FormattedString' ESCAPE 'ð'
To search start with a string:
like '%FormattedString' ESCAPE 'ð'
To search end with a string:
like 'FormattedString%' ESCAPE 'ð'
To search containing with a string:
like '%FormattedString%' ESCAPE 'ð'
And so on for other pattern matching. But direct user input needs to be formatted as mentioned above.
I am reading data from a table and updating into other table in SQLite. while reading it is not showing any error. But while it is updating it is showing error because the text i am going to update contains "You're" it is showing error for 're--> this apps' re. Any way to update with this special character?
Insert another single quote after You' so that the text becomes "You''re"
http://sqlite.org/lang_expr.html
A string constant is formed by enclosing the string in single quotes ('). A single quote within the string can be encoded by putting two single quotes in a row - as in Pascal. C-style escapes using the backslash character are not supported because they are not standard SQL.
I'm trying to build a .NET regex to match SQL Server constant strings... but not Unicode strings.
Here's a bit of SQL:
select * from SomeTable where SomeKey = 'abc''def' and AnotherField = n'another''value'
Note that within a string two single quotes escapes a single quote.
The regex should match 'abc''def' but not n'another''value'.
I have a regex now that manages to locate a string, but it also matches the Unicode string (starting just after the N):
'('{2})*([^']*)('{2})*([^']*)('{2})*'
Thanks!
This pattern will do most of what you are looking to do:
(?<unicode>n)?'(?<value>(?:''|[^'])*)'
The upside is that it should accurately match any number of escaped quotes. (SomeKey = 'abc''''def''' will match abc''''def''.)
The downside is it also matches Unicode strings, although it captures the leading n to identify it as a Unicode string. When you process the regular expression, you can ignore matches where the match group "unicode" was successful.
The pattern creates the following groups for each match:
unicode: Success if the string is a Unicode string, fails to match if ASCII
value: the string value. escaped single quotes remain escaped
If you are using .NET regular expressions, you could add (?(unicode)(?<-value>)) to the end of the pattern to suppress matching the value, although the pattern as a whole would still match.
Edit
Having thought about it some more, the following pattern should do exactly what you wanted; it will not match Unicode strings at all. The above approach might still be more readable, however.
(?:n'(?:''|[^'])*'[^']*)*(?<!n)'(?<value>(?:''|[^'])*)'
There's very little documentation available about escaping characters in SQL Server BULK INSERT files.
The documentation for BULK INSERT says the statement only has two formatting options: FIELDTERMINATOR and ROWTERMINATOR, however it doesn't say how you're meant to escape those characters if they appear in a row's field value.
For example, if I have this table:
CREATE TABLE People ( name varchar(MAX), notes varchar(MAX) )
and this single row of data:
"Foo, \Bar", "he has a\r\nvery strange name\r\nlol"
...how would its corresponding bulk insert file look like, because this wouldn't work for obvious reasons:
Foo,\Bar,he has a
very strange name
lol
SQL Server says it supports \r and \n but doesn't say if backslashes escape themselves, nor does it mention field value delimiting (e.g. with double-quotes, or escaping double-quotes) so I'm a little perplexed in this area.
I worked-around this issue by using \0 as a row separator and \t as a field separator, as neither character appeared as a field value and are both supported as separators by BULK INSERT.
I am surprised MSSQL doesn't offer more flexibility when it comes to import/export. It wouldn't take too much effort to build a first-class CSV/TSV parser.
For the next person to search:
I used "\0\t" as a field separator, and "\0\n" for the end-of-line separator on the last field. Use of "\0\r\n" would also be acceptable if you wish to pretend that the files have DOS EOL conventions.
For those unfamiliar with the \x notation, \0 is CHAR(0), \t is CHAR(9), \n is CHAR(10) and \r is CHAR(13). Replace the CHAR() function with whatever your language offers to convert a number to a nominated character.
With this combination, all instances of \t and \n (and \r) become acceptable characters in the data file. After all, the weakness of the bulk upload system is that tabs and newlines are often legitimate characters in text strings, whereas other low-ASCII characters like CHAR(0), CHAR(1) and CHAR(2) are not legal text - not even appearing in UTF-8.
The only character you cannot have in your data is \0 - UNLESS you can guarantee it will never be followed by \t or \n (or \r)
If your language suffers problems when you use \0 in strings (but depending on how you code, you may still be able to avoid that problem) - AND if you know that your data won't have CHAR(1) or CHAR(2) in it (ie no binary) then use those characters instead. Those low characters are only going to be found when you are trying to store arbitrary binary data in strings.
Note also that you will find bytes 0, 1, 2 in UTF-16, UCS-2 and UTF-32 (aka UCS-4) - BUT - the 2 or 4 byte wide representation of CHAR(0, 1 or 2) is still acceptable and distinct from any legal unicode text. Just make sure you select the correct codepage setting in the format file to suit your choice of a UTF or UCS variant.
A bulk insert needs to have corresponding fields and field count for each row. Your example is a little rough, as its not structured data. As for thecharacters it will interpret them literally, not using escape characters (your string will be as seen in the file.
As for the double quotes enclosing each field, you will just have to use them as field and row terminators as well. So now your you should have:
Fieldterminator = '","',
Rowterminator = '"\n'
Does that make sense? Then after the bulk insert you'll need to take out the prefix double quote with something like:
Update yourtable
set yourfirstcolumn = right(yourfirstcolumn, len(yourfirstcolumn) - 1)
Like:
insert into table (col) values (N'multilingual unicode strings')
I'm using SQL Server 2008 and I already use nVarChar as the column data type.
You need the N'' syntax only if the string contains characters which are not inside the default code page. "Best practice" is to have N'' whenever you insert into an nvarchar or ntext column.
Yes, you do if you have unicode characters in the strings.
From books online (http://msdn.microsoft.com/en-us/library/ms191313.aspx)...
"Unicode string constants that appear in code executed on the server, such as in stored procedures and triggers, must be preceded by the capital letter N. This is true even if the column being referenced is already defined as Unicode. Without the N prefix, the string is converted to the default code page of the database. This may not recognize certain characters. The requirement to use the N prefix applies to both string constants that originate on the server and those sent from the client."
It is preferable for compatibility sake.
Best practice is to use parameterisation in which case you don't need the N prefix.