Is it ok to use unencoded reserved characters in url? - reactjs

I have a requirement of adding multiple nested paths in the querystring. For which, im encoding the individual path names and combine those with / (delimiter). If a path contains slash in it, that will be encoded as %2F..However we are not encoding delimiter slash(which is used for spliting the path)
example:
Input1: a->b->c
Input 2: path_with_/_slash->d->e
Output: ?q=a/b/c+path_with_%2F_slash/d/e
Note: im creating querystring manually (not using urlsearchparams, as it encodes all the slashed including the separator)
Is it ok to use unencoded slash (used as separator) in query string?
Will that create any problem in any of the browsers?
Is there a better way to handle this scenario?

If you're manually forming the query string, you must follow the procedure outlined in the URL Standard, section 5.2, "application/x-www-form-urlencoded serializing":
Let output be the empty string.
For each tuple of tuples:
Let name be the result of running percent-encode after encoding with encoding, tuple’s name, the application/x-www-form-urlencoded percent-encode set, and true.
Let value be the result of running percent-encode after encoding with encoding, tuple’s value, the application/x-www-form-urlencoded percent-encode set, and true.
If output is not the empty string, then append U+0026 (&) to output.
Append name, followed by U+003D (=), followed by value, to output.
Return output.
And unless you hand-rolled your own server, any functions/middleware/etc. for working with url queries during route handling will have automatically urldecoded those values for you.

Related

When adding document to index and sepcial characters like *, #, # error

The request is invalid. Details: actions : 0: Invalid document key: 'TESTS123*14'. Keys can only contain letters, digits, underscore (_), dash (-), or equal sign (=). If the keys in your source data contain other characters, we recommend encoding them with a URL-safe version of Base64 before uploading them to your index. If that is not an option, you can add the 'allowUnsafeKeys' query string parameter to disable this check.
We use .Net Sdk, how to set the allowUnsafeKeys?
Tried to URL-safe version of Base64 but it stores only encoded content, not the actual content.

Passing a pound sign as a value in the query string of a URL, using Angular

I have an input tag and on input it will filter a list of data in a table according to the input value. That value is passed via the query string in the request URL. Typically I get data returned and the table is updated appropriately. However, when searching for the pound sign (#), I am receiving a 500 internal server error. My question is there a known issue with Angular when passing a pound sign in the query string?
To pass reserved characters in URLs, you need to use percent encoding. For #, it's %23.
The wikipedia page for Percent Encoding has a nice lookup table.

PostgreSQL: unable to save special character (regional language) in blob

I am using PostgreSQL 9.0 and am trying to store a bytea file which contains certain special characters (regional language characters - UTF8 encoded). But I am not able to store the data as input by the user.
For example :
what I get in request while debugging:
<sp_first_name_gu name="sp_first_name_gu" value="ઍયેઍ"></sp_first_name_gu><sp_first_name name="sp_first_name" value="aaa"></sp_first_name>
This is what is stored in DB:
<sp_first_name_gu name="sp_first_name_gu" value="\340\252\215\340\252\257\340\253\207\340\252\215"></sp_first_name_gu><sp_first_name name="sp_first_name" value="aaa"></sp_first_name>
Note the difference in value tag. With this issue I am not able to retrieve the proper text input by the user.
Please suggest what do I need to do?
PS: My DB is UTF8 encoded.
The value is stored correctly, but is escaped into octal escape sequences upon retrieval.
To fix that - change the settings of the DB driver or chose different different encoding/escaping for bytea.
Or just use proper field types for the XML data - like varchar or XML.
Your string \340\252\215\340\252\257\340\253\207\340\252\215 is exactly ઍયેઍ in octal encoding, so postgres stores your data correctly. PostgreSQL escapes all non printable characters, for more details see postgresql documentation, especially section 8.4.2

.NET Regex for SQL Server string... but not Unicode string?

I'm trying to build a .NET regex to match SQL Server constant strings... but not Unicode strings.
Here's a bit of SQL:
select * from SomeTable where SomeKey = 'abc''def' and AnotherField = n'another''value'
Note that within a string two single quotes escapes a single quote.
The regex should match 'abc''def' but not n'another''value'.
I have a regex now that manages to locate a string, but it also matches the Unicode string (starting just after the N):
'('{2})*([^']*)('{2})*([^']*)('{2})*'
Thanks!
This pattern will do most of what you are looking to do:
(?<unicode>n)?'(?<value>(?:''|[^'])*)'
The upside is that it should accurately match any number of escaped quotes. (SomeKey = 'abc''''def''' will match abc''''def''.)
The downside is it also matches Unicode strings, although it captures the leading n to identify it as a Unicode string. When you process the regular expression, you can ignore matches where the match group "unicode" was successful.
The pattern creates the following groups for each match:
unicode: Success if the string is a Unicode string, fails to match if ASCII
value: the string value. escaped single quotes remain escaped
If you are using .NET regular expressions, you could add (?(unicode)(?<-value>)) to the end of the pattern to suppress matching the value, although the pattern as a whole would still match.
Edit
Having thought about it some more, the following pattern should do exactly what you wanted; it will not match Unicode strings at all. The above approach might still be more readable, however.
(?:n'(?:''|[^'])*'[^']*)*(?<!n)'(?<value>(?:''|[^'])*)'

Any python/django function to check whether a string only contains characters included in my database collation?

As expected, I get an error when entering some characters not included in my database collation:
(1267, "Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation '='")
Is there any function I could use to make sure a string only contains characters existing in my database collation?
thanks
You can use a regular expression to only allow certain characters. The following allows only letters, numbers and _(underscore), but you can change to include whatever you want:
import re
exp = '^[A-Za-z0-9_]+$'
re.match(exp, my_string)
If an object is returned a match is found, if no return value, invalid string.
I'd look at Python's unicode.translate() and codec.encode() functions. Both of these would allow more elegant handling of non-legal input characters, and IIRC, translate() has been shown to be faster than a regexp for similar use-cases (should be easy to google the findings).
From Python's docs:
"For Unicode objects, the translate() method does not accept the optional deletechars argument. Instead, it returns a copy of the s where all characters have been mapped through the given translation table which must be a mapping of Unicode ordinals to Unicode ordinals, Unicode strings or None. Unmapped characters are left untouched. Characters mapped to None are deleted. Note, a more flexible approach is to create a custom character mapping codec using the codecs module (see encodings.cp1251 for an example)."
http://docs.python.org/library/stdtypes.html
http://docs.python.org/library/codecs.html

Resources