Regex inside split() method unintended side-effect [duplicate] - arrays

$.validator.addMethod('AZ09_', function (value) {
return /^[a-zA-Z0-9.-_]+$/.test(value);
}, 'Only letters, numbers, and _-. are allowed');
When I use somehting like test-123 it still triggers as if the hyphen is invalid. I tried \- and --

Escaping using \- should be fine, but you can also try putting it at the beginning or the end of the character class. This should work for you:
/^[a-zA-Z0-9._-]+$/

Escaping the hyphen using \- is the correct way.
I have verified that the expression /^[a-zA-Z0-9.\-_]+$/ does allow hyphens. You can also use the \w class to shorten it to /^[\w.\-]+$/.
(Putting the hyphen last in the expression actually causes it to not require escaping, as it then can't be part of a range, however you might still want to get into the habit of always escaping it.)

The \- maybe wasn't working because you passed the whole stuff from the server with a string. If that's the case, you should at first escape the \ so the server side program can handle it too.
In a server side string: \\-
On the client side: \-
In regex (covers): -
Or you can simply put at the and of the [] brackets.

Generally with hyphen (-) character in regex, its important to note the difference between escaping (\-) and not escaping (-) the hyphen because hyphen apart from being a character themselves are parsed to specify range in regex.
In the first case, with escaped hyphen (\-), regex will only match the hyphen as in example /^[+\-.]+$/
In the second case, not escaping for example /^[+-.]+$/ here since the hyphen is between plus and dot so it will match all characters with ASCII values between 43 (for plus) and 46 (for dot), so will include comma (ASCII value of 44) as a side-effect.

\- should work to escape the - in the character range. Can you quote what you tested when it didn't seem to? Because it seems to work: http://jsbin.com/odita3

A more generic way of matching hyphens is by using the character class for hyphens and dashes ("\p{Pd}" without quotes). If you are dealing with text from various cultures and sources, you might find that there are more types of hyphens out there, not just one character. You can add that inside the [] expression

Related

SQL Server SqlPackage variables: are quotes needed around string variable?

When running sqlpackage.exe for deployments, do string variables require quotes around the word? It seems to be running successfully both ways. What is the correct syntax?
Two options shown here:
/v:CompanyName=ABCD
/v:CompanyName="ABCD"
Resource: https://learn.microsoft.com/en-us/sql/tools/sqlpackage/sqlpackage?view=sql-server-ver15
#Jeroen Mostert is right. It's more related to the command line not only the SqlPackage.
If the string variable contains spaces equality signs, slashes, or anything else that would interfere with option syntax, the value must be surrounded in "quotes".
Here is the example blog: https://www.addictivetips.com/windows-tips/enter-file-or-folder-paths-with-spaces-in-command-prompt-on-windows-10/
If all of the following conditions are met, then quote characters on the command line are preserved:
No /S switch (Strip quotes)
Exactly two quote characters
No special characters between the two quote characters, where special is one of: & < >( ) # ^ |
There are one or more whitespace characters between the the two quote characters
The string between the two quote characters is the name of an executable file.
Ref: https://ss64.com/nt/syntax-cmd.html
HTH.

ng-pattern for alphanumeric and all special symbol characters

In input, I need to allow alphanumeric and all special symbol character.
I am using the following pattern. Unfortunatelly, it is not working.
ng-pattern="/^[ A-Za-z0-9_#./#$=!%^)(]:*;?/\,}{'|<>[&+-]*$/"
You missed to escape all the characters that should be excaped with \. The following may work:
ng-pattern="/^[A-Za-z0-9_#.#$=!%^)(\]:\*;\?\/\,}{'\|<>\[&\+-]*$/"
Note that it could be simplified to:
ng-pattern="/^[A-z\d_#.#$=!%^)(\]:\*;\?\/\,}{'\|<>\[&\+-]*$/"
Test it on regex101

Can use escape character for Double Quote JSON

I have a json without escape character which I my code is unable to parse because there's no escape character. I can make it work by adding a \ before the double quotes. However, due to some constraint I am looking for a workaround and I want to know --
a. Is there any other way I can make this json work without an escape character and the content having double quotes is displayed on my application as is, or
b. do I necessarily need to have an escape character before all double quotes and there's no workaround?
"abc": {
"x1": {
"text1": "key1",
"text2": "Given "Example text" is wrong"
}
}
Thanks !!
Your example is invalid JSON, but I think you know that. :-)
do I necessarily need to have an escape character before all double quotes
Yes, the only way to have a " inside a JSON string is to use an escape of some kind. Unlike JavaScript, JSON doesn't have '-delimited strings or backtick-delimited templates that become strings (new in ES2015). There are a couple of different escape sequences you can use (\" and \u0022 for instance), but they're still escape sequences. After all, the " is how the JSON parser knows it's found the end of the string.
In the specific case of HTML, you could also use " (a named character entity) if you're interpreting the string as HTML. But that doesn't change the fact you need to properly escape the string (since newlines and several other characters need escaping as well, not just ").
My experience is that the best way to produce JSON is to produce a structure in memory and then use the facility of your environment to convert that structure to valid JSON. In JavaScript, that's JSON.stringify; in PHP, it's json_encode; etc. Just about any language or environment you can find has a JSON library (built-in or not) for this.
You SHOULD add escape char () in order to have a valid JSON.
According to the specs, this is the list of special character used in JSON :
\b Backspace (ascii code 08)
\f Form feed (ascii code 0C)
\n New line
\r Carriage return
\t Tab
\" Double quote
\ Backslash caracter

Flex Regular Expression to Identify AWK Regular Expression

I am putting together the last pattern for my flex scanner for parsing AWK source code.
I cannot figure out how to match the regular expressions used in the AWK source code as seen below:
{if ($0 ~ /^\/\// ){ #Match for "//" (Comment)
or more simply:
else if ($0 ~ /^Department/){
where the AWK regular expression is encapsulated within "/ /".
All of the Flex patterns I have tried so far match my entire input file. I have tried changing the precedence of the regex pattern and have found no luck. Help would be greatly appreciated!!
regexing regexen must be a meme somewhere. Anyway, let's give it a try.
A gawk regex consists of:
/
any number of regex components
/
A regex component (simplified form -- Note 1) is one of the following:
any character other than /, [ or \
a \ followed by any single character (we won't get into linefeeds just now, though.
a character class (see below)
Up to here it's easy. Now for the fun part.
A character class is:
[ or [^ or [] or [^] (Note 2)
any number of character class components
]
A character class component is (theoretically, but see below for the gawk bug) one of the following:
any single character other than ] or \ (Note 3)
a \ followed by any single character
a character class
a collation class
A character class is: (Note 5)
[:
a valid class name, which afaik is always a sequence of alpha characters, but it's maybe safer not to make assumptions.
:]
A collation class is mostly unimplemented but partially parsed. You could probably ignore them, because it seems like gawk doesn't get them right yet (Note 4). But for what it's worth:
[.
some multicharacter collation character, like 'ij' in Dutch locale (I think).
.]
or an equivalence class:
[=
some character, or maybe also a multicharacter collation character
=]
An important point is the [/] does not terminate the regex. You don't need to write [\/]. (You don't need to do anything to implement that. I'm just mentioning it.).
Note 1:
Actually, the intepretation of \ and character classes, when we get to them, is a lot more complicated. I'm just describing enough of it for lexing. If you actually want to parse the regexen into their bits and pieces, it's a lot more irritating.
For example, you can specify an arbitrary octet with \ddd or \xHH (eg \203 or \x4F). However, we don't need to care, because nothing in the escape sequence is special, so for lexing purposes it doesn't matter; we'll get the right end of the lexeme. Similary, I didn't bother describing character ranges and the peculiar rules for - inside a character class, nor do I worry about regex metacharacters (){}?*+. at all, since they don't enter into lexing. You do have to worry about [] because it can implicitly hide a / from terminating the regex. (I once wrote a regex parser which let you hide / inside parenthesized expressions, which I thought was cool -- it cuts down a lot on the kilroy-was-here noise (\/) -- but nobody else seems to think this is a good idea.)
Note 2:
Although gawk does \ wrong inside character classes (see Note 3 below), it doesn't require that you use them, so you can still use Posix behaviour. Posix behaviour is that the ] does not terminate the character class if it is the first character in the character class, possibly following the negating ^. The easiest way to deal with this is to let character classes start with any of the four possible sequences, which is summarized as:
\[^?]?
Note 3:
gawk differs from Posix ERE's (Extended Regular Expressions) in that it interprets \ inside a character class as an escape character. Posix mandates that \ loses its special meaning inside character classes. I find it annoying that gawk does this (and so do many other regex libraries, equally annoying.) It's particularly annoying that the gawk info manual says that Posix requires it to do this, when it actually requires the reverse. But that's just me. Anyway, in gawk:
/[\]/]/
is a regular expression which matches either ] or /. In Posix, stripping the enclosing /s out of the way, it would be a regular expression which matches a \ followed by a / followed by a ]. (Both gawk and Posix require that ] not be special when it's not being treated as a character class terminator.)
Note 4:
There's a bug in the version of gawk installed on my machine where the regex parser gets confused at the end of a collating class. So it thinks the regex is terminated by the first second / in:
/[[.a.]/]/
although it gets this right:
/[[:alpha:]/]/
and, of course, putting the slash first always works:
/[/[:alpha:]]/
Note 5:
Character classes and collating classes and friends are a bit tricky to parse because they have two-character terminators. "Write a regex to recognize C /* */ comments" used to be a standard interview question, but I suppose it not longer is. Anyway, here's a solution (for [:...:], but just substitute : for the other punctuation if you want to):
[[]:([^:]|:*[^]:])*:+[]] // Yes, I know it's unreadable. Stare at it a while.
regex could work without "/.../" see the example:
print all numbers starting with 7 from 1-100:
kent$ seq 100|awk '{if($0~"7[0-9]")print}'
70
71
72
73
74
75
76
77
78
79
kent$ awk --version
GNU Awk 3.1.6

Regular expression for a string literal in flex/lex

I'm experimenting to learn flex and would like to match string literals. My code currently looks like:
"\""([^\n\"\\]*(\\[.\n])*)*"\"" {/*matches string-literal*/;}
I've been struggling with variations for an hour or so and can't get it working the way it should. I'm essentially hoping to match a string literal that can't contain a new-line (unless it's escaped) and supports escaped characters.
I am probably just writing a poor regular expression or one incompatible with flex. Please advise!
A string consists of a quote mark
"
followed by zero or more of either an escaped anything
\\.
or a non-quote character, non-backslash character
[^"\\]
and finally a terminating quote
"
Put it all together, and you've got
\"(\\.|[^"\\])*\"
The delimiting quotes are escaped because they are Flex meta-characters.
For a single line... you can use this:
\"([^\\\"]|\\.)*\" {/*matches string-literal on a single line*/;}
How about using a start state...
int enter_dblquotes = 0;
%x DBLQUOTES
%%
\" { BEGIN(DBLQUOTES); enter_dblquotes++; }
<DBLQUOTES>*\"
{
if (enter_dblquotes){
handle_this_dblquotes(yytext);
BEGIN(INITIAL); /* revert back to normal */
enter_dblquotes--;
}
}
...more rules follow...
It was similar to that effect (flex uses %s or %x to indicate what state would be expected. When the flex input detects a quote, it switches to another state, then continues lexing until it reaches another quote, in which it reverts back to the normal state.
Paste my code snippet about handling string in flex, hope inspire your thinking.
Use Start Condition to handle string literal will be more scalable and clear.
%x SINGLE_STRING
%%
\" BEGIN(SINGLE_STRING);
<SINGLE_STRING>{
\n yyerror("the string misses \" to termiate before newline");
<<EOF>> yyerror("the string misses \" to terminate before EOF");
([^\\\"]|\\.)* {/* do your work like save in here */}
\" BEGIN(INITIAL);
. ;
}
This is what we use in Zolang for single line string literals with embedded templates ${...}
\"(\$\{.*\}|\\.|[^\"\\])*\"
An answer that arrives late but which can be useful for the next one who will need it:
\"(([^\"]|\\\")*[^\\])?\"

Resources