Flex string recognition "unrecognised rule" error - c

Im trying to create a string recognition rule to run in flex,the string can consist of escape characters(\n , \t , \r , \ , " , '), symbols( -, +, *, /, :, _, $, !, #, #, &, ~, ^, (, ) ) and a-zA-Z0-9 characters,i have tried many variations of the code below,but i keep getting the same error mentioned above.
ESCAPECHAR [\n] | [\t] | [\r] | [\] | ['] | ["]
SYMBOLS [-+*/:_$!##&~^()]
CHARACTERS [0-9a-zA-Z]
STRING ("({ESCAPECHAR} | {SYMBOLS} | {CHARACTERS})*") | ('({ESCAPECHAR} | {SYMBOLS} | {CHARACTERS})*')

You would do well to read the Flex manual chapter on patterns syntax. It is not very long, and it gives a complete description of the syntax of Flex patterns.
Here are a few of the errors you have made:
Flex patterns cannot include unquoted whitespace (unless you put them inside of a subexpression marked with the x flag). So
[\n] | [\t] | [\r] | [\] | ['] | ["]
is invalid.
Also, the \ is used to indicate that:
the following letter is a code for a control character (so that \n is a newline character), or
the following punctuation symbol should not be given special significance.
So in [\], the \ indicates that the following ] should be treated as an ordinary character, instead of being the end of a character class, which means that the character class will continue up to the next ]. Space characters inside a character class are considered to be quoted, so the character class consists of the characters ], space, |, [ and '. (Flex lets you repeat characters inside a character class, so it won't complain about the fact that there are two space characters.) You probably meant [\\].
Anyway, you should write character classes in the same way you wrote the other character classes, as a series of characters or escaped codes inside [ and ]:
[\n\t\r\\ '"]
Flex lets you quote characters by surrounding them with quotation marks, so that `"({ESCAPECHAR} | {SYMBOLS} | {CHARACTERS})*" is treated as a single literal string, which must be matched literally in the text. You probably intended the quotation marks to be ordinary characters, so you should have escaped them or put them into a single-character character class:
["]({ESCAPECHAR}|{SYMBOLS}|{CHARACTERS})*["]
Again, it is necessary to remove the whitespace from the pattern.
I assume that your intention was to allow "escape characters" to appear in a string only if they are actually escaped. Your {ESCAPECHAR} macro expands to a collection of actual characters, so that it includes newline, tab and carriage return characters. It also includes quote and apostrophe, which really should be reserved for terminating the string literal. Probably, what you meant was to allow escape codes if they are preceded with a \ (as with C or, as mentioned above, flex itself). In that case, what you really need to write is
ESCAPECHAR \\[ntr'"]
(That is, a \\, followed by exactly on of the characters n, t, r, ', ".) Even that is not precise, though: It does not allow the use of \\ to indicate a single \, and it forces the user to write "Don\'t just copy code." and '\"', both of which would normally be written without the backslash escapes.

Related

Codename One - String replace with empty character

I like to normalize the phone numbers I get from the contacts in the local phone book. To do that, I want to remove any spaces, dashes, plus signs etc from the number.
CN1 only offers the String.replace(oldchar, newchar) function, instead of String operations. From this post,
How to represent empty char in Java Character class, this should be the way to go:
primaryPhoneNumber = primaryPhoneNumber.replace(' ', Character.MIN_VALUE);
however, this approach has several implications.
the char in the console output looks like a space, but its not. its a string terminator.
+49 234-63446
0 234 63446
when using this normalized string literal, including the Character.Min_Value in a database, the database query involving this string crashes:
Caused by: org.postgresql.util.PSQLException: ERROR: invalid byte sequence for encoding "UTF8": 0x00
How to properly remove spaces and other chars and replace them with a "nothing" character?
You can use:
String p = StringUtils.replaceAll(phone, " ", "");

Python joining a list by "\"

Lets say I have a list of elements.
l = ["xf3", "x03", "x8c"] etc.
Now I would like to join the elements inside my list with a "\". I tried r"\".join(l) but it didn't work.
\ is used to escape 'special' characters, hence a Python string can not terminate with a single \ because it escapes the closing quote.
You have to escape it by using a second \, ie '\\'.join(l)
l = ["xf3", "x03", "x8c"]
'\\'.join(l)
The important part is to escape the '\\' as inn Python Strings:
the backslash "\" is a special character, also called the "escape"
character. It is used in representing certain whitespace characters:
"\t" is a tab, "\n" is a newline, and "\r" is a carriage return. As well "\"
can be used to escape itself: "\" is the literal backslash character.
I'm assuming is what you actually want is to create a string containing those escaped characters. The easiest way I can think of is ast.literal_eval:
>>> import ast
>>> ast.literal_eval("'\\" + "\\".join(l) + "'")
'รณ\x03\x8c'
This works by first creating a string of those strings joined by backslash characters (xf3\x03\x8c), surrounding those by quotes and adding the initial backslash ('\xf3\x03\x8c'), and finally, by evaluating it as a literal, to turn it from a length 12 string into a length 3 string.

Regex to reject if all numbers and reject colon

I am trying for a regex to
reject if input is all numbers
accept alpha-neumeric
reject colon ':'
I tried ,
ng-pattern="/[^0-9]/" and
ng-pattern="/[^0-9] [^:]*$/"
for example ,
"Block1 Grand-street USA" must be accepted
"111132322" must be rejected
"Block 1 grand : " must be rejected
You may use
ng-pattern="/^(?!\d+$)[^:]+$/"
See the regex demo.
To only forbid a : at the end of the string, use
ng-pattern="/^(?!\d+$)(?:.*[^:])?$/"
See another regex demo
The pattern matches
^ - start of string
(?!\d+$) - no 1+ digits to the end of the string
[^:]+ - one or more chars other than :
(?:.*[^:])? - an optional non-capturing group that matches 1 or 0 occurrences of
.* - any 0+ chars other than line break chars, as many as possible
[^:] - any char other than : (if you do not want to match an empty string, replace the (?: and )?)
$ - end of string.
According to comments, you want to match any character but colon.
This should do the job:
ng-pattern="/^(?!\d+$)[^:]+$/"

Octave - Adding '\n' to String Array is Not Creating a New Line

I want to change ',' character to '\n' and save it to the text file
All files are in this format:
546,234,453,685,.....,234
I want to make it like:
546
234
453
685
...
234
My initiation to this problem is like this:
fid=fopen(files{i});
strArr=fscanf(fid,'%s');
newstrArr=strrep(strArr,',','\n');
% Take each .txt input
for j=1:length(newstrArr)
Array=[Array newstrArr(j)];
endfor
Let me explain step by step:
1st I open the current text file
fid=fopen(files{i});
2nd I find the strings in text file
strArr=fscanf(fid,'%s');
Please Note that you can't replace %s with %d. (Correct me if I am wrong)
3rd I replace commas with newline character
newstrArr=strrep(strArr,',','\n');
4th I add each character to a new array with for loop
for j=1:length(newstrArr)
Array=[Array newstrArr(j)];
endfor
However When I display, using;
disp(Array);
I have this output
How can I properly replace the commas with newlines?
Regards
The issue is that you are inserting a literal '\n' (the characters \ and n) and not a newline character. This is because in Octave, a single-quote enclosed string ignores escape sequences. If you want Octave to respect escape sequences you could use a double-quoted string which will convert \n into a newline.
strrep(strArr, ',', "\n");
Or if you want your code to be MATLAB-compatible, you'll want to instead use char(10) (an actual new-line character). This is because MATLAB does not have double-quote enclosed strings.
output = strrep(strArr, ',', char(10));
Another option would be to split your input at the , and use sprintf to add the newlines (it'll treat \n as a newline)
values = strsplit(strArr, ',');
output = sprintf('%s\n', values{:});
If you just want to save each entry to a new line in a file, you can use fprintf instead.
values = strsplit(strArr, ',');
fout = fopen('output.txt', 'w');
fprintf(foug, '%s\n', values{:});
fclose(fout);
If you really just want to replace "," with newline simply do
in = fileread ("yourfile");
out = strrep (in, ",", "\n")
out = 546
234
453
685
234
Btw, see the difference between "\n" (in GNU Octave a newline) and '\n' (literally \n)
Another option is to use regexprep(), this has the advantage of being MATLAB compatible. Assuming that the newline convention you want is \n, then
regexprep('123,456,789',',','\n')
ans = 123
456
789
When output to a file via fprintf() the result looks like
123
456
789
provided the text editor understands the newline convention.

How to match all characters except right crotchet (close square bracket) with SQL's PatIndex?

In the below code example, all results should return 7.
Those with aliases beginning X however, do not.
select
--where matches
patindex('%-%' ,'111111-11') dash --not a special character, so works without escaping
,patindex('%[%' ,'111111[11') xLeftCrotchet --special character [ not escaped; works
,patindex('%[[]%','111111[11') leftCrotchetEscaped --special character [ escaped to [[]; doesn't work
,patindex('%]%' ,'111111]11') rightCrotchet --special character ] not escaped; doesn't work
,patindex('%[]]%','111111]11') xRightCrotchetEscaped --special character ] escaped to []]; also doesn't work
--where doesn't match
,patindex('%[^-]%' ,'------1--') dash --not a special character, so works without escaping
,patindex('%[^[]%' ,'[[[[[[1[[') leftCrotchet --special character [ not escaped; works
,patindex('%[^[[]]%','[[[[[[1[[') xLeftCrotchetEscaped --special character [ escaped to [[]; doesn't work
,patindex('%[^]]%' ,']]]]]]1]]') xRightCrotchet --special character ] not escaped; doesn't work
,patindex('%[^[]]]%',']]]]]]1]]') xRightCrotchetEscaped --special character ] escaped to []]; also doesn't work
In some cases it makes sense why this doesn't work; i.e. where a special character has not been correctly escaped.
However, for the left crotchet, whether it needs to be escaped or not depends on whether it follows a caret (i.e. whether we're matching on this character, or all characters but this character).
For the right crotchet, there seems to be no way to match all characters other than right crotchet; i.e. no simple way to escape this character.
NB: This post states that square brackets don't need to be escaped; but that's not the case in (one scenario from) the above example. escape square brackets in PATINDEX with SQL Server

Resources