Python joining a list by "\" - arrays

Lets say I have a list of elements.
l = ["xf3", "x03", "x8c"] etc.
Now I would like to join the elements inside my list with a "\". I tried r"\".join(l) but it didn't work.

\ is used to escape 'special' characters, hence a Python string can not terminate with a single \ because it escapes the closing quote.
You have to escape it by using a second \, ie '\\'.join(l)

l = ["xf3", "x03", "x8c"]
'\\'.join(l)
The important part is to escape the '\\' as inn Python Strings:
the backslash "\" is a special character, also called the "escape"
character. It is used in representing certain whitespace characters:
"\t" is a tab, "\n" is a newline, and "\r" is a carriage return. As well "\"
can be used to escape itself: "\" is the literal backslash character.

I'm assuming is what you actually want is to create a string containing those escaped characters. The easiest way I can think of is ast.literal_eval:
>>> import ast
>>> ast.literal_eval("'\\" + "\\".join(l) + "'")
'รณ\x03\x8c'
This works by first creating a string of those strings joined by backslash characters (xf3\x03\x8c), surrounding those by quotes and adding the initial backslash ('\xf3\x03\x8c'), and finally, by evaluating it as a literal, to turn it from a length 12 string into a length 3 string.

Related

regex to limit string and trim/leading spaces

I have a regex text for check length of string :
Anychar between 2 and 40 chars.
const texte = RegExp(/^.{2,40}$/, 'g')
The problem is, if I set double spaces, regex match (normal).
But I want to specify my regex don't match for trim and trail spaces.
How can I do that please ?
Thanks
If you want the length limit to ignore trailing/leading spaces , try this regex:
/^\s*\b.{2,40}\b\s*$/
^ start of the string
\s* match any spaces
\b word boundary
.{2,40} any character length between 2 and 40
$ end of the string
Check the test cases
You can use
/^\s*\S.{0,38}\S\s*$/
See the regex demo.
Details:
^ - string start
\s* - 0+ leading whitespaces
\S - a non-whiespace
.{0,38} - zero to 38 chars other than line break chars
\S - a non-whiespace
\s* - 0+ trailing whitespaces
$ - string end.
Do you mean like this?
const texte = RegExp(/^[^ ]{2,40}$/, 'g')

Flex string recognition "unrecognised rule" error

Im trying to create a string recognition rule to run in flex,the string can consist of escape characters(\n , \t , \r , \ , " , '), symbols( -, +, *, /, :, _, $, !, #, #, &, ~, ^, (, ) ) and a-zA-Z0-9 characters,i have tried many variations of the code below,but i keep getting the same error mentioned above.
ESCAPECHAR [\n] | [\t] | [\r] | [\] | ['] | ["]
SYMBOLS [-+*/:_$!##&~^()]
CHARACTERS [0-9a-zA-Z]
STRING ("({ESCAPECHAR} | {SYMBOLS} | {CHARACTERS})*") | ('({ESCAPECHAR} | {SYMBOLS} | {CHARACTERS})*')
You would do well to read the Flex manual chapter on patterns syntax. It is not very long, and it gives a complete description of the syntax of Flex patterns.
Here are a few of the errors you have made:
Flex patterns cannot include unquoted whitespace (unless you put them inside of a subexpression marked with the x flag). So
[\n] | [\t] | [\r] | [\] | ['] | ["]
is invalid.
Also, the \ is used to indicate that:
the following letter is a code for a control character (so that \n is a newline character), or
the following punctuation symbol should not be given special significance.
So in [\], the \ indicates that the following ] should be treated as an ordinary character, instead of being the end of a character class, which means that the character class will continue up to the next ]. Space characters inside a character class are considered to be quoted, so the character class consists of the characters ], space, |, [ and '. (Flex lets you repeat characters inside a character class, so it won't complain about the fact that there are two space characters.) You probably meant [\\].
Anyway, you should write character classes in the same way you wrote the other character classes, as a series of characters or escaped codes inside [ and ]:
[\n\t\r\\ '"]
Flex lets you quote characters by surrounding them with quotation marks, so that `"({ESCAPECHAR} | {SYMBOLS} | {CHARACTERS})*" is treated as a single literal string, which must be matched literally in the text. You probably intended the quotation marks to be ordinary characters, so you should have escaped them or put them into a single-character character class:
["]({ESCAPECHAR}|{SYMBOLS}|{CHARACTERS})*["]
Again, it is necessary to remove the whitespace from the pattern.
I assume that your intention was to allow "escape characters" to appear in a string only if they are actually escaped. Your {ESCAPECHAR} macro expands to a collection of actual characters, so that it includes newline, tab and carriage return characters. It also includes quote and apostrophe, which really should be reserved for terminating the string literal. Probably, what you meant was to allow escape codes if they are preceded with a \ (as with C or, as mentioned above, flex itself). In that case, what you really need to write is
ESCAPECHAR \\[ntr'"]
(That is, a \\, followed by exactly on of the characters n, t, r, ', ".) Even that is not precise, though: It does not allow the use of \\ to indicate a single \, and it forces the user to write "Don\'t just copy code." and '\"', both of which would normally be written without the backslash escapes.

Octave - Adding '\n' to String Array is Not Creating a New Line

I want to change ',' character to '\n' and save it to the text file
All files are in this format:
546,234,453,685,.....,234
I want to make it like:
546
234
453
685
...
234
My initiation to this problem is like this:
fid=fopen(files{i});
strArr=fscanf(fid,'%s');
newstrArr=strrep(strArr,',','\n');
% Take each .txt input
for j=1:length(newstrArr)
Array=[Array newstrArr(j)];
endfor
Let me explain step by step:
1st I open the current text file
fid=fopen(files{i});
2nd I find the strings in text file
strArr=fscanf(fid,'%s');
Please Note that you can't replace %s with %d. (Correct me if I am wrong)
3rd I replace commas with newline character
newstrArr=strrep(strArr,',','\n');
4th I add each character to a new array with for loop
for j=1:length(newstrArr)
Array=[Array newstrArr(j)];
endfor
However When I display, using;
disp(Array);
I have this output
How can I properly replace the commas with newlines?
Regards
The issue is that you are inserting a literal '\n' (the characters \ and n) and not a newline character. This is because in Octave, a single-quote enclosed string ignores escape sequences. If you want Octave to respect escape sequences you could use a double-quoted string which will convert \n into a newline.
strrep(strArr, ',', "\n");
Or if you want your code to be MATLAB-compatible, you'll want to instead use char(10) (an actual new-line character). This is because MATLAB does not have double-quote enclosed strings.
output = strrep(strArr, ',', char(10));
Another option would be to split your input at the , and use sprintf to add the newlines (it'll treat \n as a newline)
values = strsplit(strArr, ',');
output = sprintf('%s\n', values{:});
If you just want to save each entry to a new line in a file, you can use fprintf instead.
values = strsplit(strArr, ',');
fout = fopen('output.txt', 'w');
fprintf(foug, '%s\n', values{:});
fclose(fout);
If you really just want to replace "," with newline simply do
in = fileread ("yourfile");
out = strrep (in, ",", "\n")
out = 546
234
453
685
234
Btw, see the difference between "\n" (in GNU Octave a newline) and '\n' (literally \n)
Another option is to use regexprep(), this has the advantage of being MATLAB compatible. Assuming that the newline convention you want is \n, then
regexprep('123,456,789',',','\n')
ans = 123
456
789
When output to a file via fprintf() the result looks like
123
456
789
provided the text editor understands the newline convention.

How to match all characters except right crotchet (close square bracket) with SQL's PatIndex?

In the below code example, all results should return 7.
Those with aliases beginning X however, do not.
select
--where matches
patindex('%-%' ,'111111-11') dash --not a special character, so works without escaping
,patindex('%[%' ,'111111[11') xLeftCrotchet --special character [ not escaped; works
,patindex('%[[]%','111111[11') leftCrotchetEscaped --special character [ escaped to [[]; doesn't work
,patindex('%]%' ,'111111]11') rightCrotchet --special character ] not escaped; doesn't work
,patindex('%[]]%','111111]11') xRightCrotchetEscaped --special character ] escaped to []]; also doesn't work
--where doesn't match
,patindex('%[^-]%' ,'------1--') dash --not a special character, so works without escaping
,patindex('%[^[]%' ,'[[[[[[1[[') leftCrotchet --special character [ not escaped; works
,patindex('%[^[[]]%','[[[[[[1[[') xLeftCrotchetEscaped --special character [ escaped to [[]; doesn't work
,patindex('%[^]]%' ,']]]]]]1]]') xRightCrotchet --special character ] not escaped; doesn't work
,patindex('%[^[]]]%',']]]]]]1]]') xRightCrotchetEscaped --special character ] escaped to []]; also doesn't work
In some cases it makes sense why this doesn't work; i.e. where a special character has not been correctly escaped.
However, for the left crotchet, whether it needs to be escaped or not depends on whether it follows a caret (i.e. whether we're matching on this character, or all characters but this character).
For the right crotchet, there seems to be no way to match all characters other than right crotchet; i.e. no simple way to escape this character.
NB: This post states that square brackets don't need to be escaped; but that's not the case in (one scenario from) the above example. escape square brackets in PATINDEX with SQL Server

How do I set word delimiters?

User's guide chapter 6.1.5 The Word Chunk A word is a string of characters delimited by space, tab, or return characters or enclosed by double quotes. Is it possible to have additional word delimiters?
I have the following code snippet taken from the User's Guide chapter 6.5.1 'When to use arrays', p. 184
on mouseUp
--cycle through each word adding each instance to an array
repeat for each word tWord in field "sample text"
add 1 to tWordCount[tWord]
end repeat
-- combine the array into text
combine tWordCount using return and comma
answer tWordCount
end mouseUp
It counts the number of occurences of each word form in the field "Sample text".
I realize that full stops after words are counted as part of the word with the default setting.
How do I change the settings that a full stop (and, or a comma) is considered a word boundary?
Alternatively you could simply remove the offending characters before processing.
This can be done using either the REPLACE function or the "REPLACETEXT function.
The REPLACETEXT function can use a regular expression matchstring but is slower than the REPLACE function. So here I am using the REPLACE function.
on mouseUp
put field "sample" into twords
--remove all trailing puncuation and quotes
replace "." with "" in twords
replace "," with "" in twords
replace "?" with "" in twords
replace ";" with "" in twords
replace ":" with "" in twords
replace quote with "" in twords
--hyphenated words need to be seperated?
replace "-" with " " in twords
repeat for each word tword in twords
add 1 to twordcount[tword]
end repeat
combine twordcount using return and comma
answer twordcount
end mouseUp
I think you are asking a question about delimiters. Some delimiters are built-in:
spaces for words,
commas for items,
return (CR) for lines.
The ability to create your own custom delimiter property (the itemDelimiter) is a powerful feature of the language, and pertains to "items". You can set this to any single character:
set the itemDelimiter to "C"
answer the number of items in "XXCXXCXX" --call this string "theText"
The result will be "3"
As others have pointed out, the method of replacing one string for another allows formidable control over custom parsing of text:
replace "C" with space in theText
yields "XX XX XX"
Craig Newman
As the User's guide says in chapter 6.1.5 The Word Chunk A word is a string of characters delimited by space, tab, or return characters or enclosed by double quotes.
There is itemDelimiter but not wordDelimiter.
So punctuation as to be removed first before adding the word to the word count array.
This may be done with a function effectiveWord.
function effectiveWord aWord
put last char of aWord into it
if it is "." then delete last char of aWord
if it is "," then delete last char of aWord
if it is ":" then delete last char of aWord
if it is ";" then delete last char of aWord
return aWord
end effectiveWord
on mouseUp
--cycle through each word adding each instance to an array
repeat for each word tWord in field "Sample text"
add 1 to tWordCount[effectiveWord(tWord)]
end repeat
-- combine the array into text
combine tWordCount using return and comma
answer tWordCount
end mouseUp

Resources