there is a string:
"fdsfsfsfsfsdomnol$natureOrder(0123)jqnm"
I want to match the substring:$natureOrder(0123),I do something like this:
regcomp(®, "\$natureOrder\([0-9]{1,4}\)", cflags);
but it doesn't work!How to write the regex pattern?
Apart escaping the $, you need to have the parenthesis in your regex, and those ones too must be escaped.
So the regular expression would be
\$natureOrder\([0-9]{1,4}\)
And when in a C string, as the \ is the start of an escape sequence :
regcomp(®, "\\$natureOrder\\([0-9]{1,4}\\)", cflags);
Related
In the regex below, \s denotes a space character. I imagine the regex parser, is going through the string and sees \ and knows that the next character is special.
But this is not the case as double escapes are required.
Why is this?
var res = new RegExp('(\\s|^)' + foo).test(moo);
Is there a concrete example of how a single escape could be mis-interpreted as something else?
You are constructing the regular expression by passing a string to the RegExp constructor.
\ is an escape character in string literals.
The \ is consumed by the string literal parsing…
const foo = "foo";
const string = '(\s|^)' + foo;
console.log(string);
… so the data you pass to the RegEx compiler is a plain s and not \s.
You need to escape the \ to express the \ as data instead of being an escape character itself.
Inside the code where you're creating a string, the backslash is a javascript escape character first, which means the escape sequences like \t, \n, \", etc. will be translated into their javascript counterpart (tab, newline, quote, etc.), and that will be made a part of the string. Double-backslash represents a single backslash in the actual string itself, so if you want a backslash in the string, you escape that first.
So when you generate a string by saying var someString = '(\\s|^)', what you're really doing is creating an actual string with the value (\s|^).
The Regex needs a string representation of \s, which in JavaScript can be produced using the literal "\\s".
Here's a live example to illustrate why "\s" is not enough:
alert("One backslash: \s\nDouble backslashes: \\s");
Note how an extra \ before \s changes the output.
As has been said, inside a string literal, a backslash indicates an escape sequence, rather than a literal backslash character, but the RegExp constructor often needs literal backslash characters in the string passed to it, so the code should have \\s to represent a literal backslash, in most cases.
A problem is that double-escaping metacharacters is tedious. There is one way to pass a string to new RegExp without having to double escape them: use the String.raw template tag, an ES6 feature, which allows you to write a string that will be parsed by the interpreter verbatim, without any parsing of escape sequences. For example:
console.log('\\'.length); // length 1: an escaped backslash
console.log(`\\`.length); // length 1: an escaped backslash
console.log(String.raw`\\`.length); // length 2: no escaping in String.raw!
So, if you wish to keep your code readable, and you have many backslashes, you may use String.raw to type only one backslash, when the pattern requires a backslash:
const sentence = 'foo bar baz';
const regex = new RegExp(String.raw`\bfoo\sbar\sbaz\b`);
console.log(regex.test(sentence));
But there's a better option. Generally, there's not much good reason to use new RegExp unless you need to dynamically create a regular expression from existing variables. Otherwise, you should use regex literals instead, which do not require double-escaping of metacharacters, and do not require writing out String.raw to keep the pattern readable:
const sentence = 'foo bar baz';
const regex = /\bfoo\sbar\sbaz\b/;
console.log(regex.test(sentence));
Best to only use new RegExp when the pattern must be created on-the-fly, like in the following snippet:
const sentence = 'foo bar baz';
const wordToFind = 'foo'; // from user input
const regex = new RegExp(String.raw`\b${wordToFind}\b`);
console.log(regex.test(sentence));
\ is used in Strings to escape special characters. If you want a backslash in your string (e.g. for the \ in \s) you have to escape it via a backslash. So \ becomes \\ .
EDIT: Even had to do it here, because \\ in my answer turned to \.
Can a string constant like "foo" "\x01" "bar" be written as a single string literal (while keeping the hexadecimal notation)? With "foo\x01bar" the escape sequence seems to be interpreted as \x01ba since I get the warning "hex escape sequence out of range."
"foo" "\x01" "bar" is a string literal.
The C standard states that a hexadecimal escape sequence is the longest sequence of characters that can constitute the escape sequence. Without the explicit concatenation (which is the common workaround to this problem), the compiler parses \x01ba which is obviously out of range.
How about "foo\x01\142ar"? Is that cheating?
Another solution is to simply write the escaped character in octal, instead of hexadecimal
"foo\1bar"
and no more ambiguity...
Is it possible to write something like this:
printf(#"
-
-
-
-
");
I can do it in C#, but can't in C. It gives me an error in CodeBlocks. Am I allowed to do such ?
Error message: error: stray '#' in program.
No. That syntax doesn't exist in C.
If you want a multiple-line string, write it as multiple double-quoted strings with no other tokens in between them. They will be combined.
printf(
"some string"
"more of the string"
"even more of the string"
);
(You will, of course, need to add a \n at the end of each line if that's what you want.)
No that's not a syntax that C understands, C doesn't have raw literals.
You can use \ as the last character to continue on the next line:
const char *str = "hello\n\
world";
Also, consecutive string literals will be concatenated. So you can do e.g.
const char *str = "Hello\n"
"world\n";
C#'s verbatim strings are not available in C. If you have some characters to escape, like " or \, escape them with '\', there is no there option in this language.
If you want to embed multiple lines in a string literal, you can either insert \n at the appropriate location in your string, or escape the return character as well:
printf("Here's\
a multiline\
string litteral");
Line continuation with \ at the end of the line.
printf("\
\
-\
-\
-\
-\
");
String literals in C may not contain newlines. You have two workarounds:
Use implicit string concatenation (done by the compiler).
printf("The quick brown"
" fox jumps over"
" the sleazy dog.");
Escape the newline by placing a backslash in front of it.
printf("The quick brown\
fox jumps over\
the sleazy dog.");
Personally, I prefer the first form since the second looks ugly (my opinion) and forces you to ruin your code indentation.
In either case, the string will simply not contain the newlines. So if you really meant for them to be there, you'll have to add them via \n.
I'm a bit confused about an explanation concerning macros in K&R 2nd Ed, p.90. Here is the paragraph:
Formal parameters are not replaced within quoted strings. If, however, a parameter name is preceded by a # in the replacement text, the combination will be expanded into a quoted string with the parameter replaced by the actual argument.
I'm not sure what that second sentence is saying. It goes on to explain a use for this with a "debugging print macro".
This can be combined with a string concatenation to make, for example, a debugging print macro:
#define dprint(expr) printf(#expr " = %g\n", expr);
Edit:
All the input was useful. Thank you guys.
If you define macro like this:
#define MAKE_STRING(X) #X
Then, you can do something like this:
puts(MAKE_STRING(a == b));
Which will expand into:
puts("a == b");
In the dprint() example, it is printing out a string form of the expression, as well as the expression value.
dprint(sin(x)/2);
Will expand into:
printf("sin(x)/2" " = %g\n", sin(x)/2);
String literal concatenation will treat the first parameter as a single string literal.
It is just a neat feature where you can convert a macro parameter into a string literal which mainly is useful for debugging purposes. So
dprint(x + y);
is expanded by the C preprocessor to this
printf("x + y = %g\n", x + y);
Notice how the value of the parameter expr appears both inside the string literal and also in the code generated by the macro. For this to happen you need to prefix expr with # to create a string literal.
One thing worth pointing out is that adjacent string literals are combined into a single string literal, e.g. "x + y" " = %g\n" are combined into "x + y = %g\n".
#expr is expanded into "expr". Two string literals next to each other are automatically concatenated. We can see that invoking gcc -E for dprint(test) will give the following output:
("test" " = %g\n");
This site may help. It describes how stringification can be implemented.
In C it is not normally possible to use ' for printf of a string. However, I have text which are full of double quote ", and I need to escape all of them as
printf("This is \"test\" for another \"text\"");
Is it possible to printf in a way without escaping ". I mean using another character for wrapping the string.
Not recommended, but you can use a macro:
#include <stdio.h>
#define S(x) #x
int main() {
printf(S(This "is" a string (with nesting)!\n));
}
This prints
This "is" a string (with nesting)!
Now the delimiters are balanced () characters. However, to escape single ), ", or ' characters, you have to write something like S(right paren: ) ")" S(!\n), which is quite ugly. This technique is not recommended for writing maintainable code.
No, that is not possible in the C language. There is only one syntax for string literals, and that is that they are delimited by double quotes.
The only way to write unescaped quotation marks is as character literals inside character arrays, which is uglier and more difficult to write, so there's very little reason to do so in a case like this:
char array[] = {'T', 'h', 'i', 's', ' ', 'i', 's', ' ', '"'}; // etc.
printf("%s", array);
No there is not other way, the draft C99 standard in section 6.4.5 String literals has the following grammar:
string-literal:
" s-char-sequenceopt "
L" s-char-sequenceopt "
No, it's not possible in standard C.
C11 6.4.5 String literals
The same considerations apply to each element of the sequence in a string literal as if it
were in an integer character constant (for a character or UTF−8 string literal) or a wide
character constant (for a wide string literal), except that the single-quote ' is representable either by itself or by the escape sequence \', but the double-quote " shall be represented by the escape sequence \".
First of all, separate a program's requirements from the solutions to meet those requirements. Given the minimum amount of info. in this question, the requirement is to print, using C, a string that has double quotes. There are several ways to do this in C.
For example, the following code fragment:
char string[] = "This string \" has one double quote.";
printf("This string %cprints%c with %cdouble%c quotes", '"', '"', '"', '"');
printf("%s", string);
produces:
This string "prints" with "double" quotes.
This string " has one double quote.
Your application might have more requirements that you have not mentioned, but it should be possible to achieve what you want, just NOT the way you initially believe it should be done (welcome to the world of "Needs Analysis").
//R "delimiter( raw_characters )delimiter"
printf(R"SOME/\STRING");
Raw string will terminate after the first )" it sees.
Therefore, if )" is in the string, you have to add delimiter ("a" is used below)
/* Print dog without any escape. */
printf(R"a(|\_/|
|q p| /}
( 0 )"""\
|"^"` |
||_/=\\__|)a");
}
It's C++11 feature and you can find more information in
document
Simliar question have been answered escape R"()" in a raw string in C++