Make R use C notation when escaping terminals - c

Not sure I am using the right terminology here, but I need the print or deparse methods use C notation (e.g. "\x05" instead of "\005" ) when escaping bytes out of the regular character set.
x <- "This is a \x05 symbol"
print(x)
[1] "This is a \005 symbol"
Is there a native way to accomplish this?
I need this for generating BSON: http://bsonspec.org/#/specification. All of the examples explicitly use \x05 notation.

Hacking into the internals of print seems a bad idea. Instead I think you should do the string escaping yourself, and eventually use cat to print the string without any extra escaping.
You can use encodeString to do the initial escaping, gregexpr to identify octal \0.. escapes, strtoi to convert strings representing octal numbers to those numbers, sprintf to print numbers in hexadecimal, and regenmatches to operate on the matched parts. The whole process would look something like this:
inputString <- "This is a \005 symbol. \x13 is \\x13."
x <- encodeString(inputString)
m <- gregexpr("\\\\[0-3][0-7][0-7]", x)
charcodes <- strtoi(substring(regmatches(x, m)[[1]], 2, 4), 8)
regmatches(x, m) <- list(sprintf("\\x%02x", charcodes))
cat(x, "\n")
Note that this approach will convert octal escapes like \005 to hexadecimal escapes like \x05, but other escape sequences like \t or \a won't be affected by this. You might need more code to deal with those as well, but the above should contain all the ingredients you need.
Note that the BSON specification you refer to almost certainly meant raw bytes, so as long as your string contains a character with code 5, which you can write as "\x05" in your input, and you write that string to the desired output in binary mode, it shouldn't matter at all how R prints that string to you. After all, octal \005 and hexadecimal \x05 are just two representations of the same byte you'll write.

Does cat suit your needs? Note, you have to escape the backslash:
> x <- "This is a \\x05 symbol\n"
> cat(x)
This is a \x05 symbol

Related

Separating hexadecimal escape sequences in strings

Can a string constant like "foo" "\x01" "bar" be written as a single string literal (while keeping the hexadecimal notation)? With "foo\x01bar" the escape sequence seems to be interpreted as \x01ba since I get the warning "hex escape sequence out of range."
"foo" "\x01" "bar" is a string literal.
The C standard states that a hexadecimal escape sequence is the longest sequence of characters that can constitute the escape sequence. Without the explicit concatenation (which is the common workaround to this problem), the compiler parses \x01ba which is obviously out of range.
How about "foo\x01\142ar"? Is that cheating?
Another solution is to simply write the escaped character in octal, instead of hexadecimal
"foo\1bar"
and no more ambiguity...

Bash builtin (not SED!) search and replace using octal values

I'm having problems getting my code to work:
for (( c=1; c<=$DirsArrCnt; c=c+$OneDirArrCnt )); do
# Replace every occurence of "/" (ASCII d47 o057) in path with "^A" (ASCII 1)
Hold="${DirsArr[$c]}"
DirsArr[c]="${Hold//\057/\001}"
done
Originally I skipped the Hold variable and used the array element directly but took that out thinking it was the problem.
Am I specifying the octal value correctly? I believe 57 is the octal value for "/" right?
I think this is what you want :
DirsArr[c]="${Hold//$'\057'/$'\001'}"
The syntax you use interprets \0 as a literal 0 (i.e. does nothing different compared to not using the backslash). You need the C-style string to have your numeric code interpreted by the shell.

Unknown escape sequence

I am trying to printf a string that shows a temperature table
printf("TABLE 24A (20\°C)");
The degree sign is a constant I have defined as 0xDF so the the string looks like this: "TABLE 24A (20\xDF C)"
This works but looks incorrect because of the space between the \xDF and the C.
If I remove the space the compiler issues a warning hex escape sequence out of range.
If I modify the string to "TABLE 24A (20\xDF\C)" I get the correct result but the compiler issues warning unknown escape sequence: '\C'
Is there a way to get rid of the warnings but lose the space between the two characters?
You can take advantage of the fact that consecutive string literals are automatically concatenated:
printf("**TABLE 24A (20\xDF" "C)**");
This prevents the parser from consuming more characters for the escape sequence than you want.
You could also pass in the character as a parameter and use the %c format specifier to print it:
printf("**TABLE 24A (20%cC)**", '\xDF');
\x escape sequences consume as many adjacent hex digits as possible. The C is being parsed as a hex digit.
With \x, you could combine two adjacent string literals.
printf("**TABLE 24A (20\xDF""C)**");
Or use a \unnnn Unicode escape, which is limited to four hex characters.
printf("**TABLE 24A (20\u00DFC)**");
Or octal \nnn:
printf("**TABLE 24A (20\337C)**");

Hex value in C string

I need to prepare constant array of ANSI C strings that contains bytes from range of 0x01 to 0x1a. I made custom codepage, so those values represents different characters (i.e. 0x09 represents Š). I'd like to initialise the array in that way:
static const char* brands[] = {
"Škoda",
//etc...
};
How can I put 0x09 instead of Š in "Škoda"?
Recommend not using "\x09koda", use octal or spaced strings.
The problem is that hexadecimal escape sequences are not limited in their length. So if the next char is a hexadecimal character, problems occur. Use octal, which is limited to 3. Or use separated strings. The compiler will concatenate then, but the escape sequence will not accidentally run too far.
// problematic
"\x09Czech"
^^^^^--- The escape sequence is \x09C, but \0x09 was hoped for
// recommend octal
"\0111234"
^^^^--- The escape sequence is \011
// recommend spaced strings
"\x09" "Czech"
Very simple
"Škoda" -> "\x09koda"
How can I put 0x09 instead of Š in "Škoda"?
"\x09koda"
Have a look at escape squences - i.e \x09
For hex escape you want to use the \Xnnn, for octal just \nnn and for unicode \Unnnn

how to include hex value in string using sprintf

i want to include value of i hex format in c.
for(i=0;i<10;i++)
sprintf(s1"DTLK\x%x\xFF\xFF\xFF\xFF\xFF\xFF",i);
but the above code outputs an error: \x used with no following hex digits
Pls any one suggest me a proper way....
Supposing you don't want to literally have \x00..\x0A, but the corresponding byte, you need
sprintf(s1, "DTLK%c\xFF\xFF\xFF\xFF\xFF\xFF",i);
while inserting \x%x would be at the wrong abstraction level...
If, OTOH, you really want to literally have the hex characters instead of the bytes with the named hey characters as their representation, the other answers might be more helpful.
You need to escape the slash on front of the \x:
sprintf(s1"DTLK\\x%x\xFF\xFF\xFF\xFF\xFF\xFF",i);
// ^------- Here
Depending on what output you would like to achieve, you may need to escape the remaining slashes as well.
Currently, the snippet produces a sequence of six characters with the code 0xFF. If this is what you want, your code fragment is complete. If you would like to see a sequence of \xFF literals, i.e. a string that looks like \x5\xFF\xFF\xFF\xFF\xFF\xFF when i == 5, you need to escape all slashes in the string:
sprintf(s1"DTLK\\x%x\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF",i);
// ^ ^ ^ ^ ^ ^ ^
Finally, if you would like the value formatted as a two-digit hex code even when the value is less than sixteen, use %02x format code to tell sprintf that you want a leading zero.
\x expects a hex value like \xC9.
If you want to include \x in your output, you need to escape \ with \\:
sprintf(s1"DTLK\\x%x\xFF\xFF\xFF\xFF\xFF\xFF",i);
sprintf(s1"DTLK\\x%x\xFF\xFF\xFF\xFF\xFF\xFF",i);
// ^------- Here
Depending on what output you would like to achieve, you may need to escape the remaining slashes as well.
Currently, the snippet produces a sequence of six characters with the code 0xFF. If this is what you want, your code fragment is complete. If you would like to see a sequence of \xFF literals, i.e. a string that looks like \x5\xFF\xFF\xFF\xFF\xFF\xFF when i == 5, you need to escape all slashes in the string:
sprintf(s1"DTLK\\x%x\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF",i);
// ^ ^ ^ ^ ^ ^ ^
Finally, if you would like the value formatted as a two-digit hex code even when the value is less than 16, use %02x format code to tell sprintf that you want a leading zero.

Resources