Octal representation inside a string in C - c

In the given program:
int main() {
char *p = "\0777";
printf("%d %d %d\n",p[0],p[1],p[2]);
printf("--%c-- --%c-- --%c--\n",p[0],p[1],p[2]);
return 0;
}
It is showing the output as:
63 55 0
--?-- --7-- ----
I can understand that it is converting the first two characters after \0 (\077) from octal to decimal but can any one explain me why 2 characters, why not 1 or 3 or any other ?
Please explain the logic behind this.

char *p = "\07777";
Here a string literal assigned to a pointer to a char.
"\07777"
In this string literal octal escape sequence is used so first three digits represents a octal number.because rules for octal escape sequence is---
You can use only the digits 0 through 7 in an octal escape sequence. Octal escape sequences can never be longer than three digits and are terminated by the first character that is not an octal digit. Although you do not need to use all three digits, you must use at least one. For example, the octal representation is \10 for the ASCII backspace character and \101 for the letter A, as given in an ASCII chart.
SO your string literal stored in memory like
1st byte as a octal number 077 which is nothing but 63 in decimal and '?' in character
2nd and 3rd byte as a characters '7' and '7' respectively
and a terminating character '\0' in last.
so your answer are as expected 1st,2nd,3d byte of the string literal.
for more explanation you can visit this web site
http://msdn.microsoft.com/en-us/library/edsza5ck.aspx

It's just the way the language defines octal escape sequences.
An octal escape sequence, which can be part of a character constant or string literal, consists of a \ followed by exactly 1, 2, or 3 octal digits ('0' .. '7').
In "\07777", the backslash is followed by 3 octal digits (0, 7, 7), which represents a character with the value 077 in octal, or 63 in decimal. In ASCII or an ASCII-derived encoding, that happens to be a question mark '?'.
So the literal represents a string with a length of 3, consisting of '?', '7', '7'.
But there must be a typo in your question. When I run your program, the output I get is:
63 55 55
--?-- --7-- --7--
If I change the declaration of p to
char *p = "\0777";
I get the output you describe. Note that the final ---- is really two hyphens, followed by a null character, followed by two hyphens. If you're on a Unix-like system, try piping the program's output through cat -v or cat -A.
When you post code, it's very important to copy-and-paste it, not retype it.
(And you're missing the #include <stdio.h> at the top.)

Related

The usage of a backslash character(\) in an char assignment expression

char C = '\1'
int I = -3
printf("%d", I * C);
output:
-3
Hi, I just saw this weird syntax in my practice book, but it doesn't give me much detail about what it is and its usage. Why is there a backslash next to 1 in the quotation mark? Is '\1' any different from '1'? If so, why the result of I * C is the same as 1 * 3? Thank you
In the initializer of the variable C
char C = '\1';
there is used an octal escape sequence. That is the digits after the backslash are considered as an octal representation of a number.
The number of digits in the octal escape sequence shall not be greater than 3 and the allowed digits are 0-7 inclusively.
For example this declaration
char C = '\11';
initializes the variable C with the value 9.
So the expression used in the call of printf
printf("%d", I * C);
is equivalent to
printf("%d", -3 * 1);
And the output will be -3.
Instead of the octal escape sequence you could use hexadecimal escape sequence like
char C = '\x1';
this declaration is equivalent to the previous declaration of the variable C like
char C = '\1';
If to initialize the variable like
char C = '\x11';
then the variable C will get the value 17.
The number of digits in the octal escape sequence shall not be greater than 3 and the allowed digits are 0-7 inclusively.
As for a declaration like this
char C = '1';
then the variable C is initialized by the value of the internal representation of the character '1'. For example if the ASCII coding is used the variable C is initialized by the value 49. If the EBCDIC coding is used then the variable C is initialized by the value 241.
The '1' is the character “1”. Most platforms nowadays use ASCII to translate characters into bytes — '1' in ASCII is an integer 49 in decimal or 0x31 in hex.
From cppreference escape sequence:
\nnn arbitrary octal value byte nnn
Octal escape sequences have a limit of three octal digits, but terminate at the first character that is not a valid octal digit if encountered sooner.
The '\1' is an integer 0x1 in hex or 1 in decimal. In ASCII, it is a SOH character — start of heading.
The:
char C = '\1';
is equivalent to:
char C = 1;
'1' is internally a byte whose value is 49 (the ASCII code of symbol 1).
'\1' is a byte whose value is 1.

How this C code working. I'm getting 56 as output for '\08' [duplicate]

This question already has answers here:
Printing char by integer qualifier
(4 answers)
Closed 3 years ago.
Output of this code is 56.
// Output : 56
#include <stdio.h>
int main() {
char c = '\08';
printf("%d",c);
return 0;
}
As #stark commented, Your constant consists of 2 bytes '\0' and '8'. These are truncated to '8' when stored in the variable c.
In ASCII Character Chart you can see that 8 is 56 in ASCII, and you get that because you use the %d format specifier to print:
You could have seen that yourself, if you paid attention to your warnings:
main.c:4:14: warning: multi-character character constant [-Wmultichar]
4 | char c = '\08';
|
Moreover, I suggest you try printing like this:
printf("%c", c);
and then the output would be:
8
which shows you what was really stored in the variable c.
Tip: If you use char c = '\01'; or char c = '\07'; and anything in between, you will see no warning and the corresponding number being printed, because these are valid octal digits, as #Gerhardh's answer mentions.
In C strings or character literals, it is possible to escape hexadecimal or octal values.
the prefix '\0' is used for octal values, while '\x' is used for hex values.
This means '\0' is equal to value 0, '\011' is equal to 9 etc.
In your case a '8' follows which is no valid octal digit. Therefore the escape sequence stops there and your literal is same as value 0 followed by character '8'.
Now you have a character literal with more than 1 character. This is a multibyte character literal and the value of this literal is implementation dependend.
In your case the final value of the character is the value of the last character, e.g. '8' which has ascii value 0x38 or 56.
As you print this as a decimal number, you get 56.
Its taking value of '8' as char. In ASCII its 56. Basicly char holds only 1 character

Understanding output of printf containing backslash (\012)

Can you please help me to understand the output of this simple code:
const char str[10] = "55\01234";
printf("%s", str);
The output is:
55
34
The character sequence \012 inside the string is interpreted as an octal escape sequence. The value 012 interpreted as octal is 10 in decimal, which is the line feed (\n) character on most terminals.
From the Wikipedia page:
An octal escape sequence consists of \ followed by one, two, or three octal digits. The octal escape sequence ends when it either contains three octal digits already, or the next character is not an octal digit.
Since your sequence contains three valid octal digits, that's how it's going to be parsed. It doesn't continue with the 3 from 34, since that would be a fourth digit and only three digits are supported.
So you could write your string as "55\n34", which is more clearly what you're seeing and which would be more portable since it's no longer hard-coding the newline but instead letting the compiler generate something suitable.
\012 is an escape sequence which represents octal code of symbol:
012 = 10 = 0xa = LINE FEED (in ASCII)
So your string looks like 55[LINE FEED]34.
LINE FEED character is interpreted as newline sequence on many platforms. That is why you see two strings on a terminal.
\012 is a new line escape sequence as others stated already.
(What might be, as chux absolute correct commented, different if ASCII isn't the used charset. But anyway it is in this notation an octal digit.)
this is meant by standard as it says for c99 in ISO/IEC 9899
for:
6.4.4.4 Character constants
[...]
3 The single-quote ', the double-quote ", the question-mark ?, the backslash \, and
arbitrary integer values are representable according to the following table of escape
sequences:
single quote' \'
double quote" \"
question mark? \?
backslash\ \
octal character \octal digits
hexadecimal character \x hexadecimal digits
And the range it gets bound to:
Constraints
9 The value of an octal or hexadecimal escape sequence shall be in the range of
representable values for the type unsigned char for an integer character constant, or
the unsigned type corresponding to wchar_t for a wide character constant.

How to escape from hex to decimal

I apologise if this is an obvious question. I've been searching online for an answer to this and cannot find one. This isn't relevant to my code per se, it's a curiosity on my part.
I am looking at testing my function to read start and end bytes of a buffer.
If I declare a char array as:
char *buffer;
buffer = "\x0212\x03";
meaning STX12ETX - switching between hex and decimal.
I get the expected error:
warning: hex escape sequence out of range [enabled by default]
I can test the code using all hex values:
"\x02\x31\x32\x03"
I am wanting to know, is there a way to escape the hex value to indicate that the following is a decimal value?
will something like this work for you ?
char *buffer;
buffer = "\x02" "12" "\x03";
according to standard:
§ 5.1.1.2 6. Adjacent string literal tokens are concatenated.
§ 6.4.4.4 3. and 7. Each octal or hexadecimal escape sequence is the longest sequence of characters that can constitute the escape sequence.
the escape characters:
\' - single quote '
\" - double quote "
\? - question mark ?
\ - backslash \
\octal digits
\xhexadecimal digits
So the only way to do it is concatenation of strings with the precompiler concatenation ( listing them one after another).
if you want to know more how the literals are constructed by compiler look at §6.4.4.4 and §6.4.5 they describe how to construct the character literals and string literals respectively.
You can write
"\b12"
to represent a decimal value. Altough you need to use space after hex values for it to work.
buffer = "\x02 \b12\x03";
Or just 12
buffer = "\x02 12\x03";
Basically you need to add a blank character after your hex values to indicate that it's a new value and not the same one
No, there's no way to end a hexadecimal escape except by having an invalid (for the hex value) character, but then that character is of course interpreted in its own right.
The C11 draft says (in 6.4.4.4 14):
[...] a hexadecimal escape sequence is terminated only by a non-hexadecimal character.
Octal escapes don't have this problem, they are limited to three octal digits.
You can always use the octal format. Octal code is always 3 digits.
So to get the character '<-' you simple type \215

Size of escaped characters in C

Why does the following program output 5?
#include <stdio.h>
main()
{
char str[]="S\065AB";
printf("\n%d", sizeof(str));
}
Short answer: See David Heffernan's answer.
Long answer:
§ 6.4.4.4 of the C(99) standard specifies "character constants", which (among others) include simple escape sequences (e.g. '\n', '\\'), octal escape sequences (e.g. '\0'), hexadecimal escape sequences (e.g. '\x0f'), and universal character names (e.g. '\u0112').
The backslash in your example introduces such an escape / octal / hex / universal constant. The following octal digit ([0-7]) makes it an octal constant (hex would be '\x', universal would be '\u', escape sequence would be '\['"?\abfnrtv]').
That octal constant is terminated once three octal digits are consumed, or a non-octal-digit is encountered.
I.e., '\065' is equivalent to '\x35' or (decimal) 53, which is (coincidentally) '5' on the ASCII table - a single character, anyway.
It's the size of the array which has five elements: S, \065, A, B, \0

Resources