How to convert "\003" to int in C - c

How to convert "\003" to int in C? I have following code.
void* message="\003";
char* bptr;
printf("%ld",strtol((char*)message,&bptr,10));
It is showing 0 as output.

The string literal "\003" specifies a string consisting of two characters; the first has the value 3, and the second is the implicit '\0' null character terminator.
The character with the value 3 is typically not a printable character. In particular, it's not the digit '3'.
If you want to extract the int value 3 from that string, you can do it like this:
int result = ((char*)message)[0];
But that doesn't seem like a particularly useful thing to do.
I have to wonder (1) why you declared message as void* rather than as char*, and (2) why you're using a string containing the control character '\003' to represent the integer value 3.
The usual way to represent an integer value as a string represents the integer 3 as "3", not as '\003'. You can certainly represent (small) integer values as single bytes, but that would generally use a single char or unsigned char object, not a string or array. Finally, there are ways to represent arbitrary numbers as sequences of bytes, but if you're doing that you haven't told us the details of your representation; for example, what string would represent the integer value 1000?

Related

How do I compare single multibyte character constants cross-platform in C?

In my previous post I found a solution to do this using C++ strings, but I wonder if there would be a solution using char's in C as well.
My current solution uses str.compare() and size() of a character string as seen in my previous post.
Now, since I only use one (multibyte) character in the std::string, would it be possible to achieve the same using a char?
For example, if( str[i] == '¶' )? How do I achieve that using char's?
(edit: made a type on SO for comparison operator as pointed out in the comments)
How do I compare single multibyte character constants cross-platform in C?
You seem to mean an integer character constant expressed using a single multibyte character. The first thing to recognize, then, is that in C, integer character constants (examples: 'c', '¶') have type int, not char. The primary relevant section of C17 is paragraph 6.4.4.4/10:
An integer character constant has type int. The value of an integer character constant containing a single character that maps to a single-byte execution character is the numerical value of the representation of the mapped character interpreted as an integer. The value of an integer character constant containing more than one character (e.g.,’ab’ ), or containing a character or escape sequence that does not map to a single-byte execution character, is implementation-defined. If an integer character constant contains a single character or escape sequence, its value is the one that results when an object with type char whose value is that of the single character or escape sequence is converted to type int.
(Emphasis added.)
Note well that "implementation defined" implies limited portability from the get-go. Even if we rule out implementations defining perverse behavior, we still have alternatives such as
the implementation rejects integer character constants containing multibyte source characters; or
the implementation rejects integer character constants that do not map to a single-byte execution character; or
the implementation maps source multibyte characters via a bytewise identity mapping, regardless of the byte sequence's significance in the execution character set.
That is not an exhaustive list.
You can certainly compare integer character constants with each other, but if they map to multibyte execution characters then you cannot usefully compare them to individual chars.
Inasmuch as your intended application appears to be to locate individual mutlibyte characters in a C string, the most natural thing to do appears to be to implement a C analog of your C++ approach, using the standard strstr() function. Example:
char str[] = "Some string ¶ some text ¶ to see";
char char_to_compare[] = "¶";
int char_size = sizeof(char_to_compare) - 1; // don't count the string terminator
for (char *location = strstr(str, char_to_compare);
location;
location = strstr(location + char_size, char_to_compare)) {
puts("Found!");
}
That will do the right thing in many cases, but it still might be wrong for some characters in some execution character encodings, such as those encodings featuring multiple shift states.
If you want robust handling for characters outside the basic execution character set, then you would be well advised to take control of the in-memory encoding, and to perform appropriate convertions to, operations on, and conversions from that encoding. This is largely what ICU does, for example.
I believe you meant something like this:
char a = '¶';
char b = '¶';
if (a == b) /*do something*/;
The above may or may not work, if the value of '¶' is bigger than the char range, then it will overflow, causing a and b to store a different value than that of '¶'. Regardless of which value they hold, they may actually both have the same value.
Remember, the char type is simply a single-byte wide (8-bits) integer, so in order to work with multibyte characters and avoid overflow you just have to use a wider integer type (short, int, long...).
short a = '¶';
short b = '¶';
if (a == b) /*do something*/;
From personal experience, I've also noticed, that sometimes your environment may try to use a different character encoding than what you need. For example, trying to print the 'á' character will actually produce something else.
unsigned char x = 'á';
putchar(x); //actually prints character 'ß' in console.
putchar(160); //will print 'á'.
This happens because the console uses an Extended ASCII encoding, while my coding environment actually uses Unicode, parsing a value of 225 for 'á' instead of the value of 160 that I want.

directly cast a char *nib as int hex

in pure and portable c.
So I am having trouble casting what has to be a variable, from a variable. char *nib to int hex. The idea being I had a char *nib ="ab"; or "0xab" or anything that directly represents two characters as a char *. then casting it as a integer. and writing it to a file to get a one to one write. so I start with char *nib="0xab"; then I write it as a int presumably, and to a hexdump or edit and the result is just ab.
I've been able to do this as a constant directly declaring... but the nib is always static.
this has to be a one to one starting with a two char string or nib. Not converting anything, purly casting.
So can you write it directly to a file without converting it? three look up tables seems like a bit much for a value what has the name length
There is no way to cast 2 characters (2 bytes) into one byte because cast does not change binary representation of the value.
The closest you can get to casting string that looks like hex to some value that will show something similar is use 0-15 characters via escape sequence like char* nib = "\x0A\x0B" and cast ( *((short*)nib)) that to 2-byte value (0x0A0B in this case) and store that to a file (I'm not sure if there is portable integer type of 2 bytes - short often is 2 bytes wide but does not have to be 2 bytes). Unfortunately I don't think there is a portable way to store 2 byte integer value to a file as different architectures may have different byte order.
Writing string value character by character is likely safest approach. Or convert string to int a usual way and use your own read/write code for integers to ensure portability.

Using int to print character constants [duplicate]

This question already has answers here:
Multi-character constant warnings
(6 answers)
Print decimal value of a char
(5 answers)
Closed 5 years ago.
I wrote the following program,
#include<stdio.h>
int main(void)
{
int i='A';
printf("i=%c",i);
return 0;
}
and I got the result as,
i=A
So I tried another program,
#include<stdio.h>
int main(void)
{
int i='ABC';
printf("i=%c",i);
return 0;
}
According to me, since 32 bits are used to store an int value and each of 'A', 'B' and 'C' have 8 bit ASCII codes which totals to 24 bits therefore 24 bits were stored in a 32 bit unit. So I expected the output to be,
i=ABC
but the output instead was
i=C
and I can't understand why?
'ABC' in this case is a integer character constant as per section 6.4.4.4.10 of the standard.
An integer character constant has type int. The value of an integer
character constant containing a single character that maps to a
single-byte execution character is the numerical value of the
representation of the mapped character interpreted as an integer. The
value of an integer character constant containing more than one
character (e.g.,'ab'), or containing a character or escape sequence
that does not map to a single-byteexecution character, is
implementation-defined. If an integer character constant contains a
single character or escape sequence, its value is the one that results
when an object with type char whose value is that of the single
character or escape sequence is converted to type int.
In this case, 'A'==0x41, 'B'==0x42, 'C'==0x43, and your compiler then interprets i to be 0x414243. As said in the other answer, this value is implementation dependent.
When you try to access it using '%c', the overflown part will be cut and you are only left with 0x43, which is 'C'.
To get more insight to it, read the answers to this question as well.
The conversion specifier c used in this call
printf("i=%c",i);
in fact extracts one character from the integer argument. So using this specifier you in any case can not get three characters as the output.
From the C Standard (7.21.6.1 The fprintf function)
c If no l length modifier is present, the int argument is converted to
an unsigned char, and the resulting character is written
Take into account that the internal representation of a multi-byte character constant is implementation defined. From the C Standard (6.4.4.4 Character constants)
...The value of an integer character constant containing more than one character (e.g., 'ab'), or containing a character or escape
sequence that does not map to a single-byte execution character, is
implementation-defined.
'ABC' is an integer character constant. Depending on code set (overwhelming it is ASCII), endian, int width (apparently 32 bits in OP's case), it may have the same value like below. It is implementation defined behavior.
'ABC'
0x41424300
0x434241
or others.
The "%c" directs printf() to take the int value, cast it to unsigned char and print the associated character. This is the main reason for apparent loss of information.
In OP's case, it appears that i took on the value of 0x434241.
int i='A';
printf("i=%c",i); --> 'A'
// same as
printf("i=%c",0x434241); --> 'A'
if you want i to contain 3 characters you need to init a array that contains 3 characters
char i[3];
i[0]= 'A';
i[1]= 'B';
i[2]='C';
the ' ' can contain only one char your code converts the integer i into a character or better you store in your 32 bit intiger a converted 8 bit character. But i think You want to seperate the 32 bits into 8 bit containers make a char array like char i[3]. and then you will see that
int j=i;
this will result in an error because you are unable to convert a char array into a integer.
In C, 'A' is an int constant that's guaranteed to fit into a char.
'ABC' is a multicharacter constant. It has an int type, but an implementation defined value. The behaviour on using %c to print that in printf is possibly undefined if the value cannot fit into a char.

uint64 to string in C

I have a uint64 value that I want to convert into a string because it has to be inserted as the payload of an HTTP POST request.
I've already tried many solutions (ltoa, this solution ) but my problem still remains.
My function is the following:
void check2(char* fingerprint, guint64 s_id) {
//stuff
char poststr[400] = "action=CheckFingerprint&sessionid=";
//convert s_id to, for example, char* myChar
strcat(poststr, myChar);
}
I want to convert s_id to char*. I've tried:
1) char ses[8]; ltoa(s_id,ses,10) but I have a segmentation fault;
2) char *buf; sprintf(buf, "%" PRIu64, s_id);
I'm working on a APIs, so I have seen that when this guint64 variable is printed, it has the following form:
JANUS_LOG(LOG_INFO, "Creating new session: %"SCNu64"\n", session_id);
sprintf is the right way to go with an unsigned 64 bit format specifier.
You'll need to allocate enough space for 16 hex digits and the null byte. Here I've allocated 20 bytes to accommodate a leading 0x as well and then I rounded it up to 20 for no good reason other than it feels better than 19.
char foo[20];
sprintf(foo, "0x%016" PRIx64, (uint64_t)numberToConvert);
will print the number in hex with leading 0x and leading zeros padded up to 16. You do not need the cast if numberToConvert is already a uint64_t
i have a uint64 value that i want to convert into char* because of it have to be inserted as payload of an HTTP POST request.
What you have is a fundamental misunderstanding.
To insert a text representation of your value into a document, you need to convert it to a sequence of characters, which is quite a different thing from a pointer to a character (char *). One of your options, which seems to be what you're really after, is to convert the value to a sequence of characters in the form of a C string -- that is, a null-terminated array of characters. You would then have or be able to obtain a pointer to the first character in the sequence.
That explains what's wrong with this attempted solution:
char *buf;
sprintf(buf, "%" PRIu64, s_id);
You are trying to write the string representation of your number into the array pointed-to by buf, but it doesn't point to one. Not having been initialized or assigned, its value is indeterminate.
Even if you your buf pointed to an array, it is essential that the array be long enough to accommodate all the digits of the value's decimal representation, plus a terminator. That's probably what's wrong with your other attempt:
char ses[8]; ltoa(s_id,ses,10)
An unsigned, 64-bit binary number may require up to 20 decimal digits, plus you need space for a terminator. The array you're providing is not nearly large enough, unless you can be confident that the actual values you're going to write will not exceed 9,999,999 (which is well within the range of a 32-bit integer).

C standard: L prefix and octal/hexadecimal escape sequences

I didn't find an explanation in the C standard how do aforementioned escape sequences in wide strings are processed.
For example:
wchar_t *txt1 = L"\x03A9";
wchar_t *txt2 = L"\xA9\x03";
Are these somehow processed (like prefixing each byte with \x00 byte) or stored in memory exactly the same way as they are declared here?
Also, how does L prefix operate according to the standard?
EDIT:
Let's consider txt2. How it would be stored in memory? \xA9\x00\x03\x00 or \xA9\x03 as it was written? Same goes to \x03A9. Would this be considered as a wide character or as 2 separate bytes which would be made into two wide characters?
EDIT2:
Standard says:
The hexadecimal digits that follow the backslash and the letter x in a hexadecimal escape
sequence are taken to be part of the construction of a single character for an integer
character constant or of a single wide character for a wide character constant. The
numerical value of the hexadecimal integer so formed specifies the value of the desired
character or wide character.
Now, we have a char literal:
wchar_t txt = L'\xFE\xFF';
It consists of 2 hex escape sequences, therefore it should be treated as two wide characters. If these are two wide characters they can't fit into one wchar_t space (yet it compiles in MSVC) and in my case this sequence is treated as the following:
wchar_t foo = L'\xFFFE';
which is the only hex escape sequence and therefore the only wide char.
EDIT3:
Conclusions: each oct/hex sequence is treated as a separate value ( wchar_t *txt2 = L"\xA9\x03"; consists of 3 elements). wchar_t txt = L'\xFE\xFF'; is not portable - implementation defined feature, one should use wchar_t txt = L'\xFFFE';
There's no processing. L"\x03A9" is simply an array wchar_t const[2] consisting of the two elements 0x3A9 and 0, and similarly L"\xA9\x03" is an array wchar_t const[3].
Note in particular C11 6.4.4.4/7:
Each octal or hexadecimal escape sequence is the longest sequence of characters that can
constitute the escape sequence.
And also C++11 2.14.3/4:
There is no limit to the number of digits in a hexadecimal sequence.
Note also that when you are using a hexadecimal sequence, it is your responsibility to ensure that your data type can hold the value. C11-6.4.4.4/9 actually spells this out as a requirement, whereas in C++ exceeding the type's range is merely "implementation-defined". (And a good compiler should warn you if you exceed the type's range.)
Your code doesn't make sense, though, because the left-hand sides are neither arrays nor pointers. It should be like this:
wchar_t const * p = L"\x03A9"; // pointer to the first element of a string
wchar_t arr1[] = L"\x03A9"; // an actual array
wchar_t arr2[2] = L"\x03A9"; // ditto, but explicitly typed
std::wstring s = L"\x03A9"; // C++ only
On a tangent: This question of mine elaborates a bit on string literals and escape sequences.

Resources