Why pointer not giving its Ascii Value? - c

This is my current code as folows:
#include<stdio.h>
int main() {
/* code */
char a[5] = {'a','b'};
int *p =a;
printf("%d\n", *p);
return 0;
}
When I execute my code it is showing 25185 instead of giving me an ASCII value.
Why is this happening?
Thank you

This is undefined behavior, so anything can happen. As for what you're observing in particular, here's the explanation:
If an array only has some of its values initialized at declaration, the remaining values are zero. So your array a is 'a', 'b', '\0', '\0', '\0'. When a pointer to the beginning of this array is interpreted as a 32-bit, little-endian int, this has the value 0x00006261, or 25185 in decimal.

(Disclaimer: the other answer shows you why you get 25185, this one shows how you can achieve your goal.)
If you want to output the ASCII value a[0] (which seems to be what you're trying to do by int *p=a;), tells to printf() you want to pass a byte, and use a char* (a pointer to a character, which is a byte in C) to point to it:
int main (int arg, char **argv)
{
char a[5] = {'a','b'};
char *p =a; // points to a char, ie a byte
printf("%hhx\n", *p); // tells to printf it's a byte type
return 0;
}

initializing a[5] with {'a', 'b'} is giving values to 2 bytes
ascii binary
a 97 0110 0001
b 98 0110 0010
so when you read it via integer pointer it reads 0110 0001 0110 0010 = 25185 (in decimal) and following little endianness upper 2 bytes contains 0
if you read via char pointer it reads one byte 0110 0001 = 97

Related

c and bit shifting in a char

I am new to C and having a hard time understanding why the code below prints out ffffffff when binary 1111111 should equal hex ff.
int i;
char num[8] = "11111111";
unsigned char result = 0;
for ( i = 0; i < 8; ++i )
result |= (num[i] == '1') << (7 - i);
}
printf("%X", bytedata);
You print bytedata which may be uninitialized.
Replace
printf("%X", bytedata);
with
printf("%X", result);
Your code then run's fine. code
Although it is legal in C, for good practice you should make
char num[8] = "11111111";
to
char num[9] = "11111111";
because in C the null character ('\0') always appended to the string literal. And also it would not compile as a C++ file with g++.
EDIT
To answer your question
If I use char the result is FFFFFFFF but if I use unsigned char the result is FF.
Answer:
Case 1:
In C size of char is 1byte(Most implementation). If it is unsigned we can
use 8bit and hold maximum 11111111 in binary and FF in hex(decimal 255). When you print it with printf("%X", result);, this value implicitly converted to unsigned int which becomes FF in hex.
Case 2: But when you use char(signed), then MSB bit use as sign bit, so you can use at most 7 bit for your number whose range -128 to 127 in decimal. When you assign it with FF(255 in decimal) then Integer Overflow occur which leads to Undefined behavior.

declaring string using pointer to int

I am trying to initialize a string using pointer to int
#include <stdio.h>
int main()
{
int *ptr = "AAAA";
printf("%d\n",ptr[0]);
return 0;
}
the result of this code is 1094795585
could any body explain this behavior and why the code gave this answers ?
I am trying to initialize a string using pointer to int
The string literal "AAAA" is of type char[5], that is array of five elements of type char.
When you assign:
int *ptr = "AAAA";
you actually must use explicit cast (as types don't match):
int *ptr = (int *) "AAAA";
But, still it's potentially invalid, as int and char objects may have different alignment requirements. In other words:
alignof(char) != alignof(int)
may hold. Also, in this line:
printf("%d\n", ptr[0]);
you are invoking undefined behavior (so it might print "Hello from Mars" if compiler likes so), as ptr[0] dereferences ptr, thus violating strict aliasing rule.
Note that it is valid to make transition int * ---> char * and read object as char *, but not the opposite.
the result of this code is 1094795585
The result makes sense, but for that, you need to rewrite your program in valid form. It might look as:
#include <stdio.h>
#include <string.h>
union StringInt {
char s[sizeof("AAAA")];
int n[1];
};
int main(void)
{
union StringInt si;
strcpy(si.s, "AAAA");
printf("%d\n", si.n[0]);
return 0;
}
To decipher it, you need to make some assumptions, depending on your implementation. For instance, if
int type takes four bytes (i.e. sizeof(int) == 4)
CPU has little-endian byte ordering (though it's not really matter, since every letter is the same)
default character set is ASCII (the letter 'A' is represented as 0x41, that is 65 in decimal)
implementation uses two's complement representation of signed integers
then, you may deduce, that si.n[0] holds in memory:
0x41 0x41 0x41 0x41
that is in binary:
01000001 ...
The sign (most-significant) bit is unset, hence it is just equal to:
65 * 2^24 + 65 * 2^16 + 65 * 2^8 + 65 =
65 * (2^24 + 2^16 + 2^8 + 1) = 65 * 16843009 = 1094795585
1094795585 is correct.
'A' has the ASCII value 65, i.e. 0x41 in hexadecimal.
Four of them makes 0x41414141 which is equal to 1094795585 in decimal.
You got the value 65656565 by doing 65*100^0 + 65*100^1 + 65*100^2 + 65*100^3 but that's wrong since a byte1 can contain 256 different values, not 100.
So the correct calculation would be 65*256^0 + 65*256^1 + 65*256^2 + 65*256^3, which gives 1094795585.
It's easier to think of memory in hexadecimal because one hexadecimal digit directly corresponds to half a byte1, so two hex digits is one full byte1 (cf. 0x41). Whereas in decimal, 255 fits in a single byte1, but 256 does not.
1 assuming CHAR_BIT == 8
65656565 this is a wrong representation of the value of "AAAA" you are seprately representing each character and "AAAA" is stored as array.Its converting into 1094795585 because %d identifier prints decimal value. Run this in gdb with following command:
x/8xb (pointer) //this will show you the memory hex value
x/d (pointer) //this will show you the converted decimal value
#zenith gave you the answer you expected, but your code invokes UB. Anyway, you could demonstrate the same in an almost correct way :
#include <stdio.h>
int main()
{
int i, val;
char *pt = (char *) &val; // cast a pointer to any to a pointer to char : valid
for (i=0; i<sizeof(int); i++) pt[i] = 'A'; // assigning bytes of int : UB in general case
printf("%d 0x%x\n",val, val);
return 0;
}
Assigning bytes of an int is UB in the general case because C standard says that [for] signed integer types, the bits of the object representation shall be divided into three groups: value bits, padding bits, and the sign bit. And a remark adds Some combinations of padding bits might generate trap representations, for example, if one padding
bit is a parity bit.
But in common architectures, there are no padding bits and all bits values correspond to valid numbers, so the operation is valid (but implementation dependant) on all common systems. It is still implementation dependant because size of int is not fixed by standard, nor is endianness.
So : on a 32 bit system using no padding bits, above code will produce
1094795585 0x41414141
indepentantly of endianness.

unable to understand the output of union program in C

I know the basic properties of union in C but still couldn't understand the output, can somebody explain this?
#include <stdio.h>
int main()
{
union uni_t{
int i;
char ch[2];
};
union uni_t z ={512};
printf("%d%d",z.ch[0],z.ch[1]);
return 0;
}
The output when running this program is
02
union a
{
int i;
char ch[2];
}
This declares a type union a, the contents of which (i.e. the memory area of a variable of this type) could be accessed as either an integer (a.i) or a 2-element char array (a.ch).
union a z ={512};
This defines a variable z of type union a and initializes its first member (which happens to be a.i of type int) to the value of 512. (Cantfindname has the binary representation of that.)
printf( "%d%d", z.ch[0], z.ch[1] );
This takes the first character, then the second character from a.ch, and prints their numerical value. Again, Cantfindname talks about endianess and how it affects the results. Basically, you are taking apart an int byte-by-byte.
And the whole shebang is apparently assuming that sizeof( int ) == 2, which hasn't been true for desktop computers for... quite some time, so you might want to be looking at a more up-to-date tutorial. ;-)
What you get here is the result of endianess (http://en.wikipedia.org/wiki/Endianness).
512 is 0b0000 0010 0000 0000 in binary, which in little endian is stored in the memory as 0000 0000 0000 0010. Then ch[0] reads the last 8 bits (0b0000 0010 = 2 in decimal) and ch[1] reads the first 8 bits (0b0000 0000 = 0 in decimal).
Using int will not lead to this output in 32 bit machines as sizeof(int) = 4. This output will occur only if we use a 16 bit system or we use short int having memory size of 2 bytes.
A Union is a variable that may hold (at different times) objects of different types and sizes, with the compiler keeping track of size and alignment requirements.
union uni_t
{
short int i;
char ch[2];
};
This code snippet declares a union having two members- a integer and a character array.
The union can be used to hold different values at different times by simply allocating the values.
union uni_t z ={512};
This defines a variable z of type union uni_t and initializes the integer member ( i ) to the value of 512.
So the value stored in z becomes : 0b0000 0010 0000 0000
When this value is referenced using character array then ch[1] refers to first byte of data and ch[0] refers to second byte.
ch[1] = 0b00000010 = 2
ch[0] = ob00000000 = 0
So printf("%d%d",z.ch[0],z.ch[1]) results to
02

Confusion on printing strings as integers in C

I would like to know how the string is represented in integer, so I wrote the following program.
#include <stdio.h>
int main(int argc, char *argv[]){
char name[4] = {"#"};
printf("integer name %d\n", *(int*)name);
return 0;
}
The output is:
integer name 64
This is understandable because # is 64 in integer, i.e., 0x40 in hex.
Now I change the program into:
#include <stdio.h>
int main(int argc, char *argv[]){
char name[4] = {"##"};
printf("integer name %d\n", *(int*)name);
return 0;
}
The output is:
integer name 16448
I dont understand this. Since ## is 0x4040 in hex. So it should be 2^12+2^6 = 4160
If I count the '\0' at the end of the string, then it should be 2^16+2^10 = 66560
Could someone explain where 16448 comes from?
Your math is wrong: 0x4040 == 16448. The two fours are the 6th and 14th bits respectively.
Your code actually invokes undefined behavior because you must not alias a char * with an int *. This is known as the strict aliasing rule. To see just one reason why this should be disallowed, consider what would otherwise have to happen if the code is run on a little and a big endian machine.
If you want to see the hex pattern of the string, you should simply loop over its bytes and print out each byte.
void
print_string(const char * strp)
{
printf("0x");
do
printf("%02X", (unsigned char) *strp);
while (*strp++);
printf("\n");
}
Of course, instead of printing the bytes, you can shift them into an integer (that will very soon overflow) and only finally output that integer. While doing this, you'll be forced to take a stand on “your” endianness.
/* Interpreting as big endian. */
unsigned long
int_string(const char * strp)
{
unsigned long value = 0UL;
do
value = (value << 8) | (unsigned char) *strp;
while (*strp++);
return value;
}
This is how 16448 comes :
0x4040 can be written like this in binary :
4 0 4 0 -> Hex
0100 0000 0100 0000 -> Binary
2^14 2^6 = 16448
Because here 6th and 14th bit are set.
Hope you got it :)

Unsigned Char pointing to unsigned integer

I don't understand why the following code prints out 7 2 3 0 I expected it to print out 1 9 7 1. Can anyone explain why it is printing 7230?:
unsigned int e = 197127;
unsigned char *f = (char *) &e;
printf("%ld\n", sizeof(e));
printf("%d ", *f);
f++;
printf("%d ", *f);
f++;
printf("%d ", *f);
f++;
printf("%d\n", *f);
Computers work with binary, not decimal, so 197127 is stored as a binary number and not a series of single digits separately in decimal
19712710 = 0003020716 = 0011 0000 0010 0000 01112
Suppose your system uses little endian, 0x00030207 would be stored in memory as 0x07 0x02 0x03 0x00 which is printed out as (7 2 3 0) as expected when you print out each byte
Because with your method you print out the internal representation of the unsigned and not its decimal representation.
Integers or any other data are represented as bytes internally. unsigned char is just another term for "byte" in this context. If you would have represented your integer as decimal inside a string
char E[] = "197127";
and then done an anologous walk throught the bytes, you would have seen the representation of the characters as numbers.
Binary representation of "197127" is "00110000001000000111".
The bytes looks like "00000111" (is 7 decimal), "00000010" (is 2), "0011" (is 3). the rest is 0.
Why did you expect 1 9 7 1? The hex representation of 197127 is 0x00030207, so on a little-endian architecture, the first byte will be 0x07, the second 0x02, the third 0x03, and the fourth 0x00, which is exactly what you're getting.
The value of e as 197127 is not a string representation. It is stored as a 16/32 bit integer (depending on platform). So, in memory, e is allocated, say 4 bytes on the stack, and would be represented as 0x30207 (hex) at that memory location. In binary, it would look like 110000001000000111. Note that the "endian" would actually backwards. See this link account endianess. So, when you point f to &e, you are referencing the 1st byte of the numeric value, If you want to represent a number as a string, you should have
char *e = "197127"
This has to do with the way the integer is stored, more specifically byte ordering. Your system happens to have little-endian byte ordering, i.e. the first byte of a multi byte integer is least significant, while the last byte is most significant.
You can try this:
printf("%d\n", 7 + (2 << 8) + (3 << 16) + (0 << 24));
This will print 197127.
Read more about byte order endianness here.
The byte layout for the unsigned integer 197127 is [0x07, 0x02, 0x03, 0x00], and your code prints the four bytes.
If you want the decimal digits, then you need to break the number down into digits:
int digits[100];
int c = 0;
while(e > 0) { digits[c++] = e % 10; e /= 10; }
while(c > 0) { printf("%u\n", digits[--c]); }
You know the type of int often take place four bytes. That means 197127 is presented as 00000000 00000011 00000010 00000111 in memory. From the result, your memory's address are Little-Endian. Which means, the low-byte 0000111 is allocated at low address, then 00000010 and 00000011, finally 00000000. So when you output f first as int, through type cast you obtain a 7. By f++, f points to 00000010, the output is 2. The rest could be deduced by analogy.
The underlying representation of the number e is in binary and if we convert the value to hex we can see that the value would be(assuming 32 bit unsigned int):
0x00030207
so when you iterate over the contents you are reading byte by byte through the *unsigned char **. Each byte contains two 4 bit hex digits and the byte order endiannes of the number is little endian since the least significant byte(0x07) is first and so in memory the contents are like so:
0x07020300
^ ^ ^ ^- Fourth byte
| | |-Third byte
| |-Second byte
|-First byte
Note that sizeof returns size_t and the correct format specifier is %zu, otherwise you have undefined behavior.
You also need to fix this line:
unsigned char *f = (char *) &e;
to:
unsigned char *f = (unsigned char *) &e;
^^^^^^^^
Because e is an integer value (probably 4 bytes) and not a string (1 byte per character).
To have the result you expect, you should change the declaration and assignment of e for :
unsigned char *e = "197127";
unsigned char *f = e;
Or, convert the integer value to a string (using sprintf()) and have f point to that instead :
char s[1000];
sprintf(s,"%d",e);
unsigned char *f = s;
Or, use mathematical operation to get single digit from your integer and print those out.
Or, ...

Resources