pointer offset doesnt work in memset? - c

Plain C, on Windows 7 & HP machine.
int main(void) {
unsigned int a = 4294967295;
unsigned int *b = &a;
printf("before val: '%u'\n", *b); // expect 4294967295, got 4294967295
memset(b+2, 0, 1);
printf("after val: '%u'\n", *b);
// little endian 4th 3rd 2nd 1st
// expect 4278255615 - 11111111 00000000 11111111 11111111
// got 4294967295 - 11111111 11111111 11111111 11111111
return 0;
}
I want to set the third byte of the integer to 0x0, but is remains the same. Any ideas? Thank you.
On my machine, int is 32 bits.

Pointer addition/subtraction does not move by only one byte - it moves by the size of the type of the object being pointed to.
That is to say (assuming 4-byte integers),
int *p = 0x00004
int *q = p+1;
assert(q == 0x00008)
Basically, it's the same as if you used the index of operator:
int *q = &p[1]
If you want to increment a pointer by one, cast it to a unsigned char *. The way you did it, you were overwriting memory that was not part of the variable a and possibly overwriting existing data for something else.

The b+2 meens in fact a displacement of two int not two bytes.
unsigned int *b = &a;
memset(b+2, 0, 1);
In fact you want to modify the third byte
unsigned int *b = &a;
memset( ((char*)b)+2, 0, 1);

int is only guaranteed to be at least two bytes, so if you want to set the third byte you should use a long int (or uint32_t). Anyway, I'd do it using the bitwise and operator
unsigned long a = 4294967293;
/* regular code */
a &= 0xFFFF00FF;

Related

Char pointer to integer array

int main()
{
int x[] = {1, 2, 3};
char *p = &x;
printf("%d", *(p+1));
return 0;
}
I run the code in codeblocks and it is giving 0 as output.
If I I change p as int pointer then its giving 2 as output.
int main()
{
int x[] = {1, 2, 3};
int *p = &x;
printf("%d", *(p+1));
return 0;
}
Why so?
When p is declared as a pointer to char, it is expected to point at data with size of 1 byte. So (p + 1) increments p by 1 byte.
Since an int is at least 4 bytes long, (p + 1) is likely pointing to the second of the higher order bytes of 1, which is still 0.
If you wanted it to have identical output, you would do something like that
printf("%d\n", *(p + sizeof(int)));
But it's best to avoid such code and compile with the -Wall flag, which would definitely produce a warning in your case.
Assume sizeof(int) is 16 bits. 2 in binary is 00000000 00000010.
sizeof(char) is 8 bits.
Little and big endian are two ways of storing multibyte data-types.
Consider the following code:
int i = 2;
char c = (char)&i;
if ((*c)==2)
printf("Little endian");
else //if *c is 0
printf("Big endian");
From this code you can conclude that Big Endian will store 2 as 00000000 00000010. But Little Endian will store it as 00000010 00000000. , So zero as output would mean first 8 bits are zero, so system is Big Endian. Had it been using Little Endian, answer would be 2 as a char p is supposed to point 8 bits only.
Actually, declaring the data type of pointer means to specify how any bits do you want it to refer and how many bits it will jump when incremented.
If in this example, as p is a char pointer, *(p+1) will refer 00000010 in Big endian and 00000000 in Little Endian.
Your compiler may be using 32 bit for interger, so i think in both cases *(p+1) will give 0. (as 2 => 00000000 00000000 00000000 00000010 2nd byte from either side is 0)
Refer to this: `#include
int main()
{
int x[] = {1, 2, 3};
char *p = &x;
printf("%d\n", *p);
printf("%d\n", *(p+1));
printf("%d\n", *(p+2));
printf("%d\n", *(p+3));
printf("%d\n", *(p+4));
printf("%d\n", *(p+5));
printf("%d\n", *(p+6));
printf("%d\n", *(p+7));
printf("%d\n", *(p+8));
return 0;
}`
Output:
1
0
0
0
2
0
0
3
To have a look from a slightly different angle, about the binary + operator, chapter 6.5.6, paragraph 8 of C99 standard says, [emphasis mine]
When an expression that has integer type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integer expression. In other words, if the expression P points to the i-th element of an array object, the expressions (P)+N (equivalently, N+(P)) and
(P)-N (where N has the value n) point to, respectively, the i+n-th and i−n-th elements of the array object, provided they exist.
So, in your First case, p is of type char * and (p + 1) gives a result as a pointer which is incremented by sizeof(char)[that's 1 byte, most of the cases] and hence points to the 2nd element of the char array held by p. Since actually, the array held by p is of type int [Let's say 4 bytes of length, in a 32 bit system], so as per the value stored, [1 getting stored as 0000 0000 0000 0001], the *(p+1) prints out 0.
OTOH, in your second case, p is of type int * and (p + 1) gives a result as a pointer which is incremented by sizeof(int) and hence points to the 2nd element of the int array held by p. Since actually, the array held by p is of type int, so as per the value stored, [int x[] = {1, 2, 3};], the *(p+1) prints out 2.
When you increment a pointer, it moves by the size of the thing it is pointing to.
Let's say you have 16 bit integers. In binary, the number one is: 0000 0000 0000 0001
A char pointer can only point to 8 bits at a time: 0000 0000

Unsigned Char pointing to unsigned integer

I don't understand why the following code prints out 7 2 3 0 I expected it to print out 1 9 7 1. Can anyone explain why it is printing 7230?:
unsigned int e = 197127;
unsigned char *f = (char *) &e;
printf("%ld\n", sizeof(e));
printf("%d ", *f);
f++;
printf("%d ", *f);
f++;
printf("%d ", *f);
f++;
printf("%d\n", *f);
Computers work with binary, not decimal, so 197127 is stored as a binary number and not a series of single digits separately in decimal
19712710 = 0003020716 = 0011 0000 0010 0000 01112
Suppose your system uses little endian, 0x00030207 would be stored in memory as 0x07 0x02 0x03 0x00 which is printed out as (7 2 3 0) as expected when you print out each byte
Because with your method you print out the internal representation of the unsigned and not its decimal representation.
Integers or any other data are represented as bytes internally. unsigned char is just another term for "byte" in this context. If you would have represented your integer as decimal inside a string
char E[] = "197127";
and then done an anologous walk throught the bytes, you would have seen the representation of the characters as numbers.
Binary representation of "197127" is "00110000001000000111".
The bytes looks like "00000111" (is 7 decimal), "00000010" (is 2), "0011" (is 3). the rest is 0.
Why did you expect 1 9 7 1? The hex representation of 197127 is 0x00030207, so on a little-endian architecture, the first byte will be 0x07, the second 0x02, the third 0x03, and the fourth 0x00, which is exactly what you're getting.
The value of e as 197127 is not a string representation. It is stored as a 16/32 bit integer (depending on platform). So, in memory, e is allocated, say 4 bytes on the stack, and would be represented as 0x30207 (hex) at that memory location. In binary, it would look like 110000001000000111. Note that the "endian" would actually backwards. See this link account endianess. So, when you point f to &e, you are referencing the 1st byte of the numeric value, If you want to represent a number as a string, you should have
char *e = "197127"
This has to do with the way the integer is stored, more specifically byte ordering. Your system happens to have little-endian byte ordering, i.e. the first byte of a multi byte integer is least significant, while the last byte is most significant.
You can try this:
printf("%d\n", 7 + (2 << 8) + (3 << 16) + (0 << 24));
This will print 197127.
Read more about byte order endianness here.
The byte layout for the unsigned integer 197127 is [0x07, 0x02, 0x03, 0x00], and your code prints the four bytes.
If you want the decimal digits, then you need to break the number down into digits:
int digits[100];
int c = 0;
while(e > 0) { digits[c++] = e % 10; e /= 10; }
while(c > 0) { printf("%u\n", digits[--c]); }
You know the type of int often take place four bytes. That means 197127 is presented as 00000000 00000011 00000010 00000111 in memory. From the result, your memory's address are Little-Endian. Which means, the low-byte 0000111 is allocated at low address, then 00000010 and 00000011, finally 00000000. So when you output f first as int, through type cast you obtain a 7. By f++, f points to 00000010, the output is 2. The rest could be deduced by analogy.
The underlying representation of the number e is in binary and if we convert the value to hex we can see that the value would be(assuming 32 bit unsigned int):
0x00030207
so when you iterate over the contents you are reading byte by byte through the *unsigned char **. Each byte contains two 4 bit hex digits and the byte order endiannes of the number is little endian since the least significant byte(0x07) is first and so in memory the contents are like so:
0x07020300
^ ^ ^ ^- Fourth byte
| | |-Third byte
| |-Second byte
|-First byte
Note that sizeof returns size_t and the correct format specifier is %zu, otherwise you have undefined behavior.
You also need to fix this line:
unsigned char *f = (char *) &e;
to:
unsigned char *f = (unsigned char *) &e;
^^^^^^^^
Because e is an integer value (probably 4 bytes) and not a string (1 byte per character).
To have the result you expect, you should change the declaration and assignment of e for :
unsigned char *e = "197127";
unsigned char *f = e;
Or, convert the integer value to a string (using sprintf()) and have f point to that instead :
char s[1000];
sprintf(s,"%d",e);
unsigned char *f = s;
Or, use mathematical operation to get single digit from your integer and print those out.
Or, ...

how to create pointer to a bit in c-language

As we know a in c-language char pointer traverse memory byte by byte i.e. 1 byte each time,
and integer pointer 4 byte each time(in gcc compiler), 2 byte each time(in TC compiler).
for example:
char *cptr; // if this points to 0x100
cptr++; // now it points to 0x101
int *iptr; // if this points to 0x100
iptr++; // now it points to 0x104
My question is:
How to create a bit pointer in c which on incrementing traverse memory bit by bit?
The char is the 'smallest addressable unit' in C. You can't point directly at something smaller than that (such as a bit).
You can't. Using pointers, it's not possible to manipulate bits directly. (Do you really expect poor hypothetical bit *p = 1; p++ to return 1.125?)
However, you can use bitwise operators, such as <<, >>, | and & to access a specific bit within a byte.
Conceptually, a "bit pointer" is not a single scalar, but an ordered pair consisting of a byte pointer and a bit index within that byte. You can represent this with a structure containing both, or with two separate objects. Performing arithmetic on them requires some modular reduction on your part; for example, if you want to access the bit 10 bits past a given bit, you have to add 10 to the bit index, then reduce it modulo 8, and increment the byte pointer part appropriately.
Incidentally, on historical systems that only had word-addressable memory, not byte-addressable, char * consisted of a word pointer and a byte index within the word. This is the exact same concept. The difference is that, while C provides char * even on machines without byte-addressable memory, it does not provide any built-in "bit pointer" type. You have to create it yourself if you want it.
No, but you can write a function to read the bits one by one:
int readBit(char *byteData, int bitOffset)
{
const int wholeBytes = bitOffset / 8;
const int remainingBits = bitOffset % 8;
return (byteData[wholeBytes] >> remainingBits) & 1;
//or if you want most significant bit to be 0
//return (byteData[wholeBytes] >> (7-remainingBits)) & 1;
}
Usage:
char *data = any memory you like.
int bitPointer=0;
int bit0 = readBit(data, bitPointer);
bitPointer++;
int bit1 = readBit(data, bitPointer);
bitPointer++;
int bit2 = readBit(data, bitPointer);
Of course if this kind of function had general value it would probably already exist. Operating bit-by-bit is just so inefficient compared to using bit masks, and shifts etc.
I don't think that is possible since modern computers are byte addressable which means that there is one address for each byte. So a bit has no address and as such a pointer cant point to it. You could use a char * and bitwise operations to determine the value of individual bits.
If you really want it you could write a class that uses a char* to keep track of the address in memory, a char(or short/int however the value would never need to be higher than 0000 0111 so a char would reduce the memory footprint) to keep track of which bit in that byte you are at and then overload the operators so that it functions as you want it to.
I am not sure what you are asking is possible. You need to do some magic with bit shifting to traverse through all the bits of a byte pointed by the pointer.
You could always cast your pointer to integer, that is at least 3 bits bigger in size than byte pointer used at the system. Then just shift the pointer after the cast left by 3 bits. Then store the bit information on the least significant 3 bits.
This integer "bitpointer" can then be incremented with normal arithmetic.
Something like this:
#include <stdio.h>
#define bitptr long long
#define create_bitptr(pointer,bit) ((((bitptr)pointer)<<3)|bit) ;
#define get_bit(bptr) ((bptr)&7)
#define get_value(bptr) (*((char*)((bptr)>>3)))
#define set_bit(bptr) get_value(bptr) |= 1<<get_bit(bptr)
#define clear_bit(bptr) get_value(bptr) &= (~(1<<get_bit(bptr)))
int main(void)
{
char variable=0;
bitptr p ;
p=create_bitptr(&variable,0) ;
set_bit(p) ; p++ ; //1
clear_bit(p) ; p++ ; //0
set_bit(p) ; p++ ; //1
clear_bit(p) ; p++ ; //0
clear_bit(p) ; p++ ; //0
clear_bit(p) ; p++ ; //0
clear_bit(p) ; p++ ; //0
clear_bit(p) ; p++ ; //0
printf("%d\n",variable) ;
return 0;
}
With pointers it does not look like possible.But to write or read any bit of the data you can try this one.
unsigned char data;
struct _p
{
unsigned char B0:1;
unsigned char B1:1;
unsigned char B2:1;
unsigned char B3:1;
unsigned char B4:1;
unsigned char B5:1;
unsigned char B6:1;
unsigned char B7:1;
}
int main()
{
data = 15;
_p * point = ( _p * ) & data;
//you can read and write any bit of the byte with point->BX; ( Ex: printf( "%d" , point->B0;point->B5 = 1;
}

C - unsigned int to unsigned char array conversion

I have an unsigned int number (2 byte) and I want to convert it to unsigned char type. From my search, I find that most people recommend to do the following:
unsigned int x;
...
unsigned char ch = (unsigned char)x;
Is the right approach? I ask because unsigned char is 1 byte and we casted from 2 byte data to 1 byte.
To prevent any data loss, I want to create an array of unsigned char[] and save the individual bytes into the array. I am stuck at the following:
unsigned char ch[2];
unsigned int num = 272;
for(i=0; i<2; i++){
// how should the individual bytes from num be saved in ch[0] and ch[1] ??
}
Also, how would we convert the unsigned char[2] back to unsigned int.
Thanks a lot.
You can use memcpy in that case:
memcpy(ch, (char*)&num, 2); /* although sizeof(int) would be better */
Also, how would be convert the unsigned char[2] back to unsigned int.
The same way, just reverse the arguments of memcpy.
How about:
ch[0] = num & 0xFF;
ch[1] = (num >> 8) & 0xFF;
The converse operation is left as an exercise.
How about using a union?
union {
unsigned int num;
unsigned char ch[2];
} theValue;
theValue.num = 272;
printf("The two bytes: %d and %d\n", theValue.ch[0], theValue.ch[1]);
It really depends on your goal: why do you want to convert this to an unsigned char? Depending on the answer to that there are a few different ways to do this:
Truncate: This is what was recomended. If you are just trying to squeeze data into a function which requires an unsigned char, simply cast uchar ch = (uchar)x (but, of course, beware of what happens if your int is too big).
Specific endian: Use this when your destination requires a specific format. Usually networking code likes everything converted to big endian arrays of chars:
int n = sizeof x;
for(int y=0; n-->0; y++)
ch[y] = (x>>(n*8))&0xff;
will does that.
Machine endian. Use this when there is no endianness requirement, and the data will only occur on one machine. The order of the array will change across different architectures. People usually take care of this with unions:
union {int x; char ch[sizeof (int)];} u;
u.x = 0xf00
//use u.ch
with memcpy:
uchar ch[sizeof(int)];
memcpy(&ch, &x, sizeof x);
or with the ever-dangerous simple casting (which is undefined behavior, and crashes on numerous systems):
char *ch = (unsigned char *)&x;
Of course, array of chars large enough to contain a larger value has to be exactly as big as this value itself.
So you can simply pretend that this larger value already is an array of chars:
unsigned int x = 12345678;//well, it should be just 1234.
unsigned char* pChars;
pChars = (unsigned char*) &x;
pChars[0];//one byte is here
pChars[1];//another byte here
(Once you understand what's going on, it can be done without any variables, all just casting)
You just need to extract those bytes using bitwise & operator. OxFF is a hexadecimal mask to extract one byte. Please look at various bit operations here - http://www.catonmat.net/blog/low-level-bit-hacks-you-absolutely-must-know/
An example program is as follows:
#include <stdio.h>
int main()
{
unsigned int i = 0x1122;
unsigned char c[2];
c[0] = i & 0xFF;
c[1] = (i>>8) & 0xFF;
printf("c[0] = %x \n", c[0]);
printf("c[1] = %x \n", c[1]);
printf("i = %x \n", i);
return 0;
}
Output:
$ gcc 1.c
$ ./a.out
c[0] = 22
c[1] = 11
i = 1122
$
Endorsing #abelenky suggestion, using an union would be a more fail proof way of doing this.
union unsigned_number {
unsigned int value; // An int is 4 bytes long
unsigned char index[4]; // A char is 1 byte long
};
The characteristics of this type is that the compiler will allocate memory only for the biggest member of our data structure unsigned_number, which in this case is going to be 4 bytes - since both members (value and index) have the same size. Had you defined it as a struct instead, we would have 8 bytes allocated on memory, since the compiler does its allocation for all the members of a struct.
Additionally, and here is where your problem is solved, the members of an union data structure all share the same memory location, which means they all refer to same data - think of that like a hard link on GNU/Linux systems.
So we would have:
union unsigned_number my_number;
// Assigning decimal value 202050300 to my_number
// which is represented as 0xC0B0AFC in hex format
my_number.value = 0xC0B0AFC; // Representation: Binary - Decimal
// Byte 3: 00001100 - 12
// Byte 2: 00001011 - 11
// Byte 1: 00001010 - 10
// Byte 0: 11111100 - 252
// Printing out my_number one byte at time
for (int i = 0; i < (sizeof(my_number.value)); i++)
{
printf("index[%d]: %u, 0x%x\n", \
i, my_number.index[i], my_number.index[i]);
}
// Printing out my_number as an unsigned integer
printf("my_number.value: %u, 0x%x", my_number.value, my_number.value);
And the output is going to be:
index[0]: 252, 0xfc
index[1]: 10, 0xa
index[2]: 11, 0xb
index[3]: 12, 0xc
my_number.value: 202050300, 0xc0b0afc
And as for your final question, we wouldn't have to convert from unsigned char back to unsigned int since the values are already there. You just have to choose by which way you want to access it
Note 1: I am using an integer of 4 bytes in order to ease the understanding of the concept. For the problem you presented you must use:
union unsigned_number {
unsigned short int value; // A short int is 2 bytes long
unsigned char index[2]; // A char is 1 byte long
};
Note 2: I have assigned byte 0 to 252 in order to point out the unsigned characteristic of our index field. Was it declared as a signed char, we would have index[0]: -4, 0xfc as output.

Casting int pointer to char pointer causes loss of data in C?

I have the following piece of code:
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[])
{
int n = 260;
int *p = &n;
char *pp = (char*)p;
*pp = 0;
printf("n = %d\n", n);
system("PAUSE");
return 0;
}
The output put of the program is n = 256.
I may understand why it is, but I am not really sure.
Can anyone give me a clear explanation, please?
Thanks a lot.
The int 260 (= 256 * 1 + 4) will look like this in memory - note that this depends on the endianness of the machine - also, this is for a 32-bit (4 byte) int:
0x04 0x01 0x00 0x00
By using a char pointer, you point to the first byte and change it to 0x00, which changes the int to 256 (= 256 * 1 + 0).
You're apparently working on a little-endian machine. What's happening is that you're starting with an int that takes up at least two bytes. The value 260 is 256+4. The 256 goes in the second byte, and the 4 in the first byte. When you write 0 to the first byte, you're left with only the 256 in the second byte.
In C a pointer references a block of bytes based on the type associated with the pointer. So in your case the integer pointer refers to a block 4 bytes in size, while a char is only one byte long. When you set the char to 0 it only changes the first byte of the integer value, but because of how numbers are stored in memory on modern machines (effectively in reverse order from how you would write it) you are overwritting the least significant byte (which was 4) you are left w/ 256 as the value
I understood what exactly happens by changing value:
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[])
{
int n = 260;
int *p = &n;
char *pp = (char*)p;
*pp = 20;
printf("pp = %d\n", (int)*pp);
printf("n = %d\n", (int)n);
system("PAUSE");
return 0;
}
The output value are
20
and
276
So basically the problem is not that you have data loss, is that the char pointer points only to the first byte of the int and so it changes only that, the other bytes are not changed and that's why those weird value (if you are on an INTEL processor the first byte is the least significant, that's why you change the "smallest" part of the number
Your problem is the assignment
*pp = 0;
You're dereferencing pp which points to n, and changing n.
However, pp is a char pointer so it doesn't change all of n
which is an int. This causes the binary complications in the other answers.
In terms of the C language, the description for what you are doing is modifying the representation of the int variable n. In C, all types have a "representation" as one or more bytes (unsigned char), and it's legal to access the underlying representation by casting a pointer to char * or unsigned char * - the latter is better for reasons that would just unnecessarily complicate things if I went into them here.
As schnaader answered, on a little endian, twos complement implementation with 32-bit int, the representation of 260 is:
0x04 0x01 0x00 0x00
and overwriting the first byte with 0 yields:
0x00 0x01 0x00 0x00
which is the representation for 256 on such an implementation.
C allows implementations which have padding bits and trap representations (which raise a signal/abort your program if they're accessed), so in general overwriting part but not all of an int in this way is not safe to do. Nonetheless, it does work on most real-world machines, and if you instead used the type uint32_t, it would be guaranteed to work (although the ordering of the bits would still be implementation-dependent).
Considering 32 bit systems,
256 will be represented in like this.
00000000 (Byte-3) 00000000 (Byte-2) 00000001(Byte-1) 00000100(Byte-0)
Now when p is typecast-ed to a char pointer, the label on the pointer changes, but the memory contents don't. It means earlier p could have access 4 bytes, as it was an integer pointer, but now it can only access 1 byte as it is a char pointer. So, only the LSB gets changes to zero, not all the 4 bytes.
And it becomes
00000000 (Byte-3) 00000000 (Byte-2) 00000001(Byte-1) 00000000(Byte-0)
Hence, the o/p is 256.

Resources