I have a generic struct declared and an array of these structs as given below:
struct A
{
int x,y,z;
char a,b,c;
};
struct A *str_arr[5];
From my understanding str_arr is a pointer to a block of memory which stores pointers to the 5 structs in sequential order and therefore these pointers can be accessed via pointer arithmetic or array indexing as:
struct A *str_a = str_arr[1]; // points to 2nd struct?
struct A *str_b = str_arr + 2*sizeof(struct A*); // points to 3rd struct?
However, these 5 structs might not be in sequential memory?
printf("%p\n", str_arr); // prints memory location of start of str_arr pointers?
printf("%p\n", str_arr[1]) // prints memory location of 2nd struct?
printf("%d\n" str_arr == &str_arr[0]) // prints 1?
I would just like clarification that my understanding is correct with all of the points I have raised.
All is correct except one:
struct A **str_b = str_arr + 2 /* *sizeof(struct A*) */;
/* ^^ ^^^^^^^^^^^^^^^^^^^^^^ */
/* Not need to multiply with size. Dereference with * if your type is struct A * */
or
struct A *str_b = *(str_arr + 2);
You give the offset in terms of number of elements and not the size in bytes.
str_arr + 2*sizeof(struct A*) is equivalent to &str_arr[2*sizeof(struct A*)]
+0 +1 +2 +3 +4
+---+---+---+---+---+
| A | B | C | D | E |
+---+---+---+---+---+
str_arr ^^^^^^^^^^^^^^^^^^^^^
&str_arr[0]^^^^
str_arr[1] = B
str_arr is the address of array start
str_arr[1] is the contents at offset +1 i.e. B which is an address pointing to object of type struct A.
str_arr == &str_arr[0] are same address with different type
As suggested by #Gopi, last point can be proven by printing the following:
sizeof str_arr v/s &str_arr[0]
Following address, &str_arr + 1 v/s str_arr + 1
Related
So I'm trying to learn some C with pointers but I'm having trouble understanding the following code snippet. first and last are the first and last item. So you have something like first, middle, last. but what is last - 1? Is it just the second last element? so if we have some thing first, last with no middle then this would be true?
item *first, item *last
if (first == last -1)
return 0
To expound on #John Bode's answer, a pointer is just a number. It's an address in memory. Arrays are laid out contiguously in memory. The first element has the lowest address, and the last element the highest address.
So, if we have the address for the first element in the array, and we add 1, we get the memory address of the next element in the array. Your compiler should know how many bytes each element in the array takes up unless you're doing weird casting on the pointers. Knowing this, the program adds that many bytes to the address.
So, to get to the next address, we simply add 1. To get to the previous address, we can subtract 1.
Imagine you have an array of item:
item arr[10];
and then you set first and last to point to the first and last items of the array, respectively:
item *first = &arr[0];
item *last = &arr[9];
which gives us something like
item item *
+---+
arr: | | arr[0] <---- first
+---+
| | arr[1]
+---+
| | arr[2]
+---+
...
+---+
| | arr[8] <---- last - 1
+---+
| | arr[9] <---- last
+---+
The expression last - 1 gives you a pointer to the object immediately before the object pointed to by last, assuming both objects are members of the same array, or that last points to an object immediately following the end of the array.
C doesn't guarantee that individual variables are laid out in any specific order, so adding or subtracting an offset to a pointer only works (that is, gives you a useful result) if you're pointing to an element of an array. IOW, if you had something like
item a;
item b;
item *first = &a;
item *last = &b;
then the expression last - 1 isn't guaranteed to point to a (nor is first + 1 guaranteed to point to b).
Yet another pointer math illustration
Not knowing the type of item in your example code, I will choose to make it a struct:
typedef struct {
int a;
int b;
float c;
} item_s;//new type
Using the new type item_s, create the following array and a set of pointers to illustrate pointer arithmetic
item_s item[5] = {{1,2,3.0},{10,20,30.0},{15,25,35.0},{16,26,36.0},{17,27,37.0}};
item_s *first, *last, *index;
//set the `first` and `last` pointers to the first and last element of the array
first = &item[0];
last = &item[4];
//now illustrate type of pointer manipulation and its results
index = first + 1 ;// index now points to the area of memory occupied by item[1]
// first + 1*sizeof(item_s)
printf("a: %d\nb: %d\nc:%f\n", index->a, index->b, index->c);
index = last - 1;//index points to the area of memory occupied by item[3]
// last - 1*sizeof(item_s)
printf("a: %d\nb: %d\nc:%f\n", index->a, index->b, index->c);
index = last + 1;//index points to an un-owned area of memory and possibly a seg-fault.
//printf("a: %d\nb: %d\nc:%f\n", index->a, index->b, index->c);//UB
So when you increment a pointer by 1, the new memory location pointed to is exactly the
old memory location + 1 * sizeof(pointer type)
If you increment it by 3, then it is
old memory location + 3 * sizeof(pointer type)
Suppose I had an extremely basic C code that simply printed the memory address of an element inside the list such as
#include <stdio.h>
int main()
{
int data[5] = {1,2,3,4,5};
printf("%x", &data[2]);
return 0;
}
What is the order of operations for the &data[2] call?
I am finding it hard to visually see because data[2] returns a number, and getting the memory address of a number "3" doesn't really make sense.
When you create an array, the compiler will reserve memory for it, and you can store values in that memory.
What is returned by e.g. &data[2] is not a pointer to the integer 3 but a pointer to the array element where you have opted to store the integer value 3.
An array like yours look like this in memory
+---------+---------+---------+---------+---------+
| data[0] | data[1] | data[2] | data[3] | data[4] |
+---------+---------+---------+---------+---------+
The exact values stored in each element is kind of irrelevant if all you want is a pointer to an element.
In C the syntax array[index] is just syntactic sugar for *(array + index) and the address-of operator &array[index] can be rewritten as just (array + index), which is what you're taking the address of with the statement &data[2]. You aren't getting the address of the value at data[2] (i.e. 3); you're getting the address of (data + 2).
Try this:
int data[5] = {1,2,3,4,5};
printf("%p\n", (void *)&data[2]);
printf("%p\n", (void *)(data + 2));
You will see the same address on both lines.
I have currently trouble understanding the following scenario:
I have a multidimensional array of Strings and I want to address it by using pointers only but I always get a Segmentation Fault when using the array annotation on the pointer. This is just an example code I want to use the 3D array in a pthread so I want to pass it in via a structure as a pointer but it just doesn't work and I would like to know why? I thought pointers and arrays are functionally equivalent? Here is the sample code:
#include <stdio.h>
void func(unsigned char ***ptr);
int main() {
// Image of dimension 10 times 10
unsigned char image[10][10][3];
unsigned char ***ptr = image;
memcpy(image[0][0], "\120\200\12", 3);
// This works as expected
printf("Test: %s", image[0][0]);
func(image);
return 0;
}
void func(unsigned char ***ptr) {
// But here I get a Segmentation Fault but why??
printf("Ptr: %s", ptr[0][0]);
}
Thanks in advance for your help :)
I think maybe strdup confuses the issue. Pointers and arrays are not always equivalent. Let me try to demonstrate. I always avoid actual multi-dimension arrays, so I may make a mistake here, but:
int main()
{
char d3Array[10][10][4]; //creates a 400-byte contiguous memory area
char ***d3Pointer; //a pointer to a pointer to a pointer to a char.
int i,j;
d3Pointer = malloc(sizeof(char**) * 10);
for (i = 0; i < 10; ++i)
{
d3Pointer[i] = malloc(sizeof(char*) * 10);
for (j = 0; j < 4; ++j)
{
d3Pointer[i][j] = malloc(sizeof(char) * 4);
}
}
//this
d3Pointer[2][3][1] = 'a';
//is equivalent to this
char **d2Pointer = d3Pointer[2];
char *d1Pointer = d2Pointer[3];
d1Pointer[1] = 'a';
d3Array[2][3][1] = 'a';
//is equivalent to
((char *)d3Array)[(2 * 10 * 4) + (3 * 4) + (1)] = 'a';
}
Generally, I use the layered approach. If I want contiguous memory, I handle the math myself..like so:
char *psuedo3dArray = malloc(sizeof(char) * 10 * 10 * 4);
psuedo3dArray[(2 * 10 * 4) + (3 * 4) + (1)] = 'a';
Better yet, I use a collection library like uthash.
Note that properly encapsulating your data makes the actual code incredibly easy to read:
typedef unsigned char byte_t;
typedef struct
{
byte_t r;
byte_t g;
byte_t b;
}pixel_t;
typedef struct
{
int width;
int height;
pixel_t * pixelArray;
}screen_t;
pixel_t *getxyPixel(screen_t *pScreen, int x, int y)
{
return pScreen->pixelArray + (y*pScreen->width) + x;
}
int main()
{
screen_t myScreen;
myScreen.width = 1024;
myScreen.height = 768;
myScreen.pixelArray = (pixel_t*)malloc(sizeof(pixel_t) * myScreen.height * myScreen.width);
getxyPixel(&myScreen, 150, 120)->r = 255;
}
In C, you should allocate space for your 2D array one row at a time. Your definition of test declares a 10 by 10 array of char pointers, so you don't need to call malloc for it. But to store a string you need to allocate space for the string. Your call to strcpy would crash. Use strdup instead. One way to write your code is as follows.
char ***test = NULL;
char *ptr = NULL;
test = malloc(10 * sizeof(char **));
for (int i = 0; i < 10; i++) {
test[i] = malloc(10 * sizeof(char *));
}
test[0][0] = strdup("abc");
ptr = test[0][0];
printf("%s\n", ptr);
test[4][5] = strdup("efg");
ptr = test[4][5];
printf("%s\n", ptr);
Alternatively, if you want to keep your 10 by 10 definition, you could code it like this:
char *test[10][10];
char *ptr = NULL;
test[0][0] = strdup("abc");
ptr = test[0][0];
printf("%s\n", ptr);
test[4][5] = strdup("efg");
ptr = test[4][5];
printf("%s\n", ptr);
Your problem is, that a char[10][10][3] is something very different from a char***: The first is an array of arrays of arrays, the later is a pointer to a pointer to a pointer. The confusions arises because both can be dereferenced with the same syntax. So, here is a bit of an explanation:
The syntax a[b] is nothing but a shorthand for *(a + b): First you perform pointer arithmetic, then you dereference the resulting pointer.
But, how come you can use a[b] when a is an array instead of a pointer? Well, because...
Arrays decay into pointers to their first element: If you have an array declared like int array[10], saying array + 3 results in array decaying to a pointer of type int*.
But, how does that help to evaluate a[b]? Well, because...
Pointer arithmetic takes the size of the target into account: The expression array + 3 triggers a calculation along the lines of (size_t)array + 3*sizeof(*array). In our case, the pointer that results from the array-pointer-decay points to an int, which has a size, say 4 bytes. So, the pointer is incremented by 3*4 bytes. The result is a pointer that points to the fourths int in the array, the first three elements are skipped by the pointer arithmetic.
Note, that this works for arrays of any element type. Arrays can contain bytes, or integers, or floats, or structs, or other arrays. The pointer arithmetic is the same.
But, how does that help us with multidimensional arrays? Well, because...
Multidimensional arrays are just 1D arrays that happen to contain arrays as elements: When you declare an array with char image[256][512]; you are declaring a 1D array of 256 elements. These 256 elements are all arrays of 512 characters, each. Since the sizeof(char) == 1, the size of an element of the outer array is 512*sizeof(char) = 512, and, since we have 256 such arrays, the total size of image is 256*512. Now, I can declare a 3D array with char animation[24][256][512];...
So, going back to your example that uses
char image[10][10][3]
what happens when you say image[1][2][1] is this: The expression is equivalent to this one:
*(*(*(image + 1) + 2) + 3)
image being of type char[10][10][3] decays into a pointer to its first element, which is of type char(*)[10][3] The size of that element is 10*3*1 = 30 bytes.
image + 1: Pointer arithmetic is performed to add 1 to the resulting pointer, which increments it by 30 bytes.
*(image + 1): The pointer is dereferenced, we are now talking directly about the element, which is of type char[10][3].
This array again decays into a pointer to its first element, which is of type char(*)[3]. The size of the element is 3*1 = 3. This pointer points at the same byte in memory as the pointer that resulted from step 2. The only difference is, that it has a different type!
*(image + 1) + 2: Pointer arithmetic is performed to add 2 to the resulting pointer, which increments it by 2*3 = 6 bytes. Together with the increment in step 2, we now have an offset of 36 bytes, total.
*(*(image + 1) + 2): The pointer is dereferenced, we are now talking directly about the element, which is of type char[3].
This array again decays into a pointer to its first element, which is of type char*. The size of the element is now just a single byte. Again, this pointer has the same value as the pointer resulting from step 5, but a different type.
*(*(image + 1) + 2) + 1: Pointer arithmetic again, adding 1*1 = 1 bytes to the total offset, which increases to 37 bytes.
*(*(*(image + 1) + 2) + 1): The pointer is dereferenced the last time, we are now talking about the char at an offset of 37 bytes into the image.
So, what's the difference to a char***? When you dereference a char***, you do not get any array-pointer-decay. When you try to evaluate the expression pointers[1][2][1] with a variable declared as
char*** pointers;
the expression is again equivalent to:
*(*(*(pointers + 1) + 2) + 3)
pointers is a pointer, so no decay happens. Its type is char***, and it points to a value of type char**, which likely has a size of 8 bytes (assuming a 64 bit system).
pointers + 1: Pointer arithmetic is performed to add 1 to the resulting pointer, which increments it by 1*8 = 8 bytes.
*(pointers + 1): The pointer is dereferenced, we are now talking about the pointer value that is found in memory at an offset of 8 bytes of where pointers points.
Further steps depending on what actually happened to be stored at pointers[1]. These steps do not involve any array-pointer-decay, and thus load pointers from memory instead.
You see, the difference between a char[10][10][3] and a char*** is profound. In the first case, the array-pointer-decay transforms the process into a pure offset computation into a multidimensional array. In the later case, we repeatedly load pointers from memory when accessing elements, all we ever have are 1D arrays of pointers. And it's all down to the types of pointers!
I'm quite new in C language, so this "problem" is very confusing for me.
I wanted to create 2D array using array of int pointers (rows) which points to arrays of ints (columns) in one block of memory. I did it and it works but I'm not sure why after I checked something.
I've used malloc to allocate 48 bytes (2x4 array) in the heap (I'm on x86-64 machine):
int **a;
a = (int **)malloc(sizeof(int*) * 2 + sizeof(int) * 2 * 4);
Now lets assume that this is the whole 48 bytes in memory. I wanted 2 row's array so I needed 2 pointers to arrays of ints - a[0], a[1]:
----------------------------------------------------------------
| a[0] | a[1] | |
----------------------------------------------------------------
^
|
I assumed that all pointers are 8 bytes long and that address of a[2] (arrow) is the place where I can start storing my values (arrays of ints). So I did...
int *addr = (int*)&a[2];
a[0] = addr;
addr += 4;
a[1] = addr;
This is working perfectly fine, I can easily fill and print 2D array. Problem is that when I was writing int *addr = (int*)&a[2]; I was sure that this will be the address of a[0] plus 2 * 8 bytes, but it wasn't. I've checked it at another example with this simple code:
int *p;
int **k;
p = (int*) malloc(30);
k = (int**) malloc(30);
printf("&p = %p %p %p\n", &p[0], &p[1], &p[2]);
printf("&k = %p %p %p\n", &k[0], &k[1], &k[2]);
Output:
&p = 0x14d8010 0x14d8014 0x14d8018 <-- ok (int = 4 bytes)
&k = 0x14d8040 0x14d8048 0x14d8050 <-- something wrong in my opinion (ptrs = 8 bytes)
My question is: Why the third address of the pointer in array is 0x14d8050 not 0x14d8056. I think it might be because 0x14d8056 is not the best address for ints but why is that and why it happens only when dealing with array of pointers?? I've checked this on x86 machine and pointer has "normal" values
&p = 0x8322008 0x832200c 0x8322010
&k = 0x8322030 0x8322034 0x8322038
I know this might be an obvious or even stupid question for someone so please at least share some links with information about this behavior. Thank you.
Numbers prefixed by 0x are represented in hexa decimal.
Thus, 0x14d8048 + 8 == 0x14d8050 is expected.
as timrau said in his comment 0x14d8048 + 8 is not 0x14d8056 but 0x14d8050 because it's hexadecimal
concerning your 2D array , I'm not sure why it worked but that's not the way to create one.
there are two ways for creating a 2D array , the first and simple one is " statically " and it goes like this : int a[2][4]; .
the second one , the one you tried , is dynamically , the slightly more complicated and it goes like this
int **a;
int i;
a = malloc(2 * sizeof(*int));
for(i = 0 ; i < 2 ; i++)
a[i] = malloc(4 * sizeof(int));
For a course about the functioning of operating systems, we had to write a malloc/free implementation for a specific sized struct. Our idea was to store the overhead, like the start and end of the specified (static) memory block our code has to work in, in the first few addresses of that block.
However, something went wrong with the calculation of the last memory slot; We're adding the size of all usable memory slots to the address of the first usable memory slot, to determine what the last slot is. However, when adding the int sizeofslots to the address of currentslot, it actually adds sizeofslots 4 times to this address. Here's the relevant code:
/* Example memory block*/
/* |------------------------------------------------------------------------------|*/
/* | ovr | 1 t | 0 | 1 t | 1 t | 1 t | 0 | 0 | 0 | 0 | 0 | 0 | 0 |*/
/* |------------------------------------------------------------------------------|*/
/* ovr: overhead, the variables `currentslot`, `firstslot` and `lastslot`.
* 1/0: Whether or not the slot is taken.
* t: the struct
*/
/* Store the pointer to the last allocated slot at the first address */
currentslot = get_MEM_BLOCK_START();
*currentslot = currentslot + 3*sizeof(void *);
/* The first usable memory slot after the overhead */
firstslot = currentslot + sizeof(void *);
*firstslot = currentslot + 3*sizeof(void *);
/* The total size of all the effective memory slots */
int sizeofslots = SLOT_SIZE * numslots;
/* The last usable slot in our memory block */
lastslot = currentslot + 2*sizeof(void*);
*lastslot = firstslot + sizeofslots;
printf("%p + %i = %p, became %p\n", previous, sizeofslots, previous + (SLOT_SIZE*numslots), *lastslot);
We figured it had something to do with integers being 4 bytes, but we still don't get what is happening here; Can anyone explain it?
C's pointer arithmetic always works like this; addition and subtraction is always in terms of the item being pointed at, not in bytes.
Compare it to array indexing: as you might know, the expression a[i] is equivalent to *(a + i), for any pointer a and integer i. Thus, it must be the case that the addition happens in terms of the size of each element of a.
To work around it, cast the structure pointer down to (char *) before the add.
When you add an integer to a pointer, it increments by that many strides (i.e. myPointer + x will increment by x*sizeof(x). If this didn't happen, it would be possible to have unaligned integers, which is many processor architectures is a fault and will cause some funky behaviour, to say the least.
Take the following as an example
char* foo = (char*)0x0; // Foo = 0
foo += 5; // foo = 5
short* bar = (short*)0x0; // Bar = 0; (we assume two byte shorts)
bar += 5; // Bar = 0xA (10)
int* foobar = (int*)0x0; // foobar = 0; (we assume four byte ints)
foobar += 2; // foobar = 8;
char (*myArr)[8]; // A pointer to an array of chars, 8 size
myArr += 2; // myArr = 0x10 (16). This is because sizeof(char[8]) = 8;
Example
const int MAX = 3;
int main ()
{
int var[] = {10, 100, 200};
int i, *ptr;
/* let us have array address in pointer */
ptr = var;
for ( i = 0; i < MAX; i++)
{
printf("Address of var[%d] = %x\n", i, ptr );
printf("Value of var[%d] = %d\n", i, *ptr );
/* move to the next location */
ptr++;
}
return 0;
}
Output::
Address of var[0] = bfb7fe3c
Value of var[0] = 10
Address of var[1] = bfb7fe40
Value of var[1] = 100
Address of var[2] = bfb7fe44
Value of var[2] = 200
You can deduce from the example that, a pointer increments itself by "Number Of Bytes" = "Size of the type it is pointing to". Here it is, Number Of bytes = sizeof(int). Similarly, it will increment itself 1 byte in case of char.