Passing two dimensional array with "constant" length - c

I am trying to pass a two-dimensional char array to a function. I have found how to do it with a one-dimensional array, but I am having a hard time extrapolating that to how to deal with my current situation. See example code:
void foo( char ** bar ) {
}
int main () {
char mlady[100][100];
foo(mlady);
}
The compiler hints that the argument should be of type char (*)100, but I can't get that to compile. I also tried a lot of bruteforce without luck.
I am surprised I can't find this on google/SO, but I guess I am using the wrong keywords.

Change void foo( char ** bar ) to void foo( char bar[][100] ).
When you are passing a multidimensional array to a function, the first array dimension does not have to be specified. However the second(and subsequent) dimensions must be provided to the function.
For more information check this C-Faq.

When you define a multi-dimensional array, the bytes are laid out contiguously in memory.
For instance:
char a[3][5];
will give you 15 contiguous bytes of memory, laid out as so:
----------------------------------------------
| a + 0 | a + 1 | a + 2 | a + 3 | a + 4 |
----------------------------------------------
| a + 5 | a + 6 | a + 7 | a + 8 | a + 9 |
----------------------------------------------
| a + 10 | a + 11 | a + 12 | a + 13 | a + 14 |
----------------------------------------------
To reference a[x][y], the compiler will translate this to a + y + x * 5, in the same way that it translates indexing of a single element array from a[x] to a + x.
You'll see that in order for this translation to work for a two-dimensional array, you must know the rightmost dimension. In other words, the 5 in the definition char a[3][5] appears in the calculation a + y + x * 5, and must appear for that calculation to be able to work.
You don't need to know the leftmost dimension, because it doesn't appear in the calculation.
This is generally true of any multi-dimensional array. In order to calculate the offset of a specified index, you must know all of the dimensions except for the leftmost one. It doesn't hurt to have the leftmost one, but you don't need it.
Another way of saying all this is that "multi-dimensional arrays" in C are really just convenient ways of working with one-dimensional arrays. There's nothing char a[3][5] gives you that you can't accomplish with char a[15], and replacing all instances of a[x][y] with *(a + y + x * 5), it just makes the syntax slightly more convenient. Under the hood, these two ways are actually identical.
So, on the one hand, attempting to pass a two-dimensional array to a function as a char ** will not work, because you're not providing the rightmost dimension to the function, and the function needs it. void func(char (*bar)[5]) and void func(char bar[][5]) would be equivalent ways of doing that. In the first case, you need the parentheses around *bar because otherwise you'd have void func(char *bar[5]), where bar would be declared as a five element array of pointers to char, which is not what you have at all.
More fundamentally, passing it as a char ** won't work because a char ** is a pointer to a char pointer, and in a two-dimensional char array, you don't have any char pointers in there - just a bunch of plain chars. A pointer to the array will be passed to the function, but when you dereference it once, there are no more pointers to find, and to get from a char ** to a char you need to deference twice.
If you were to do something like (error checking omitted for brevity):
char ** list_strings = malloc(3 * sizeof *list_strings);
list_strings[0] = malloc(5);
list_strings[1] = malloc(5);
list_strings[2] = malloc(5);
then you could pass list_strings as a char ** because it actually is one, and not a true multi-dimensional array - there are two levels of indirection in there, once to get to the individual strings, and then again to get to an individual character in one of those strings. What you'd be passing to your function would be a single-dimensional (dynamically allocated) array of char *. The fact that your function could then go on to deference any of those members again does not make it a multi-dimensional array.
When you pass a char ** in this way, you technically don't need to know any of the dimensions, at least not to access any particular element. You do, of course, need to know the dimensions to avoid the function dropping off the end of any of them, unless you hardcode the dimensions of the array into your function and only ever pass it arrays of that size, which is generally not a good way to write programs.

Related

How C arrays are structured [duplicate]

This question already has answers here:
how do arrays work internally in c/c++
(4 answers)
Closed 4 years ago.
I have a question about how C arrays are stored in memory. But I'm having trouble formulating the question, so here's my best try to put it into words. I have trouble with English. Let's say we have a three dimensional array:
int foo[2][3][4];
Elements can be accessed using either array or pointer notation:
foo[i][j][k]
*(*(*(foo + i) + j) + k)
We could think of the array as a pointer to a pointer to a pointer to an int, or, for example, a pointer to a 2 dimensional array like (*a)[2][3].
The problem in my thinking is this: I would have thought that in order to 'extract' values in the array, we'd only have to dereference the top level of the array (i.e. [i]) once, the second level (i.e. [j]) twice, and the third level (i.e. [k]) three times. But actually we always have to dereference three times to get to any value. Why is this? Or is this really the case?
I try to imagine the array structure in memory.
Apologies for my poor way to express this.
Your array of arrays of arrays foo is arranged like this in memory:
+--------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+----------+
| foo[0][0][0] | foo[0][0][1] | foo[0][0][2] | foo[0][0][3] | foo[0][1][0] | foo[0][1][1] | foo[0][1][2] | foo[0][1][3] | ... etc. |
+--------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+----------+
foo[0][0][0] will be at the lowest memory location, foo[1][2][3] will be at the highest.
And an important note: An array is not a pointer. It can decay to a pointer to its first element, but please don't "think of" an array as a pointer.
Another important note: The pointers &foo, &foo[0], &foo[0][0] and &foo[0][0][0] are all pointing to the same location, but they are all different types which makes them semantically different:
&foo is of the type int (*)[2][3][4]
&foo[0] is of the type int (*)[3][4]
&foo[0][0] is of the type int (*)[4]
And &foo[0][0][0] is of the type int *
Lastly a note about the array-to-pointer decay, it only happens in one step. That means foo decays to &foo[0], foo[0] decays to &foo[0][0], and foo[0][0] decays to &foo[0][0][0].
An array is not as plain as a storage location (memory address), but an object with a type and layout.
So in your example, foo is an array of 3 arrays of 4 arrays of int, whose length is 2. *f is an array of 4 arrays of int, and **f is an array of int.
Even though each level of dereferencing gives the same memory address, they're different because they have different types, and thus the data at the same location should be interpreted differently.

Memory allocation for pointer to a char array

would someone please explain the difference between
char *names[3]
and
char (*names)[3]
and how to read this operators?
If I want to allocate memory for them dynamically how to do so?
For the first case, I think it's just an array of char* of length 3, so no memory allocation not applicable. But in second case how to do memory allocation?
When faced with questions like this, you can usually turn to cdecl (online version here):
cdecl> explain char *names[3]
declare names as array 3 of pointer to char
cdecl> explain char (*names)[3]
declare names as pointer to array 3 of char
So the former creates array of three pointers-to-char:
+----------+
| names[0] | -> char
| [1] | -> char
| [2] | -> char
+----------+
And the latter creates a single pointer to a char array of size three.
+-------+
| names | -> char, char, char - no, not a dance step :-)
+-------+
The second line decodes as "declare names as pointer to array 3 of char".
I've been writing C for over 25 years, and I've never used such a variable.
Anyway, I guess this should work:
char data[3];
char (*names) = data;
Note that the variable name, names, is highly misleading since the variable holds only 3 single characters, as opposed to char *names[3] which is three pointers to characters and thus easily could be used to hold three strings.
Also note that the above code makes little sense, you could just use data directly if you had it.
The first an array of three pointers to char.
The second is a pointer to an array of three chars.
(Read it as "*names is a char[3]").
You can create such a pointer by taking the address of an array of three chars:
char name[3];
char (*names)[3] = &name;
or dynamically in the normal way:
char (*names)[3] = malloc(sizeof(*names)); /* or sizeof(char[3]), if you're fond of bugs */
or through the regular array-to-pointer conversion:
char stuff[2][3] = {};
char (*names)[3] = stuff; /* Same as &stuff[0], as normal. */
and how to read this operators?
Let's take the second one as it is the more complex of the two:
char (*names)[3];
When you are looking at a complex definition like this, the best way to attack it is to start in the middle and work your way out. “Starting in the middle” means starting at the variable name, which is names.
“Working your way out” means looking to the right for the nearest item (nothing in this case; the right parenthesis stops you short), then looking to the left (a pointer denoted by the asterisk), then looking to the right (an array of 3), then looking to the left (char).
This right-left-right motion works with most declarations.
This means names is a pointer to an char array of size 3.
This is very strange declaration but that is how it is read.
If I want to allocate memory for them dynamically how to do so?
Now that you know what the declaration means, memory allocation becomes easy:
char (*names)[3] = malloc(3 * sizeof(char));
char *names[3] is an array of 3 char pointers.
char (*names)[3] is an array pointer (pointer to array) to an array of 3 characters char[3].
So these two have fundamentally different meanings! Don't confuse them with each other.
If you wish to allocate an array of pointers, then you can do it as in either of these examples:
char** names = malloc(3 * sizeof(*names) );
char** names = malloc(sizeof(char*[3]));
char** names = calloc(3, sizeof(char*));
These are all equivalent (but calloc also sets all pointers to NULL). names will be a pointer to the first element in the array. It will be a pointer to a char*.
If you wish to allocate an array and point to the first element, simply do:
char* names = malloc(3 * sizeof(*names));
Alternatively, you can use the array pointer syntax and point to the array as whole:
char (*names)[3] = malloc(sizeof(*names));

Passing multi-dimensional array to function without righmost size by pointer to incomplete array type

From what I remember arrays are always passed as pointers. For instance, the declaration:
void foo(int array[2][5]);
means for compiler exactly the same thing as:
void foo(int (*array)[5]);
You can say that these both forms are equivalent. Now, I wonder, why it is allowed then to declare it as:
void foo(int (*array)[]);
while not as:
void foo(int array[][]);
Take an example:
#include <stdio.h>
void foo(int (*p)[3]);
void bar(int (*p)[]);
int main(void)
{
int a[2][3] = {{1, 2, 3}, {4, 5, 6}};
foo(a);
bar(a);
return 0;
}
// The same as int p[][3] or int p[N][3] where N is a constant expression
void foo(int (*p)[3])
{
}
// Would it the same as int p[][] or int p[N][] (by analogy)?
void bar(int (*p)[])
{
}
It compiles fine and without warnings, but if I change bar's declaration to:
void bar(int p[][]);
then it's an error.
Why C allows such "obscure" way to pass an array?
Arrays are not pointers, if you declare an array of pointers with unspecified size it's ok because it will store the addresses stored in the poitners contigously and the size of each element is known, but p[][] would require the arrays to be contigous not their addresses and the size of the array is unknown so that's the problem.
To make it clear if you say int p[][] the you don't know how far is p[0] from p[1] whereas in int (*p)[] you know that the distance is the size of a pointer.
Arrays are converted to pointers but the are not pointers.
Arrays are usable as pointers, but they're more than pointers; they point to a number of things which have a size. That way, you can tell the compiler you're interested in the third or fifth element of your array, and it will calculate the location of that element by multiplying the size of one element by three or five to find an offset, and adding that offset to the address of the array.
C doesn't actually have multidimensional arrays. What it does have is arrays of arrays, which is pretty much the same thing. Say you have this:
int a[4][5];
In memory, this would look like this:
[0] |[1] |[2] |[3]
+-------------------|-------------------|-------------------|-------------------+
a | | | | | | | | | | | | | X | | | | | | | |
+-------------------|-------------------|-------------------|-------------------+
0 1 2 3 4 | 0 1 2 3 4 | 0 1 2 3 4 | 0 1 2 3 4 |
and you then try to access the element at index [2][2], then the system needs to perform the following calculation (conceptually):
Take the size of an int, and multiply that by five to get the size of the inner array
Take the size of that inner array, and multiply by two to get the offset of the second element in the outer array
Take the size of an int again, and multiply that by two to get the offset of the second element in the inner array
Add the two offsets; that's your memory address
In this case, the calculation is *(a + (2 * 5 * sizeof(int)) + (2 * sizeof(int))), giving *(a + 12*sizeof(int)), which is indeed the correct offset from the starting pointer.
All that means is that the size of the inner array needs to be defined in order for the compiler to be able to do that calculation. If you don't define the size of any dimension (but the leftmost one) of your multidimensional array, you don't have a defined size, and the compiler will balk.
You're tripping over the C language support for arrays of 'unspecified' size, which you can declare, but can't generally use directly unless you give it a more specific size somewhere.
In your example, if you actually try to do anything with the array in bar, you'll get an error:
void bar(int (*p)[])
{
printf("%d\n", p[1][1]);
}
% gcc -Wall t.c
t.c: In function ‘bar’:
t.c:23:5: error: invalid use of array with unspecified bounds
printf("%d\n", p[1][1]);
^
The only thing you can do with p in bar is cast it to some type with an explicit size, or pass it to some other function that takes a pointer to an array (of specified or unspecified size), and if you use the wrong explicit size to try access the array, you get undefined behavior with no warning.

Is it possible to convert char[] to char* in C?

I'm doing an assignment where we have to read a series of strings from a file into an array. I have to call a cipher algorithm on the array (cipher transposes 2D arrays). So, at first I put all the information from the file into a 2D array, but I had a lot of trouble with conflicting types in the rest of my code (specifically trying to set char[] to char*). So, I decided to switch to an array of pointers, which made everything a lot easier in most of my code.
But now I need to convert char* to char[] and back again, but I can't figure it out. I haven't been able to find anything on google. I'm starting to wonder if it's even possible.
It sounds like you're confused between pointers and arrays. Pointers and arrays (in this case char * and char []) are not the same thing.
An array char a[SIZE] says that the value at the location of a is an array of length SIZE
A pointer char *a; says that the value at the location of a is a pointer to a char. This can be combined with pointer arithmetic to behave like an array (eg, a[10] is 10 entries past wherever a points)
In memory, it looks like this (example taken from the FAQ):
char a[] = "hello"; // array
+---+---+---+---+---+---+
a: | h | e | l | l | o |\0 |
+---+---+---+---+---+---+
char *p = "world"; // pointer
+-----+ +---+---+---+---+---+---+
p: | *======> | w | o | r | l | d |\0 |
+-----+ +---+---+---+---+---+---+
It's easy to be confused about the difference between pointers and arrays, because in many cases, an array reference "decays" to a pointer to it's first element. This means that in many cases (such as when passed to a function call) arrays become pointers. If you'd like to know more, this section of the C FAQ describes the differences in detail.
One major practical difference is that the compiler knows how long an array is. Using the examples above:
char a[] = "hello";
char *p = "world";
sizeof(a); // 6 - one byte for each character in the string,
// one for the '\0' terminator
sizeof(p); // whatever the size of the pointer is
// probably 4 or 8 on most machines (depending on whether it's a
// 32 or 64 bit machine)
Without seeing your code, it's hard to recommend the best course of action, but I suspect changing to use pointers everywhere will solve the problems you're currently having. Take note that now:
You will need to initialise memory wherever the arrays used to be. Eg, char a[10]; will become char *a = malloc(10 * sizeof(char));, followed by a check that a != NULL. Note that you don't actually need to say sizeof(char) in this case, because sizeof(char) is defined to be 1. I left it in for completeness.
Anywhere you previously had sizeof(a) for array length will need to be replaced by the length of the memory you allocated (if you're using strings, you could use strlen(), which counts up to the '\0').
You will need a make a corresponding call to free() for each call to malloc(). This tells the computer you are done using the memory you asked for with malloc(). If your pointer is a, just write free(a); at a point in the code where you know you no longer need whatever a points to.
As another answer pointed out, if you want to get the address of the start of an array, you can use:
char* p = &a[0]
You can read this as "char pointer p becomes the address of element [0] of a".
If you have
char[] c
then you can do
char* d = &c[0]
and access element c[1] by doing *(d+1), etc.
You don't need to declare them as arrays if you want to use use them as pointers. You can simply reference pointers as if they were multi-dimensional arrays. Just create it as a pointer to a pointer and use malloc:
int i;
int M=30, N=25;
int ** buf;
buf = (int**) malloc(M * sizeof(int*));
for(i=0;i<M;i++)
buf[i] = (int*) malloc(N * sizeof(int));
and then you can reference buf[3][5] or whatever.
None of the above worked for me except strtok
#include <string.h>
Then use strtok
char some[] = "some string";
char *p = strtok(some, "");
strtok is used to split strings. But you can see that I split it on nothing ""
Now you have a pointer.
Well, I'm not sure to understand your question...
In C, Char[] and Char* are the same thing.
Edit : thanks for this interesting link.

How are multi-dimensional arrays formatted in memory?

In C, I know I can dynamically allocate a two-dimensional array on the heap using the following code:
int** someNumbers = malloc(arrayRows*sizeof(int*));
for (i = 0; i < arrayRows; i++) {
someNumbers[i] = malloc(arrayColumns*sizeof(int));
}
Clearly, this actually creates a one-dimensional array of pointers to a bunch of separate one-dimensional arrays of integers, and "The System" can figure out what I mean when I ask for:
someNumbers[4][2];
But when I statically declare a 2D array, as in the following line...:
int someNumbers[ARRAY_ROWS][ARRAY_COLUMNS];
...does a similar structure get created on the stack, or is it of another form completely? (i.e. is it a 1D array of pointers? If not, what is it, and how do references to it get figured out?)
Also, when I said, "The System," what is actually responsible for figuring that out? The kernel? Or does the C compiler sort it out while compiling?
A static two-dimensional array looks like an array of arrays - it's just laid out contiguously in memory. Arrays are not the same thing as pointers, but because you can often use them pretty much interchangeably it can get confusing sometimes. The compiler keeps track properly, though, which makes everything line up nicely. You do have to be careful with static 2D arrays like you mention, since if you try to pass one to a function taking an int ** parameter, bad things are going to happen. Here's a quick example:
int array1[3][2] = {{0, 1}, {2, 3}, {4, 5}};
In memory looks like this:
0 1 2 3 4 5
exactly the same as:
int array2[6] = { 0, 1, 2, 3, 4, 5 };
But if you try to pass array1 to this function:
void function1(int **a);
you'll get a warning (and the app will fail to access the array correctly):
warning: passing argument 1 of ‘function1’ from incompatible pointer type
Because a 2D array is not the same as int **. The automatic decaying of an array into a pointer only goes "one level deep" so to speak. You need to declare the function as:
void function2(int a[][2]);
or
void function2(int a[3][2]);
To make everything happy.
This same concept extends to n-dimensional arrays. Taking advantage of this kind of funny business in your application generally only makes it harder to understand, though. So be careful out there.
The answer is based on the idea that C doesn't really have 2D arrays - it has arrays-of-arrays. When you declare this:
int someNumbers[4][2];
You are asking for someNumbers to be an array of 4 elements, where each element of that array is of type int [2] (which is itself an array of 2 ints).
The other part of the puzzle is that arrays are always laid out contiguously in memory. If you ask for:
sometype_t array[4];
then that will always look like this:
| sometype_t | sometype_t | sometype_t | sometype_t |
(4 sometype_t objects laid out next to each other, with no spaces in between). So in your someNumbers array-of-arrays, it'll look like this:
| int [2] | int [2] | int [2] | int [2] |
And each int [2] element is itself an array, that looks like this:
| int | int |
So overall, you get this:
| int | int | int | int | int | int | int | int |
unsigned char MultiArray[5][2]={{0,1},{2,3},{4,5},{6,7},{8,9}};
in memory is equal to:
unsigned char SingleArray[10]={0,1,2,3,4,5,6,7,8,9};
In answer to your also: Both, though the compiler is doing most of the heavy lifting.
In the case of statically allocated arrays, "The System" will be the compiler. It will reserve the memory like it would for any stack variable.
In the case of the malloc'd array, "The System" will be the implementer of malloc (the kernel usually). All the compiler will allocate is the base pointer.
The compiler is always going to handle the type as what they are declared to be except in the example Carl gave where it can figure out interchangeable usage. This is why if you pass in a [][] to a function it must assume that it is a statically allocated flat, where ** is assumed to be pointer to pointer.
Suppose, we have a1 and a2 defined and initialized like below (c99):
int a1[2][2] = {{142,143}, {144,145}};
int **a2 = (int* []){ (int []){242,243}, (int []){244,245} };
a1 is a homogeneous 2D array with plain continuous layout in memory and expression (int*)a1 is evaluated to a pointer to its first element:
a1 --> 142 143 144 145
a2 is initialized from a heterogeneous 2D array and is a pointer to a value of type int*, i.e. dereference expression *a2 evaluates into a value of type int*, memory layout does not have to be continuous:
a2 --> p1 p2
...
p1 --> 242 243
...
p2 --> 244 245
Despite totally different memory layout and access semantics, C-language grammar for array-access expressions looks exactly the same for both homogeneous and heterogeneous 2D array:
expression a1[1][0] will fetch value 144 out of a1 array
expression a2[1][0] will fetch value 244 out of a2 array
Compiler knows that the access-expression for a1 operates on type int[2][2], when the access-expression for a2 operates on type int**. The generated assembly code will follow the homogeneous or heterogeneous access semantics.
The code usually crashes at run-time when array of type int[N][M] is type-casted and then accessed as type int**, for example:
((int**)a1)[1][0] //crash on dereference of a value of type 'int'
To access a particular 2D array consider the memory map for an array declaration as shown in code below:
0 1
a[0]0 1
a[1]2 3
To access each element, its sufficient to just pass which array you are interested in as parameters to the function. Then use offset for column to access each element individually.
int a[2][2] ={{0,1},{2,3}};
void f1(int *ptr);
void f1(int *ptr)
{
int a=0;
int b=0;
a=ptr[0];
b=ptr[1];
printf("%d\n",a);
printf("%d\n",b);
}
int main()
{
f1(a[0]);
f1(a[1]);
return 0;
}

Resources