I have a matrix struct:
typedef struct Matrix
{
float m[16];
} Matrix;
When I try to call this function:
memcpy(m->m, MultiplyMatrices(m, &translation).m, sizeof(m->m));
I get an error at compile time saying:
error: invalid use of non-lvalue array
MultiplyMatrices returns a Matrix.
I only get this error if I use gcc to compile the file into an object,
if I use g++ to compile the object I get no error.
I am not even sure what the error means, I have a feeling it has to do with the array stored in the Matrix returned by MultiplyMatrices.
If you need to see more code let me know.
This code is from this tutorial: OpenGL Book Chapter 4
p.s. I would like to keep this code strict iso/ansi, if there is no other solution however, then I'll just have to deal with it.
EDIT: I ended up going with creating a temporary Matrix then copying the array.
Matrix tempMatrix;
...
tempMatrix = MultiplyMatrices(m, &translation);
memcpy(m->m, tempMatrix.m, sizeof(m->m));
The return value of MultiplyMatrices() is not an lvalue (like the return value of any function), which means that you can't take its address. Evaluating an array (including an array member of a structure) implicitly takes the address of the first element, so you can't do that.
You can, however, use simple assignment of the containing struct:
*m = MultiplyMatrices(m, &translation);
As long as your struct only contains the one element as you have shown, this is exactly the same.
Related
If I have a multidimensional array like this:
int arr2d[2][3] = {{0,1,2}, {10,11,12}};
I can pass it to a function like this:
void foobar(int arg[][3])
This is not a call by value, this is call by reference, so just an pointer to the start address, but the compiler still knows it is a 2D array and I'm able to access it like one in the function.
Now how does the same work in a struct?
typedef struct {
int arr2d[][3];
} Foobar_t
First this gives me: error: flexible array member in otherwise empty struct. I can fix this by doing so:
typedef struct {
int dummy;
int arr2d[][3];
} Foobar_t
It will compile without errors or warnings. But when I try to use it like Foobar_t foobar = {1337, arr2d} I get some warnings:
missing braces around initializer
initialization makes integer from pointer without a cast
And when accessing it: subscripted value is neither array nor pointer nor vector.
One dimensional arrays can easily be treated as pointers. But for multi dimensional arrays the compiler needs to know the size of the different dimensions to calculate the offsets correctly. Is there a way without cast (int (*)[3]) and why does the syntax differ from the function parameter?
So this is the work-around I want to avoid:
#include <stdio.h>
static int testArr[2][3] = {{0,1,2},{10,11,12}};
typedef struct {
int *arr2d;
} Foobar_t;
int main( int argc, char** argv ) {
Foobar_t foobar = {(int*)testArr};
int (*arr2d)[3] = (int (*)[3]) foobar.arr2d;
printf("testStruct_0_0: %d\n", arr2d[0][0]);
printf("testStruct_1_0: %d\n", arr2d[1][0]);
return 1337;
}
Edit:
Some comments suggest that reference is not the correct word. Of course in the C language this is implemented by a pointer.
So the TLDR of this questions is: How does the syntax for a pointer type to a multi dimensional array look like.
The answer can already be seen in my work-around code. So that is all, move along, nothing to see here ;) Nevertheless thanks for the replies.
There is no "call by reference" in the C language. Function arguments are always passed by value. Arrays do appear special, since an array decays to a pointer to its first element in most expressions (including function calls). This means that when an array is used as an argument in a function call, a pointer to the first element is passed to the function instead of an array; but it is the value of this pointer which is passed.
In function declarators, array types are adjusted to pointers to appropriate types. This is specific to the semantics of function declarators. Thus, a function declaration like:
void foobar(int arg[][3]);
is adjusted to take a pointer to an array of three ints as an argument:
void foobar(int (*arg)[3]);
In general, a type expression such as int arg[][3] is an incomplete type, since it is impossible to know the size of the array arg[][] without more information.
Structures in C do not allow member types to be specified with incomplete types (with one exception), since there is no way to know the size of the struct without this information. Further, struct specifiers do not make the same adjustment to array types that function declarators do, since structs may actually include array members.
The exception to the incomplete type rule in structs is with flexible array members. The last member of a struct with at least two named members may have an incomplete array type.
The simple solution to the problem in the question is to change the specifier for the struct to use a pointer to an array. Note that here the member .arr2d is not an array, but a pointer to an array of three ints:
typedef struct {
int (*arr2d)[3];
} Foobar_t;
You could try using either int **arr2d, which would allow you to access it via arr2d[x][y] or simply convert it to a one-dimensional array like int *arr2d = malloc(2*3*sizeof(int));.
That way, you'd need to access values like this: arr2d[x*m + y];, where x and y are the same of the previous example, while m is the size of a row.
I'd also suggest you to store both row number and column number of your 2-dimensional array into the struct.
Turns out just looking at the work-around showed me the solution:
typedef struct {
int (*arr2d)[3];
} Foobar_t;
This is the correct type for a pointer to a 2D array. type (*name)[n] also works for function parameters.
So why is the other syntax type name[][n] still valid for function parameters? Probably the conflicts with the flexible array feature of structs keep it from working there.
I am currently studying for my C-Midterm and I encountered this declaration:
int **foo[][]()
When looking for the solution as to what this declaration means my tutors actually gave two different answers:
1) foo is an array of arrays of functions with return type pointer to pointer to an int
2) foo is an array of arrays of pointers to pointers to a function with return type int
I know the "start with the name of the variable, continue to the right until you reach the end or ')' then go back to your last starting point and continue to the left until you reach the start or '('" rule so I think 1) is the correct answer here but I am not entirely sure.
Thanks,
Ozelotl
It is nothing specific. Meaning that on the surface it looks like a C declaration, but it is not well-formed. It is illegal and as such it means nothing.
Firstly, it appears like a two-dimensional array declaration, but in C language an array declaration is required to specify all sizes except possibly for the very first one. Your declaration omits the second size as well, which makes it illegal.
Secondly, even if we ignore the missing sizes, it looks like a declaration for an array of functions. It is illegal to declare arrays of functions in C.
For example, this would make a legal C declaration
int (**foo[][42])()
but not what you have originally.
The syntax of this declaration is that foo[][] declares a 2-D array (or it would, if the second bound had a dimension specified - as it stands that's illegal); and then the rest of it is:
int **bar(); // with bar = foo[][]
which is a function taking unspecified arguments and returning int **. However, since bar is an array type here this attempts to declare an array of functions, which is illegal. (Not to be confused with an array of function pointers, which would be OK).
The grammar rules are that the ** bind to the int (not to the bar) unless you use parentheses to force them to bind to the bar; so they are part of the type int ** and they are not saying that bar is a pointer.
#include <stdio.h>
char A[];
int main()
{
printf("%c\n",A[1]);
return 0;
}
I can access any element using index . It never gives error . What is the index of max element i can access for 32 bit machine?
It has size 1. Accesses beyond index 0 (including your code, which accesses A[1]) have undefined behavior.
This is 6.9.2 in the C99 standard. char A[]; is a "tentative definition", which roughly speaking means that if the same translation unit contains a proper definition then it's just a declaration of A as an array of char of unknown size. If there's no proper definition then the object is defined anyway as if there were a definition at the end of the translation unit, with a default initializer.
When the declaration char A[]; appears at file scope, it declares an array. A definition of the array should appear somewhere else. If the definition does not appear in the same file (translation unit), then the behavior is as if a definition appeared with one initializer with value zero, as if you had written char A[] = { 0 };.
Code in which the declaration is visible may use the array. However, if the definition of the array is not visible, then the compiler does not know the size of the array. It is the responsibility of the author of the code to use only elements that are actually defined. They must know the size of the array by prior arrangement or by passing some information in the program.
If code uses an element of the array that does not exist, or even calculates an address of an element more than one beyond the end of the array, then the behavior is undefined.
A[1] will be translated as *(A+1) which is basically a memory address . So, printf can print whatever is at that memory location. I assume you can keep referencing the array till anything exists at that location (which gives you garbage)[and you are permitted to access that location].
Edit: GCC 4.6.3 gives warning: array ‘A’ assumed to have one element [enabled by default]
I can access any element using index . It never gives error . What is
the index of max element i can access for 32 bit machine?
There are a great many things that you can do in C which you nevertheless should not do. Accessing out of bounds elements of an array is usually one of those things.
When I compile your code using gcc, I get:
warning: array ‘A’ assumed to have one element
That should be enough to tell you that you should not access any element other than A[0].
It's just a pointer, there are no elements in the table. You should not try to index anything.
So, recently I had the unfortunate need to make a C extension for Ruby (because of performance). Since I was having problems with understanding VALUE (and still do), so I looked into the Ruby source and found: typedef unsigned long VALUE; (Link to Source, but you will notice that there are a few other 'ways' it's done, but I think it's essentially a long; correct me if I'm wrong). So, while investigating this further I found an interesting blog post, which says:
"...in some cases the VALUE object could BE the data instead of POINTING TO the data."
What confuses me is that, when I attempt to pass a string to C from Ruby, and use RSTRING_PTR(); on the VALUE (passed to the C-function from Ruby), and try to 'debug' it with strlen(); it returns 4. Always 4.
example code:
VALUE test(VALUE inp) {
unsigned char* c = RSTRING_PTR(inp);
//return rb_str_new2(c); //this returns some random gibberish
return INT2FIX(strlen(c));
}
This example returns always 1 as the string length:
VALUE test(VALUE inp) {
unsigned char* c = (unsigned char*) inp;
//return rb_str_new2(c); // Always "\x03" in Ruby.
return INT2FIX(strlen(c));
}
Sometimes in ruby I see an Exception saying "Can't convert Module to String" (or something along those lines, however I was messing with the code so much trying to figure this out that I am unable to reproduce the error now the error would happen when I tried StringValuePtr(); [I'm a bit unclear what this exactly does. Documentation says it changes the passed paramater to char*] on inp):
VALUE test(VALUE inp) {
StringValuePtr(inp);
return rb_str_new2((char*)inp); //Without the cast, I would get compiler warnings
}
So, the Ruby code in question is: MyMod::test("blahblablah")
EDIT: Fixed a few typos and updated the post a little.
The questions
What exactly does VALUE imp hold? A pointer to the object/value?
The value itself?
If it holds the value itself: when does it do that, and is there a way to check for it?
How do I actually access the value (since I seem to accessing almost everything but
the value)?
P.S: My understanding of C isn't really the best, but it's a work in progress; also, read the comments in the code snippets for some additional description (if it helps).
Thanks!
Ruby Strings vs. C strings
Let's start with strings first. First of all, before trying to retrieve a string in C, it is good habit to call StringValue(obj) on your VALUE first. This ensures that you will really deal with a Ruby string in the end because if it is not already a string, then it will turn it into one by coercing it with a call to that object's to_str method. So this makes things safer and prevents the occasional segfault you might get otherwise.
The next thing to watch out for is that Ruby strings are not \0-terminated as your C code would expect them to make things like strlen etc. work as expected. Ruby's strings carry their length information with them instead - that's why in addition to RSTRING_PTR(str) there is also the RSTRING_LEN(str) macro to determine the actual length.
So what StringValuePtr now does is returning the non-zero-terminated char * to you - this is great for buffers where you have a separate length, but not what you want for e.g. strlen. Use StringValueCStr instead, it will modify the string to be zero-terminated so that it is safe for usage with functions in C that expect it to be zero-terminated. But, try to avoid this wherever possible, because this modification is much less performant than retrieving the non-zero-terminated string that does not have to be modified at all. It's surprising if you keep an eye on this how rarely you will actually need "real" C strings.
self as an implicit VALUE argument
Another reason why your current code doesn't work as expected is that every C function to be called by Ruby gets passed self as an implicit VALUE.
No arguments in Ruby ( e.g. obj.doit ) translates to
VALUE doit(VALUE self)
Fixed amount of arguments (>0, e.g. obj.doit(a, b)) translates to
VALUE doit(VALUE self, VALUE a, VALUE b)
Var args in Ruby ( e.g. obj.doit(a, b=nil)) translates to
VALUE doit(int argc, VALUE *argv, VALUE self)
in Ruby. So what you were working on in your example is not the string passed to you by Ruby but actually the current value of self, that is the object that was the receiver when you called that function. A correct definition for your example would be
static VALUE test(VALUE self, VALUE input)
I made it static to point out another rule that you should follow in your C extensions. Make your C functions only public if you intend to share them among several source files. Since that's almost never the case for function that you attach to a Ruby class, you should declare them as static by default and only make them public if there is a good reason to do so.
What is VALUE and where does it come from?
Now to the harder part. If you dig down deeply into Ruby internals, then you will find the function rb_objnew in gc.c. Here you can see that any newly created Ruby object becomes a VALUEby being cast as one from something called the freelist. It's defined as:
#define freelist objspace->heap.freelist
You can imagine the objspace as a huge map that stores each and every object that is currently alive at a given point in time in your code. This is also where the garbage collector fulfills his duty and the heap struct in particular is the place where new objects are born. The "freelist" of the heap is again declared as being an RVALUE *. This is the C-internal representation of the Ruby built-in types. An RVALUE is actually defined as follows:
typedef struct RVALUE {
union {
struct {
VALUE flags; /* always 0 for freed obj */
struct RVALUE *next;
} free;
struct RBasic basic;
struct RObject object;
struct RClass klass;
struct RFloat flonum;
struct RString string;
struct RArray array;
struct RRegexp regexp;
struct RHash hash;
struct RData data;
struct RTypedData typeddata;
struct RStruct rstruct;
struct RBignum bignum;
struct RFile file;
struct RNode node;
struct RMatch match;
struct RRational rational;
struct RComplex complex;
} as;
#ifdef GC_DEBUG
const char *file;
int line;
#endif
} RVALUE;
That is, basically a union of core data types that Ruby knows about. Missing something? Yes, Fixnums, Symbols, nil and boolean values are not included there. It's because these kinds of objects are directly represented using the unsigned long that a VALUE boils down to in the end. I think the design decision there was (besides being a cool idea) that dereferencing a pointer might be slightly less performant than the bit shifts that are currently needed when transforming the VALUE to what it actually represents. Essentially
obj = (VALUE)freelist;
says give me whatever freelist points to currently and treat is as unsigned long. This is safe because freelist is a pointer to an RVALUE - and a pointer can also be safely interpreted as unsigned long. This implies that every VALUE except those carrying Fixnums, symbols, nil or Booleans are essentially pointers to an RVALUE, the others are directly represented within the VALUE.
Your last question, how can you check for what a VALUE stands for? You can use the TYPE(x) macro to check whether a VALUE's type would be one of the "primitive" ones.
VALUE test(VALUE inp)
The first issue is here: inp is self (so, in your case, the module). If you want to refer to the first argument, you need to add a self argument before that (which makes me to add -Wno-unused-parameters to my cflags, as it is never used in the case of module functions):
VALUE test(VALUE self, VALUE inp)
Your first example uses a module as a string, which certainly won't result into anything good. RSTRING_PTR lacks type checks, which is a good reason not to use it.
A VALUE is a reference to the Ruby object, but not directly a pointer to what it may contain (like a char* in the case of a string). You need to get that pointer using some macros or functions depending on each object. For a string, you want StringValuePtr (or StringValueCStr to ensure that the string is null-terminated) which returns the pointer (it doesn't change the content of your VALUE in any way).
strlen(StringValuePtr(thing));
RSTRING_LEN(thing); /* I assume strlen was just an example ;) */
The actual content of the VALUE is, in MRI and YARV at least, the object_id of the object (or at least, it is after a bitshift).
For your own objects, the VALUE will most likely contain a pointer to a C object which you can get using Data_Get_Struct:
my_type *thing = NULL;
Data_Get_Struct(rb_thing, my_type, thing);
This compiles in gcc with no errors or warnings even with -Wall option
meaning that array bounds are checked at run-time and hence compiler can't detect the error
#include<stdio.h>
int main()
{
int a[2][3][4];
a[1][2][100] = 4 ;
return 0;
}
However,
#include<stdio.h>
int main()
{
int a[2][3];
a[1][2][100] = 4 ;
return 0;
}
this generates an error while compiling as :
$ gcc sample.c -Wall
sample.c: In function ‘main’:
sample.c:7: error: subscripted value is neither array nor pointer
Why is this so ? in both the two codes a[1][2][100] is invalid . Still how can the compiler detect this is code2 and not in code1.
Specially when every compiler flattens all multidimensional array into corresponding single dimension arrays, then how can the compiler be selectively aware of this flaw in the code.
Explanation or mention of some book or article where the proper explanation resides will be gratefully accepted :)
The difference is the types. C does no bounds checking but it does do (static) type checking. The type of a in your first example is int[][][] but the type in the second example is int[][].
The "flattening" that you refer to happens in code generation, which is (conceptually, at least) after type checking.
First, array bounds are not checked at runtime or at compile time. Be careful out there.
Second, your second case gives an error because you have a mismatch in array dimension - you're using three subscript operators ([]) on a 2D array. Just because the array happens to be laid out in memory as an array of arrays doesn't mean there is any actual type changing going on with the variable.
Array subscripting is described in the C standard section 6.5.2.1 Array subscripting.
Given
int a[2][3];
the compiler will determine that a[1][2] is of type int. Therefore, accessing element [100] of this is equivalent to:
int x;
x[100] = 4;
This would give you the same error about the subscripted value.