My program in C has a 3D array defined as int origF [6][6][4]. I also have a function void displayPM (int tile []), to which I pass origF [i][j] as an argument, which logically makes a sense. It works as it should (when displayPM reads tile [k] it gets the value origF [i][j][k]. However, the compiler (it was Turbo C++ in VirtualBox) issues a warning Suspicious pointer conversion with explanation: The compiler encountered some conversion of a pointer that caused the pointer to point to a different type. You should use a cast to suppress this warning if the conversion is proper.
Realising that just like one-dimensional array, a milti-dimensional array is just a pointer to the beginning of the data, what type is then origF [i][j]? As it's working correctly, it's still a pointer, and it points to origF [i][j][0], but wrong type? Or is it an issue with the compiler?
Realising that just like one-dimensional array, a milti-dimensional array is just a pointer to the beginning of the data
No. Not in the case of one or multidimensional arrays. An arrays is the block of data. It is not a pointer. It does convert to a pointer to the first element when used in most other expressions (even origF[i] is in fact *(origF + i)), but it is not a pointer itself.
origF is an array. When you index into it, it gets converted to a pointer to an array for that purpose. It becomes int (*)[6][4]. You can create such pointers too.
int (*p)[6][4] = origF; // Here origF is decaying to a pointer to its first element.
And when that is dereferenced, you obtain an expression of an array type, int[6][4]. That also happens "recursively" for as many dimensions as one needs.
So back to your example, you wanted to know what origF [i][j] is. It's an expression that has array type. The type is int[4]. When you pass it to a function, it is automatically converted to an int*.
That's at the language level, something the compiler authors seem not aware of at the time. There is no suspicious conversion going on. The only suspicions should be aimed at whoever programmed that warning. It actually suggests you add a cast (I.e. just silence it) and potentially break your program. Not that there is anything wrong with what you did, again, but that is very bad advice in general.
Anyway, since TurboC is discontinued, you'd be far better off with a modern compiler. GCC and Clang are both Free and Open source software respectively, and they have a very high QoI. You should give them a look.
Related
I was going through the answers to this question Why is it allowed to omit the first dimension, but not the other dimensions when declaring a multi-dimensonal array? and some other questions as well and I understood that we can omit the first dimension but other dimensions are inevitable to be specified.
To my surprise, the following code executed perfectly fine:
#include<stdio.h>
void f1(int (*p)[]){
//SOMETHING
}
int main()
{
int a[3][3]={1,2,3,4,5,6,7,8,9};
f1(a);
}
However, if I use the pointer p to print some value like
printf("%d",p[0][1]);
It gives an error message saying:
error: invalid use of array with unspecified bounds
Why does C allow such kind of declaration? If it is very certain that it is going to throw an error on pointer's use then why is it not throwing error at the time of declaration itself? Why wait for pointer's use to throw an error?
Is there any specific reason for allowing such a declaration?
The expression p[0][1] contains p[0]. p[0] is defined to be equivalent to *(p+0). This uses the addition of a pointer and an integer.
There is a rule that addition with a pointer requires a pointer to a complete object type. This is because calculating how many bytes the address must be changed by usually requires knowing the size of the objects being pointed to. It does not when adding zero, but there is no exception in the rule for this.
The type of p is “pointer to array of unknown number of int. Because the number of elements in the array is unknown, the type is incomplete. So the addition is not allowed.
Interestingly, (*p)[1] is allowed, even though it refers to the same thing p[0][1] would. It is allowed because calculating *p does not require pointer addition. p points to an array, so we know where that array starts even though we do not know its size.
The need to know object size for pointer arithmetic is why the elements of an array must have a complete type (so, if those elements are themselves arrays, their length must be given, to make them complete, and so all inner dimensions of arrays must be known). But pointers are allowed to point to incomplete types, so, when a pointer points to an array, it is not required that the array dimension be known.
I understood that we can omit the first dimension but other dimensions are inevitable to be specified.
Yes. When you declare an array, the element type must be a complete type. This is a formal constraint specified in paragraph 6.7.6.2/1 of the standard, so conforming compilers must diagnose violations.
But so what? You're asking about the declaration of parameter p in ...
void f1(int (*p)[]){
... which has a pointer type. Specifically, it is a pointer to an array of an unknown number of ints. That array type omits only the first dimension, which, as you know, is allowed, though it makes the type an incomplete one. Pointers to incomplete types are allowed (and are themselves complete types), with type void * being the poster child. Furthermore, type int(*)[] is compatible with the type of the argument you are passing, which is int(*)[3] (not int[3][3]).
Why does C allow such kind of declaration?
Why shouldn't it? I mean, incomplete types are a bit weird, but they serve a useful purpose. The parameter declaration you present is consistent with C's requirements.
If it is very certain that it is going to throw an error on pointer's use then why is it not throwing error at the time of declaration itself? Why wait for pointer's use to throw an error?
Because only some uses of the pointer are erroneous. You can convert it to an integer or to another pointer type, for instance, or assign it to a variable of compatible type. In fact, although C does not define the behavior of dereferencing pointers to other varieties of incomplete types, it does allow you to dereference pointers to incomplete array types.
Is there any specific reason for allowing such a declaration?
Consistency? Usefulness? Your question seems predicated on the belief that allowing it is inconsistent, useless, or both, but it is neither. C has a concept of incomplete types. It allows pointers to incomplete types, which is very important to the language, and it does not discriminate among incomplete types in this regard. C also makes no special-case rule against pointers to incomplete types being the types of function parameters.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I wonder why it is not possible to return array in C?
After all, array is just a pointer backed by size info (to make sizeof work). First I thought this was done to prevent me from returning array defined on my stack, but nothing prevents me from returning pointer to something on my stack (gcc warns me, but code compiles). And I also can return string literal which is statically storaged array of chars. By the way, in lunux it is stored in .rodata, and const array is stored there also (check it with objdump), so I can return array (casting it to pointer) and it works, but AFAIK this is just implementation-specific (another OS/Compiler may store const on stack).
I have 2 ideas how to implement array returning: Just copy it as value (as it is done for structure. I even can return array wrapping it into structure!!), and create pointer to it automatically or allow user to return const array and create contract that such array should have static storage duration (as it done for strings). Both ideas are trivial!
So, my question is why K&R did not implement something like that?
Technically, you can return an array; you just can't do it "directly", but have to wrap it in a struct:
struct foo {
int array[5];
};
struct foo returns_array(void) {
return((struct foo) {
.array = {2, 4, 6, 8, 10}
});
}
Why C doesn't allow you to do it directly even though it has the ability is still a good question, though. It is probably related to the fact that it doesn't support whole-array assignments either:
void bar(int input[5]) {
int temp[5];
temp = input; <-- Doesn't compile
}
What makes it even stranger though, of course, is that whole-array copy via argument-passing is supported. If someone knows how to find the ANSI committee's decisions on the matter, that would be interesting to read.
However,
After all, array is just a pointer backed by size info (to make sizeof work).
This is not correct. There is no explicit pointer, nor any stored size, of an array. The array is stored as the raw values, packed together; the size is only known inside the compiler and never made explicit as run-time data in the program. The array decays to a pointer when you try to use it as one.
An array is not "just a pointer backed by size info".
An array is a block of contiguous elements of a certain type. There is no pointer.
Since an array is an object, a pointer can be formed which points to the array, or to one of the array's elements. But such a pointer is not part of the array and is not stored with the array. It would make as much sense to say "an int is just a pointer backed by a size of 1 int".
The size of an array is known by the compiler in the same way that the size of any object is known. If we have double d; then it is known that sizeof d is sizeof(double) because the compiler remembers that d is an object of type double.
nothing prevents me from returning pointer to something on my stack
The C standard prevents you from doing this (and using the returned pointer). If you write code that violates the standard then you are on your own.
And I also can return string literal
A string literal is an array of char. When you use an array in a return statement, it is converted to a pointer to the first element.
To enable arrays to be returned (and assigned) by value, the rule regarding conversion of array to pointer (sometimes called "decay") would have to be changed. This would be possible, but K&R decided to make the decay almost ubiquitous when designing C.
In fact it would be possible to have a language like C but without having the decay at all. Maybe in hindsight that would have saved a lot of confusion. However they just chose to implement C in the way that they did.
In K&R C, it was not possible to return structures by value either. Any copy operation that was not a primitive type, had to be done with memcpy or an equivalent iterative copy. This seems like a reasonable design decision given the way hardware resources were in the 1970s.
ANSI C added the possibility to return structures by value , however by then it would have been too late to change the decay rule even if they had wanted to; it would break a lot of existing code which is relying on the decay rule.
Because if suddently, a revision of the language allows a function to be able to return a complete array, that revision should deal with these situations too:
Allow assignment between arrays (because if a function returns an array, it's because it is going to be assigned to an array variable in the caller function)
Allow passing a complete array as value parameter (because the name of an array is no longer a pointer to its first element, as this would conflict with the first situation)
If these constructions are allowed, existing programs that pass the name of an array as an argument to a function, expecting the function to modify that array, will cease to work.
Also, existing programs that use the array's name as pointer to assign it to a pointer variable will cease to work.
So, while it's technically feasible, making arrays to work as complete entities that can be assigned, returned and so on would break a lot of existing programs.
Note that structs could be "upgraded" because there were no prior semantics in the K&R C that related the name of a variable structure to be a pointer to itself. Any function that had to use structures as arguments or return values had to use pointers to them.
The "reason" is that arrays decay to pointers in most expressions and things would "as wrong" as if you would want to allow for assignment of arrays. If you'd return an array from a function, you wouldn't be able to distinguish it from a normal pointer. If f() would be returning double[5], say, the initialization
double* A=f();
would be valid. A would take the address of a temporary object, something that in C only lives up to the end of the full expression where the call to f appeared. So then A would be a dangling pointer, a pointer that points to an address that is not valid any more.
To summarize: the initial decision to have arrays behave similar to pointers in most contexts, imposes that arrays can't be assigned nor returned by functions.
If I am given a void pointer to an array of elements, is there a way in 'C' to find out what type of elements (i.e. data-type of elements) are stored in the array?
What could possibly happen if I typecast this void pointer to a random data-type and try to traverse the array?
Short answer: No, undefined behaviour.
Long answer: You have to cast the pointer into something that's appropriate. There are ways to figure it out, but only if you pass, along with the void pointer itself, information about the width of each element in the array.
You get in the best options a GPF, in the worst case you'll execute some random code, before a GPF. In C a cast does nothing but "considering" a pointer being of a certain type, the only responsible that cast is valid is you.
You cannot possibly know the types in the array as it is "just a number" containing an address. Casting pointers into a different type is undefined behavior and may yield alignment problems, which may generate a CPU exception depending on your architecture.
There is no general way of doing so, but if you know something about what types of data might be in the variable, and those different data types are distinctive in some way, you can examine the starting bytes of the pointer to try and make an educated guess (you may first want to examine the pointer to determine whether it has any alignment restrictions that would forbid certain data types). Outside of debugging (i.e. you know that something clobbered your pointer, but you're not sure what) there's no good reason to do this.
I have gone through sizeof operator equivalent.
size_t size = (size_t)(1 + ((X*)0));
But could not able to understand what is the meaning of (int*)0 or (int*)1000.
What does it tell to the compiler? And why one is added to them? Could you please elaborate it.
(int *)0 means "treat 0 as the address of an integer". Adding one to this obtains the address of the "next" integer in memory. Converting the result back to an integer therefore gives you the size of an integer.
However, this relies on undefined behaviour, so you shouldn't use it.
This just creates a pointer that points to address 0.
Adding 1 to it does increment on a pointer.
This has the effect of advancing the address with the size of the data type.
In the example, size will contain the size of class X since the pointer will be advanced by the size of the class X and since the initial pointer value is zero.
understand what is the meaning of (int*)0 or (int*)1000.
Those are just casts, exactly the same as the second line here:
int a = 25;
short b = (short)a;
Casting a to short allows you to assign the value to b, which is typed as a short. It's the same thing with (int*)0 -- you're just telling the compiler to treat 0 as a pointer to an int.
The thing about casting is that you're essentially telling the compiler: "Look, I know what I'm doing here, so do what I tell you and stop complaining about types not matching." And that means that you really do need to know what you're doing, and what the effect in the final code is. Any code that includes an expression like (int*)1000 is likely to be a) highly suspect and b) very dependent on the compiler and the particulars of the platform that the code was written for. It might possibly make sense in some sort of embedded system where you know for darn sure what's going to be at memory location 1000 because you control the entire system. In general, you should avoid code like that.
I have seen the following macro being used in OpenGL VBO implementations:
#define BUFFER_OFFSET(i) ((char *)NULL + (i))
//...
glNormalPointer(GL_FLOAT, 32, BUFFER_OFFSET(x));
Could you provide a little detail on how this macro works? Can it be replaced with a function?
More exactly, what is the result of incrementing a NULL pointer?
Let's take a trip back through the sordid history of OpenGL. Once upon a time, there was OpenGL 1.0. You used glBegin and glEnd to do drawing, and that was all. If you wanted fast drawing, you stuck things in a display list.
Then, somebody had the bright idea to be able to just take arrays of objects to render with. And thus was born OpenGL 1.1, which brought us such functions as glVertexPointer. You might notice that this function ends in the word "Pointer". That's because it takes pointers to actual memory, which will be accessed when one of the glDraw* suite of functions is called.
Fast-forward a few more years. Now, graphics cards have the ability to perform vertex T&L on their own (up until this point, fixed-function T&L was done by the CPU). The most efficient way to do that would be to put vertex data in GPU memory, but display lists are not ideal for that. Those are too hidden, and there's no way to know whether you'll get good performance with them. Enter buffer objects.
However, because the ARB had an absolute policy of making everything as backwards compatible as possible (no matter how silly it made the API look), they decided that the best way to implement this was to just use the same functions again. Only now, there's a global switch that changes glVertexPointer's behavior from "takes a pointer" to "takes a byte offset from a buffer object." That switch being whether or not a buffer object is bound to GL_ARRAY_BUFFER.
Of course, as far as C/C++ is concerned, the function still takes a pointer. And the rules of C/C++ do not allow you to pass an integer as a pointer. Not without a cast. Which is why macros like BUFFER_OBJECT exist. It's one way to convert your integer byte offset into a pointer.
The (char *)NULL part simply takes the NULL pointer (which is usually a void* in C and the literal 0 in C++) and turns it into a char*. The + i just does pointer arithmetic on the char*. Because the null pointer usually has a zero address, adding i to it will increment the byte offset by i, thus generating a pointer who's value is the byte offset you passed in.
Of course, the C++ specification lists the results of BUFFER_OBJECT as undefined behavior. By using it, you're really relying on the compiler to do something reasonable. After all, NULL does not have to be zero; all the specification says is that it is an implementation-defined null pointer constant. It doesn't have to have the value of zero at all. On most real systems, it will. But it doesn't have to.
That's why I just use a cast.
glVertexAttribPointer(1, 4, GL_FLOAT, GL_FALSE, 0, (void*)48);
It's not guaranteed behavior either way (int->ptr->int conversions are conditionally supported, not required). But it's also shorter than typing "BUFFER_OFFSET". GCC and Visual Studio seem to find it reasonable. And it doesn't rely on the value of the NULL macro.
Personally, if I were more C++ pedantic, I'd use a reinterpret_cast<void*> on it. But I'm not.
Or you can ditch the old API and use glVertexAttribFormat et. al., which is better in every way.
#define BUFFER_OFFSET(i) ((char *)NULL + (i))
Technically the result of this operation is undefined, and the macro actually wrong. Let me explain:
C defines (and C++ follows it), that pointers can be casted to integers, namely of type uintptr_t, and that if the integer obtained that way, casted back into the original pointer type it came from, would yield the original pointer.
Then there's pointer arithmetic, which means if I have two pointers pointing so the same object I can take the difference of them, resulting in a integer (of type ptrdiff_t), and that integer added or subtracted to either of the original pointers, will yield the other. It is also defines, that by adding 1 to a pointer, the pointer to the next element of an indexed object is yielded. Also the difference of two uintptr_t, divided by sizeof(type pointed to) of pointers of the same object must be equal to the pointers themself being subtracted. And last but not least, the uintptr_t values may be anything. They could be opaque handles as well. They're not required to be the addresses (though most implementations do it that way, because it makes sense).
Now we can look at the infamous null pointer. C defines the pointer which is casted to for from type uintptr_u value 0 as the invalid pointer. Note that this is always 0 in your source code. On the backend side, in the compiled program, the binary value used for actually representing it to the machine may be something entirely different! Usually it is not, but it may be. C++ is the same, but C++ doesn't allow for as much implicit casting than C, so one must cast 0 explicitly to void*. Also because the null pointer does not refer to an object and therefore has no dereferenced size pointer arithmetic is undefined for the null pointer. The null pointer referring to no object also means, there is no definition for sensibly casting it to a typed pointer.
So if this is all undefined, why does this macro work after all? Because most implementations (means compilers) are extremely gullible and compiler coders lazy to the highest degree. The integer value of a pointer in the majority of implementations is just the value of the pointer itself on the backend side. So the null pointer is actually 0. And although pointer arithmetic on the null pointer is not checked for, most compilers will silently accept it, if the pointer got some type assigned, even if it makes no sense. char is the "unit sized" type of C if you want to say so. So then pointer arithmetic on cast is like artihmetic on the addresses on the backend side.
To make a long story short, it simply makes no sense to try doing pointer magic with the intended result to be a offset on the C language side, it just doesn't work that way.
Let's step back for a moment and remember, what we're actually trying to do: The original problem was, that the gl…Pointer functions take a pointer as their data parameter, but for Vertex Buffer Objects we actually want to specify a byte based offset into our data, which is a number. To the C compiler the function takes a pointer (a opaque thing as we learned). The correct solution would have been the introduction of new functions especially for the use with VBOs (say gl…Offset – I think I'm going to ralley for their introduction). Instead what was defined by OpenGL is a exploit of how compilers work. Pointers and their integer equivalent are implemented as the same binary representation by most compilers. So what we have to do, it making the compiler call those gl…Pointer functions with our number instead of a pointer.
So technically the only thing we need to do is telling to compiler "yes, I know you think this variable a is a integer, and you are right, and that function glVertexPointer only takes a void* for it's data parameter. But guess what: That integer was yielded from a void*", by casting it to (void*) and then holding thumbs, that the compiler is actually so stupid to pass the integer value as it is to glVertexPointer.
So this all comes down to somehow circumventing the old function signature. Casting the pointer is the IMHO dirty method. I'd do it a bit different: I'd mess with the function signature:
typedef void (*TFPTR_VertexOffset)(GLint, GLenum, GLsizei, uintptr_t);
TFPTR_VertexOffset myglVertexOffset = (TFPTR_VertexOffset)glVertexPointer;
Now you can use myglVertexOffset without doing any silly casts, and the offset parameter will be passed to the function, without any danger, that the compiler may mess with it. This is also the very method I use in my programs.
openGL vertex attribute data is assigned through the same function (glVertexAttribPointer) as either pointers in memory or offsets located within a Vertex Buffer Object depending on context.
the BUFFER_OFFSET() macro appears to convert an integer byte offset into a pointer simply to allow the compiler to pass it as a pointer argument safely.
The "(char*)NULL+i" expresses this conversion through pointer-arithmetic; the result should be the same bit-pattern assuming sizeof(char)==1, without which, this macro would fail.
it would also be possible through simple re-casting, but the macro might make it stylistically clearer what is being passed; it would also be a convenient place to trap overflows for 32/64bit safety/futureproofing
struct MyVertex { float pos[3]; u8 color[4]; }
// general purpose Macro to find the byte offset of a structure member as an 'int'
#define OFFSET(TYPE, MEMBER) ( (int)&((TYPE*)0)->MEMBER)
// assuming a VBO holding an array of 'MyVertex',
// specify that color data is held at an offset 12 bytes from the VBO start, for every vertex.
glVertexAttribPointer(
colorIndex,4, GL_UNSIGNED_BYTE, GL_TRUE,
sizeof(MyVertex),
(GLvoid*) OFFSET(MyVertex, color) // recast offset as pointer
);
That's not "NULL+int", that's a "NULL cast to the type 'pointer to char'", and then increments that pointer by i.
And yes, that could be replaced by a function - but if you don't know what it does, then why do you care about that? First understand what it does, then consider if it would be better as a function.