What is the result of NULL + int? - c

I have seen the following macro being used in OpenGL VBO implementations:
#define BUFFER_OFFSET(i) ((char *)NULL + (i))
//...
glNormalPointer(GL_FLOAT, 32, BUFFER_OFFSET(x));
Could you provide a little detail on how this macro works? Can it be replaced with a function?
More exactly, what is the result of incrementing a NULL pointer?

Let's take a trip back through the sordid history of OpenGL. Once upon a time, there was OpenGL 1.0. You used glBegin and glEnd to do drawing, and that was all. If you wanted fast drawing, you stuck things in a display list.
Then, somebody had the bright idea to be able to just take arrays of objects to render with. And thus was born OpenGL 1.1, which brought us such functions as glVertexPointer. You might notice that this function ends in the word "Pointer". That's because it takes pointers to actual memory, which will be accessed when one of the glDraw* suite of functions is called.
Fast-forward a few more years. Now, graphics cards have the ability to perform vertex T&L on their own (up until this point, fixed-function T&L was done by the CPU). The most efficient way to do that would be to put vertex data in GPU memory, but display lists are not ideal for that. Those are too hidden, and there's no way to know whether you'll get good performance with them. Enter buffer objects.
However, because the ARB had an absolute policy of making everything as backwards compatible as possible (no matter how silly it made the API look), they decided that the best way to implement this was to just use the same functions again. Only now, there's a global switch that changes glVertexPointer's behavior from "takes a pointer" to "takes a byte offset from a buffer object." That switch being whether or not a buffer object is bound to GL_ARRAY_BUFFER.
Of course, as far as C/C++ is concerned, the function still takes a pointer. And the rules of C/C++ do not allow you to pass an integer as a pointer. Not without a cast. Which is why macros like BUFFER_OBJECT exist. It's one way to convert your integer byte offset into a pointer.
The (char *)NULL part simply takes the NULL pointer (which is usually a void* in C and the literal 0 in C++) and turns it into a char*. The + i just does pointer arithmetic on the char*. Because the null pointer usually has a zero address, adding i to it will increment the byte offset by i, thus generating a pointer who's value is the byte offset you passed in.
Of course, the C++ specification lists the results of BUFFER_OBJECT as undefined behavior. By using it, you're really relying on the compiler to do something reasonable. After all, NULL does not have to be zero; all the specification says is that it is an implementation-defined null pointer constant. It doesn't have to have the value of zero at all. On most real systems, it will. But it doesn't have to.
That's why I just use a cast.
glVertexAttribPointer(1, 4, GL_FLOAT, GL_FALSE, 0, (void*)48);
It's not guaranteed behavior either way (int->ptr->int conversions are conditionally supported, not required). But it's also shorter than typing "BUFFER_OFFSET". GCC and Visual Studio seem to find it reasonable. And it doesn't rely on the value of the NULL macro.
Personally, if I were more C++ pedantic, I'd use a reinterpret_cast<void*> on it. But I'm not.
Or you can ditch the old API and use glVertexAttribFormat et. al., which is better in every way.

#define BUFFER_OFFSET(i) ((char *)NULL + (i))
Technically the result of this operation is undefined, and the macro actually wrong. Let me explain:
C defines (and C++ follows it), that pointers can be casted to integers, namely of type uintptr_t, and that if the integer obtained that way, casted back into the original pointer type it came from, would yield the original pointer.
Then there's pointer arithmetic, which means if I have two pointers pointing so the same object I can take the difference of them, resulting in a integer (of type ptrdiff_t), and that integer added or subtracted to either of the original pointers, will yield the other. It is also defines, that by adding 1 to a pointer, the pointer to the next element of an indexed object is yielded. Also the difference of two uintptr_t, divided by sizeof(type pointed to) of pointers of the same object must be equal to the pointers themself being subtracted. And last but not least, the uintptr_t values may be anything. They could be opaque handles as well. They're not required to be the addresses (though most implementations do it that way, because it makes sense).
Now we can look at the infamous null pointer. C defines the pointer which is casted to for from type uintptr_u value 0 as the invalid pointer. Note that this is always 0 in your source code. On the backend side, in the compiled program, the binary value used for actually representing it to the machine may be something entirely different! Usually it is not, but it may be. C++ is the same, but C++ doesn't allow for as much implicit casting than C, so one must cast 0 explicitly to void*. Also because the null pointer does not refer to an object and therefore has no dereferenced size pointer arithmetic is undefined for the null pointer. The null pointer referring to no object also means, there is no definition for sensibly casting it to a typed pointer.
So if this is all undefined, why does this macro work after all? Because most implementations (means compilers) are extremely gullible and compiler coders lazy to the highest degree. The integer value of a pointer in the majority of implementations is just the value of the pointer itself on the backend side. So the null pointer is actually 0. And although pointer arithmetic on the null pointer is not checked for, most compilers will silently accept it, if the pointer got some type assigned, even if it makes no sense. char is the "unit sized" type of C if you want to say so. So then pointer arithmetic on cast is like artihmetic on the addresses on the backend side.
To make a long story short, it simply makes no sense to try doing pointer magic with the intended result to be a offset on the C language side, it just doesn't work that way.
Let's step back for a moment and remember, what we're actually trying to do: The original problem was, that the gl…Pointer functions take a pointer as their data parameter, but for Vertex Buffer Objects we actually want to specify a byte based offset into our data, which is a number. To the C compiler the function takes a pointer (a opaque thing as we learned). The correct solution would have been the introduction of new functions especially for the use with VBOs (say gl…Offset – I think I'm going to ralley for their introduction). Instead what was defined by OpenGL is a exploit of how compilers work. Pointers and their integer equivalent are implemented as the same binary representation by most compilers. So what we have to do, it making the compiler call those gl…Pointer functions with our number instead of a pointer.
So technically the only thing we need to do is telling to compiler "yes, I know you think this variable a is a integer, and you are right, and that function glVertexPointer only takes a void* for it's data parameter. But guess what: That integer was yielded from a void*", by casting it to (void*) and then holding thumbs, that the compiler is actually so stupid to pass the integer value as it is to glVertexPointer.
So this all comes down to somehow circumventing the old function signature. Casting the pointer is the IMHO dirty method. I'd do it a bit different: I'd mess with the function signature:
typedef void (*TFPTR_VertexOffset)(GLint, GLenum, GLsizei, uintptr_t);
TFPTR_VertexOffset myglVertexOffset = (TFPTR_VertexOffset)glVertexPointer;
Now you can use myglVertexOffset without doing any silly casts, and the offset parameter will be passed to the function, without any danger, that the compiler may mess with it. This is also the very method I use in my programs.

openGL vertex attribute data is assigned through the same function (glVertexAttribPointer) as either pointers in memory or offsets located within a Vertex Buffer Object depending on context.
the BUFFER_OFFSET() macro appears to convert an integer byte offset into a pointer simply to allow the compiler to pass it as a pointer argument safely.
The "(char*)NULL+i" expresses this conversion through pointer-arithmetic; the result should be the same bit-pattern assuming sizeof(char)==1, without which, this macro would fail.
it would also be possible through simple re-casting, but the macro might make it stylistically clearer what is being passed; it would also be a convenient place to trap overflows for 32/64bit safety/futureproofing
struct MyVertex { float pos[3]; u8 color[4]; }
// general purpose Macro to find the byte offset of a structure member as an 'int'
#define OFFSET(TYPE, MEMBER) ( (int)&((TYPE*)0)->MEMBER)
// assuming a VBO holding an array of 'MyVertex',
// specify that color data is held at an offset 12 bytes from the VBO start, for every vertex.
glVertexAttribPointer(
colorIndex,4, GL_UNSIGNED_BYTE, GL_TRUE,
sizeof(MyVertex),
(GLvoid*) OFFSET(MyVertex, color) // recast offset as pointer
);

That's not "NULL+int", that's a "NULL cast to the type 'pointer to char'", and then increments that pointer by i.
And yes, that could be replaced by a function - but if you don't know what it does, then why do you care about that? First understand what it does, then consider if it would be better as a function.

Related

Pass a row of a multidimensional array as a parameter

My program in C has a 3D array defined as int origF [6][6][4]. I also have a function void displayPM (int tile []), to which I pass origF [i][j] as an argument, which logically makes a sense. It works as it should (when displayPM reads tile [k] it gets the value origF [i][j][k]. However, the compiler (it was Turbo C++ in VirtualBox) issues a warning Suspicious pointer conversion with explanation: The compiler encountered some conversion of a pointer that caused the pointer to point to a different type. You should use a cast to suppress this warning if the conversion is proper.
Realising that just like one-dimensional array, a milti-dimensional array is just a pointer to the beginning of the data, what type is then origF [i][j]? As it's working correctly, it's still a pointer, and it points to origF [i][j][0], but wrong type? Or is it an issue with the compiler?
Realising that just like one-dimensional array, a milti-dimensional array is just a pointer to the beginning of the data
No. Not in the case of one or multidimensional arrays. An arrays is the block of data. It is not a pointer. It does convert to a pointer to the first element when used in most other expressions (even origF[i] is in fact *(origF + i)), but it is not a pointer itself.
origF is an array. When you index into it, it gets converted to a pointer to an array for that purpose. It becomes int (*)[6][4]. You can create such pointers too.
int (*p)[6][4] = origF; // Here origF is decaying to a pointer to its first element.
And when that is dereferenced, you obtain an expression of an array type, int[6][4]. That also happens "recursively" for as many dimensions as one needs.
So back to your example, you wanted to know what origF [i][j] is. It's an expression that has array type. The type is int[4]. When you pass it to a function, it is automatically converted to an int*.
That's at the language level, something the compiler authors seem not aware of at the time. There is no suspicious conversion going on. The only suspicions should be aimed at whoever programmed that warning. It actually suggests you add a cast (I.e. just silence it) and potentially break your program. Not that there is anything wrong with what you did, again, but that is very bad advice in general.
Anyway, since TurboC is discontinued, you'd be far better off with a modern compiler. GCC and Clang are both Free and Open source software respectively, and they have a very high QoI. You should give them a look.

How to properly store extra data in LSB pointer bits?

I have a GSList from GTK/glib 2, and those only store full pointers, and I really don't want extra allocations. How do I do bit twiddling hacks to store extra data in those pointers?
I figure I can't just take a pointer and do tagged_ptr = ptr | 1 (indeed, the compiler complains very loudly when I try). I'm not sure how to do this.
This would definitely be local to a single function, however, and the GSList (or the pointers) would not leak onto the rest of the code.
To perform arithmetic on the numeric value of pointers (as opposed to pointer arithmetic, which is different and highly constrained), you need to cast back and forth to an appropriate integer type. If stdint.h defines UINTPTR_MAX, the appropriate type to use is uintptr_t. If not, there is no appropriate type, and your (nonportable) hack can't work on that particular implementation.
Note that you also have the problem that you're assuming pointers have low unused bits. If _Alignof(max_align_t) is greater than 1, this is probably a reasonable assumption, assuming the implementation follows the intent of the standard that the transformation to uintptr_t reflect the address model of the implementation (rather than being some arbitrary injection). But if not, you're out of luck.

Can I assign any integer value to a pointer variable directly?

Since addresses are numbers and can be assigned to a pointer variable, can I assign any integer value to a pointer variable directly, like this:
int *pPtr = 60000;
You can, but unless you're developing for an embedded device with known memory addresses with a compiler that explicitly allows it, attempting to dereference such a pointer will invoke undefined behavior.
You should only assign the address of a variable or the result of a memory allocation function such as malloc, or NULL.
Yes you can.
You should only assign the address of a variable or the result of a memory allocation function such as malloc, or NULL.
According to the pointer conversion rules, e.g. as described in this online c++ standard draft, any integer may be converted to a pointer value:
6.3.2.3 Pointers
(5) An integer may be converted to any pointer type. Except as
previously specified, the result is implementation-defined, might not
be correctly aligned, might not point to an entity of the referenced
type, and might be a trap representation.
Allowing the conversion does, however, not mean that you are allowed to dereference the pointer then.
You can but there's a lot of considerations.
1) What does that mean?
The only really useful abstraction when this actually gets used is that you need to access a specific memory location because something is mapped to a specific point, generally hardware control registers (less often: a specific area in flash or from the linker table). The fact that you are assigning 60000 (a decimal number rather than a hexadecimal address or a symbolic mnemonic) makes me quite worried.
2) Do you have "odd" pointers?
Some microcontrollers have pointers with strange semantics (near vs far, tied to a specific memory page, etc.) You may have to do odd things to make the pointer make sense. In addition, some pointers can do strange things depending upon where they point. For example, the PIC32 series can point to the exact same data but with different upper bits that will retrieve a cached copy or an uncached copy.
3) Is that value the correct size for the pointer?
Different architectures need different sizes. The newer data types like intptr_t are meant to paper over this.

What is the meaning of (data type*)0

I have gone through sizeof operator equivalent.
size_t size = (size_t)(1 + ((X*)0));
But could not able to understand what is the meaning of (int*)0 or (int*)1000.
What does it tell to the compiler? And why one is added to them? Could you please elaborate it.
(int *)0 means "treat 0 as the address of an integer". Adding one to this obtains the address of the "next" integer in memory. Converting the result back to an integer therefore gives you the size of an integer.
However, this relies on undefined behaviour, so you shouldn't use it.
This just creates a pointer that points to address 0.
Adding 1 to it does increment on a pointer.
This has the effect of advancing the address with the size of the data type.
In the example, size will contain the size of class X since the pointer will be advanced by the size of the class X and since the initial pointer value is zero.
understand what is the meaning of (int*)0 or (int*)1000.
Those are just casts, exactly the same as the second line here:
int a = 25;
short b = (short)a;
Casting a to short allows you to assign the value to b, which is typed as a short. It's the same thing with (int*)0 -- you're just telling the compiler to treat 0 as a pointer to an int.
The thing about casting is that you're essentially telling the compiler: "Look, I know what I'm doing here, so do what I tell you and stop complaining about types not matching." And that means that you really do need to know what you're doing, and what the effect in the final code is. Any code that includes an expression like (int*)1000 is likely to be a) highly suspect and b) very dependent on the compiler and the particulars of the platform that the code was written for. It might possibly make sense in some sort of embedded system where you know for darn sure what's going to be at memory location 1000 because you control the entire system. In general, you should avoid code like that.

Is casting an integer value to a void* an often used paradigm in callbacks?

Rather than sending an actual pointer to a value, the value is cast to a pointer. I found these examples in the GUI interface code of a GTK program.
g_signal_connect (pastebutton[pane],
"clicked",
G_CALLBACK(on_paste_button_pressed),
(void*)((long)pane<<4));
In the above example, I am referring to the last parameter of g_signal_connect. When on_paste_button_pressed is called by GTK2, on_paste_button_pressed casts the user_data void pointer back like so:
int pane = ((long)user_data) >> 4;
Actually, I added this particular example to the code, but I based it upon what was already there. I added the bit-shifting so as to avoid warnings about casting. The program itself has four panes containing quite a number of widgets, the copy and paste buttons allow you to copy all the values from one pane to another.
Is this way of casting a value to a pointer address often used, and are there reasons why this should not be used?
edit:
The cast from an integer to a void pointer can also be achieved like so:
void* void_ptr = some_int - NULL;
It is used. It is used quite commonly when it does what's required.
One reason not to use it is that theoretically pointer size might be smaller than the source integer size.
Another reason not to use it is that it allows you to pass only one piece of integer data to the callback. If in the future you'll need to add another piece of data (or switch to a non-integer), you'll have to locate and rewrite every single place where the callback makes access to the passed data. So, if there's a chance that you'd have to extend the data in the future, it is better to create a struct (even if it holds just a single int at this time) and pass a pointer to that struct.
But if you are sure that you'll never have to pass anything other than that single integer and that your integer fits into a void *, then this technique is not in any way broken.
P.S. Pedantically speaking, neither C nor C++ appear to have the roundtrip guarantee for integer-to-void *-to-integer conversion, i.e. they don't guarantee that it will work and restore the original integral value.
You should use macros GINT_TO_POINTER() and GPOINTER_TO_INT() to cast between pointers and integers.
glib: Type Conversion Macros
Casting an integer to a pointer is used to pass a value by-value. This is the preferred way if you do need a by-reference parameter because the compiler then does not need to dereference the pointer.
The bit-shifting is a bad idea because it can cause overflows.
For really portable code, you should use intptr_t as your integer type cause it will nicely fit into a pointer.
It does see use in these kinds of cases, yes. It works on many platforms, but might fail because an arbitrary integer is not always a valid pointer value, though the shifting you do should get around that. It is also possible that a pointer cannot hold all the values that an integer can; this would be the case if you're using a 64-bit long on a platform where pointers are 32 bits, and of course since you are shifting the value, it could also fail even if pointers and integers are the same size. If you want to use this trick, you should probably check sizeof (void*) against the size of the integer type you use, and at runtime check against the actual value if the pointer isn't big enough. It's probably not worth it to make this fully portable so that you would use an actual pointer on platforms where that's needed, so either limit yourself to platforms where this trick works or abandon the trick altogether.
I find it just fine. Don't have statistics about use frequency, but do you really care?
I'm not sure I understand how bit-shifting helps, though.
It is used but it's by no means portable so you have to be careful.
The C standard does not mandate that pointer types have at least as many bits as integer types, so you may not always be able to do it.
But I can't recall any platform in which pointers have actually been smaller than integers so you're probably safe (despite not being technically safe).
The only reason I can think that the casting may be there for is to remove the possibility of alignment warnings. Section 6.3.2.3 of the C1x draft states:
An integer may be converted to any pointer type. Except as previously specified, the
result is implementation-defined, might not be correctly aligned, might not point to an
entity of the referenced type, and might be a trap representation.
This is from the C-FAQ:
Q: How are integers converted to and from pointers? Can I temporarily stuff an integer into a pointer, or vice versa?
A: Once upon a time, it was guaranteed that a pointer could be converted to an integer (though one never knew whether an int or a long might be required), and that an integer could be converted to a pointer, and that a pointer remained unchanged when converted to a (large enough) integer and back again, and that the conversions (and any mapping) were intended to be ``unsurprising to those who know the addressing structure of the machine.'' In other words, there is some precedent and support for integer/pointer conversions, but they have always been machine dependent, and hence nonportable. Explicit casts have always been required (though early compilers rarely complained if you left them out).
The ANSI/ISO C Standard, in order to ensure that C is widely implementable, has weakened those earlier guarantees. Pointer-to-integer and integer-to-pointer conversions are implementation-defined (see question 11.33), and there is no longer any guarantee that pointers can be converted to integers and back, without change.
Forcing pointers into integers, or integers into pointers, has never been good practice. When you need a generic slot that can hold either kind of data, a union is a much better idea.
See also questions 4.15, 5.18, and 19.25.
References: K&R1 Sec. A14.4 p. 210
K&R2 Sec. A6.6 p. 199
ISO Sec. 6.3.4
Rationale Sec. 3.3.4
H&S Sec. 6.2.3 p. 170, Sec. 6.2.7 pp. 171-2

Resources