I've written the following bit of code that is producing a
warning: initialization makes pointer from integer without a cast
OR A
warning: cast to pointer from integer of different size
from gcc (GCC) 4.1.1 20070105 (Red Hat 4.1.1-52)
struct my_t {
unsigned int a : 1;
unsigned int b : 1;
};
struct my_t mine = {
.a = 1,
.b = 0
};
const void * bools[] = { "ItemA", mine->a, "ItemB", mine->b, 0, 0 };
int i;
for (i = 0; bools[i] != NULL; i += 2)
fprintf(stderr, "%s = %d\n", bools[i], (unsigned int) bools[i + 1] ? "true" : "false");
How do I get the warning to go away? No matter what I've tried casting, a warning seems to always appears.
Thanks,
Chenz
Hmm, why do you insist on using pointers as booleans? How about this alternative?
struct named_bool {
const char* name;
int val;
};
const struct named_bool bools[] = {{ "ItemA", 1 }, { "ItemB", 1 }, { 0, 0 }};
const void * bools[] = { "ItemA", mine->a, "ItemB", mine->b, 0, 0 };
There are several problems with this snippet:
mine isn't declared as a pointer type (at least not in the code you posted), so you shouldn't be using the -> component selection operator;
If you change that to use the . selection operator, you'd be attempting to store the boolean value in a or b as a pointer, which isn't what you want;
But that doesn't matter, since you cannot take the address of a bit-field (§ 6.5.3.2, paragraph 1).
If you're trying to associate a boolean value with another object, you'd be better off declaring a type like
struct checkedObject {void *objPtr; int check};
and initialize an array as
struct checkedObject[] = {{"ItemA", 1}, {"ItemB", 0}, {NULL, 0}};
Bit-fields have their uses, but this isn't one of them. You're really not saving any space in this case, since at least one complete addressable unit of storage (byte, word, whatever) needs to be allocated to hold the two bitfields.
Two problems:
Not sure why you are trying to convert unsigned int a:1 to a void*. If you are trying to reference it, the syntax would be &mine->a rather than mine->a, but...
You can't create a pointer to a bit in C (at least as far as I know). If you're trying to create a pointer to a bit, you may want to consider one the following options:
Create a pointer to the bitfield structure (i.e. struct my_t *), and (if necessary) use a separate number to indicate which bit to use. Example:
struct bit_ref {
struct my_t *bits;
unsigned int a_or_b; // 0 for bits->a, 1 for bits->b
}
Don't use a bit field. Use char for each flag, as it is the smallest data type that you can create a pointer to.
Do use a bit field, but implement it manually with boolean operations. Example:
typedef unsigned int my_t;
#define MY_T_A (1u << 0)
#define MY_T_B (1u << 1)
struct bit_ref {
struct my_t *bits;
unsigned int shift;
};
int deref(const struct bit_ref bit_ref)
{
return !!(bit_ref.bits & (1 << bit_ref.shift));
}
There's a few ways you could get rid of the warning, but still use a pointer value as a boolean. For example, you could do this:
const void * bools[] = { "ItemA", mine->a ? &bools : 0, "ItemB", mine->b ? &bools : 0, 0, 0 };
This uses a NULL pointer for false, and a non-null pointer (in this case, &bools, but a pointer to any object of the right storage duration would be fine) for true. You would also then remove the cast to unsigned int in the test, so that it is just:
fprintf(stderr, "%s = %d\n", bools[i], bools[i + 1] ? "true" : "false");
(A null pointer always evaluates as false, and a non-null pointer as true).
However, I do agree that you are better off creating an array of structs instead.
Related
Is there a way to type cast structure member during its initiation, for instance:
struct abc {
char k;
};
int main()
{
struct abc data[] = {.k= 'TI'};
}
Above wouldn't work since k is of type char, is there way to type caste this member k (to int) during its assignment to 'TI' ?
You don't need a cast here.
struct abc {
char k;
};
int main()
{
struct abc data[] = {.k= 'TI'};
}
Your object data is an array of struct abc. The initializer is for a single object of type struct abc.
If you want data to be a 1-element array, you can do this:
struct abc data[] = {{.k= 'TI'}};
or, if you want to be more explicit:
struct abc data[] = {[0] = {.k = 'TI'}};
That's valid code, but it's likely to trigger a warning. 'TI' is a multi-character constant, an odd feature of C that in my experience is used by accident more often than it's used deliberately. Its value is implementation-defined, and it's of type int.
Using gcc on my system, its value is 21577, or 0x5449, which happens to be ('T' << 8) + 'I'. Since data[0].k is a single byte, it can't hold that value. There's an implicit conversion from int to char that determines the value that will be stored (in this case, on my system, 73, which happens to be 'I').
A cast (not a "type cast") converts a value from one type to another. It doesn't change the type of an object. k is of type char, and that's not going to change unless you modify its declaration. Maybe you want to have struct abc { int k; };?
I can't help more without knowing what you're trying to do. Why are you using a multi-character constant? Why is k of type char?
No matter what you do to the value, it doesn't change the fact that the field cannot store that much information. You will need to change the type of k.
'TI' is wrong you probably mean string "TI"
If typecast mean see the sting "TI" as integer you need to use union - it is called type punning.
typedef union
{
char k[2]; //I do not want to store null terminating character
short int x;
}union_t;
int main(void)
{
union_t u = {.k = "TI"};
printf("u.k[0]=%c (0x%x), u.k[1]=%c (0x%x) u.x = %hd (0x%hx)\n", u.k[0], u.k[0], u.k[1], u.k[1], u.x, u.x);
}
https://godbolt.org/z/EhbKG5qnW
"...I was thinking if member K which of type char can be type caste to int ..."
As mentioned in comments, struct members are static and cannot be cast.
But given your willingness to cast (even though that will not work) why not start with a type that naturally accommodates multi-byte characters, i.e. wchar_t ?
If this is acceptable for your work, then when working with multi-byte char in C, you can do the following:
#include <wchar.h>
struct abc {
wchar_t k[2];//or k[3] depending on k will be used as a C string.
};
...
//inside function somewhere:
//assignment can be done as follows:
struct abc buf = {.k[0] = L'T', .k[1] = L'I'};
// Or:
struct abc buf = {.k = L"TI"};//Note, no null terminator, so not C string
Note: regarding 2nd method, by definition, L"TI" contains three characters, T, I and \0. If k is to be used as a string, k must be defined with space for the null terminator: wchar_t k[3];.
(See examples here.)
This results in the following:
(where 84 and 73 are the ASCII values for T and I respectively.)
Note: compiled using GNU GCC, set to follow C99 rules
Consider the following code:
typedef struct { char byte; } byte_t;
typedef struct { char bytes[10]; } blob_t;
int f(void) {
blob_t a = {0};
*(byte_t *)a.bytes = (byte_t){10};
return a.bytes[0];
}
Does this give aliasing problems in the return statement? You do have that a.bytes dereferences a type that does not alias the assignment in patch, but on the other hand, the [0] part dereferences a type that does alias.
I can construct a slightly larger example where gcc -O1 -fstrict-aliasing does make the function return 0, and I'd like to know if this is a gcc bug, and if not, what I can do to avoid this problem (in my real-life example, the assignment happens in a separate function so that both functions look really innocent in isolation).
Here is a longer more complete example for testing:
#include <stdio.h>
typedef struct { char byte; } byte_t;
typedef struct { char bytes[10]; } blob_t;
static char *find(char *buf) {
for (int i = 0; i < 1; i++) { if (buf[0] == 0) { return buf; }}
return 0;
}
void patch(char *b) {
*(byte_t *) b = (byte_t) {10};
}
int main(void) {
blob_t a = {0};
char *b = find(a.bytes);
if (b) {
patch(b);
}
printf("%d\n", a.bytes[0]);
}
Building with gcc -O1 -fstrict-aliasing produces 0
The main issue here is that those two structs are not compatible types. And so there can be various problems with alignment and padding.
That issue aside, the standard 6.5/7 only allows for this (the "strict aliasing rule"):
An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
a type compatible with the effective type of the object,
...
an aggregate or union type that includes one of the aforementioned types among its members
Looking at *(byte_t *)a.bytes, then a.bytes has the effective type char[10]. Each individual member of that array has in turn the effective type char. You de-reference that with byte_t, which is not a compatible struct type nor does it have a char[10] among its members. It does have char though.
The standard is not exactly clear how to treat an object which effective type is an array. If you read the above part strictly, then your code does indeed violate strict aliasing, because you access a char[10] through a struct which doesn't have a char[10] member. I'd also be a bit concerned about the compiler padding either struct to meet alignment.
Generally, I'd simply advise against doing fishy things like this. If you need type punning, then use a union. And if you wish to use raw binary data, then use uint8_t instead of the potentially signed & non-portable char.
The error is in *(byte_t *)a.bytes = (byte_t){10};. The C spec has a special rule about character types (6.5§7), but that rule only applies when using character type to access any other type, not when using any type to access a character.
According to the Standard, the syntax array[index] is shorthand for *((array)+(index)). Thus, p->array[index] is equivalent to *((p->array) + (index)), which uses the address of p to compute the address of p->array, and then without regard for p's type, adds index (scaled by the size of the array-element type), and then dereferences the resulting pointer to yield an lvalue of the array-element type. Nothing in the wording of the Standard would imply that an access via the resulting lvalue is an access to an lvalue of the underlying structure type. Thus, if the struct member is an array of character type, the constraints of N1570 6.5p7 would allow an lvalue of that form to access storage of any type.
The maintainers of some compilers such as gcc, however, appear to view the laxity of the Standard there as a defect. This can be demonstrated via the code:
struct s1 { char x[10]; };
struct s2 { char x[10]; };
union s1s2 { struct s1 v1; struct s2 v2; } u;
int read_s1_x(struct s1 *p, int i)
{
return p->x[i];
}
void set_s2_x(struct s2 *p, int i, int value)
{
p->x[i] = value;
}
__attribute__((noinline))
int test(void *p, int i)
{
if (read_s1_x(p, 0))
set_s2_x(p, i, 2);
return read_s1_x(p, 0);
}
#include <stdio.h>
int main(void)
{
u.v2.x[0] = 1;
int result = test(&u, 0);
printf("Result = %d / %d", result, u.v2.x[0]);
}
The code abides the constraints in N1570 6.5p7 because it all accesses to any portion of u are performed using lvalues of character type. Nonetheless, the code generated by gcc will not allow for the possibility that the storage accessed by (*(struct s1))->x[0] might also be accessed by (*(struct s2))->x[i] despite the fact that both accesses use lvalues of character type.
I have a list of variables char [][20] ls = {"var_1", "var_2", ... , ""}
which are the names of the fields of a struct struct {char var1[10], ...} my_struct;
The variables inside the struct are all char[] with changing lengths.
The list itself is const and should not change mid-run-time.
I want to access those variables in a loop in a somewhat generic way. Instead of calling myfunc(my_struct.var1); myfunc(my_struct.var2); and so on, I would much rather have:
for (char * p = ls[0]; *p; p += sizeof(ls[0]))
{
myfunc(my_struct.{some magic that would put var_1 / var_2 here});
}
But I guess this is impossible due to fact that the loop is executed in run-time, and the variable name needs to be available in compile-time.
Am I correct or is there something that can be done here? (not have to be this way, just wants to know if I can pack this routine into a nice loop)
Since all members are arrays of the same type, you can create an array of addresses to each member and loop through that:
char *my_struct_addrs[] = { my_struct.var1, my_struct.var2, ... };
int i;
for (i=0; i < sizeof(my_struct_addrs) / sizeof(my_struct_addrs[0]); i++) {
myfunc(my_struct_addrs[i]);
}
Since the size of each of these arrays is different however, you'll need to take care not to pass the bounds of each one. You can address this by keeping track of the size of each field and passing that to the function as well:
struct addr_list {
char *addr;
int len;
};
struct addr_list my_struct_addrs[] = {
{ my_struct.var1, sizeof(my_struct.var1) },
{ my_struct.var2, sizeof(my_struct.var2) },
...
};
int i;
for (i=0; i < sizeof(my_struct_addrs) / sizeof(my_struct_addrs[0]); i++) {
myfunc(my_struct_addrs[i].addr, my_struct_addrs[i].len);
}
Assuming you have something like
const char* ls[] = {"var_1", "var_2", ""};
where this list is not tightly-coupled to the struct data (if so you can use the answer by dbush), but is a separate item for whatever reason.
Then the slightly hacky, but well-defined version would be to use look-up tables. Create two lookup tables, one with strings, one with offsets:
#include <stddef.h>
typedef struct
{
int var_1;
int var_2;
} my_struct_t;
static const char* VAR_STRINGS[] =
{
"var_1",
"var_2",
""
};
static const size_t VAR_OFFSET[] =
{
offsetof(my_struct_t, var_1),
offsetof(my_struct_t, var_2),
};
Then do something like index = search_in_VAR_STRINGS_for(ls[i]); to get an index. (Loop through all items, or use binary search etc). The following code is then actually legal and well-defined:
unsigned char* ptr = (unsigned char*)&my_struct;
ptr += VAR_OFFSET[index];
int var_1 = *(int*)ptr;
This takes padding in account and the pointer arithmetic is guaranteed to be OK by C11 6.3.2.3/7:
When a pointer to an object is converted to a pointer to a character type,
the result points to the lowest addressed byte of the object. Successive increments of the
result, up to the size of the object, yield pointers to the remaining bytes of the object.
And since what's really stored at that address (effective type) is indeed an int, the variable access is guaranteed to be OK by C11 6.5/7 ("strict aliasing"):
An object shall have its stored value accessed only by an lvalue expression that has one of
the following types:
— a type compatible with the effective type of the object,
But various error handling obviously needs to be in place to check that something doesn't go out of bounds.
Using the OpenJDK's hashCode, I tried to implement a generic hashing routine in C:
U32 hashObject(void *object_generic, U32 object_length) {
if (object_generic == NULL) return 0;
U8 *object = (U8*)object_generic;
U32 hash = 1;
for (U32 i = 0; i < object_length; ++i) {
// hash = 31 * hash + object[i]; // Original prime used in OpenJDK
hash = 92821 * hash + object[i]; // Better constant found here: https://stackoverflow.com/questions/1835976/what-is-a-sensible-prime-for-hashcode-calculation
}
return hash;
}
The idea is that I can pass a pointer to any C object (primitive type, struct, array, etc.) and the object will be uniquely hashed. However, since this is the first time I am doing something like this, I'd like to ask- Is this the right approach? Are there any pitfalls that I need to be aware of?
There are decidedly pitfalls. The below program using your function, for example, prints a different value for each equivalent object (and a different value every time it’s compiled) under gcc -O0:
#include <stddef.h>
#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>
struct foo {
char c;
int i;
};
static uint32_t hashObject(void const* object_generic, uint32_t object_length) {
if (object_generic == NULL) return 0;
uint8_t const* object = (uint8_t const*)object_generic;
uint32_t hash = 1;
for (uint32_t i = 0; i < object_length; ++i) {
hash = 92821 * hash + object[i];
}
return hash;
}
int main() {
struct foo a[2];
a[0].c = 'A';
a[0].i = 1;
a[1].c = 'A';
a[1].i = 1;
_Static_assert(
sizeof(struct foo) == offsetof(struct foo, i) + sizeof(int),
"struct has no end padding"
);
printf("%d\n", hashObject(&a[0], sizeof *a));
printf("%d\n", hashObject(&a[1], sizeof *a));
return EXIT_SUCCESS;
}
This happens because padding can contain anything.
In the comments you ask what would happen if you zero out the struct object before using them.
It would not help. The hashes could still be different because padding bytes take unspecified values when a value is stored into a struct object or a member of a struct object1. The unspecified values may change on every store.
There is an additional problem, with other types. Any scalar type (pointer, integers, and floating types) may have different representations of the same value. This is a similar problem as struct types have with padding bytes, mentioned above. The bit representations of scalar objects may change, even though the value did not, and the resulting hash will be different.
(Quoted from: ISO/IEC 9899:201x 6.2.6 Representation of types 6.2.6.1 General 6)
When a value is stored in an object of structure or union type, including in a member
object, the bytes of the object representation that correspond to any padding bytes take
unspecified values.
No.
std::vector<int> v1 = {1, 2, 3, 4};
std::vector<int> v2 = {1, 2, 3, 4};
std::cout << "hash1=" << hashobject(&v1, sizeof(v1))
<< "hash2=" << hashobject(&v1, sizeof(v1)) << std::endl;
would report two different hash values, which is probably not the intended behaviour.
PS: the question is about C rather than the C++, but the similar class can be in C.
I came across this simple program somewhere
#include<stdio.h>
#include<stdlib.h>
char buffer[2];
struct globals {
int value;
char type;
long tup;
};
#define G (*(struct globals*)&buffer)
int main ()
{
G.value = 233;
G.type = '*';
G.tup = 1234123;
printf("\nValue = %d\n",G.value);
printf("\ntype = %c\n",G.type);
printf("\ntup = %ld\n",G.tup);
return 0;
}
It's compiling (using gcc) and executing well and I get the following output:
Value = 233
type = *
tup = 1234123
I am not sure how the #define G statement is working.
How G is defined as an object of type struct globals ?
First, this code has undefined behavior, because it re-interprets a two-byte array as a much larger struct. Therefore, it is writing past the end of the allocated space. You could make your program valid by using the size of the struct to declare the buffer array, like this:
struct globals {
int value;
char type;
long tup;
};
char buffer[sizeof(struct globals)];
The #define is working in its usual way - by providing textual substitutions of the token G, as if you ran a search-and-replace in your favorite text editor. Preprocessor, the first stage of the C compiler, finds every entry G, and replaces it with (*(struct globals*)&buffer).
Once the preprocessor is done, the compiler sees this code:
int main ()
{
(*(struct globals*)&buffer).value = 233;
(*(struct globals*)&buffer).type = '*';
(*(struct globals*)&buffer).tup = 1234123;
printf("\nValue = %d\n",(*(struct globals*)&buffer).value);
printf("\ntype = %c\n",(*(struct globals*)&buffer).type);
printf("\ntup = %ld\n",(*(struct globals*)&buffer).tup);
return 0;
}
The macro simply casts the address of the 2-character buffer buf into a pointer to the appropriate structure type, then de-references that to produce a struct-typed lvalue. That's why the dot (.) struct-access operator works on G.
No idea why anyone would do this. I would think it much cleaner to convert to/from the character array when that is needed (which is "never" in the example code, but presumably it's used somewhere in the larger original code base), or use a union to get rid of the macro.
union {
struct {
int value;
/* ... */
} s;
char c[2];
} G;
G.s.value = 233; /* and so on */
is both cleaner and clearer. Note that the char array is too small.