I'm getting confused working with character string arrays. I'm trying to fill 2 arrays in a for loop. Within each array, all elements are the same.
To conserve memory, for array_person_name I attempt to simply copy the pointer to the string stored by person_name. For array_param, the string that it stores the pointer to is always 2 characters long (e.g. "bt") plus the null termination character , and here I also attempt to conserve memory by storing the pointer to "bt" in array_param.
Since the number of array elements, arraysize, is downloaded from a database when the program runs, I use malloc to allocate memory. Since my OS is 64 bit (Linux x86-64), I allocate 8 bytes for each of arraysize pointers. Although not shown, I free these two arrays at the end of the program.
int kk, arraysize;
char person_name[101] = "";
char * array_person_name;
char * array_param;
...
strncpy(person_name, "John Smith", 100);
arraysize = <this value is downloaded from database>;
...
array_person_name = malloc( 8 * arraysize ); /* 8 is for 64b OS */
array_param = malloc( 8 * arraysize );
for (kk = 0; kk < arraysize; kk++) {
array_person_name[kk] = &person_name;
array_param[kk] = &"bt";
}
/* test results by printing to screen */
printf("Here is array_person_name[0]: %s\n", array_person_name[0]);
printf("here is array_param[0]: %s\n", array_param[0]);
The compiler returns the warnings: warning: assignment makes integer from pointer without a cast on the two lines inside the for loop. Any idea what I'm doing wrong?
Since you want each item in array_person_name and array_param to be a pointer to person_name/"bt", you want a char **:
char **array_person_name;
array_person_name = malloc(arraysize * sizeof(*array_person_name));
for (int i=0; i<arraysize; i++)
array_person_name[i] = person_name;
You're assigning a pointer to array person_name to character defined by array_person_name[kk]. What you probably meant to do was to define array_person_name as a char** type.
You shouldn't be assuming 8 bytes because it's 64 bit. You should leave that part to C and use sizeof() operator.
Related
I have currently trouble understanding the following scenario:
I have a multidimensional array of Strings and I want to address it by using pointers only but I always get a Segmentation Fault when using the array annotation on the pointer. This is just an example code I want to use the 3D array in a pthread so I want to pass it in via a structure as a pointer but it just doesn't work and I would like to know why? I thought pointers and arrays are functionally equivalent? Here is the sample code:
#include <stdio.h>
void func(unsigned char ***ptr);
int main() {
// Image of dimension 10 times 10
unsigned char image[10][10][3];
unsigned char ***ptr = image;
memcpy(image[0][0], "\120\200\12", 3);
// This works as expected
printf("Test: %s", image[0][0]);
func(image);
return 0;
}
void func(unsigned char ***ptr) {
// But here I get a Segmentation Fault but why??
printf("Ptr: %s", ptr[0][0]);
}
Thanks in advance for your help :)
I think maybe strdup confuses the issue. Pointers and arrays are not always equivalent. Let me try to demonstrate. I always avoid actual multi-dimension arrays, so I may make a mistake here, but:
int main()
{
char d3Array[10][10][4]; //creates a 400-byte contiguous memory area
char ***d3Pointer; //a pointer to a pointer to a pointer to a char.
int i,j;
d3Pointer = malloc(sizeof(char**) * 10);
for (i = 0; i < 10; ++i)
{
d3Pointer[i] = malloc(sizeof(char*) * 10);
for (j = 0; j < 4; ++j)
{
d3Pointer[i][j] = malloc(sizeof(char) * 4);
}
}
//this
d3Pointer[2][3][1] = 'a';
//is equivalent to this
char **d2Pointer = d3Pointer[2];
char *d1Pointer = d2Pointer[3];
d1Pointer[1] = 'a';
d3Array[2][3][1] = 'a';
//is equivalent to
((char *)d3Array)[(2 * 10 * 4) + (3 * 4) + (1)] = 'a';
}
Generally, I use the layered approach. If I want contiguous memory, I handle the math myself..like so:
char *psuedo3dArray = malloc(sizeof(char) * 10 * 10 * 4);
psuedo3dArray[(2 * 10 * 4) + (3 * 4) + (1)] = 'a';
Better yet, I use a collection library like uthash.
Note that properly encapsulating your data makes the actual code incredibly easy to read:
typedef unsigned char byte_t;
typedef struct
{
byte_t r;
byte_t g;
byte_t b;
}pixel_t;
typedef struct
{
int width;
int height;
pixel_t * pixelArray;
}screen_t;
pixel_t *getxyPixel(screen_t *pScreen, int x, int y)
{
return pScreen->pixelArray + (y*pScreen->width) + x;
}
int main()
{
screen_t myScreen;
myScreen.width = 1024;
myScreen.height = 768;
myScreen.pixelArray = (pixel_t*)malloc(sizeof(pixel_t) * myScreen.height * myScreen.width);
getxyPixel(&myScreen, 150, 120)->r = 255;
}
In C, you should allocate space for your 2D array one row at a time. Your definition of test declares a 10 by 10 array of char pointers, so you don't need to call malloc for it. But to store a string you need to allocate space for the string. Your call to strcpy would crash. Use strdup instead. One way to write your code is as follows.
char ***test = NULL;
char *ptr = NULL;
test = malloc(10 * sizeof(char **));
for (int i = 0; i < 10; i++) {
test[i] = malloc(10 * sizeof(char *));
}
test[0][0] = strdup("abc");
ptr = test[0][0];
printf("%s\n", ptr);
test[4][5] = strdup("efg");
ptr = test[4][5];
printf("%s\n", ptr);
Alternatively, if you want to keep your 10 by 10 definition, you could code it like this:
char *test[10][10];
char *ptr = NULL;
test[0][0] = strdup("abc");
ptr = test[0][0];
printf("%s\n", ptr);
test[4][5] = strdup("efg");
ptr = test[4][5];
printf("%s\n", ptr);
Your problem is, that a char[10][10][3] is something very different from a char***: The first is an array of arrays of arrays, the later is a pointer to a pointer to a pointer. The confusions arises because both can be dereferenced with the same syntax. So, here is a bit of an explanation:
The syntax a[b] is nothing but a shorthand for *(a + b): First you perform pointer arithmetic, then you dereference the resulting pointer.
But, how come you can use a[b] when a is an array instead of a pointer? Well, because...
Arrays decay into pointers to their first element: If you have an array declared like int array[10], saying array + 3 results in array decaying to a pointer of type int*.
But, how does that help to evaluate a[b]? Well, because...
Pointer arithmetic takes the size of the target into account: The expression array + 3 triggers a calculation along the lines of (size_t)array + 3*sizeof(*array). In our case, the pointer that results from the array-pointer-decay points to an int, which has a size, say 4 bytes. So, the pointer is incremented by 3*4 bytes. The result is a pointer that points to the fourths int in the array, the first three elements are skipped by the pointer arithmetic.
Note, that this works for arrays of any element type. Arrays can contain bytes, or integers, or floats, or structs, or other arrays. The pointer arithmetic is the same.
But, how does that help us with multidimensional arrays? Well, because...
Multidimensional arrays are just 1D arrays that happen to contain arrays as elements: When you declare an array with char image[256][512]; you are declaring a 1D array of 256 elements. These 256 elements are all arrays of 512 characters, each. Since the sizeof(char) == 1, the size of an element of the outer array is 512*sizeof(char) = 512, and, since we have 256 such arrays, the total size of image is 256*512. Now, I can declare a 3D array with char animation[24][256][512];...
So, going back to your example that uses
char image[10][10][3]
what happens when you say image[1][2][1] is this: The expression is equivalent to this one:
*(*(*(image + 1) + 2) + 3)
image being of type char[10][10][3] decays into a pointer to its first element, which is of type char(*)[10][3] The size of that element is 10*3*1 = 30 bytes.
image + 1: Pointer arithmetic is performed to add 1 to the resulting pointer, which increments it by 30 bytes.
*(image + 1): The pointer is dereferenced, we are now talking directly about the element, which is of type char[10][3].
This array again decays into a pointer to its first element, which is of type char(*)[3]. The size of the element is 3*1 = 3. This pointer points at the same byte in memory as the pointer that resulted from step 2. The only difference is, that it has a different type!
*(image + 1) + 2: Pointer arithmetic is performed to add 2 to the resulting pointer, which increments it by 2*3 = 6 bytes. Together with the increment in step 2, we now have an offset of 36 bytes, total.
*(*(image + 1) + 2): The pointer is dereferenced, we are now talking directly about the element, which is of type char[3].
This array again decays into a pointer to its first element, which is of type char*. The size of the element is now just a single byte. Again, this pointer has the same value as the pointer resulting from step 5, but a different type.
*(*(image + 1) + 2) + 1: Pointer arithmetic again, adding 1*1 = 1 bytes to the total offset, which increases to 37 bytes.
*(*(*(image + 1) + 2) + 1): The pointer is dereferenced the last time, we are now talking about the char at an offset of 37 bytes into the image.
So, what's the difference to a char***? When you dereference a char***, you do not get any array-pointer-decay. When you try to evaluate the expression pointers[1][2][1] with a variable declared as
char*** pointers;
the expression is again equivalent to:
*(*(*(pointers + 1) + 2) + 3)
pointers is a pointer, so no decay happens. Its type is char***, and it points to a value of type char**, which likely has a size of 8 bytes (assuming a 64 bit system).
pointers + 1: Pointer arithmetic is performed to add 1 to the resulting pointer, which increments it by 1*8 = 8 bytes.
*(pointers + 1): The pointer is dereferenced, we are now talking about the pointer value that is found in memory at an offset of 8 bytes of where pointers points.
Further steps depending on what actually happened to be stored at pointers[1]. These steps do not involve any array-pointer-decay, and thus load pointers from memory instead.
You see, the difference between a char[10][10][3] and a char*** is profound. In the first case, the array-pointer-decay transforms the process into a pure offset computation into a multidimensional array. In the later case, we repeatedly load pointers from memory when accessing elements, all we ever have are 1D arrays of pointers. And it's all down to the types of pointers!
If I'm trying to create a global array to hold an arbitrary number of integers in this case 2 ints. How is it possible that I can assign more numbers to it if I only allocate enough space for just two integers.
int *globalarray;
int main(int argc, char *argv[]) {
int size = 2;
globalarray = malloc(size * sizeof(globalarray[0]));
// How is it possible to initialize this array pass
// the two location that I allocated.
for (size_t i = 0; i < 10; i++) {
globalarray[i] = i;
}
for (size_t i = 0; i < 10; i++) {
printf("%d ", globalarray[i]);
}
printf("%s\n", "");
int arrayLength = sizeof(*globalarray)/sizeof(globalarray[0]);
printf("Array Length: %d\n", arrayLength);
}
When I run this it gives me
0 1 2 3 4 5 6 7 8 9
Array Length: 1
So I wanted to know if someone could clarify this for me.
(1) Am I creating the global array correctly?
(2) Why is the array length 1? When I feel that it should be 2 since I malloced the pointer for 2.
And background info on why I want to know this is because I want to create a global array (shared array) so that threads can later access the array and change the values.
How is it possible to initialize this array pass the two location that I allocated.
Short answer: This is undefined behaviour and anything can happen, also the appearance that it worked.
Long answer: You can only initialize the memory you've allocated, it
doesn't matter that the variable is a global variable. C doesn't prevent you from
stepping out of bounds, but if you do, then you get undefined behaviour and anything can happen
(it can "work" but it also can crash immediately or it can crash later).
So if you know that you need 10 ints, then allocate memory for 10 int.
globalarray = malloc(10 * sizeof *globalarray);
if(globalarray == NULL)
{
// error handling
}
And if you later need more, let's say 15, then you can use realloc to increase
the memory allocation:
globalarray = malloc(10 * sizeof *globalarray);
if(globalarray == NULL)
{
// error handling
// do not contiue
}
....
// needs more space
int *tmp = realloc(globalarray, 15 * sizeof *globalarray);
if(tmp == NULL)
{
// error handling
// globalarray still points to the previously allocated
// memory
// do not continue
}
globalarray = tmp;
Am I creating the global array correctly?
Yes and no. It is syntactically correct, but semantically it is not, because you are
allocating space for only 2 ints, but it's clear from the next lines that
you need 10 ints.
Why is the array length 1? When I feel that it should be 2 since I malloced the pointer for 2.
That's because
sizeof(*globalarray)/sizeof(globalarray[0]);
only works with arrays, not pointers. Note also that you are using it wrong in
two ways:
The correct formula is sizeof(globalarray) / sizeof(globalarray[0])
This only works for arrays, not pointers (see below)
We sometimes use the term array as a visual representation when we do stuff
like
int *arr = malloc(size * sizeof *arr)
but arr (and globalarray) are not arrays,
they are pointers. sizeof returns the amount in bytes that the
expression/variable needs. In your case *globalarray has type int and
globalarray[0] has also type int. So you are doing sizeof(int)/sizeof(int)
which is obviously 1.
Like I said, this only works for arrays, for example, this is correct
// not that arr here is not an array
int arr[] = { 1, 2, 3, 4 };
size_t len = sizeof arr / sizeof arr[0]; // returns 4
but this is incorrect:
int *ptr = malloc(4 * sizeof *ptr);
size_t len = sizeof ptr / sizeof ptr[0]; // this is wrong
because sizeof ptr does not returns the total amount of allocated
bytes, it returns the amount of bytes that a pointer needs to be stored in memory. When you are dealing with
pointers, you have to have a separate variable that holds the size.
C does not prevent you from writing outside allocated memory. When coding in C it is of the utmost importance that you manage your memory properly.
For your second question, this is how you would want to allocate your buffer:
globalarray = malloc(sizeof(int) * size);
And if you are on an older version of C than c11:
globalarray = (int*) malloc(sizeof(int) * size);
I run in a problem with a program and I'm not sure how to solve it. I'm processing a file and to do so I get the size with ftell and store it in M_size. After that I declare a unsigned char pointer array with N. The array is then used in two functions a() and b().
...
unsigned long N = (M_size/ x);
int LstElemSize = M_size % x;
if(LstElemSize != 0){
N += 1;
}
unsigned char *ptr_M[N]
a(ptr_M)
b(ptr_M)
...
Function a() actually initializes each element of ptr_M in a for loop:
a(){
int i;
for(i = 0; i < N-1; i ++){
ptr_M[i] = malloc(sizeof(unsigned char) * x);
}
}
Function b() iterates then over each element and calculates stuff and at the end each element is freed.
My problem is now that when I try to process a file e.g. 1 GB the array size will be around 4 000 000 and a Segmentation error occurs (In the line i declare my array). If I calculated it correctly that is 8 byte (char pointer) times 4 000 000 = 32MB. The server running the program has enough memory to hold the file, but i guess as mentioned in Response 1 the stack space is not enough.
What can I do to solve my problem? Increase my stack space? Thanks!
The problem is that you're defining ptr_M in the stack, which has a small size limit. The heap does not have such a small size limit and is able to use more of your system's memory. You need to use malloc() to allocate ptr_M just like you allocate the subarrays. (Make sure to free it at some point too along with all those subarrays!)
unsigned char **ptr_M = malloc(sizeof(unsigned char*) * N);
Also, your a() has an off-by-one error. It ignores the last item of the array. Use this:
for(i = 0; i < N; i ++){
unsigned char *ptr_M[N] is a variable-length array declaring N number of unsigned char on the stack in your case. You should dynamically allocate the space for the array as well.
unsigned char **ptr_M = malloc(sizeof(unsigned char*) * N);
a(ptr_M);
b(ptr_M);
...
//After you free each element in ptr_M
free(ptr_M);
malloc allocates space from heap, not from stack. You may try increasing your heapsize looking at the compiler option. Check the upper limit of heapsize that is supported there.
I want to create an integer pointer p, allocate memory for a 10-element array, and then fill each element with the value of 5. Here's my code:
//Allocate memory for a 10-element integer array.
int array[10];
int *p = (int *)malloc( sizeof(array) );
//Fill each element with the value of 5.
int i = 0;
printf("Size of array: %d\n", sizeof(array));
while (i < sizeof(array)){
*p = 5;
printf("Current value of array: %p\n", *p);
*p += sizeof(int);
i += sizeof(int);
}
I've added some print statements around this code, but I'm not sure if it's actually filling each element with the value of 5.
So, is my code working correctly? Thanks for your time.
First:
*p += sizeof(int);
This takes the contents of what p points to and adds the size of an integer to it. That doesn't make much sense. What you probably want is just:
p++;
This makes p point to the next object.
But the problem is that p contains your only copy of the pointer to the first object. So if you change its value, you won't be able to access the memory anymore because you won't have a pointer to it. (So you should save a copy of the original value returned from malloc somewhere. If nothing else, you'll eventually need it to pass to free.)
while (i < sizeof(array)){
This doesn't make sense. You don't want to loop a number of times equal to the number of bytes the array occupies.
Lastly, you don't need the array for anything. Just remove it and use:
int *p = malloc(10 * sizeof(int));
For C, don't cast the return value of malloc. It's not needed and can mask other problems such as failing to include the correct headers. For the while loop, just keep track of the number of elements in a separate variable.
Here's a more idiomatic way of doing things:
/* Just allocate the array into your pointer */
int arraySize = 10;
int *p = malloc(sizeof(int) * arraySize);
printf("Size of array: %d\n", arraySize);
/* Use a for loop to iterate over the array */
int i;
for (i = 0; i < arraySize; ++i)
{
p[i] = 5;
printf("Value of index %d in the array: %d\n", i, p[i]);
}
Note that you need to keep track of your array size separately, either in a variable (as I have done) or a macro (#define statement) or just with the integer literal. Using the integer literal is error-prone, however, because if you need to change the array size later, you need to change more lines of code.
sizeof of an array returns the number of bytes the array occupies, in bytes.
int *p = (int *)malloc( sizeof(array) );
If you call malloc, you must #include <stdlib.h>. Also, the cast is unnecessary and can introduce dangerous bugs, especially when paired with the missing malloc definition.
If you increment a pointer by one, you reach the next element of the pointer's type. Therefore, you should write the bottom part as:
for (int i = 0;i < sizeof(array) / sizeof(array[0]);i++){
*p = 5;
p++;
}
*p += sizeof(int);
should be
p += 1;
since the pointer is of type int *
also the array size should be calculated like this:
sizeof (array) / sizeof (array[0]);
and indeed, the array is not needed for your code.
Nope it isn't. The following code will however. You should read up on pointer arithmetic. p + 1 is the next integer (this is one of the reasons why pointers have types). Also remember if you change the value of p it will no longer point to the beginning of your memory.
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#define LEN 10
int main(void)
{
/* Allocate memory for a 10-element integer array. */
int array[LEN];
int i;
int *p;
int *tmp;
p = malloc(sizeof(array));
assert(p != NULL);
/* Fill each element with the value of 5. */
printf("Size of array: %d bytes\n", (int)sizeof(array));
for(i = 0, tmp = p; i < LEN; tmp++, i++) *tmp = 5;
for(i = 0, tmp = p; i < LEN; i++) printf("%d\n", tmp[i]);
free(p);
return EXIT_SUCCESS;
}
//Allocate memory for a 10-element integer array.
int array[10];
int *p = (int *)malloc( sizeof(array) );
At this point you have allocated twice as much memory -- space for ten integers in the array allocated on the stack, and space for ten integers allocated on the heap. In a "real" program that needed to allocate space for ten integers and stack allocation wasn't the right thing to do, the allocation would be done like this:
int *p = malloc(10 * sizeof(int));
Note that there is no need to cast the return value from malloc(3). I expect you forgot to include the <stdlib> header, which would have properly prototyped the function, and given you the correct output. (Without the prototype in the header, the C compiler assumes the function would return an int, and the cast makes it treat it as a pointer instead. The cast hasn't been necessary for twenty years.)
Furthermore, be vary wary of learning the habit sizeof(array). This will work in code where the array is allocated in the same block as the sizeof() keyword, but it will fail when used like this:
int foo(char bar[]) {
int length = sizeof(bar); /* BUG */
}
It'll look correct, but sizeof() will in fact see an char * instead of the full array. C's new Variable Length Array support is keen, but not to be mistaken with the arrays that know their size available in many other langauges.
//Fill each element with the value of 5.
int i = 0;
printf("Size of array: %d\n", sizeof(array));
while (i < sizeof(array)){
*p = 5;
*p += sizeof(int);
Aha! Someone else who has the same trouble with C pointers that I did! I presume you used to write mostly assembly code and had to increment your pointers yourself? :) The compiler knows the type of objects that p points to (int *p), so it'll properly move the pointer by the correct number of bytes if you just write p++. If you swap your code to using long or long long or float or double or long double or struct very_long_integers, the compiler will always do the right thing with p++.
i += sizeof(int);
}
While that's not wrong, it would certainly be more idiomatic to re-write the last loop a little:
for (i=0; i<array_length; i++)
p[i] = 5;
Of course, you'll have to store the array length into a variable or #define it, but it's easier to do this than rely on a sometimes-finicky calculation of the array length.
Update
After reading the other (excellent) answers, I realize I forgot to mention that since p is your only reference to the array, it'd be best to not update p without storing a copy of its value somewhere. My little 'idiomatic' rewrite side-steps the issue but doesn't point out why using subscription is more idiomatic than incrementing the pointer -- and this is one reason why the subscription is preferred. I also prefer the subscription because it is often far easier to reason about code where the base of an array doesn't change. (It Depends.)
//allocate an array of 10 elements on the stack
int array[10];
//allocate an array of 10 elements on the heap. p points at them
int *p = (int *)malloc( sizeof(array) );
// i equals 0
int i = 0;
//while i is less than 40
while (i < sizeof(array)){
//the first element of the dynamic array is five
*p = 5;
// the first element of the dynamic array is nine!
*p += sizeof(int);
// incrememnt i by 4
i += sizeof(int);
}
This sets the first element of the array to nine, 10 times. It looks like you want something more like:
//when you get something from malloc,
// make sure it's type is "____ * const" so
// you don't accidentally lose it
int * const p = (int *)malloc( 10*sizeof(int) );
for (int i=0; i<10; ++i)
p[i] = 5;
A ___ * const prevents you from changing p, so that it will always point to the data that was allocated. This means free(p); will always work. If you change p, you can't release the memory, and you get a memory leak.
I'm having a bit of a problem with strcat and segmentation faults. The error is as follows:
Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0x0000000000000000
0x00007fff82049f1f in __strcat_chk ()
(gdb) where
#0 0x00007fff82049f1f in __strcat_chk ()
#1 0x0000000100000adf in bloom_operation (bloom=0x100100080, item=0x100000e11 "hello world", operation=1) at bloom_filter.c:81
#2 0x0000000100000c0e in bloom_insert (bloom=0x100100080, to_insert=0x100000e11 "hello world") at bloom_filter.c:99
#3 0x0000000100000ce5 in main () at test.c:6
bloom_operation is as follows:
int bloom_operation(bloom_filter_t *bloom, const char *item, int operation)
{
int i;
for(i = 0; i < bloom->number_of_hash_salts; i++)
{
char temp[sizeof(item) + sizeof(bloom->hash_salts[i]) + 2];
strcat(temp, item);
strcat(temp, *bloom->hash_salts[i]);
switch(operation)
{
case BLOOM_INSERT:
bloom->data[hash(temp) % bloom->buckets] = 1;
break;
case BLOOM_EXISTS:
if(!bloom->data[hash(temp) % bloom->buckets]) return 0;
break;
}
}
return 1;
}
The line with trouble is the second strcat. The bloom->hash_salts are part of a struct defined as follows:
typedef unsigned const char *hash_function_salt[33];
typedef struct {
size_t buckets;
size_t number_of_hash_salts;
int bytes_per_bucket;
unsigned char *data;
hash_function_salt *hash_salts;
} bloom_filter_t;
And they are initialized here:
bloom_filter_t* bloom_filter_create(size_t buckets, size_t number_of_hash_salts, ...)
{
bloom_filter_t *bloom;
va_list args;
int i;
bloom = malloc(sizeof(bloom_filter_t));
if(bloom == NULL) return NULL;
// left out stuff here for brevity...
bloom->hash_salts = calloc(bloom->number_of_hash_salts, sizeof(hash_function_salt));
va_start(args, number_of_hash_salts);
for(i = 0; i < number_of_hash_salts; ++i)
bloom->hash_salts[i] = va_arg(args, hash_function_salt);
va_end(args);
// and here...
}
And bloom_filter_create is called as follows:
bloom_filter_create(100, 4, "3301cd0e145c34280951594b05a7f899", "0e7b1b108b3290906660cbcd0a3b3880", "8ad8664f1bb5d88711fd53471839d041", "7af95d27363c1b3bc8c4ccc5fcd20f32");
I'm doing something wrong but I'm really lost as to what. Thanks in advance,
Ben.
I see a couple of problems:
char temp[sizeof(item) + sizeof(bloom->hash_salts[i]) + 2];
The sizeof(item) will only return 4 (or 8 on a 64-bit platform). You probably need to use strlen() for the actual length. Although I don't think you can declare it on the stack like that with strlen (although I think maybe I saw someone indicate that it was possible with newer versions of gcc - I may be out to lunch on that).
The other problem is that the temp array is not initialized. So the first strcat may not write to the beginning of the array. It needs to have a NULL (0) put in the first element before calling strcat.
It may already be in the code that was snipped out, but I didn't see that you initialized the number_of_hash_salts member in the structure.
You need to use strlen, not sizeof. item is passed in as a pointer, not an array.
The line:
char temp[sizeof(item) + sizeof(bloom->hash_salts[i]) + 2];
will make temp the 34x the length of a pointer + 2. The size of item is the size of a pointer, and the sizeof(bloom->hash_salts[i]) is currently 33x the size of a pointer.
You need to use strlen for item, so you know the actual number of characters.
Second, bloom->hash_salts[i] is a hash_function_salt, which is an array of 33 pointers to char. It seems like hash_function_salt should be defined as:
since you want it to hold 33 characters, not 33 pointers. You should also remember that when you're passing a string literal to bloom_filter_create, you're passing a pointer. That means to initialize the hash_function_salt array we use memcpy or strcpy. memcpy is faster when we know the exact length (like here):
So we get:
typedef unsigned char hash_function_salt[33];
and in bloom_filter_create:
memcpy(bloom->hash_salts[i], va_arg(args, char*), sizeof(bloom->hash_salts[i]));
Going back to bloom_operation, we get:
char temp[strlen(item) + sizeof(bloom->hash_salts[i])];
strcpy(temp, item);
strcat(temp, bloom->hash_salts[i]);
We use strlen for item since it's a pointer, but sizeof for the hash_function_salt, which is a fixed size array of char. We don't need to add anything, because hash_function_salt already includes room for a NUL. We use strcpy first. strcat is for when you already have a NUL-terminated string (which we don't here). Note that we drop the *. That was a mistake following from your incorrect typedef.
Your array size calculation for temp uses sizeof(bloom->hash_salts[i]) (which is
just the size of the pointer), but then you dereference the pointer and try
to copy the entire string into temp.
First, as everyone has said, you've sized temp based on the sizes of two pointers, not the lengths of the strings. You've now fixed that, and report that the symptom has moved to the call to strlen().
This is showing a more subtle bug.
You've initialized the array bloom->hash_salts[] from pointers returned by va_arg(). Those pointers will have a limited lifetime. They may not even outlast the call to va_end(), but they almost certainly do not outlive the call to bloom_filter_create().
Later, in bloom_filter_operation(), they point to arbitrary places and you are doomed to some kind of interesting failure.
Edit: Resolving this this requires that the pointers stored in the hash_salts array have sufficient lifetime. One way to deal with that is to allocate storage for them, copying them out of the varargs array, for example:
// fragment from bloom_filter_create()
bloom->hash_salts = calloc(bloom->number_of_hash_salts, sizeof(hash_function_salt));
va_start(args, number_of_hash_salts);
for(i = 0; i < number_of_hash_salts; ++i)
bloom->hash_salts[i] = strdup(va_arg(args, hash_function_salt));
va_end(args);
Later, you would need to loop over hash_salts and call free() on each element before freeing the array of pointers itself.
Another approach that would require more overhead to initialize, but less effort to free would be to allocate the array of pointers along with enough space for all of the strings in a single allocation. Then copy the strings and fill in the pointers. Its a lot of code to get right for a very small advantage.
Are you sure that the hash_function_salt type is defined correctly? You may have too many *'s:
(gdb) ptype bloom
type = struct {
size_t buckets;
size_t number_of_hash_salts;
int bytes_per_bucket;
unsigned char *data;
hash_function_salt *hash_salts;
} *
(gdb) ptype bloom->hash_salts
type = const unsigned char **)[33]
(gdb) ptype bloom->hash_salts[0]
type = const unsigned char *[33]
(gdb) ptype *bloom->hash_salts[0]
type = const unsigned char *
(gdb)