Malloc Undefined Behavior - Losing data - c

So, I'm working with some memory bound applications and I have:
1 - Two arrays of structs that simulates tables on a vertical database. One of them just with keys (1.5M 32-bits integer keys) and another one with integer keys and double payloads (150k tuples). The two of then dynamically allocated
2 - An array of 2^15 64-bits unsigned integers
3 - An array of 2^10 32-bits unsigned integers
And I need to allocate dynamically an array of 32-bits integers which I will know the size just on runtime.
The problem is: I've been able to allocate this array using malloc, BUT when I initialize values to zero, it just subscribes the values of the 150k tuples table. Which means, I`m losing data. The worst thing that could happen to a databases researcher.
Allocation of the "tables"
tamCustomer = countLines("customer.tbl");
c_customer = malloc(tamCustomer*sizeof(column_customer));
readCustomerColumn("customer.tbl", c_customer);
tamOrders = countLines("orders.tbl");
c_orders = malloc(tamOrders*sizeof(column_orders));
readOrdersColumn("orders.tbl", c_orders, sel);
Allocation of the problematic array
cht->tamHT = actualPopCounter;
cht->HT = malloc(sizeof(uint32_t)*cht->tamHT);
if (cht->HT == NULL)
printf("deu merda\n");
for (int i=0; i<cht->tamHT; i++)
cht->HT[i] = 0;
So, after this point, half of the table c_customer gets lost, subscribed by zeros.
What can I do to avoid that?
EDIT: structs definitions:
/******** VETOR DE STRUCTS COLUMN CUSTOMER *********/
typedef struct customer_c
{
unsigned int C_CUSTKEY;
float C_ACCTBAL;
} column_customer;
column_customer *c_customer;
/******** VETOR DE STRUCTS COLUMN ORDERS ***********/
typedef struct orders_c
{
unsigned int O_CUSTKEY;
} column_orders;
column_orders *c_orders;
CHT definition:
typedef struct CHT
{
uint64_t bitmap[CHT_BMP_SIZE];
bucket OHT[CHT_OHT_SIZE];
bucket *HT;
uint32_t tamHT;
} CHT;
And thats pretty much the function where it occurs. This is not a small application and Ive been so focused on this problem that I can`t think properly right now (sorry).
inline void generateCHT(column_customer *c_customer, int tamCustomer, CHT * cht)
{
uint32_t ohtOcc=0;
uint32_t chtOcc=0;
uint32_t ohtOccBMP=0;
uint32_t chtOccBMP=0;
uint64_t actualPopCounter;
uint64_t oldPopCounter;
//Allocate CHT
cht->tamHT = 0;
//Initialize OHT and bitmap
for (int i=0; i<CHT_OHT_SIZE;i++)
{
cht->OHT[i]=0;
cht->bitmap[i]=0;
}
for (int i=0; i<tamCustomer; i++)
{
switch (chtInsertBitmap(c_customer[i].C_CUSTKEY, tamCustomer, cht))
{
case 0:
printf("ERROR: Something went wrong while inserting the key %u on the CHT\n", c_customer[i].C_CUSTKEY);
break;
case 1:
chtOccBMP++;
break;
case 2:
ohtOccBMP++;
break;
}
}
//count Population
actualPopCounter = 0;
for (int i=0; i<CHT_BMP_SIZE;i++)
{
oldPopCounter = popCount(cht->bitmap[i]>>32);
cht->bitmap[i] = cht->bitmap[i] | actualPopCounter;
actualPopCounter = actualPopCounter + oldPopCounter;
}
cht->tamHT = actualPopCounter;
cht->HT = malloc(sizeof(uint32_t)*cht->tamHT);
if (cht->HT == NULL)
printf("deu merda\n");
for (int i=0; i<cht->tamHT; i++)
cht->HT[i] = 0;
for (int i=0; i<tamCustomer; i++)
{
if (chtInsertConciseTable(c_customer[i].C_CUSTKEY, cht, tamCustomer) == 0)
ohtOcc++;
else
chtOcc++;
}
printf("OHT has %d occupied buckets and %d on the bitmap \n", ohtOcc, ohtOccBMP);
printf("CHT has %d occupied buckets and %d on the bitmap \n", chtOcc, chtOccBMP);
}

You're possibly walking off the end of the cht->HT array you allocated.
bucket *HT;
...
...
cht->HT = malloc(sizeof(uint32_t)*cht->tamHT);
Try sizeof(bucket) instead.

Related

How to cache part of the data in buffer/ array and have everything else stored in members of data structure in C

I have my pseudocode something like this in C. I have some part of data stored in data structure, but im struggling to have another set of data (based on an if condition) to store in a separate array which is not fixed size. Any suggestion is appreciated.
typedef struct struct1 {
uint32 member1
} PACKED struct1_t
typedef struct struct2 {
struct1_t *member2
} PACKED struct2_t
uint32 curnt_cnt = 0;
for (i=0; i<some_number; i++){
if (cond) {
k = m;
struct2_t->member2[curnt_cnt].member1 = k; #I have no prob writing here
}
else {
k = n;
array[curnt_cnt] = k; ==> Is this even correct implementation?
# I want to store/ book-keep the values of k in an array throughout every iteration of for loop without overwriting the previous value
# Size of the array will not exceed "some_number (mentioned in for loop)" at any time
}
curnt_cnt++;
}
You must create a pointer, since lists in C must have a specific size
int* arr;
arr = (int*)malloc(sizeof(int)*some_number);
and then in your code
else {
k = n;
array[curnt_cnt] = k;
}
will work.

Undefined behavior when deleting an element from dynamic array of structs

I have an n sized array of structs dynamically allocated, and each position of the array is an array too, with different sizes for each position (an array of arrays).
I created a function to delete a given array[index] but I'm facing some undefined behavior, for example:
If the array is of size 3, if I delete array[0],I can't access array[1]. This happens with other combinations of indexes too. The only way it works flawlessly is when I delete from end to start.
Here is the code I have:
Structures:
typedef struct point{
char id[5];
char type[5];
char color[10];
int x;
int y;
} Point;
typedef struct {
char lineID[5];
int nPoints;
Point *pt;
}railData;
typedef struct railway {
railData data;
}railway;
This is how the array was created:
headRail = (railway**)calloc(lineNum,sizeof(railway*));
And each Rail:
headRail[i] = (railway*)calloc(pointsNum,sizeof(railway));
These are the functions to delete a rail:
railway **delRail(railway **headRail, int j)
{
int nPts = 0;
if (!headRail)
{
puts(ERRORS[NULLPOINTER]);
return NULL;
}
// Number of rail points on jth rail
nPts = headRail[j]->data.nPoints;
// Free each rail point from jth rail
for (int i = 0; i < nPts; ++i)
{
free(headRail[j][i].data.pt);
}
// Free allocated memory for jth rail
free(headRail[j]);
return headRail;
}
And this is where I call the previous function:
railway **removeRail(railway **headRail)
{
char userID[20];
int index = 0;
// Quit if no rails
if (!headRail)
{
backToMenu("No rails available!");
return NULL;
}
// Get user input
getString("\nRail ID: ",userID,MINLEN,MAXLEN); // MINLEN = 2 MAXLEN = 4
// get index of the asked rail
getRailIndex(headRail,userID,&index);
if (index != NOTFOUND)
{
headRail = delRail(headRail, index);
// Update number of rails in the array (global var)
NUMOFRAILS--;
backToMenu("Rail deleted!\n");
}
else
backToMenu("Rail not found!");
return headRail;
}
So my question is how can I modify my code so that when position i is eliminated, all other indexes are shifted left and the last position, which would be empty, is discarded (something like realloc but for shrinking)
Is what I'm asking doable without changing the array's structure?
When removing element i, do memmove all the data from i+1 to i to the end of the array and then realloc with the size decremented by 1.
Note that arrays in C do not track their size in any way, so you need to pass the size by an external way.
Your data abstraction is strange. I would expect that headRail[j][0].data.nPoints is used to store the number of points inside the headRail[j][0].data structure, yet there you store the count of headRails in the j row headRail[j][<this count>]. I would advise to rewrite the abstraction, have one "object" for the railway and another for hadling two dimensional arrays of railways with dynamic sizes in all directions.
Like:
railway **delRail(railway **headRail, int j)
{
...
// this is strange, it's equal to
// nPts = headRail[j][0].data.nPoints;
// dunno if you mean that,
// or if [j][0].data.nPoints refers to the size of
// headRail[j][0].data.pt or to the size of the whole array
size_t nPts = headRail[j]->data.nPoints;
for (size_t i = 0; i < nPts; ++i) {
free(headRail[j][i].data.pt);
}
free(headRail[j]);
// note that arrays in C does not know how many elements are there in the array
// so you typically pass that along the arguments, like
// railway **delRail(railway **headRail, size_t railcount, int j);
size_t headRailCount = lineNum; // some external knowledge of the size
memmove(&headRail[j], &headRail[j + 1], (headRailCount - j - 1) * sizeof(*headRail));
void *pnt = realloc(headRail, (headRailCount - 1) * sizeof(*headRail));
if (pnt == NULL) return NULL; // that would be strange
headRail = pnt; // note that the previous headRail is no longer valid
--lineNum; // decrement that object where you store the size of the array
return headRail;
}
What about some encapsulation and more structs instead of 2d array? 2d arrays are really a bit of pain for C, what about:
typedef struct {
// stores a single row of rail datas
struct railData_row_s {
// stores a pointer to an array of rail datas
railData *data;
// stores the count of how many datas of rails are stored here
size_t datacnt;
// stores a pointer to an array of rows of rail datas
} *raildatas;
// stores the size of the pointer of rows of rail datas
size_t raildatascnt;
} railway;
The count of mallocs will stay the same, but thinking about data will get simpler. And each pointer that points to an array of data has it's own size tracking variable. An allocation might look like this:
railway *rail_new(size_t lineNum, size_t pointsNum) {
railway *r = calloc(1, sizeof(*r));
if (!r) { return NULL; }
// allocate the memory for rows of raildata
r->raildatascnt = lineNum;
r->raildatas = calloc(r->raildatascnt, sizeof(*r->raildatas));
if (!t->raildatas) { /* error hadnling */ free(r); abort(); }
// for each row of raildata
for (size_t i = 0; i < r->raildatascnt; ++i) {
struct railData_row_s * const row = &r->raildatas[i];
// allocate the memory for the column of raildata
// hah, looks similar to the above?
row->datacnt = pointsNum;
row->data = calloc(row->datacnt, sizeof(*row->data));
if (!row->data) { /* error ahdnling */ abort(); }
}
return r;
}

Array of pointers to the struct

I am coding in C a Zoo for a school project. Where there are Areas and Animals within it. We must use dynamic structures. I am trying to do the Areas and I am stuck. I am using a linked list.
Structure
typedef struct area Area, *pArea;
struct area{
char id[10];
int size, nadj;
pArea prox; //for linked list
pArea adj[3]; ///array of pointers to the struct area
};
Filling the list
void fill(pArea p){
printf("ID: ");
scanf(" %10[^\n]", p->id);
printf("Size: ");
scanf(" %d", &p->size);
printf("Nadj: ");
scanf(" %d", &p->nadj);
if(p->nadj == 0)
for(int i = 0; i < p->nadj; i++)
p->adj[i] = NULL;
else
//stuck here. HELP
}
p->prox = NULL;
}
AreaA 500 2 AreaB AreaC
Where AreaA is the id, 500 is the size variable, 2 is the number of Areas (nadj) that will be near the AreaA, following with the areas. Now, my teacher said that the areas near the id Area must be stored in a array of pointers to the struct Area (pArea adj[3], it must be in max 3 Areas) but I don't know how to fill that array while only using the name of the areas as they are on the above example when they are of type struct Area and not an array.
You need to maintain some kind of map from area names to areas. Then
for (int i = 0; i < p->nadj; i++) {
name = read_name();
p->adj[i] = find_area_by_name(name);
}
where find_area_by_name, well, should do what its name suggests. Depending on the amount of areas you need to handle (and level of the class) you may implement it as simple as linear lookup, or as fancy as AVL tree.
BTW,
if(p->nadj == 0)
for(int i = 0; i < p->nadj; i++)
p->adj[i] = NULL;
is effectively a no-op. Since the loop is entered only when p-nadj == 0, it is equivalent to
for(int i = 0; i < 0; i++)

Remove some elements from array and re-size array in C

Regards
I want to remove some elements from my array and re-size it.
for example my array is:
char get_res[6] = {0x32,0x32,0x34,0x16,0x00,0x00};
Now I want to remove elements after 0x16, so my desire array is:
get_res[] = {0x32,0x32,0x34,0x16};
what is solution?
You cannot resize arrays in C (unlike Python, for example). For real resizing, at least from an API user's point of view, use malloc, calloc, realloc, and free (realloc specifically).
Anyway, "resizing" an array can be imitated using
a delimiter; for example, a delimiter like 0xff could mark the end of the valid data in the array
Example:
#define DELIMITER 0xff
print_data(char* data) {
for (size_t i = 0; data[i] != DELIMITER; ++i)
printf("%x", data[i]);
}
a member counter; count the number of valid data from the beginning of the array onward
Example:
size_t counter = 5;
print_data(char* data) {
for (size_t i = 0; i < counter; ++i)
printf("%x", data[i]);
}
Notes:
Use unsigned char for binary data. char may be aliasing signed char, which you might run into problems with because signed char contains a sign bit.
There is no need to "remove" them. Just don't access them. Pretend like they don't exist. Same like in stacks, when you "pop" a value from the top of the stack, you just decrement the stack pointer.
Manipulating arrays in C isn't easy as it is for vector in C++ or List in Java. There is no "remove element" in C. I mean that you have to do the job yourself, that is, create another array, copy only the elements you want to this new array, and free the memory occupied by the previous one.
Can you do that? Do you want the code?
EDIT:
Try that. It's just a simple program that simulates the situation. Now, you have to see the example and adapt it to your code.
#include <stdio.h>
#include <stdlib.h>
int main() {
char get_res[6] = {0x32,0x32,0x34,0x16,0x00,0x00};
char target = 0x16;
int pos, i, length = 6; // or specify some way to get this number
for(i = 0; i < length; i++)
if(get_res[i] == target) {
pos = i;
break;
}
pos = pos + 1; // as you have to ignore the target itself
char *new_arr = malloc(pos);
for(i = 0; i < length; i++) {
new_arr[i] = get_res[i];
i++;
}
for(i = 0; i < pos; i++)
printf("%c ", new_arr[i]);
return 0;
}

Arduino - Optimising existing method for iterating through an array

Is there a more efficient and cleaner way of doing what the following method is already doing?
void sendCode(prog_uint16_t inArray[], int nLimit) {
unsigned int arr[nLimit];
unsigned int c;
int index = 0;
while ((c = pgm_read_word(inArray++))) {
arr[index] = c;
index++;
}
for (int i = 0; i < nLimit; i=i+2) {
delayMicroseconds(arr[i]);
pulseIR(arr[i+1]);
}
}
This is in reference to an existing question I had answered.
Arduino - Iterate through C array efficiently
There should be no need for the local arr array variable. If you do away with that you should both save temporary stack space and speed up execution by removing the need to copy data.
void sendCode(const prog_uint16_t inArray[]) {
unsigned int c;
for (int i = 0; c = pgm_read_word(inArray++); i++) {
if (i % 2 == 0) { // Even array elements are delays
delayMicroseconds(c);
} else { // Odd array elements are pulse lengths
pulseIR(c);
}
}
}
This code assumes that the maximum integer stored in an int is greater than the maximum size of inArray (this seems reasonable as the original code essentially makes the same assumption by using an int for nLimit).

Resources