My objective is to optimize memory usage... I've never seen it in any tutorial which leads me to think that this isn't the right way to do it
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
struct Player {
char* username;
int hp;
int mp;
};
int main(void) {
struct Player test, *p = &test;
p->username = (char*)malloc(50 * sizeof(char));
scanf("%s", p->username);
p->username = realloc(p->username, (strlen(p->username) + 1) * sizeof(char));
printf("%s", p->username);
return 0;
}
right way to optimize memory usage?
Temporary re-used buffers can often be generousness and fixed in size.
Allocating the right-size amount for memory makes sense for member .username for code could be for millions of struct Player.
IOWs, use allocation for the variable size aspects of code. If struct Player was for 2-player chess, a char username[50] size makes sense. For a multi-player universe, char * makes sense.
Rather than call *alloc() twice consider a single right-sized call.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
// Reasonable upper bound
#define USERNAME_SIZEMAX 50
struct Player {
char* username;
int hp;
int mp;
};
int main(void) {
puts("Enter user name");
// Recommend 2x - useful for leading/trailing spaces & detecting excessive long inputs.
char buf[USERNAME_SIZEMAX * 50];
if (fgets(buf, sizeof buf, stdin) == NULL) {
puts("No input");
} else {
trim(buf); // TBD code to lop off leading/trailing spaces
if (!valid_name(buf)) { // TBD code to validate the `name`
printf("Bad input \"%s\"\n", buf);
} else {
struct Player test = { 0 }; // fully populate
test.username = malloc(strlen(buf) + 1);
// Maybe add NULL check here
strcpy(test.username, buf);
// Oh happy day!
printf("%s", p->username);
return EXIT_SUCCESS;
}
}
return EXIT_FAILURE;
}
Some tips:
a) the example code is too small to matter
b) never use malloc() for something that you will always want one of. Instead, pre-allocate (e.g. as a global variable) or (if it's small enough) use a local variable to avoid the overhead of malloc(). E.g.:
int main(void) {
struct Player test, *p = &test;
char userName[50];
p->username = userName;
c) Don't spread data all over the place. You want all the data in the same place (in the least number of cache lines, with pieces of data that are used at the same time as close to each other as possible). One way to do that is to combine multiple items. E.g.:
struct Player {
char username[50];
int hp;
int mp;
};
int main(void) {
struct Player test, *p = &test;
d) If something takes (at most) 50 chars of memory; don't bother using realloc() to waste CPU time and potentially waste more memory. Don't forget that the internal code for malloc() and realloc() will add meta-data to each allocated piece of memory that is likely to cost an extra 16 bytes or more.
In general; for performance, malloc() and realloc() (and new() and ...) should be completely avoided (especially for larger programs). They spread data "randomly" everywhere and destroy any hope of getting good locality (which is important for minimising multiple very expensive things - cache misses, TLB misses, page faults, swap space usage, ...).
Note: scanf() and gets() should also be banned. They provide no way to prevent buffer overflows (e.g. the user providing more than 50 characters when there's only enough memory allocated for 50 characters, for the purpose of deliberately trashing/corrupting other data), which results in huge gaping security holes.
Related
So I'm supposed to do the sorting algorithm as a CS homework.
It should read arbitrary number of words each ending with '\n'. After it reads the '.', it should print the words in alphabetical order.
E.g.:
INPUT:
apple
dog
austria
Apple
OUTPUT:
Apple
apple
Austria
dog
I want to store the words into a struct. I think that in order to work it for arbitrary number of words I should make the array of structs.
So far I've tried to create a typedef struct with only one member (string) and I planned to make the array of structs from that, into which I would then store each of the words.
As for the "randomness" of the number of words, I wanted to set the struct type in main after finding out how many words had been written and then store each word into each element of the struct array.
My problem is:
1. I don't know how to find out the number of words. The only thing I tried was making a function which counts how many times the '\n' occured, though it didn't work very good.
as for the datastructure, I've came up with struct having only one string member:
typedef struct{
char string[MAX];
}sort;
then in main function I firstly read a number of words to come (not the actual assignment but only for purposes of making the code work)
and after having the "len" I declared the variable of type sort:
int main(){
/*code, scanf("%d", &len) and stuff*/
sort sort_t[len];
for(i = 0; i < len; i++){
scanf("%s", sort_t[i].string);
}
Question: Is such thing "legal" and do I use a good approach?
Q2: How do I get to know the number of words to store (for the array of structs) before I start storing them?
IMHO the idea of reserving the same maximal storage for each and every string is a bit wasteful. You are probably better off sticking to dynamic NUL-terminated strings as usually done in C code. This is what the C library supports best.
As for managing an unknown number of strings, you have a choice. Possibility 1 is to use a linked list as mentioned by Xavier. Probably the most elegant solution, but it could be time-consuming to debug, and ultimately you have to convert it to an array in order to use one of the common sort algorithms.
Possibility 2 is to use something akin to a C++ std::vector object. Say the task of allocating storage is delegated to some "bag" object. Code dealing with the "bag" has a monopoly on calling the realloc() function mentioned by Vlad. Your main function only calls bag_create() and bag_put(bag, string). This is less elegant but probably easier to get right.
As your focus is to be on your sorting algorithm, I would rather suggest using approach #2. You could use the code snippet below as a starting point.
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
typedef struct {
size_t capacity;
size_t usedSlotCount;
char** storage;
} StringBag;
StringBag* bag_create()
{
size_t initialSize = 4; /* start small */
StringBag* myBag = malloc(sizeof(StringBag));
myBag->capacity = initialSize;
myBag->usedSlotCount = 0;
myBag->storage = (char**)malloc(initialSize*sizeof(char*));
return myBag;
}
void bag_put(StringBag* myBag, char* str)
{
if (myBag->capacity == myBag->usedSlotCount) {
/* have to grow storage */
size_t oldCapacity = myBag->capacity;
size_t newCapacity = 2 * oldCapacity;
myBag->storage = realloc(myBag->storage, newCapacity*sizeof(char*));
if (NULL == myBag->storage) {
fprintf(stderr, "Out of memory while reallocating\n");
exit(1);
}
fprintf(stderr, "Growing capacity to %lu\n", (unsigned long)newCapacity);
myBag->capacity = newCapacity;
}
/* save string to new allocated memory, as this */
/* allows the caller to always use the same static storage to house str */
char* str2 = malloc(1+strlen(str));
strcpy(str2, str);
myBag->storage[myBag->usedSlotCount] = str2;
myBag->usedSlotCount++;
}
static char inputLine[4096];
int main()
{
StringBag* myBag = bag_create();
/* read input data */
while(scanf("%s", inputLine) != EOF) {
if (0 == strcmp(".", inputLine))
break;
bag_put(myBag, inputLine);
}
/* TODO: sort myBag->storage and print the sorted array */
}
we wrote a program that reads comma-separated integer-values into an array and tries processing them with a parallel structure.
By doing so, we found out that there is a fixed limitation for the maximum size of the dynamic array, which usually gets allocated dynamically by doubling the size. Yet for a dataset with more than 5000 values, we can't double it anymore.
I am a bit confused right now, since technically, we did everything the way other posts pointed out we should do (use realloc, don't use stack but heap instead).
Note that it works fine for any file with less or equal than 5000 values.
We also tried working with realloc, but to the same result.
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
// compile with gcc filename -lpthread -lm -Wall -Wextra -o test
int reader(int ** array, char * name) {
FILE *fp;
int data,row,col,count,inc;
int capacity=10;
char ch;
fp=fopen(name,"r");
row=col=count=0;
while(EOF!=(inc=fscanf(fp,"%d%c", &data, &ch)) && inc == 2){
if(capacity==count)
// this is the alternative with realloc we tried. Still the same issue.
//*array=malloc(sizeof(int)*(capacity*=2));
*array = realloc(*array, sizeof(int)*(capacity*=2));
(*array)[count++] = data;
//printf("%d ", data);
if(ch == '\n'){
break;
} else if(ch != ','){
fprintf(stderr, "format error of different separator(%c) of Row at %d \n", ch, row);
break;
}
}
// close file stream
fclose(fp);
//*array=malloc( sizeof(int)*count);
*array = realloc(*array, sizeof(int)*count);
return count;
}
int main(){
int cores = 8;
pthread_t p[cores];
int *array;
int i = 0;
array=malloc(sizeof(int)*10);
// read the file
int length = reader(&array, "data_2.txt");
// clean up and exit
free(array);
return 0;
}
EDIT: I included the realloc-command we tried and changed the values back to our original testing values (starting at 10). This didn't impact the result though, or rather still does not work. Thanks anyways for pointing out the errors! I also reduced the included code to the relevant part.
I can't really get my head around the fact that it should work this way, but doesn't, so it might just be a minor mistake we overlooked.
Thanks in advance.
New answer after question has been updated
The use of realloc is wrong. Always do realloc into a new pointer and check for NULL before overwriting the old pointer.
Like:
int* tmp = realloc(....);
if (!tmp)
{
// No more memory
// do error handling
....
}
*array = tmp;
Original answer (not fully valid after question has been updated)
You have some serious problems with the current code.
In main you have:
array=malloc(sizeof(int)*10); // This only allocates memory for 10 int
int length = reader(&array, "data_1.txt");
and in reader you have:
int capacity=5001;
So you assume that the array capacity is 5001 even though you only reserved memory for 10 to start with. So you end up writing outside the reserved array (i.e. undefined behavior).
A better approach could be to handle all allocation in the function (i.e. don't do any allocation in main). If you do that you shall initialize capacity to 0 and rewrite the way capacity grows.
Further, in reader you have:
if(capacity==count)
*array=malloc(sizeof(int)*(capacity*=2));
It is wrong to use malloc as you loose all data already in the array and leak memory as well. Use realloc instead.
Finally, you have:
*array=malloc( sizeof(int)*count);
Again this is wrong for the same reason as above. If you want to resize to the exact size (aka count) use realloc
i try to store name and age in a dynamic array
when we have a different type of data , int , and Char that we dont know the size in the start how to use a dynamic array to store the 2 types
typedef struct personne{
char nom ;
int age ;
}personne;
struct personne saisie_personne_suivante(struct personne* x){
scanf("%s",&x->nom);
scanf("%d",&x->age);
return *x;
}
int main(void){
personne *ali;
ali = malloc(sizeof(char*));
saisie_personne_suivante(ali);
printf("\n %d ",ali->age);
printf("\n %s",&ali->nom);
return 0;
}
Why i dont sucess ?
i think we cant store two types of data at a time in array.If we do so we need to allocate half of memory to char and half to integers provided you should give some size of array.
=>in your program at this line [ali = malloc(sizeof(char*))] you are passsing address of only char not of variable.If you want to store both values just pass address of both int and char.
ali is a pointer to a struct of size sizeof(char) + sizeof(int) which may vary between architectures.
For the time being, let's assume it's 5 bytes (which it probably is on your PC).
What you're doing, is allocate space equal to size of a pointer to char, (which is either 32 or 64bits wide, depending on your OS).
What you probably want to do is allocate space equal to size of your struct (5 bytes), that is:
ali = malloc(sizeof(personne));
Note the lack of *, since you want actual memory for a struct and not a pointer pointing to such a location.
By the way, you wouldn't want to write: malloc(sizeof(char)) either, since that would be just one byte needed for your struct.
I strongly advise you to get your hands on a book on C or a decent tutorial at least.
int main() {
personne *ali;
ali = (struct personne *)malloc(sizeof(personne));
saisie_personne_suivante(ali);
printf("\n %d ", ali->age);
printf("\n %c", ali->nom);
return 0;
}
There is not enough memory for struct personne, so you need to malloc sizeof(personne) memory. nom is not a pointer,it's a char variable,when you print it, use printf("%c",ali->nom);
I can concur with the commenters who recommended a good book/tutorial to get started but nevertheless: here is your repaired code, with a bit of comment.
// printf(), fprintf(), and puts()
#include <stdio.h>
// exit(), malloc(), and scanf()
#include <stdlib.h>
#define PERSONNE_ERROR 0
#define PERSONNE_OK 1
typedef struct personne {
// fixed width for 49 characters and the trailing NUL
char nom[50];
int age;
} personne;
int saisie_personne_suivante(struct personne *x)
{
// For the returns of the scanf()s.
// Because you always check the returns if available
// (well, actually: the returns of printf() et al. rarely get checked)
// preset it to a value meant to say "OK"
int res = PERSONNE_OK;
// UX: let the users know what they are supposed to do.
puts("Your name, please");
// we have a fixed maximum size of name and we can set it here within scanf()
// scanf() returns the number of elements it parsed, *not* the number of characters
// sacnf() needs a pointer to the memory it is expected to put the value into.
// x->nom is already a pointer to a char array, no need to use "&"
if ((res = scanf("%49s", x->nom)) != 1) {
// we can return immediatly here.
// If we would need to cleanup (free memory, for example) we would
// set res to PERSONNE_ERROR and use a goto to jump at the place
// where all the cleanup happens. But that should be done if the clean-up
// is always the same (or could be sorted) and you need such cleanups
// more than just two or three times.
return PERSONNE_ERROR;
}
puts("Your age, too, if you don't mind.");
// x->age is not a pointer to an int, hence we need to prefix "&"
if ((res = scanf("%d", &x->age)) != 1) {
return PERSONNE_ERROR;
}
return res;
}
int main(void)
{
personne *ali;
int res;
// reserve momory for the struct
ali = malloc(sizeof(personne));
// call function that fills the struct and check the return
if ((res = saisie_personne_suivante(ali)) != PERSONNE_OK) {
fprintf(stderr, "Something went wrong with saisie_personne_suivante()\n");
exit(EXIT_FAILURE);
}
// print the content of struct personne
// you can feed printf() directly, no need to find the pointer to the memory
// holding the int
printf("Age: %d\n", ali->age);
// To print strings it needs to know the start of the string whcih needs to be
// a pointer and ali->nom is a pointer to the start of the string
printf("Name: %s\n", ali->nom);
// free allocated memory (not really necessary at the end of the
// program but it's deemed good style and because it costs us nothing
// we cannot find a good reason to skip it)
free(ali);
// exit with a value that tells the OS that this programm ended without an error
// It shoudl be 0 (zero) which it almost always is.
// *Almost* always
exit(EXIT_SUCCESS);
}
But really: go and get some beginners book/tutorial. I cannot give you a recommendation because I don't know about any good ones in your native language (sometimes the english version is good but the translation lacks a lot).
I am looking for a malloc alternative for c that will only ever be used as a stack. Something more like alloca but not limited in space by the stack size. It is for coding a math algorithm.
I will work with large amounts of memory (possibly hundreds of megabytes in use in the middle of the algorithm)
memory is accessed in a stack-like order. What I mean is that the next memory to be freed is always the memory that was most recently allocated.
would like to be able to run an a variety of systems (Windows and Unix-like)
as an extra, something that can be used with threading, where the stack-like allocate and free order applies just to individual threads. (ie ideally each thread has its own "pool" for memory allocation)
My question is, is there anything like this, or is this something that would be easy to implement?
This sounds like a perfect use for Obstack.
I've never used it myself since the API is really confusing, and I can't dig up an example right now. But it supports all the operations you want, and additionally supports streaming creation of the "current" object.
Edit: whipped up a quick example. The Obstack API shows signs of age, but the principle is sound at least.
You will probably want to look into tuning the align/block settings and likely use obstack_next_free and obstack_object_size if you do any fancy growing.
#include <obstack.h>
#include <stdio.h>
#include <stdlib.h>
void *xmalloc(size_t size)
{
void *rv = malloc(size);
if (rv == NULL)
abort();
return rv;
}
#define obstack_chunk_alloc xmalloc
#define obstack_chunk_free free
const char *cat(struct obstack *obstack_ptr, const char *dir, const char *file)
{
obstack_grow(obstack_ptr, dir, strlen(dir));
obstack_1grow(obstack_ptr, '/');
obstack_grow0(obstack_ptr, file, strlen(file));
return obstack_finish(obstack_ptr);
}
int main()
{
struct obstack main_stack;
obstack_init(&main_stack);
const char *cat1 = cat(&main_stack, "dir1", "file1");
const char *cat2 = cat(&main_stack, "dir1", "file2");
const char *cat3 = cat(&main_stack, "dir2", "file3");
puts(cat1);
puts(cat2);
puts(cat3);
obstack_free(&main_stack, cat2);
// cat2 and cat3 both freed, cat1 still valid
}
As you already found out, as long as it works with malloc you should use it and only come back when you need to squeeze out the last bit of performance.
An idea fit that case: You could use a list of blocks, that you allocate when needed. Using a list makes it possible to eventually swap out data in case you hit the virtual memory limit.
struct block {
size_t size;
void * memory;
struct block * next;
};
struct stacklike {
struct block * top;
void * last_alloc;
};
void * allocate (struct stacklike * a, size_t s) {
// add null check for top
if (a->top->size - (a->next_alloc - a->top->memory) < s + sizeof(size_t)) {
// not enough memory left in top block, allocate new one
struct block * nb = malloc(sizeof(*nb));
nb->next = a->top;
a->top = nb;
nb->memory = malloc(/* some size large enough to hold multiple data entities */);
// also set nb size to that size
a->next_alloc = nb->memory;
}
void * place = a->next_alloc;
a->next_alloc += s;
*((size_t *) a->next_alloc) = s; // store size to be able to free
a->next_alloc += sizeof (size_t);
return place;
}
I hope this shows the general idea, for an actual implementation there's much more to consider.
To swap out stuff you change that to a doubly linked list an keep track of the total allocated bytes. If you hit a limit, write the end to some file.
I have seen a strategy used in an old FORTRAN program that might be what you are looking for. The strategy involves use of a global array that is passed down to each function from main.
char global_buffer[SOME_LARGE_SIZE];
void foo1(char* buffer, ...);
void foo2(char* buffer, ...);
void foo3(char* buffer, ...);
int main()
{
foo1(global_buffer, ....);
}
void foo1(char* buffer, ...)
{
// This function needs to use SIZE1 characters of buffer.
// It can let the functions that it calls use buffer+SIZE1
foo2(buffer+SIZE1, ...);
// When foo2 returns, everything from buffer+SIZE1 is assumed
// to be free for re-use.
}
void foo2(char* buffer, ...)
{
// This function needs to use SIZE2 characters of buffer.
// It can let the functions that it calls use buffer+SIZE2
foo3(buffer+SIZE2, ...);
}
void foo3(char* buffer, ...)
{
// This function needs to use SIZE3 characters of buffer.
// It can let the functions that it calls use buffer+SIZE3
bar1(buffer+SIZE3, ...);
}
I know it could be done using malloc, but I do not know how to use it yet.
For example, I wanted the user to input several numbers using an infinite loop with a sentinel to put a stop into it (i.e. -1), but since I do not know yet how many he/she will input, I have to declare an array with no initial size, but I'm also aware that it won't work like this int arr[]; at compile time since it has to have a definite number of elements.
Declaring it with an exaggerated size like int arr[1000]; would work but it feels dumb (and waste memory since it would allocate that 1000 integer bytes into the memory) and I would like to know a more elegant way to do this.
This can be done by using a pointer, and allocating memory on the heap using malloc.
Note that there is no way to later ask how big that memory block is. You have to keep track of the array size yourself.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char** argv)
{
/* declare a pointer do an integer */
int *data;
/* we also have to keep track of how big our array is - I use 50 as an example*/
const int datacount = 50;
data = malloc(sizeof(int) * datacount); /* allocate memory for 50 int's */
if (!data) { /* If data == 0 after the call to malloc, allocation failed for some reason */
perror("Error allocating memory");
abort();
}
/* at this point, we know that data points to a valid block of memory.
Remember, however, that this memory is not initialized in any way -- it contains garbage.
Let's start by clearing it. */
memset(data, 0, sizeof(int)*datacount);
/* now our array contains all zeroes. */
data[0] = 1;
data[2] = 15;
data[49] = 66; /* the last element in our array, since we start counting from 0 */
/* Loop through the array, printing out the values (mostly zeroes, but even so) */
for(int i = 0; i < datacount; ++i) {
printf("Element %d: %d\n", i, data[i]);
}
}
That's it. What follows is a more involved explanation of why this works :)
I don't know how well you know C pointers, but array access in C (like array[2]) is actually a shorthand for accessing memory via a pointer. To access the memory pointed to by data, you write *data. This is known as dereferencing the pointer. Since data is of type int *, then *data is of type int. Now to an important piece of information: (data + 2) means "add the byte size of 2 ints to the adress pointed to by data".
An array in C is just a sequence of values in adjacent memory. array[1] is just next to array[0]. So when we allocate a big block of memory and want to use it as an array, we need an easy way of getting the direct adress to every element inside. Luckily, C lets us use the array notation on pointers as well. data[0] means the same thing as *(data+0), namely "access the memory pointed to by data". data[2] means *(data+2), and accesses the third int in the memory block.
The way it's often done is as follows:
allocate an array of some initial (fairly small) size;
read into this array, keeping track of how many elements you've read;
once the array is full, reallocate it, doubling the size and preserving (i.e. copying) the contents;
repeat until done.
I find that this pattern comes up pretty frequently.
What's interesting about this method is that it allows one to insert N elements into an empty array one-by-one in amortized O(N) time without knowing N in advance.
Modern C, aka C99, has variable length arrays, VLA. Unfortunately, not all compilers support this but if yours does this would be an alternative.
Try to implement dynamic data structure such as a linked list
Here's a sample program that reads stdin into a memory buffer that grows as needed. It's simple enough that it should give some insight in how you might handle this kind of thing. One thing that's would probably be done differently in a real program is how must the array grows in each allocation - I kept it small here to help keep things simpler if you wanted to step through in a debugger. A real program would probably use a much larger allocation increment (often, the allocation size is doubled, but if you're going to do that you should probably 'cap' the increment at some reasonable size - it might not make sense to double the allocation when you get into the hundreds of megabytes).
Also, I used indexed access to the buffer here as an example, but in a real program I probably wouldn't do that.
#include <stdlib.h>
#include <stdio.h>
void fatal_error(void);
int main( int argc, char** argv)
{
int buf_size = 0;
int buf_used = 0;
char* buf = NULL;
char* tmp = NULL;
char c;
int i = 0;
while ((c = getchar()) != EOF) {
if (buf_used == buf_size) {
//need more space in the array
buf_size += 20;
tmp = realloc(buf, buf_size); // get a new larger array
if (!tmp) fatal_error();
buf = tmp;
}
buf[buf_used] = c; // pointer can be indexed like an array
++buf_used;
}
puts("\n\n*** Dump of stdin ***\n");
for (i = 0; i < buf_used; ++i) {
putchar(buf[i]);
}
free(buf);
return 0;
}
void fatal_error(void)
{
fputs("fatal error - out of memory\n", stderr);
exit(1);
}
This example combined with examples in other answers should give you an idea of how this kind of thing is handled at a low level.
One way I can imagine is to use a linked list to implement such a scenario, if you need all the numbers entered before the user enters something which indicates the loop termination. (posting as the first option, because have never done this for user input, it just seemed to be interesting. Wasteful but artistic)
Another way is to do buffered input. Allocate a buffer, fill it, re-allocate, if the loop continues (not elegant, but the most rational for the given use-case).
I don't consider the described to be elegant though. Probably, I would change the use-case (the most rational).