How to fill an c array with unknown number of objects - c

I need to fill an c array with a number of objects which I don't know before the process of filling is completed.
It's an array of strings.
Another addition: I'm dividing a string into it's words so I know the size of the string. Cloud this be helpful to guess the right size?
I need to have something like an mutable array.
How can I achieve this?

Update
Given that you're chunking the string (dividing it into words), you could count the number of spaces, giving you an idea of how big an array you'll need:
char givenString[] = "The quick brown fox jumps over the lazy dog";
int i;
for (i=0;givenString[i];givenString[i] == ' ' ? ++i : givenString++);
++i;// + 1 for the last word
Now i will tell you how many words there are in the given string. Then you can simply do:
char **words = malloc(i*sizeof(char *));
And set about your business. Of course, you'll still have to allocate each word pointer, and free it. Perhaps this is a decent use-case for strtok, BTW:
//assuming i still holds the word-count
words[0] = strtok(givenString, " ");//
for (int j=1;j<i;++j)
{
words[j] = strtok(NULL, " ");
}
You might want to look into strtok_r if you're going to be doing a lot of this string-splitting business, though.
You could use realloc for that. realloc changes the size of the block of memory that a given pointer points to. If the pointer points to NULL, realloc behaves like malloc
char **ptr_array = NULL;
for (int i=1;i<argc;++i)
{
ptr_array = realloc(ptr_array, i*sizeof(char *));
ptr_array[i-1] = calloc(strlen(argv[i]) + 1, sizeof(char));
strncpy(ptr_array[i-1], argv[i], strlen(argv[i]));
}
This code will copy all arguments to heap memory, one by one allocating memory for the memory required. Don't forget the free calls, mind you!
Here's an example in full (including free-calls)
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
char **ptr_array = NULL;//pointer to pointers
int i, argc = 5;//fake argc, argv
char *argv[5] = {"me","foo","bar","zar", "car"};//dummy strings
for (i=1;i<argc;++i)
{
ptr_array = realloc(ptr_array, i*sizeof(char *));//(re-)allocate mem
ptr_array[i-1] = calloc(strlen(argv[i]) + 1, sizeof(char));//alloc str
strncpy(ptr_array[i-1], argv[i], strlen(argv[i]));//copy
}
--argc;
for(i=0;i<argc;++i)
{
printf("At index %d: %s\n", i, ptr_array[i]);//print
free(ptr_array[i]);//free string mem
}
free(ptr_array);//free pointer block
return 0;
}
I've tested this code, and the output was, as you'd expect:
At index 0: foo
At index 1: bar
At index 2: zar
At index 3: car

You could use realloc to declare your array and change its size when needed.
myStrArray = realloc(myStrArray, MaxArray * sizeof(char*));
Realloc will return the same block of memory in most cases until it's "full" and then it will move the memory and contents to somewhere else.
Please note this is an array of pointers and so the strings themselves will need to be allocated or assigned to it. To allocate a 100 char string to the first element, for example:
myStrArray[0] = calloc(100, sizeof(char));
And always free your allocated memory with free (realloc, calloc, malloc).

Since you don't know a priori the size of the array, you must use malloc() to dynamically allocate the memory for the array and for the strings contained in it.
You then must use free() to release this memory when it is no longer needed.
To have good locality, you may want to allocate a single chunk of memory for the strings in the array, considering a data structure like double-NUL terminated strings.

Related

How do I modify the contents of a string literal without using brackets in C?

Disclaimer: this is for a homework assigment.
Say I have a string that was declared like this:
char *string1;
For part of my program, I need to set string1 equal to another string, string2. I can't use strcpy or use brackets.
This is my code so far:
int i;
for(i = 0; *(string2 + i) != '\0'; i++){
*(string1 + i) = *(string2 + i);
}
This causes a segmentation fault.
According to https://www.geeksforgeeks.org/storage-for-strings-in-c/ , this is because string1 was declared like this: char *string1 and a workaround to avoid segfaults is to use brackets. I can't use brackets, so is there any workaround that I can do?
EDIT: I am also prohibited from allocating more memory or declaring arrays. I cant use malloc(), falloc() etc.
The issue you are having is that string2 does not have memory allocated to it.
Your code is missing some details, but I'll assume it looks something like this:
#include <stdio.h>
int main()
{
char *originalStr = "Hello NewArsenic";
char *newStr;
// YMMV depending on the compiler for this line. Might print (null) for
// newStr or it might throw an error.
printf("Original: %s\nNew: %s\n", originalStr, newStr);
int i;
for (i = 0; *(originalStr + i) != '\0'; i++)
{
*(newStr + i) = *(originalStr + i);
}
printf("Original: %s\nNew: %s\n", originalStr, newStr);
return 0;
}
TL;DR Your Issue
Your issue here is that you are attempting to store some values into newStr without having the memory to do so.
Solution
Use malloc.
#include <stdio.h>
#include <stdlib.h> // malloc(size_t) is in stdlib.h
#include <string.h> // strlen(const char *) is in string.h
int main()
{
char *originalStr = "Hello NewArsenic";
// Note here that size_t is preferable to int for length.
// Generally you want to be using size_t if you are working with size/length.
// More info at https://stackoverflow.com/questions/19732319/difference-between-size-t-and-unsigned-int
size_t originalLength = strlen(originalStr);
// This is malloc's typical usage, where we are asking from the system to
// give us originalLength + 1 many chars.
// The `char` here is redundant, actually, since sizeof(char) is defined to
// be one by the C spec, but you might find it useful to see the typical
// usage of `malloc`.
// Since malloc returns a void *, we need to cast that to a char *.
char *newStr = (char *)malloc((originalLength + 1) * sizeof(char));
// Your code stays the same.
printf("Original: %s\nNew: %s\n", originalStr, newStr);
size_t i;
for (i = 0; *(originalStr + i) != '\0'; i++)
{
*(newStr + i) = *(originalStr + i);
}
// Don't forget to append a null character like I did before editing!
*(newStr + originalLength) = 0;
printf("Original: %s\nNew: %s\n", originalStr, newStr);
// Because `malloc` gives us memory on the stack, we need to tell the system
// that we want to free it before exiting.
free(newStr);
return 0;
}
The long answer
What is a C String?
In C, a string is merely an array of characters. What this means is that for each character you want to have have, you need to allocate memory.
Memory
In C, there are two types of memory allocation - stack- and heap-based.
Stack Memory
You're probably more familiar with stack-based memory than you think. Whenever you declare a variable, you're defining it on the stack. Arrays declared with bracket notation type array[size_t] are stack-based too. What's specific about stack-based memory allocation is that when you allocate memory, it will only last for as long as the function in which it was declared, as you're probably familiar with. This means that you don't have to worry about your memory sticking around for longer than it should.
Heap Memory
Now heap-based memory allocation is different in the sense that it will persist until it is cleared. This is advantageous in one way:
You can keep values of which you don't know the size at compile time.
But, that comes at a cost:
The heap is slower
You have to manually clear your memory once you're done with it.
For more info, check out this thread.
We typically use the function (void *) malloc(size_t) and its sister (void *) calloc(size_t, size_t) for allocating heap memory. To free the memory that we asked for from the system, use free(void *).
Alternatives
You could've also used newStr = originalStr, but that would not actually copy the string, but only make newStr point to originalStr, which I'm sure you're aware of.
Other remarks
Generally, it's an anti-pattern to do:
char* string = "literal";
This is an anti-pattern because literals cannot be edited and shouldn't be. Do:
char const* string = "literal";
See this thread for more info.
Avoid using int in your loop. Use size_t See this thread.
For part of my program, I need to set string1 equal to another string, string2. I can't use strcpy or use brackets.
Perhaps the solution is just as simple as
string2 = string1
Note that this assignes the string2 pointer to point directly to the same memory as string1. This is sometimes very helpful because you need to maintain the beginning of the string with string1 but also need another pointer to move inside the string with things like string2++.
One way or another, you have to point string2 at an address in memory that you have access to. There are two ways to do this:
Point at memory that you already have access to through another variable either with another pointer variable or with the address-of & operator.
Allocate memory with malloc() or related functions.

Memory, pointers, and pointers to pointers

I am working on a short program that reads a .txt file. Intially, I was playing around in main function, and I had gotten to my code to work just fine. Later, I decided to abstract it to a function. Now, I cannot seem to get my code to work, and I have been hung up on this problem for quite some time.
I think my biggest issue is that I don't really understand what is going on at a memory/hardware level. I understand that a pointer simply holds a memory address, and a pointer to a pointer simply holds a memory address to an another memory address, a short breadcrumb trail to what we really want.
Yet, now that I am introducing malloc() to expand the amount of memory allocated, I seem to lose sight of whats going on. In fact, I am not really sure how to think of memory at all anymore.
So, a char takes up a single byte, correct?
If I understand correctly, then by a char* takes up a single byte of memory?
If we were to have a:
char* str = "hello"
Would it be say safe to assume that it takes up 6 bytes of memory (including the null character)?
And, if we wanted to allocate memory for some "size" unknown at compile time, then we would need to dynamically allocate memory.
int size = determine_size();
char* str = NULL;
str = (char*)malloc(size * sizeof(char));
Is this syntactically correct so far?
Now, if you would judge my interpretation. We are telling the compiler that we need "size" number of contiguous memory reserved for chars. If size was equal to 10, then str* would point to the first address of 10 memory addresses, correct?
Now, if we could go one step further.
int size = determine_size();
char* str = NULL;
file_read("filename.txt", size, &str);
This is where my feet start to leave the ground. My interpretation is that file_read() looks something like this:
int file_read(char* filename, int size, char** buffer) {
// Set up FILE stream
// Allocate memory to buffer
buffer = malloc(size * sizeof(char));
// Add characters to buffer
int i = 0;
char c;
while((c=fgetc(file))!=EOF){
*(buffer + i) = (char)c;
i++;
}
Adding the characters to the buffer and allocating the memory is what is I cannot seem to wrap my head around.
If **buffer is pointing to *str which is equal to null, then how do I allocate memory to *str and add characters to it?
I understand that this is lengthy, but I appreciate the time you all are taking to read this! Let me know if I can clarify anything.
EDIT:
Whoa, my code is working now, thanks so much!
Although, I don't know why this works:
*((*buffer) + i) = (char)c;
So, a char takes up a single byte, correct?
Yes.
If I understand correctly, by default a char* takes up a single byte of memory.
Your wording is somewhat ambiguous. A char takes up a single byte of memory. A char * can point to one char, i.e. one byte of memory, or a char array, i.e. multiple bytes of memory.
The pointer itself takes up more than a single byte. The exact value is implementation-defined, usually 4 bytes (32bit) or 8 bytes (64bit). You can check the exact value with printf( "%zd\n", sizeof char * ).
If we were to have a char* str = "hello", would it be say safe to assume that it takes up 6 bytes of memory (including the null character)?
Yes.
And, if we wanted to allocate memory for some "size" unknown at compile time, then we would need to dynamically allocate memory.
int size = determine_size();
char* str = NULL;
str = (char*)malloc(size * sizeof(char));
Is this syntactically correct so far?
Do not cast the result of malloc. And sizeof char is by definition always 1.
If size was equal to 10, then str* would point to the first address of 10 memory addresses, correct?
Yes. Well, almost. str* makes no sense, and it's 10 chars, not 10 memory addresses. But str would point to the first of the 10 chars, yes.
Now, if we could go one step further.
int size = determine_size();
char* str = NULL;
file_read("filename.txt", size, &str);
This is where my feet start to leave the ground. My interpretation is that file_read() looks something like this:
int file_read(char* filename, int size, char** buffer) {
// Set up FILE stream
// Allocate memory to buffer
buffer = malloc(size * sizeof(char));
No. You would write *buffer = malloc( size );. The idea is that the memory you are allocating inside the function can be addressed by the caller of the function. So the pointer provided by the caller -- str, which is NULL at the point of the call -- needs to be changed. That is why the caller passes the address of str, so you can write the pointer returned by malloc() to that address. After your function returns, the caller's str will no longer be NULL, but contain the address returned by malloc().
buffer is the address of str, passed to the function by value. Allocating to buffer would only change that (local) pointer value.
Allocating to *buffer, on the other hand, is the same as allocating to str. The caller will "see" the change to str after your file_read() returns.
Although, I don't know why this works: *((*buffer) + i) = (char)c;
buffer is the address of str.
*buffer is, basically, the same as str -- a pointer to char (array).
(*buffer) + i) is pointer arithmetic -- the pointer *buffer plus i means a pointer to the ith element of the array.
*((*buffer) + i) is dereferencing that pointer to the ith element -- a single char.
to which you are then assigning (char)c.
A simpler expression doing the same thing would be:
(*buffer)[i] = (char)c;
with char **buffer, buffer stands for the pointer to the pointer to the char, *buffer accesses the pointer to a char, and **buffer accesses the char value itself.
To pass back a pointer to a new array of chars, write *buffer = malloc(size).
To write values into the char array, write *((*buffer) + i) = c, or (probably simpler) (*buffer)[i] = c
See the following snippet demonstrating what's going on:
void generate0to9(char** buffer) {
*buffer = malloc(11); // *buffer dereferences the pointer to the pointer buffer one time, i.e. it writes a (new) pointer value into the address passed in by `buffer`
for (int i=0;i<=9;i++) {
//*((*buffer)+i) = '0' + i;
(*buffer)[i] = '0' + i;
}
(*buffer)[10]='\0';
}
int main(void) {
char *b = NULL;
generate0to9(&b); // pass a pointer to the pointer b, such that the pointer`s value can be changed in the function
printf("b: %s\n", b);
free(b);
return 0;
}
Output:
0123456789

Dynamically allocating array of strings

I want to dynamically allocate array of strings, but I'm not sure how I can do this. So I thought of making a struct and dynamically allocate that struct. So I made the code below, but this code creates assertion failure.
#include <stdio.h>
#include <stdlib.h>
typedef struct {
char str1[20];
char str2[20];
} String;
int main(void)
{
String * list;
list = (String *)malloc(sizeof(String));
int i = 1;
for (; i < 6; i++) {
realloc(list, i * sizeof(String));
printf("Input String 1: ");
scanf("%s", list[i - 1].str1);
printf("Input String 2: ");
scanf("%s", list[i - 1].str2);
}
for (i = 0; i < 5; i++)
printf("%s\t%s\n", list[i].str1, list[i].str2);
free(list);
}
What have I done wrong and how can I fix this problem?
Thanks :)
The man page for realloc says:
The realloc() function returns a pointer to the newly allocated
memory, which is suitably aligned for any kind of variable and may be
different from ptr, or NULL if the request fails.
The new pointer can be different from the one you passed to realloc, so you need to collect and use the pointer returned by realloc.
A structure always has the same size so with this implementation you'd be stuck with always having an array of size 2.
A way to declare an array of strings (which are themselves arrays of characters) s doing
char **string;
If you want an array of 20 strings then that'd be:
string = malloc(sizeof(char*)*20);
Structs must have constant size, so i don't think the compiler will like you trying to allocate more memory for a structure than what it was defined with.

Using Malloc for i endless C -String

I was wondering is it possible to create one endless array which can store endlessly long strings?
So what I exactly mean is, I want to create a function which gets i Strings with n length.I want to input infinite strings in the program which can be infinite characters long!
void endless(int i){
//store user input on char array i times
}
To achieve that I need malloc, which I would normally use like this:
string = malloc(sizeof(char));
But how would that work for lets say 5 or 10 arrays or even a endless stream of arrays? Or is this not possible?
Edit:
I do know memory is not endless, what I mean is if it where infinite how would you try to achieve it? Or maybe just allocate memory until all memory is used?
Edit 2:
So I played around a little and this came out:
void endless (char* array[], int numbersOfArrays){
int j;
//allocate memory
for (j = 0; j < numbersOfArrays; j++){
array[j] = (char *) malloc(1024*1024*1024);
}
//scan strings
for (j = 0; j < numbersOfArrays; j++){
scanf("%s",array[j]);
array[j] = realloc(array[j],strlen(array[j]+1));
}
//print stringd
for (j = 0; j < numbersOfArrays; j++){
printf("%s\n",array[j]);
}
}
However this isn't working maybe I got the realloc part terrible wrong?
The memory is not infinite, thus you cannot.
I mean the physical memory in a computer has its limits.
malloc() will fail and allocate no memory when your program requestes too much memory:
If the function failed to allocate the requested block of memory, a null pointer is returned.
Assuming that memory is infinite, then I would create an SxN 2D array, where S is the number of strings and N the longest length of the strings you got, but obviously there are many ways to do this! ;)
Another way would be to have a simple linked list (I have one in List (C) if you need one), where every node would have a char pointer and that pointer would eventually host a string.
You can define a max length you will assume it will be the max lenght of your strings. Otherwise, you could allocate a huge 1d char array which you hole the new string, use strlen() to find the actual length of the string, and then allocate dynamically an array that would exactly the size that is needed, equal of that length + 1 for the null-string-terminator.
Here is a toy example program that asks the user to enter some strings. Memory is allocated for the strings in the get_string() function, then pointers to the strings are added to an array in the add_string() function, which also allocates memory for array storage. You can add as many strings of arbitrary length as you want, until your computer runs out of memory, at which point you will probably segfault because there are no checks on whether the memory allocations are successful. But that would take an awful lot of typing.
I guess the important point here is that there are two allocation steps: one for the strings and one for the array that stores the pointers to the strings. If you add a string literal to the storage array, you don't need to allocate for it. But if you add a string that is unknown at compile time (like user input), then you have to dynamically allocate memory for it.
Edit:
If anyone tried to run the original code listed below, they might have encountered some bizarre behavior for long strings. Specifically, they could be truncated and terminated with a mystery character. This was a result of the fact that the original code did not handle the input of an empty line properly. I did test it for a very long string, and it seemed to work. I think that I just got "lucky." Also, there was a tiny (1 byte) memory leak. It turned out that I forgot to free the memory pointed to from newstring, which held a single '\0' character upon exit. Thanks, Valgrind!
This all could have been avoided from the start if I had passed a NULL back from the get_string() function instead of an empty string to indicate an empty line of input. Lesson learned? The source code below has been fixed, NULL now indicates an empty line of input, and all is well.
#include <stdio.h>
#include <stdlib.h>
char * get_string(void);
char ** add_string(char *str, char **arr, int num_strings);
int main(void)
{
char *newstring;
char **string_storage;
int i, num = 0;
string_storage = NULL;
puts("Enter some strings (empty line to quit):");
while ((newstring = get_string()) != NULL) {
string_storage = add_string(newstring, string_storage, num);
++num;
}
puts("You entered:");
for (i = 0; i < num; i++)
puts(string_storage[i]);
/* Free allocated memory */
for (i = 0; i < num; i++)
free(string_storage[i]);
free(string_storage);
return 0;
}
char * get_string(void)
{
char ch;
int num = 0;
char *newstring;
newstring = NULL;
while ((ch = getchar()) != '\n') {
++num;
newstring = realloc(newstring, (num + 1) * sizeof(char));
newstring[num - 1] = ch;
}
if (num > 0)
newstring[num] = '\0';
return newstring;
}
char ** add_string(char *str, char **arr, int num_strings)
{
++num_strings;
arr = realloc(arr, num_strings * (sizeof(char *)));
arr[num_strings - 1] = str;
return arr;
}
I was wondering is it possible to create one endless array which can store endlessly long strings?
The memory can't be infinite. So, the answer is NO. Even if you have every large memory, you will need a processor that could address that huge memory space. There is a limit on amount of dynamic memory that can be allocated by malloc and the amount of static memory(allocated at compile time) that can be allocated. malloc function call will return a NULL if there is no suitable memory block requested by you in the heap memory.
Assuming that you have very large memory space available to you relative to space required by your input strings and you will never run out of memory. You can store your input strings using 2 dimensional array.
C does not really have multi-dimensional arrays, but there are several ways to simulate them. You can use a (dynamically allocated) array of pointers to (dynamically allocated) arrays. This is used mostly when the array bounds are not known until runtime. OR
You can also allocate a global two dimensional array of sufficient length and width. The static allocation for storing random size input strings is not a good idea. Most of the memory space will be unused.
Also, C programming language doesn't have string data type. You can simulate a string using a null terminated array of characters. So, to dynamically allocate a character array in C, we should use malloc like shown below:
char *cstr = malloc((MAX_CHARACTERS + 1)*sizeof(char));
Here, MAX_CHARACTERS represents the maximum number of characters that can be stored in your cstr array. The +1 is added to allocate a space for null character if MAX_CHARACTERS are stored in your string.

Segmentation fault for malloc function of a 2D char array

Here is my code snippet to create the 2D array that holds char array. It would be great if someone could find out what could be the reason. I have tried using both malloc() and calloc() to allocate memory to the 2D array, yet no positive signs.
Code Snippet:
char** attrNames = (char **)malloc(3*sizeof(char*))
for (m = 0; m < 3; m++) {
attrNames[m] = (char *)malloc(2 * sizeof(char*));
strcpy(schema->attrNames[m], temp_buff2[m]);
}
I am trying to allocate the memory and then going on a loop and again allocating memory and copy the data from a variable called temp_buff2 (has character data) into the char array.
Try the code below. Even though memory allocation error in your project might be unlikely, now is a good time to develop a good error handling reflex - it will save your bacon when you move on to more serious projects.
Note that char* pointer needs a buffer that is equal to the length of the string plus one extra byte. sizeof(char*) is a small value, only 8 on a 64-bit architecture - it just stores the value of the memory address where the string starts. Note that we need +1 on top of strlen() because strcpy() will store one extra byte (\0) as a string terminator.
char** attrNames = (char **)malloc(3*sizeof(char*));
if (!attrName)
{
// handle memory error
}
for (m = 0; m < 3; m++) {
attrNames[m] = (char *)malloc(strlen(temp_buff2[m])+1);
if (!attrNames[m])
{
// handle memory error
}
strcpy(schema->attrNames[m], temp_buff2[m]);
}
Memory error can be handled by returning an error code from your function or via a fatal exit like this:
fprintf(stderr, "Out of memory\n");
exit(1);
You will need to #include <stdlib.h> for the prototype of exit().
You need to reserve enough space for whatever you have inside "temp_buff2". For example:
char** attrNames = (char **)malloc(3*sizeof(char*))
for (m = 0; m < 3; m++) {
attrNames[m] = (char *)malloc( strlen(temp_buff2[m]) + 1 );
strcpy(schema->attrNames[m], temp_buff2[m]);
}
Notice that I am adding 1 to the strlen result, this is because we need to reserve an additional byte for the null char terminator.

Resources