Allocating an array of an unknown size - c

Context: I'm trying to do is to make a program which would take text as input and store it in a character array. Then I would print each element of the array as a decimal. E.g. "Hello World" would be converted to 72, 101, etc.. I would use this as a quick ASCII2DEC converter. I know there are online converters but I'm trying to make this one on my own.
Problem: how can I allocate an array whose size is unknown at compile-time and make it the exact same size as the text I enter? So when I enter "Hello World" it would dynamically make an array with the exact size required to store just "Hello World". I have searched the web but couldn't find anything that I could make use of.

I see that you're using C. You could do something like this:
#define INC_SIZE 10
char *buf = (char*) malloc(INC_SIZE),*temp;
int size = INC_SIZE,len = 0;
char c;
while ((c = getchar()) != '\n') { // I assume you want to read a line of input
if (len == size) {
size += INC_SIZE;
temp = (char*) realloc(buf,size);
if (temp == NULL) {
// not enough memory probably, handle it yourself
}
buf = temp;
}
buf[len++] = c;
}
// done, note that the character array has no '\0' terminator and the length is represented by `len` variable

Typically, on environments like a PC where there are no great memory constraints, I would just dynamically allocate, (language-dependent) an array/string/whatever of, say, 64K and keep an index/pointer/whatever to the current end point plus one - ie. the next index/location to place any new data.

if you use cpp language, you can use the string to store the input characters,and access the character by operator[] , like the following codes:
std::string input;
cin >> input;

I'm going to guess you mean C, as that's one of the commonest compiled languages where you would have this problem.
Variables that you declare in a function are stored on the stack. This is nice and efficient, gets cleaned up when your function exits, etc. The only problem is that the size of the stack slot for each function is fixed and cannot change while the function is running.
The second place you can allocate memory is the heap. This is a free-for-all that you can allocate and deallocate memory from at runtime. You allocate with malloc(), and when finished, you call free() on it (this is important to avoid memory leaks).
With heap allocations you must know the size at allocation time, but it's better than having it stored in fixed stack space that you cannot grow if needed.
This is a simple and stupid function to decode a string to its ASCII codes using a dynamically-allocated buffer:
char* str_to_ascii_codes(char* str)
{
size_t i;
size_t str_length = strlen(str);
char* ascii_codes = malloc(str_length*4+1);
for(i = 0; i<str_length; i++)
snprintf(ascii_codes+i*4, 5, "%03d ", str[i]);
return ascii_codes;
}
Edit: You mentioned in a comment wanting to get the buffer just right. I cut corners with the above example by making each entry in the string a known length, and not trimming the result's extra space character. This is a smarter version that fixes both of those issues:
char* str_to_ascii_codes(char* str)
{
size_t i;
int written;
size_t str_length = strlen(str), ascii_codes_length = 0;
char* ascii_codes = malloc(str_length*4+1);
for(i = 0; i<str_length; i++)
{
snprintf(ascii_codes+ascii_codes_length, 5, "%d %n", str[i], &written);
ascii_codes_length = ascii_codes_length + written;
}
/* This is intentionally one byte short, to trim the trailing space char */
ascii_codes = realloc(ascii_codes, ascii_codes_length);
/* Add new end-of-string marker */
ascii_codes[ascii_codes_length-1] = '\0';
return ascii_codes;
}

Related

Tokenizing string from dynamic array into multiple lines in static 2D char array

I have a dynamic array that holds a string containing '\n' characters, so this string is made up of multiple lines. I'm trying to extract the lines and put them all into a 2D char array and I'm getting segmentation errors.
Here's my code:
char *input_lines = malloc(MAX_LINE_LEN*sizeof(char));
input_lines = extractInput(MAX_LINE_LEN, input_file);
char inputLines_counted[lineCount_input][MAX_LINE_LEN];
char *t = strtok(input_lines, "\n");
for(i = 0; i < lineCount_input; i++) {
strcpy(inputLines_counted[i], t);
// printf("%s\n", inputLines_counted[i]);
t = strtok(NULL, "\n");
}
Upon creating the dynamic array, I use the extractInput(MAX_LINE_LEN, input_file) function to populate the input_lines array with a string containing multiple lines.
Here's the extract function:
char *extractInput(int len, FILE *file) {
char tmp[len];
char *pos;
char *input_lines = malloc(len*sizeof(char));
char *lines;
while(fgets(tmp, len, file)) {
// if((pos = strchr(tmp, '\n')) != NULL) {
// *pos = ' ';
// }
input_lines = realloc(input_lines, (strlen(input_lines) + len)*sizeof(char));
strcat(input_lines, tmp);
}
return input_lines;
}
Why am I getting segfaults here?
The function call
input_lines = realloc(input_lines, (strlen(input_lines) + len)*sizeof(char));
takes your current allocated memory block and expands it, if it can. you should check the return value of realloc, it may fail.
btw when you allocate memory in C, you always need to have space for the ending \0.
see what happens with this file
hello\n
world\n
The first fgets reads in hello\n into tmp.
you now do realloc even though it is unnecessary, input_lines is already pointing to a buffer that could hold the string
char *input_lines = malloc(MAX_LINE_LEN*sizeof(char));
now with your realloc
input_lines = realloc(input_lines, (strlen(input_lines) + len)*sizeof(char));
you do strlen(input_lines) + len so you make the buffer strlen("hello\n") + len long.
but the important thing you need to notice is the following line
strcat(input_lines, tmp);
you have not initialized the memory that input_lines is pointing to, it can contain anything even \0's so your strcat could potentially put the string anywhere in the buffer and cause the error you describe.
Either do a memset or use calloc when you allocate the buffer.
If you use realloc you should keep track of the total size that you have allocated and how much you are using of it, before you copy into the buffer check if there is enough room. If not, add a certain number of bytes to the buffer.
I also noticed you read from the file line by line, then you concatenated the lines together to later use strtok to divide them again. It would be more efficient to return an array of lines.

Using Malloc for i endless C -String

I was wondering is it possible to create one endless array which can store endlessly long strings?
So what I exactly mean is, I want to create a function which gets i Strings with n length.I want to input infinite strings in the program which can be infinite characters long!
void endless(int i){
//store user input on char array i times
}
To achieve that I need malloc, which I would normally use like this:
string = malloc(sizeof(char));
But how would that work for lets say 5 or 10 arrays or even a endless stream of arrays? Or is this not possible?
Edit:
I do know memory is not endless, what I mean is if it where infinite how would you try to achieve it? Or maybe just allocate memory until all memory is used?
Edit 2:
So I played around a little and this came out:
void endless (char* array[], int numbersOfArrays){
int j;
//allocate memory
for (j = 0; j < numbersOfArrays; j++){
array[j] = (char *) malloc(1024*1024*1024);
}
//scan strings
for (j = 0; j < numbersOfArrays; j++){
scanf("%s",array[j]);
array[j] = realloc(array[j],strlen(array[j]+1));
}
//print stringd
for (j = 0; j < numbersOfArrays; j++){
printf("%s\n",array[j]);
}
}
However this isn't working maybe I got the realloc part terrible wrong?
The memory is not infinite, thus you cannot.
I mean the physical memory in a computer has its limits.
malloc() will fail and allocate no memory when your program requestes too much memory:
If the function failed to allocate the requested block of memory, a null pointer is returned.
Assuming that memory is infinite, then I would create an SxN 2D array, where S is the number of strings and N the longest length of the strings you got, but obviously there are many ways to do this! ;)
Another way would be to have a simple linked list (I have one in List (C) if you need one), where every node would have a char pointer and that pointer would eventually host a string.
You can define a max length you will assume it will be the max lenght of your strings. Otherwise, you could allocate a huge 1d char array which you hole the new string, use strlen() to find the actual length of the string, and then allocate dynamically an array that would exactly the size that is needed, equal of that length + 1 for the null-string-terminator.
Here is a toy example program that asks the user to enter some strings. Memory is allocated for the strings in the get_string() function, then pointers to the strings are added to an array in the add_string() function, which also allocates memory for array storage. You can add as many strings of arbitrary length as you want, until your computer runs out of memory, at which point you will probably segfault because there are no checks on whether the memory allocations are successful. But that would take an awful lot of typing.
I guess the important point here is that there are two allocation steps: one for the strings and one for the array that stores the pointers to the strings. If you add a string literal to the storage array, you don't need to allocate for it. But if you add a string that is unknown at compile time (like user input), then you have to dynamically allocate memory for it.
Edit:
If anyone tried to run the original code listed below, they might have encountered some bizarre behavior for long strings. Specifically, they could be truncated and terminated with a mystery character. This was a result of the fact that the original code did not handle the input of an empty line properly. I did test it for a very long string, and it seemed to work. I think that I just got "lucky." Also, there was a tiny (1 byte) memory leak. It turned out that I forgot to free the memory pointed to from newstring, which held a single '\0' character upon exit. Thanks, Valgrind!
This all could have been avoided from the start if I had passed a NULL back from the get_string() function instead of an empty string to indicate an empty line of input. Lesson learned? The source code below has been fixed, NULL now indicates an empty line of input, and all is well.
#include <stdio.h>
#include <stdlib.h>
char * get_string(void);
char ** add_string(char *str, char **arr, int num_strings);
int main(void)
{
char *newstring;
char **string_storage;
int i, num = 0;
string_storage = NULL;
puts("Enter some strings (empty line to quit):");
while ((newstring = get_string()) != NULL) {
string_storage = add_string(newstring, string_storage, num);
++num;
}
puts("You entered:");
for (i = 0; i < num; i++)
puts(string_storage[i]);
/* Free allocated memory */
for (i = 0; i < num; i++)
free(string_storage[i]);
free(string_storage);
return 0;
}
char * get_string(void)
{
char ch;
int num = 0;
char *newstring;
newstring = NULL;
while ((ch = getchar()) != '\n') {
++num;
newstring = realloc(newstring, (num + 1) * sizeof(char));
newstring[num - 1] = ch;
}
if (num > 0)
newstring[num] = '\0';
return newstring;
}
char ** add_string(char *str, char **arr, int num_strings)
{
++num_strings;
arr = realloc(arr, num_strings * (sizeof(char *)));
arr[num_strings - 1] = str;
return arr;
}
I was wondering is it possible to create one endless array which can store endlessly long strings?
The memory can't be infinite. So, the answer is NO. Even if you have every large memory, you will need a processor that could address that huge memory space. There is a limit on amount of dynamic memory that can be allocated by malloc and the amount of static memory(allocated at compile time) that can be allocated. malloc function call will return a NULL if there is no suitable memory block requested by you in the heap memory.
Assuming that you have very large memory space available to you relative to space required by your input strings and you will never run out of memory. You can store your input strings using 2 dimensional array.
C does not really have multi-dimensional arrays, but there are several ways to simulate them. You can use a (dynamically allocated) array of pointers to (dynamically allocated) arrays. This is used mostly when the array bounds are not known until runtime. OR
You can also allocate a global two dimensional array of sufficient length and width. The static allocation for storing random size input strings is not a good idea. Most of the memory space will be unused.
Also, C programming language doesn't have string data type. You can simulate a string using a null terminated array of characters. So, to dynamically allocate a character array in C, we should use malloc like shown below:
char *cstr = malloc((MAX_CHARACTERS + 1)*sizeof(char));
Here, MAX_CHARACTERS represents the maximum number of characters that can be stored in your cstr array. The +1 is added to allocate a space for null character if MAX_CHARACTERS are stored in your string.

Constantly adjusting the size of an array with C

In my application, we take in char values one at a time and we need to be able to but them into a string. We are assembling these strings one by one by putting the char values into a char array, then clearing the array. However the strings are each different lengths and we are unable to determine the size of the string. How can we change the sizes of the array to add more space as we need it?
Also, how can we print out the array?
If the array was dynamically allocated with malloc, you can resize it with realloc:
int array_size = 1024;
char *array = (char *) malloc(array_size);
int n = 0;
char c;
while ((c = getchar()) != EOF) {
array[n++] = c;
if (n >= array_size) {
array_size += 1024;
array = (char *) realloc(array_size);
}
}
array[n] = '\0';
For printing out the contents of the array, you can simply pass it to printf or puts:
printf("%s\n", array);
puts(array);
if you don'y know the size you are going to need and are adding one character at a time you can consider using a linked list. It can grow as much as you need it to. The disadvntages would be lookup is kind of slow, and if you need to free the memory, or clear it you would have to do this for each element, one at a time.
You can also take the dynamic array approach: allocate a certain size which you consider large enough and when that is 80% full, allocate a new buffer, twice as large and copy the contents of the old one in the new, larger one.

C: how to read in a variable amount of info from files and store it in array

I am not used to programming in c, so I am wondering how to have an array, and then read a variable amount of variables in a file, and those these files in the array.
//how do I declare an array whose sizes varies
do {
char buffer[1000];
fscanf(file, %[^\n]\n", buffer);
//how do i add buffer to array
}while(!feof(file));
int nlines = 0
char **lines = NULL; /* Array of resulting lines */
int curline = 0;
char buffer[BUFSIZ]; /* Just alloocate this once, not each time through the loop */
do {
if (fgets(buffer, sizeof buffer, file)) { /* fgets() is the easy way to read a line */
if (curline >= nlines) { /* Have we filled up the result array? */
nlines += 1000; /* Increase size by 1,000 */
lines = realloc(lines, nlines*sizeof(*lines); /* And grow the array */
}
lines[curline] = strdup(buffer); /* Make a copy of the input line and add it to the array */
curline++;
}
}while(!feof(file));
Arrays are always fixed-size in C. You cannot change their size. What you can do is make an estimate of how much space you'll need beforehand and allocate that space dynamically (with malloc()). If you happen to run out of space, you reallocate. See the documentation for realloc() for that. Basically, you do:
buffer = realloc(size);
The new size can be larger or smaller than what you had before (meaning you can "grow" or "shrink" the array.) So if at first you want, say, space for 5000 characters, you do:
char* buffer = malloc(5000);
If later you run out of space and want an additional 2000 characters (so the new size will be 7000), you would do:
buffer = realloc(7000);
The already existing contents of buffer are preserved. Note that realloc() might not be able to really grow the memory block, so it might allocate an entirely new block first, then copy the contents of the old memory to the new block, and then free the old memory. That means that if you stored a copy of the buffer pointer elsewhere, it will point to the old memory block which doesn't exist anymore. For example:
char* ptr = buffer;
buffer = realloc(7000);
At that point, ptr is only valid if ptr == buffer, which is not guaranteed to be the case.
It appears that you are trying to read until you read a newline.
The easiest way to do this is via getline.
char *buffer = NULL;
int buffer_len;
int ret = getline(&buffer, &buffer_len, file);
...this will read one line of text from the file file (unless ret is -1, in which there's an error or you're at the end of the file).
An array where the string data is in the array entry is usually a non-optimal choice. If the complete set of data will fit comfortably in memory and there's a reasonable upper bound on the number of entries, then a pointer-array is one choice.
But first, avoid scanf %s and %[] formats without explicit lengths. Using your example buffer size of 1000, the maximum string length that you can read is 999, so:
/* Some needed data */
int n;
struct ptrarray_t
{
char **strings;
int nalloc; /* number of string pointers allocated */
int nused; /* number of string pointers used */
} pa_hdr; /* presume this was initialized previously */
...
n = fscanf(file, "%999[\n]", buffer);
if (n!=1 || getc(file)!='\n')
{
there's a problem
}
/* Now add a string to the array */
if (pa_hdr.nused < pa_hdr.nalloc)
{
int len = strlen(buffer);
char *cp = malloc(len+1);
strcpy(cp, buffer);
pa_hdr.strings[pa_hdr.nused++] = cp;
}
A reference to any string hereafter is just pa_hdr.strings[i], and a decent design will use function calls or macros to manage the header, which in turn will be in a header file and not inline. When you're done with the array, you'll need a delete function that will free all of those malloc()ed pointers.
If there are a large number of small strings, malloc() can be costly, both in time and space overhead. You might manage pools of strings in larger blocks that will live nicely with the memory allocation and paging of the host OS. Using a set of functions to effectively make an object out of this string-array will help your development. You can pick a simple strategy, as above, and optimize the implementation later.

How to create a string of variable length with character X in C

I am new to C and am having some troubles with strings. How do I create a string of variable length containing a specified character in C? This is what I have tried but I get a compiler error:
int cLen = 8 /* Specified Length */
char chr = 'a'; /* Specified Character */
char outStr[cLen];
int tmp = 0;
while (tmp < cLen-1)
outStr[tmp++] = chr;
outStr[cLen-1] = '\0';
/* outStr = "aaaaaaaa" */
You can try:
char *str = malloc(cLen + 1);
memset(str, 'a', cLen);
str[cLen] = 0;
Strings in C might not be as flexible as you want, on the first look.
What you did with "char outStr[]" was to indicate you'd like a pointer to char, that can be iterated with array syntax... it creates no actual storage for the characters, because you never mentioned how many you would like to store.
In C you can have the storage decoupled from these special variables, called pointers. The example of wanting a variable length string is actually a good example of why would you want that: I want an entity that holds knowledge of where the storage is at; I want methods to allow me to change the storage size.
So, you prepare yourself to deal with dynamic memory allocation by including
#include <stdlib.h>
declare a pointer to chars by
char *cpString;
you ask for an allocation of "n" chars with
cpString=malloc(n*sizeof(char));
Now you can strcat, printf, whatever you want to do with a string that has n-1 charaters (because it must be null terminated).
Specifically, you can now initialize your string with
memset(cpString,X,n-1);
cpString[n]=0;
which creates a XXXX...XXX\0 string, of n-1 characters.
When you want to change cpString storage size, here's the tricky part, you need to free the allocated memory before you request for a new storage allocation
if (cpString !=0)
{
free(cpString);
cpString=0;
}
cpString=malloc(n*sizeof(char));
otherwise the dynamic memory storage area (called a "heap") is left with an un-reclaimable piece of the old n size.
There are better allocators, that don't need free(), but I better leave you studying and practicing with malloc() free() usage.
There's no need to use strncat(), strings are just character arrays so do the assignment directly character by character:
void repeated_string(char *out, size_t len, char v)
{
for(; len > 0; --len)
*out++ = v;
*out = '\0';
}
There are two issues with your code:
1) the length is (probably) not what you're expecting. You have:
int cLen = 8; /* Specified Length */
Presumably you want a string of length 8. Because you have to add a NULL terminator, you're only getting a string of length 7 right now. If that's what you want you should just update your comment to make that clear:
int cLen = 9; /* Specified Length (8) + 1 for NULL */
2) you're not assigning the char correctly:
char chr = "a";
is not right. Characters are specified with a single quote:
char chr = 'a';
After that your code should work.

Resources