I'm working on a minisql code in C and i having some issues to allocate array of strings. I made a function called "alocaString" to do this (bc i'm using that a lot), but i don't think is working.
When the code reaches the line "strncpy(lista[qtnPalavras], splitStr, 100);" in the function "listaPalavras" (that have the purpose of split a string in different types of characters) a file named "strcpy-avx2.S" is created, one of the arguments of that function (**lista) is allocated with "alocaString" so i think the problem is in that function.
I already try to use valgrind and shows "is used uninitialized in this function [-Werror=uninitialized]" to all arrays of strings that i tried to use on that function, but i'm initializing them inside of the function
int alocaString (char **string, int tamanho, int posicoes){
string = malloc (posicoes * sizeof(char*));
for (int i = 0; i < posicoes; i++){
string [i] = malloc (tamanho * sizeof(char));
if (string[i] == NULL){return 0;}
}
return **string;
}
void desalocaString (char **string, int posicoes){
for (int i = 0; i < (posicoes); i++){
free (string[i]);
}
free (string);
}
int listaPalavras(char *entrada, char **lista, char *separador){ // lista as palavras
char *splitStr;
int qtnPalavras = 0;
splitStr = strtok(entrada, separador);
while (splitStr != NULL){
strncpy(lista[qtnPalavras], splitStr, 100);
qtnPalavras++;
splitStr = strtok(NULL, separador);
}
return qtnPalavras;
}
I assume that you are using these functions like this:
alocaString(lista, tamanho, posicoes);
listaPalavras(some_string, lista, some_delimiters);
desalocaString(arr);
Even without looking at the code, it seems logically wrong to allocate an array of strings first and then populate it if you do not already know how many strings it will need to fit. If you happen to allocate an array of n strings, but your listaPalavras() functions splits the provided string into n+1 or more substrings, you're going to overflow your previously allocated array. Nonetheless, this can be done taking the appropriate precautions, like carrying around sizes and checking them to avoid overflow.
The only sane way to achieve what you want is therefore to either (A) count the number of delimiters in the string first to know in advantage how many pointers you will need or (B) dynamically allocate the needed amount in listaPalavras() while splitting. You seem to be going with something similar to option A, but your code is flawed.
The desalocaString() is the only function that seems correct.
A correct implementation of alocaString() would return the allocated array (or NULL in case of failure), but you are returning **string which is just the first character of the first string. Needless to say, this does not make much sense. You don't need to take a char ** parameter, just the sizes. Secondly, in case of failure of any of the calls to malloc() you should free the previously allocated ones before returning NULL.
char **alocaString (unsigned tamanho, unsigned posicoes) {
char **lista = malloc(posicoes * sizeof(char*));
if (lista == NULL)
return NULL;
for (unsigned i = 0; i < posicoes; i++) {
lista[i] = malloc(tamanho * sizeof(char));
if (lista[i] == NULL) {
for (unsigned j = 0; j < i; j++)
free(lista[j]);
free(lista);
return NULL;
}
}
return lista;
}
As per listaPalavras(), which has the job of splitting the given string into other strings and copying them into the previously allocated array, to avoid overflowing the given array of strings you will need to also provide its length as well as the length of the previously allocated strings as argument (let's call them posicoes and tamanho like for the above function). Moreover, strncpy() will not add a NUL-terminator (\0) to the destination string if it is not found in the source string within the first n characters (n being the third argument), so you will need to add it yourself to make sure your strings are correctly terminated.
unsigned listaPalavras(const char *entrada, char *separador, char **lista, unsigned posicoes, unsigned tamanho) {
char *splitStr;
unsigned qtnPalavras = 0;
splitStr = strtok(entrada, separador);
while (qtnPalavras < posicoes && splitStr != NULL){
strncpy(lista[qtnPalavras], splitStr, tamanho);
lista[qtnPalavras][tamanho - 1] = '\0';
qtnPalavras++;
splitStr = strtok(NULL, separador);
}
return qtnPalavras;
}
Finally the code of the caller should look something like this:
char **lista;
unsigned tamanho = 100;
unsigned posicoes = 10;
unsigned palavras;
lista = alocaString(tamanho, posicoes);
if (lista == NULL) {
// handle the error somehow
}
palavras = listaPalavras(YOUR_STRING, YOUR_DELIMITERS, lista, posicoes, tamanho);
desalocaString(lista);
This should work fine, however you are limited by the fact that:
You cannot know beforehand the number of substrings that strtok() will find.
You cannot know beforehand the length of any of those substrings.
Therefore, allocating the needed lista dynamically inside listaPalavras() would make more sense.
Finally, as a side note, the names of your functions are misleading: if you need to allocate an array of strings, you might want to choose a better name than alocaString() which seems to imply that you are allocating a single string. Maybe alocaLista() and dealocaLista() would be better choices.
Related
Recently I was pondering over this question: how to make an easier way to iterate over an array of pointer in C.
If I create an array of string in C, it should look like this right?
int size = 5;
char ** strArr = (char **) malloc(sizeof(char *) * size);
if (strArr == NULL) return;
But the problem is, when you want to iterate over this array for some reason (like printing all values inside it), you have to keep track of its current size, storing in another variable.
That's not a problem, but if you create lots of arrays, you have to keep track of every single one of their sizes inside the code. If you pass this array to another function, you must pass its size as well.
void PrintValues (char ** arr, int size) {
for (int i = 0; i < size; i++)
printf("%s\n", arr[i]);
}
But when iterating over a string, it's different. You have the '\0' character, which specifies the end of the string. So, you could iterate over a string like this, with not need to keep its size value:
char * str = (char *) malloc(sizeof(char) * 4);
str[0] = 'a';
str[1] = 'b';
str[2] = 'c';
str[3] = '\0';
for (int i = 0; str[i] != '\0'; i++)
printf("%c", str[i]);
printf("\n");
Now my question:
Is it ok or morally right to allocate +1 unit in an array of pointers to maintain its tail as NULL?
char ** strArr = (char **) malloc(sizeof(char *) * (5 +1);
if (strArr == NULL) return;
strArr[0] = PseudoFunc_NewString("Car");
strArr[1] = PseudoFunc_NewString("Car#1");
strArr[2] = PseudoFunc_NewString("Car#2");
strArr[3] = PseudoFunc_NewString("Tree");
strArr[4] = PseudoFunc_NewString("Tree#1");
strArr[5] = NULL; // Stop iteration here as next element is not allocated
Then I could use the NULL pointer to control the iterator:
void PrintValues (char ** arr) {
for (int i = 0; arr[i] != NULL; i++)
printf("%s\n", arr[i]);
}
This would help me to keep the code cleaner, though it would consume more memory as a pointer size is larger than a integer size.
Also, when programming with event-based libraries, like Gtk, the size values would be released from the stack at some point, so I would have to create a pointer to dynamically store the size value for example.
In cases like this, it ok to do this? Or is it considered something bad?
Is this technique only used with char pointers because char type has a size of only 1 byte?
I miss having a foreach iterator in C...
Now my question: Is it ok or morally right to allocate +1 unit in an array of pointers to maintain its tail as NULL?
This is ok, the final NULL is called a sentinel value and using one is somewhat common practice. This is most often used when you don't even know the size of the data for some reason.
It is however, not the best solution, because you have to iterate over all the data to find the size. Solutions that store the size separately are much faster. An arrays of structs for example, containing both size and data in the same place.
Now my question: Is it ok or morally right to allocate +1 unit in an array of pointers to maintain its tail as NULL?
In C this is quite a common pattern, and it has a name. You're simply using a sentinel value.
As long as your list can not contain null pointers normally this is fine. It is a bit error-prone in general however, then again, that's C for you.
It's ok, and is a commonly used pattern.
As an alternative you can use a struct, in there you can create a size variable where you can store the current size of the array, and pass the struct as argument. The advantage is that you don't need to iterate through the entire array to know its size.
Example:
Live demo
#include <stdlib.h>
#include <stdio.h>
typedef struct
{
char **strArr;
int size;
} MyStruct;
void PrintValues(MyStruct arr) //pass the struct as an argument
{
for (int i = 0; i < arr.size; i++) //use the size passed in the struct
printf("%s\n", arr.strArr[i]);
}
int main()
{
// using the variable to extract the size, to avoid silent errors
// also removed the cast for the same reason
char **strArr = malloc(sizeof *strArr * 5);
if (strArr == NULL) return EXIT_FAILURE;
strArr[0] = "Car";
strArr[1] = "Car#1";
strArr[2] = "Car#2";
strArr[3] = "Tree";
strArr[4] = "Tree#1";
MyStruct strt = { strArr, 5 }; // initialize the struct
PrintValues(strt); //voila
free(strArr); // don't forget to free the allacated memory
return EXIT_SUCCESS;
}
This allows for direct access to an index with error checking:
// here if the array index exists, it will be printed
// otherwise no, allows for O(1) access error free
if(arr.size > 6){
printf("%s\n", arr.strArr[6]);
}
I have a function
populateAvailableExtensions(const char** gAvailableExtensions[], int gCounter)
which take a pointer to an array of strings and the number of elements in the array as parameters.
I allocate initial memory to that array using malloc(0). Specs say that it will either return a null pointer or a unique pointer that can be passed to free().
int currentAvailableExtensionCount = gCounter;
This variable will store number of string in gAvailableExtensions.
Inside this for loop
for (int i = 0; i < availableExtensionCount; ++i)
I have this piece of code
size_t sizeOfAvailableExtensionName =
sizeof(availableExtensionProperties[i].name);
reallocStatus = realloc(*gAvailableExtensions, sizeOfAvailableExtensionName);
memcpy(&(*gAvailableExtensions)[currentAvailableExtensionCount],
&availableExtensionProperties[i].name,
sizeOfAvailableExtensionName);
++currentAvailableExtensionCount;
where
availableExtensionProperties[i].name
returns a string.
This is how that struct is defined
typedef struct Stuff {
char name[MAX_POSSIBLE_NAME];
...
...
} Stuff;
realloc(*gAvailableExtensions, sizeOfAvailableExtensionName);
should add memory of size sizeOfAvailableExtensionName to *gAvailableExtensions de-referenced array.
memcpy(&(*gAvailableExtensions)[currentAvailableExtensionCount],
&availableExtensionProperties[i].name,
sizeOfAvailableExtensionName);
should copy the string (this sizeOfAvailableExtensionName much memory) from
&availableExtensionPropterties[i].name
address to
&(*gAvailableExtensions)[currentAvailableExtensionCount]
address.
But I don't think the code does what I think it should because I'm getting this error
realloc(): invalid next size
Aborted
(core dumped) ./Executable
EDIT: Full code
uint32_t populateAvailableExtensions(const char** gAvailableExtensions[], int gCounter) {
int currentAvailableExtensionCount = gCounter;
void* reallocStatus;
uint32_t availableExtensionCount = 0;
vkEnumerateInstanceExtensionProperties(
VK_NULL_HANDLE, &availableExtensionCount, VK_NULL_HANDLE);
VkExtensionProperties availableExtensionProperties[availableExtensionCount];
vkEnumerateInstanceExtensionProperties(
VK_NULL_HANDLE, &availableExtensionCount, availableExtensionProperties);
for (int i = 0; i < availableExtensionCount; ++i) {
size_t sizeOfAvailableExtensionName =
sizeof(availableExtensionProperties[i].extensionName);
reallocStatus = realloc(*gAvailableExtensions, sizeOfAvailableExtensionName);
memcpy(&(*gAvailableExtensions)[currentAvailableExtensionCount],
availableExtensionProperties[i].extensionName,
sizeOfAvailableExtensionName);
++currentAvailableExtensionCount;
}
return currentAvailableExtensionCount;
}
This is how an external function calls on that one,
uint32_t availableExtensionCount = 0;
availableExtensions = malloc(0);
availableExtensionCount = populateAvailableExtensions(&availableExtensions);
and
const char** availableExtensions;
is declared in header file.
EDIT 2: Updated the code, now gCounter holds the number of elements in gAvailableExtensions
This loop is totally messy:
for (int i = 0; i < availableExtensionCount; ++i) {
size_t sizeOfAvailableExtensionName =
sizeof(availableExtensionProperties[i].extensionName);
reallocStatus = realloc(*gAvailableExtensions, sizeOfAvailableExtensionName);
memcpy(&(*gAvailableExtensions)[currentAvailableExtensionCount],
availableExtensionProperties[i].extensionName,
sizeOfAvailableExtensionName);
++currentAvailableExtensionCount;
}
I assume the only lines that does what you expect them to do, are the lines for (int i = 0; i < availableExtensionCount; ++i) and ++currentAvailableExtensionCount;
First, the typical way to use realloc is like this:
foo *new_p = realloc(p, new_size);
if (!new_p)
handle_error();
else
p = new_p;
The point is that realloc will not update the value of p if a reallocation happens. It is your duty to update 'p'. In your case you never update *gAvailableExtensions. I also suspect that you don't calculate sizeOfAvailableExtensionCount correctly. The operator sizeof always return a compile time constant, so the realloc doesn't actuall make any sense.
The memcpy doesn't actally make any sense either, since you are copying the string into the memory of a pointer array (probably with an additional buffer overflow).
You said that *gAvailableExtensions is a pointer to an array of pointers to strings.
That means that you have to realloc the buffer to hold the correct number of pointers, and malloc memory for each string you want to store.
For this example, I assume that .extensionName is of type char * or char[XXX]:
// Calculate new size of pointer array
// TODO: Check for overflow
size_t new_array_size =
(currentAvailableExtensionCount + availableExtensionCount) * sizeof(*gAvailableExtensions);
char **tmp_ptr = realloc(*gAvailableExtensions, new_array_size);
if (!tmp_ptr)
{
//TODO: Handle error;
return currentAvailableExtensionCount;
}
*gAvailableExtensions = tmp_ptr;
// Add strings to array
for (int i = 0; i < availableExtensionCount; ++i)
{
size_t length = strlen(availableExtensionProperties[i].extensionName);
// Allocate space for new string
char *new_s = malloc(length + 1);
if (!new_s)
{
//TODO: Handle error;
return currentAvailableExtensionCount;
}
// Copy string
memcpy (new_s, availableExtensionProperties[i].extensionName, length + 1);
// Insert string in array
(*gAvailableExtensions)[currentAvailableExtensionCount] = new_s;
++currentAvailableExtensionCount;
}
If you can guarantee that the lifetime of availableExtensionProperties[i].extensionName is longer than *gAvailableExtensions, you can simplify this a little bit by dropping malloc and memcpy in the loop, and do:
char *new_s = availableExtensionProperties[i].extensionName;
(*gAvailableExtensions)[currentAvailableExtensionCount] = new_s;
Some harsh words at the end: It seems like you have the "Infinite number of Monkeys" approach to programming, just hitting the keyboard until it works.
Such programs will just only give the illusion of working. They will break in spectacular ways sooner or later.
Programming is not a guessing game. You have to understand every piece of code you write before you move to the next one.
int currentAvailableExtensionCount =
sizeof(*gAvailableExtensions) / sizeof(**gAvailableExtensions) - 1;
is just a obfuscated way of saying
int currentAvailableExtensionCount = 0;
I stopped reading after that, because i assume that is not what you intend to write.
Pointers in c doesn't know how many elements there are in the sequence they are pointing at. They only know the size of a single element.
In your case *gAvailableExtensions is of type of char ** and **gAvailableExtensions is of type char *. Both are pointers and have the same size on a typical desktop system. So on a 64 bit desktop system the expression turns into
8/8 - 1, which equals zero.
Unless you fix this bug, or clarify that you actually want the value to always be zero, the rest of the code does not make any sense.
I want to filter an strings array passed in, something like this:
char **
filter_vids(char **vids, size_t n) {
int i;
int count = 0;
char ** filted = malloc(n * sizeof(char *));
for(i = 0; i < n; i++){
filted[i] = (char*)malloc(50 * sizeof(char));
}
for(i = 0; i < n; i++) {
if(some_filter(vids[i])) {
strcpy(filted[count++], vids[i]);
printf("in filter:%s\n", vids[i]);
}
}
return filted;
}
But the caller may not known the length of return array, it's extractly the counter variable, so what's the best practice of returning an array while telling him the right length of array?
such as
char **
filter_vids(char **vids, size_t n, int *output_length)
It's the best practice of using output_length?
I edit this function to this, as your suggestions:
char **
filter_vids(char **vids, size_t n) {
int i;
int count = 0;
char ** filted = malloc((n + 1) * sizeof(char *));
for(i = 0; i < n; i++) {
if(vids[i][0] <= 'f') {
filted[count++] = strdup(vids[i]);
}
}
filted[count] = NULL;
return filted;
}
To pass a pointer to an integer length variable whose value is then set in the function is certainly a good way. As Malcolm said, it is also general and can be used for sets of values which do not have an "invalid" member.
In the case of pointers with their invalid null pointer value one can mark the end of valid entries with a null pointer. For example, the array of string pointers which the C run time uses to pass command line arguments to main is thus terminated.
Which method to choose depends a little on how the caller wants to use the resulting array. If it is processed sequentially, a (while *p){ ..; ++p; } feels idiomatic. If, on the other hand, you need random access and must perform the equivalent of a strlen before you can do anything with the array, then it is probably better to return the length via a pointed-to length variable right away.
Two remarks:
First, note the difference between
a valid pointer to an empty string (if somebody called, let's say, myProg par1 "" par2, argv[2] could be a valid pointer to a zero byte);
and a null pointer which is pointing nowhere; in the example, argv[4] would be the null pointer, indicating the end of the argument list.
Second, You malloc more memory than you need which is wasteful in the case of longer strings and/or strict filters. You could instead allocate the string on demand inside the if clause.
These are common options:
Receive the allowed size as parameter by pointer, overwrite it with the actual size, return the array as return value.
Receive the output array as parameter by pointer, update as required, return the actual size as return value.
Append a sentinel value to the output array (here a null pointer), as suggested in the other answer.
Use a more sophisticated data structure as a return value. You could use a struct, which stores the size alongside the array or a linked list.
Example (untested):
typedef char* mystring;
typedef mystring* mystringarray;
typedef struct { mystringarray *arr; size_t size } mysizedstringarray;
/* returns filtered array, size will be updated to reflect the valid size */
mystringarray* myfun1(mystringarray in, size_t* size);
/* out will be allocated and populated, actual size is returned */
size_t myfun2(mystringarray in, size_t size, mystringarray* out);
/* output array contains valid items until sentinel value (NULL) is reached */
mystringarray* myfun3(mystringarray in, size_t size);
/* returns filtered array with actual size */
mysizedstringarray myfun4(mystringarray in, size_t size);
I have tried everything and my code looks perfectly fine to me (it obviously isn't if it's not working).
I am trying to read from some text a list of words separated by a comma, and each word will be an element of an array of strings. I don't know how many elements there will be or how long it will be.
The for loop is grand, as I count how many characters there is before. The main problem is allocation memory, sometimes I get "Segmentation Fault: 11" when I run it (as it compiles grand), sometimes when I read the items it get something like:
P?? adios (null) heyya
When it should give me something like:
hola adios bye heyya
I think I am accessing memory I am not supposed to. Anyway, here the code:
// We allocate memory for one string
variables = (char**)calloc(1, sizeof(char*));
variables[0] = (char*)calloc(100, sizeof(char));
if (variables == NULL) {
return NULL;
}
// Now we start looking for the variables
for (int i = comma_pos+1; i < *(second_pos + pos); i++) {
deleteSpaces(string, &i);
// If the character is not a comma, we copy the character
if (*(string + i) != ',') {
*(variables[stringnum] + j) = *(string + i);
j++;
} else {
// If the character is a comma, we have to allocate more memory for a new string
*(variables[stringnum] + j) = '\0';
stringnum++;
j = 0;
char **temp = variables;
// We allocate more memory for a second array
variables = realloc(variables, sizeof(char*) * stringnum);
variables[stringnum] = (char*)calloc(100, sizeof(char));
// If we cannot allocate more memory then get out
if (variables == NULL) {
return temp;
}
} // end else
} // end for
*(variables[stringnum] + j) = '\0';
It's not immediately clear to me what is wrong with your code, but it's not at all how I would approach the problem.
I would start by determining how many substrings there are by counting delimiters in the source string and adding one. This does require a pre-scan of the string, but it's likely to be much cheaper than any alternative that requires performing multiple memory allocations.
As for space for the strings themselves, if you do not need to keep the comma-delimited form of the list, then you may be able to re-use that space. Use the strtok() function to tokenize it, and store the resulting pointers.
If you must preserve the original comma-delimited string, then I suggest making a copy of the whole thing, and then tokenizing as I suggested before (and you will know how long it is already from counting delimiters). You do not need more space overall for the individual strings than the original comma-delimited one occupies.
If you prefer to avoid strtok() then it's not hard to implement the same thing manually.
you have to alloc in both directions and maybe you are already.
you need to allocate the depth, an array of pointers, then for each pointer in that array need to allocate the width for that row.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main ( void )
{
unsigned int ra;
unsigned int rb;
char **x;
x=malloc(100*sizeof(char *));
printf("%p\n",x);
for(ra=0;ra<100;ra++)
{
x[ra]=malloc(ra*sizeof(char));
}
for(ra=0;ra<100;ra++)
{
printf("%p\n",x[ra]);
}
for(ra=0;ra<100;ra++)
{
for(rb=0;rb<ra;rb++) x[ra][rb]=rb;
}
for(ra=0;ra<100;ra++)
{
for(rb=0;rb<ra;rb++)
{
printf("%u ",x[ra][rb]);
}
printf("\n");
}
return(0);
}
I'm working on a basic shell (as in the console program that awaits commands and executes them in UNIX systems) replica in C, and need to be able to manipulate 2d arrays of char to store the environment variables.
I wrote a small function to create that 2d array and initialize each string to NULL before I fill it up elsewhere in my code.
Except that it crashes as soon as the program is launched, for some reason.
I have similar issues (namely occasional segfaults, probably due to me reading/writing in an inapropriate place) with two other functions, respectively to free those 2d arrays when needed, and to get the length of one of those 2d array.
If I don't use these two functions and malloc the 2d array within the rest of my code, without initializing anything except the last entry to NULL, but instead copy the env strings directly after the malloc, I have something that works. But it'd be better to be able to prevent the memory leaks, and to have that ft_tabnew function to work so that I could reuse it in future projects.
char **ft_tabnew(size_t size)
{
char **mem;
size_t i;
if (!(mem = (char **)malloc(size + 1)))
return (NULL);
i = 0;
while (i < size + 1)
{
mem[i] = NULL;
i++;
}
return (mem);
}
void ft_tabdel(char ***as)
{
int i;
int len;
if (as == NULL)
return ;
i = 0;
len = ft_tablen(*as);
while (i < len)
{
if (*as[i])
ft_strdel(&(*as[i]));
i++;
}
free(*as);
*as = NULL;
return ;
}
size_t ft_tablen(char **tab)
{
size_t i;
i = 0;
while (tab[i])
i++;
return (i);
}
NOTE : The ft_strdel function used in ft_tabdel is freeing a string that was dynamically allocated, and sets the pointer to NULL. I've been using it for a few months in several projects and it has not failed me yet.
Hopefully, you wonderful people will be able to tell me what misconception or misunderstanding I have about 2d arrays of chars, or what stupid error I'm making here.
Thank you.
You're not allocating enough space.
if (!(mem = (char **)malloc(size + 1)))
only allocates size+1 bytes. But you need to allocate space for size+1 pointers, and pointers are typically 4 bytes. You need to multiply the number of elements by the size of each element:
if (!(mem = malloc((size + 1) * sizeof(*mem))))
In the code
char **mem;
while (i < size + 1)
{
mem[i] = NULL;
i++;
}
mem is a "pointer to a pointer to a char" and hence its size is that of a pointer, not of a char. When you say mem[i] and incement i, you increment with the size of pointer, not of char, and so overwrite memory outside your allocated memory. Try:
if (!(mem = (char **)malloc((size + 1)*sizeof(void *))))