I'm attempting to fill a few strings using a function, but the strings don't seem to be getting filled properly. The print statement is just 4 empty lines. BUT if I un-comment the char ** pls line, it prints all three strings properly even if I never use the variable pls anywhere. It also runs properly in debug mode without the pls variable existing. I'm not entirely sure what I did that it isn't happy about.
char * dataFile = (char *) calloc(64, sizeof(char));
//char ** pls = &dataFile;
char * queryFile = (char *) calloc(64, sizeof(char));
char * outFile = (char *) calloc(64, sizeof(char));
for(i = 1; i <argc; ++i)
{
char command[3];
char * iterator = argv[i];
command[0] = *iterator;
++iterator;
command[1] = *iterator;
++iterator;
command[2] = *iterator;
if(strcmp(command, "df=") == 0)
determineFileString(iterator, &dataFile);
else if(strcmp(command, "if=") == 0)
determineFileString(iterator, &queryFile);
else if(strcmp(command, "of=") == 0)
determineFileString(iterator, &outFile);
}
printf("%s\n%s\n%s\n", dataFile, queryFile, outFile);
void determineFileString(char * iterator, char ** file)
{
char * p = *file;
++iterator;
while(*iterator != '\0')
{
*p = *iterator;
++p;
++iterator;
}
*p = '\0';
}
You are calling strcmp but the first operand does not point to a string. A string is defined as some characters followed by a null terminator.
Your code will also cause undefined behaviour if an argv[i] string is shorter than 2 characters, because you always copy 3 characters out of it.
To fix this, either make command bigger and put a null terminator on the end, or use memcmp instead of strcmp. (But be careful with memcmp as it also causes UB if both objects are not at least as big as the size).
Here is a possible fix:
for(i = 1; i <argc; ++i)
{
if ( strlen(argv[i]) < 3 )
continue;
if ( memcmp(argv[i], "df=", 3) == 0 )
determineFileString(argv[i] + 3, &dataFile);
else if // etc.
}
BTW, the determineFileString function does not do any buffer size checking (it could buffer overflow). I'd suggest redesigning this function; perhaps it could do a length check and call realloc inside the function.
Related
The task was to read the first 20 lines from a specific file and format them to use only specific parts, the next step was to store those formatted strings in a dynamic array (char ** str | a pointer to a pointer), send it to a function and print it out with said function
Here is the main code:
int main(int argc, char* argv[]){
FILE* file = fopen("./passwd.txt", "r"); // open file
if (!file)
{
perror("Error opening file");
return 1;
}
char line [MAXCHARS];
int counter = 0;
char ** str;
str = malloc(20 * sizeof (char*));
while (fgets(line, MAXCHARS, file) && counter < 20) {
char * position;
if ((position = strchr(line,':'))){
char * end_char;
*position = 0; //setting here the value 0, will terminate the string line (first column)
if((position = strchr(++position,':')) && (end_char = strchr(++position,':'))){ //increment the position to skip the second column
*end_char = 0; //set the value 0 to end_char to terminate the string pointed by position
char * temp_str = "\0";
sprintf(temp_str, "{%d} - {%s}\n", atoi(position), line ); //concatenate line and userID into one string and store it into a temporary string
*(str + counter) = malloc(sizeof (char) * strlen(temp_str)); //set the memory space for the temp_string into the array
*(str + counter) = temp_str; //copy the string into the array
}
}
counter++;
}
printArray(str);
fclose(file);
if (line)
free(line);
return 0;
}
And here is the print function:
void printArray(char ** array){
for(int i = 0; i < 20; i++){
printf("%s",*(array+i));
free(*(array+i));
}
free(array);
}
I cannot find the error, the code compiles with
Process finished with exit code -1073741819 (0xC0000005)
So at least it compiles, and I think is just a problem in my pointers handling skills, but I'm not able to find the error.
Can someone help me?
there are 3 errors in your program :
use temp_str which haven't allocated.
char * temp_str = "\0";
sprintf(temp_str, "{%d} - {%s}\n", atoi(position), line );
save temp_str local pointer's address to str+counter and use the pointer after it went out of scope at printArray=> undefined behavior
line is not a pointer, can't use free
if (line)
{
free(line);
}
lets try this. https://godbolt.org/z/7KPfnTEMY I correct these points
My str_split function returns (or at least I think it does) a char** - so a list of strings essentially. It takes a string parameter, a char delimiter to split the string on, and a pointer to an int to place the number of strings detected.
The way I did it, which may be highly inefficient, is to make a buffer of x length (x = length of string), then copy element of string until we reach delimiter, or '\0' character. Then it copies the buffer to the char**, which is what we are returning (and has been malloced earlier, and can be freed from main()), then clears the buffer and repeats.
Although the algorithm may be iffy, the logic is definitely sound as my debug code (the _D) shows it's being copied correctly. The part I'm stuck on is when I make a char** in main, set it equal to my function. It doesn't return null, crash the program, or throw any errors, but it doesn't quite seem to work either. I'm assuming this is what is meant be the term Undefined Behavior.
Anyhow, after a lot of thinking (I'm new to all this) I tried something else, which you will see in the code, currently commented out. When I use malloc to copy the buffer to a new string, and pass that copy to aforementioned char**, it seems to work perfectly. HOWEVER, this creates an obvious memory leak as I can't free it later... so I'm lost.
When I did some research I found this post, which follows the idea of my code almost exactly and works, meaning there isn't an inherent problem with the format (return value, parameters, etc) of my str_split function. YET his only has 1 malloc, for the char**, and works just fine.
Below is my code. I've been trying to figure this out and it's scrambling my brain, so I'd really appreciate help!! Sorry in advance for the 'i', 'b', 'c' it's a bit convoluted I know.
Edit: should mention that with the following code,
ret[c] = buffer;
printf("Content of ret[%i] = \"%s\" \n", c, ret[c]);
it does indeed print correctly. It's only when I call the function from main that it gets weird. I'm guessing it's because it's out of scope ?
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#define DEBUG
#ifdef DEBUG
#define _D if (1)
#else
#define _D if (0)
#endif
char **str_split(char[], char, int*);
int count_char(char[], char);
int main(void) {
int num_strings = 0;
char **result = str_split("Helo_World_poopy_pants", '_', &num_strings);
if (result == NULL) {
printf("result is NULL\n");
return 0;
}
if (num_strings > 0) {
for (int i = 0; i < num_strings; i++) {
printf("\"%s\" \n", result[i]);
}
}
free(result);
return 0;
}
char **str_split(char string[], char delim, int *num_strings) {
int num_delim = count_char(string, delim);
*num_strings = num_delim + 1;
if (*num_strings < 2) {
return NULL;
}
//return value
char **ret = malloc((*num_strings) * sizeof(char*));
if (ret == NULL) {
_D printf("ret is null.\n");
return NULL;
}
int slen = strlen(string);
char buffer[slen];
/* b is the buffer index, c is the index for **ret */
int b = 0, c = 0;
for (int i = 0; i < slen + 1; i++) {
char cur = string[i];
if (cur == delim || cur == '\0') {
_D printf("Copying content of buffer to ret[%i]\n", c);
//char *tmp = malloc(sizeof(char) * slen + 1);
//strcpy(tmp, buffer);
//ret[c] = tmp;
ret[c] = buffer;
_D printf("Content of ret[%i] = \"%s\" \n", c, ret[c]);
//free(tmp);
c++;
b = 0;
continue;
}
//otherwise
_D printf("{%i} Copying char[%c] to index [%i] of buffer\n", c, cur, b);
buffer[b] = cur;
buffer[b+1] = '\0'; /* extend the null char */
b++;
_D printf("Buffer is now equal to: \"%s\"\n", buffer);
}
return ret;
}
int count_char(char base[], char c) {
int count = 0;
int i = 0;
while (base[i] != '\0') {
if (base[i++] == c) {
count++;
}
}
_D printf("Found %i occurence(s) of '%c'\n", count, c);
return count;
}
You are storing pointers to a buffer that exists on the stack. Using those pointers after returning from the function results in undefined behavior.
To get around this requires one of the following:
Allow the function to modify the input string (i.e. replace delimiters with null-terminator characters) and return pointers into it. The caller must be aware that this can happen. Note that supplying a string literal as you are doing here is illegal in C, so you would instead need to do:
char my_string[] = "Helo_World_poopy_pants";
char **result = str_split(my_string, '_', &num_strings);
In this case, the function should also make it clear that a string literal is not acceptable input, and define its first parameter as const char* string (instead of char string[]).
Allow the function to make a copy of the string and then modify the copy. You have expressed concerns about leaking this memory, but that concern is mostly to do with your program's design rather than a necessity.
It's perfectly valid to duplicate each string individually and then clean them all up later. The main issue is that it's inconvenient, and also slightly pointless.
Let's address the second point. You have several options, but if you insist that the result be easily cleaned-up with a call to free, then try this strategy:
When you allocate the pointer array, also make it large enough to hold a copy of the string:
// Allocate storage for `num_strings` pointers, plus a copy of the original string,
// then copy the string into memory immediately following the pointer storage.
char **ret = malloc((*num_strings) * sizeof(char*) + strlen(string) + 1);
char *buffer = (char*)&ret[*num_strings];
strcpy(buffer, string);
Now, do all your string operations on buffer. For example:
// Extract all delimited substrings. Here, buffer will always point at the
// current substring, and p will search for the delimiter. Once found,
// the substring is terminated, its pointer appended to the substring array,
// and then buffer is pointed at the next substring, if any.
int c = 0;
for(char *p = buffer; *buffer; ++p)
{
if (*p == delim || !*p) {
char *next = p;
if (*p) {
*p = '\0';
++next;
}
ret[c++] = buffer;
buffer = next;
}
}
When you need to clean up, it's just a single call to free, because everything was stored together.
The string pointers you store into the res with ret[c] = buffer; array point to an automatic array that goes out of scope when the function returns. The code subsequently has undefined behavior. You should allocate these strings with strdup().
Note also that it might not be appropriate to return NULL when the string does not contain a separator. Why not return an array with a single string?
Here is a simpler implementation:
#include <stdlib.h>
char **str_split(const char *string, char delim, int *num_strings) {
int i, n, from, to;
char **res;
for (n = 1, i = 0; string[i]; i++)
n += (string[i] == delim);
*num_strings = 0;
res = malloc(sizeof(*res) * n);
if (res == NULL)
return NULL;
for (i = from = to = 0;; from = to + 1) {
for (to = from; string[to] != delim && string[to] != '\0'; to++)
continue;
res[i] = malloc(to - from + 1);
if (res[i] == NULL) {
/* allocation failure: free memory allocated so far */
while (i > 0)
free(res[--i]);
free(res);
return NULL;
}
memcpy(res[i], string + from, to - from);
res[i][to - from] = '\0';
i++;
if (string[to] == '\0')
break;
}
*num_strings = n;
return res;
}
Let's say we have a string of words that are delimited by a comma.
I want to write a code in C to store these words in a variable.
Example
amazon, google, facebook, twitter, salesforce, sfb
We do not know how many words are present.
If I were to do this in C, I thought I need to do 2 iterations.
First iteration, I count how many words are present.
Then, in the next iteration, I store each words.
Step 1: 1st loop -- count number of words
....
....
//End 1st loop. num_words is set.
Step 2:
// Do malloc using num_words.
char **array = (char**)malloc(num_words* sizeof(char*));
Step 3: 2nd loop -- Store each word.
// First, walk until the delimiter and determine the length of the word
// Once len_word is determined, do malloc
*array= (char*)malloc(len_word * sizeof(char));
// And then store the word to it
// Do this for all words and then the 2nd loop terminates
Can this be done more efficiently?
I do not like having 2 loops. I think there must be a way to do it in 1 loop with just basic pointers.
The only restriction is that this needs to be done in C (due to constraints that are not in my control)
You don't need to do a separate pass to count the words. You can use realloc to enlarge the array on the fly as you read in the data on a single pass.
To parse an input line buffer, you can use strtok to tokenize the individual words.
When saving the parsed words into the word list array, you can use strdup to create a copy of the tokenized word. This is necessary for the word to persist. That is, whatever you were pointing to in the line buffer on the first line will get clobbered when you read the second line (and so on ...)
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
char **words;
size_t wordmax;
size_t wordcount;
int
main(int argc,char **argv)
{
char *cp;
char *bp;
FILE *fi;
char buf[5000];
--argc;
++argv;
// get input file name
cp = *argv;
if (cp == NULL) {
printf("no file specified\n");
exit(1);
}
// open input file
fi = fopen(cp,"r");
if (fi == NULL) {
printf("unable to open file '%s' -- %s\n",cp,strerror(errno));
exit(1);
}
while (1) {
// read in next line -- bug out if EOF
cp = fgets(buf,sizeof(buf),fi);
if (cp == NULL)
break;
bp = buf;
while (1) {
// tokenize the word
cp = strtok(bp," \t,\n");
if (cp == NULL)
break;
bp = NULL;
// expand the space allocated for the word list [if necessary]
if (wordcount >= wordmax) {
// this is an expensive operation so don't do it too often
wordmax += 100;
words = realloc(words,(wordmax + 1) * sizeof(char *));
if (words == NULL) {
printf("out of memory\n");
exit(1);
}
}
// get a persistent copy of the word text
cp = strdup(cp);
if (cp == NULL) {
printf("out of memory\n");
exit(1);
}
// save the word into the word array
words[wordcount++] = cp;
}
}
// close the input file
fclose(fi);
// add a null terminator
words[wordcount] = NULL;
// trim the array to exactly what we need/used
words = realloc(words,(wordcount + 1) * sizeof(char *));
// NOTE: because we added the terminator, _either_ of these loops will
// print the word list
#if 1
for (size_t idx = 0; idx < wordcount; ++idx)
printf("%s\n",words[idx]);
#else
for (char **word = words; *word != NULL; ++word)
printf("%s\n",*word);
#endif
return 0;
}
What you're looking for is
http://manpagesfr.free.fr/man/man3/strtok.3.html
(From man page)
The strtok() function parses a string into a sequence of tokens. On the first call to strtok() the string to be parsed should be specified in str. In each subsequent call that should parse the same string, str should be NULL.
But this thread look like duplicate of Split string with delimiters in C
Unless you are forced to produce your own implementation ...
We do not know how many words are present.
We know num_words <= strlen(string) + 1. Only 1 "loop" needed. The cheat here is a quick run down s via strlen().
// *alloc() out-of-memory checking omitted for brevity
char **parse_csv(const char *s) {
size_t slen = strlen(s);
size_t num_words = 0;
char **words = malloc(sizeof *words * (slen + 1));
// find, allocate, copy the words
while (*s) {
size_t len = strcspn(s, ",");
words[num_words] = malloc(len + 1);
memcpy(words[num_words], s, len);
words[num_words][len] = '\0';
num_words++;
s += len; // skip word
if (*s) s++; // skip ,
}
// Only 1 realloc() needed.
realloc(words, sizeof *words *num_words); // right-size words list
return words;
}
It makes send to NULL terminate the list, so
char **words = malloc(sizeof *words * (slen + 1 + 1));
...
words[num_words++] = NULL;
realloc(words, sizeof *words *num_words);
return words;
In considering the worst case for the initial char **words = malloc(...);, I take a string like ",,," with its 3 ',' would make for 4 words "", "", "", "". Adjust code as needed for such pathological cases.
I'm using this function to read, char by char, a text file or a stdin input
void readLine(FILE *stream, char **string) {
char c;
int counter = 0;
do {
c = fgetc(stream);
string[0] = (char *) realloc (string[0], (counter+1) * sizeof(char));
string[0][counter++] = c;
} while(c != ENTER && !feof(stream));
string[counter-1] = '\0';
}
But when I call it, my program crashed and I really don't know why, because I don't forget the 0-terminator and I'm convinced that I stored correctly the char sequence. I've verified the string length, but it appears alright.
This is an error:
do {
c = fgetc(stream);
// What happens here?!?
} while(c != ENTER && !feof(stream));
"What happens here" is that you add c to string before you've checked for EOF, whoops.
This is very ungood:
string[0] = (char *) realloc (string[0], (counter+1) * sizeof(char));
in a loop. realloc is a potentially expensive call and you do it for every byte of input! It is also a silly and confusing interface to ask for a pointer parameter that has (apparently) not been allocated anything -- passing in the pointer usually indicates that is already done. What if string were a static array? Instead, allocate in chunks and return a pointer:
char *readLine (FILE *stream) {
// A whole 4 kB!
int chunksz = 4096;
int counter = 0;
char *buffer = malloc(chunksz);
char *test;
int c;
if (!buffer) return NULL;
while (c = fgetc(stream) && c != ENTER && c != EOF) {
buffer[counter++] = (char)c;
if (counter == chunksz) {
chunksz *= 2;
test = realloc(buffer, chunksz);
// Abort on out-of-memory.
if (!test) {
free(buffer);
return NULL;
} else buffer = test;
}
}
// Now null terminate and resize.
buffer[counter] = '\0';
realloc(buffer, counter + 1);
return buffer;
}
That is a standard "power of 2" allocation scheme (it doubles). If you really want to submit a pointer, pre-allocate it and also submit a "max length" parameter:
void *readLine (FILE *stream, char *buffer, int max) {
int counter = 0;
int c;
while (
c = fgetc(stream)
&& c != ENTER
&& c != EOF
&& counter < max - 1
) buffer[counter++] = (char)c;
// Now null terminate.
buffer[counter] = '\0';
}
There are a few issues in this code:
fgetc() returns int.
Don't cast the return value of malloc() and friends, in C.
Avoid using sizeof (char), it's just a very clumsy way of writing 1, so multiplication by it is very redundant.
Normally, buffers are grown more than 1 char at a time, realloc() can be expensive.
string[0] would be more clearly written as *string, since it's not an array but just a pointer to a pointer.
Your logic around end of file means it will store the truncated version of EOF, not very nice.
Change this line
string[counter-1] = '\0';
to
string[0][counter-1] = '\0';
You want to terminate string stored at string[0].
How can I compare the first letter of the first element of a char**?
I have tried:
int main()
{
char** command = NULL;
while (true)
{
fgets(line, MAX_COMMAND_LEN, stdin);
parse_command(line, command);
exec_command(command);
}
}
void parse_command(char* line, char** command)
{
int n_args = 0, i = 0;
while (line[i] != '\n')
{
if (isspace(line[i++]))
n_args++;
}
for (i = 0; i < n_args+1; i++)
command = (char**) malloc (n_args * sizeof(char*));
i = 0;
line = strtok(line," \n");
while (line != NULL)
{
command[i++] = (char *) malloc ( (strlen(line)+1) * sizeof(char) );
strcpy(command[i++], line);
line = strtok(NULL, " \n");
}
command[i] = NULL;
}
void exec_command(char** command)
{
if (command[0][0] == '/')
// other stuff
}
but that gives a segmentation fault. What am I doing wrong?
Thanks.
Could you paste more code? Have you allocated memory both for your char* array and for the elements of your char* array?
The problem is, you do allocate a char* array inside parse_command, but the pointer to that array never gets out of the function. So exec_command gets a garbage pointer value. The reason is, by calling parse_command(line, command) you pass a copy of the current value of the pointer command, which is then overwritten inside the function - but the original value is not affected by this!
To achieve that, either you need to pass a pointer to the pointer you want to update, or you need to return the pointer to the allocated array from parse_command. Apart from char*** looking ugly (at least to me), the latter approach is simpler and easier to read:
int main()
{
char** command = NULL;
while (true)
{
fgets(line, MAX_COMMAND_LEN, stdin);
command = parse_command(line);
exec_command(command);
}
}
char** parse_command(char* line)
{
char** command = NULL;
int n_args = 0, i = 0;
while (line[i] != '\n')
{
if (isspace(line[i++]))
n_args++;
}
command = (char**) malloc ((n_args + 1) * sizeof(char*));
i = 0;
line = strtok(line," \n");
while (line != NULL)
{
command[i] = (char *) malloc ( (strlen(line)+1) * sizeof(char) );
strcpy(command[i++], line);
line = strtok(NULL, " \n");
}
command[i] = NULL;
return command;
}
Notes:
in your original parse_command, you allocate memory to command in a loop, which is unnecessary and just creates memory leaks. It is enough to allocate memory once. I assume that you want command to contain n_args + 1 pointers, so I modified the code accordingly.
in the last while loop of parse_command, you increment i incorrectly twice, which also leads to undefined behaviour, i.e. possible segmentation fault. I fixed it here.