Segmentation fault when converting char * to char ** - c

I'm trying to split a sentence (char *) to an array of words (char **). The problem is that my function that does just that sometimes doesn't return a valid char **.
char **get_words(char *buffer, char delimiter)
{
char **words = malloc(sizeof(char *) * 4096);
for (int i = 0; i < 4096; i++)
words[i] = malloc(sizeof(char) * 4096);
int word_count = 0;
int l = 0;
for (int i = 0; buffer[i] != '\0' && buffer[i] != '\n'; i++, l++) {
if (buffer[i] == delimiter) {
words[word_count][l] = '\0';
word_count++;
l = -1;
}
else
words[word_count][l] = buffer[i];
}
words[word_count][l] = '\0';
return (words);
}
I first use it like this:
char *buffer = malloc(sizeof(char) * 50);
buffer = "/login test\n";
char **words = get_words(buffer, ' ');
printf("Words[0] = %s", words[0]);
And it works fine.
However when I do it the same way with this:
char **reply = get_words("502 Command doesn't exist.\n", ' ')
I can't even print reply[0][0] (see below) without having a segmentation fault.
Moreover, I tried to debug this using valgrind but when I use it the program doesn't crash and everything works so I can't find what's wrong.
printf("Reply[0][0] = %d\n", reply[0][0]);
printf("Reply[0][0] = %c\n", reply[0][0]);
EDIT:
Here is a reproductible example.
#include <unistd.h>
#include <stdio.h>
#include <string.h>
#include <fcntl.h>
#include <assert.h>
char **get_words(char *buffer, char delimiter)
{
printf("buffer = %s\n", buffer);
char **words = malloc(sizeof(char *) * 100);
if (words == NULL) {
printf("Malloc Error\n");
exit(84);
}
for (int i = 0; i < 100; i++) {
words[i] = malloc(sizeof(char) * 100);
if (words[i] == NULL) {
printf("Malloc Error\n");
exit(84);
}
}
int word_count = 0;
int l = 0;
for (int i = 0; buffer[i] != '\0' && buffer[i] != '\n'; i++, l++) {
if (buffer[i] == delimiter) {
words[word_count][l] = '\0';
word_count++;
l = -1;
}
else
words[word_count][l] = buffer[i];
}
words[word_count][l] = '\0';
return (words);
}
int main()
{
char *buffer = malloc(sizeof(char) * 100);
buffer = "hello world !\n";
char **words = get_words(buffer, ' ');
printf("words[0]= %s\n", words[0]);
free (buffer);
char **reply = get_words("Second call\n", ' ');
printf("reply[0] = %s\n", reply[0]);
}

If you need help in learning programming, you can try a static analyzer. This is a program that performs code reviews and finds suspicious code fragments. Static analyzers can't replace code reviews performed by a teammate. However, analyzers complement code reviews and help find many errors at earliest stages.
Let's run the online version of the PVS-Studio analyzer for the code sample attached to the question. The first interesting and important warning is the following warning: V1031 The malloc function is not declared. Passing data to or from this function can be affected.
Without declaring the malloc function, the program runs in a strange way. According to the C language, if a function is not declared, it returns int. But actually, it's a pointer. You can find out why this is dangerous here. Let's fix this problem by adding #include <stdlib.h>.
Now the analyzer issues another warning — we get a more serious issue:
43:1: note: V773 The 'buffer' pointer was assigned values twice without releasing the memory. A memory leak is possible.
The issue is in the following code fragment:
char *buffer = malloc(sizeof(char) * 100);
buffer = "hello world !\n";
....
free (buffer);
The pointer value is overwritten. To copy a string to the buffer, a programmer should use special functions, for example strcpy. Let's fix this.
Here's the fixed code.
#include <unistd.h>
#include <stdio.h>
#include <string.h>
#include <fcntl.h>
#include <assert.h>
#include <stdlib.h>
char **get_words(char *buffer, char delimiter)
{
printf("buffer = %s\n", buffer);
char **words = malloc(sizeof(char *) * 100);
if (words == NULL) {
printf("Malloc Error\n");
exit(84);
}
for (int i = 0; i < 100; i++) {
words[i] = malloc(sizeof(char) * 100);
if (words[i] == NULL) {
printf("Malloc Error\n");
exit(84);
}
}
int word_count = 0;
int l = 0;
for (int i = 0; buffer[i] != '\0' && buffer[i] != '\n'; i++, l++) {
if (buffer[i] == delimiter) {
words[word_count][l] = '\0';
word_count++;
l = -1;
}
else
words[word_count][l] = buffer[i];
}
words[word_count][l] = '\0';
return (words);
}
int main()
{
char *buffer = malloc(sizeof(char) * 100);
if (buffer == NULL)
exit(84);
strcpy(buffer, "hello world !\n");
char **words = get_words(buffer, ' ');
printf("words[0]= %s\n", words[0]);
free (buffer);
char **reply = get_words("Second call\n", ' ');
printf("reply[0] = %s\n", reply[0]);
}
I can't say that this code is perfect and secure, but it runs. So, using static analyzers to find errors, you can improve your learning process.

Related

Extracting the first two words in a sentence in C without pointers

I am getting used to writing eBPF code as of now and want to avoid using pointers in my BPF text due to how difficult it is to get a correct output out of it. Using strtok() seems to be out of the question due to all of the example codes requiring pointers. I also want to expand it to CSV files in the future since this is a means of practice for me. I was able to find another user's code here but it gives me an error with the BCC terminal due to the one pointer.
char str[256];
bpf_probe_read_user(&str, sizeof(str), (void *)PT_REGS_RC(ctx));
char token[] = strtok(str, ",");
char input[] ="first second third forth";
char delimiter[] = " ";
char firstWord, *secondWord, *remainder, *context;
int inputLength = strlen(input);
char *inputCopy = (char*) calloc(inputLength + 1, sizeof(char));
strncpy(inputCopy, input, inputLength);
str = strtok_r (inputCopy, delimiter, &context);
secondWord = strtok_r (NULL, delimiter, &context);
remainder = context;
getchar();
free(inputCopy);
Pointers are powerful, and you wont be able to avoid them for very long. The time you invest in learning them is definitively worth it.
Here is an example:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
/**
Extracts the word with the index "n" in the string "str".
Words are delimited by a blank space or the end of the string.
}*/
char *getWord(char *str, int n)
{
int words = 0;
int length = 0;
int beginIndex = 0;
int endIndex = 0;
char currentchar;
while ((currentchar = str[endIndex++]) != '\0')
{
if (currentchar == ' ')
{
if (n == words)
break;
if (length > 0)
words++;
length = 0;
beginIndex = endIndex;
continue;
}
length++;
}
if (n == words)
{
char *result = malloc(sizeof(char) * length + 1);
if (result == NULL)
{
printf("Error while allocating memory!\n");
exit(1);
}
memcpy(result, str + beginIndex, length);
result[length] = '\0';
return result;
}else
return NULL;
}
You can easily use the function:
int main(int argc, char *argv[])
{
char string[] = "Pointers are cool!";
char *word = getWord(string, 2);
printf("The third word is: '%s'\n", word);
free(word); //Don't forget to de-allocate the memory!
return 0;
}

Invalid write of Size 8 when mallocing array of strings

I am trying to write my own Shell in C. I have a problem. I wrote my own _strtok function that uses strtok but returns all the tokens as an array of strings. For testing I use the string "ls -laR" defined in the main function. I get the valgrind error "Invalid write of size 8" when trying to malloc the number of chars in the second pointer in the array of strings named "Doubl". Why is it doing this? I am allocating the proper number of pointers to strings in the doubl array. Any insight or help would be appreciated
#include <string.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <sys/wait.h>
char **_strtok(char *str, char *delim)
{
char **doubl;
char *s = str;
char *string;
int i = 0;
while (*s)
{
if (*s == *delim)
i++;
s++;
}
doubl = malloc(sizeof(char *) * i + 1);
i = 0;
string = strtok(str, delim);
while (1)
{
doubl[i] = malloc(sizeof(char) * strlen(string) + 1);
strcpy(doubl[i], string);
i++;
if (string == NULL)
break;
string = strtok(NULL, delim);
}
return (doubl);
}
char *get_path(char **env)
{
char **check = env;
char *path = NULL;
char pth[] = "PATH";
int i, j, stop = 0;
for (i = 0; check[i] && stop == 0; i++)
{
for (j = 0; j < 4 && stop == 0; j++)
{
if (check[i][j] != pth[j])
break;
if (check[i][j] == pth[j] && j == 3)
{
path = malloc(strlen(check[i]));
strcpy(path, check[i]);
stop = 1;
}
}
}
return (path);
}
char **cmd_to_arg(char **cmd, char **env)
{
/* FREE PATH BEFORE END */
char *path = get_path(env);
char *slash = "/";
char **args = NULL, **check = _strtok(path, ":"), **checkStart = check, **cmdStart = cmd;
int status = -1, i = 0, j;
while (*checkStart)
{
strcat(*checkStart, slash);
strcat(*checkStart, cmd[0]);
status = access(*checkStart, F_OK | X_OK);
printf("%s\n", *checkStart);
if (status == 0)
break;
checkStart++;
}
for(;*cmdStart; i++, cmdStart++)
printf("%d\n", i);
args = malloc(sizeof(char *) * i);
args[0] = malloc(strlen(*checkStart));
strcpy(args[0], *checkStart);
puts(args[0]);
for (j = 1; j < i && cmd[j] != NULL; j++)
{
//printf("%d\n", j);
args[j] = malloc(strlen(cmd[j]) * sizeof(char));
strcpy(args[j], cmd[j]);
puts(args[j]);
}
return (args);
}
int main(int ac, char **av, char **env)
{
(void)ac, (void)av, (void)env;
char line[] = "ls laR";
//size_t size = 0;
char **cmd; //**cmdStart;
//int i = 0, j = 0;
cmd = _strtok(line, " ");
cmd = cmd_to_arg(cmd, env);
return (0);
}

Copying specific number of characters from a string to another

I have a variable length string that I am trying to divide from plus signs and study on:
char string[] = "var1+vari2+varia3";
for (int i = 0; i != sizeof(string); i++) {
memcpy(buf, string[0], 4);
buf[9] = '\0';
}
since variables are different in size I am trying to write something that is going to take string into loop and extract (divide) variables. Any suggestions ? I am expecting result such as:
var1
vari2
varia3
You can use strtok() to break the string by delimiter
char string[]="var1+vari2+varia3";
const char delim[] = "+";
char *token;
/* get the first token */
token = strtok(string, delim);
/* walk through other tokens */
while( token != NULL ) {
printf( " %s\n", token );
token = strtok(NULL, delim);
}
More info about the strtok() here: https://man7.org/linux/man-pages/man3/strtok.3.html
It seems to me that you don't just want to want to print the individual strings but want to save the individual strings in some buffer.
Since you can't know the number of strings nor the length of the individual string, you should allocate memory dynamic, i.e. use functions like realloc, calloc and malloc.
It can be implemented in several ways. Below is one example. To keep the example simple, it's not performance optimized in anyway.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <assert.h>
char** split_string(const char* string, const char* token, int* num)
{
assert(string != NULL);
assert(token != NULL);
assert(num != NULL);
assert(strlen(token) != 0);
char** data = NULL;
int num_strings = 0;
while(*string)
{
// Allocate memory for one more string pointer
char** ptemp = realloc(data, (num_strings + 1) * sizeof *data);
if (ptemp == NULL) exit(1);
data = ptemp;
// Look for token
char* tmp = strstr(string, token);
if (tmp == NULL)
{
// Last string
// Allocate memory for one more string and copy it
int len = strlen(string);
data[num_strings] = calloc(len + 1, 1);
if (data[num_strings] == NULL) exit(1);
memcpy(data[num_strings], string, len);
++num_strings;
break;
}
// Allocate memory for one more string and copy it
int len = tmp - string;
data[num_strings] = calloc(len + 1, 1);
if (data[num_strings] == NULL) exit(1);
memcpy(data[num_strings], string, len);
// Prepare to search for next string
++num_strings;
string = tmp + strlen(token);
}
*num = num_strings;
return data;
}
int main()
{
char string[]="var1+vari2+varia3";
// Split the string into dynamic allocated memory
int num_strings;
char** data = split_string(string, "+", &num_strings);
// Now data can be used as an array-of-strings
// Example: Print the strings
printf("Found %d strings:\n", num_strings);
for(int i = 0; i < num_strings; ++i) printf("%s\n", data[i]);
// Free the memory
for(int i = 0; i < num_strings; ++i) free(data[i]);
free(data);
}
Output
Found 3 strings:
var1
vari2
varia3
You can use a simple loop scanning the string for + signs:
char string[] = "var1+vari2+varia3";
char buf[sizeof(string)];
int start = 0;
for (int i = 0;;) {
if (string[i] == '+' || string[i] == '\0') {
memcpy(buf, string + start, i - start);
buf[i - start] = '\0';
// buf contains the substring, use it as a C string
printf("%s\n", buf);
if (string[i] == '\0')
break;
start = ++i;
} else {
i++;
}
}
Your code does not have any sense.
I wrote such a function for you. Analyse it as sometimes is good to have some code as a base
char *substr(const char *str, char *buff, const size_t start, const size_t len)
{
size_t srcLen;
char *result = buff;
if(str && buff)
{
if(*str)
{
srcLen = strlen(str);
if(srcLen < start + len)
{
if(start < srcLen) strcpy(buff, str + start);
else buff[0] = 0;
}
else
{
memcpy(buff, str + start, len);
buff[len] = 0;
}
}
else
{
buff[0] = 0;
}
}
return result;
}
https://godbolt.org/z/GjMEqx

Dynamically allocated unknown length string reading from file (it has to be protected from reading numbers from the file) in C

My problem is such that I need to read string from file. File example:
Example 1 sentence
Example sentence number xd 595 xd 49 lol
but I have to read only the string part, not numbers. I guess I have to use fscanf() with %s for it but let me know what you guys think about it.
The part where my problem begins is how to read the string (it is unknown length) using malloc(), realloc()? I tried it by myself, but I failed (my solution is at bottom of my post).
Then I need to show the result on the screen.
P.S. I have to use malloc()/calloc(), realloc() <-- it has to be dynamically allocated string :) (char *)
Code I've tried:
int wordSize = 2;
char *word = (char *)malloc(wordSize*sizeof(char));
char ch;
FILE* InputWords = NULL;
InputWords = fopen(ListOfWords,"r"); /* variable ListOfWords contains name of the file */
if (InputWords == NULL)
{
printf("Error while opening the file.\n");
return 0;
}
int index = 0;
while((ch = fgetc(InputWords)) != -1)
{
if(ch == ' ')
{
printf("%s\n", word);
wordSize = 2;
index = 0;
free(word);
char* word = (char *)malloc(wordSize*sizeof(char));
}
else
{
wordSize++;
word = (char *)realloc(word, wordSize*sizeof(char));
strcpy(word,ch);
index++;
}
}
fclose(InputWords);
For your code, you have something have to improve:
fgetc return the int type not char. So change char ch to int ch;
As the comment of #pmg use EOF (may be any negative value) instead of -1`
strcpy(word,ch); you try to copy character (ch) to character pointer (word).
Do not cast malloc or realloc function: Do I cast the result of malloc?.
For solving your question, i propose you use the strtok function to split string by space character, then test each word is number or not. If the word is not a number, you can use strcat to concatenate the word to the old sentence.
The complete code:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <ctype.h>
int is_number(char *str) {
if (strlen(str) == 0)
return -1;
for(int i =0; (i < strlen(str)) && (str[i] != '\n') ; i++) {
if(!isdigit(str[i]))
return -1;
}
return 1;
}
int main()
{
FILE *fp = fopen("input.txt", "r");
char line[256];
if(!fp) return -1;
char **sentence;
int i = 0;
sentence = malloc(sizeof(char *));
if(!sentence) return -1;
while(fgets(line, 256, fp)) {
char * token = strtok(line, " ");
size_t len = 0;
sentence = realloc(sentence, sizeof(char *) * (i+1));
if(!sentence) return -1;
while(token != NULL) {
if (is_number(token) != 1) {
sentence[i] = realloc(sentence[i], len + 2 + strlen(token)); // +2 because 1 for null character and 1 for space character
if (!sentence[i]) {
printf("cannot realloc\n");
return -1;
}
strcat(strcat(sentence[i], " "), token);
len = strlen(sentence[i]);
}
token = strtok(NULL, " ");
}
if(len > 0)
i++;
}
for(int j = 0; j < i; j++) {
printf("line[%d]: %s", j, sentence[j]);
}
for(int j = 0; j < i; j++) {
free(sentence[j]);
}
free(sentence);
fclose(fp);
return 0;
}
The input and output:
$cat input.txt
Example 1 sentence
Example sentence number xd 595 xd 49 lol
./test
line[0]: Example sentence
line[1]: Example sentence number xd xd lol

Erase last members of line from text file

I have a text file as data.txt and I want to delete the last members of each line:
Here's the text file:
2031,2,0,0,0,0,0,0,54,0,
2027,2,0,0,0,0,0,0,209,0,
2029,2,0,0,0,0,0,0,65,0,
2036,2,0,0,0,0,0,0,165,0,
I would like to delete so it becomes:
2031,2,0,0,0,0,0,0,
2027,2,0,0,0,0,0,0,
2029,2,0,0,0,0,0,0,
2036,2,0,0,0,0,0,0,
I'm working in C but as the numbers can have two or three digits, I'm not sure how to do this.
A couple of uses of strrchr() can do the job:
#include <string.h>
void zap_last_field(char *line)
{
char *last_comma = strrchr(line, ',');
if (last_comma != 0)
{
*last_comma = '\0';
last_comma = strrchr(line, ',');
if (last_comma != 0)
*(last_comma + 1) = '\0';
}
}
Compiled code that seems to work. Note that given a string containing a single comma, it will zap that comma. If you don't want that to happen, then you have to work a little harder.
Test code for zap_last_field()
#include <string.h>
extern void zap_last_field(char *line);
void zap_last_field(char *line)
{
char *last_comma = strrchr(line, ',');
if (last_comma != 0)
{
*last_comma = '\0';
last_comma = strrchr(line, ',');
if (last_comma != 0)
*(last_comma + 1) = '\0';
}
}
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
char *line = malloc(4096);
if (line != 0)
{
while (fgets(line, 4096, stdin) != 0)
{
printf("Line: %s", line);
zap_last_field(line);
printf("Zap1: %s\n", line);
}
free(line);
}
return(0);
}
This has been vetted with valgrind and is OK on both the original data file and the mangled data file listed below. The dynamic memory allocation is there to give valgrind the maximum chance of spotting any problems.
I strongly suspect that the core dump reported in a comment happens because the alternative test code tried to pass a literal string to the function, which won't work because literal strings are not generally modifiable and this code modifies the string in situ.
Test code for zap_last_n_fields()
If you want to zap the last couple of fields (a controlled number of fields), then you'll probably want to pass in a count of the number of fields to be zapped and add a loop. Note that this code uses a VLA so it requires a C99 compiler.
#include <string.h>
extern void zap_last_n_fields(char *line, size_t nfields);
void zap_last_n_fields(char *line, size_t nfields)
{
char *zapped[nfields+1];
for (size_t i = 0; i <= nfields; i++)
{
char *last_comma = strrchr(line, ',');
if (last_comma != 0)
{
zapped[i] = last_comma;
*last_comma = '\0';
}
else
{
/* Undo the damage wrought above */
for (size_t j = 0; j < i; j++)
*zapped[j] = ',';
return;
}
}
zapped[nfields][0] = ',';
zapped[nfields][1] = '\0';
}
#include <stdio.h>
int main(void)
{
char line1[4096];
while (fgets(line1, sizeof(line1), stdin) != 0)
{
printf("Line: %s", line1);
char line2[4096];
for (size_t i = 1; i <= 3; i++)
{
strcpy(line2, line1);
zap_last_n_fields(line2, i);
printf("Zap%zd: %s\n", i, line2);
}
}
return(0);
}
Example run — using your data.txt as input:
Line: 2031,2,0,0,0,0,0,0,54,0,
Zap1: 2031,2,0,0,0,0,0,0,54,
Zap2: 2031,2,0,0,0,0,0,0,
Zap3: 2031,2,0,0,0,0,0,
Line: 2027,2,0,0,0,0,0,0,209,0,
Zap1: 2027,2,0,0,0,0,0,0,209,
Zap2: 2027,2,0,0,0,0,0,0,
Zap3: 2027,2,0,0,0,0,0,
Line: 2029,2,0,0,0,0,0,0,65,0,
Zap1: 2029,2,0,0,0,0,0,0,65,
Zap2: 2029,2,0,0,0,0,0,0,
Zap3: 2029,2,0,0,0,0,0,
Line: 2036,2,0,0,0,0,0,0,165,0,
Zap1: 2036,2,0,0,0,0,0,0,165,
Zap2: 2036,2,0,0,0,0,0,0,
Zap3: 2036,2,0,0,0,0,0,
It also correctly handles a file such as:
2031,0,0,
2031,0,
2031,
2031
,

Resources