In C, How to split a string on \n into lines - c

I want to split a string by \n and place lines which contain a specific token into an array.
I have this code:
char mydata[100] =
"mary likes apples\njim likes playing\nmark hates school\nanne likes mary";
char *token = "likes";
char ** res = NULL;
char * p = strtok (mydata, "\n");
int n_spaces = 0, i;
/* split string and append tokens to 'res' */
while (p) {
res = realloc (res, sizeof (char*) * ++n_spaces);
if (res == NULL)
exit (-1); /* memory allocation failed */
if (strstr(p, token))
res[n_spaces-1] = p;
p = strtok (NULL, "\n");
}
/* realloc one extra element for the last NULL */
res = realloc (res, sizeof (char*) * (n_spaces+1));
res[n_spaces] = '\0';
/* print the result */
for (i = 0; i < (n_spaces+1); ++i)
printf ("res[%d] = %s\n", i, res[i]);
/* free the memory allocated */
free (res);
But then I get a segmentation fault:
res[0] = mary likes apples
res[1] = jim likes playing
Segmentation fault
How can I split a string on \n correctly in C?

strstr just returns a pointer to the first match of second argument.
Your code is not taking care of null character.
Can use strcpy to copy string.
while (p) {
// Also you want string only if it contains "likes"
if (strstr(p, token))
{
res = realloc (res, sizeof (char*) * ++n_spaces);
if (res == NULL)
exit (-1);
res[n_spaces-1] = malloc(sizeof(char)*strlen(p));
strcpy(res[n_spaces-1],p);
}
p = strtok (NULL, "\n");
}
Free res using:
for(i = 0; i < n_spaces; i++)
free(res[i]);
free(res);

Try this:
char mydata[100] = "mary likes apples\njim likes playing\nmark hates school\nanne likes mary";
char *token = "likes";
char **result = NULL;
int count = 0;
int i;
char *pch;
// split
pch = strtok (mydata,"\n");
while (pch != NULL)
{
if (strstr(pch, token) != NULL)
{
result = (char*)realloc(result, sizeof(char*)*(count+1));
result[count] = (char*)malloc(strlen(pch)+1);
strcpy(result[count], pch);
count++;
}
pch = strtok (NULL, "\n");
}
// show and free result
printf("%d results:\n",count);
for (i = 0; i < count; ++i)
{
printf ("result[%d] = %s\n", i, result[i]);
free(result[i]);
}
free(result);

Related

Function to split a string and return every word in the string as an array of strings

I am trying to create a function that will accept a string, and return an array of words in the string. Here is my attempt:
#include "main.h"
/**
* str_split - Splits a string
* #str: The string that will be splited
*
* Return: On success, it returns the new array
* of strings. On failure, it returns NULL
*/
char **str_split(char *str)
{
char *piece, **str_arr = NULL, *str_cpy;
int number_of_words = 0, i;
if (str == NULL)
{
return (NULL);
}
str_cpy = str;
piece = strtok(str_cpy, " ");
while (piece != NULL)
{
if ((*piece) == '\n')
{
piece = strtok(NULL, " ");
continue;
}
number_of_words++;
piece = strtok(NULL, " ");
}
str_arr = (char **)malloc(sizeof(char *) * number_of_words);
piece = strtok(str, " ");
for (i = 0; piece != NULL; i++)
{
if ((*piece) == '\n')
{
piece = strtok(NULL, " ");
continue;
}
str_arr[i] = (char *)malloc(sizeof(char) * (strlen(piece) + 1));
strcpy(str_arr[i], piece);
piece = strtok(NULL, " ");
}
return (str_arr);
}
Once I compile my file, I should be getting:
Hello
World
But I am getting:
Hello
Why is this happening? I have tried to dynamically allocate memory for the new string array, by going through the copy of the original string and keeping track of the number of words. Is this happening because the space allocated for the array of strings is not enough?
The code seems fine overall, with just some issues:
You tried to copy str, as strtok modifies it while parsing.
This is the right approach. However, the following line is wrong:
str_cpy = str;
This is not a copy of strings, it is only copying the address of the string. You can use strdup function here.
Also, you need to return the number of words counted otherwise the caller will not know how many were parsed.
Finally, be careful when you define the string to be passed to this function. If you call it with:
char **arr = str_split ("Hello World", &nwords);
Or even with:
char *str = "Hello World";
char **arr = str_split (str, &nwords);
The program will crash as str here is read-only (see this).
Taking care of these, the program should work with:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
/**
* str_split - Splits a string
* #str: The string that will be splited
*
* Return: On success, it returns the new array
* of strings. On failure, it returns NULL
*/
char **str_split(char *str, int *number_of_words)
{
char *piece, **str_arr = NULL, *str_cpy = NULL;
int i = 0;
if (str == NULL)
{
return (NULL);
}
str_cpy = strdup (str);
piece = strtok(str_cpy, " ");
while (piece != NULL)
{
if ((*piece) == '\n')
{
piece = strtok(NULL, " ");
continue;
}
(*number_of_words)++;
piece = strtok(NULL, " ");
}
str_arr = (char **)malloc(sizeof(char *) * (*number_of_words));
piece = strtok(str, " ");
for (i = 0; piece != NULL; i++)
{
if ((*piece) == '\n')
{
piece = strtok(NULL, " ");
continue;
}
str_arr[i] = (char *)malloc(sizeof(char) * (strlen(piece) + 1));
strcpy(str_arr[i], piece);
piece = strtok(NULL, " ");
}
if (str_cpy)
free (str_cpy);
return (str_arr);
}
int main ()
{
int nwords = 0;
char str[] = "Hello World";
char **arr = str_split (str, &nwords);
for (int i = 0; i < nwords; i++) {
printf ("word %d: %s\n", i, arr[i]);
}
// Needs to free allocated memory...
}
Testing:
$ gcc main.c && ./a.out
word 0: Hello
word 1: World

parse line from file to list in c c90

I'm working with c90 on linux.
I have a strange bug when I want to end a string,
let idx be the index, so when I get to the last index I want the list[idx] to be NULL.
example:
list[0] actually "hello"
list[1] actually "world\n"
list[2] sometimes is "" or NULL
so when I put NULL to the the end of the list its deletes one of the other words..
for: list[2] = NULL;
unexpectedly list[0] turns NULL but list[1] still "world\n" and list[2] of course NULL.
I wrote this function:
void function()
{
char buffer[BUFF_LEN];
char** list = NULL;
int list_len = 0;
while (fgets(buffer, BUFF_LEN, fptr))
{
list = (char**)malloc((sizeof(char*)));
get_input(buffer, list, &list_len);
/*
some other code
*/
}
free_list(list, list_len); /*free the array of strings words*/
}
and wrote also the get_input because I work with c90
void get_input(char* line, char** list, int *idx)
{
char * token;
*idx = 0;
token = strtok(line, " "); /*extract the first token*/
/* loop through the string to extract all other tokens */
while (token != NULL)
{
if (token && token[0] == '\t')
memmove(token, token + 1, strlen(token));
printf("%s\n", token);
list[*idx] = (char *)malloc(strlen(token)+1);
strncpy(list[*idx], token, strlen(token));
token = strtok(NULL, " "); /*get every token*/
(*idx)++;
}
if (*idx == 0)
list = NULL;
list[*idx - 1][strcspn(list[*idx - 1], "\n")] = 0; /* remove the "\n" */
list[*idx] = NULL; /* to know when the list ends */
}
the free function:
void free_list(char** list, int list_len)
{
int i;
for(i= list_len - 1; i >= 0; i--)
{
list[i] = NULL;
free(list[i]);
}
}
You have multiple issues.
void function()
{
char buffer[BUFF_LEN];
char** list = NULL;
int list_len = 0;
while (fgets(buffer, BUFF_LEN, fptr))
{
list = (char**)malloc((sizeof(char*)));
get_input(buffer, list, &list_len);
/*
some other code
*/
}
free_list(list, list_len); /*free the array of strings words*/
}
You only allocate memory for 1 pointer.
You only free the pointers in the last list.
You never free the memory for list ifself.
You should not cast the return value of malloc and friends.
This should be changed like this:
void function()
{
char buffer[BUFF_LEN];
char** list = NULL;
int list_len = 0;
while (fgets(buffer, BUFF_LEN, fptr))
{
list = malloc((sizeof(char*)));
get_input(buffer, &list, &list_len);
/*
some other code
*/
free_list(list); /*free the array of strings words*/
free(list);
}
}
The freeing function is also broken:
void free_list(char** list, int list_len)
{
int i;
for( i= list_len - 1; i >= 0; i--)
{
list[i] = NULL;
free(list[i]);
}
}
You set the pointer within list to NULL before you free it. This causes a memory leak as the memory is not really freed.
You don't really need the length as you have added a sentinel. But that is not an error.
There is also no need to free the pointers backwards.
After cleanup the function could look like this:
void free_list(char** list)
{
while (list[i])
{
free(list[i]);
i++;
}
}
Now the biggest part:
void get_input(char* line, char** list, int *idx)
{
char * token;
*idx = 0;
token = strtok(line, " "); /*extract the first token*/
/* loop through the string to extract all other tokens */
while (token != NULL)
{
if (token && token[0] == '\t')
memmove(token, token + 1, strlen(token));
printf("%s\n", token);
list[*idx] = (char *)malloc(strlen(token)+1);
strncpy(list[*idx], token, strlen(token));
token = strtok(NULL, " "); /*get every token*/
(*idx)++;
}
if (*idx == 0)
list = NULL;
list[*idx - 1][strcspn(list[*idx - 1], "\n")] = 0; /* remove the "\n" */
list[*idx] = NULL; /* to know when the list ends */
}
You do not care about memory for the pointers in your list. That means you store the pointers in memory that you are not allowed to touch. By doing this you invoke undefined behaviour.
You must realloc the memory and for that you must be able to modify the passed pointer.
You should not cast the return values of malloc and friends.
You access illegal index values if *idx==0
You call strncpy with the length of the string without space for the 0 byte. That will cause the copy to be not nul terminated. Also there is no need to use strncpy over strcpy as you have reserved enough memory.
void get_input(char* line, char*** list, int *idx)
{
char *token;
char **list_local = *list; // Make things easier by avoiding one * within the function.
*idx = 0;
token = strtok(line, " "); /*extract the first token*/
/* loop through the string to extract all other tokens */
while (token != NULL)
{
if (token[0] == '\t') // No need to check for `token` again
memmove(token, token + 1, strlen(token));
printf("%s\n", token);
list_local[*idx] = malloc(strlen(token)+1);
strcpy(list_local[*idx], token);
token = strtok(NULL, " "); /*get every token*/
(*idx)++;
/* Increase array size to hold 1 more entry. */
/* That new element already includes memory for the sentinel NULL */
{
char ** temp = realloc(list_local, sizeof(char*) * (*idx));
if (temp != NULL)
list_local = temp;
// TODO: error handling ...
}
}
if (*idx != 0)
{
list_local[*idx - 1][strcspn(list_local[*idx - 1], "\n")] = 0; /* remove the "\n" */
}
list_local[*idx] = NULL; /* to know when the list ends */
*list = list_local;
}

Copying specific number of characters from a string to another

I have a variable length string that I am trying to divide from plus signs and study on:
char string[] = "var1+vari2+varia3";
for (int i = 0; i != sizeof(string); i++) {
memcpy(buf, string[0], 4);
buf[9] = '\0';
}
since variables are different in size I am trying to write something that is going to take string into loop and extract (divide) variables. Any suggestions ? I am expecting result such as:
var1
vari2
varia3
You can use strtok() to break the string by delimiter
char string[]="var1+vari2+varia3";
const char delim[] = "+";
char *token;
/* get the first token */
token = strtok(string, delim);
/* walk through other tokens */
while( token != NULL ) {
printf( " %s\n", token );
token = strtok(NULL, delim);
}
More info about the strtok() here: https://man7.org/linux/man-pages/man3/strtok.3.html
It seems to me that you don't just want to want to print the individual strings but want to save the individual strings in some buffer.
Since you can't know the number of strings nor the length of the individual string, you should allocate memory dynamic, i.e. use functions like realloc, calloc and malloc.
It can be implemented in several ways. Below is one example. To keep the example simple, it's not performance optimized in anyway.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <assert.h>
char** split_string(const char* string, const char* token, int* num)
{
assert(string != NULL);
assert(token != NULL);
assert(num != NULL);
assert(strlen(token) != 0);
char** data = NULL;
int num_strings = 0;
while(*string)
{
// Allocate memory for one more string pointer
char** ptemp = realloc(data, (num_strings + 1) * sizeof *data);
if (ptemp == NULL) exit(1);
data = ptemp;
// Look for token
char* tmp = strstr(string, token);
if (tmp == NULL)
{
// Last string
// Allocate memory for one more string and copy it
int len = strlen(string);
data[num_strings] = calloc(len + 1, 1);
if (data[num_strings] == NULL) exit(1);
memcpy(data[num_strings], string, len);
++num_strings;
break;
}
// Allocate memory for one more string and copy it
int len = tmp - string;
data[num_strings] = calloc(len + 1, 1);
if (data[num_strings] == NULL) exit(1);
memcpy(data[num_strings], string, len);
// Prepare to search for next string
++num_strings;
string = tmp + strlen(token);
}
*num = num_strings;
return data;
}
int main()
{
char string[]="var1+vari2+varia3";
// Split the string into dynamic allocated memory
int num_strings;
char** data = split_string(string, "+", &num_strings);
// Now data can be used as an array-of-strings
// Example: Print the strings
printf("Found %d strings:\n", num_strings);
for(int i = 0; i < num_strings; ++i) printf("%s\n", data[i]);
// Free the memory
for(int i = 0; i < num_strings; ++i) free(data[i]);
free(data);
}
Output
Found 3 strings:
var1
vari2
varia3
You can use a simple loop scanning the string for + signs:
char string[] = "var1+vari2+varia3";
char buf[sizeof(string)];
int start = 0;
for (int i = 0;;) {
if (string[i] == '+' || string[i] == '\0') {
memcpy(buf, string + start, i - start);
buf[i - start] = '\0';
// buf contains the substring, use it as a C string
printf("%s\n", buf);
if (string[i] == '\0')
break;
start = ++i;
} else {
i++;
}
}
Your code does not have any sense.
I wrote such a function for you. Analyse it as sometimes is good to have some code as a base
char *substr(const char *str, char *buff, const size_t start, const size_t len)
{
size_t srcLen;
char *result = buff;
if(str && buff)
{
if(*str)
{
srcLen = strlen(str);
if(srcLen < start + len)
{
if(start < srcLen) strcpy(buff, str + start);
else buff[0] = 0;
}
else
{
memcpy(buff, str + start, len);
buff[len] = 0;
}
}
else
{
buff[0] = 0;
}
}
return result;
}
https://godbolt.org/z/GjMEqx

null terminate an array of strings

I am trying to figure out how to get my array of strings from get_arguments to NULL terminate, or if that isn't the issue to function in my execv call.
char ** get_arguments(const char * string) {
char * copy = strdup(string);
char * remove_newline = "";
for(;;) {
remove_newline = strpbrk(copy, "\n\t");
if (remove_newline) {
strcpy(remove_newline, "");
}
else {
break;
}
}
char (* temp)[16] = (char *) malloc(256 * sizeof(char));
char * token = strtok(copy, " ");
strcpy(temp[0], token);
int i = 1;
while (token && (token = strtok(NULL, " "))) {
strcpy(temp[i], token);
i++;
}
char * new_null;
//new_null = NULL;
//strcpy(temp[i], new_null);
if(!temp[i]) printf("yup\n");
int c = 0;
for ( ; c <= i; c++) {
printf("%s ", temp[c]);
}
return temp;
}
I am trying to read in a string, space separated, similar to find ./ -name *.h. I am trying to input them into execv.
char (* arguments)[16] = (char **) malloc(256 * sizeof(char));
//...numerous lines of unrelated code
pid = fork();
if (pid == 0) {
arguments = get_arguments(input_string);
char * para[] = {"find", "./","-name", "*.h", NULL};
execv("/usr/bin/find", (char * const *) arguments);
//printf("%s\n", arguments[0]);
printf("\nexec failed: %s\n", strerror(errno)); //ls -l -R
exit(-1);
}
When I swap arguments in the execv call for para it works as intended, but trying to call with arguments returns exec failed: Bad address. If I remove the NULL from para I get the same issue. I've tried strcpy(temp, (char *) NULL), the version you see commented out in get_arguments, and a number of other things that I can't recall in their entirety, and my program ranges from Segmentation fault to failure to compile from attempting to strcpy NULL.
Changing the declarations of arguments and temp to char ** arguments = (char *) malloc(256 * sizeof(char)); ``char ** temp = (char *) malloc(256 * sizeof(char));clears upwarning: initialization from incompatible pointer typebut causes segfault on all calls toget_arguments`.
You want this:
char* temp[256]; // an array of 256 char*'s
char * token = strtok(copy, " ");
temp[0] = strdup(token);
int i = 1;
while (token && (token = strtok(NULL, " "))) {
temp[i] = strdup(token);
i++;
}
temp[i] = NULL;

Parsing string to array in C?

I am trying to parse a string in to an array with specific delimiters but its not working as expected. I tried a lot to get this working but failed.
Code i am using is below.
CODE TO PARSE
char itemCode[] = "0xFF,0xAA,0xBB,0x00,0x01,0x04,0x90";
char itemCodeToSend[34] = {0};
char ** res = NULL;
char * p = strtok (itemCode, ",");
int n_spaces = 0, i;
/* split string and append tokens to 'res' and 'itemCodeToSend' */
while (p) {
res = realloc (res, sizeof (char*) * ++n_spaces);
if (res == NULL)
exit (-1); /* memory allocation failed */
res[n_spaces-1] = p; // Copying to char**
strcpy(&itemCodeToSend[0],p);// Copying to Array
p = strtok (NULL, ",");
}
/* realloc one extra element for the last NULL */
res = realloc (res, sizeof (char*) * (n_spaces+1));
res[n_spaces] = 0;
/* print the result */
for (i = 0; i < (n_spaces); ++i)
printf ("res[%d] = %s\n", i, res[i]);
for (i = 0; i < 34; ++i)
printf ("0x%02x",(unsigned)res[i]&0xffU);
/* free the memory allocated */
free (res);
I am getting the below output for char** but not for char[]
res[0] = "0xFF"; itemCodeToSend[0] = "0x30";
res[1] = "0xAA"; itemCodeToSend[1] = "0x30";
res[2] = "0xBB"; itemCodeToSend[2] = "0x30";
res[3] = "0x00"; itemCodeToSend[3] = "0x30";
res[4] = "0x01"; itemCodeToSend[4] = "0x30";
res[5] = "0x04"; itemCodeToSend[5] = "0x30";
res[6] = "0x90"; itemCodeToSend[6] = "0x30";
Am i using right way to copy the extracted value to array?
Looks like you want these lines to access itemCodeToSend.
for (i = 0; i < 34; ++i)
printf ("0x%02x",(unsigned)res[j]&0xffU);
Before that, you you write to itemCodeToSend it looks like you
want to write the integer values, not their textual representation. And not to overwrite
the same element in each loop.
Perhaps you need something along the lines of
itemCodeToSend[n_spaces-1] = strtol(p, 0, 16);
or similar, instead of
strcpy(&itemCodeToSend[0],p);// Copying to Array

Resources