C build string char by char with known MAX length - c

I'm trying to add characters to a string one by one. I have something like this:
void doline(char *line, char *buffer, char** tokens){
}
and i am calling it like:
char *line = malloc(1025 * sizeof(char *));
fgets(line, 1024, stdin);
int linelength = strlen(line);
if (line[linelength - 1] == '\n'){
line[linelength - 1] = '\0';
}
char ** tokens = (char **) malloc(strlen(line) * sizeof(char *));
char *emptybuffer = malloc(strlen(line) * sizeof(char *));
parseline(line, emptybuffer, tokens);
So doline will go through line and tokenize it based on various conditions and place fragments of it into tokens. I am building the temp string in the variable buffer To do this, I need to go through line character by character.
I am currently doing:
buffer[strlen(buffer)] = line[i];
And then at the end of the loop:
*buffer++ = '\0';
But this is the result:
printf("Working on line: '%s' %d\n", line, strlen(line));
Outputs: Working on line: 'test' 4
But by the end of the function the buffer is:
*buffer++ = '\0';
printf("Buffer at the very end: '%s' %d\n", buffer, strlen(buffer));
Outputs: Buffer at the very end: 'test' 7
So the output is showing that the string is getting messed up. What's the best way to build this string character by character? Are my string manipulations correct?
Any help would be much appreciated!
Thanks!

There were some basic problems so I re-written the program.
#include <stdio.h>
#include <stdlib.h>
#define str_len 180
void tokenize(char *str, char **tokens)
{
int length = 0, index = 0;
int i = 0;
int str_i;
int tok_i;
while(str[length]) {
if (str[length] == ' ') {
/* this charecter is a space, so skip it! */
length++;
index++;
tokens[i] = malloc(sizeof(char) * index);
tok_i = 0;
for (str_i=length-index ; str_i<length; str_i++) {
tokens[i][tok_i] = str[str_i];
tok_i++;
}
tokens[i][tok_i] = '\0';
i++;
index = 0;
}
length++;
index++;
}
/* copy the last word in the string */
tokens[i] = malloc(sizeof(char) * index);
tok_i = 0;
for (str_i=length-index ; str_i<length; str_i++) {
tokens[i][tok_i] = str[str_i];
tok_i++;
}
tokens[i][tok_i] = '\0';
tokens[i++] = NULL;
return;
}
int main()
{
char *str = malloc(str_len * sizeof(char));
char **tokens = malloc(100 * sizeof(char *));
int i = 0;
if (str == NULL || tokens == NULL)
return 1;
gets(str);
printf("input string: %s\n", str);
tokenize(str, tokens);
while(tokens[i] != NULL) {
printf("%d - %s \n", i, tokens[i]);
i++;
}
while(tokens[i])
free(tokens[i]);
free(tokens);
free(str);
return 0;
}
It is compiled and executed as follows:
$ gcc -ggdb -Wall prog.c
$ ./a.out
this is a test string... hello world!!
input string: this is a test string... hello world!!
0 - this
1 - is
2 - a
3 - test
4 - string...
5 - hello
6 - world!!
$
There were few basic assumptions:
the length of the incoming string is assumed to a constant. This can be done dynamically - please check this - How to read a line from the console in C?.
The length of the tokens array is also assumed to be a constant. This can also be changed. I will leave that to you to find out how!
Hope this helps!

Related

Function to split a string and return every word in the string as an array of strings

I am trying to create a function that will accept a string, and return an array of words in the string. Here is my attempt:
#include "main.h"
/**
* str_split - Splits a string
* #str: The string that will be splited
*
* Return: On success, it returns the new array
* of strings. On failure, it returns NULL
*/
char **str_split(char *str)
{
char *piece, **str_arr = NULL, *str_cpy;
int number_of_words = 0, i;
if (str == NULL)
{
return (NULL);
}
str_cpy = str;
piece = strtok(str_cpy, " ");
while (piece != NULL)
{
if ((*piece) == '\n')
{
piece = strtok(NULL, " ");
continue;
}
number_of_words++;
piece = strtok(NULL, " ");
}
str_arr = (char **)malloc(sizeof(char *) * number_of_words);
piece = strtok(str, " ");
for (i = 0; piece != NULL; i++)
{
if ((*piece) == '\n')
{
piece = strtok(NULL, " ");
continue;
}
str_arr[i] = (char *)malloc(sizeof(char) * (strlen(piece) + 1));
strcpy(str_arr[i], piece);
piece = strtok(NULL, " ");
}
return (str_arr);
}
Once I compile my file, I should be getting:
Hello
World
But I am getting:
Hello
Why is this happening? I have tried to dynamically allocate memory for the new string array, by going through the copy of the original string and keeping track of the number of words. Is this happening because the space allocated for the array of strings is not enough?
The code seems fine overall, with just some issues:
You tried to copy str, as strtok modifies it while parsing.
This is the right approach. However, the following line is wrong:
str_cpy = str;
This is not a copy of strings, it is only copying the address of the string. You can use strdup function here.
Also, you need to return the number of words counted otherwise the caller will not know how many were parsed.
Finally, be careful when you define the string to be passed to this function. If you call it with:
char **arr = str_split ("Hello World", &nwords);
Or even with:
char *str = "Hello World";
char **arr = str_split (str, &nwords);
The program will crash as str here is read-only (see this).
Taking care of these, the program should work with:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
/**
* str_split - Splits a string
* #str: The string that will be splited
*
* Return: On success, it returns the new array
* of strings. On failure, it returns NULL
*/
char **str_split(char *str, int *number_of_words)
{
char *piece, **str_arr = NULL, *str_cpy = NULL;
int i = 0;
if (str == NULL)
{
return (NULL);
}
str_cpy = strdup (str);
piece = strtok(str_cpy, " ");
while (piece != NULL)
{
if ((*piece) == '\n')
{
piece = strtok(NULL, " ");
continue;
}
(*number_of_words)++;
piece = strtok(NULL, " ");
}
str_arr = (char **)malloc(sizeof(char *) * (*number_of_words));
piece = strtok(str, " ");
for (i = 0; piece != NULL; i++)
{
if ((*piece) == '\n')
{
piece = strtok(NULL, " ");
continue;
}
str_arr[i] = (char *)malloc(sizeof(char) * (strlen(piece) + 1));
strcpy(str_arr[i], piece);
piece = strtok(NULL, " ");
}
if (str_cpy)
free (str_cpy);
return (str_arr);
}
int main ()
{
int nwords = 0;
char str[] = "Hello World";
char **arr = str_split (str, &nwords);
for (int i = 0; i < nwords; i++) {
printf ("word %d: %s\n", i, arr[i]);
}
// Needs to free allocated memory...
}
Testing:
$ gcc main.c && ./a.out
word 0: Hello
word 1: World

Copying specific number of characters from a string to another

I have a variable length string that I am trying to divide from plus signs and study on:
char string[] = "var1+vari2+varia3";
for (int i = 0; i != sizeof(string); i++) {
memcpy(buf, string[0], 4);
buf[9] = '\0';
}
since variables are different in size I am trying to write something that is going to take string into loop and extract (divide) variables. Any suggestions ? I am expecting result such as:
var1
vari2
varia3
You can use strtok() to break the string by delimiter
char string[]="var1+vari2+varia3";
const char delim[] = "+";
char *token;
/* get the first token */
token = strtok(string, delim);
/* walk through other tokens */
while( token != NULL ) {
printf( " %s\n", token );
token = strtok(NULL, delim);
}
More info about the strtok() here: https://man7.org/linux/man-pages/man3/strtok.3.html
It seems to me that you don't just want to want to print the individual strings but want to save the individual strings in some buffer.
Since you can't know the number of strings nor the length of the individual string, you should allocate memory dynamic, i.e. use functions like realloc, calloc and malloc.
It can be implemented in several ways. Below is one example. To keep the example simple, it's not performance optimized in anyway.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <assert.h>
char** split_string(const char* string, const char* token, int* num)
{
assert(string != NULL);
assert(token != NULL);
assert(num != NULL);
assert(strlen(token) != 0);
char** data = NULL;
int num_strings = 0;
while(*string)
{
// Allocate memory for one more string pointer
char** ptemp = realloc(data, (num_strings + 1) * sizeof *data);
if (ptemp == NULL) exit(1);
data = ptemp;
// Look for token
char* tmp = strstr(string, token);
if (tmp == NULL)
{
// Last string
// Allocate memory for one more string and copy it
int len = strlen(string);
data[num_strings] = calloc(len + 1, 1);
if (data[num_strings] == NULL) exit(1);
memcpy(data[num_strings], string, len);
++num_strings;
break;
}
// Allocate memory for one more string and copy it
int len = tmp - string;
data[num_strings] = calloc(len + 1, 1);
if (data[num_strings] == NULL) exit(1);
memcpy(data[num_strings], string, len);
// Prepare to search for next string
++num_strings;
string = tmp + strlen(token);
}
*num = num_strings;
return data;
}
int main()
{
char string[]="var1+vari2+varia3";
// Split the string into dynamic allocated memory
int num_strings;
char** data = split_string(string, "+", &num_strings);
// Now data can be used as an array-of-strings
// Example: Print the strings
printf("Found %d strings:\n", num_strings);
for(int i = 0; i < num_strings; ++i) printf("%s\n", data[i]);
// Free the memory
for(int i = 0; i < num_strings; ++i) free(data[i]);
free(data);
}
Output
Found 3 strings:
var1
vari2
varia3
You can use a simple loop scanning the string for + signs:
char string[] = "var1+vari2+varia3";
char buf[sizeof(string)];
int start = 0;
for (int i = 0;;) {
if (string[i] == '+' || string[i] == '\0') {
memcpy(buf, string + start, i - start);
buf[i - start] = '\0';
// buf contains the substring, use it as a C string
printf("%s\n", buf);
if (string[i] == '\0')
break;
start = ++i;
} else {
i++;
}
}
Your code does not have any sense.
I wrote such a function for you. Analyse it as sometimes is good to have some code as a base
char *substr(const char *str, char *buff, const size_t start, const size_t len)
{
size_t srcLen;
char *result = buff;
if(str && buff)
{
if(*str)
{
srcLen = strlen(str);
if(srcLen < start + len)
{
if(start < srcLen) strcpy(buff, str + start);
else buff[0] = 0;
}
else
{
memcpy(buff, str + start, len);
buff[len] = 0;
}
}
else
{
buff[0] = 0;
}
}
return result;
}
https://godbolt.org/z/GjMEqx

Dynamically allocated unknown length string reading from file (it has to be protected from reading numbers from the file) in C

My problem is such that I need to read string from file. File example:
Example 1 sentence
Example sentence number xd 595 xd 49 lol
but I have to read only the string part, not numbers. I guess I have to use fscanf() with %s for it but let me know what you guys think about it.
The part where my problem begins is how to read the string (it is unknown length) using malloc(), realloc()? I tried it by myself, but I failed (my solution is at bottom of my post).
Then I need to show the result on the screen.
P.S. I have to use malloc()/calloc(), realloc() <-- it has to be dynamically allocated string :) (char *)
Code I've tried:
int wordSize = 2;
char *word = (char *)malloc(wordSize*sizeof(char));
char ch;
FILE* InputWords = NULL;
InputWords = fopen(ListOfWords,"r"); /* variable ListOfWords contains name of the file */
if (InputWords == NULL)
{
printf("Error while opening the file.\n");
return 0;
}
int index = 0;
while((ch = fgetc(InputWords)) != -1)
{
if(ch == ' ')
{
printf("%s\n", word);
wordSize = 2;
index = 0;
free(word);
char* word = (char *)malloc(wordSize*sizeof(char));
}
else
{
wordSize++;
word = (char *)realloc(word, wordSize*sizeof(char));
strcpy(word,ch);
index++;
}
}
fclose(InputWords);
For your code, you have something have to improve:
fgetc return the int type not char. So change char ch to int ch;
As the comment of #pmg use EOF (may be any negative value) instead of -1`
strcpy(word,ch); you try to copy character (ch) to character pointer (word).
Do not cast malloc or realloc function: Do I cast the result of malloc?.
For solving your question, i propose you use the strtok function to split string by space character, then test each word is number or not. If the word is not a number, you can use strcat to concatenate the word to the old sentence.
The complete code:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <ctype.h>
int is_number(char *str) {
if (strlen(str) == 0)
return -1;
for(int i =0; (i < strlen(str)) && (str[i] != '\n') ; i++) {
if(!isdigit(str[i]))
return -1;
}
return 1;
}
int main()
{
FILE *fp = fopen("input.txt", "r");
char line[256];
if(!fp) return -1;
char **sentence;
int i = 0;
sentence = malloc(sizeof(char *));
if(!sentence) return -1;
while(fgets(line, 256, fp)) {
char * token = strtok(line, " ");
size_t len = 0;
sentence = realloc(sentence, sizeof(char *) * (i+1));
if(!sentence) return -1;
while(token != NULL) {
if (is_number(token) != 1) {
sentence[i] = realloc(sentence[i], len + 2 + strlen(token)); // +2 because 1 for null character and 1 for space character
if (!sentence[i]) {
printf("cannot realloc\n");
return -1;
}
strcat(strcat(sentence[i], " "), token);
len = strlen(sentence[i]);
}
token = strtok(NULL, " ");
}
if(len > 0)
i++;
}
for(int j = 0; j < i; j++) {
printf("line[%d]: %s", j, sentence[j]);
}
for(int j = 0; j < i; j++) {
free(sentence[j]);
}
free(sentence);
fclose(fp);
return 0;
}
The input and output:
$cat input.txt
Example 1 sentence
Example sentence number xd 595 xd 49 lol
./test
line[0]: Example sentence
line[1]: Example sentence number xd xd lol

How to return a pointer to a string in C [duplicate]

This question already has answers here:
What is a debugger and how can it help me diagnose problems?
(2 answers)
Closed 4 years ago.
I tried to develop a function which take a string reverse letters and return pointer to string.
char *reverseStr(char s[])
{
printf("Initial string is: %s\n", s);
int cCounter = 0;
char *result = malloc(20);
while(*s != '\0')
{
cCounter++;
s++;
}
printf("String contains %d symbols\n", cCounter);
int begin = cCounter;
for(; cCounter >= 0; cCounter--)
{
result[begin - cCounter] = *s;
s--;
}
result[13] = '\0';
return result;
}
in main function I invoke the function and tried to print the result in this way:
int main()
{
char testStr[] = "Hello world!";
char *pTestStr;
puts("----------------------------------");
puts("Input a string:");
pTestStr = reverseStr(testStr);
printf("%s\n", pTestStr);
free(pTestStr);
return 0;
}
but the result is unexpected, there is no reverse string.
What is my fault?
There are multiple mistakes in the shared code, primarily -
s++; move the pointer till '\0'. It should be brought back 1 unit to
point to actual string by putting s--. Other wise the copied one will start with '\0' that will make it empty string.
Magic numbers 20 and 13. where in malloc() 1 + length of s should be
sufficient instead or 20. For 13 just move a unit ahead and put '\0'
However, using string.h library functions() this can be super easy. But I think you are doing it for learning purpose.
Therefore, Corrected code without using string.h lib function() should look like this:
char *reverseStr(char s[])
{
printf("Initial string is: %s\n", s);
int cCounter = 0;
while(*s != '\0')
{
cCounter++;
s++;
}
s--; //move pointer back to point actual string's last charecter
printf("String contains %d symbols\n", cCounter);
char *result = (char *) malloc(sizeof(char) * ( cCounter + 1 ));
if( result == NULL ) /*Check for failure. */
{
puts( "Can't allocate memory!" );
exit( 0 );
}
char *tempResult = result;
for (int begin = 0; begin < cCounter; begin++)
{
*tempResult = *s;
s--; tempResult++;
}
*tempResult = '\0';
//result[cCounter+1] = '\0';
return result;
}
Calling from main
int main()
{
char testStr[] = "Hello world!";
char *pTestStr;
puts("----------------------------------");
puts("Input a string:");
pTestStr = reverseStr(testStr);
printf("%s\n", pTestStr);
free(pTestStr);
}
Output
----------------------------------
Input a string:
Initial string is: Hello world!
String contains 12 symbols
!dlrow olleH
As per WhozCraig suggestion just by using pointer arithmetic only -
char *reverseStr(const char s[])
{
const char *end = s;
while (*end)
++end;
char *result = malloc((end - s) + 1), *beg = result;
if (result == NULL)
{
perror("Failed to allocate string buffer");
exit(EXIT_FAILURE);
}
while (end != s)
*beg++ = *--end;
*beg = 0;
return result;
}
Your code can be simplified using a string library function found in string.h
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char *reverseStr(char s[])
{
printf("Initial string is: %s\n", s);
int cCounter = strlen(s);
char *result = malloc(cCounter + 1);
printf("String contains %d symbols\n", cCounter);
int begin = cCounter;
for(; cCounter > 0; cCounter--)
{
result[begin - cCounter] = s[cCounter - 1];
}
result[begin] = '\0';
return result;
}
int main()
{
char testStr[] = "Hello world!";
char *pTestStr;
puts("----------------------------------");
puts("Input a string:");
pTestStr = reverseStr(testStr);
printf("%s\n", pTestStr);
free(pTestStr);
return 0;
}
Output:
----------------------------------
Input a string:
Initial string is: Hello world!
String contains 12 symbols
!dlrow olleH

Dynamic array of strings

I have to dynamically allocate array of words. Words are stored in a file separated by variable count of white-space characters. I don't know how many words is in the file a they can have variable length.
I have this code:
void readWord(FILE* stream, char *word, char first_c) {
word[0] = first_c;
char val;
int wlen = 1;
// isWhitespac is my function - tests if char is blank or '\n'
while ((val = fgetc(stream)) != EOF && isWhitespace(val) == 0) {
wlen++;
word = realloc(word, (wlen+1) * sizeof (char));
word[wlen-1] = val;
}
word[wlen] = '\0';
}
int readList(const char *file) {
FILE* f;
char **arr;
char val;
int wcount = 0;
arr = malloc(sizeof (char*));
f = fopen(file, "r");
while (fscanf(f, " %c", &val) == 1) {
wcount++;
arr = realloc(arr, wcount * sizeof (char *));
arr[wcount - 1] = malloc(sizeof (char));
readWord(f, arr[wcount-1], val);
printf("%s\n", arr[wcount-1]);
}
for (int i = 0; i < wcount; ++i) {
free(arr[i]);
}
free(arr);
fclose(f);
return 0;
}
It appears to work fine, it reads a prints all the words. But when I run the program with Valgrind the are too many errors, which I can't find. Could anyone help me? (I know I have to test if malloc and others went fine, it is just a test func.)
The Valgrind log is quite long, should I post it too?
One of the issues is that you do realloc inside readWord. If realloc allocates a new buffer and doesn't just extend the current one then your code will crash (you will double free the pointer) and this is what Valgrind picks up. To fix this I would rewrite the code so it returns a pointer instead of void.
char * readWord(FILE* stream, char *word, char first_c) {
word[0] = first_c;
char val;
int wlen = 1;
// isWhitespac is my function - tests if char is blank or '\n'
while ((val = fgetc(stream)) != EOF && isWhitespace(val) == 0) {
wlen++;
word = realloc(word, (wlen+1) * sizeof (char));
word[wlen-1] = val;
}
word[wlen] = '\0';
return word;
}
And then change the loop in readList to this:
while (fscanf(f, " %c", &val) == 1) {
wcount++;
arr = realloc(arr, wcount * sizeof (char *));
arr[wcount-1]=malloc(sizeof(char));
arr[wcount - 1] = readWord(f, arr[wcount-1], val);
printf("%s\n", arr[wcount-1]);
}

Resources