questions regarding tokenisation in c - c

I am writing a tokenisation program. I want to get input from a file, then store it in an input pointer. I am using the strtok function but when I print my tokens[i] I get NULL.
int tokenise(char *input, int file_output)
{
int i = 0;
char *tokens[100];
for(i=0 ;i<=20;i++)
{
tokens[i]= (char*)malloc(sizeof(char*));
}
char delim[] = " ,.;#/";
printf("\n ------------- buffer data is %s",input);
tokens[i] = strtok(input , delim);
printf("tokens are %s",*tokens[0]);
int j=0;
while(NULL != tokens[i])
{
i++;
tokens[i] = strtok(NULL,delim);
}
for(j = i; j <= 0; j--)
{
write(file_output,tokens[i],strlen(tokens[i]));
}
for(i = 0; i <= 20; i++)
{
printf("%s \n",*tokens[i]);
}
return SUCCESS;
}

For some reason you allocate memory and write pointers to the first 21 elements of tokens[]. At the end of that loop, i is 21. You then parse the input string using strtok(), storing its results in continuing array elements, from tokens[21]. So two of your loops need rewriting:
for(j=21; j<i; j++)
write(file_output,tokens[j],strlen(tokens[j]));
for(j=21; j<i; j++)
printf("%s \n",*tokens[j]);
But it would be better if you removed the first loop that allocates unnecessary memory. strtok() returns pointers to the original string, which it breaks into pieces by inserting '\0' terminators, so you only need to store the pointers in the array tokens[].

Related

Cant get first character in array of strings

I want to get first char character of each string. Here a example:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
int main() {
int size = 2;
char** text = (char**) malloc(sizeof(char*) * size);
for(int i = 0; i < size; ++i) {
char buf[80];
fgets(buf, 80, stdin);
text[i] = (char*)malloc(strlen(buf));
strcpy(text[i], buf);
}
for(int i = 0; i < strlen(text[i]); ++i) {
printf("%c ", text[i][0]);
}
}
In last for loop, program falls in Segmentation fault. I dont know why.
The strlen function returns the number of characters in the given string not including the terminal nul character; however, the strcpy function copies all characters including that terminating nul!
So, your allocation for text[i] is not quite big enough and, by writing beyond the buffer's bounds, you are getting undefined behaviour.
Add an extra character to the malloc call:
for(int i = 0; i < size; ++i) {
char buf[80];
fgets(buf, 80, stdin);
text[i] = malloc(strlen(buf) + 1); // Need space for the terminal nul!
strcpy(text[i], buf);
}
Or, more simply, use the strdup function, which achieves the same result as your malloc and strcpy in one fell swoop:
for(int i = 0; i < size; ++i) {
char buf[80];
fgets(buf, 80, stdin);
text[i] = strdup(buf);
}
Either way, don't forget to call free on all the buffers you allocate.
EDIT: You are also using the wrong 'limit' in your final output loop; this:
for(int i = 0; i < strlen(text[i]); ++i) { // strlen() is not the # strings
printf("%c ", text[i][0]);
}
Should be:
for(int i = 0; i < size; ++i) { // "size" is your number of strings!
printf("%c ", text[i][0]);
}

Reversing a string in C using loop [duplicate]

This question already has answers here:
Function to reverse string in C
(4 answers)
Closed 6 years ago.
Beginner programmer here. I'm trying to take an input from user, reverse it and show the result. For some reason, it's printing blanks instead of the reversed string. I know that array[i] has the right information because if I use this loop on line for (int i=0; i<count; i++), it's printing the right characters. It's just not printing in reverse. What am I not getting here?
#include <stdio.h>
#include <cs50.h>
#include <string.h>
int main(void)
{
printf("Please enter a word: ");
char *word = get_string();
int count = strlen(word);
char array[count];
for (int i=0; i< count; i++)
{
array[i] = word[i];
}
for (int i=count-1; i==0; i--)
{
printf("%c ", array[i]);
}
printf("\n");
}
for (int i=0; i< count; i++)
{
array[i] = word[i];
}
You go over the string and copy it, you do not reverse it.
There is also a subtle bug in-waiting in your declaration of array, since you do not leave space for the '\0' character terminator. Passing your buffer to printf as a C-string, as opposed to character by character will have undefined behavior.
So to fix those two particular errors:
char array[count + 1];
array[count] = '\0';
for (int i = 0; i< count; i++)
{
array[i] = word[count - i];
}
As a side note, it may not mean much to use a VLA for this small exercise, but for larger inputs it could very well overflow the call stack. Beware.
// the header where strlen is
#include <string.h>
/**
* \brief reverse the string pointed by str
**/
void reverseString(char* str) {
int len = strlen(str);
// the pointer for the left and right character
char* pl = str;
char* pr = str+len-1;
// iterate to the middle of the string from left and right (len>>1 == len/2)
for(int i = len>>1; i; --i, ++pl, --pr) {
// swap the left and right character
char l = *pl;
*pl = *pr;
*pr = l;
};
};
And just call the function:
int main(void) {
printf("Please enter a word: ");
char *word = get_string();
// Just call the function. Note: the memory is changed, if you want to have the original and the reversed just use a buffer and copy it with srcpy before the call
reverseString(word)
printf("%s\n", word);
};
And just change
char array[count];
for (int i=0; i< count; i++)
{
array[i] = word[i];
}
to
// add an other byte for the null-terminating character!!!
char array[count+1];
strcpy(array, word);

Parsing CSV file by splitting with strsep

I'm getting some unwanted output when attempting to parse a comma seperated value file with strsep(). It seems be be working for half of the file, with a number with only one value (ie. 0-9), but as soon as multiple values are added like for instance 512,
It will print 512 12 2 512 12 2 and so on. I'm not exactly sure if this is due to the particular style that I'm looping? Not really sure.
int main() {
char line[1024];
FILE *fp;
int data[10][10];
int i = 0;
int j = 0;
fp = fopen("file.csv", "r");
while(fgets(line, 1024, fp)) {
char* tmp = strdup(line);
char* token;
char* idx;
while((token = strsep(&tmp, ","))) {
for (idx=token; *idx; idx++) {
data[i][j] = atoi(idx);
j++;
}
}
i++;
j=0;
free(tmp);
}
for(i = 0; i < 10; i++) {
for(j = 0; j < 10; j++) {
printf("%d ", data[i][j]);
}
printf("\n");
}
fclose(fp);
}
It is because you are creating elements by using every characters in the token returned by strsep() as start via the loop
for (idx=token; *idx; idx++) {
data[i][j] = atoi(idx);
j++;
}
Stop doing that and create just one element from one token to correct:
while((token = strsep(&tmp, ","))) {
data[i][j] = atoi(token);
j++;
}
Also free(tmp); will do nothing because tmp will be set to NULL by strsep(). To free the buffer allocated via strdup(), keep the pointer in another variable and use it for freeing.

Scanf doesn't read string properly in C. What's wrong?

I'm writing a code that's supposed to find the spaces in the string and separate the parts before and after them in different string arrays. The first problem would be that the scanf doesn't even read my string properly, but also I haven't worked with strings in C before and am curious if it's correct (especially with the a[] array).
char expr[50];
char *a[50];
scanf("%s",expr);
int i=0;
int j=0;
while (strlen(expr)!=0){
if (expr[i]==' '){
strncpy(a[j],expr,i);
strcpy(expr,expr+i+1);
j++;
i=0;
}
else {
if (strlen(expr)==1){
strcpy(a[j],expr);
strcpy(expr,"");
j++;
i=0;
}
else i++;
}
}
i=0;
for (i=0; i<j; i++){
printf("%s\n",a[i]);
}
return 0;
This code is wrong.
Firstly, do not use uninitialized a[j].
Add
if((a[j]=calloc(strlen(expr)+1,sizeof(char)))==NULL)exit(1);
before strncpy(a[j],expr,i); and strcpy(a[j],expr); to allocate some memory.
Secondary, strcpy(expr,expr+i+1); is wrong because strcpy() won't accept overlapped regions.
Finally, you should use scanf("%49s",expr); instead if scanf("%s",expr); to avoid buffer overrun.
Don't use scanf, use gets() for standand input or fgets() for reading from a FILE*
To break a string into elements separated by spaces just use strtok():
char expr[50];
gets(expr);
char* a[50];
int i;
for (i = 0; i < 50; i++)
{
a[i] = (char*)malloc(10); // replace 10 with your maximum expected token length
}
i = 0;
for (char* token = strtok(expr, " "); token != NULL; token = strtok(NULL, " "))
{
strcpy(a[i++], token);
}
for (int j = 0; j < i; j++)
{
printf("%s\n", a[j]);
}
// don't forget to free each a[i] when done.
For simplicity this sample uses deprecated functions such as strcpy, consider replacing it with strcpy_s

Using Pointer Strings gives segmentation fault

The task was mainly to use pointers to input a string and slice it at places where there is a '\' character and output them in separate lines, using pointers. The program runs fine when I use arrays instead of pointers. However using pointers to store strings give the message "Segmentation fault". The code is as follows :
#include <stdio.h>
#include <stdlib.h>
int main() {
char *name;
char *sep[100];
int i = 0, j = 0, k = 0;
scanf("%[^\n]s", name);
for(i = 0; (*(name+i)) != '\0'; i++) {
if((*(name+i)) == '\\') {
*((*(sep+k))+j) = '\0';
j = 0;
k++;
} else {
*((*(sep+k))+j) = *(name+i);
j++;
}
}
for(i = 0; i <= k; i++) {
printf("%s\n", *(sep+i));
}
return 0;
}
It would be awesome if you could point out what and where the problem is, instead of giving me an alternative solution. TIA.
your pointers are null pointers.you are invoking undefined behavior by using them without assigning them to allocated memory.Allocate memory to them so that you can use them correctly and store words separated by \.Also,you can use [] instead of *.
#include <stdio.h>
#include <stdlib.h>
int main()
{
char name[256];
char *sep[100];
for( int n = 0 ; n < 100 ; n++ )
{
sep[n] = malloc(30*sizeof(char));
}
int i = 0, j = 0, k = 0;
scanf(" %255[^\n]s", name);
for(i = 0; name[i] != '\0'; i++)
{
if( name[i] == '\\')
{
sep[k][j] = '\0';
j = 0;
k++;
}
else
{
sep[k][j] = name[i];
j++;
}
}
sep[k][j] = '\0';
for(i = 0; i <= k ; i++)
{
printf("%s\n",sep[i]);
}
for( int n = 0 ; n < 100 ; n++ )
{
free(sep[n]);
}
return 0;
}
In your code,
scanf("%[^\n]s", name);
name is an unintialized pointer. It does not point to any valid memory location. You need to allocate memory before you can use it.
The same goes out for sep array, too.
You can consider using an array for this purpose or see the man page of malloc() if you want to stick to a pointer.
FWIW, using an unitialized pointer can lead to undefined behavior.
You must allocate space for your pointers to avoid undefined behaviour: you cannot use a pointer without initializing it.
int main() {
char *name = malloc(MAX_DIM_OF_NAME+1);
char *sep[100];
for (int i=0; i<100; i++)
sep[i] = malloc(MAX_DIM_OF_NAME+1);
....
You call scanf with an uninitialized name.

Resources