c splitting a char* into an char** - c

I'm reading in a line from a file (char by char, using fgetc()), where all fields(firstname, lastname, ...) are seperated by an ;. What I now want to do is create a char**, add all the chars to that and replace the ; by \0 so that I effectively get a list of all the fields.
Is that actually possibly? And when I create a char**, e.g. char ** buf = malloc(80) can I treat it like a one dimensional array? If the memory returned by malloc contiguous?
EDIT
Sry, meant to replace ; by \0, bot \n.
EDIT 2
This code should demonstrate what I intend to do (may clarify things a little):
int length = 80; // initial length char *buf = malloc(length); int num_chars = 0;
// handle malloc returning NULL
// suppose we already have a FILE pointer fp for (int ch = fgetc(fp); ch != EOF && ferror(fp) == 0; ch = fgetc(fp)) {
if (length == size) { // expand array if necessary
length += 80;
buf = realloc(buf, length);
// handle realloc failure
}
if (ch == ';') {
buf[num_chars] = '\0'; // replace delimiter by null-terminator
} else {
buf[num_chars] = ch; // else put char in buf
} }
// resize the buffer to chars read buf
= realloc(buf, num_chars);
// now comes the part where I'm quite unsure if it works / is possible
char **list = (char **)buf; // and now I should be able to handle it like a list of strings?

Yes, it's possible. Yes, you can treat it as a one dimensional array. Yes, the memory is contiguous.
But you probably want:
char ** fields = malloc(sizeof(char *) * numFields);
Then you can do:
// assuming `field` is a `char *`
fields[i++] = field;

It's not exactly possible as you describe it, but something similar is possible.
More specifically, the last line of your example char **list = (char **)buf; won't do what you believe. char **list means items pointed by *list will be of the type char*, but the content of *buf are chars. Hence it won't be possible to change one into another.
You should understand that a char ** is just a pointer, it can hold nothing by itself. But the char ** can be the start address of an array of char *, each char * pointing to a field.
You will have to allocate space for the char * array using a first malloc (or you also can use a static array of char * instead of a char**. You also will have to find space for every individual field, probably performing a malloc for each field and copying it from the initial buffer. If the initial buffer is untouched after reading, it would also be possible to use it to keep the field names, but if you just replace the initial ; by a \n, you'll be missing the trailing 0 string terminator, you can't replace inplace one char by two char.
Below is a simplified example of what can be done (removed malloc part as it does not add much to the example, and makes it uselessly complex, that's another story):
#include <stdio.h>
#include <string.h>
main(){
// let's define a buffer with space enough for 100 chars
// no need to perform dynamic allocation for a small example like this
char buf[100];
const char * data = "one;two;three;four;";
// now we want an array of pointer to fields,
// here again we use a static buffer for simplicity,
// instead of a char** with a malloc
char * fields[10];
int nbfields = 0;
int i = 0;
char * start_of_field;
// let's initialize buf with some data,
// a semi-colon terminated list of strings
strcpy(buf, data);
start_of_field = buf;
// seek end of each field and
// copy addresses of field to fields (array of char*)
for (i = 0; buf[i] != 0; i++){
if (buf[i] == ';'){
buf[i] = 0;
fields[nbfields] = start_of_field;
nbfields++;
start_of_field = buf+i+1;
}
}
// Now I can address fields individually
printf("total number of fields = %d\n"
"third field is = \"%s\"\n",
nbfields, fields[2]);
}

I now want to do is create a char**, add all the chars to that
You would allocate an array of char (char*) to store the characters, not an array of char* (char**).
And when I create a char*, e.g. char * buf = malloc(80) can I treat it like a one dimensional array? If the memory returned by malloc contiguous?
Yes, malloc always returns continguous blocks. Yes, you can treat it as single-dimensional an array of char* (with 80/sizeof char* elements). But you'll need to seperately allocate memory for each string you store in this array, or create another block for storing the string data, then use your first array to store pointers into that block.
(or something else; lots of ways to skin this cat)

It may be that Helper Method wants each of the "fields" written to a nul-terminated string. I cannot tell. You can do that with strtok(). However strtok() has problems - it destroys the original string "fed" to it.
#define SPLIT_ARRSZ 20 /* max possible number of fields */
char **
split( char *result[], char *w, const char *delim)
{
int i=0;
char *p=NULL;
for(i=0, result[0]=NULL, p=strtok(w, delim); p!=NULL; p=strtok(NULL, delim), i++ )
{
result[i]=p;
result[i+1]=NULL;
}
return result;
}
void
usage_for_split(void)
{
char *result[SPLIT_ARRSZ]={NULL};
char str[]="1;2;3;4;5;6;7;8;9;This is the last field\n";
char *w=strdup(str);
int i=0;
split(result, w, ";\n");
for(i=0; result[i]!=NULL; i++)
printf("Field #%d = '%s'\n", i, result[i]);
free(w);
}

Related

How can I implement a function to concatenate to a char* and not char array?

How can I implement a function that will concatenate something to a char* (not char array)?
Example of what I want:
#include <stdio.h>
#include <string.h>
int main() {
char* current_line;
char temp[1];
sprintf(temp, "%c", 'A');
// This causes a seg fault. I of course don't want that, so how do I get this to work properly?
strcat(current_line, temp);
return 0;
}
How can I fix this to work properly (and please, tell me if I need to add anything to my question or point me in the right direction because I couldn't find anything)?
Edit: I made this but it seg faults
char* charpointercat(char* mystr, char* toconcat) {
char ret[strlen(mystr) + 1];
for(int i = 0; mystr[i] != '\0'; i++) {
ret[i] = mystr[i];
}
return ret;
}
You have 3 problems:
You do not allocate memory for current_line at all!
You do not allocate enough memory for temp.
You return a pointer to a local variable from charpointercat.
The first one should be obvious, and was explained in comments:
char *current_line only holds a pointer to some bytes, but you need to allocate actual bytes if you want to store something with a function like stracat.
For the secoond one, note that sprintf(temp, "%c", 'A'); needs at least char temp[2] as it will use one byte for the "A", and one byte for terminating null character.
Since sprintf does not know how big temp is, it writes beyond it and that is how you get the segfault.
As for your charpointercat once the function exits, ret no longer exists.
To be more precise:
An array in C is represented by a pointer (a memory address) of its first item (cell).
So, the line return ret; does not return a copy of all the bytes in ret but only a pointer to the first byte.
But that memory address is only valid inside charpointercat function.
Once you try to use it outside, it is "undefined behavior", so anything can happen, including segfault.
There are two ways to fix this:
Learn how to use malloc and allocate memory on the heap.
Pass in a third array to the function so it can store the result there (same way you do with sprintf).
From the first code you posted it seems like you want to concatenate a char to the end of a string... This code will return a new string that consists of the first one followed by the second, it wont change the parameter.
char* charpointercat(char* mystr, char toconcat) {
char *ret = (char*) malloc(sizeof(char)*(strlen(mystr) + 2));
int i;
for(i = 0; mystr[i] != '\0'; i++) {
ret[i] = mystr[i];
}
ret[i] = toconcat;
ret[i + 1] = '\0';
return ret;
}
This should work:
char* charpointercat(char* mystr, char* toconcat) {
size_t l1,l2;
//Get lengths of strings
l1=strlen(mystr);
l2=strlen(toconcat);
//Allocate enough memory for both
char * ret=malloc(l1+l2+1);
strcpy(ret,mystr);
strcat(ret,toconcat);
//Add null terminator
ret[l1+l2]='\0';
return ret;
}
int main(){
char * p=charpointercat("Hello","World");
printf("%s",p);
//Free the memory
free(p);
}

how do I assign individual string to an element in one array of pointer?

I am new to C and still trying to figure out pointer.
So here is a task I am stuck in: I would like to assign 10 fruit names to a pointer of array and print them out one by one. Below is my code;
#include <stdio.h>
#include <string.h>
int main(){
char *arr_of_ptrs[10];
char buffer[20];
int i;
for (i=0;i<10;i++){
printf("Please type in a fruit name:");
fgets(buffer,20,stdin);
arr_of_ptrs[i]= *buffer;
}
int j;
for (j=0;j<10;j++){
printf("%s",*(arr_of_ptrs+j));
}
}
However after execution this, it only shows me the last result for all 10 responses. I tried to consult similar questions others asked but no luck.
My understanding is that
1) pointer of array has been allocated memory with [10] so malloc() is not needed.
2) buffer stores the pointer to each individual answer therefore I dereference it and assign it to the arr_of_ptrs[i]
I am unsure if arr_of_ptrs[i] gets me a pointer or a value. I thought it is definitely a pointer but I deference it with * the code and assign it to *buffer, program would get stuck.
If someone could point out my problem that would be great.
Thanks in advance
Three erros, 1. You must allocate memory for elements of elements of arr_of_ptrs, now you only allocate the memory for elements of arr_of_ptrs on stack memory. 2. arr_of_ptrs[i]= *buffer; means all of the arr_of_ptrs elements are pointed to same memory address, which is the "buffer" pointer. So all the elements of arr_of_ptrs will be same to the last stdin input string. 3. subsequent fgets() call has potential problem, one of the explaination could be here
A quick fix could be,
#include <stdio.h>
#include <string.h>
int main(){
const int ARR_SIZE = 10, BUFFER_SIZE = 20;
char arr_of_ptrs[ARR_SIZE][BUFFER_SIZE];
char *pos;
int i, c;
for (i = 0; i < ARR_SIZE; ++i) {
printf ("Please type in a fruit name: ");
if (fgets (arr_of_ptrs[i], BUFFER_SIZE, stdin) == NULL) return -1;
if ((pos = strchr(arr_of_ptrs[i], '\n')))
*pos = 0;
else
while ((c = getchar()) != '\n' && c != EOF) {}
}
for (i = 0; i < ARR_SIZE; ++i)
printf("%s\n", arr_of_ptrs[i]);
return 0;
}
The misunderstanding is probably that "Dereferencing" an array of characters, unlike dereferencing a pointer to a primitive data type, does not create a copy of that array. Arrays cannot be copied using assignment operator =; There a separate functions for copying arrays (especially for 0-terminated arrays of char aka c-strings, and for allocating memory needed for the copy):
Compare a pointer to a primitive data type like int:
int x = 10;
int *ptr_x = &x;
int copy_of_x = *ptr_x; // dereferences a pointer to x, yielding the integer value 10
However:
char x[20] = "some text"; // array of characters, not a single character!
char *ptr_x = &x[0]; // pointer to the first element of x
char copy_of_first_char_of_x = *ptr_x; // copies the 's', not the entire string
Use:
char x[20] = "some text";
char *ptr_x = &x[0];
char *copy_of_x = malloc(strlen(ptr_x)+1); // allocate memory large enough to store the copy
strcpy(copy_of_x,ptr_x); // copy the string.
printf("%s",copy_of_x);
Output:
some text

Appending char to C array

I have a string declared as such:
char *mode_s = (char *)calloc(MODE_S_LEN, sizeof(char));
How can I add a char to the end of the array?
Lets assume " first available position " means at index 0.
char *mode_s = (char *)calloc(MODE_S_LEN, sizeof(char));
*mode_s='a';
To store a character at an arbitrary index n
*(mode_s+n)='b';
Use pointer algebra, as demonstrated above, which is equivalent to
mode_s[n]='b';
One sees that the first case simply means that n=0.
If you wish to eliminate incrementing the counter, as specified in the comment bellow, you can write a data structure and a supporting function that fits your needs. A simple one would be
typedef struct modeA{
int size;
int index;
char *mode_s;
}modeA;
The supporting function could be
int add(modeA* a, char toAdd){
if(a->size==a->index) return -1;
a->mode_s[index]=toAdd;
a->index++;
return 0;
}
It returns 0 when the add was successful, and -1 when one runs out of space.
Other functions you might need can be coded in a similar manner. Note that as C is not object oriented, the data structure has to be passed to the function as a parameter.
Finally you code code a function creating an instance
modeA genModeA(int size){
modeA tmp;
tmp.mode_s=(char *)calloc(size, sizeof(char));
tmp.size=size;
tmp.index=0;
return tmp;
}
Thus using it with no need to manually increment the counter
modeA tmp=genModeA(MODE_S_LEN);
add(&tmp,'c');
There is no standard function to concatenate a character to a string in C. You can easily define such a function:
#include <string.h>
char *strcatc(char *str, char c) {
size_t len = strlen(str);
str[len++] = c;
str[len] = '\0';
return str;
}
This function only works if str is allocated or defined with a larger size than its length + 1, ie if there is available space at its end. In your example, mode_s is allocated with a size of MODE_S_LEN, so you can put MODE_S_LEN-1 chars into it:
char *mode_s = calloc(MODE_S_LEN, sizeof(*mode_s));
for (int i = 0; i < MODE_S_LEN - 1; i++) {
strcatc(mode_s, 'X');
}
char newchar = 'a'; //or getch() from keyboard
//realloc memory:
char *mode_sNew = (char *)calloc(MODE_S_LEN + 1, sizeof(char));
//copy the str:
srncpy(mode_sNew, mode_s, MODE_S_LEN);
//put your char:
mode_sNew[MODE_S_LEN] = newchar;
//free old memory:
free(mode_s);
//reassign to the old string:
mode_s = mode_sNew;
//in a loop you can add as many characters as you want. You also can add more than one character at once, but assign only one in a new position

Parsing a string with multi-char delimiter and got junk data along with the result in the second call in c

I have writen a code to split the string with multiple char delimiter.
It is working fine for first time of calling to this function
but i calling it second time it retuns the correct word with some unwanted symbol.
I think this problem occurs because of not clearing the buffer.I have tried a lot but cant solve this. please help me to solve this problem.
char **split(char *phrase, char *delimiter) {
int i = 0;
char **arraylist= malloc(10 *sizeof(char *));
char *loc1=NULL;
char *loc=NULL;
loc1 = phrase;
while (loc1 != NULL) {
loc = strstr(loc1, delimiter);
if (loc == NULL) {
arraylist[i]=malloc(sizeof(loc1));
arraylist[i]=loc1;
break;
}
char *buf = malloc(sizeof(char) * 256); // memory for 256 char
int length = strlen(delimiter);
strncpy(buf, loc1, loc-loc1);
arraylist[i]=malloc(sizeof(buf));
arraylist[i]=buf;
i++;
loc = loc+length;
loc1 = loc;
}
return arraylist;
}
called this function first time
char **splitdetails = split("100000000<delimit>0<delimit>hellooo" , "<delimit>");
It gives
splitdetails[0]=100000000
splitdetails[1]=0
splitdetails[2]=hellooo
but i called this second time
char **splitdetails = split("20000000<delimit>10<delimit>testing" , "<delimit>");
splitdetails[0]=20000000��������������������������
splitdetails[1]=10����
splitdetails[2]=testing
Update:-
thanks to #fatelerror. i have change my code as
char** split(char *phrase, char *delimiter) {
int i = 0;
char **arraylist = malloc(10 *sizeof(char *));
char *loc1=NULL;
char *loc=NULL;
loc1 = phrase;
while (loc1 != NULL) {
loc = strstr(loc1, delimiter);
if (loc == NULL) {
arraylist[i]=malloc(strlen(loc1) + 1);
strcpy(arraylist[i], loc1);
break;
}
char *buf = malloc(sizeof(char) * 256); // memory for 256 char
int length = strlen(delimiter);
strncpy(buf, loc1, loc-loc1);
buf[loc - loc1] = '\0';
arraylist[i]=malloc(strlen(buf));
strcpy(arraylist[i], buf);
i++;
loc = loc+length;
loc1 = loc;
}
}
In the caller function, i used it as
char *id
char **splitdetails = split("20000000<delimit>10<delimit>testing" , "<delimit>");
id = splitdetails[0];
//some works done with id
//free the split details with this code.
for(int i=0;i<3;i++) {
free(domaindetails[i]);
}free(domaindetails);
domaindetails=NULL;
then i called the same for the second as,
char **splitdetails1= split("10000000<delimit>1000<delimit>testing1" , "<delimit>");
it makes error and i can't free the function.
thanks in advance.
Your problem boils down to three basic things:
sizeof is not strlen()
Assignment doesn't copy strings in C.
strncpy() doesn't always nul-terminate strings.
So, when you say something like:
arraylist[i]=malloc(sizeof(loc1));
arraylist[i]=loc1;
thisdoes not copy the string. The first one allocates the size of loc1, which is a char *. In other words, you allocated the size of a pointer. You want to allocate storage to store the string, i.e. using strlen():
arraylist[i]=malloc(strlen(loc1) + 1);
Note the + 1 as well, because you also need room for the nul-terminator. Then, to copy the string you want to use strcpy():
strcpy(arraylist[i], loc1);
The way you had it was just assigning a pointer to your old string (and in the process leaing the memory you had just allocated). It's also common to use strdup() which combines both of these steps, i.e.
arraylist[i] = strdup(loc1);
This is convenient but strdup() is not part of the official C library. You need to assess the portability needs of your code before you consider using it.
Additionally, with strncpy(), you should be aware that it does not always nul-terminate:
strncpy(buf, loc1, loc-loc1);
This copies less bytes than were in the original string and doesn't terminate buf. Thus, it's necessary to include a nul terminator yourself:
buf[loc - loc1] = '\0';
This is the root cause of what you are seeing with the garbage. Since you didn't nul terminate, C doesn't know where your string ends and so it keeps on reading whatever happens to be in memory.

strtok - how avoid new line to and put to array of strings?

if i dupe topic i really sorry, i searched for it with no result here.
I have code
void split(char* str, char* splitstr)
{
char* p;
char splitbuf[32];
int i=0;
p = strtok(str,",");
while(p!= NULL)
{
printf("%s", p);
sprintf(&splitstr[i],"%s",p);
i++;
p = strtok (NULL, ",");
}
}
How can i use proper sprintf to put the splited words by strtok to string array?
Can i somehow avoid breaklines created by strtok?
I am programming in ANSI C. I declared array splitstr and str the same way.
char* splitstr;//in main char splitstr[32];
Thanks for help.
edit:
i would like do something like this:
INPUT (it is a string) > "aa,bbb,ccc,ddd"
I declare: char tab[33];
OUTPUT (save all items to array of strings if it is even possible) >
tab[0] is "aa"
tab[1] is "bbb"
...
tab[3] is "ddd" but not "ddd(newline)"
edit2 [18:16]
I forgot add that the data string is from reading line of file. That's why i wrote about "ddd(newline)". I found after that the new line was also shown by strtok but as another item. By the way all answers are good to think over the problem. Few seconds ago my laptop has Broken (i dont know why the screen gone black) As soon as i take control over my pc i will check codes. :-)
Give this a shot:
#include <stdlib.h>
#include <string.h>
...
void split(char *str, char **splitstr)
{
char *p;
int i=0;
p = strtok(str,",");
while(p!= NULL)
{
printf("%s", p);
splitsr[i] = malloc(strlen(p) + 1);
if (splitstr[i])
strcpy(splitstr[i], p);
i++;
p = strtok (NULL, ",");
}
}
And then, in main:
#define MAX_LEN ... // max allowed length of input string, including 0 terminator
#define MAX_STR ... // max allowed number of substrings in input string
int main(void)
{
char input[MAX_LEN];
char *strs[MAX_STR] = {NULL};
...
split(input, strs);
...
}
Some explanations.
strs is defined in main as an array of pointer to char. Each array element will point to a string extracted from the input string. In main, all that's allocated is the array of pointers, with each element initially NULL; the memory for each element will be allocated within the split function using malloc, based on the length of the substring. Somewhere after you are finished with strs you will need to deallocate those pointers using free:
for (i = 0; i < MAX_STR; i++)
free(strs[i]);
Now, why is splitstr declared as char ** instead of char *[MAX_STR]? Except when it is the operand of the sizeof or & operators, or is a string literal being used to initialize another array in a declaration, an array expression will have its type implicitly converted from N-element array of T to pointer to T, and the value of the expression will be the location of the first element in the array.
When we call split:
split(input, strs);
the array expression input is implicitly converted from type char [MAX_LENGTH] to char * (T == char), and the array expression strs is implicitly converted from type char *[MAX_STRS] to char ** (T == char *). So splitstr receives pointer values for the two parameters, as opposed to array values.
If I understand correctly, you want to save the strings obtained by strtok. In that case you'll want to declare splitstr as char[MAX_LINES][32] and use strcpy, something like this:
strcpy(splitstr[i], p);
where i is the ith string read using strtok.
Note that I'm no ansi C expert, more C++ expert, but here's working code. It uses a fixed size array, which may or may not be an issue. To do anything else would require more complicated memory management:
/* Not sure about the declaration of splitstr here, and whether there's a better way.
char** isn't sufficient. */
int split(char* str, char splitstr[256][32])
{
char* p;
int i=0;
p = strtok(str,",");
while(p) {
strcpy(splitstr[i++], p);
p = strtok (NULL, ",");
}
return i;
}
int main(int argc, char* argv[])
{
char input[256];
char result[256][32];
strcpy(input, "aa,bbb,ccc,ddd");
int count = split(input, result);
for (int i=0; i<count; i++) {
printf("%s\n", result[i]);
}
printf("the end\n");
}
Note that I supply "aa,bbb,ccc,ddd" in and I get {"aa", "bbb", "ccc", "ddd" } out. No newlines on the result.

Resources