Making an array of tokens and comparing it with another string - c

I have an *input string from a console. That string might look like: show name year xxx.. and I need an output to look like this:
name: Adi
year: 1994 (for example)..
I have been trying to achieve this by using strtok() function, but I also need to compare every tokon with allowed keyywords(name, year...) if that word is not allowed, than the token needs to be skiped(deleted).. for example in this case it would skip show, and xxx.
Another problem is that I need those tokens in a form of an array in order to work with them and with a structs..
There should be no limit to number of words that could be entered in an input..
I hope you understood what I asked.. so, how to make tokens from a string using strtok or something else and make them be arrays or pointers, and how to compare those tokens with another string ( for example constant: #define NAME "name") and of there are some other inputs to skip(delete) them..
I would really appreciate it if you could help me with this.. Thanks..

I would avoid the array. It provides unnecessary overhead. What you're asking for could be accomplished with something like this:
void parseString(char * string) {
char * name = NULL;
char * year = NULL:
char * ptr = strtok(string, " ");
while (ptr != NULL) {
if (stricmp(ptr, "name") == 0) {
ptr = strtok(ptr, " ");
name = ptr;
/* do whatever with name */
} else if (stricmp(ptr, "year") == 0) {
ptr = strtok(ptr, " ");
/* do whatever with year */
year = ptr;
} /* else if ... */
ptr = strtok(ptr, " ");
}
This gives you a fair amount of flexibility. You check all the terms you need, you don't need to worry about how to allocate the array, and you can access values for settings if necessary.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#include <stdbool.h>
char *tolowerstr(char *str){
char *p = str;
while((*p++ = tolower(*p)));
return str;
}
int cmp(const void *a, const void *b){
return strcmp(*(const char **)a, *(const char **)b);
}
bool isBanWord(const char *word){
static const char *table[] =
{ "fuck", "show", "xxx" };//sorted
char **ret, *key;
key = tolowerstr(strdup(word));
ret=bsearch(&key, table, sizeof(table)/sizeof(*table), sizeof(*table), cmp);
free(key);
return !!ret;//ret != NULL ? true : false;
}
//Create and return as a dynamic array of pointer to the copy of the word from a string.
//String passed is destroyed.
char **strToWords(char *str, size_t *size){
const char *delimiters = " .";
size_t count=0;
char **array = malloc(strlen(str)*sizeof(char*));//number of words < string length
if(array){
char *token=strtok(str, delimiters);
for(; token ;token=strtok(NULL, delimiters)){
if(!isBanWord(token))//skip ban word
array[count++] = strdup(token);
}
array[count] = NULL;//End mark
array=realloc(array, (count + 1)*sizeof(*array));//include NULL
}
*size = count;
return array;
}
typedef struct words {
char **words;
size_t n; //number of words
} Words;
void clearWords(Words *w){
size_t i;
for(i=0;i < w->n;++i)
free(w->words[i]);
free(w->words);
w->words = NULL;
w->n = 0;
}
void printWords(Words *w){
size_t i=0;
while(i < w->n){
printf("%s", w->words[i++]);
if(w->words[i])
putchar(' ');
}
putchar('\n');
}
int main(){//DEMO
char sentence[] = "show name year xxx.";//input string. Will be destroyed.
Words w;
w.words = strToWords(sentence, &w.n);
printWords(&w);//name year
clearWords(&w);
return 0;
}

Related

Is there an easy way to remove specific chars from a char*?

char * deleteChars = "\"\'.“”‘’?:;-,—*($%)! \t\n\x0A\r"
I have this and i'm trying to remove any of these from a given char*. I'm not sure how I would go about comparing a char* to it.
For example if the char* is equal to "hello," how would I go about removing that comma with my deleteChars?
So far I have
void removeChar(char*p, char*delim){
char*holder = p;
while(*p){
if(!(*p==*delim++)){
*holder++=*p;
p++;
}
}
*holder = '\0';
A simple one-by-one approach:
You can use strchr to decide if the character is present in the deletion set. You then assign back into the buffer at the next unassigned position, only if not a filtered character.
It might be easier to understand this using two indices, instead of using pointer arithmetic.
#include <stdio.h>
#include <string.h>
void remove_characters(char *from, const char *set)
{
size_t i = 0, j = 0;
while (from[i]) {
if (!strchr(set, from[i]))
from[j++] = from[i];
i++;
}
from[j] = 0;
}
int main(void) {
const char *del = "\"\'.“”‘’?:;-,—*($%)! \t\n\x0A\r";
char buf[] = "hello, world!";
remove_characters(buf, del);
puts(buf);
}
stdout:
hello world
If you've several delimiters/characters to ignore, it's better to use a look-up table.
void remove_chars (char* str, const char* delims)
{
if (!str || !delims) return;
char* ans = str;
int dlt[256] = {0};
while (*delims)
dlt[(unsigned char)*delims++] = 1;
while (*str) {
if (dlt[(unsigned char)*str])
++str; // skip it
else //if (str != ans)
*ans++ = *str++;
}
*ans = '\0';
}
You could do a double loop, but depending on what you want to treat, it might not be ideal. And since you are FOR SURE shrinking the string you don't need to malloc (provided it was already malloced). I'd initialize a table like this.
#include <string.h>
...
char del[256];
memset(del, 0, 256 * sizeof(char));
for (int i = 0; deleteChars[i]; i++) del[deleteChars[i]] = 1;
Then in a function:
void delChars(char *del, char *string) {
int i, offset;
for (i = 0, offset = 0; string[i]; i++) {
string[i - offset] = string[i];
if (del[string[i]]) offset++;
}
string[i - offset] = 0;
}
This will not work on string literals (that you initialize with char* x = "") though because you'd end up writing in program memory, and probably segfault. I'm sure you can tweak it if that's your need. (Just do something like char *newString = malloc(strlen(string) + 1); newString[i - offset] = string[i])
Apply strchr(delim, p[i]) to each element in p[].
Let us take advantage that strchr(delim, 0) always returns a non-NULL pointer to eliminate the the null character test for every interrelation.
void removeChar(char *p, char *delim) {
size_t out = 0;
for (size_t in; /* empty */; in++) {
// p[in] in the delim set?
if (strchr(delim, p[in])) {
if (p[in] == '\0') {
break;
}
} else {
p[out++] = p[in];
}
}
p[out] = '\0';
}
Variation on #Oka good answer.
it is better way - return the string without needless characters
#include <string.h>
char * remove_chars(char * str, const char * delim) {
for ( char * p = strpbrk(str, delim); p; p = strpbrk(p, delim) )
memmove(p, p + 1, strlen(p));
return str;
}

C: Splitting a string into two strings, and returning a 2 - element array

I am trying to write a method that takes a string and splits it into two strings based on a delimiter string, similar to .split in Java:
char * split(char *tosplit, char *culprit) {
char *couple[2] = {"", ""};
int i = 0;
// Returns first token
char *token = strtok(tosplit, culprit);
while (token != NULL && i < 2) {
couple[i++] = token;
token = strtok(NULL, culprit);
}
return couple;
}
But I keep getting the Warnings:
In function ‘split’:
warning: return from incompatible pointer type [-Wincompatible-pointer-types]
return couple;
^~~~~~
warning: function returns address of local variable [-Wreturn-local-addr]
... and of course the method doesn't work as I hoped.
What am I doing wrong?
EDIT: I am also open to other ways of doing this besides using strtok().
A view things:
First, you are returning a pointer to a (sequence of) character(s), i.e. a char
* rather than a pointer to a (sequence of) pointer(s) to char. Hence, the return type should be char **.
Second, you return the address of a local variable, which - once the function has finished - goes out of scope and must not be accessed afterwards.
Third, you define an array of 2 pointers, whereas your while-loop may write beyond these bounds.
If you really want to split into two strings, the following method should work:
char ** split(char *tosplit, char *culprit) {
static char *couple[2];
if ((couple[0] = strtok(tosplit, culprit)) != NULL) {
couple[1] = strtok(NULL, culprit);
}
return couple;
}
I'd caution your use of strtok, it probably does not do what you want it to. If you think it does anything like a Java split, read the man page and then re-read it again seven times. It is literally tokenizing the string based on any of the values in delim.
I think you are looking for something like this:
#include <stdio.h>
#include <string.h>
char* split( char* s, char* delim ) {
char* needle = strstr(s, delim);
if (!needle)
return NULL;
needle[0] = 0;
return needle + strlen(delim);
}
int main() {
char s[] = "Fluffy furry Bunnies!";
char* res = split(s, "furry ");
printf("%s%s\n", s, res );
}
Which prints out "Fluffy Bunnies!".
First of all strtok modifies the memory of tosplit so be certain that, that's what you wish to do. If so then consider this:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
/*
* NOTE: unsafe (and leaky) implementation using strtok
*
* *into must point to a memory space where tokens can be stored
* or if *into is NULL then it allocates enough space.
* Returns:
* allocated array of items that you must free yourself
*
*/
char **__split(char *src, const char *delim)
{
size_t idx = 0;
char *next;
char **dest = NULL;
do {
dest = realloc(dest, (idx + 1)* sizeof(char *));
next = strtok(idx > 0 ? NULL:strdup(src), delim);
dest[idx++] = next;
} while(next);
return dest;
}
int main() {
int x = 0;
char **here = NULL;
here = __split("hello,there,how,,are,you?", ",");
while(here[x]) {
printf("here: %s\n", here[x]);
x++;
}
}
You can implement a much safer and non leaky version (note the strdup) of this but hopefully this is a good start.
The type of couple is char** but you have defined the function return type as char*. Furthermore you are returning the pointer to a local variable. You need to pass the pointer array into the function from the caller. For example:
#include <stdio.h>
#include <string.h>
char** split( char** couple, char* tosplit, char* culprit )
{
int i = 0;
// Returns first token
char *token = strtok( tosplit, culprit);
for( int i = 0; token != NULL && i < 2; i++ )
{
couple[i] = token;
token = strtok(NULL, culprit);
}
return couple;
}
int main()
{
char* couple[2] = {"", ""};
char tosplit[] = "Hello World" ;
char** strings = split( couple, tosplit, " " ) ;
printf( "%s, %s", strings[0], strings[1] ) ;
return 0;
}

C: Take parts from a string without a delimiter (using strstr)

I have a string, for example: "Error_*_code_break_*_505_*_7.8"
I need to split the string with a loop by the delimiter "_*_" using the strstr function and input all parts into a new array, let's call it -
char *elements[4] = {"Error", "code_break", "505", "7.8"}
but strstr only gives me a pointer to a char, any help?
Note: the second string "code_break" should still contain "_", or in any other case.
This will get you half-way there. This program prints the split pieces of the string to the standard output; it does not make an array, but maybe you can add that yourself.
#include <stdio.h>
#include <string.h>
#include <malloc.h>
void split(const char * str, const char * delimiter)
{
char * writable_str = strdup(str);
if (writable_str == NULL) { return; }
char * remaining = writable_str;
while (1)
{
char * ending = strstr(remaining, delimiter);
if (ending != NULL) { *ending = 0; }
printf("%s\n", remaining);
if (ending == NULL) { break; }
remaining = ending + strlen(delimiter);
}
free(writable_str);
}
int main(void) {
const char * str = "Error_*_code_break_*_505_*_7.8";
const char * delimiter = "_*_";
split(str, delimiter);
return 0;
}
Here is a function that splits a string into an array. You have to pass the size of the array so that the function won't overfill it. It returns the number of things it put into the array. What it puts into the array is a pointer into the string that was passed. It modifies the string by inserting null characters to end the pieces - just like strtok does.
#include<string.h>
#include<stdio.h>
int split(char *string, char *delimiter, char* array[], int size)
{
int count=0;
char *current=string;
char *next;
while(current && *current!='\0')
{
next=strstr(current,delimiter);
if(!next)break;
*next='\0';
if(count<size) array[count++]=current;
current=next+strlen(delimiter);
}
if(count<size) array[count++]=current;
return count;
}
int main()
{
char string[100]="Error_*_code_break_*_505_*_7.8";
char *array[10];
int size=split(string,"_*_",array,10);
for(int i=0;i<size;i++) puts(array[i]);
return size;
}

How do I create a function in C that allows me to split a string based on a delimiter into an array?

I want to create a function in C, so that I can pass the function a string, and a delimiter, and it will return to me an array with the parts of the string split up based on the delimiter. Commonly used to separate a sentence into words.
e.g.: "hello world foo" -> ["hello", "world", "foo"]
However, I'm new to C and a lot of the pointer things are confusing me. I got an answer mostly from this question, but it does it inline, so when I try to separate it into a function the logistics of the pointers are confusing me:
This is what I have so far:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void split_string(char string[], char *delimiter, char ***result) {
char *p = strtok(string, delimiter);
int i, num_spaces = 0;
while (p != NULL) {
num_spaces++;
&result = realloc(&result, sizeof(char *) * num_spaces);
if (&result == NULL) {
printf("Memory reallocation failed.");
exit(-1);
}
&result[num_spaces - 1] = p;
p = strtok(NULL, " ");
}
// Add the null pointer to the end of our array
&result = realloc(split_array, sizeof(char *) * num_spaces + 1);
&result[num_spaces] = 0;
for (i = 0; i < num_spaces; i++) {
printf("%s\n", &result[i]);
}
free(&result);
}
int main(int argc, char *argv[]) {
char str[] = "hello world 1 foo";
char **split_array = NULL;
split_string(str, " ", &split_array);
return 0;
}
The gist of it being that I have a function that accepts a string, accepts a delimiter and accepts a pointer to where to save the result. Then it constructs the result. The variable for the result starts out as NULL and without memory, but I gradually reallocate memory for it as needed.
But I'm really confused as to the pointers, like I said. I know my result is of type char ** as a string it of type char * and there are many of them so you need pointers to each, but then I'm supposed to pass the location of that char ** to the new function, right, so it becomes a char ***? When I try to access it with & though it doesn't seem to like it.
I feel like I'm missing something fundamental here, I'd really appreciate insight into what is going wrong with the code.
You confusing dereferencing with addressing (which is the complete opposite). Btw, I couldn't find split_array anywhere in the function, as it was down in main. Even if you had the dereferencing and addressing correct, this would still have other issues.
I'm fairly sure you're trying to do this:
#include <stdio.h>
#include <stdlib.h>
void split_string(char string[], const char *delimiter, char ***result)
{
char *p = strtok(string, delimiter);
void *tmp = NULL;
int count=0;
*result = NULL;
while (p != NULL)
{
tmp = realloc(*result, (count+1)*sizeof **result);
if (tmp)
{
*result = tmp;
(*result)[count++] = p;
}
else
{ // failed to expand
perror("Failed to expand result array");
exit(EXIT_FAILURE);
}
p = strtok(NULL, delimiter);
}
// add null pointer
tmp = realloc(*result, (count+1)*sizeof(**result));
if (tmp)
{
*result = tmp;
(*result)[count] = NULL;
}
else
{
perror("Failed to expand result array");
exit(EXIT_FAILURE);
}
}
int main()
{
char str[] = "hello world 1 foo", **toks = NULL;
char **it;
split_string(str, " ", &toks);
for (it = toks; *it; ++it)
printf("%s\n", *it);
free(toks);
}
Output
hello
world
1
foo
Honestly this would be cleaner if the function result were utilized rather than an in/out parameter, but you choice the latter, so there you go.
Best of luck.

Null terminating char pointer

I am completely newbie in C.
I am trying to do simple C function that will split string (char array).
The following code doesn't work properly because I don't know how to terminate char array in the array. There are to char pointers passed in function. One containing original constant char array to be split and other pointer is multidimensional array that will store each split part in separate char array.
Doing the function I encountered obviously lots of hustle, mainly due to my lack of C experience.
I think what I cannot achieve in this function is terminating individual array with '\0'.
Here is the code:
void splitNameCode(char *code, char *output);
void splitNameCode(char *code, char *output){
int OS = 0; //output string number
int loop;
size_t s = 1;
for (loop = 0; code[loop]; loop++){
if (code[loop] == ':'){
output[OS] = '\0'; // I want to terminate each array in the array
OS ++;
}else {
if (!output[OS]) {
strncpy(&output[OS], &code[loop], s);
}else {
strncat(&output[OS], &code[loop], s);
}
}
}
}
int main (int argc, const char * argv[]) {
char output[3][15];
char str[] = "andy:james:john:amy";
splitNameCode(str, *output);
for (int loop = 0; loop<4; loop++) {
printf("%s\n", output[loop]);
}
return 0;
}
Here is a working program for you. Let me know if you need any explanation.
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
void splitNameCode(char *code, char **output) {
int i = 0;
char* token = strtok(code, ":");
while (token != NULL) {
output[i++] = token;
token = strtok(NULL, ":");
}
}
int main (int argc, const char *argv[]) {
char* output[4];
char input[] = "andy:james:john:amy";
splitNameCode(input, output);
for (int i = 0; i < 4; i++) {
printf("%s\n", output[i]);
}
return 0;
}
If I understand your intent correctly, you are trying to take a string like andy:james:john:amy and arrive at andy\0james\0john\0amy. If this is the case, then your code can be simplified significantly:
void splitNameCode(char *code, char *output){
int loop;
strncpy(code, output, strlen(code));
for (loop = 0; output[loop]; loop++){
if (output[loop] == ':'){
output[loop] = '\0'; // I want to terminate each array in the array
}
}
}

Resources