Parsing string into parameters and then concatenate some of them - c

I have task to make family binary tree. In this task I need to implement some commands (e.g. add,draw,list..) First of all I need to tokenize standard input .For example "add Walburga [f] mother Sirius Black [m]") to find which commands , names,surnames, relationships and genders are entered.
I have done this with strtok function, and now I have array, which contains every separated string from standard input as parameters. But now I need to concatenate name and surname (if exists) as one parameter.

If the input will always have the format:
command fullname1 [gender] relation fullname2 [gender]
I suggest an approach with strchr() to find the first space and call it command_end. Then find the first [ and call position of [ - 1 = fullname1_end etc.
Then you find the length of the token with command_end - command_startfor example and strncpy() the length to an array and so on.
So you would get something like this (I've used very verbose names to avoid comments):
int main(void)
{
char input[] = "add Walburga Granger [f] mother Sirius Black [m]";
char command[30];
char fullname1[30];
char gender1[4];
char rel[30];
char fullname2[30];
char gender2[4];
char* command_start = input;
char* command_end = strchr(input, ' '); // find the first whitespace of input
char* fullname1_start = command_end + 1;
char* gender1_start = strchr(input, '['); // find the first '[' of input
char* fullname1_end = gender1_start - 1;
char* gender1_end = strchr(gender1_start, ' '); // find the first space after gender1 and so on...
char* rel_start = gender1_end + 1;
char* rel_end = strchr(rel_start, ' ');
char* fullname2_start = rel_end + 1;
char* gender2_start = strchr(fullname2_start, '[');
char* gender2_end = strchr(gender2_start, '\0');
char* fullname2_end = gender2_start - 1;
int command_length = command_end - command_start;
strncpy(command, command_start, command_length);
command[command_length] = '\0';
int fullname1_length = fullname1_end - fullname1_start;
strncpy(fullname1, fullname1_start, fullname1_length);
fullname1[fullname1_length] = '\0';
printf("command: %s\n", command);
printf("fullname1: %s\n", fullname1);
}
Output:
command: add
fullname1: Walburga Granger
Another approach would be to iterate over the input trying to find those keys characters along the way, like:
int main(void)
{
char input[] = "add Walburga Granger [f] mother Sirius Black [m]";
char command[30];
char name1[30];
char gender1[4];
char rel[30];
char name2[30];
char gender2[4];
int i = 0;
int j = 0;
// extract command
for (j = 0; i < strlen(input); i++, j++)
{
if (input[i] == ' ')
break;
command[j] = input[i];
}
command[j] = '\0';
i++;
// extract first fullname1
for (j = 0; i < strlen(input); i++, j++)
{
if (input[i] == '[')
break;
name1[j] = input[i];
}
name1[j - 1] = '\0';
// extract gender1
for (j = 0; i < strlen(input); i++, j++)
{
if (input[i] == ' ')
break;
gender1[j] = input[i];
}
gender1[j] = '\0';
and so on....
Third approach to salvage the majority of your code. You would insert this snippet after you get your command tokens.
char fullname1[100];
char fullname2[100];
// build fullname1
strcpy(fullname1, commands[1]);
i = 2;
while (commands[i][0] != '[')
{
strcat(fullname1, " ");
strcat(fullname1, commands[i]);
i++;
}
// build fullname2
i += 2;
strcpy(fullname2, commands[i]);
i++;
while (commands[i][0] != '[')
{
strcat(fullname2, " ");
strcat(fullname2, commands[i]);
i++;
}
printf("%s\n", fullname1);
printf("%s\n", fullname2);

I think sprintf is what you want.
char fullname[strlen(commands[4]) + strlen(commands[5]) + 2]; // plus 2 for ' ' and '\0'
sprintf(fullname, "%s %s", commands[4], commands[5]);
/* fullname = "Sirius Black" */

Related

Eject excess space from string in C

I need to write a function which will eject excess space from string in C.
Example:
char s[]=" abcde abcde ";
OUTPUT:
"abcde abcde"
Code:
#include <stdio.h>
#include <ctype.h>
char *eject(char *str) {
int i, x;
for (i = x = 0; str[i]; ++i)
if (!isspace(str[i]) || (i > 0 && !isspace(str[i - 1])))
str[x++] = str[i];
if(x > 0 && str[x-1] == ' ') str[x-1] = '\0';
return str;
}
int main() {
char s[] = " abcde abcde ";
printf("\"%s\"", eject(s));
return 0;
}
This code doesn't work for string " "
If this string is found program should print:
""
How to fix this?
Basically, you need to remove the consecutive space characters between the words in the input string and all leading and trailing space characters of the input string. That means, write code to remove the consecutive space characters in the input string and while removing the consecutive space characters, remove the leading and trailing space characters completely.
You can do it in just one iteration. No need to write the different functions for removing the leading and trailing spaces of input string, as shown in the other post.
You can do:
#include <stdio.h>
#include <ctype.h>
char * eject (char *str) {
if (str == NULL) {
printf ("Invalid input..\n");
return NULL;
}
/* Pointer to keep track of position where next character to be write
*/
char * p = str;
for (unsigned int i = 0; str[i] ; ++i) {
if ((isspace (str[i])) && ((p == str) || (str[i + 1] == '\0') || (str[i] == (str[i + 1])))) {
continue;
}
*p++ = str[i];
}
/* Add the null terminating character.
*/
*p = '\0';
return str;
}
int main (void) {
char s[] = " abcde abcde ";
printf("\"%s\"\n", eject(s));
char s1[] = " ";
printf("\"%s\"\n", eject(s1));
char s2[] = "ab yz ";
printf("\"%s\"\n", eject(s2));
char s3[] = " ddd xx jj m";
printf("\"%s\"\n", eject(s3));
char s4[] = "";
printf("\"%s\"\n", eject(s4));
return 0;
}
Output:
# ./a.out
"abcde abcde"
""
"ab yz"
"ddd xx jj m"
""
You could write two functions which trim leading and trailing whitespace characters.
void trim_front(char *src) {
size_t i = 0, j = 0;
while (isspace(src[i])) i++;
while (i < strlen(src)) src[j++] = src[i++];
src[j] = '\0';
}
void trim_back(char *src) {
char *ch = src + strlen(src) - 1;
while (isspace(*ch)) *ch-- = '\0';
}
If you know you don't have to deal with trailing or leading spaces, your task becomes much simpler.
void reduce_spaces(char *src) {
size_t i = 0, j = 0;
for (; i < strlen(src); ++i) {
if (i == strlen(src) - 1 ||
(isspace(src[i]) && !isspace(src[i + 1])) ||
!isspace(src[i])) {
src[j++] = src[i];
}
}
src[j] = '\0';
}
And testing this:
int main(void) {
char s[] = " hello world ";
trim_front(s);
trim_back(s);
reduce_spaces(s);
printf(">%s<\n", s);
return 0;
}
% gcc test.c
% ./a.out
>hello world<
%
Of course, if you really want to, you can transplant the code from those functions into reduce_spaces, but decomposing a problem into multiple smaller problems can make things much easier.
A slightly more advanced answer just for reference - suppose you were tasked in writing a professional library for the use in real world programs. Then one would first list all requirements that make sense:
It's good practice to treat strings as "immutable" - that is, build up a new string based on the old one rather than in-place replacement.
Take the destination string as parameter but also return a pointer to it (similar to strcpy etc functions).
In case of empty strings, set the destination string empty too.
Remove all "white space" not just the ' ' character.
Instead of always inserting a space character after each word, why not insert a variable delimiter? Might as well be something like , or ;.
No delimiter should be inserted after the last word.
The algorithm should only traverse the data once for performance reasons. That is, internal calls like strlen etc are unacceptable.
Byte by byte iteration is fine - we need not care about alignment.
Then we might come up with something like this:
#include <ctype.h>
#include <stdbool.h>
#include <stdio.h>
char* trim_delimit (char* restrict dst, const char* restrict src, char delim)
{
char* start = dst;
*dst = '\0';
bool remove_spaces = true;
char* insert_delim_pos = NULL;
for(; *src != '\0'; src++)
{
if(remove_spaces)
{
if(!isspace(*src))
{
remove_spaces = false;
if(insert_delim_pos != NULL)
{
// we only get here if more words were found, not yet at the end of the string
*insert_delim_pos = delim;
insert_delim_pos = NULL;
}
}
}
if(!remove_spaces)
{
if(isspace(*src))
{
remove_spaces = true;
insert_delim_pos = dst; // remember where to insert delimiter for later
}
else
{
*dst = *src;
}
dst++;
}
}
return start;
}
Test cases:
int main (void)
{
char s[]=" abcde abcde ";
char trimmed[100];
puts(trim_delimit(trimmed, s, ' '));
puts(trim_delimit(trimmed, "", ' '));
puts(trim_delimit(trimmed, s, ';'));
}
Output:
abcde abcde
abcde;abcde

Split a char[] and store value in different arrays C

I´m new to C programming and have a problem:
I have a string:
char input[] = "1000 10 30: 1 2 3";
I want to split input and store value in different arrays, "1000 10 30" in one array and "1 2 3" in different array.
I've tried to use strtok(), but I can´t find the solution to do it.
Somebody know how to do it?
Thanks!
Edit: Thanks, here is rest of the code:
int a1[3];
int a2[3];
char input[] = "1000 10 30:400 23 123";
char*c = strtok(input, ":");
while (c != 0)
{
char* sep = strchr(c, ' ');
if (sep != 0)
{
*sep = 0;
a1[0] = atoi(c);
++sep;
*sep = strtok(sep, " ");
a1[1] = atoi(sep);
++sep;
a2[2] = atoi(sep);
}
c = strtok(0, ":");
I used an example I found here and tried to change it to add more element to an array, but could not make it. the third element is for some reason 0, and I don't understand why. I'm a beginner++ on programming, but mostly C# and I don't used pointers before.
It is unclear to me what you try to do with the pointer sep. And this code
*sep = strtok(sep, " ");
should give you compiler warnings as strtok returns a char pointer and you are trying to store it into a char (aka *sep).
You don't need more than strtok as you can give it multiple delimiters, i.e. you can give it both ' ' and ':' by passing it " :".
So the code could look like this:
int main() {
char input[] = "1000 10 30: 1 2 3";
int a1[3];
int a2[3];
int i = 0;
char* p = strtok(input, " :");
while(p)
{
if (i < 3)
{
a1[i] = atoi(p);
++i;
}
else if (i < 6)
{
a2[i-3] = atoi(p);
++i;
}
p = strtok(NULL, " :");
}
// Print the values
for (int j = 0; j <i; ++j)
{
if (j < 3)
{
printf("a1[%d] = %d\n", j, a1[j]);
}
else if (j < 6)
{
printf("a2[%d] = %d\n", j-3, a2[j-3]);
}
}
}
Output:
a1[0] = 1000
a1[1] = 10
a1[2] = 30
a2[0] = 1
a2[1] = 2
a2[2] = 3
Tip: The above code solves the task but I recommend you take a look at sscanf as it will allow you to read the values with a single line of code.

Store every string that start and end with a special words into an array in C

I have a long string and I want to store every string that starts and ends with a special word into an array, and then remove duplicate strings. In my long string, there is no space, , or any other separation between words so that I cannot use strtok. The start marker is start and the end marker is end. This is the code I have so far (but it doesn't work because it is using strtok()).
char buf[] = "start-12-3.endstart-12-4.endstart-13-3.endstart-12-4.end";
char *array[5];
char *x;
int i = 0, j = 0;
array[i] = strtok(buf, "start");
while (array[i] != NULL) {
array[++i] = strtok(NULL, "start");
}
//removeDuplicate(array[i]);
for (i = 0; i < 5; i++)
for (j = 0; j < 5; j++)
if (strcmp(array[i], array[j]) == 0)
x[i++] = array[i];
printf("%s", x[i]);
Example input:
start-12-3.endstart-12-4.endstart-13-3.endstart-12-4.end
Output equivalent to:
char *array[]= { "start-12-3.end", "start-12-4.end", "start-13-3.end" };
The second start-12-4.end string has been eliminated in the output.
*I've also used strstr but has some issue:
int main(int argc, char **argv)
{
char string[] = "This-one.testthis-two.testthis-three.testthis-two.test";
int counter = 0;
while (counter < 4)
{
char *result1 = strstr(string, "this");
int start = result1 - string;
char *result = strstr(string, "test");
int end = result - string;
end += 4;
printf("\n%s\n", result);
memmove(result, result1, end += 4);
counter++;
}
}
To put string into array and remove duplicate string, I've tried following code but it has issue:
int main(void)
{
char string[] = "this-one.testthis-two.testthis-three.testthis-two.test";
int counter = 0;
const char *b_token = "this";
const char *e_token = "test";
int e_len = strlen(e_token);
char *buffer = string;
char *b_mark;
char *e_mark;
char *a[50];
int i=0, j;
char *s;
while ((b_mark = strstr(buffer, b_token)) != 0 && (e_mark =strstr(b_mark, e_token)) != 0)
{
int length = e_mark + e_len - b_mark;
s = (char *) malloc(length);
strncpy(s, b_mark, length);
a[i]=s;
i++;
buffer = e_mark + e_len;
}
for (i=0; i<strlen(s); i++)
printf ("%s",a[i]);
free(s);
/*
//remove duplicate string
for (i=0; i<4; i++)
for (j=0; j<4; j++)
{
if (a[i] == NULL || a[j] == NULL || i == j)
continue;
if (strcmp (a[i], a[j]) == 0) {
free(a[i]);
a[i] = NULL;
}
printf("%s\n", a[i]);
*/
return 0;
}
Works with provided example of yours and tested in Valgrind for mem leaks, but might require further testing.
#include <malloc.h>
#include <stdio.h>
#include <string.h>
unsigned tokens_find_amount( char const* const string, char const* const delim )
{
unsigned counter = 0;
char const* pos = string;
while( pos != NULL )
{
if( ( pos = strstr( pos, delim ) ) != NULL )
{
pos++;
counter++;
}
}
return counter;
}
void tokens_remove_duplicate( char** const tokens, unsigned tokens_num )
{
for( unsigned i = 0; i < tokens_num; i++ )
{
for( unsigned j = 0; j < tokens_num; j++ )
{
if( tokens[i] == NULL || tokens[j] == NULL || i == j )
continue;
if( strcmp( tokens[i], tokens[j] ) == 0 )
{
free( tokens[i] );
tokens[i] = NULL;
}
}
}
}
void tokens_split( char const* const string, char const* const delim, char** tokens )
{
unsigned counter = 0;
char const* pos, *lastpos;
lastpos = string;
pos = string + 1;
while( pos != NULL )
{
if( ( pos = strstr( pos, delim ) ) != NULL )
{
*(tokens++) = strndup( lastpos, (unsigned long )( pos - lastpos ));
lastpos = pos;
pos++;
counter++;
continue;
}
*(tokens++) = strdup( lastpos );
}
}
void tokens_free( char** tokens, unsigned tokens_number )
{
for( unsigned i = 0; i < tokens_number; ++i )
{
free( tokens[ i ] );
}
}
void tokens_print( char** tokens, unsigned tokens_number )
{
for( unsigned i = 0; i < tokens_number; ++i )
{
if( tokens[i] == NULL )
continue;
printf( "%s ", tokens[i] );
}
}
int main(void)
{
char const* buf = "start-12-3.endstart-12-4.endstart-13-3.endstart-12-4.end";
char const* const delim = "start";
unsigned tokens_number = tokens_find_amount( buf, delim );
char** tokens = malloc( tokens_number * sizeof( char* ) );
tokens_split( buf, delim, tokens );
tokens_remove_duplicate( tokens, tokens_number );
tokens_print( tokens, tokens_number );
tokens_free( tokens, tokens_number );
free( tokens );
return 0;
}
Basic splitting — identifying the strings
In a comment, I suggested:
Use strstr() to locate occurrences of your start and end markers. Then use memmove() (or memcpy()) to copy parts of the strings around. Note that since your start and end markers are adjacent in the original string, you can't simply insert extra characters into it — which is also why you can't use strtok(). So, you'll have to make a copy of the original string.
Another problem with strtok() is that it looks for any one of the delimiter characters — it does not look for the characters in sequence. But strtok() modifies its input string, zapping the delimiter it finds, which is clearly not what you need. Generally, IMO, strtok() is only a source of headaches and seldom an answer to a problem. If you must use something like strtok(), use POSIX strtok_r() or Microsoft's strtok_s(). Microsoft's function is essentially the same as strtok_r() except for the spelling of the function name. (The Standard C Annex K version of strtok_s() is different from both POSIX and Microsoft — see Do you use the TR 24731 'safe' functions?)
In another comment, I noted:
Use strstr() again, starting from where the start portion ends, to find the next end marker. Then, knowing the start of the whole section, and the start of the end and the length of the end, you can arrange to copy precisely the correct number of characters into the new string, and then null terminate if that's appropriate, or comma terminate. Something like:
if ((start = strstr(source, "start")) != 0 && ((end = strstr(start, "end")) != 0)
then the data is between start and end + 2 (inclusive) in your source string. Repeat starting from the character after the end of 'end'.
You then said:
I've tried following code but it doesn't work fine; would u please tell me what's wrong with it?
#include <stdio.h>
#include <string.h>
int main(int argc, char **argv)
{
char string[] = "This-one.testthis-two.testthis-three.testthis-two.test";
int counter = 0;
while (counter < 4)
{
char *result1 = strstr(string, "This");
int start = result1 - string;
char *result = strstr(string, "test");
int end = result - string;
end += 4;
printf("\n%s\n", result);
memmove(result, result1, end += 4);
counter++;
}
}
I observed:
The main problem appears to be searching for This with a capital T but the string only contains a single capital T. You should also look at Is there a way to specify how many characters of a string to print out using printf()?
Even assuming you fix the This vs this glitch, there are other issues.
You print the entire string.
You don't change the starting point for the search.
Your moving code adds 4 to end a second time.
You don't use start.
The code should print from result1, not result.
With those fixed, the code runs but produces:
testthis-two.testthis-three.testthis-two.test
testtestthis-three.testthis-two.test
testtthis-two.test
test?
and a core dump (segmentation fault).
Code identifying the strings
This is what I created, based on a mix of your code and my commentary:
#include <stdio.h>
#include <string.h>
int main(void)
{
char string[] = "this-one.testthis-two.testthis-three.testthis-two.test";
int counter = 0;
const char *b_token = "this";
const char *e_token = "test";
int e_len = strlen(e_token);
char *buffer = string;
char *b_mark;
char *e_mark;
while ((b_mark = strstr(buffer, b_token)) != 0 &&
(e_mark = strstr(b_mark, e_token)) != 0)
{
int length = e_mark + e_len - b_mark;
printf("%d: %.*s\n", ++counter, length, b_mark);
buffer = e_mark + e_len;
}
return 0;
}
Clearly, this code does no moving of data, but being able to isolate the data to be moved is a key first step to completing that part of the exercise. Extending it to make copies of the strings so that they can be compared is fairly easy. If it is available to you, the strndup() function will be useful:
char *strndup(const char *s1, size_t n);
The strndup() function copies at most n characters from the string s1 always NUL terminating the copied string.
If you don't have it available, it is pretty straight-forward to implement, though it is more straight-forward if you have strnlen() available:
size_t strnlen(const char *s, size_t maxlen);
The strnlen() function attempts to compute
the length of s, but never scans beyond the first maxlen bytes of s.
Neither of these is a standard C library function, but they're defined as part of POSIX (strnlen()
and strndup()) and are available on BSD and Mac OS X; Linux has them, and probably other versions of Unix do too. The specifications shown are quotes from the Mac OS X man pages.
Example output:
I called the program stst (for start-stop).
$ ./stst
1: this-one.test
2: this-two.test
3: this-three.test
4: this-two.test
$
There are multiple features to observe:
Since main() ignores its arguments, I removed the arguments (my default compiler options won't allow unused arguments).
I case-corrected the string.
I set up constant strings b_token and e_token for the beginning and end markers. The names are symmetric deliberately. This could readily be transplanted into a function where the tokens are arguments to the function, for example.
Similarly I created the b_mark and e_mark variables for the positions of the begin and end markers.
The name buffer is a pointer to where to start searching.
The loop uses the test I outlined in the comments, adapted to the chosen names.
The printing code determines how long the found string is and prints only that data. It prints the counter value.
The reinitialization code skips all the previously printed material.
Command line options for generality
You could generalize the code a bit by accepting command line arguments and processing each of those in turn if any are provided; you'd use the string you provide as a default when no string is provided. A next level beyond that would allow you to specify something like:
./stst -b beg -e end 'kalamazoo-beg-waffles-end-tripe-beg-for-mercy-end-of-the-road'
and you'd get output such as:
1: beg-waffles-end
2: beg-for-mercy-end
Here's code that implements that, using the POSIX getopt().
#include <stdio.h>
#include <string.h>
#include <unistd.h>
int main(int argc, char **argv)
{
char string[] = "this-one.testthis-two.testthis-three.testthis-two.test";
const char *b_token = "this";
const char *e_token = "test";
int opt;
int b_len;
int e_len;
while ((opt = getopt(argc, argv, "b:e:")) != -1)
{
switch (opt)
{
case 'b':
b_token = optarg;
break;
case 'e':
e_token = optarg;
break;
default:
fprintf(stderr, "Usage: %s [-b begin][-e end] ['beginning-to-end...' ...]\n", argv[0]);
return 1;
}
}
/* Use string if no argument supplied */
if (optind == argc)
{
argv[argc-1] = string;
optind = argc - 1;
}
b_len = strlen(b_token);
e_len = strlen(e_token);
printf("Begin: (%d) [%s]\n", b_len, b_token);
printf("End: (%d) [%s]\n", e_len, e_token);
for (int i = optind; i < argc; i++)
{
char *buffer = argv[i];
int counter = 0;
char *b_mark;
char *e_mark;
printf("Analyzing: [%s]\n", buffer);
while ((b_mark = strstr(buffer, b_token)) != 0 &&
(e_mark = strstr(b_mark + b_len, e_token)) != 0)
{
int length = e_mark + e_len - b_mark;
printf("%d: %.*s\n", ++counter, length, b_mark);
buffer = e_mark + e_len;
}
}
return 0;
}
Note how this program documents what it is doing, printing out the control information. That can be very important during debugging — it helps ensure that the program is working on the data you expect it to be working on. The searching is better too; it works correctly with the same string as the start and end marker (or where the end marker is a part of the start marker), which the previous version did not (because this version uses b_len, the length of b_token, in the second strstr() call). Both versions are quite happy with adjacent end and start tokens, but they're equally happy to skip material between an end token and the next start token.
Example runs:
$ ./stst -b beg -e end 'kalamazoo-beg-waffles-end-tripe-beg-for-mercy-end-of-the-road'
Begin: (3) [beg]
End: (3) [end]
Analyzing: [kalamazoo-beg-waffles-end-tripe-beg-for-mercy-end-of-the-road]
1: beg-waffles-end
2: beg-for-mercy-end
$ ./stst -b th -e th
Begin: (2) [th]
End: (2) [th]
Analyzing: [this-one.testthis-two.testthis-three.testthis-two.test]
1: this-one.testth
2: this-th
$ ./stst -b th -e te
Begin: (2) [th]
End: (2) [te]
Analyzing: [this-one.testthis-two.testthis-three.testthis-two.test]
1: this-one.te
2: this-two.te
3: this-three.te
4: this-two.te
$
After update to question
You have to account for the trailing null byte by allocating enough space for length + 1 bytes. Using strncpy() is fine but in this context guarantees that the string is not null terminated; you must null terminate it.
Your duplicate elimination code, commented out, was not particularly good — too many null checks when none should be necessary. I've created a print function; the tag argument allows it to identify which set of data it is printing. I should have put the 'free' loop into a function. The duplicate elimination code could (should) be in a function; the string extraction code could (should) be in a function — as in the answer by pikkewyn. I extended the test data (string concatenation is wonderful in contexts like this).
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
static void dump_strings(const char *tag, char **strings, int num_str)
{
printf("%s (%d):\n", tag, num_str);
for (int i = 0; i < num_str; i++)
printf("%d: %s\n", i, strings[i]);
putchar('\n');
}
int main(void)
{
char string[] =
"this-one.testthis-two.testthis-three.testthis-two.testthis-one.test"
"this-1-testthis-1-testthis-2-testthis-1-test"
"this-1-testthis-1-testthis-1-testthis-1-test"
;
const char *b_token = "this";
const char *e_token = "test";
int b_len = strlen(b_token);
int e_len = strlen(e_token);
char *buffer = string;
char *b_mark;
char *e_mark;
char *a[50];
int num_str = 0;
while ((b_mark = strstr(buffer, b_token)) != 0 && (e_mark = strstr(b_mark + b_len, e_token)) != 0)
{
int length = e_mark + e_len - b_mark;
char *s = (char *) malloc(length + 1); // Allow for null
strncpy(s, b_mark, length);
s[length] = '\0'; // Null terminate the string
a[num_str++] = s;
buffer = e_mark + e_len;
}
dump_strings("After splitting", a, num_str);
//remove duplicate strings
for (int i = 0; i < num_str; i++)
{
for (int j = i + 1; j < num_str; j++)
{
if (strcmp(a[i], a[j]) == 0)
{
free(a[j]); // Free the higher-indexed duplicate
a[j] = a[--num_str]; // Move the last element here
j--; // Examine the new string next time
}
}
}
dump_strings("After duplicate elimination", a, num_str);
for (int i = 0; i < num_str; i++)
free(a[i]);
return 0;
}
Testing with valgrind gives this a clean bill of health: no memory faults, no leaked data.
Sample output:
After splitting (13):
0: this-one.test
1: this-two.test
2: this-three.test
3: this-two.test
4: this-one.test
5: this-1-test
6: this-1-test
7: this-2-test
8: this-1-test
9: this-1-test
10: this-1-test
11: this-1-test
12: this-1-test
After duplicate elimination (5):
0: this-one.test
1: this-two.test
2: this-three.test
3: this-1-test
4: this-2-test

code accounting for multiple delimiters isn't working

I have a program I wrote to take a string of words and, based on the delimiter that appears, separate each word and add it to an array.
I've adjusted it to account for either a ' ' , '.' or '.'. Now the goal is to adjust for multiple delimiters appearing together (as in "the dog,,,was walking") and still only add the word. While my program works, and it doesn't print out extra delimiters, every time it encounters additional delimiters, it includes a space in the output instead of ignoring them.
int main(int argc, const char * argv[]) {
char *givenString = "USA,Canada,Mexico,Bermuda,Grenada,Belize";
int stringCharCount;
//get length of string to allocate enough memory for array
for (int i = 0; i < 1000; i++) {
if (givenString[i] == '\0') {
break;
}
else {
stringCharCount++;
}
}
// counting # of commas in the original string
int commaCount = 1;
for (int i = 0; i < stringCharCount; i++) {
if (givenString[i] == ',' || givenString[i] == '.' || givenString[i] == ' ') {
commaCount++;
}
}
//declare blank Array that is the length of commas (which is the number of elements in the original string)
//char *finalArray[commaCount];
int z = 0;
char *finalArray[commaCount] ;
char *wordFiller = malloc(stringCharCount);
int j = 0;
char current = ' ';
for (int i = 0; i <= stringCharCount; i++) {
if (((givenString[i] == ',' || givenString[i] == '\0' || givenString[i] == ',' || givenString[i] == ' ') && (current != (' ' | '.' | ',')))) {
finalArray[z] = wordFiller;
wordFiller = malloc(stringCharCount);
j=0;
z++;
current = givenString[i];
}
else {
wordFiller[j++] = givenString[i];
}
}
for (int i = 0; i < commaCount; i++) {
printf("%s\n", finalArray[i]);
}
return 0;
}
This program took me hours and hours to get together (with help from more experienced developers) and I can't help but get frustrated. I'm using the debugger to my best ability but definitely need more experience with it.
/////////
I went back to pad and paper and kind of rewrote my code. Now I'm trying to store delimiters in an array and compare the elements of that array to the current string value. If they are equal, then we have come across a new word and we add it to the final string array. I'm struggling to figure out the placement and content of the "for" loop that I would use for this.
char * original = "USA,Canada,Mexico,Bermuda,Grenada,Belize";
//creating two intialized variables to count the number of characters and elements to add to the array (so we can allocate enough mmemory)
int stringCharCount = 0;
//by setting elementCount to 1, we can account for the last word that comes after the last comma
int elementCount = 1;
//calculate value of stringCharCount and elementCount to allocate enough memory for temporary word storage and for final array
for (int i = 0; i < 1000; i++) {
if (original[i] == '\0') {
break;
}
else {
stringCharCount++;
if (original[i] == ',') {
elementCount++;
}
}
}
//account for the final element
elementCount = elementCount;
char *tempWord = malloc(stringCharCount);
char *finalArray[elementCount];
int a = 0;
int b = 0;
//int c = 0;
//char *delimiters[4] = {".", ",", " ", "\0"};
for (int i = 0; i <= stringCharCount; i++) {
if (original[i] == ',' || original[i] == '\0') {
finalArray[a] = tempWord;
tempWord = malloc(stringCharCount);
tempWord[b] = '\0';
b = 0;
a++;
}
else {
tempWord[b++] = original[i];
}
}
for (int i = 0; i < elementCount; i++) {
printf("%s\n", finalArray[i]);
}
return 0;
}
Many issues. Suggest dividing code into small pieces and debug those first.
--
Un-initialize data.
// int stringCharCount;
int stringCharCount = 0;
...
stringCharCount++;
Or
int stringCharCount = strlen(givenString);
Other problems too: finalArray[] is never assigned a terminarting null character yet printf("%s\n", finalArray[i]); used.
Unclear use of char *
char *wordFiller = malloc(stringCharCount);
wordFiller = malloc(stringCharCount);
There are more bugs than lines in your code.
I'd suggest you start with something much simpler.
Work through a basic programming book with excercises.
Edit
Or, if this is about learning to program, try another, simpler programming language:
In C# your task looks rather simple:
string givenString = "USA,Canada Mexico,Bermuda.Grenada,Belize";
string [] words = string.Split(new char[] {' ', ',', '.'});
foreach(word in words)
Console.WriteLine(word);
As you see, there are much issues to worry about:
No memory management (alloc/free) this is handeled by the Garbage Collector
no pointers, so nothing can go wrong with them
powerful builtin string capabilities like Split()
foreach makes loops much simpler

How to reverse a sentence in C and Perl

If the sentence is
"My name is Jack"
then the output should be
"Jack is name My".
I did a program using strtok() to separate the words and then push them onto a stack,
popping them and concatenating.
Is there any other, more efficient way than this?
Is it easier to do in Perl?
Whether it is more efficient or not will be something you can test but in Perl you could do something along the lines of:
my $reversed = join( " ", reverse( split( / /, $string ) ) );
Perl makes this kind of text manipulation very easy, you can even test this easily on the shell:
echo "run as fast as you can" | perl -lne 'print join $",reverse split /\W+/'
or:
echo "all your bases are belong to us" | perl -lne '#a=reverse /\w+/g;print "#a"'
The strategy for C could be this:
1) Reverse the characters of the string. This results in the words being the right general position, albeit backward.
2) Reverse the characters of each word in the string.
We will need one function to reverse characters in a buffer:
/*
* Reverse characters in a buffer.
*
* If provided "My name is Jack", modifies the input to become
* "kcaJ si eman yM".
*/
void reverse_chars(char * buf, int cch_len)
{
char * front = buf, *back = buf + cch_len - 1;
while (front < back)
{
char tmp = *front;
*front = *back;
*back = tmp;
front ++;
back --;
}
}
For the purpose of breaking the input buffer into words, a function which returns the number of non-space characters in the buffer. (strtok() modifies the buffer and is harder to use in-place)
int word_len(char *input)
{
char * p = input;
while (*p && !isspace(*p))
p++;
return p - input;
}
Finally, we will need a function which uses those two helpers to achieve the strategy described in the first paragraph.
/*
* Reverse words in a buffer.
*
* Given the input "My name is Jack", modifies the input to become
* "Jack is name My"
*/
void reverse_words(char *input)
{
int cch_len = strlen(input);
/* Part 1: Reverse the string characters. */
reverse_chars(input, cch_len);
char * p = input;
/* Part 2: Loop over one word at a time. */
while (*p)
{
/* Skip leading spaces */
while (*p && isspace(*p))
p++;
if (*p)
{
/* Advance one complete word. */
int cch_word = word_len(p);
reverse_chars(p, cch_word);
p += cch_word;
}
}
}
You've gotten a couple of versions in C, but they strike me as a bit more verbose than is probably really necessary. Absent a reason to do otherwise, I'd consider something like this:
#define MAX 32
char *words[MAX];
char word[256];
int pos = 0;
for (pos=0; pos<MAX && scanf("%255s", word); pos++)
words[pos] = strdup(word);
while (--pos >= 0)
printf("%s ", words[pos]);
One possible "intermediate" level between C and Perl would be C++:
std::istringstream input("My name is Jack");
std::vector<std::string> words((std::istream_iterator<std::string>(input)),
std::istream_iterator<std::string>());
std::copy(words.rbegin(), words.rend(),
std::ostream_iterator<std::string>(std::cout, " "));
Here is a C idea that uses a little recursion to do the stacking for you:
void rev(char * x){
char * p;
if(p = strchr(x, ' ')){
rev(p+1);
printf("%.*s ", p-x, x);
}
else{
printf("%s ", x);
}
}
Some fun with a little help from regexp and perl special variables :)
$_ = "My name is Jack";
unshift #_, "$1 " while /(\w+)/g;
print #_;
EDIT
And a killer (by now):
$,=' ';print reverse /\w+/g;
Little explanation: $, is special variable for print output separator. Of course you can do it in shorter way without this special var:
print reverse /\w+ ?/g;
but the result might be not as satisfactiry as example above.
Using reverse:
my #words = split / /, $sentence;
my $newSentence = join(' ', reverse #words);
It's probably a lot easier to do in Perl, but...
char *strrtok(char *str, const char *delim)
{
int i, j;
for (i = strlen(str) - 1; i > 0; i--)
{
// Sets the furthest set of contiguous delimiters to null characters
if (strchr(delim, str[i]))
{
j = i + 1;
while (strchr(delim, str[i]) && i >= 0)
{
str[i] = '\0';
i--;
}
return &(str[j]);
}
}
return str;
}
This should work similarly to strtok() in reverse, but you continue to pass the pointer to the original string location rather than passing NULL after the first call. Also, you should get empty strings for start and end cases.
C version:
#include <string.h>
int main()
{
char s[] = "My name is Jack";
char t[100];
int i = 0, j = 0, k = 0;
for(i = strlen(s) - 1 ; i >= 0 ;i--)
{
if(s[i] == ' ' || i == 0)
{
j = i == 0 ? i : i + 1;
for(j = j; s[j] != '\0'; j++) t[k++] = s[j];
t[k++] = ' ';
s[i] = '\0';
}
}
t[k] = '\0';
printf("%s\n", t);
return 0;
}
C example
char * srtrev (char * str) {
int l = strlen(str);
char * rev;
while(l != 0)
{
rev += str[ --l];
}
return rev;
}

Resources