I need define a “word” in this question to be any sequence of characters that doesn’t contain a space or null character.
For example, the string “Hello World” would contain 2 words. However, it is actually possible for a word to be empty
i.e., zero chars.
A sentence would be a series of words that are separated by 1 space character. So “Hello World” would be a
sentence of two words. The goal of ReverseSentence would be to reverse the sentence in terms of word.
Right now, I am having an error whereby programs proceed to call function and prints out a1 to a5. Upon reaching a5, it seems that program abort and core dumped. If i replace blank with space, it will read in the previous input and replace according to number of space.
Where am i going wrong?
ReverseSentence.c
#include <stdlib.h> /* malloc */
#include <string.h> /* strcat, strcpy */
void ReverseSentence(char *str)
{
char *newSentence;
int i, j, start, len;
/* contains the string length of the input */
len = strlen(str);
/* position or index in the array */
start = strlen(str);
/* malloc */
newSentence = malloc(len + 1);
/* loop checks from the right of the sentences */
for (i = len; i >= 0; i--) {
/* if index reach the array with a space or zero */
if (str[i] == ' ' || i == 0) {
/* allocates memory */
char *word = malloc((start - i) + 1);
int c = 0;
if (i == 0)
/* index remains same */
j = i;
else
j = i + 1;
/* j smaller or equal than the start position */
for (; j <= start; j++) {
/*do a incremental*/
word[c++] = str[j];
}
/* hits a null char */
word[c] = '\0';
/* string concatenate */
strcat(newSentence, word);
/* if index hits a space */
if (str[i] == ' ')
strcat(newSentence, " "); /* concatenate space to newSentence */
else
strcat(newSentence, "\0");
start = i - 1;
/* free memory */
free(word);
}
}
newSentence[len] = '\0';
/* string copy */
/* str is destination, newSentence is the source */
/* copy new string to original string */
strcpy(str, newSentence);
/* free memory */
free(newSentence);
}
main.c
#include <stdio.h>
#include "ReverseSentence.h"
int main()
{
char a1[] = "Hello World ";
char a2[] = "abcdefghi ";
char a3[] = " ";
char a4[] = "C programming is a dangerous activity";
char a5[] = "a "; /* a sentence with only empty words */
ReverseSentence(a1);
printf("Test case 1:\"%s\"\n", a1); /* prints "World Hello" */
ReverseSentence(a2);
printf("Test case 2:\"%s\"\n", a2); /* prints "abcdefghi" */
ReverseSentence(a3);
printf("Test case 3:\"%s\"\n", a3); /* prints "" */
ReverseSentence(a4);
printf("Test case 4:\"%s\"\n", a4); /* prints "activity dangerous a is pro Cgramming" */
ReverseSentence(a5);
printf("Test case 5:\"%s\"\n", a5); /* prints " " */
return 0;
}
EDIT: new version
void ReverseSentence(char *str)
{
/* holder */
/* pointer to char */
char *newSentence;
int i, start, len, lastindex, size;
/* contains the string length of the input */
len = strlen(str);
lastindex = strlen(str);
/* starting position */
start = 0;
i = 0;
/* malloc */
newSentence = malloc(sizeof(char) * strlen(str));
while (i >= 0) {
for (i = len - 1; str[i] != '\0' && str[i] != ' '; i--) {
lastindex--;
}
/* number of chars in string size */
size = len - lastindex;
/* Copy word into newStr at startMarker */
strncpy(&newSentence[start], &str[lastindex], size);
/* pointer move to right */
start = start + size;
/* Space placed into memory slot */
newSentence[start] = ' ';
/* start position moves by 1 towards the right */
start = start + 1;
/* pointer at len moves to left */
lastindex = lastindex - 1;
/* lastindex moves to where len is */
len = lastindex;
}
/* Copy new string into old string */
for (i = 0; str[i] != '\0'; i++) {
str[i] = newSentence[i];
}
/* free memory */
free(newSentence);
}
In addition to Matthias' answer: you don't allocate enough memory, I just did a wild guess and added 1 to the arguments passed to malloc.
newSentence = malloc(len + 2); // +2 instead of +1
and
char *word = malloc((start - i) + 2); // +2 instead of +1
And now it doesn't crash anymore. So there is definitely a buffer overflow here.
I don't pretend the program is perfectly correct now. You should have a look into this.
Your code is not safe. You never initialize newSentence, since malloc() only allocates but not initialize the memory (in contrast to calloc()). Thus, you are starting with a garbage sentence, where you append something new (strcat()). Depending on the garbage, there may be no 0 even in the allocated memory, and you access some unallocated memory area.
Your method is too complicated. It has several issues:
you do not initialize newSentence: since malloc memory is uninitialized, you invoke undefined behavior when you copy the words at its end with strcat. You can fix that with *newSentence = '\0';
when you copy the word into the allocated word buffer, you iterate upto and including start, then you add a '\0' at the end. You effectively write one byte too many for the last word (case i == 0). This invokes undefined behavior.
strcat(newSentence, "\0"); does nothing.
allocating a buffer for each word found is wasteful, you could just copy the word with memcpy or with a simple for loop.
You could simplify with the following steps:
allocate a buffer and copy the string to it.
for each word in the string, copy it at the end of the destination and copy the separator before it if not at the end.
free the buffer.
Here is the code:
char *ReverseSentence(char *str) {
int len = strlen(tmp); /* length of the string */
char *tmp = strdup(str); /* copy of the string */
int i; /* index into the copy */
int j = len; /* index into the string */
int n; /* length of a word */
for (i = 0; i < len; ) {
n = strcspn(tmp + i, " "); /* n is the length of the word */
j -= n; /* adjust destination offset */
memcpy(str + j, tmp + i, n); /* copy the word */
i += n; /* skip the word */
if (tmp[i] != '\0') { /* unless we are at the end */
j--;
str[j] = tmp[i]; /* copy the separator */
i++;
}
}
free(tmp); /* free the copy */
return str;
}
There are at least two issues in the new version of your program:
You don't allocate enough memory, you don't account for the zero terminator. You should allocate one more byte.
in your first for loop you allow i to become -1. The loop should stop when i is zero: modify your for statement like this: for(i=len-1; tr[i] != ' ' && i >= 0; i--). You incorrecty assume that the first byte before the str buffer is zero, therefore the str[i]!='\0' is wrong. BTW accessing one byte before the str buffer yields in undefined behaviour.
There are probably other issues.
Related
I have been running into issues with the strcpy() function in C. In this function I take a string in buffer and the string contains a something along the lines of '(213);' and I am trying to remove the brackets so the output would be something like 200;.
for (i = 0; i < bufferlen; i++) {
// check for '(' followed by a naked number followed by ')'
// remove ')' by shifting the tail end of the expression
// remove '(' by shifting the beginning of the expression
if((buffer[i] == '(') && (isdigit(buffer[i+1]))){
int numberLen = 0;
int test =0;
i++;
while((isdigit(buffer[i]))){
i++;
numberLen++;
}
if(buffer[i] == ')'){
int numberStart = i - numberLen-1;
strcpy(&buffer[i], &buffer[i+1]);
strcpy(&buffer[numberStart], &buffer[numberStart+1]);
printf("buffer = %s\n", buffer);
}
}
}
However, the output is as follows
buffer before strcpy(&buffer[i], &buffer[i+1]); = (213);
buffer after strcpy(&buffer[i], &buffer[i+1]); = (213;
buffer after strcpy(&buffer[numberStart], &buffer[numberStart+1]); = 23;;
for some reason the second strcpy function removes the second value of the string. I have also tried
strcpy(&buffer[0], &buffer[1]); and still end up with the same results. Any insight as to why this is occurring would be greatly appreciated.
Continuing from the comment, strcpy(&buffer[i], &buffer[i+1]); where source and dest overlap results in Undefined Behavior, use memmove, or simply use a couple of pointers instead.
The prohibition on using strings that overlap (i.e. are the same string) is found in C11 Standard - 7.24.2.3 The strcpy function
If I understand your question and you simply want to turn "'(213)'" into "213", you don't need any of the string.h functions at all. You can simply use a couple of pointers and walk down the source-string until you find a digit. Start copying digits to dest at that point by simple assignment. When the first non-digit is encountered, break your copy loop. Keeping a flag to indicate when you are "in" a number copying digits will allow you to break on the 1st non-digit to limit your copy to the first sequence of digits found (e.g. so from the string "'(213)' (423)", only 213 is returned instead of 213423). You could do somehting like:
char *extractdigits (char *dest, const char *src)
{
/* you can check src != NULL here */
char *p = dest; /* pointer to dest (to preserve dest for return) */
int in = 0; /* simple flag to break loop when non-digit found */
while (*src) { /* loop over each char in src */
if (isdigit(*src)) { /* if it is a digit */
*p++ = *src; /* copy to dest */
in = 1; /* set in-number flag */
}
else if (in) /* if in-number, break on non-digit */
break;
src++; /* increment src pointer */
}
*p = 0; /* nul-terminate dest */
return dest; /* return pointer to dest (for convenience) */
}
A short example would be:
#include <stdio.h>
#include <ctype.h>
#define MAXC 32
char *extractdigits (char *dest, const char *src)
{
/* you can check src != NULL here */
char *p = dest; /* pointer to dest (to preserve dest for return) */
int in = 0; /* simple flag to break loop when non-digit found */
while (*src) { /* loop over each char in src */
if (isdigit(*src)) { /* if it is a digit */
*p++ = *src; /* copy to dest */
in = 1; /* set in-number flag */
}
else if (in) /* if in-number, break on non-digit */
break;
src++; /* increment src pointer */
}
*p = 0; /* nul-terminate dest */
return dest; /* return pointer to dest (for convenience) */
}
int main (void) {
char digits[MAXC] = "";
const char *string = "'(213}'";
printf ("in : %s\nout: %s\n", string, extractdigits (digits, string));
}
Example Use/Output
$ ./bin/extractdigits
in : '(213}'
out: 213
Look things over and let me know if you have further questions.
I am very new in C coding. I have written my code to find the longest word in a string. my code does not show any error but it prints a word with strange characters that is not in the string. Can you tell me what is wrong with my code?
Thank you
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char LongestWord (char GivenString[100]);
int main()
{
char input[100];
char DesiredWord[20];
printf("please give a string:\n");
gets(input);
DesiredWord[20]=LongestWord(input);
printf("longest Word is:%s\n",DesiredWord);
return 0;
}
char LongestWord (char GivenString[100]){
//It is a predefined function, by using this function we can clear the data from console (Monitor).
//clrscr()
int position1=0;
int position2=0;
int longest=0;
int word=0;
int Lenght=strlen(GivenString);
char Solution[20];
int p=0;
for (int i=1; i<=Lenght; i++){
if (GivenString[i-1]!=' '){
word=word++;
}
if(GivenString[i-1]=' '){
if (word>longest){
//longest stores the length of longer word
longest=word;
position2=i-1;
position1=i-longest;
word=0;
}
}
}
for (int j=position1; j<=position2; j++){
Solution[p]=GivenString[j];
p=p++;
}
return (Solution[20]);
}
This should work:
#include <stdio.h>
#include <string.h>
void LongestWord(char string[100])
{
char word[20],max[20],min[20],c;
int i = 0, j = 0, flag = 0;
for (i = 0; i < strlen(string); i++)
{
while (i < strlen(string) && string[i]!=32 && string[i]!=0)
{
word[j++] = string[i++];
}
if (j != 0)
{
word[j] = '\0';
if (!flag)
{
flag = !flag;
strcpy(max, word);
}
if (strlen(word) > strlen(max))
{
strcpy(max, word);
}
j = 0;
}
}
printf("The largest word is '%s' .\n", max);
}
int main()
{
char string[100];
printf("Enter string: ");
gets(string);
LongestWord(string);
}
Aside from invoking Undefined Behavior by returning a pointer to a locally declared array in LongestWord, using gets despite gets() is so dangerous it should never be used! and writing beyond the end of the Solution array -- you are missing the logic of identifying the longest word.
To identify the longest word, you must obtain the length of each word as you work you way down the string. You must keep track of what the longest string seen, and only if the current string is longer than the longest seen so far do you copy to valid memory that will survive the function return (and nul-terminate).
There are a number of ways to do this. You can use strtok to tokenize all words in the string, you can use a combination of strcspn and strspn to bracket the words, you can use sscanf and an offset to the beginning of each word, or what I find easiest is just to use a pair of pointers sp (start-pointer) and ep (end-pointer) to work down the string.
There you just move sp to the first character in each word and keep moving ep until you find a space (or end of string). The word length is ep - sp and then if it is the longest, you can simply use memcpy to copy length characters to your longest word buffer and nul-terminate, (repeat until you run out of characters)
To create valid storage, you have two-choices, either pass an array of sufficient size (see comment), or declare a valid block of memory within your function using malloc (or calloc or realloc) and return a pointer to that block of memory.
An example passing an array of sufficient size to hold the longest word could be:
#include <stdio.h>
#include <ctype.h>
#include <string.h>
#define MAXW 256 /* longest word buffer size */
#define MAXC 1024 /* input string buffer size */
size_t longestword (char *longest, const char *str)
{
int in = 0; /* flag reading (in/out) of word */
size_t max = 0; /* word max length */
const char *sp = str, /* start-pointer for bracketing words */
*ep = str; /* end-pointer for bracketing words */
*longest = 0; /* initialize longest as empty-string */
for (;;) { /* loop over each char in str */
if (isspace (*ep) || !*ep) { /* is it a space or end? */
if (in) { /* are we in a word? */
size_t len = ep - sp; /* if so, get word length */
if (len > max) { /* is it longest? */
max = len; /* if so, set max to len */
memcpy (longest, sp, len); /* copy len chars to longest */
longest[len] = 0; /* nul-terminate longest */
}
in = 0; /* it's a space, no longer in word */
}
if (!*ep) /* if end of string - done */
break;
}
else { /* not a space! */
if (!in) { /* if we are not in a word */
sp = ep; /* set start-pointer to current */
in = 1; /* set in flag */
}
}
ep++; /* increment end-pointer to next char */
}
return max; /* return max length */
}
int main (void) {
char str[MAXC] = "", /* storage for input string */
word[MAXW] = ""; /* storage for longest word */
size_t max = 0; /* longest word length */
fputs ("enter string: ", stdout); /* prompt */
if (!fgets (str, MAXC, stdin)) { /* validate input */
fputs ("(user canceled input)\n", stderr);
return 1;
}
if ((max = longestword (word, str))) /* get length and longest word */
printf ("longest word: %s (%zu-chars)\n", word, max);
}
(note: by using this method you ignore all leading, trailing and intervening whitespace, so strings like " my little dog has 1 flea . " do not present problems.)
Example Use/Output
$ ./bin/longest_word
enter string: my dog has fleas
longest word: fleas (5-chars)
$ ./bin/longest_word
enter string: my little dog has 1 flea .
longest word: little (6-chars)
There are many, many ways to do this. This is one of the most basic, using pointers. You could do the same thing using indexes, e.g. string[i], etc.. That just requires you maintain an offset to the start of each word and then do the subtraction to get the length. strtok is convenient, but modifies the string being tokenized so it cannot be used with string literals or other constant strings.
Best way to learn is work the problem 3-different ways, and pick the one that you find the most intuitive. Let me know if you have further questions.
please declare a proper main entry point: int main( int argc, const char* argv[] )
Use fgets instead of gets, as gets does not check the bound of your string ( what happened when you enter a 120 chars line)
pass the length of the expected string to LongestWord
if available prefer using strnlen to plain strlen, there might be scenario where your string is not properly terminated.
Better yet use the suggested length parameter to limit your loop and break when a terminating char is encountered.
your Solution is a stack allocated array, returning it as it is might depend on your implementation, you might be better returning a heap allocated array (using malloc).
Suggested changes
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char* getLongestWord(char* input, size_t input_length, size_t *result_length);
int main( int argc, const char* argv[] )
{
const size_t max_length = 100;
char input[max_length]; // consider using LINE_MAX from limits.h
printf("please give a string:\n");
if ( fgets( input, max_length, stdin ) == NULL ) return EXIT_FAILURE; // some failure happened with fgets.
size_t longestWord_length = 0;
char* longestWord = getLongestWord(input, max_length , &longestWord_length);
printf("longest Word is %.*s\n",longestWord_length, longestWord );
return EXIT_SUCCESS;
}
char* getLongestWord(char* input, size_t input_length, size_t *result_length) {
char* result = NULL;
size_t length = 0;
size_t word_start = 0, word_end = 0;
for(int i = 0; i < input_length; ++i) {
if( (input[i] == ' ') || (input[i] == 0) ) {
if( i == 0 ) { // first space
word_start = 1;
continue;
}
word_end = i-1;
size_t word_length = word_end - word_start+1;
if( word_length <= length ) {
word_start = i + 1; // next word start
continue;
}
// new max length
length = word_length;
result = &input[word_start];
word_start = i + 1; // next word start
}
if( input[i] == 0 ) break; // end of string
}
*result_length = length;
return result;
}
Hello this brand new noob on C is trying to take a string input of 3 words and store it on 3 different arrays, not 2D nor 3D arrays. For this problem I'm not allow to use any of the string library functions. Basically I'm trying to implement the sscanf function. I created a function that breaks the input into the three words, and stores them in their indicated array. My problem is when I try to print the each of the arrays, for my second array I can't get it to print the word I tried to store in it. I'm probably not storing anything, but I can't find my mistake. Here it's what I have...
#include<stdio.h>
#include<string.h>
void breakUp(char *, char *, char *, char *, int* );
int main()
{
char line[80], comm[10], p1[10], p2[10];
int len, n=0;
printf("Please Enter a command: ");
fgets(line, 80, stdin);
/* get rid of trailing newline character */
len = strlen(line) - 1;
if (line[len] == '\n')
line[len] = '\0';
/* Break up the line */
breakUp(line, comm, p1, p2, &n);
printf ("%d things on this line\n", n);
printf ("command: %s\n", comm);
printf ("parameter 1: %s\n", p1);
printf ("parameter 2: %s\n", p2);
return 0;
}
/*
This function takes a line and breaks it into words.
The orginal line is in the char array str, the first word
will go into the char array c, the second into p1, and the
the third into p2. If there are no words, the corresponding
char arrays are empty. At the end, n contains the number of
words read.
*/
void breakUp(char *str, char *c, char *p1, char *p2, int* n)
{
c[0] = p1[0] = p2[0] = '\0';
p1[0] = '\0';
int j = 0; // str array index
int i = 0; // index of rest of the arrays
n[0] = 0;
// stores first word in array c
while(str[j]!= ' '|| str[j] == '\0')
{
c[i]= str[j];
i++;
j++;
}
// increases n count, moves j into next element
// and sets i back to index 0
if (str[j] == ' '|| str[j] == '\0')
{
c[i] = '\0';
n[0]++;
j++;
i =0;
if( str[j] == '\0')
return;
}
// stores second word in array p1
while(str[j]!= ' '|| str[j] == '\0')
{
p1[i]= str[j];
i++;
j++;
}
// increases n count, moves j into next element
// and sets i back to index 0
if (str[j] == ' '|| str[j] == '\0')
{
p1[i] = '\0';
n[0]++;
j++;
i =0;
if( str[j] == '\0')
return;
}
// stores 3rd word in array p2
while(str[j] != ' ' || str[j] == '\0')
{
p2[i] = str[j];
i++;
j++;
}
// increases n count, moves j into next element
// and sets i back to index 0
if(str[j] == ' ' || str[j] == '\0')
{
p2[i] = '\0';
n[0]++;
if( str[j] == '\0')
return;
}
}
Advanced thanks if any help is provided
while(str[j]!= ' '|| str[j] == '\0')
should be
while(str[j]!= ' '&& str[j]!= '\0')
Same for if statements that check if end of the line was reached. Thanks.
In addition to the comments and answers, there are a few areas where you may be making things harder on yourself than they need to be. First lets look at the return of breakup. Why are you using void? If you need to be able to gauge success/failure of a function, pick a meaningful return type that will provide that information. Here a simple return of int will do (and it also prevents having to pass a pointer to n to update -- you can simply return n) You are free to pass a pointer to n, just note, it cannot be used to provide success/failure information at the time of the call to breakup. A return of n can.
(note: breakup is all lowercase -- leave camelCase variables to java and C++, C traditionally uses all lowercase variable and function names reserving all uppercase for constants and macros)
Second, avoid the use of magic numbers in your code (e.g. 80, 10, etc..). If you need constants, then #define them at the top, or use an enum to define global constants. That way you have a single convenient location at the top of your file to make adjustments and you don't have to go picking through each character array declaration changing magic numbers if you need to change the array sizes, etc. For example, to define the number of words, the command array lengths and the line buffer length, you can simple do:
#define NWRD 3 /* number of words */
#define CLEN 16 /* command length */
#define LLEN 80 /* line buffer len */
or using an enum:
enum { NWRD = 3, CLEN = 16, LLEN = 80 }; /* constants */
Now for the content of breakup, you are free to duplicate the code blocks for reading each word 3 separate times, but you can get a little smarter about it and realize you are already passing pointers to 3 character arrays to breakup, why not just use an array of pointers initialized to { c, p1, p2 } and then just write a single block of code to pick out each word from str? It's just a shorter way of skinning-the-cat so to speak. That way, instead of having to reference c, p1, p2, you can simply loop over arr[0], arr[1], arr[2], where arr[0] = c, arr[1] = p1, arr[2] = p2;.
For example, with your new declaration for breakup being:
int breakup (char *str, char *c, char *p1, char *p2);
within breakup you could assign the array of pointers as follows:
char *arr[] = { c, p1, p2 };
Then you can simply add all characters in the first word to arr[0], all chars in the second to arr[1], and so on which will in-turn fill c, p1, p2. That looks like something that could be handled in a nice single loop...
All you need do now is figure out a way of adding each character to each word, checking for spaces (or multiple spaces and tabs), making sure you nul-terminate each word, make sure you only fill three words -- and no more, and finally return the number of words you filled.
Without belaboring each point, you could shorten breakup to something like the following:
int breakup (char *str, char *c, char *p1, char *p2)
{
char *arr[] = { c, p1, p2 }; /* assign c, p1, p2 to arr */
int cidx = 0, n = 0; /* character index & n */
for (; n < NWRD; str++) { /* loop each char while n < NWRD */
if (*str == ' ' || *str == '\t') { /* have space or tab? */
if (*arr[n]) { /* have chars in arr[n]? */
arr[n++][cidx] = 0; /* nul-terminate arr[n] */
cidx = 0; /* zero char index */
}
}
else if (*str == '\n' || !*str) { /* reached '\n' or end of str? */
arr[n++][cidx] = 0; /* nul-terminate arr[n] */
break; /* bail */
}
else
arr[n][cidx++] = *str; /* assign char to arr[n] */
}
return n;
}
note: breakup tests the first character in each word to determine if you have started filling the word (to allow you to skip over multiple spaces or tabs), so you either need to insure you initialize all strings in main, or you could do it at the top of breakup as well. In main, you can simply initialize your character arrays when you declare them, e.g.
char line[LLEN] = "", comm[CLEN] = "", p1[CLEN] = "", p2[CLEN] = "";
(note: it is good practice to initialize all variables when you declare them).
You may want to also add a few additional validations when you are reading line to insure it isn't empty before passing it to breakup. You can do that by simply adding validations at the time line is read by fgets, example:
printf ("please enter a command: "); /* prompt & get line */
if (!fgets (line, LLEN, stdin) || !*line || *line == '\n') {
fprintf (stderr, "error: input empty or user canceled.\n");
return 1;
}
which checks that the first character in line is not the nul-character (empty-string) or that the first character is '\n' (empty-line).
Putting it altogether in a short example, you could do something like the following:
#include <stdio.h>
enum { NWRD = 3, CLEN = 16, LLEN = 80 }; /* constants */
int breakup (char *str, char *c, char *p1, char *p2)
{
char *arr[] = { c, p1, p2 }; /* assign c, p1, p2 to arr */
int cidx = 0, n = 0; /* character index & n */
for (; n < NWRD; str++) { /* loop each char while n < NWRD */
if (*str == ' ' || *str == '\t') { /* have space or tab? */
if (*arr[n]) { /* have chars in arr[n]? */
arr[n++][cidx] = 0; /* nul-terminate arr[n] */
cidx = 0; /* zero char index */
}
}
else if (*str == '\n' || !*str) { /* reached '\n' or end of str? */
arr[n++][cidx] = 0; /* nul-terminate arr[n] */
break; /* bail */
}
else
arr[n][cidx++] = *str; /* assign char to arr[n] */
}
return n;
}
int main (void) {
char line[LLEN] = "", comm[CLEN] = "", p1[CLEN] = "", p2[CLEN] = "";
int n = 0;
printf ("please enter a command: "); /* prompt & get line */
if (!fgets (line, LLEN, stdin) || !*line || *line == '\n') {
fprintf (stderr, "error: input empty or user canceled.\n");
return 1;
}
if ((n = breakup (line, comm, p1, p2)) != NWRD) { /* call breakup */
fprintf (stderr, "error: %d words found\n", n);
return 1;
}
printf ("command : %s\nparameter 1: %s\nparameter 2: %s\n",
comm, p1, p2);
return 0;
}
Example Use/Output
$ ./bin/breakup
please enter a command: dogs have fleas
command : dogs
parameter 1: have
parameter 2: fleas
$ ./bin/breakup
please enter a command: dogs have
error: 2 words found
$ ./bin/breakup
please enter a command: dogs have lots of fleas
command : dogs
parameter 1: have
parameter 2: lots
$ ./bin/breakup
please enter a command: dogs have fleas for sale
command : dogs
parameter 1: have
parameter 2: fleas
Look things over and make sure you understand what is happening. Being new. the array of pointers char *arr[] = { c, p1, p2 }; may have you scratching your head, but it is doing nothing more than creating 3 pointers arr[0], arr[1], arr[2] and assigning arr[0] = c; arr[1] = p1, arr[2] = p2; (just as you would use any pointer, say char *p = c, only you are using arr[0] instead of p to allow yourself to loop over each pointer.
Beyond that it is all fairly basic. But the time in now to ask questions to make sure you understand it going forward.
I was wondering we would go about splitting strings into tokens or any other efficient ways of doing this.
i.e. I have...
char string1[] = "hello\tfriend\n";
How would I get "hello" and "friend" in their own separate variables?
Here is a very simple example splitting your string into parts saved in an array of character arrays using a start and end pointer. The MAXL and MAXW defines simply are a convenient way to define constants that are used to limit the individual word length to 32 (31 chars + null terminator) and a maximum of 3 words (parts) of the original string:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAXL 32
#define MAXW 3
int main (void) {
char string1[] = "hello\tfriend\n";
char *sp = string1; /* start pointer */
char *ep = string1; /* end pointer */
unsigned c = 0; /* temp character */
unsigned idx = 0; /* index for part */
char strings[MAXW][MAXL] = {{0}}; /* array to hold parts */
while (*ep) /* for each char in string1 */
{
if (*ep == '\t' || *ep == '\n') /* test if \t or \n */
{
c = *ep; /* save character */
*ep = 0; /* replace with null-termator */
strcpy (strings[idx], sp); /* copy part to strings array */
*ep = c; /* replace w/original character */
idx++; /* increment index */
sp = ep + 1; /* set start pointer */
}
ep++; /* advance to next char */
}
printf ("\nOriginal string1 : %s\n", string1);
unsigned i = 0;
for (i = 0; i < idx; i++)
printf (" strings[%u] : %s\n", i, strings[i]);
return 0;
}
Output
$ ./bin/split_hello
Original string1 : hello friend
strings[0] : hello
strings[1] : friend
Using strtok simply replaces the manual pointer logic with the function call to split the string.
Updated Line-end Handling Example
As you have found, when stepping though the string you can create as simple an example as you need to fit the current string, but with a little extra effort you can expand your code to handle a broader range of situations. In your comment you noted that the above code does not handle the situation where there is no newline at the end of the string. Rather than changing the code to handle just that situation, with a bit of thought, you can improve the code so it handles both situations. One approach would be:
while (*ep) /* for each char in string1 */
{
if (*ep == '\t' || *ep == '\n') /* test if \t or \n */
{
c = *ep; /* save character */
*ep = 0; /* replace with null-termator */
strcpy (strings[idx], sp); /* copy part to strings array */
*ep = c; /* replace w/original character */
idx++; /* increment index */
sp = ep + 1; /* set start pointer */
}
else if (!*(ep + 1)) { /* check if next is ending */
strcpy (strings[idx], sp); /* handle no ending '\n' */
idx++;
}
ep++; /* advance to next char */
}
Break on Any Format/Non-Print Character
Continuing to broaden characters that can be used to separate the strings, rather than using discrete values to identify which characters divide the words, you can use a range of ASCII values to identify all non-printing or format characters as separators. A slightly different approach can be used:
char string1[] = "\n\nhello\t\tmy\tfriend\tagain\n\n";
char *p = string1; /* pointer to char */
unsigned idx = 0; /* index for part */
unsigned i = 0; /* generic counter */
char strings[MAXW][MAXL] = {{0}}; /* array to hold parts */
while (*p) /* for each char in string1 */
{
if (idx == MAXW) { /* test MAXW not exceeded */
fprintf (stderr, "error: MAXW (%d) words in string exceeded.\n", MAXW);
break;
}
/* skip each non-print/format char */
while (*p && (*p < ' ' || *p > '~'))
p++;
if (!*p) break; /* if end of s, break */
while (*p >= ' ' && *p <= '~') /* for each printable char */
{
strings[idx][i] = *p++; /* copy to strings array */
i++; /* advance to next position */
}
strings[idx][i] = 0; /* null-terminate strings */
idx++; /* next index in strings */
i = 0; /* start at beginning char */
}
This will handle your test string regardless of line ending and regardless of the number of tabs or newlines included. Take a look at ASCII Table and Description as a reference for the character ranges used.
I am doing an exercise on a book, changing the words in a sentence into pig latin. The code works fine in window 7, but when I compiled it in mac, the error comes out.
After some testings, the error comes from there. I don't understand the reason of this problem. I am using dynamic memories for all the pointers and I have also added the checking of null pointer.
while (walker != NULL && *walker != NULL){
free(**walker);
free(*walker);
free(walker);
walker++;
}
Full source code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <assert.h>
#define inputSize 81
void getSentence(char sentence [], int size);
int countWord(char sentence[]);
char ***parseSentence(char sentence[], int *count);
char *translate(char *world);
char *translateSentence(char ***words, int count);
int main(void){
/* Local definition*/
char sentence[inputSize];
int wordsCnt;
char ***head;
char *result;
getSentence(sentence, inputSize);
head = parseSentence(sentence, &wordsCnt);
result = translateSentence(head, wordsCnt);
printf("\nFinish the translation: \n");
printf("%s", result);
return 0;
}
void getSentence(char sentence [81], int size){
char *input = (char *)malloc(size);
int length;
printf("Input the sentence to big latin : ");
fflush(stdout);
fgets(input, size, stdin);
// do not copy the return character at inedx of length - 1
// add back delimater
length = strlen(input);
strncpy(sentence, input, length-1);
sentence[length-1]='\0';
free(input);
}
int countWord(char sentence[]){
int count=0;
/*Copy string for counting */
int length = strlen(sentence);
char *temp = (char *)malloc(length+1);
strcpy(temp, sentence);
/* Counting */
char *pToken = strtok(temp, " ");
char *last = NULL;
assert(pToken == temp);
while (pToken){
count++;
pToken = strtok(NULL, " ");
}
free(temp);
return count;
}
char ***parseSentence(char sentence[], int *count){
// parse the sentence into string tokens
// save string tokens as a array
// and assign the first one element to the head
char *pToken;
char ***words;
char *pW;
int noWords = countWord(sentence);
*count = noWords;
/* Initiaze array */
int i;
words = (char ***)calloc(noWords+1, sizeof(char **));
for (i = 0; i< noWords; i++){
words[i] = (char **)malloc(sizeof(char *));
}
/* Parse string */
// first element
pToken = strtok(sentence, " ");
if (pToken){
pW = (char *)malloc(strlen(pToken)+1);
strcpy(pW, pToken);
**words = pW;
/***words = pToken;*/
// other elements
for (i=1; i<noWords; i++){
pToken = strtok(NULL, " ");
pW = (char *)malloc(strlen(pToken)+1);
strcpy(pW, pToken);
**(words + i) = pW;
/***(words + i) = pToken;*/
}
}
/* Loop control */
words[noWords] = NULL;
return words;
}
/* Translate a world into big latin */
char *translate(char *word){
int length = strlen(word);
char *bigLatin = (char *)malloc(length+3);
/* translate the word into pig latin */
static char *vowel = "AEIOUaeiou";
char *matchLetter;
matchLetter = strchr(vowel, *word);
// consonant
if (matchLetter == NULL){
// copy the letter except the head
// length = lenght of string without delimiter
// cat the head and add ay
// this will copy the delimater,
strncpy(bigLatin, word+1, length);
strncat(bigLatin, word, 1);
strcat(bigLatin, "ay");
}
// vowel
else {
// just append "ay"
strcpy(bigLatin, word);
strcat(bigLatin, "ay");
}
return bigLatin;
}
char *translateSentence(char ***words, int count){
char *bigLatinSentence;
int length = 0;
char *bigLatinWord;
/* calculate the sum of the length of the words */
char ***walker = words;
while (*walker){
length += strlen(**walker);
walker++;
}
/* allocate space for return string */
// one space between 2 words
// numbers of space required =
// length of words
// + (no. of words * of a spaces (1) -1 )
// + delimater
// + (no. of words * ay (2) )
int lengthOfResult = length + count + (count * 2);
bigLatinSentence = (char *)malloc(lengthOfResult);
// trick to initialize the first memory
strcpy(bigLatinSentence, "");
/* Translate each word */
int i;
char *w;
for (i=0; i<count; i++){
w = translate(**(words + i));
strcat(bigLatinSentence, w);
strcat(bigLatinSentence, " ");
assert(w != **(words + i));
free(w);
}
/* free memory of big latin words */
walker = words;
while (walker != NULL && *walker != NULL){
free(**walker);
free(*walker);
free(walker);
walker++;
}
return bigLatinSentence;
}
Your code is unnecessarily complicated, because you have set things up such that:
n: the number of words
words: points to allocated memory that can hold n+1 char ** values in sequence
words[i] (0 <= i && i < n): points to allocated memory that can hold one char * in sequence
words[n]: NULL
words[i][0]: points to allocated memory for a word (as before, 0 <= i < n)
Since each words[i] points to stuff-in-sequence, there is a words[i][j] for some valid integer j ... but the allowed value for j is always 0, as there is only one char * malloc()ed there. So you could eliminate this level of indirection entirely, and just have char **words.
That's not the problem, though. The freeing loop starts with walker identical to words, so it first attempts to free words[0][0] (which is fine and works), then attempts to free words[0] (which is fine and works), then attempts to free words (which is fine and works but means you can no longer access any other words[i] for any value of i—i.e., a "storage leak"). Then it increments walker, making it more or less equivalent to &words[1]; but words has already been free()d.
Instead of using walker here, I'd use a loop with some integer i:
for (i = 0; words[i] != NULL; i++) {
free(words[i][0]);
free(words[i]);
}
free(words);
I'd also recommending removing all the casts on malloc() and calloc() return values. If you get compiler warnings after doing this, they usually mean one of two things:
you've forgotten to #include <stdlib.h>, or
you're invoking a C++ compiler on your C code.
The latter sometimes works but is a recipe for misery: good C code is bad C++ code and good C++ code is not C code. :-)
Edit: PS: I missed the off-by-one lengthOfResult that #David RF caught.
int lengthOfResult = length + count + (count * 2);
must be
int lengthOfResult = length + count + (count * 2) + 1; /* + 1 for final '\0' */
while (walker != NULL && *walker != NULL){
free(**walker);
free(*walker);
/* free(walker); Don't do this, you still need walker */
walker++;
}
free(words); /* Now */
And you have a leak:
int main(void)
{
...
free(result); /* You have to free the return of translateSentence() */
return 0;
}
In this code:
while (walker != NULL && *walker != NULL){
free(**walker);
free(*walker);
free(walker);
walker++;
}
You need to check that **walker is not NULL before freeing it.
Also - when you compute the length of memory you need to return the string, you are one byte short because you copy each word PLUS A SPACE (including a space after the last word) PLUS THE TERMINATING \0. In other words, when you copy your result into the bigLatinSentence, you will overwrite some memory that isn't yours. Sometimes you get away with that, and sometimes you don't...
Wow, so I was intrigued by this, and it took me a while to figure out.
Now that I figured it out, I feel dumb.
What I noticed from running under gdb is that the thing failed on the second run through the loop on the line
free(walker);
Now why would that be so. This is where I feel dumb for not seeing it right away. When you run that line, the first time, the whole array of char*** pointers at words (aka walker on the first run through) on the second run through, when your run that line, you're trying to free already freed memory.
So it should be:
while (walker != NULL && *walker != NULL){
free(**walker);
free(*walker);
walker++;
}
free(words);
Edit:
I also want to note that you don't have to cast from void * in C.
So when you call malloc, you don't need the (char *) in there.