Read a text file into an array in plain C - c

Is there a way to read a text file into a one dimensional array in plain C? Here's what I tried (I am writing hangman):
int main() {
printf("Welcome to hangman!");
char buffer[81];
FILE *dictionary;
int random_num;
int i;
char word_array[80368];
srand ( time(NULL) );
random_num = rand() % 80368 + 1;
dictionary = fopen("dictionary.txt", "r");
while (fgets(buffer, 80, dictionary) != NULL){
printf(buffer); //just to make sure the code worked;
for (i = 1; i < 80368; i++) {
word_array[i] = *buffer;
}
}
printf("%s, \n", word_array[random_num]);
return 0;
}
What's wrong here?

Try changing a couple of things;
First; you're storing a single char. word_array[i] = *buffer; means to copy a single character (the first one on the line/in the buffer) into each (and every) single-char slot in word_array.
Secondly, your array will hold 80K characters, not 80K words. Assuming that that's the length of your dictionary file, you can't fit it all in there using that loop.
I'm assuming you have 80,368 words in your dictionary file. That's about 400,000 words less than /usr/share/dict/words on my workstation, though, but sounds like a reasonable size for hangman…
If you want a one-dimensional array intentionally, for some reason, you'll have to do one of three things:
pretend you're on a mainframe, and use 80 chars for every word:
char word_array[80368 * 80];
memcpy (&(word_array[80 * i]), buffer, 80);
create a parallel array with indices to the start of each line in a huge buffer
int last_char = 0;
char* word_start[80368];
char word_array[80368 * 80];
for ( … i++ ) {
memcpy (&word_array[last_char], buffer, strlen(buffer));
word_start[i] = last_char;
last_char += strlen(buffer);
}
switch to using an array of pointers to char, one word per slot.
char* word_array[80368];
for (int i = 0; i < 80368, i++) {
fgets (buffer, 80, dictionary);
word_array[i] = strdup (buffer);
}
I'd recommend the latter, as otherwise you have to guess at the max size or waste a lot of RAM while reading. (If your average word length is around 4-5 chars, as in English, you're on average wasting 75 bytes per word.)
I'd also recommend dynamically allocating the word_array:
int max_word = 80368;
char** word_array = malloc (max_word * sizeof (char*));
… which can lead you to a safer read, if your dictionary size ever were to change:
int i = 0;
while (1) {
/* If we've exceeded the preset word list size, increase it. */
if ( i > max_word ) {
max_word *= 1.2; /* tunable arbitrary value */
word_array = realloc (word_array, max_word * sizeof(char*));
}
/* Try to read a line, and… */
char* e = fgets (buffer, 80, dictionary);
if (NULL == e) { /* end of file */
/* free any unused space */
word_array = realloc (word_array, i * sizeof(char*));
/* exit the otherwise-infinite loop */
break;
} else {
/* remove any \r and/or \n end-of-line chars */
for (char *s = &(buffer[0]); s < &(buffer[80]); ++s) {
if ('\r' == *s || '\n' == *s || '\0' == *s) {
*s = '\0'; break;
}
}
/* store a copy of the word, only, and increment the counter.
* Note that `strdup` will only copy up to the end-of-string \0,
* so you will only allocate enough memory for actual word
* lengths, terminal \0's, and the array of pointers itself. */
*(word_array + i++) = strdup (buffer);
}
}
/* when we reach here, word_array is guaranteed to be the right size */
random = rand () % max_word;
printf ("random word #%d: %s\n", random, *(word_array + random));
Sorry, this is posted in an hurry, so I haven't tested the above. Caveat emptor.

This part is wrong:
while (fgets(buffer, 80, dictionary) != NULL){
printf(buffer); //just to make sure the code worked;
for (i = 1; i < 80368; i++) {
word_array[i] = *buffer;
}
}
You are copying 80368 chars from buffer which has size 81. Change it to:
i = 0;
while (fgets(buffer, 80, dictionary) != NULL){
printf(buffer); //just to make sure the code worked;
for (j = 0; j < 80; j++) {
word_array[i++] = buffer[j];
}
}

Related

Reading the words of a file into a dynamic 2D array

I am trying to read a file and store every word into a dynamically allocated 2D array. The size of the input file is unknown.
I am totally lost and don't know how I could "fix/finish" the program.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void) {
char filename[25];
printf("Input the filename");
scanf("%s", filename);
fileConverter(filename);
}
int fileConverter(char filename[25]) {
//int maxLines = 50000;
//int maxWordSize = 128;
//char words[maxLines][maxWordSize];
//char **words;
char **arr = (char**) calloc(num_elements, sizeof(char*));
for ( i = 0; i < num_elements; i++ ) {
arr[i] = (char*) calloc(num_elements_sub, sizeof(char));
}
FILE *file = NULL;
int amountOfWords = 0;
file = fopen(filename, "r");
if(file == NULL) {
exit(0);
}
while(fgets(words[amountOfWords], 10000, file)) {
words[amountOfWords][strlen(words[amountOfWords]) - 1] = "\0";
amountOfWords++;
}
for(int i = 0; i < amountOfWords; i++) {
printf("a[%d] = ", i);
printf("%s\n", words[i]);
}
printf("The file contains %d words and the same amount of lines.\n", amountOfWords);
return amountOfWords;
The main challenges for this kind of problem are
reallocating the array of strings as the program reads new words, and
handling words that are larger than the buffer used by fgets.
The general approach for these kind of parsing problems, is to design a state machine. The state machine here has two states:
The current character is whitespace. Action: Continue reading whitespace until we reach the end of the buffer, or until we land on a non-whitespace character, in which case we switch to state 2.
The current character is non-whitespace (i.e. a word). Action: Continue reading non-whitespace until we reach the end of the buffer, or until we land on a whitespace character, in which case we copy the word we just read to the array of strings and switch to state 1.
Particularly difficult is the case in which we are in state 2 and reach the end of the buffer. This means that this word spans multiple buffers. To accommodate for this, we deviate slightly from a direct state machine implementation. State 2 is slightly different, depending on if we are reading a new word or continuing one that was started in a previous buffer.
We now keep track of wordSize. If we start reading from the start of a buffer, but wordSize is not 0, then we know we are continuing a previous word and we know what size it was for the realloc we need.
Below is one possible implementation. All the work is done in the wordArrayRead function. Walking through it from the top of the function:
First we declare the variables that we need across lineBuffer reads: an index for the word itself and the length of the word we are currently reading, followed by the declaration of the buffer itself. The outside loop repeatedly reads using fgets until we have exhausted the input.
We start reading at index 0 and stop at the null-terminator. The first if-statement checks if we should be in state 2: either the current character is the start of a word or we were already reading a word.
State 2
The index wordStartIdx stays at the first character of the word (segment) and we walk the wordEndIdx to the end of the word (segment) or to the end of the buffer.
We then check if we need to increase the size of the array of strings. Here we increase it to 2 times + 1 the previous size to avoid frequent reallocations.
We set a boolean value, indication whether we have reached the end of a word. If we have, we need to allocate for and write the null-terminator at the end of the string.
If wordLength == 0 it means we are reading a new word and have to allocate memory for it for the first time. If wordLength != 0, we have to reallocate to append to an existing word.
We copy the word (segment) currently in the lineBuffer to the array of strings.
Now, we do some bookkeeping. If we reached the end of a word, we write the null-terminator, increment the index to point to the next word location and reset wordLength. If this wasn't the case, we only increment the wordLength with the length of the segment we just read. Finally, we update wordStartIdx, which still points to the start of the word, to point to the end of the word, so we can continue iterating over the buffer.
State 1
Having finishing the State 2 processing, we go into State 1 which has only two lines. It simply advances the index until we land at non-whitespace. Note that the null-terminator of the lineBuffer ('\0') does not count as whitespace, so this loop will not continue past the end of the buffer.
After all input has been processed, we shrink the array of strings to the actual size of its data. This "corrects" the allocation policy of increasing the size by 2n+1 each time it wasn't large enough.
#include <assert.h>
#include <ctype.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
// BUFFER_SIZE must be >1U
#define BUFFER_SIZE 1024U
struct WordArray
{
char **words;
size_t numberOfWords;
};
static struct WordArray wordArrayConstruct(void);
static void wordArrayResize(struct WordArray *wordArray, size_t const newSize);
static void wordArrayDestruct(struct WordArray *wordArray);
static void wordArrayRead(FILE *restrict stream, struct WordArray *wordArray);
static char *reallocStringWrapper(char *restrict str, size_t const newSize);
static void wordArrayPrint(struct WordArray const *wordArray);
int main(void)
{
struct WordArray wordArray = wordArrayConstruct();
wordArrayRead(stdin, &wordArray);
wordArrayPrint(&wordArray);
wordArrayDestruct(&wordArray);
}
static void wordArrayRead(FILE *restrict stream, struct WordArray *wordArray)
{
size_t wordArrayIdx = 0U;
size_t wordLength = 0U;
char lineBuffer[BUFFER_SIZE];
while (fgets(lineBuffer, sizeof lineBuffer, stream) != NULL)
{
size_t wordStartIdx = 0U;
while (lineBuffer[wordStartIdx] != '\0')
{
if (!isspace(lineBuffer[wordStartIdx]) || wordLength != 0U)
{
size_t wordEndIdx = wordStartIdx;
while (!isspace(lineBuffer[wordEndIdx]) && wordEndIdx != BUFFER_SIZE - 1U)
++wordEndIdx;
if (wordArrayIdx >= wordArray->numberOfWords)
wordArrayResize(wordArray, wordArray->numberOfWords * 2U + 1U);
size_t wordSegmentLength = wordEndIdx - wordStartIdx;
size_t foundWordEnd = wordEndIdx != BUFFER_SIZE - 1U; // 0 or 1 bool
// Allocate for a new word, or reallocate for an existing word
// If a word end was found, add 1 to the size for the '\0' character
char *dest = wordLength == 0U ? NULL : wordArray->words[wordArrayIdx];
size_t allocSize = wordLength + wordSegmentLength + foundWordEnd;
wordArray->words[wordArrayIdx] = reallocStringWrapper(dest, allocSize);
memcpy(&(wordArray->words[wordArrayIdx][wordLength]),
&lineBuffer[wordStartIdx], wordSegmentLength);
if (foundWordEnd)
{
wordArray->words[wordArrayIdx][wordLength + wordSegmentLength] = '\0';
++wordArrayIdx;
wordLength = 0U;
}
else
{
wordLength += wordSegmentLength;
}
wordStartIdx = wordEndIdx;
}
while (isspace(lineBuffer[wordStartIdx]))
++wordStartIdx;
}
}
// All done. Shrink the words array to the size of the actual data
if (wordArray->numberOfWords != 0U)
wordArrayResize(wordArray, wordArrayIdx);
}
static struct WordArray wordArrayConstruct(void)
{
return (struct WordArray) {.words = NULL, .numberOfWords = 0U};
}
static void wordArrayResize(struct WordArray *wordArray, size_t const newSize)
{
assert(newSize > 0U);
char **tmp = (char**) realloc(wordArray->words, newSize * sizeof *wordArray->words);
if (tmp == NULL)
{
wordArrayDestruct(wordArray);
fprintf(stderr, "WordArray allocation error\n");
exit(EXIT_FAILURE);
}
wordArray->words = tmp;
wordArray->numberOfWords = newSize;
}
static void wordArrayDestruct(struct WordArray *wordArray)
{
for (size_t wordStartIdx = 0U; wordStartIdx < wordArray->numberOfWords; ++wordStartIdx)
{
free(wordArray->words[wordStartIdx]);
wordArray->words[wordStartIdx] = NULL;
}
free(wordArray->words);
}
static char *reallocStringWrapper(char *restrict str, size_t const newSize)
{
char *tmp = (char*) realloc(str, newSize);
if (tmp == NULL)
{
free(str);
fprintf(stderr, "Realloc string allocation error\n");
exit(EXIT_FAILURE);
}
return tmp;
}
static void wordArrayPrint(struct WordArray const *wordArray)
{
for (size_t wordStartIdx = 0U; wordStartIdx < wordArray->numberOfWords; ++wordStartIdx)
printf("%zu: %s\n", wordStartIdx, wordArray->words[wordStartIdx]);
}
Note: This program reads input from stdin, as Unix/Linux utilities typically do. Use input redirection to read from a file, or provide a file descriptor to the readWordArray function.
to allocate dynamic 2D array you need:
void allocChar2Darray(size_t rows, size_t columns, char (**array)[columns])
{
*array = malloc(rows * sizeof(**array));
}

C loop to read lines of input

I want to create a program in C that takes an arbitrary number of lines of arbitrary length as input and then prints to console the last line that was inputted. For example:
input:
hi
my name is
david
output: david
I figured the best way to do this would be to have a loop that takes each line as input and stores it in a char array, so at the end of the loop the last line ends up being what is stored in the char array and we can just print that.
I have only had one lecture in C so far so I think I just keep setting things up wrong with my Java/C++ mindset since I have more experience in those languages.
Here is what I have so far but I know that it's nowhere near correct:
#include <stdio.h>
int main()
{
printf("Enter some lines of strings: \n");
char line[50];
for(int i = 0; i < 10; i++){
line = getline(); //I know this is inproper syntax but I want to do something like this
}
printf("%s",line);
}
I also have i < 10 in the loop because I don't know how to find the total number of lines in the input which, would be the proper amount of times to loop this. Also, the input is being put in all at once from the
./program < test.txt
command in Unix shell, where test.txt has the input.
Use fgets():
while (fgets(line, sizeof line, stdin)) {
// don't need to do anything here
}
printf("%s", line);
You don't need a limit on the number of iterations. At the end of the file, fgets() returns NULL and doesn't modify the buffer, so line will still hold the last line that was read.
I'm assuming you know the maximum length of the input line.
This one here will surely do the job for you
static char *getLine( char * const b , size_t bsz ) {
return fgets(b, bsz, stdin) );
}
But remember fgets also puts a '\n' character at the end of buffer so perhaps something like this
static char *getLine( char * const b , size_t bsz ) {
if( fgets(b, bsz, stdin) ){
/* Optional code to strip NextLine */
size_t size = strlen(b);
if( size > 0 && b[size-1] == '\n' ) {
b[--size] = '\0';
}
/* End of Optional Code */
return b;
}
return NULL;
}
and your code needs to be altered a bit while calling the getline
#define BUF_SIZE 256
char line[BUF_SIZE];
for(int i = 0; i < 10; i++){
if( getLine(line, BUF_SIZE ) ) {
fprintf(stdout, "line : '%s'\n", line);
}
}
Now it is how ever quite possible to create function like
char *getLine();
but then one needs to define the behavior of that function for instance if the function getLine() allocates memory dynamically then you probably need use a free to de-allocate the pointer returned by getLine()
in which case the function may look like
char *getLine( size_t bsz ) {
char *b = malloc( bsz );
if( b && fgets(b, bsz, stdin) ){
return b;
}
return NULL;
}
depending on how small your function is you can entertain thoughts about making it inline perhaps that's a little off topic for now.
In order to have dynamic number of input of dynamic length, you have to keep on reallocating your buffer when the input is of greater length. In order to store the last line, you have to take another pointer to keep track of it and to stop the input from the terminal you have to press EOF key(ctrl+k). This should do your job.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char *get_last_line(FILE* fp, size_t size){
//The size is extended by the input with the value of the provisional
char *str, *last_str = NULL;
int ch;
size_t len = 0, last_len = 0;
str = realloc(NULL, sizeof(char)*size);//size is start size
if(!str)return str;
while(ch=fgetc(fp)){
if(ch == EOF){
break;
}
if(ch == '\n'){
str[len]='\0';
last_len = len;
last_str = realloc(last_str,sizeof(char)*last_len);
last_str[last_len]='\0';
//storing the last line
memcpy(last_str,str,sizeof(char)*last_len);
str = realloc(NULL, sizeof(char)*size);//size is start size
len = 0;
}
else {
str[len++]=ch;
if(len==size){
str = realloc(str, sizeof(char)*(size+=16));
if(!str)return str;
}
}
}
free(str);
return last_str;
}
int main(void){
char *m;
printf("input strings : ");
m = get_last_line(stdin, 10);
printf("last string :");
printf("%s\n", m);
free(m);
return 0;
}

How many elements in string array?

I want to find how many names in names array. I know sizeof(names)/sizeof(names[0]) gives the right answer. But the problem is I can't just declare char *names[];. Because compiler gives me an error like this "Storage of names is unknown". To avoid this error, I must declare like this char *names[] = {"somename", "somename2"};. But the thing is I cannot assign the strings right after deceleration. I assign strings after some conditions and my problem is how many strings i have after that conditions.
My example.
char *names[];
char word[10];
int i = 0;
while (fscanf(word, sizeof(word), fp)>0) {
// Think hello increase every time loop returns.
// such as "hello1", and the 2nd time "hello2"
if(strcmp(word, "hello1") == 0)
names[i] = word;
}
printf("size: %d\n", sizeof(names)/sizeof(names[0]));
An array size is fixed once the array is created. It cannot change.
If fp can be read twice, read the file once for the word count.
size_t word_count = 0;
int word_length_max = 0;
long pos = ftell(fp); // remember file location
int n = 0;
while (fscanf(fp, "%*s%n", &n) != EOF && n > 0) { // Use %n to record character count
word_count++;
if (n > word_length_max) {
word_length_max = n;
}
n = 0;
}
Now code knows the word[] array size needed and the maximum length.
char *words[word_count];
char word[word_length_max + 1u]; // buffer size needed to read in the words
fseek(fp, pos, SEEK_SET); // go back
for (size_t i=0; i<word_count; i++) {
if (fscanf(fp, "%s", word) != 1) {
Handle_UnexpectedError(); // 2nd pass should have same acceptable results
}
words[i] = strdup(word); // allocate a duplicate
}
When done with words[], be sure to free the allocated memory.
....
for (size_t i=0; i<word_count; i++) {
free(words[i]);
}
Better code would also check the return value of ftell(), fseek(), malloc() for errors and limit fscanf(fp, "%s", word).

Scanf unknown number of string arguments C

I wanted to know if there was a way to use scanf so I can take in an unknown number of string arguments and put them into a char* array. I have seen it being done with int values, but can't find a way for it to be done with char arrays. Also the arguments are entered on the same line separated by spaces.
Example:
user enters hello goodbye yes, hello gets stored in array[0], goodbye in array[1] and yes in array[2]. Or the user could just enter hello and then the only thing in the array would be hello.
I do not really have any code to post, as I have no real idea how to do this.
You can do something like, read until the "\n" :
scanf("%[^\n]",buffer);
you need to allocate before hand a big enough buffer.
Now go through the buffer count the number of words, and allocate the necessary space char **array = ....(dynamic string allocation), go to the buffer and copy string by string into the array.
An example:
int words = 1;
char buffer[128];
int result = scanf("%127[^\n]",buffer);
if(result > 0)
{
char **array;
for(int i = 0; buffer[i]!='\0'; i++)
{
if(buffer[i]==' ' || buffer[i]=='\n' || buffer[i]=='\t')
{
words++;
}
}
array = malloc(words * sizeof(char*));
// Using RoadRunner suggestion
array[0] = strtok (buffer," ");
for(int w = 1; w < words; w++)
{
array[w] = strtok (NULL," ");
}
}
As mention in the comments you should use (if you can) fgets instead fgets(buffer,128,stdin);.
More about strtok
If you have an upper bound to the number of strings you may receive from the user, and to the number of characters in each string, and all strings are entered on a single line, you can do this with the following steps:
read the full line with fgets(),
parse the line with sscanf() with a format string with the maximum number of %s conversion specifiers.
Here is an example for up to 10 strings, each up to 32 characters:
char buf[400];
char s[10][32 + 1];
int n = 0;
if (fgets(buf, sizeof buf, sdtin)) {
n = sscanf("%32s%32s%32s%32s%32s%32s%32s%32s%32s%32s",
s[0], s[1], s[2], s[3], s[4], s[5], s[6], s[7], s[8], s[9]));
}
// `n` contains the number of strings
// s[0], s[1]... contain the strings
If the maximum number is not known of if the maximum length of a single string is not fixed, or if the strings can be input on successive lines, you will need to iterate with a simple loop:
char buf[200];
char **s = NULL;
int n;
while (scanf("%199s", buf) == 1) {
char **s1 = realloc(s, (n + 1) * sizeof(*s));
if (s1 == NULL || (s1[n] = strdup(buf)) == NULL) {
printf("allocation error");
exit(1);
}
s = s1;
n++;
}
// `n` contains the number of strings
// s[0], s[1]... contain pointers to the strings
Aside from the error handling, this loop is comparable to the hard-coded example above but it still has a maximum length for each string. Unless you can use a scanf() extension to allocate the strings automatically (%as on GNU systems), the code will be more complicated to handle any number of strings with any possible length.
You can use:
fgets to read input from user. You have an easier time using this instead of scanf.
malloc to allocate memory for pointers on the heap. You can use a starting size, like in this example:
size_t currsize = 10
char **strings = malloc(currsize * sizeof(*strings)); /* always check
return value */
and when space is exceeded, then realloc more space as needed:
currsize *= 2;
strings = realloc(strings, currsize * sizeof(*strings)); /* always check
return value */
When finished using the requested memory from malloc() and realloc(), it's always to good to free the pointers at the end.
strtok to parse the input at every space. When copying over the char * pointer from strtok(), you must also allocate space for strings[i], using malloc() or strdup.
Here is an example I wrote a while ago which does something very similar to what you want:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define INITSIZE 10
#define BUFFSIZE 100
int
main(void) {
char **strings;
size_t currsize = INITSIZE, str_count = 0, slen;
char buffer[BUFFSIZE];
char *word;
const char *delim = " ";
int i;
/* Allocate initial space for array */
strings = malloc(currsize * sizeof(*strings));
if(!strings) {
printf("Issue allocating memory for array of strings.\n");
exit(EXIT_FAILURE);
}
printf("Enter some words(Press enter again to end): ");
while (fgets(buffer, BUFFSIZE, stdin) != NULL && strlen(buffer) > 1) {
/* grow array as needed */
if (currsize == str_count) {
currsize *= 2;
strings = realloc(strings, currsize * sizeof(*strings));
if(!strings) {
printf("Issue reallocating memory for array of strings.\n");
exit(EXIT_FAILURE);
}
}
/* Remove newline from fgets(), and check for buffer overflow */
slen = strlen(buffer);
if (slen > 0) {
if (buffer[slen-1] == '\n') {
buffer[slen-1] = '\0';
} else {
printf("Exceeded buffer length of %d.\n", BUFFSIZE);
exit(EXIT_FAILURE);
}
}
/* Parsing of words from stdin */
word = strtok(buffer, delim);
while (word != NULL) {
/* allocate space for one word, including nullbyte */
strings[str_count] = malloc(strlen(word)+1);
if (!strings[str_count]) {
printf("Issue allocating space for word.\n");
exit(EXIT_FAILURE);
}
/* copy strings into array */
strcpy(strings[str_count], word);
str_count++;
word = strtok(NULL, delim);
}
}
/* print and free strings */
printf("Your array of strings:\n");
for (i = 0; i < str_count; i++) {
printf("strings[%d] = %s\n", i, strings[i]);
free(strings[i]);
strings[i] = NULL;
}
free(strings);
strings = NULL;
return 0;
}

"Pointer being freed was not allocated" happen on mac but not on window7

I am doing an exercise on a book, changing the words in a sentence into pig latin. The code works fine in window 7, but when I compiled it in mac, the error comes out.
After some testings, the error comes from there. I don't understand the reason of this problem. I am using dynamic memories for all the pointers and I have also added the checking of null pointer.
while (walker != NULL && *walker != NULL){
free(**walker);
free(*walker);
free(walker);
walker++;
}
Full source code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <assert.h>
#define inputSize 81
void getSentence(char sentence [], int size);
int countWord(char sentence[]);
char ***parseSentence(char sentence[], int *count);
char *translate(char *world);
char *translateSentence(char ***words, int count);
int main(void){
/* Local definition*/
char sentence[inputSize];
int wordsCnt;
char ***head;
char *result;
getSentence(sentence, inputSize);
head = parseSentence(sentence, &wordsCnt);
result = translateSentence(head, wordsCnt);
printf("\nFinish the translation: \n");
printf("%s", result);
return 0;
}
void getSentence(char sentence [81], int size){
char *input = (char *)malloc(size);
int length;
printf("Input the sentence to big latin : ");
fflush(stdout);
fgets(input, size, stdin);
// do not copy the return character at inedx of length - 1
// add back delimater
length = strlen(input);
strncpy(sentence, input, length-1);
sentence[length-1]='\0';
free(input);
}
int countWord(char sentence[]){
int count=0;
/*Copy string for counting */
int length = strlen(sentence);
char *temp = (char *)malloc(length+1);
strcpy(temp, sentence);
/* Counting */
char *pToken = strtok(temp, " ");
char *last = NULL;
assert(pToken == temp);
while (pToken){
count++;
pToken = strtok(NULL, " ");
}
free(temp);
return count;
}
char ***parseSentence(char sentence[], int *count){
// parse the sentence into string tokens
// save string tokens as a array
// and assign the first one element to the head
char *pToken;
char ***words;
char *pW;
int noWords = countWord(sentence);
*count = noWords;
/* Initiaze array */
int i;
words = (char ***)calloc(noWords+1, sizeof(char **));
for (i = 0; i< noWords; i++){
words[i] = (char **)malloc(sizeof(char *));
}
/* Parse string */
// first element
pToken = strtok(sentence, " ");
if (pToken){
pW = (char *)malloc(strlen(pToken)+1);
strcpy(pW, pToken);
**words = pW;
/***words = pToken;*/
// other elements
for (i=1; i<noWords; i++){
pToken = strtok(NULL, " ");
pW = (char *)malloc(strlen(pToken)+1);
strcpy(pW, pToken);
**(words + i) = pW;
/***(words + i) = pToken;*/
}
}
/* Loop control */
words[noWords] = NULL;
return words;
}
/* Translate a world into big latin */
char *translate(char *word){
int length = strlen(word);
char *bigLatin = (char *)malloc(length+3);
/* translate the word into pig latin */
static char *vowel = "AEIOUaeiou";
char *matchLetter;
matchLetter = strchr(vowel, *word);
// consonant
if (matchLetter == NULL){
// copy the letter except the head
// length = lenght of string without delimiter
// cat the head and add ay
// this will copy the delimater,
strncpy(bigLatin, word+1, length);
strncat(bigLatin, word, 1);
strcat(bigLatin, "ay");
}
// vowel
else {
// just append "ay"
strcpy(bigLatin, word);
strcat(bigLatin, "ay");
}
return bigLatin;
}
char *translateSentence(char ***words, int count){
char *bigLatinSentence;
int length = 0;
char *bigLatinWord;
/* calculate the sum of the length of the words */
char ***walker = words;
while (*walker){
length += strlen(**walker);
walker++;
}
/* allocate space for return string */
// one space between 2 words
// numbers of space required =
// length of words
// + (no. of words * of a spaces (1) -1 )
// + delimater
// + (no. of words * ay (2) )
int lengthOfResult = length + count + (count * 2);
bigLatinSentence = (char *)malloc(lengthOfResult);
// trick to initialize the first memory
strcpy(bigLatinSentence, "");
/* Translate each word */
int i;
char *w;
for (i=0; i<count; i++){
w = translate(**(words + i));
strcat(bigLatinSentence, w);
strcat(bigLatinSentence, " ");
assert(w != **(words + i));
free(w);
}
/* free memory of big latin words */
walker = words;
while (walker != NULL && *walker != NULL){
free(**walker);
free(*walker);
free(walker);
walker++;
}
return bigLatinSentence;
}
Your code is unnecessarily complicated, because you have set things up such that:
n: the number of words
words: points to allocated memory that can hold n+1 char ** values in sequence
words[i] (0 <= i && i < n): points to allocated memory that can hold one char * in sequence
words[n]: NULL
words[i][0]: points to allocated memory for a word (as before, 0 <= i < n)
Since each words[i] points to stuff-in-sequence, there is a words[i][j] for some valid integer j ... but the allowed value for j is always 0, as there is only one char * malloc()ed there. So you could eliminate this level of indirection entirely, and just have char **words.
That's not the problem, though. The freeing loop starts with walker identical to words, so it first attempts to free words[0][0] (which is fine and works), then attempts to free words[0] (which is fine and works), then attempts to free words (which is fine and works but means you can no longer access any other words[i] for any value of i—i.e., a "storage leak"). Then it increments walker, making it more or less equivalent to &words[1]; but words has already been free()d.
Instead of using walker here, I'd use a loop with some integer i:
for (i = 0; words[i] != NULL; i++) {
free(words[i][0]);
free(words[i]);
}
free(words);
I'd also recommending removing all the casts on malloc() and calloc() return values. If you get compiler warnings after doing this, they usually mean one of two things:
you've forgotten to #include <stdlib.h>, or
you're invoking a C++ compiler on your C code.
The latter sometimes works but is a recipe for misery: good C code is bad C++ code and good C++ code is not C code. :-)
Edit: PS: I missed the off-by-one lengthOfResult that #David RF caught.
int lengthOfResult = length + count + (count * 2);
must be
int lengthOfResult = length + count + (count * 2) + 1; /* + 1 for final '\0' */
while (walker != NULL && *walker != NULL){
free(**walker);
free(*walker);
/* free(walker); Don't do this, you still need walker */
walker++;
}
free(words); /* Now */
And you have a leak:
int main(void)
{
...
free(result); /* You have to free the return of translateSentence() */
return 0;
}
In this code:
while (walker != NULL && *walker != NULL){
free(**walker);
free(*walker);
free(walker);
walker++;
}
You need to check that **walker is not NULL before freeing it.
Also - when you compute the length of memory you need to return the string, you are one byte short because you copy each word PLUS A SPACE (including a space after the last word) PLUS THE TERMINATING \0. In other words, when you copy your result into the bigLatinSentence, you will overwrite some memory that isn't yours. Sometimes you get away with that, and sometimes you don't...
Wow, so I was intrigued by this, and it took me a while to figure out.
Now that I figured it out, I feel dumb.
What I noticed from running under gdb is that the thing failed on the second run through the loop on the line
free(walker);
Now why would that be so. This is where I feel dumb for not seeing it right away. When you run that line, the first time, the whole array of char*** pointers at words (aka walker on the first run through) on the second run through, when your run that line, you're trying to free already freed memory.
So it should be:
while (walker != NULL && *walker != NULL){
free(**walker);
free(*walker);
walker++;
}
free(words);
Edit:
I also want to note that you don't have to cast from void * in C.
So when you call malloc, you don't need the (char *) in there.

Resources