I am trying to take input with fgets(). I know how many lines I will get but it changes and I store the number of lines in the variable var. I also have another variable named part; it is the length of the line I get, but since there are white spaces between the values I multiplied it by 2 (I couldn't find another solution; I could use some advice).
Anyway, I tried to get the input as in the code below, but when I entered the first line it automatically breaks out the for loop and prints random things. I think it is to do with the fgets() in the loop; I don't know if there is a use of fgets() like this.
char inp[var][(2*part)];
int k,l;
for(k=0;k<=var;k++);
fgets(inp[k],(2*part),stdin);
printf("%c\n",inp[0]);
printf("%c\n",inp[1]);
printf("%c\n",inp[2]);
printf("%c\n",inp[3]);
…since there are white spaces between the values I multiplied it with 2…
If you aren't required to store everything on the stack, you can instead store the strings in dynamically allocated memory. For example:
char* inp[var];
char buf[400]; // just needs to be long
for (k = 0; k < var; k++) {
fgets(buf, 400, stdin);
inp[k] = malloc(sizeof(char) * (strlen(buf) + 1));
strcpy(inp[k], buf);
}
Although technically not standards-compliant, strdup is widely available and makes this easier as well.
As far as the actual issue, as BLUEPIXY said in the comments above, you have a few typos.
After the for loop, the semicolon makes it act unexpectedly.
for(k=0;k<=var;k++);
fgets(inp[k],(2*part),stdin);
is actually the same as
for(k=0;k<=var;k++) {
; // do nothing
}
fgets(...);
Remove that semicolon after the for loop statement. As it is, you're not actually reading correctly, which is why you see garbage.
To print an entire string, the printf family needs a %s format flag.
With your bounds on k, there will actually be var + 1 iterations of the loop. If var were 3, then k = 0,1,2,3 -> terminate when k checked at 4.
Typically, the safest and easiest way to use fgets is to allocate a single, large-enough line buffer. Use that to read the line, then copy it into correctly sized buffers.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void) {
// Allocate just the space for the list, not the strings themselves.
int num_input = 5;
char *input[num_input];
// Allocate our reusable line buffer.
char line[1024];
for( int i = 0; i < num_input; i++ ) {
// Read into the line buffer.
fgets(line, 1024,stdin);
// Copy from the line buffer into correctly sized memory.
input[i] = strdup(line);
}
for( int i = 0; i < num_input; i++ ) {
printf("%s\n",input[i]);
}
}
Note that strdup() is not an ISO C function, but POSIX. It's common and standard enough. It's too handy not to use. Write your own if necessary.
That takes care of not knowing the line length.
If you don't know the number of lines you're storing, you'll have to grow the array. Typically this is done with realloc to reallocate the existing memory. Start with a small list size, then grow it as needed. Doubling is a good rough approximation that's a pretty efficient balance between speed (reallocating can be slow) and memory efficiency.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
int main(void) {
// How big the input list is.
size_t input_max = 64;
// How many elements are in it.
size_t input_size = 0;
// Allocate initial memory for the input list.
// Again, not for the strings, just for the list.
char **input = malloc( sizeof(char*) * input_max );
char line[1024];
while( fgets(line, 1024,stdin) != NULL ) {
// Check if we need to make the input list bigger.
if( input_size >= input_max ) {
// Double the max length.
input_max *= 2;
// Reallocate.
// Note: this is only safe because we're
// going to exit on error, otherwise we'd leak
// input's memory.
input = realloc( input, sizeof(char*) * input_max );
// Check for error.
if( input == NULL ) {
fprintf(stderr, "Could not reallocate input list to %zu: %s", input_max, strerror(errno) );
exit(1);
}
}
input[input_size] = strdup(line);
input_size++;
}
for( size_t i = 0; i < input_size; i++ ) {
printf("%s\n",input[i]);
}
}
As you can see, this gets a bit complicated. Now you need to keep track of the array, its maximum size, and its current size. Anyone using the array must remember to check its size and grow it, and remember to error check it. Your next impulse will be to create a struct to collect all that together, and functions to manage the list.
This is a good exercise in dynamic memory management, and I encourage you to do it. But for production code, use a pre-existing library. GLib is a good choice. It contains all sorts of handy data structures and functions that are missing from C, including pointer arrays that automatically grow. Use them, or something like it, in production code.
Related
I am new to C, coming from Python. I want to read a .xyz file into a dynamically sized array, to use for various calculations later on in the program. The file is formatted as follows:
Title
Comment
Symbol 0.000 0.000 0.000
Symbol 0.000 0.000 0.000
....
The two first lines are not needed, and should just be skipped. The "Symbol" part of the file are chemical symbols--e.g. H, Au, C, Mn--as the .xyz file format is used for storing 3D coordinates of atoms. They need to be ignored as well. I'm interested in the space separated decimal numbers. I therefore want to:
Skip the first two lines, or just ignore them in some way.
Skip the first part of each line until the first space.
Store the three columns of numbers (coordinates) in an array.
So far I have been able to open a file for reading, and then I've attempted to check how long the file is, in order to have the size of the array change depending on how many coordinate sets needs to be stored.
// Variable declaration
FILE *fp;
long file_size;
// Open file and error checking
fp = fopen ("file_name" , "r");
if(!fp) perror("file_name"), exit(1);
// Check file size
fseek(fp, 0, SEEK_END);
file_size = ftell(fp);
rewind(fp);
// Close file
fclose(fp);
I've been able to skip the first two lines using fscanf(fp, "%*[^\n]"), to skip to the end of the line. But, I haven't been able to figure out how to loop through the rest of the file, while storing only the decimal numbers in an array.
If I understand correctly, I need to allocate memory for the array, using something like malloc() in combination with my file_size and then copy the data into the array using fread().
Here is an example of the contents of an actual .xyz file:
10 atom system
Energy: -914941.6614699
Ag 0.96834 1.51757 0.02281
Ag 0.96758 -1.51824 -0.02206
Ag -1.80329 2.27401 0.03179
Ag -3.58033 0.00046 0.00126
Ag -1.80447 -2.27338 -0.03537
Ag -0.96581 0.02246 -1.51755
Ag -0.96929 -0.02231 1.51463
Ag 1.80613 0.03321 -2.27213
Ag 3.58027 0.00028 0.00206
Ag 1.80086 -0.03407 2.27455
Here is a general approach in C for reading a file into an array of cstrings (pointers to cstrings, so the rough equivalent of a Python list of strings).
int count = 0; // line counter;
int char_count = 0; // char counter;
int max_len = 0; // for storing the longest line length
int c; // for measuring each line length
char **str_ptr_arr; // array of pointers to c-string
//extract characters from the file, looking for endlines; note that
//the EOF check has to come AFTER the getc(fp) to work properly
for (c = getc(fp); c != EOF; c = getc(fp)) { //edit see comments
char_count += 1;
if (c == '\n') { //safe comparison see comments
count += 1;
if (max_len < char_count) {
max_len = char_count; //gets longest line
}
char_count = 0;
}
}
//should probably do an feof check here
rewind(fp);
So now you have the number of lines and the length of the longest line, (You can try using the above loop to exclude lines if you want but it might just be easier to read the whole thing into an array of c-strings, then process that into an array of doubles). Now allocate the memory for the array of pointers to c-strings and for the c-strings themselves:
//allocate enough memory to hold all the strings in the file, by first
//allocating the arr of ptrs then a slot for each c-string pointed to:
str_ptr_arr = malloc(count * sizeof(char*)); //size of pointer
for (int i = 0; i < count; i++) {
str_ptr_arr[i] = malloc ((max_len + 1) * sizeof(char)); // +1 for '\0' terminate
}
rewind(fp); //rewind again;
Now, we have a problem, which is how to populate these cstrings (Python is so much easier!). This works, I'm not sure if it's the expert approach, but here we read into a
temporary buffer then use strcpy to move the contents of the buffer into our allocated array slots:
for (int i = 0; i < count; i++) {
char buff[max_len + 1]; //local temporary buffer that can store any line in file
fscanf(fp, "%s", buff); //read the first string to buffer
strcpy(str_ptr_arr[i], buff);
}
Note: this is a decent point at which to start excluding lines or removing various substrings from lines, as you can make strcpy conditional on the contents of the buffer, by using other cstring methods. I'm fairly new at this myself, (learning to write C functions for use in Python progams), but this seems to be the correct approach.
It might also be possible to go directly to a dynamically allocated array of floats for storing your numerical data without bothering with the cstring array; that could be done in the last loop above. You could split the strings at whitespace, exclude the alphabetical parts, and use the cstring function atof to convert to float datatype.
Edit: I should mention all these memory allocations must be manually freed when you are done with them, and this is the approach:
for(int i = 0; i < count; i++) { // free each allocated cstring space
free(str_ptr_arr[i]);
}
free(str_ptr_arr); // free the cstring pointer space
str_ptr_arr = NULL;
Given, for example:
#define STORAGE_INCREMENT 128
typedef struct
{
double x, y, z ;
} sXYZ ;
Then:
int atom_count = 0 ;
int atom_capacity = STORAGE_INCREMENT ;
sXYZ* atoms = malloc( atom_capacity * sizeof(*atoms) ) ;
// While valid triplet, discard symbol, get x,y,z
while( fscanf( fp, "%*s%lf%lf%lf", &atoms[atom_count].x,
&atoms[atom_count].y,
&atoms[atom_count].z ) == 3 )
{
// Increment count
atom_count++ ;
// If capacity exhausted, expand allocation
if( atom_count == atom_capacity )
{
atom_capacity += STORAGE_INCREMENT ;
sXYZ* bigger = realloc( atoms, atom_capacity * sizeof(*atoms) ) ;
if( bigger == NULL )
{
break ;
}
atoms = bigger ;
}
}
This allocates enough space for 128 atoms initially, and if the space is exhausted, it is expanded by a further 128 atoms - indefinitely. A smaller value can be used if the files typically have fewer atoms to be a little more memory efficient. This approach saves you having to first count the number of triplets in the file.
I have a for loop which should run 4 times but is running 6 times.
Could you please explain the behaviour?
This is strange because stringarr1 is not changed.
Edit: I want to remove all '!' from my first string and want to save the letters in a second string.
#include <stdio.h>
#include <math.h>
#include <string.h>
int main(){
char stringarr1[] = "a!bc";
char stringarr2[] = "";
printf("%d\n", strlen(stringarr1)); // lenght --> 4
for (size_t i = 0; i < strlen(stringarr1); i++)
{
printf("i: %d\n", i);
if (stringarr1[i] != '!') {
stringarr2[strlen(stringarr2)] = stringarr1[i];
printf("info: != '!'\n");
}
}
}
You are overrunning the buffer for stringarr2 (length 1), which is in this case corrupting the memory-adjacent stringarr1, causing the string length to change by overwriting its nul terminator.
Then because you are reevaluating the string length on each iteration, the loop will run for a non-deterministic number of iterations - in your case just 6, but it could be worse; the behaviour you have observed is just one of several possibilities - it is undefined.
Apart from correcting the buffer length for stringarr2, it is best practice to evaluate loop-invariants once (although in this case the string length is not invariant due to a bug). So the following:
const size_t length = strlen( stringarr1 ) ;
for( size_t i = 0; i < length; i++ )
{
...
will run for 4 iterations regardless of the buffer overrun bug because the length is not reevaluated following the corruption. Re-evaluating loop-invariants can lead to very slow code execution.
Your code can run any number of times. You write beyond the end of stringarr2 so you may be smashing the stack and overwriting local variables. What you meant to do is probably something like this:
#include <stdio.h>
#include <math.h>
#include <string.h>
int main(){
char stringarr1[] = "a!bc";
char stringarr2[10];
int len = strlen(stringarr1);
printf("%d\n", len); // lenght --> 4
for (size_t i = 0; i < len; i++)
{
printf("i: %d\n", i);
if (stringarr1[i] != '!') {
stringarr2[len] = stringarr1[i];
printf("info: != '!'\n");
}
}
}
Like others said, it is not really clear what you are trying to accomplish here. But in C, a declaration like char s[] = "string" only allocates enough memory to store whatever is on the right hand side of the assignment. If that is an empty string like in your case, only a single byte is allocated, to store the end of string 'null' character. You need to either explicitly specify, like I did, the number of bytes to allocate as the array size, or use dynamic memory allocation.
The problem is that you're writing past the end of stringarr2. This triggers undefined behaviour.
To fix this, you need to allocate sufficient memory for stringarr2.
First, we must allocate the string to be long enough.
char stringarr1[] = "a!bc";
//save this in a variable beforehand because strlen loops over the string every time it is called
size_t len = strlen(stringarr1);
char stringarr2[1024] = { 0 };
{ 0 } initializes all characters in the string to 0, which means the last one will always be a null terminator after we add characters. This tells C string functions where the string ends.
Now we can put stuff in there. It seems like you're trying to append, so keep a separate iterator for the 2nd string. This is more efficient than calling strlen every loop.
for(size_t i = 0, j = 0; i < len; i++){
printf("i: %d\n", i);
if (stringarr1[i] != '!') {
stringarr2[j++] = stringarr1[i];
printf("info: != '!'\n");
}
}
I am writing a method in C in which I have a list of words from a file that I am redirecting from stdin. However, when I attempt to read in the words into the array, my code will only output the first character. I understand that this is because of a casting issue with char and char *.
While I am challenging myself to not use any of the functions from string.h, I have tried iterating through and am thinking of writing my own strcpy function, but I am confused because my input is coming from a file that I am redirecting from standard input. The variable numwords is inputted by the user in the main method (not shown).
I am trying to debug this issue via dumpwptrs to show me what the output is. I am not sure what in the code is causing me to get the wrong output - whether it is how I read in words to the chunk array, or if I am pointing to it incorrectly with wptrs?
//A huge chunk of memory that stores the null-terminated words contiguously
char chunk[MEMSIZE];
//Points to words that reside inside of chunk
char *wptrs[MAX_WORDS];
/** Total number of words in the dictionary */
int numwords;
.
.
.
void readwords()
{
//Read in words and store them in chunk array
for (int i = 0; i < numwords; i++) {
//When you use scanf with '%s', it will read until it hits
//a whitespace
scanf("%s", &chunk[i]);
//Each entry in wptrs array should point to the next word
//stored in chunk
wptrs[i] = &chunk[i]; //Assign address of entry
}
}
Do not re-use char chunk[MEMSIZE]; used for prior words.
Instead use the next unused memory.
char chunk[MEMSIZE];
char *pool = chunk; // location of unassigned memory pool
// scanf("%s", &chunk[i]);
// wptrs[i] = &chunk[i];
scanf("%s", pool);
wptrs[i] = pool;
pool += strlen(pool) + 1; // Beginning of next unassigned memory
Robust code would check the return value of scanf() and insure i, chunk do not exceed limits.
I'd go for a fgets() solution as long as words are entered a line at a time.
char chunk[MEMSIZE];
char *pool = chunk;
// return word count
int readwords2() {
int word_count;
// limit words to MAX_WORDS
for (word_count = 0; word_count < MAX_WORDS; word_count++) {
intptr_t remaining = &chunk[MEMSIZE] - pool;
if (remaining < 2) {
break; // out of useful pool memory
}
if (fgets(pool, remaining, stdin) == NULL) {
break; // end-of-file/error
}
pool[strcspn(pool, "\n")] = '\0'; // lop off potential \n
wptrs[word_count] = pool;
pool += strlen(pool) + 1;
}
return word_count;
}
While I am challenging myself to not use any of the functions from string.h, ...
The best way to challenge yourself to not use any of the functions from string.h is to write them yourself and then use them.
your program reads the next word in the i-esim position of the buffer chunk, so you are getting the first letters of each word (as long as i doesn't get above the size of chunk) as each time you read, you overwrite the second and rest of the chars of the last word with the ones of the just read one. Then, you are putting all the pointers in wptrs to point to these places, making it impossible to distinguish the end of one string to the next (you overwrote all the null terminators, leaving only the last) so you will get a first string with all the first letters of your words but the last, which is complete. then the second will have the same string, but beginning at the second... then the third.... etc.
Build your own version of strdup(3) and use chunk to store temporarily the string... then make a dynamically allocated copy of the string with your version of strdup(3) and make the pointer to point to it.... etc.
Finally, when you are finished, just free all the allocated strings and voilà!!
Also, this is very important: read How to create a Minimal, Complete, and Verifiable example as it is very frequent that your code lacks of some errors that you have eliminated from the posted code (you don't normally know where the error is, or you would have corrected it and no question here, right?)
So I need to create a word search program that will read a data file containing letters and then the words that need to be found at the end
for example:
f a q e g g e e e f
o e q e r t e w j o
t e e w q e r t y u
government
free
and the list of letters and words are longer but anyway I need to save the letters into an array and i'm having a difficult time because it never stores the correct data. here's what I have so far
#include <stdio.h>
int main()
{
int value;
char letters[500];
while(!feof(stdin))
{
value = fgets(stdin);
for(int i =0; i < value; i++)
{
scanf("%1s", &letters[i]);
}
for(int i=0; i<1; i++)
{
printf("%1c", letters[i]);
}
}
}
I also don't know how I am gonna store the words into a separate array after I get the chars into an array.
You said you want to read from a data file. If so, you should open the file.
FILE *fin=fopen("filename.txt", "r");
if(fin==NULL)
{
perror("filename.txt not opened.");
}
In your input file, the first few lines have single alphabets each separated by a space.
If you want to store each of these letters into the letters character array, you could load each line with the following loop.
char c;
int i=0;
while(fscanf(fin, "%c", &c)==1 && c!='\n')
{
if(c!=' ')
{
letters[i++]=c;
}
}
This will only store the letters and is not a string as there is no \0 character.
Reading the words which are at the bottom may be done with fgets().
Your usage of the fgets() function is wrong.
Its prototype is
char *fgets(char *str, int n, FILE *stream);
See here.
Note that fgets() will store the trailing newline(\n) into string as well. You might want to remove it like
str[strlen(str)-1]='\0';
Use fgets() to read the words at the bottom into a character array and replace the \n with a \0.
and do
fgets(letters, sizeof(letters, fin);
You use stdin instead of the fin here when you want to accept input from the keyboard and store into letters.
Note that fgets() will store the trailing newline(\n) into letters as well. You might want to remove it like
letters[strlen(letters)-1]='\0';
Just saying, letters[i] will be a character and not a string.
scanf("%1s", &letters[i]);
should be
scanf("%c", &letters[i]);
One way to store the lines with characters or words is to store them in an array of pointers to arrays - lines,
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAXLET 500
#define MAXLINES 1000
int main()
{
char *lptr;
// Array with letters from a given line
char letters[MAXLET];
// Array with pointers to lines with letters
char *lineptr[MAXLINES];
// Length of current array
unsigned len = 0;
// Total number of lines
int nlines = 0;
// Read lines from stdin and store them in
// an array of pointers
while((fgets(letters,MAXLET,stdin))!=NULL)
{
len = strlen(letters);
letters[len-1] = '\0';
lptr = (char*) malloc(len);
strcpy(lptr, letters);
lineptr[nlines++]=lptr;
}
// Print lines
for (int i = 0; i < nlines; i++)
printf("%s\n", lineptr[i]);
// Free allocated memory
for (int i = 0; i < nlines; i++)
free(lineptr[i]);
}
In the following, pointer to every line from stdin is stored in lineptr. Once stored, you can access and manipulate each of the lines - in this simple case I only print them one by one but the examples of simple manipulation are shown later on. At the end, program frees the previously allocated memory. It is a good practice to free the allocated memory once it is no longer in use.
The process of storing a line consists of getting each line from the stdin, collecting it's length with strlen, stripping it's newline character by replacing it with \0 (optional), allocating memory for it with malloc, and finally storing the pointer to that memory location in lineptr. During this process the program also counts the number of input lines.
You can implement this sequence for both of your inputs - chars and words. It will result in a clean, ready to use input. You can also consider moving the line collection into a function, that may require making lineptr type arrays global. Let me know if you have any questions.
Thing to remember is that MAXLET and especially MAXLINES may have to be increased for a given dataset (MAXLINES 1000 literally assumes you won't have more than a 1000 lines).
Also, while on Unix and Mac this program allows you to read from a file as it is by using $ prog_name < in_file it can be readily modified to read directly from files.
Here are some usage examples - lineptr stores pointers to each line (array) hence the program first retrieves the pointer to a line and then it proceeds as with any array:
// Print 3rd character of each line
// then substitute 2nd with 'a'
char *p;
for (int i = 0; i < nlines; i++){
p = lineptr[i];
printf("%c\n", p[2]);
p[1] = 'a';
}
// Print lines
for (int i = 0; i < nlines; i++)
printf("%s\n", lineptr[i]);
// Swap first and second element
// of each line
char tmp;
for (int i = 0; i < nlines; i++){
p = lineptr[i];
tmp = p[0];
p[0] = p[1];
p[1] = tmp;
}
// Print lines
for (int i = 0; i < nlines; i++)
printf("%s\n", lineptr[i]);
Note that these examples are just a demonstration and assume that each line has at least 3 characters. Also, in your original input the characters are separated by a space - that is not necessary, in fact it's easier without it.
The code in your post does not appear to match your stated goals, and indicates you have not yet grasp the proper application of the functions you are using.
You have expressed an idea describing what you want to do, but the steps you have taken (at least those shown) will not get you there. Not even close.
It is always good to have a map in hand to plan to plan your steps. An algorithm is a kind of software map. Before you can plan your steps though, you need to know where you are going.
Your stated goals:
1) Open and read a file into lines.
2) Store the lines, somehow. (using fgets(,,)?)
3) Use some lines as content to search though.
4) Use other lines as objects to search for
Some questions to answer:
a) How is the search content distinguished from the strings to search
for?
b) How is the search content to be stored?
c) How are the search words to be stored?
d) How will the comparison between content and search word be done?
e) How many lines in the file? (example)
f) Length of longest line? (discussion and example) (e & f used to create storage)
g) How is fgets() used. (maybe a google search: How to use fgets)
h) Are there things to be aware of when using feof()? (discussion and examaple feof)
i) Why is my input not right after the second call to scanf? (answer)
Finish identifying and crystallizing the list of items in your goals, then answer these (and maybe other) questions. At that point you will be ready to start identifying the steps to get there.
value = fgets(stdin); is a terrible expression! You don't respect at all the syntax of the fgets function. My man page says
char *
fgets(char * restrict str, int size, FILE * restrict stream);
So here, as you do not pass the stream at the right place, you probably get an underlying io error and fgets returns NULL, which is converted to the int 0 value. And then the next loop is just a no-op.
The correct way to read a line with fgets is:
if (NULL == fgets(letters, sizeof(letters), stdin) {
// indication of end of file or error
...
}
// Ok letters contains the line...
I am trying to deconstruct a document into its respective paragraphs, and input each paragraphs, as a string, into an array. However, each time a new value is added, it overwrites all previous values in the array. The last "paragraph" read (as denoted by newline) is the value of each non-null value of the array.
Here is the code:
char buffer[MAX_SIZE];
char **paragraphs = (char**)malloc(MAX_SIZE * sizeof(char*));
int pp = 0;
int i;
FILE *doc;
doc = fopen(argv[1], "r+");
assert(doc);
while((i = fgets(buffer, sizeof(buffer), doc) != NULL)) {
if(strncmp(buffer, "\n", sizeof(buffer))) {
paragraphs[pp++] = (char*)buffer;
}
}
printf("pp: %d\n", pp);
for(i = 0; i < MAX_SIZE && paragraphs[i] != NULL; i++) {
printf("paragraphs[%d]: %s", i, paragraphs[i]);
}
The output I receive is:
pp: 4
paragraphs[0]: paragraph four
paragraphs[1]: paragraph four
paragraphs[2]: paragraph four
paragraphs[3]: paragraph four
when the program is run as follows: ./prog.out doc.txt, where doc.txt is:
paragraph one
paragraph two
paragraph three
paragraph four
The behavior of the program is otherwise desired. The paragraph count works properly, ignoring the line that contains ONLY the newline character (line 4).
I assume the problem occurs in the while loop, however am unsure how to remedy the problem.
Your solution is pretty sound. Your Paragraph array is supposed to hold each paragraph, and since each paragraph element is just a small 4 bytes pointer you can afford to define a reasonable max number of them. However, since this max number is a constant, it is of little use to allocate the array dynamically.
The only meaningful use of dynamic allocation would be to read the whole text once to count the actual number of paragraphs, allocate the array accordingly and re-read the whole file a second time, but I doubt this is worth the effort.
The downside of using fixed-size paragraph array is that you must stop filling it once you reach the maximal number of elements.
You can then re-allocate a bigger array if you absolutely want to be able to process the whole Bible, but for an educational exercise I think it's reasonable to just stop recording paragraphs (thus producing a code that can store and count paragraphs up to a maximal number).
The real trouble with your code is, you don't store the paragraph contents anywhere. When you read the actual lines, it's always inside the same buffer, so each paragraph will point to the same string, which will contain the last paragraph read.
The solution is to make a unique copy of the buffer and have the current paragraph point to that.
C being already messy enough as it is, I suggest using the strdup() function, which duplicates a string (basically computing string length, allocating sufficient memory, copying the string into it and returning the new block of memory holding the new copy). You just need to remember to free this new copy once you're done using it (in your case at the end of your program).
This is not the most time-efficient solution, since each string will require a strlen and a malloc performed internally by strdump while you could have pre-allocated a big buffer for all paragraphs, but it is certainly simpler and probably more memory-efficient (only the minimal amount of memory will be allocated for each string, though each malloc consumes a few extra bytes for internal allocator housekeeping).
The bloody awkward fgets also stores the trailing \n at the end of the line, so you'll probably want to get rid of that.
Your last display loop would be simpler, more robust and more efficient if you simply used pp as a limit, instead of checking uninitialized paragraphs.
Lastly, you'd better define two different constants for max line size and max number of paragraphs. Using the same value for both makes little sense, unless you're processing perfectly square texts :).
#define MAX_LINE_SIZE 82 // max nr of characters in a line (including trailing \n and \0)
#define MAX_PARAGRAPHS 100 // max number of paragraphs in a file
void main (void)
{
char buffer[MAX_LINE_SIZE];
char * paragraphs[MAX_PARAGRAPHS];
int pp = 0;
int i;
FILE *doc;
doc = fopen(argv[1], "r+");
assert(doc != NULL);
while((fgets(buffer, sizeof(buffer), doc) != NULL)) {
if (pp != MAX_PARAGRAPHS // make sure we don't overflow our paragraphs array
&& strcmp(buffer, "\n")) {
// fgets awkwardly collects the ending \n, so get rid of it
if (buffer[strlen(buffer)-1] == '\n') buffer[strlen(buffer)-1] = '\0';
// current paragraph references a unique copy of the actual text
paragraphs[pp++] = strdup (buffer);
}
}
printf("pp: %d\n", pp);
for(i = 0; i != pp; i++) {
printf("paragraphs[%d]: %s", i, paragraphs[i]);
free(paragraphs[i]); // release memory allocated by strdup
}
}
What is the proper way to allocate the necessary memory? Is the malloc on line 2 not enough?
No, you need to allocate memory for the 2D array of strings you created. The following will not work as coded.
char **paragraphs = (char**)malloc(MAX_SIZE * sizeof(char*));
If you have: (for a simple explanation)
char **array = {0}; //array of C strings, before memory is allocation
Then you can create memory for it like this:
int main(void)
{
int numStrings = 10;// for example, change as necessary
int maxLen = MAX_SIZE; //for example, change as necessary
char **array {0};
array = allocMemory(array, numStrings, maxLen);
//use the array, then free it
freeMemory(array, numStrings);
return 0;
}
char ** allocMemory(char ** a, int numStrings, int maxStrLen)
{
int i;
a = calloc(sizeof(char*)*(numStrings+1), sizeof(char*));
for(i=0;i<numStrings; i++)
{
a[i] = calloc(sizeof(char)*maxStrLen + 1, sizeof(char));
}
return a;
}
void freeMemory(char ** a, int numStrings)
{
int i;
for(i=0;i<numStrings; i++)
if(a[i]) free(a[i]);
free(a);
}
Note: you can determine the number of lines in a file several ways, One way for example, by FILE *fp = fopen(filepath, "r");, then calling ret = fgets(lineBuf, lineLen, fp) in a loop until ret == EOF, keeping count of an index value for each loop. Then fclose(). (which you did not do either) This necessary step is not included in the code example above, but you can add it if that is the approach you want to use.
Once you have memory allocated, Change the following in your code:
paragraphs[pp++] = (char*)buffer;
To:
strcpy(paragraphs[pp++], buffer);//no need to cast buffer, it is already char *
Also, do not forget to call fclose() when you are finished with the open file.