I am new to C, coming from Python. I want to read a .xyz file into a dynamically sized array, to use for various calculations later on in the program. The file is formatted as follows:
Title
Comment
Symbol 0.000 0.000 0.000
Symbol 0.000 0.000 0.000
....
The two first lines are not needed, and should just be skipped. The "Symbol" part of the file are chemical symbols--e.g. H, Au, C, Mn--as the .xyz file format is used for storing 3D coordinates of atoms. They need to be ignored as well. I'm interested in the space separated decimal numbers. I therefore want to:
Skip the first two lines, or just ignore them in some way.
Skip the first part of each line until the first space.
Store the three columns of numbers (coordinates) in an array.
So far I have been able to open a file for reading, and then I've attempted to check how long the file is, in order to have the size of the array change depending on how many coordinate sets needs to be stored.
// Variable declaration
FILE *fp;
long file_size;
// Open file and error checking
fp = fopen ("file_name" , "r");
if(!fp) perror("file_name"), exit(1);
// Check file size
fseek(fp, 0, SEEK_END);
file_size = ftell(fp);
rewind(fp);
// Close file
fclose(fp);
I've been able to skip the first two lines using fscanf(fp, "%*[^\n]"), to skip to the end of the line. But, I haven't been able to figure out how to loop through the rest of the file, while storing only the decimal numbers in an array.
If I understand correctly, I need to allocate memory for the array, using something like malloc() in combination with my file_size and then copy the data into the array using fread().
Here is an example of the contents of an actual .xyz file:
10 atom system
Energy: -914941.6614699
Ag 0.96834 1.51757 0.02281
Ag 0.96758 -1.51824 -0.02206
Ag -1.80329 2.27401 0.03179
Ag -3.58033 0.00046 0.00126
Ag -1.80447 -2.27338 -0.03537
Ag -0.96581 0.02246 -1.51755
Ag -0.96929 -0.02231 1.51463
Ag 1.80613 0.03321 -2.27213
Ag 3.58027 0.00028 0.00206
Ag 1.80086 -0.03407 2.27455
Here is a general approach in C for reading a file into an array of cstrings (pointers to cstrings, so the rough equivalent of a Python list of strings).
int count = 0; // line counter;
int char_count = 0; // char counter;
int max_len = 0; // for storing the longest line length
int c; // for measuring each line length
char **str_ptr_arr; // array of pointers to c-string
//extract characters from the file, looking for endlines; note that
//the EOF check has to come AFTER the getc(fp) to work properly
for (c = getc(fp); c != EOF; c = getc(fp)) { //edit see comments
char_count += 1;
if (c == '\n') { //safe comparison see comments
count += 1;
if (max_len < char_count) {
max_len = char_count; //gets longest line
}
char_count = 0;
}
}
//should probably do an feof check here
rewind(fp);
So now you have the number of lines and the length of the longest line, (You can try using the above loop to exclude lines if you want but it might just be easier to read the whole thing into an array of c-strings, then process that into an array of doubles). Now allocate the memory for the array of pointers to c-strings and for the c-strings themselves:
//allocate enough memory to hold all the strings in the file, by first
//allocating the arr of ptrs then a slot for each c-string pointed to:
str_ptr_arr = malloc(count * sizeof(char*)); //size of pointer
for (int i = 0; i < count; i++) {
str_ptr_arr[i] = malloc ((max_len + 1) * sizeof(char)); // +1 for '\0' terminate
}
rewind(fp); //rewind again;
Now, we have a problem, which is how to populate these cstrings (Python is so much easier!). This works, I'm not sure if it's the expert approach, but here we read into a
temporary buffer then use strcpy to move the contents of the buffer into our allocated array slots:
for (int i = 0; i < count; i++) {
char buff[max_len + 1]; //local temporary buffer that can store any line in file
fscanf(fp, "%s", buff); //read the first string to buffer
strcpy(str_ptr_arr[i], buff);
}
Note: this is a decent point at which to start excluding lines or removing various substrings from lines, as you can make strcpy conditional on the contents of the buffer, by using other cstring methods. I'm fairly new at this myself, (learning to write C functions for use in Python progams), but this seems to be the correct approach.
It might also be possible to go directly to a dynamically allocated array of floats for storing your numerical data without bothering with the cstring array; that could be done in the last loop above. You could split the strings at whitespace, exclude the alphabetical parts, and use the cstring function atof to convert to float datatype.
Edit: I should mention all these memory allocations must be manually freed when you are done with them, and this is the approach:
for(int i = 0; i < count; i++) { // free each allocated cstring space
free(str_ptr_arr[i]);
}
free(str_ptr_arr); // free the cstring pointer space
str_ptr_arr = NULL;
Given, for example:
#define STORAGE_INCREMENT 128
typedef struct
{
double x, y, z ;
} sXYZ ;
Then:
int atom_count = 0 ;
int atom_capacity = STORAGE_INCREMENT ;
sXYZ* atoms = malloc( atom_capacity * sizeof(*atoms) ) ;
// While valid triplet, discard symbol, get x,y,z
while( fscanf( fp, "%*s%lf%lf%lf", &atoms[atom_count].x,
&atoms[atom_count].y,
&atoms[atom_count].z ) == 3 )
{
// Increment count
atom_count++ ;
// If capacity exhausted, expand allocation
if( atom_count == atom_capacity )
{
atom_capacity += STORAGE_INCREMENT ;
sXYZ* bigger = realloc( atoms, atom_capacity * sizeof(*atoms) ) ;
if( bigger == NULL )
{
break ;
}
atoms = bigger ;
}
}
This allocates enough space for 128 atoms initially, and if the space is exhausted, it is expanded by a further 128 atoms - indefinitely. A smaller value can be used if the files typically have fewer atoms to be a little more memory efficient. This approach saves you having to first count the number of triplets in the file.
Related
So I need to create a word search program that will read a data file containing letters and then the words that need to be found at the end
for example:
f a q e g g e e e f
o e q e r t e w j o
t e e w q e r t y u
government
free
and the list of letters and words are longer but anyway I need to save the letters into an array and i'm having a difficult time because it never stores the correct data. here's what I have so far
#include <stdio.h>
int main()
{
int value;
char letters[500];
while(!feof(stdin))
{
value = fgets(stdin);
for(int i =0; i < value; i++)
{
scanf("%1s", &letters[i]);
}
for(int i=0; i<1; i++)
{
printf("%1c", letters[i]);
}
}
}
I also don't know how I am gonna store the words into a separate array after I get the chars into an array.
You said you want to read from a data file. If so, you should open the file.
FILE *fin=fopen("filename.txt", "r");
if(fin==NULL)
{
perror("filename.txt not opened.");
}
In your input file, the first few lines have single alphabets each separated by a space.
If you want to store each of these letters into the letters character array, you could load each line with the following loop.
char c;
int i=0;
while(fscanf(fin, "%c", &c)==1 && c!='\n')
{
if(c!=' ')
{
letters[i++]=c;
}
}
This will only store the letters and is not a string as there is no \0 character.
Reading the words which are at the bottom may be done with fgets().
Your usage of the fgets() function is wrong.
Its prototype is
char *fgets(char *str, int n, FILE *stream);
See here.
Note that fgets() will store the trailing newline(\n) into string as well. You might want to remove it like
str[strlen(str)-1]='\0';
Use fgets() to read the words at the bottom into a character array and replace the \n with a \0.
and do
fgets(letters, sizeof(letters, fin);
You use stdin instead of the fin here when you want to accept input from the keyboard and store into letters.
Note that fgets() will store the trailing newline(\n) into letters as well. You might want to remove it like
letters[strlen(letters)-1]='\0';
Just saying, letters[i] will be a character and not a string.
scanf("%1s", &letters[i]);
should be
scanf("%c", &letters[i]);
One way to store the lines with characters or words is to store them in an array of pointers to arrays - lines,
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAXLET 500
#define MAXLINES 1000
int main()
{
char *lptr;
// Array with letters from a given line
char letters[MAXLET];
// Array with pointers to lines with letters
char *lineptr[MAXLINES];
// Length of current array
unsigned len = 0;
// Total number of lines
int nlines = 0;
// Read lines from stdin and store them in
// an array of pointers
while((fgets(letters,MAXLET,stdin))!=NULL)
{
len = strlen(letters);
letters[len-1] = '\0';
lptr = (char*) malloc(len);
strcpy(lptr, letters);
lineptr[nlines++]=lptr;
}
// Print lines
for (int i = 0; i < nlines; i++)
printf("%s\n", lineptr[i]);
// Free allocated memory
for (int i = 0; i < nlines; i++)
free(lineptr[i]);
}
In the following, pointer to every line from stdin is stored in lineptr. Once stored, you can access and manipulate each of the lines - in this simple case I only print them one by one but the examples of simple manipulation are shown later on. At the end, program frees the previously allocated memory. It is a good practice to free the allocated memory once it is no longer in use.
The process of storing a line consists of getting each line from the stdin, collecting it's length with strlen, stripping it's newline character by replacing it with \0 (optional), allocating memory for it with malloc, and finally storing the pointer to that memory location in lineptr. During this process the program also counts the number of input lines.
You can implement this sequence for both of your inputs - chars and words. It will result in a clean, ready to use input. You can also consider moving the line collection into a function, that may require making lineptr type arrays global. Let me know if you have any questions.
Thing to remember is that MAXLET and especially MAXLINES may have to be increased for a given dataset (MAXLINES 1000 literally assumes you won't have more than a 1000 lines).
Also, while on Unix and Mac this program allows you to read from a file as it is by using $ prog_name < in_file it can be readily modified to read directly from files.
Here are some usage examples - lineptr stores pointers to each line (array) hence the program first retrieves the pointer to a line and then it proceeds as with any array:
// Print 3rd character of each line
// then substitute 2nd with 'a'
char *p;
for (int i = 0; i < nlines; i++){
p = lineptr[i];
printf("%c\n", p[2]);
p[1] = 'a';
}
// Print lines
for (int i = 0; i < nlines; i++)
printf("%s\n", lineptr[i]);
// Swap first and second element
// of each line
char tmp;
for (int i = 0; i < nlines; i++){
p = lineptr[i];
tmp = p[0];
p[0] = p[1];
p[1] = tmp;
}
// Print lines
for (int i = 0; i < nlines; i++)
printf("%s\n", lineptr[i]);
Note that these examples are just a demonstration and assume that each line has at least 3 characters. Also, in your original input the characters are separated by a space - that is not necessary, in fact it's easier without it.
The code in your post does not appear to match your stated goals, and indicates you have not yet grasp the proper application of the functions you are using.
You have expressed an idea describing what you want to do, but the steps you have taken (at least those shown) will not get you there. Not even close.
It is always good to have a map in hand to plan to plan your steps. An algorithm is a kind of software map. Before you can plan your steps though, you need to know where you are going.
Your stated goals:
1) Open and read a file into lines.
2) Store the lines, somehow. (using fgets(,,)?)
3) Use some lines as content to search though.
4) Use other lines as objects to search for
Some questions to answer:
a) How is the search content distinguished from the strings to search
for?
b) How is the search content to be stored?
c) How are the search words to be stored?
d) How will the comparison between content and search word be done?
e) How many lines in the file? (example)
f) Length of longest line? (discussion and example) (e & f used to create storage)
g) How is fgets() used. (maybe a google search: How to use fgets)
h) Are there things to be aware of when using feof()? (discussion and examaple feof)
i) Why is my input not right after the second call to scanf? (answer)
Finish identifying and crystallizing the list of items in your goals, then answer these (and maybe other) questions. At that point you will be ready to start identifying the steps to get there.
value = fgets(stdin); is a terrible expression! You don't respect at all the syntax of the fgets function. My man page says
char *
fgets(char * restrict str, int size, FILE * restrict stream);
So here, as you do not pass the stream at the right place, you probably get an underlying io error and fgets returns NULL, which is converted to the int 0 value. And then the next loop is just a no-op.
The correct way to read a line with fgets is:
if (NULL == fgets(letters, sizeof(letters), stdin) {
// indication of end of file or error
...
}
// Ok letters contains the line...
I am trying to take input with fgets(). I know how many lines I will get but it changes and I store the number of lines in the variable var. I also have another variable named part; it is the length of the line I get, but since there are white spaces between the values I multiplied it by 2 (I couldn't find another solution; I could use some advice).
Anyway, I tried to get the input as in the code below, but when I entered the first line it automatically breaks out the for loop and prints random things. I think it is to do with the fgets() in the loop; I don't know if there is a use of fgets() like this.
char inp[var][(2*part)];
int k,l;
for(k=0;k<=var;k++);
fgets(inp[k],(2*part),stdin);
printf("%c\n",inp[0]);
printf("%c\n",inp[1]);
printf("%c\n",inp[2]);
printf("%c\n",inp[3]);
…since there are white spaces between the values I multiplied it with 2…
If you aren't required to store everything on the stack, you can instead store the strings in dynamically allocated memory. For example:
char* inp[var];
char buf[400]; // just needs to be long
for (k = 0; k < var; k++) {
fgets(buf, 400, stdin);
inp[k] = malloc(sizeof(char) * (strlen(buf) + 1));
strcpy(inp[k], buf);
}
Although technically not standards-compliant, strdup is widely available and makes this easier as well.
As far as the actual issue, as BLUEPIXY said in the comments above, you have a few typos.
After the for loop, the semicolon makes it act unexpectedly.
for(k=0;k<=var;k++);
fgets(inp[k],(2*part),stdin);
is actually the same as
for(k=0;k<=var;k++) {
; // do nothing
}
fgets(...);
Remove that semicolon after the for loop statement. As it is, you're not actually reading correctly, which is why you see garbage.
To print an entire string, the printf family needs a %s format flag.
With your bounds on k, there will actually be var + 1 iterations of the loop. If var were 3, then k = 0,1,2,3 -> terminate when k checked at 4.
Typically, the safest and easiest way to use fgets is to allocate a single, large-enough line buffer. Use that to read the line, then copy it into correctly sized buffers.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void) {
// Allocate just the space for the list, not the strings themselves.
int num_input = 5;
char *input[num_input];
// Allocate our reusable line buffer.
char line[1024];
for( int i = 0; i < num_input; i++ ) {
// Read into the line buffer.
fgets(line, 1024,stdin);
// Copy from the line buffer into correctly sized memory.
input[i] = strdup(line);
}
for( int i = 0; i < num_input; i++ ) {
printf("%s\n",input[i]);
}
}
Note that strdup() is not an ISO C function, but POSIX. It's common and standard enough. It's too handy not to use. Write your own if necessary.
That takes care of not knowing the line length.
If you don't know the number of lines you're storing, you'll have to grow the array. Typically this is done with realloc to reallocate the existing memory. Start with a small list size, then grow it as needed. Doubling is a good rough approximation that's a pretty efficient balance between speed (reallocating can be slow) and memory efficiency.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
int main(void) {
// How big the input list is.
size_t input_max = 64;
// How many elements are in it.
size_t input_size = 0;
// Allocate initial memory for the input list.
// Again, not for the strings, just for the list.
char **input = malloc( sizeof(char*) * input_max );
char line[1024];
while( fgets(line, 1024,stdin) != NULL ) {
// Check if we need to make the input list bigger.
if( input_size >= input_max ) {
// Double the max length.
input_max *= 2;
// Reallocate.
// Note: this is only safe because we're
// going to exit on error, otherwise we'd leak
// input's memory.
input = realloc( input, sizeof(char*) * input_max );
// Check for error.
if( input == NULL ) {
fprintf(stderr, "Could not reallocate input list to %zu: %s", input_max, strerror(errno) );
exit(1);
}
}
input[input_size] = strdup(line);
input_size++;
}
for( size_t i = 0; i < input_size; i++ ) {
printf("%s\n",input[i]);
}
}
As you can see, this gets a bit complicated. Now you need to keep track of the array, its maximum size, and its current size. Anyone using the array must remember to check its size and grow it, and remember to error check it. Your next impulse will be to create a struct to collect all that together, and functions to manage the list.
This is a good exercise in dynamic memory management, and I encourage you to do it. But for production code, use a pre-existing library. GLib is a good choice. It contains all sorts of handy data structures and functions that are missing from C, including pointer arrays that automatically grow. Use them, or something like it, in production code.
I'm try to get my text to be read back to front and to be printed in the reverse order in that file, but my for loop doesn't seem to working. Also my while loop is counting 999 characters even though it should be 800 and something (can't remember exactly), I think it might be because there is an empty line between the two paragraphs but then again there are no characters there.
Here is my code for the two loops -:
/*Reversing the file*/
char please;
char work[800];
int r, count, characters3;
characters3 = 0;
count = 0;
r = 0;
fgets(work, 800, outputfile);
while (work[count] != NULL)
{
characters3++;
count++;
}
printf("The number of characters to be copied is-: %d", characters3);
for (characters3; characters3 >= 0; characters3--)
{
please = work[characters3];
work[r] = please;
r++;
}
fprintf(outputfile, "%s", work);
/*Closing all the file streams*/
fclose(firstfile);
fclose(secondfile);
fclose(outputfile);
/*Message to direct the user to where the files are*/
printf("\n Merged the first and second files into the output file
and reversed it! \n Check the outputfile text inside the Debug folder!");
There are a couple of huge conceptual flaws in your code.
The very first one is that you state that it "doesn't seem to [be] working" without saying why you think so. Just running your code reveals what the problem is: you do not get any output at all.
Here is why. You reverse your string, and so the terminating zero comes at the start of the new string. You then print that string – and it ends immediately at the first character.
Fix this by decreasing the start of the loop in characters3.
Next, why not print a few intermediate results? That way you can see what's happening.
string: [This is a test.
]
The number of characters to be copied is-: 15
result: [
.tset aa test.
]
Hey look, there seems to be a problem with the carriage return (it ends up at the start of the line), which is exactly what should happen – after all, it is part of the string – but more likely not what you intend to do.
Apart from that, you can clearly see that the reversing itself is not correct!
The problem now is that you are reading and writing from the same string:
please = work[characters3];
work[r] = please;
You write the character at the end into position #0, decrease the end and increase the start, and repeat until done. So, the second half of reading/writing starts copying the end characters back from the start into the end half again!
Two possible fixes: 1. read from one string and write to a new one, or 2. adjust the loop so it stops copying after 'half' is done (since you are doing two swaps per iteration, you only need to loop half the number of characters).
You also need to think more about what swapping means. As it is, your code overwrites a character in the string. To correctly swap two characters, you need to save one first in a temporary variable.
void reverse (FILE *f)
{
char please, why;
char work[800];
int r, count, characters3;
characters3 = 0;
count = 0;
r = 0;
fgets(work, 800, f);
printf ("string: [%s]\n", work);
while (work[count] != 0)
{
characters3++;
count++;
}
characters3--; /* do not count last zero */
characters3--; /* do not count the return */
printf("The number of characters to be copied is-: %d\n", characters3);
for (characters3; characters3 >= (count>>1); characters3--)
{
please = work[characters3];
why = work[r];
work[r] = please;
work[characters3] = why;
r++;
}
printf ("result: [%s]\n", work);
}
As a final note: you do not need to 'manually' count the number of characters, there is a function for that. All that's needed instead of the count loop is this;
characters3 = strlen(work);
Here's a complete and heavily commented function that will take in a filename to an existing file, open it, then reverse the file character-by-character. Several improvements/extensions could include:
Add an argument to adjust the maximum buffer size allowed.
Dynamically increase the buffer size as the input file exceeds the original memory.
Add a strategy for recovering the original contents if something goes wrong when writing the reversed characters back to the file.
// naming convention of l_ for local variable and p_ for pointers
// Returns 1 on success and 0 on failure
int reverse_file(char *filename) {
FILE *p_file = NULL;
// r+ enables read & write, preserves contents, starts pointer p_file at beginning of file, and will not create a
// new file if one doesn't exist. Consider a nested fopen(filename, "w+") if creation of a new file is desired.
p_file = fopen(filename, "r+");
// Exit with failure value if file was not opened successfully
if(p_file == NULL) {
perror("reverse_file() failed to open file.");
fclose(p_file);
return 0;
}
// Assumes entire file contents can be held in volatile memory using a buffer of size l_buffer_size * sizeof(char)
uint32_t l_buffer_size = 1024;
char l_buffer[l_buffer_size]; // buffer type is char to match fgetc() return type of int
// Cursor for moving within the l_buffer
int64_t l_buffer_cursor = 0;
// Temporary storage for current char from file
// fgetc() returns the character read as an unsigned char cast to an int or EOF on end of file or error.
int l_temp;
for (l_buffer_cursor = 0; (l_temp = fgetc(p_file)) != EOF; ++l_buffer_cursor) {
// Store the current char into our buffer in the original order from the file
l_buffer[l_buffer_cursor] = (char)l_temp; // explicitly typecast l_temp back down to signed char
// Verify our assumption that the file can completely fit in volatile memory <= l_buffer_size * sizeof(char)
// is still valid. Return an error otherwise.
if (l_buffer_cursor >= l_buffer_size) {
fprintf(stderr, "reverse_file() in memory buffer size of %u char exceeded. %s is too large.\n",
l_buffer_size, filename);
fclose(p_file);
return 0;
}
}
// At the conclusion of the for loop, l_buffer contains a copy of the file in memory and l_buffer_cursor points
// to the index 1 past the final char read in from the file. Thus, ensure the final char in the file is a
// terminating symbol and decrement l_buffer_cursor by 1 before proceeding.
fputc('\0', p_file);
--l_buffer_cursor;
// To reverse the file contents, reset the p_file cursor to the beginning of the file then write data to the file by
// reading from l_buffer in reverse order by decrementing l_buffer_cursor.
// NOTE: A less verbose/safe alternative to fseek is: rewind(p_file);
if ( fseek(p_file, 0, SEEK_SET) != 0 ) {
return 0;
}
for (l_temp = 0; l_buffer_cursor >= 0; --l_buffer_cursor) {
l_temp = fputc(l_buffer[l_buffer_cursor], p_file); // write buffered char to the file, advance f_open pointer
if (l_temp == EOF) {
fprintf(stderr, "reverse_file() failed to write %c at index %lu back to the file %s.\n",
l_buffer[l_buffer_cursor], l_buffer_cursor, filename);
}
}
fclose(p_file);
return 1;
}
I am trying to write a program that has to store an ASCII picture (every line has a different length) in an 2D-Array and then print it out again. Either I have to cut the array at "\n" or I have to create an array with dynamic size. Here is what I have so far. It is printing out in the right way, but every line has 255 chars.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdbool.h>
#define MAX_LENGTH 255
int main(void) {
FILE *iFile, *oFile;
char elements[MAX_LENGTH][MAX_LENGTH];
memset(elements, 0, MAX_LENGTH);
int zeile = 0, position = 0, size = 0;
char c;
bool fileend = false;
//open files
iFile = fopen("image.txt", "r");
oFile = fopen("imagerotated.txt", "w+");
if ((iFile == NULL) || (oFile == NULL))
{
perror("Error: File does not exist.");
exit(EXIT_FAILURE);
}
//read File to a 2D-Array
while (1) {
if ((c = fgetc(iFile)) != EOF) {
if (c == '\n') {
zeile++;
position = 0;
}
elements[zeile][position] = c;
position++;
}
else {
fileend = true;
}
if (fileend == true) {
break;
}
}
//Write 2D-Array into the output file
fwrite(elements, MAX_LENGTH, MAX_LENGTH, oFile);
fclose(iFile);
fclose(oFile);
return EXIT_SUCCESS;
}
So my question is what is the best solution to print out an array and cut each line at "\n"? (Or create an array with dynamic length/size).
I thought about creating an int countRows and get the number from 'zeile' when fileend becomes true, but how do i get countColoumns?
The main goal is rotate the ASCII image in 90 degree steps, but i'm stuck at the output. So i have to use a 2D-Array so i can swap chars easily.
Thanks for your help.
I,
You can try to make a function prototyped like that :
char **myFileToArray(char *contentOfFile);
The purpose of this function is to create an array who will contain each line of your picture.
In this function you count the number of lines (= number of '/n'), you create a dynamic array with this size. For each dimension of your array, you dynamically allocate of the length of your line plus one for the '/0'. You copy your line into the dimension and put the '/0' at each end of dimension.
after this you can return your array.
For example, if your file is :
1. /\_/\
2. =( °w° )=
3. ) ( //
4. (__ __)//
you will have a array of size 5 composed like that :
array[0] = " /\_/\"
array[1] = "=( °w° )="
array[2] = " ) ( //"
array[3] = " (__ __)//"
array[4] = "/0"
For the printing you can create an other function that write the array an replace each '/0' by a '/n'.
This solution is bit more complex but it's still a solution and a clean one i think.
I hope my explanation was clear sorry for my bad english i try to improve !
To write your array data to a binary file with fwrite, you will need to know the number of characters you are writing. All you care about is the number of used characters (or used lines in your case), not the uninitialized/empty part of your array. You will probably also want to write the number of lines and the size of each array element as the first two values to the file so you will know how many of x-size array to read back. (handling strings, you can also write the null-terminating char to the file if you choose) If you have the number of lines numlines as an unsigned int, then something simple like the following for type char where sizeof (char) = 1:
unsigned maxl = MAX_LENGTH;
fwrite (&numlines, sizeof numlines, 1, oFile);
fwrite (&maxl, sizeof maxl, 1, oFile);
fwrite (elements, MAX_LENGTH, numlines, oFile);
Then to read it back, you read the first 2 unsigned int values which the number an array (or row) size to read back from the file.
If elements contains something other than char where sizeof type > 1, then you would change the fwrite for maxl and elements to (example for int):
unsigned maxl = MAX_LENGTH * sizeof (int);
...
fwrite (&maxl, sizeof maxl, 1, oFile);
fwrite (elements, MAX_LENGTH * sizeof (int), numlines, oFile);
or the last line would normally be seen as just:
fwrite (elements, maxl, numlines, oFile);
The goal is to write the file (or record) in a way that you can easily read it back in. Knowing the number of elements and the size and saving those as the first two numbers written to the file allows you to query the file for the size/type needed to hold the values and then to read them back into your program and validate your read.
With not much additional effort, you can write a jagged array out as well, just include the number of character (or bytes) to be read before each line in your output. That will allow you to write a number of unevenly sized groups of elements to your file saving some file size in the process.
I am trying to deconstruct a document into its respective paragraphs, and input each paragraphs, as a string, into an array. However, each time a new value is added, it overwrites all previous values in the array. The last "paragraph" read (as denoted by newline) is the value of each non-null value of the array.
Here is the code:
char buffer[MAX_SIZE];
char **paragraphs = (char**)malloc(MAX_SIZE * sizeof(char*));
int pp = 0;
int i;
FILE *doc;
doc = fopen(argv[1], "r+");
assert(doc);
while((i = fgets(buffer, sizeof(buffer), doc) != NULL)) {
if(strncmp(buffer, "\n", sizeof(buffer))) {
paragraphs[pp++] = (char*)buffer;
}
}
printf("pp: %d\n", pp);
for(i = 0; i < MAX_SIZE && paragraphs[i] != NULL; i++) {
printf("paragraphs[%d]: %s", i, paragraphs[i]);
}
The output I receive is:
pp: 4
paragraphs[0]: paragraph four
paragraphs[1]: paragraph four
paragraphs[2]: paragraph four
paragraphs[3]: paragraph four
when the program is run as follows: ./prog.out doc.txt, where doc.txt is:
paragraph one
paragraph two
paragraph three
paragraph four
The behavior of the program is otherwise desired. The paragraph count works properly, ignoring the line that contains ONLY the newline character (line 4).
I assume the problem occurs in the while loop, however am unsure how to remedy the problem.
Your solution is pretty sound. Your Paragraph array is supposed to hold each paragraph, and since each paragraph element is just a small 4 bytes pointer you can afford to define a reasonable max number of them. However, since this max number is a constant, it is of little use to allocate the array dynamically.
The only meaningful use of dynamic allocation would be to read the whole text once to count the actual number of paragraphs, allocate the array accordingly and re-read the whole file a second time, but I doubt this is worth the effort.
The downside of using fixed-size paragraph array is that you must stop filling it once you reach the maximal number of elements.
You can then re-allocate a bigger array if you absolutely want to be able to process the whole Bible, but for an educational exercise I think it's reasonable to just stop recording paragraphs (thus producing a code that can store and count paragraphs up to a maximal number).
The real trouble with your code is, you don't store the paragraph contents anywhere. When you read the actual lines, it's always inside the same buffer, so each paragraph will point to the same string, which will contain the last paragraph read.
The solution is to make a unique copy of the buffer and have the current paragraph point to that.
C being already messy enough as it is, I suggest using the strdup() function, which duplicates a string (basically computing string length, allocating sufficient memory, copying the string into it and returning the new block of memory holding the new copy). You just need to remember to free this new copy once you're done using it (in your case at the end of your program).
This is not the most time-efficient solution, since each string will require a strlen and a malloc performed internally by strdump while you could have pre-allocated a big buffer for all paragraphs, but it is certainly simpler and probably more memory-efficient (only the minimal amount of memory will be allocated for each string, though each malloc consumes a few extra bytes for internal allocator housekeeping).
The bloody awkward fgets also stores the trailing \n at the end of the line, so you'll probably want to get rid of that.
Your last display loop would be simpler, more robust and more efficient if you simply used pp as a limit, instead of checking uninitialized paragraphs.
Lastly, you'd better define two different constants for max line size and max number of paragraphs. Using the same value for both makes little sense, unless you're processing perfectly square texts :).
#define MAX_LINE_SIZE 82 // max nr of characters in a line (including trailing \n and \0)
#define MAX_PARAGRAPHS 100 // max number of paragraphs in a file
void main (void)
{
char buffer[MAX_LINE_SIZE];
char * paragraphs[MAX_PARAGRAPHS];
int pp = 0;
int i;
FILE *doc;
doc = fopen(argv[1], "r+");
assert(doc != NULL);
while((fgets(buffer, sizeof(buffer), doc) != NULL)) {
if (pp != MAX_PARAGRAPHS // make sure we don't overflow our paragraphs array
&& strcmp(buffer, "\n")) {
// fgets awkwardly collects the ending \n, so get rid of it
if (buffer[strlen(buffer)-1] == '\n') buffer[strlen(buffer)-1] = '\0';
// current paragraph references a unique copy of the actual text
paragraphs[pp++] = strdup (buffer);
}
}
printf("pp: %d\n", pp);
for(i = 0; i != pp; i++) {
printf("paragraphs[%d]: %s", i, paragraphs[i]);
free(paragraphs[i]); // release memory allocated by strdup
}
}
What is the proper way to allocate the necessary memory? Is the malloc on line 2 not enough?
No, you need to allocate memory for the 2D array of strings you created. The following will not work as coded.
char **paragraphs = (char**)malloc(MAX_SIZE * sizeof(char*));
If you have: (for a simple explanation)
char **array = {0}; //array of C strings, before memory is allocation
Then you can create memory for it like this:
int main(void)
{
int numStrings = 10;// for example, change as necessary
int maxLen = MAX_SIZE; //for example, change as necessary
char **array {0};
array = allocMemory(array, numStrings, maxLen);
//use the array, then free it
freeMemory(array, numStrings);
return 0;
}
char ** allocMemory(char ** a, int numStrings, int maxStrLen)
{
int i;
a = calloc(sizeof(char*)*(numStrings+1), sizeof(char*));
for(i=0;i<numStrings; i++)
{
a[i] = calloc(sizeof(char)*maxStrLen + 1, sizeof(char));
}
return a;
}
void freeMemory(char ** a, int numStrings)
{
int i;
for(i=0;i<numStrings; i++)
if(a[i]) free(a[i]);
free(a);
}
Note: you can determine the number of lines in a file several ways, One way for example, by FILE *fp = fopen(filepath, "r");, then calling ret = fgets(lineBuf, lineLen, fp) in a loop until ret == EOF, keeping count of an index value for each loop. Then fclose(). (which you did not do either) This necessary step is not included in the code example above, but you can add it if that is the approach you want to use.
Once you have memory allocated, Change the following in your code:
paragraphs[pp++] = (char*)buffer;
To:
strcpy(paragraphs[pp++], buffer);//no need to cast buffer, it is already char *
Also, do not forget to call fclose() when you are finished with the open file.