I'm try to get my text to be read back to front and to be printed in the reverse order in that file, but my for loop doesn't seem to working. Also my while loop is counting 999 characters even though it should be 800 and something (can't remember exactly), I think it might be because there is an empty line between the two paragraphs but then again there are no characters there.
Here is my code for the two loops -:
/*Reversing the file*/
char please;
char work[800];
int r, count, characters3;
characters3 = 0;
count = 0;
r = 0;
fgets(work, 800, outputfile);
while (work[count] != NULL)
{
characters3++;
count++;
}
printf("The number of characters to be copied is-: %d", characters3);
for (characters3; characters3 >= 0; characters3--)
{
please = work[characters3];
work[r] = please;
r++;
}
fprintf(outputfile, "%s", work);
/*Closing all the file streams*/
fclose(firstfile);
fclose(secondfile);
fclose(outputfile);
/*Message to direct the user to where the files are*/
printf("\n Merged the first and second files into the output file
and reversed it! \n Check the outputfile text inside the Debug folder!");
There are a couple of huge conceptual flaws in your code.
The very first one is that you state that it "doesn't seem to [be] working" without saying why you think so. Just running your code reveals what the problem is: you do not get any output at all.
Here is why. You reverse your string, and so the terminating zero comes at the start of the new string. You then print that string – and it ends immediately at the first character.
Fix this by decreasing the start of the loop in characters3.
Next, why not print a few intermediate results? That way you can see what's happening.
string: [This is a test.
]
The number of characters to be copied is-: 15
result: [
.tset aa test.
]
Hey look, there seems to be a problem with the carriage return (it ends up at the start of the line), which is exactly what should happen – after all, it is part of the string – but more likely not what you intend to do.
Apart from that, you can clearly see that the reversing itself is not correct!
The problem now is that you are reading and writing from the same string:
please = work[characters3];
work[r] = please;
You write the character at the end into position #0, decrease the end and increase the start, and repeat until done. So, the second half of reading/writing starts copying the end characters back from the start into the end half again!
Two possible fixes: 1. read from one string and write to a new one, or 2. adjust the loop so it stops copying after 'half' is done (since you are doing two swaps per iteration, you only need to loop half the number of characters).
You also need to think more about what swapping means. As it is, your code overwrites a character in the string. To correctly swap two characters, you need to save one first in a temporary variable.
void reverse (FILE *f)
{
char please, why;
char work[800];
int r, count, characters3;
characters3 = 0;
count = 0;
r = 0;
fgets(work, 800, f);
printf ("string: [%s]\n", work);
while (work[count] != 0)
{
characters3++;
count++;
}
characters3--; /* do not count last zero */
characters3--; /* do not count the return */
printf("The number of characters to be copied is-: %d\n", characters3);
for (characters3; characters3 >= (count>>1); characters3--)
{
please = work[characters3];
why = work[r];
work[r] = please;
work[characters3] = why;
r++;
}
printf ("result: [%s]\n", work);
}
As a final note: you do not need to 'manually' count the number of characters, there is a function for that. All that's needed instead of the count loop is this;
characters3 = strlen(work);
Here's a complete and heavily commented function that will take in a filename to an existing file, open it, then reverse the file character-by-character. Several improvements/extensions could include:
Add an argument to adjust the maximum buffer size allowed.
Dynamically increase the buffer size as the input file exceeds the original memory.
Add a strategy for recovering the original contents if something goes wrong when writing the reversed characters back to the file.
// naming convention of l_ for local variable and p_ for pointers
// Returns 1 on success and 0 on failure
int reverse_file(char *filename) {
FILE *p_file = NULL;
// r+ enables read & write, preserves contents, starts pointer p_file at beginning of file, and will not create a
// new file if one doesn't exist. Consider a nested fopen(filename, "w+") if creation of a new file is desired.
p_file = fopen(filename, "r+");
// Exit with failure value if file was not opened successfully
if(p_file == NULL) {
perror("reverse_file() failed to open file.");
fclose(p_file);
return 0;
}
// Assumes entire file contents can be held in volatile memory using a buffer of size l_buffer_size * sizeof(char)
uint32_t l_buffer_size = 1024;
char l_buffer[l_buffer_size]; // buffer type is char to match fgetc() return type of int
// Cursor for moving within the l_buffer
int64_t l_buffer_cursor = 0;
// Temporary storage for current char from file
// fgetc() returns the character read as an unsigned char cast to an int or EOF on end of file or error.
int l_temp;
for (l_buffer_cursor = 0; (l_temp = fgetc(p_file)) != EOF; ++l_buffer_cursor) {
// Store the current char into our buffer in the original order from the file
l_buffer[l_buffer_cursor] = (char)l_temp; // explicitly typecast l_temp back down to signed char
// Verify our assumption that the file can completely fit in volatile memory <= l_buffer_size * sizeof(char)
// is still valid. Return an error otherwise.
if (l_buffer_cursor >= l_buffer_size) {
fprintf(stderr, "reverse_file() in memory buffer size of %u char exceeded. %s is too large.\n",
l_buffer_size, filename);
fclose(p_file);
return 0;
}
}
// At the conclusion of the for loop, l_buffer contains a copy of the file in memory and l_buffer_cursor points
// to the index 1 past the final char read in from the file. Thus, ensure the final char in the file is a
// terminating symbol and decrement l_buffer_cursor by 1 before proceeding.
fputc('\0', p_file);
--l_buffer_cursor;
// To reverse the file contents, reset the p_file cursor to the beginning of the file then write data to the file by
// reading from l_buffer in reverse order by decrementing l_buffer_cursor.
// NOTE: A less verbose/safe alternative to fseek is: rewind(p_file);
if ( fseek(p_file, 0, SEEK_SET) != 0 ) {
return 0;
}
for (l_temp = 0; l_buffer_cursor >= 0; --l_buffer_cursor) {
l_temp = fputc(l_buffer[l_buffer_cursor], p_file); // write buffered char to the file, advance f_open pointer
if (l_temp == EOF) {
fprintf(stderr, "reverse_file() failed to write %c at index %lu back to the file %s.\n",
l_buffer[l_buffer_cursor], l_buffer_cursor, filename);
}
}
fclose(p_file);
return 1;
}
Related
I'm trying to understand how memory allocation and pointers work, since i find a problem set of CS50 (pset5) too overwhelming.
I made a simple program that reads characters from an array, and let them be written into both a new text file, and into the terminal.
The program works, but it is leaking memory.
Specifically for each \n encountered in the string, valgrind states that it loses memory in 1 more block. And for each character in the string (of char *c), it states that 1 more byte is leaked.
What am i doing wrong?
image link of the terminal: https://i.stack.imgur.com/ANtAs.png
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main (void)
{
FILE *fp;
char *c = "One\nTwo\n";
// Open file for writing (reading and writing works too, we can use 'w+' for that).
fp = fopen("file.txt", "w");
// Write data to the file.
fwrite(c, strlen(c), 1, fp);
// Seek to the beginning of the file
fseek(fp, 0, SEEK_SET);
// close file of the file pointer (the text file).
fclose(fp);
// initialize a counter for the amount of characters in the current word that is being read out of the file.
int char_count = 0;
// initialize an address for the first character in a string.
char *buffer_temp_word = NULL;
// Read and display data, using iterations over each character.
// Open the file in read mode.
fp = fopen("file.txt", "r");
// initiate a for loop.
// condition 1: getting a character from the fp stream does not equal reaching the end of the file
// condition 2: the amount of iterations is not above 60 (failsafe against endless loops).
for (int i = 0; fgetc(fp) != EOF && i <= 60 ; i++)
{
//add a counter to the amount of characters currently read.
char_count++;
// seek the pointer 1 place back (the 'IF' function moves the pointer forward 1 place forward for each character).
fseek(fp , -1L, SEEK_CUR);
// get the character value of the current spot that the pointer of the read file points to.
char x = fgetc(fp);
buffer_temp_word = realloc(buffer_temp_word, (sizeof(char)) * char_count);
//the string stores the character on the correct place
//(the first character starts at memory location 0, hence the amount of characters -1)
buffer_temp_word[char_count - 1] = x;
// check for the end of the line (which is the end of the word).
if(x == '\n')
{
//printf("(end of line reached)");
printf("\nusing memory:");
// iterate trough characters in the memory using the pointer + while loop, option 2.
while(*buffer_temp_word != '\n')
{
printf("%c", *buffer_temp_word);
buffer_temp_word++;
}
printf("\nword printed succesfully");
// reset the pointer to the beginning of the buffer_temp_word string (which is an array actually).
buffer_temp_word = NULL;
free(buffer_temp_word);
// reset the amount of characters (for the next word that will be read).
char_count = 0;
}
printf("%c", x);
}
fclose(fp);
free(buffer_temp_word);
return(0);
}
You set buffer_temp_word to NULL before freeing it:
// reset the pointer to the beginning of the buffer_temp_word string (which is an array actually).
buffer_temp_word = NULL;
free(buffer_temp_word);
If you use clang's static analyzer, it can walk you through a path in your code to show your memory leak.
Also, setting a pointer to NULL does not reset it to the starting position of the array it points to, it sets it to, well, NULL. Consider using a for-loop instead of your while loop and use the counter to index your array:
for(int j = 0; buffer_temp_word[j] != '\n'; ++j)
{
printf("%c", buffer_temp_word[j]);
}
And then don't set buffer_temp_word to NULL and don't free it immediately after this loop. The program is already set to realloc it or free it later.
I found this piece of code at Reading a file character by character in C and it compiles and is what I wish to use. My problem that I cannot get the call to it working properly. The code is as follows:
char *readFile(char *fileName)
{
FILE *file = fopen(fileName, "r");
char *code;
size_t n = 0;
int c;
if (file == NULL)
return NULL; //could not open file
code = malloc(1500);
while ((c = fgetc(file)) != EOF)
{
code[n++] = (char) c;
}
code[n] = '\0';
return code;
}
I am not sure of how to call it. Currently I am using the following code to call it:
.....
char * rly1f[1500];
char * RLY1F; // This is the Input File Name
rly1f[0] = readFile(RLY1F);
if (rly1f[0] == NULL) {
printf ("NULL array); exit;
}
int n = 0;
while (n++ < 1000) {
printf ("%c", rly1f[n]);
}
.....
How do I call the readFile function such that I have an array (rly1f) which is not NULL? The file RLY1F exists and has data in it. I have successfully opened it previously using 'in line code' not a function.
Thanks
The error you're experiencing is that you forgot to pass a valid filename. So either the program crashes, or fopen tries to open a trashed name and returns NULL
char * RLY1F; // This is not initialized!
RLY1F = "my_file.txt"; // initialize it!
The next problem you'll have will be in your loop to print the characters.
You have defined an array of pointers char * rly1f[1500];
You read 1 file and store it in the first pointer of the array rly1f[0]
But when you display it you display the pointer values as characters which is not what you want. You should just do:
while (n < 1000) {
printf ("%c", rly1f[0][n]);
n++;
}
note: that would not crash but would print trash if the file read is shorter than 1000.
(BLUEPIXY suggested the post-incrementation fix for n BTW or first character is skipped)
So do it more simply since your string is nul-terminated, pass the array to puts:
puts(rly1f[0]);
EDIT: you have a problem when reading your file too. You malloc 1500 bytes, but you read the file fully. If the file is bigger than 1500 bytes, you get buffer overflow.
You have to compute the length of the file before allocating the memory. For instance like this (using stat would be a better alternative maybe):
char *readFile(char *fileName, unsigned int *size) {
...
fseek(file,0,SEEK_END); // set pos to end of file
*size = ftell(file); // get pos, i.e. size
rewind(file); // set pos to 0
code = malloc(*size+1); // allocate the proper size plus one
notice the extra parameter which allows you to return the size as well as the file data.
Note: on windows systems, text files use \r\n (CRLF) to delimit lines, so the allocated size will be higher than the number of characters read if you use text mode (\r\n are converted to \n so there are less chars in your buffer: you could consider a realloc once you know the exact size to shave off the unused allocated space).
I am trying to read a file of unknown size line by line including single or multiple newline characters.
for example if my sample.txt file looks like this
abc cd er dj
text
more text
zxc cnvx
I want my strings to look something like this
string1 = "abc cd er dj\n";
string2 = "text\n\n";
string3 = "more text\n\n\n";
string4 = "zxc convex";
I can't seem to come up with solution that works properly. I have tried following code to get the length of each line including newline characters but it gives me incorrect length
while((temp = fgetc(input)) != EOF) {
if (temp != '\n') {
length++;
}
else {
if (temp == '\n') {
while ((temp = fgetc(input)) == '\n') {
length++;
}
}
length = 0;
}
}
I was thinking, if I can get length of each line including newline character(s) and then I can malloc string of that length and then read that size of string using fread but I am not sure if that would work because I will have to move the file pointer to get the next string.
I also don't want to use buffer because I don't know the length of each line. Any sort of help will be appreciated.
If the lines are just short and there aren't many of them, you could use realloc to reallocate memory as needed. Or you can use smaller (or larger) chunks and reallocate. It's a little more wasteful but hopefully it should average out in the end.
If you want to use just one allocation, then find the start of the next non-empty line and save the file position (use ftell). Then get the difference between the current position and the previous start position and you know how much memory to allocate. For the reading yes you have to seek back and forth but if it's not to big all data will be in the buffer to it's just modifying some pointers. After reading then seek to the saved position and make it the next start position.
Then you could of course the possibility to memory-map the file. This will put the file contents into your memory map like it was all allocated. For a 64-bit system the address space is big enough so you should be able to map multi-gigabyte files. Then you don't need to seek or allocate memory, all you do is manipulate pointers instead of seeking. Reading is just a simply memory copying (but then since the file is "in" memory already you don't really need it, just save the pointers instead).
For a very simple example on fseek and ftell, that is somewhat related to your problem, I put together this little program for you. It doesn't really do anything special but it shows how to use the functions in a way that could be used for a prototype of the second method I discussed above.
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
FILE *file = fopen("some_text_file.txt", "r");
// The position after a successful open call is always zero
long start_of_line = 0;
int ch;
// Read characters until we reach the end of the file or there is an error
while ((ch = fgetc(file)) != EOF)
{
// Hit the *first* newline (which differs from your problem)
if (ch == '\n')
{
// Found the first newline, get the current position
// Note that the current position is the position *after* the newly read newline
long current_position = ftell(file);
// Allocate enough memory for the whole line, including newline
size_t bytes_in_line = current_position - start_of_line;
char *current_line = malloc(bytes_in_line + 1); // +1 for the string terminator
// Now seek back to the start of the line
fseek(file, start_of_line, SEEK_SET); // SEEK_SET means the offset is from the beginning of the file
// And read the line into the buffer we just allocated
fread(current_line, 1, bytes_in_line, file);
// Terminate the string
current_line[bytes_in_line] = '\0';
// At this point, if everything went well, the file position is
// back at current_position, because the fread call advanced the position
// This position is the start of the next line, so we use it
start_of_line = current_position;
// Then do something with the line...
printf("Read a line: %s", current_line);
// Finally free the memory we allocated
free(current_line);
}
// Continue loop reading character, to read the next line
}
// Did we hit end of the file, or an error?
if (feof(file))
{
// End of the file it is
// Now here's the tricky bit. Because files doesn't have to terminated
// with a newline, at this point we could actually have some data we
// haven't read. That means we have to do the whole thing above with
// the allocation, seeking and reading *again*
// This is a good reason to extract that code into its own function so
// you don't have to repeat it
// I will not repeat the code my self. Creating a function containing it
// and calling it is left as an exercise
}
fclose(file);
return 0;
}
Please note that for brevity's sake the program doesn't contain any error handling. It should also be noted that I haven't actually tried the program, not even tried to compile it. It's all written ad hoc for this answer.
Unless you are trying to write your own implementation, you can use the standard POSIX getline() function:
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
FILE *fp;
char *line = NULL;
size_t len = 0;
ssize_t read;
fp = fopen("/etc/motd", "r");
if (fp == NULL)
exit(1);
while ((read = getline(&line, &len, fp)) != -1) {
printf("Retrieved line of length %zu :\n", read);
printf("%s", line);
}
if (ferror(fp)) {
/* handle error */
}
free(line);
fclose(fp);
return 0;
}
You get the wrong length. The reason is that before you enter the loop:
while ((temp = fgetc(input)) == '\n')
you forgot to increment length as it has just read a \n character. So those lines must become:
else {
length++; // add the \n just read
if (temp == '\n') { // this is a redundant check
while ((temp = fgetc(input)) == '\n') {
length++;
}
ungetc(temp, input);
}
EDIT
After having read the first non \n, you now have read the first character of the next line, so you must unget it:
ungetc(temp, input);
SO i'm supposed to write a block of code that opens a file called "words" and writes the last word in the file to a file called "lastword". This is what I have so far:
FILE *f;
FILE *fp;
char string1[100];
f = fopen("words","w");
fp=fopen("lastword", "w");
fscanf(f,
fclose(fp)
fclose(f);
The problem here is that I don't know how to read in the last word of the text file. How would I know which word is the last word?
This is similar to what the tail tool does, you seek to a certain offset from the end of the file and read the block there, then search backwards, once you meet a whitespace or a new line, you can print the word from there, that is the last word. The basic code looks like this:
char string[1024];
char *last;
f = fopen("words","r");
fseek(f, SEEK_END, 1024);
size_t nread = fread(string, 1, sizeof string, f);
for (int I = 0; I < nread; I++) {
if (isspace(string[nread - 1 - I])) {
last = string[nread - I];
}
}
fprintf(fp, "%s", last);
If the word boundary is not find the first block, you continue to read the second last block and search in it, and the third, until your find it, then print all the characters after than position.
There are plenty of ways to do this.
Easy way
One easy approach would be to to loop on reading words:
f = fopen("words.txt","r"); // attention !! open in "r" mode !!
...
int rc;
do {
rc=fscanf(f, "%99s", string1); // attempt to read
} while (rc==1 && !feof(f)); // while it's successfull.
... // here string1 contains the last successfull string read
However this takes a word as any combination of characters separated by space. Note the use of the with filed in the scanf() format to make sure that there will be no buffer overflow.
More exact way
Building on previous attempt, if you want a stricter definition of words, you can just replace the call to scanf() with a function of your own:
rc=read_word(f, string1, 100);
The function would be something like:
int read_word(FILE *fp, char *s, int szmax) {
int started=0, c;
while ((c=fgetc(fp))!=EOF && szmax>1) {
if (isalpha(c)) { // copy only alphabetic chars to sring
started=1;
*s++=c;
szmax--;
}
else if (started) // first char after the alphabetics
break; // will end the word.
}
if (started)
*s=0; // if we have found a word, we end it.
return started;
}
I am really struggling to understand how character arrays work in C. This seems like something that should be really simple, but I do not know what function to use, or how to use it.
I want the user to enter a string, and I want to iterate through a text file, comparing this string to the first word of each line in the file.
By "word" here, I mean substring that consists of characters that aren't blanks.
Help is greatly appreciated!
Edit:
To be more clear, I want to take a single input and search for it in a database of the form of a text file. I know that if it is in the database, it will be the first word of a line, since that is how to database is formatted. I suppose I COULD iterate through every single word of the database, but this seems less efficient.
After finding the input in the database, I need to access the two words that follow it (on the same line) to achieve the program's ultimate goal (which is computational in nature)
Here is some code that will do what you are asking. I think it will help you understand how string functions work a little better. Note - I did not make many assumptions about how well conditioned the input and text file are, so there is a fair bit of code for removing whitespace from the input, and for checking that the match is truly "the first word", and not "the first part of the first word". So this code will not match the input "hello" to the line "helloworld 123 234" but it will match to "hello world 123 234". Note also that it is currently case sensitive.
#include <stdio.h>
#include <string.h>
int main(void) {
char buf[100]; // declare space for the input string
FILE *fp; // pointer to the text file
char fileBuf[256]; // space to keep a line from the file
int ii, ll;
printf("give a word to check:\n");
fgets(buf, 100, stdin); // fgets prevents you reading in a string longer than buffer
printf("you entered: %s\n", buf); // check we read correctly
// see (for debug) if there are any odd characters:
printf("In hex, that is ");
ll = strlen(buf);
for(ii = 0; ii < ll; ii++) printf("%2X ", buf[ii]);
printf("\n");
// probably see a carriage return - depends on OS. Get rid of it!
// note I could have used the result that ii is strlen(but) but
// that makes the code harder to understand
for(ii = strlen(buf) - 1; ii >=0; ii--) {
if (isspace(buf[ii])) buf[ii]='\0';
}
// open the file:
if((fp=fopen("myFile.txt", "r"))==NULL) {
printf("cannot open file!\n");
return 0;
}
while( fgets(fileBuf, 256, fp) ) { // read in one line at a time until eof
printf("line read: %s", fileBuf); // show we read it correctly
// find whitespace: we need to keep only the first word.
ii = 0;
while(!isspace(fileBuf[ii]) && ii < 255) ii++;
// now compare input string with first word from input file:
if (strlen(buf)==ii && strstr(fileBuf, buf) == fileBuf) {
printf("found a matching line: %s\n", fileBuf);
break;
}
}
// when you get here, fileBuf will contain the line you are interested in
// the second and third word of the line are what you are really after.
}
Your recent update states that the file is really a database, in which you are looking for a word. This is very important.
If you have enough memory to hold the whole database, you should do just that (read the whole database and arrange it for efficient searching), so you should probably not ask about searching in a file.
Good database designs involve data structures like trie and hash table. But for a start, you could use the most basic improvement of the database - holding the words in alphabetical order (use the somewhat tricky qsort function to achieve that).
struct Database
{
size_t count;
struct Entry // not sure about C syntax here; I usually code in C++; sorry
{
char *word;
char *explanation;
} *entries;
};
char *find_explanation_of_word(struct Database* db, char *word)
{
for (size_t i = 0; i < db->count; i++)
{
int result = strcmp(db->entries[i].word, word);
if (result == 0)
return db->entries[i].explanation;
else if (result > 0)
break; // if the database is sorted, this means word is not found
}
return NULL; // not found
}
If your database is too big to hold in memory, you should use a trie that holds just the beginnings of the words in the database; for each beginning of a word, have a file offset at which to start scanning the file.
char* find_explanation_in_file(FILE *f, long offset, char *word)
{
fseek(f, offset, SEEK_SET);
char line[100]; // 100 should be greater than max line in file
while (line, sizeof(line), f)
{
char *word_in_file = strtok(line, " ");
char *explanation = strtok(NULL, "");
int result = strcmp(word_in_file, word);
if (result == 0)
return explanation;
else if (result > 0)
break;
}
return NULL; // not found
}
I think what you need is fseek().
1) Pre-process the database file as follows. Find out the positions of all the '\n' (carriage returns), and store them in array, say a, so that you know that ith line starts at a[i]th character from the beginning of the file.
2) fseek() is a library function in stdio.h, and works as given here. So, when you need to process an input string, just start from the start of the file, and check the first word, only at the stored positions in the array a. To do that:
fseek(inFile , a[i] , SEEK_SET);
and then
fscanf(inFile, "%s %s %s", yourFirstWordHere, secondWord, thirdWord);
for checking the ith line.
Or, more efficiently, you could use:
fseek ( inFile , a[i]-a[i-1] , SEEK_CURR )
Explanation: What fseek() does is, it sets the read/write position indicator associated with the file at the desired position. So, if you know at which point you need to read or write, you can just go there and read directly or write directly. This way, you won't need to read whole lines just to get first three words.