does the read call in linux add a newline at EOF? - c

why does read() on a file in linux add a newline character at EOF even if the file really does not have a newline character ?
my file data is :
1hello2hello3hello4hello5hello6hello7hello8hello9hello10hello11hello12hello13hello14hello15hello
my read() call on this file should hit EOF after reading the last 'o' in "15hello". I use the below :
while( (n = read(fd2, src, read_size-1)) != 0) // read_size = 21
{
//... some code
printf("%s",src);
//... some code
}
where fd2 is the file's descriptor. At the last loop, n was 17 and i had src[16] = '\n'. So......, does the read call in linux add a newline at EOF?

does the read call in linux add a newline at EOF?
No.
Your input file likely has a terminating newline in it - most well-formatted text files do, so multiple files can be concatenated without lines running together.
You could also be running into a stray newline character that was already in your buffer, because read() does not terminate the data read with a NUL character to create an actual C-style string. And I'd guess your code doesn't either, else you would have posted it. Which means your
printf("%s",src);
is quite likely undefined behavior.

why does read() on a file in linux add a newline character at EOF even if the file really does not have a newline character ? No, read() system call doesn't add any new line at end of file.
You are experiencing this kind of behavior because may be you have created text file using vi command and note that default new line gets added if you have created file using vi.
You can validate this on your system by creating a empty text file using vi and then run wc command on that.
Also you can read file data using read() system call all at once if you know the file size(find size using stat() system call) and can avoid while loop.
This
while( (n = read(fd2, src, read_size-1)) != 0) {
/* some code */
}
Change to
struct stat var;
stat(filename, &var); /* check the retuen value of stat()..having all file info now */
off_t size = var.st_size;
Now you have size of file, create one dynamic or stack array equal to size and read the data from file.
char *ptr = malloc(size + 1);
Now read all data at once like
read(fd,ptr,size);/*now ptr having all file contents */
And at last once work done, Don't forgot to free the ptr by calling free(ptr).

Related

fopen failing on variable filepath

This function is passed the path of a text file(mapper_path) which contains paths to other text files on each line. I am supposed to open the mapper_path.txt file, then open and evaluate each of the paths within it (example in output).
fopen succeeds on the mapper_path file but fails on the paths which it contains.
In the failure condition, it prints the EXACT path I'm trying to open.
I'm working in C on windows and running commands on Ubuntu subsystem.
How can I properly read and store the sub-path into a variable to open it?
SOLVED with Rici's suggestion!
int processText(char * mapper_path, tuple * letters[])
{
char line[LINE_SIZE];
char txt_path[MAX_PATH];
FILE * mapper_fp = fopen(mapper_path, "r");
if(!mapper_fp)
{
printf("Failed to open mapper path: %s \n", mapper_path);
return -1;
}
//!!! PROBLEM IS HERE !!!
while(fgets(txt_path, MAX_PATH, mapper_fp))
{
//remove newline character from end
txt_path[strlen(txt_path)-1] = 0;
//open each txt file path, return -1 if it fails
FILE* fp = fopen(txt_path, "r");
if(!fp)
{
printf("Failed to open file path:%s\n", txt_path);
return -1;
}
//...more unimportant code
prints:
Failed to open filepath:
/mnt/c/users/adam/documents/csci_4061/projects/blackbeards/testtext.txt
This is the exact path of the file i am trying to open.
I suspect that the problem is related to this:
I'm working in C on windows and running commands on Ubuntu subsystem.
Presumably, you created the mapper.txt file using Windows tools, so it has Windows line endings. However, I think the Ubuntu subsystem does not know about Windows line endings, and so even though you open the file in mode 'r', it does not translate CR-LF into a single \n. When you then remove the \n at the end of the input, you still leave the \r.
That \r won't be visible when you print out the line, since all it does is move the cursor to the beginning of the line and the next character output is a \n. It's usually a good idea to surround strings with other text when you print debugging messages, since that can give you a clue about this sort of problem. If you'd used:
printf("Failed to open file path: '%s'\n", txt_path);
you might have seen the error:
'ailed to open filepath: '/mnt/c/users/adam/documents/csci_4061/projects/blackbeards/testtext.txt
Here, the hint that there is a \r at the end of the string is the overwriting of the first character of the message with the trailing apostrophe.
It's not quite accurate to say that fgets "adds a \n character to the end [of the line read]." It's more accurate to say that it doesn't remove that character, if it is present. It is quite possible that there isn't a newline at the end of the line. The line may be the last line in a text file which doesn't end with a newline character, for example. Or the fgets might have been terminated by reaching the character limit you supplied, rather than by finding a newline character.
So you are certainly better off using the getline interface, which has two advantages: (a) it allocates storage for the line itself, so you don't need to guess a maximum length in advance, and (b) it tells you exactly how many characters it read, so you don't have to count them.
Using that information, you can then remove a \n which happens to be at the end of the line, if there is one, and then remove the preceding \r, if there is one:
char* line = NULL;
size_t n_line = 0;
for (;;) {
ssize_t n_read = getline(&line, &n_line, mapper_fp);
if (n_read < 0) break; /* EOF or some kind of read error */
if (n_read > 0 && line[n_read - 1] == '\n')
line[nread--] = 0;
if (n_read > 0 && line[n_read - 1] == '\r')
line[nread--] = 0;
if (nread == 0) continue; /* blank line */
/* Handle the line read */
}
if (ferr(mapper_fp))
perror("Error reading mapper file");
free(line);

Reading \n as really Feed Line character from text file in C

I'm trying to read text file with C. Text file is a simple language file which works in embeded device and EACH LINE of file has a ENUM on code side. Here is a simple part of my file :
SAMPLE FROM TEXT FILE :
OPERATION SUCCESS!
OPERATION FAILED!\nRETRY COUNT : %d
ENUM :
typedef enum
{
...
MESSAGE_VALID_OP,
MESSAGE_INVALID_OP_WITH_RETRY_COUNT
...
}
Load Strings :
typedef struct
{
char *str;
} Message;
int iTotalMessageCount = 1012;
void vLoadLanguageStrings()
{
FILE *xStringList;
char * tmp_line_message[256];
size_t len = 0;
ssize_t read;
int message_index = 0;
xStringList = fopen("/home/change/strings.bin", "r");
if (xStringList == NULL)
exit(EXIT_FAILURE);
mMessages = (Message *) malloc(iTotalMessageCount * sizeof(Message));
while ((read = fgets(tmp_line_message, 256, xStringList)) != -1 && message_index < iTotalMessageCount)
{
mMessages[message_index].str = (char *) malloc(strlen(tmp_line_message));
memcpy(mMessages[message_index].str, tmp_line_message, strlen(tmp_line_message) -1);
message_index++;
}
fclose(xStringList);
}
As you se in the Sample from text file i have to use \n Feed Line character on some of my lines. After all, i read file successfuly. But if i try to call my text which has feed line \n, feed line character just printed on device screen as \ & n characters.
I already try with getline(...) method. How can i handle \n character without raising the complexity and read file line by line.
As you se in the Sample from text file i have to use \n Feed Line
character on some of my lines.
No, I don't see that. Or at least, I don't see you doing that. The two-character sequence \n is significant primarily to the C compiler; it has no inherent special significance in data files, whether those files are consumed by a C program or not.
Indeed, if the system recognizes line feeds as line terminators, then by definition, it is impossible to embed a literal line feed in a physical line. What it looks like you are trying to do is to encode line feeds as the "\n" character sequence. That's fine, but it's quite a different thing from embedding a line feed character itself.
But after all, i read file successfuly.
But if i try to call my text which has feed line \n, feed line
character just printed on device screen as \ & n characters.
Of course. Those are the characters you read in (not a line feed), so if you write them back out then you reproduce them. If you are encoding line feeds via that character sequence, then your program must decode that sequence if you want it to output literal line feeds in its place.
I already try with getline(...) method. How can i handle \n character
without raising the complexity and read file line by line.
You need to process each line read to decode the \n sequences in it. I would write a function for that. Any way around, however, your program will be more complex, because the current version simply doesn't do all the things it needs to do.

C Program unable to create output text file

A friend of mine needs to use MATLAB for one of his classes, so he called me up (a Computer Science Major) and asked if I could teach him C. I am familiar with C++, so I am also familiar with the general syntax, but had to read up on the IO library for C.
I was creating some simple IO programs to show my friend, but my third program is causing me trouble. When I run the program on my machine using Eclipse (with the CDT) Eclipse's console produces a glitchy output where instead of prompting me for the data, it gets the input and then prints it all at once with FAILURE.
The program is supposed to get a filename from user, create the file, and write to it until the user enters a blank line.
When I compile/run it on my machine via console (g++ files2.c) I am prompted for the data properly, but FAILURE shows up, and there is no output file.
I think the error lies with how I am using the char arrays, since using scanf to get the filename will create a functional file (probably since it ignores whitespace), but not enter the while loop.
#include <stdio.h>
#define name_length 20
#define line_size 80
int main() {
FILE * write_file; // pointer to file you will write to
char filename[name_length]; // variable to hold the name of file
char string_buffer[line_size]; // buffer to hold your text
printf("Filename: "); // prompt for filename
fgets(filename, name_length, stdin); // get filename from user
if (filename[name_length-1] == '\n') // if last char in stream is newline,
{filename[name_length-1] = '\0';} // remove it
write_file = fopen(filename, "w"); // create/overwrite file user named
if (!write_file) {printf("FAILURE");} // failed to create FILE *
// inform user how to exit
printf("To exit, enter a blank line (no spaces)\n");
// while getting input, print to file
while (fgets(string_buffer, line_size, stdin) != NULL) {
fputs(string_buffer, write_file);
if (string_buffer[0] == '\n') {break;}
}
fclose(write_file);
return 0;
}
How should I go about fixing the program? I have found next to nothing on user-terminated input being written to file.
Now if you will excuse me, I have a couple of files to delete off of my University's UNIX server, and I cannot specify them by name since they were created with convoluted filenames...
EDIT------
Like I said, I was able to use
scanf("%s", filename);
to get a working filename (without the newline char). But regardless of if I use scanf or fgets for my while loop, if I use them in conjunction with scanf for the filename, I am not able to write anything to file, as it does not enter the while loop.
How should I restructure my writing to file and my while loop?
Your check for the newline is wrong; you're looking at the last character in filename but it may be before that if the user enters a filename that's shorter than the maximum. You're then trying to open a file that has a newline in it's name.
These lines seem to be incorrect:
if (filename[name_length-1] == '\n') // if last char in stream is newline,
{filename[name_length-1] = '\0';} // remove it
You verify the name_length - 1 character,, which is 19 in your case without any regard of the introduced filename's length. So if your file name's length is less then 18 you won't replace the '\n' character at the end of your string. Obviously the file name can't contain '\n' character.
You need to get the size of you file name first with strlen() as an example.
if (filename[strlen(filename) - 1] == '\n')
{
filename[strlen(filename) - 1] = '\0';
}
(Don't forget to include the string.h header)
I hope I was able to help with my weak english.

C, unix and overwriting a char with write(), open() and lseek()

I need to replace the a character in a text file with '?'. It's not working as expected.
The file has contents 'abc' (without quotes) and i've got to use the unix system calls: lseek(), open() and write(). I can't use the standard C file I/O functions.
The plan is to eventually exand this into a more generalised "find and replace" utility.
#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdlib.h>
int main(){
int file = open("data", O_RDWR); //open file with contents 'abc'
lseek(file,0,0); //positions at first char at beginnging of file.
char buffer;
read(file,&buffer, sizeof(buffer));
printf("%c\n", buffer); // text file containing 'abc', it prints 'a'.
if (buffer == 'a'){
char copy = '?';
write(file,&copy,1); //text file containing 'abc' puts '?' were 'b' is.
}
close(file);
}
The file "data" contains abc, i want to replace a with ? and make it ?bc but i'm getting a?c
read() is reading the right char, but write() is writing to the next char.
Why is this?
Been searching google for hours.
Thanks
The answer is actually embedded in your own code, in a way.
The lseek call you do right after open is not required because when you first open a file the current seek offset is zero.
After each successful read or write operation, the seek offset moves forward by the number of bytes read/written. (If you add O_APPEND to your open the seek offset also moves just before each write, to the current-end-of-file, but that's not relevant at this point.)
Since you successfully read one byte, your seek offset moves from 0 to 1. If you want to put it back to 0, you must do that manually.
(You should also check that each operation actually succeeds, of course, but I assume you left that out for brevity here.)
Your call to read() moves the file pointer forward one byte - i.e. from 0 to 1. Since you're using the same file descriptor ("int file = ...") for reading and writing, the position is the same for reading and writing.
To write over the byte that was just read, you need to lseek() back one byte after
(buffer == 'a')
comes true.
lseek is in the wrong place. Once 'a' has been found it writes '?' in the next available spot (which happens to overwrite 'b'). To fix, you need to change the current position using lseek BEFORE you write.
if (buffer == 'a'){
char copy = '?';
lseek(file,0,SEEK_SET); //positions at first char at beginnging of file.
write(file,&copy,1); //text file containing 'abc' puts '?' were 'b' is.
}

Reading data from stdin in C

I'm trying to read from stdin and output the data, things work, EXCEPT that it's not outputting the new incoming data. I'm not quite sure where is the issue. I'm guessing it has something to do when determining the stdin size. Any help would be greatly appreciated! Thanks
tail -f file | my_prog
Updated
#include <stdio.h>
#include <sys/stat.h>
long size(FILE *st_in) {
struct stat st;
if (fstat(fileno(st_in), &st) == 0)
return st.st_size;
return -1;
}
int main (){
FILE *file = stdin;
char line [ 128 ];
while ( fgets ( line, sizeof line, file ) != NULL )
fputs ( line, stdout ); /* write the line */
long s1, s2;
s1 = size(file);
for (;;) {
s2 = size (file);
if (s2 != s1) {
if (!fseek (file, s1, SEEK_SET)) {
while ( fgets ( line, sizeof line, file ) != NULL ) {
fputs ( line, stdout ); /* write the line */
}
}
s1 = s2;
usleep(300000);
}
}
return 0;
}
Edit: Fixed!
After a FILE * has reached EOF, it stays in a state where it will read no more data until you clear the 'EOF' bit either with clearerr() or with fseek(). However, if standard input is connected to a terminal, then that is not a seekable device, so instead of clearing the error, it might not do anything useful:
POSIX says:
The behavior of fseek() on devices which are incapable of seeking is implementation-defined.
Your loop entry condition is suspect; you need to sleep before starting it, and you need to sleep on each iteration. Indeed, normally you write tail -f without worrying about the file size; you sleep, try to read until the next 'EOF', reset the file EOF indicator, and repeat. Note, too, that the size of a pipe or terminal is not defined.
Separately, it is aconventional to call a FILE * argument to a function filename; it has completely the wrong connotations. A filename is a string.
This is not really standard C:
size(file);
Call stat() to get file information - organization type of a file, file size and permissions.
What your code does is to eventually set the file pointer to the end of the file, as it tries to read through it. Consider stat() (or fstat() on a an open file) instead.
rewind() resets the file pointer to the start of the file, fseek() will place it anywhere you need.
tail -f repeatedly tries the file at the EOF point with a short sleep in between tries.... It does not "consider" EOF to be an error. It remembers the current file offset for the EOF, then fseeks() using SEEK_END, then calls ftell(), and compares the offsets. If there is a difference it then fseek()-s back to the last known endpoint and reads the data.
This description is from old unix source. I'm sure it has been tweaked since then.

Resources