C reading formatted text from file - c

i have a question on a C program that I'm doing. The beginning of the track ask this:
"Process P ask as argument the path of a file in which every line ust be 16 characters length (included the end of line), and every line must start with "WAIT" or "NOWAIT" followed by a command."
The example of input file is:
WAIT ls
NOWAIT who
WAIT date
I made this code for now:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <string.h>
#define MIN_SIZE 5
#define ROW_LEN 17
int main (int argc, char *argv[]) {
if (argc != 2) {
fprintf(stderr, "Program usage: %s file_path.\n", argv[0]);
exit(1);
}
int fd = open(argv[1], O_RDONLY);
struct stat fd_info;
if(fd < 0) {
perror("Error opening file");
exit(2);
}
fstat(fd, &fd_info);
if(fd_info.st_size <= MIN_SIZE) {
printf("Size of file '%s' is less or equal than 5 bytes.\n", argv[1]);
exit(3);
}
char buf[ROW_LEN];
buf[ROW_LEN - 1] = '\0';
while ((read(fd, buf, ROW_LEN - 1)) > 0) {
char type[ROW_LEN], cmd[ROW_LEN];
sscanf(buf, "%s %s", type, cmd);
printf("type=%s; command=%s;\n", type, cmd);
}
return 0;
}
In this way i can read good only if in the file.txt I complete every row with spaces until it reaches 15 characters for each line (else it start reading next line too). But in the file that prof gave us there aren't spaces to complete the row. So my question is, how can I read correctly from the file? I can't understand that "every line must have 16 characters included end of line".
Thanks to all, I hope I explained good the question!

Firstly with this sentence you must considerate each line as a possible input, but the input is coming from anyone so anything can append and any errors in consequences.
You start on the good way
you must consider all your file and after your line > check if your line is good
you can use getline to get your file easily, and strlen and strcmp to check if your line is conform.
Finaly the part "include end of line", that mean that all the line must have a length of 16 character with '\0', so in your file the "visible" length must be at 15 for the maximum,
for example if the maximal length is 3 included end of line :
"abc" : incorrect because it's equal to {'a', 'b', 'c' '\0'};
"ab" : correct because it's equal to {'a', 'b', '\0'};

Related

how to compare number of rows in redirected text file C

#include <stdio.h>
#include <stdlib.h>
#define BUFFERSIZE 10
int main(int argc, char *argv[])
{
char address[BUFFERSIZE];
//checking text file on stdin which does not work
if (fgets(address, BUFFERSIZE, stdin) < 42)
{
fprintf(stderr, "The program needs at least 42 addresses for proper functionality.");
}
//while reads the redirected file line by line and print the content line by line
while(fgets(address, BUFFERSIZE, stdin) != NULL)
{
printf("%s", address);
}
return 0;
}
Hi, this is my code. Does not work. The problem is that I have a redirected external file adresy.txt into stdin and I need to check if the file has the required number of rows.
The minimum number of rows that a file must have is 42. If it has 42 or more rows the program can continue, if not, it throws out the fprintf(stderr, "The program needs at least 42 addresses for proper functionality.");
I tried it this way if (fgets(address, BUFFERSIZE, stdin) < 42) but it still tells me that I can not compare pointer and integer
like so: warning: comparison between pointer and integer
In the code extension I will compare the arguments from the user to what is in adresy.txt therefore I need argc and *argv [] but now i need to solve this.
Any advice how to fix it? Thanks for any help.
There are several problems in your code:
#define BUFFERSIZE 10 is odd as your lines but be at least 42 long.
you compare the pointer returned by fgets to 42 which is nonsense, BTW your compiler warned you.
With your method you actually display only one line out of two
You probably want this:
#define BUFFERSIZE 200 // maximum length of one line
int main(int argc, char *argv[])
{
char address[BUFFERSIZE];
while(fgets(address, BUFFERSIZE, stdin) != NULL)
{
// here the line has been read
if (strlen(address) < 42)
{
// if the length of the string read is < 42, inform user and stop
fprintf(stderr, "The program needs at least 42 addresses for proper functionality.");
exit(1);
}
// otherwise print line
printf("%s", address);
}
return 0;
}

C: Parse a file to obtain number of columns while reading by mmap

I have a file such as the following:
1-3-5 2 1
2 3-4-1 2
4-1 2-41-2 3-4
I want to return the number of columns of this file. I am reading the file with mmap in C. I have been trying to do with strtok(), but failing, so far. This is just a testfile, my original file is in GB scale.
pmap = mmap(0,mystat.st_size,PROT_READ|PROT_WRITE,MAP_PRIVATE,fd,0);
char *start = pmap;
char *token;
token = strtok(start, "\t");
while (token != NULL){
printf("%s \n",token);
token = strtok(NULL, "\t");
col_len++;
}
I have been trying something on these lines, but, obviously there is a logical error. I am getting the following output:
number of cols = 1
Although, the # of cols should be 3.
It'd be great if you guys can help with any idea on how to parse this kind of a file using mmap.
I am using mmap because of faster execution for a single pass over the file.
It is hard to provide a definitive answer without a definitive question; as written, the question does not contain complete code, does not show the precise input, and does not show the debugging output.
But it is possible to provide some suggestions based on the non-applicability of strtok to this problem.
(strtok modifies its first argument, so it is really not a good idea to use it with an mmaped resource. However, that is not directly relevant to the problem you are having.)
You should ensure that the columns in the file are really separated by tabs. It seems to me most likely that the file contains spaces, not tabs, which is why the program reports that the entire file contains one column. If this were the only problem, you could call strtok with the second argument " \t" rather than "\t". But remember that strtok combines successive delimiters into a single separator so if the file is tab-separated and there are empty fields, strtok will not report the empty fields.
Related to the phrase "entire file" above, you do not tell strtok to recognized a newline character as terminating a token. So the strtok loop will try to analyze the entire file, counting the last field of each line as part of the same token as the first field of the next line. That is surely not what you want.
However, strtok overwrites the column delimiter that it finds, so if you did fix the strtok calls to include \n as a delimiter character, you would no longer be able to tell where the lines ended. That is probably important to your code, and it is a key reason why strtok is not an appropriate tool in this case. The Gnu strtok manpage (man strtok, emphasis added) provides a warning about this very issue (in the BUGS section at the end):
Be cautious when using these functions. If you do use them, note that:
These functions modify their first argument.
These functions cannot be used on constant strings.
The identity of the delimiting byte is lost.
There is no guarantee that a file ends with a NUL character. In fact, the file is very unlikely to contain a NUL character, and it is undefined behaviour to reference bytes in the mmap'ed region which are not in the file, but in practice most OSs will mmap an integral number of pages, zero-filling the last page. So 4095 times out of 4096, you will not notice this problem, and the 4096th time when the file is precisely an integral number of pages, your program will crash and burn, along with whatever sensitive equipment it is controlling. This is another reason strtok should never be used on mmaped files.
My comment was actually not correct, as you use MAP_PRIVATE, you don't risk destroying your file. But still, if you modify the memory area, the touched pages are copied, and you probably don't want this overhead, otherwise you could just copy the file to RAM from the beginning. So I'd still say: don't use strtok() here.
A solution with an own loop based on the functions in <ctype.h> is quite simple, though. As I wanted to try it myself, see here a working program to demonstrate it (the relevant part is the countCols() function):
#define _POSIX_C_SOURCE 200112L
#include <ctype.h>
#include <errno.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
int countCols(const char *line, size_t maxlen)
{
int cols = 0;
int incol = 0;
const char *c = line;
while (maxlen && (!isspace(*c) || isblank(*c)))
{
if (isblank(*c))
{
incol = 0;
}
else
{
if (!incol)
{
incol = 1;
++cols;
}
}
++c;
--maxlen;
}
return cols;
}
int main(int argc, char **argv)
{
if (argc != 2)
{
fprintf(stderr, "Usage: %s [file]\n", argv[0]);
return EXIT_FAILURE;
}
struct stat st;
if (stat(argv[1], &st) < 0)
{
fprintf(stderr, "Could not stat `%s': %s\n", argv[1],
strerror(errno));
return EXIT_FAILURE;
}
int dataFd = open(argv[1], O_RDONLY);
if (dataFd < 0)
{
fprintf(stderr, "Could not open `%s': %s\n", argv[1],
strerror(errno));
return EXIT_FAILURE;
}
char *data = mmap(0, st.st_size, PROT_READ, MAP_PRIVATE, dataFd, 0);
if (data == MAP_FAILED)
{
close(dataFd);
fprintf(stderr, "Could not mmap `%s': %s\n", argv[1],
strerror(errno));
return EXIT_FAILURE;
}
int cols = countCols(data, st.st_size);
printf("found %d columns.\n", cols);
munmap(data, st.st_size);
return EXIT_SUCCESS;
}

read system call doesn't detect end of file

I'm trying to create a function that reads an entire file using a specific read size that can change anytime, but the read system call doesn't store the characters properly in the buffer, so far I'm only trying to print until the end of file like this:
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <unistd.h>
#include <fcntl.h>
# define READ_SIZE (42)
int main(int argc, char **argv)
{
int fd;
int rd;
char *buffer;
buffer = malloc(READ_SIZE);
fd = open(argv[1], O_RDONLY);
while ((rd = read(fd, buffer, READ_SIZE)) > 0)
{
printf("%s", buffer);
}
return (0);
}
This is the file that I'm trying to read:
test1234
test123
test1
test2
test3
test4
test
This is the output of my program:
test123
test12
test1
test2
test3
test4
testest123
test12
test1
test2
test3
test4
tes
I can only use malloc, and read to handle this, open is only for testing, and I don't understand why it does this, usually read returns the number of bytes read in that file, and 0 if it reaches the end of file, so it's a bit weird to see this.
The printing of the character array lacks a null character. This is UB with "%s".
printf("%s", buffer); // bad
To limit printing a character array lacking a null character, use a precision modifier. This will print the character array up to that many characters or a null character - which ever is first.
// printf("%s", buffer);
printf("%.*s", rd, buffer);
Debug tip: Print text with sentinels to clearly indicate the result of each print.
printf("<%.*s>\n", rd, buffer);
Besides the very elegant solution provided by chux's answer you could as well just terminate the buffer (and with this only make it a C-"string") explicitly before printing:
while ((rd = read(fd, buffer, READ_SIZE-1)) > 0) /* read one less, to have a spare
char available for the `0`-terminator. */
{
buffer[rd] = '\0';
printf("'%s'", buffer);
}

C - Replacing words

My goal here is to read text from a file redirected from stdin, then replace certain argv passed words with the word "Replaced".
For example, if I run:
$ ./a.exe line < input.txt
where input.txt is "Test line one", at the end I should print "Test Replaced one."
I'm not quite sure where my code is going wrong, sometimes I get segmentation fault, and I'm also not sure how I would go about printing the newOut string, or if I even need one.
As a side note, if I was reading using fgets, what if the 59th character started "li" then as it started reading again as the 0th index for the next read command, "ne". Wouldn't that not count as one string for strstr to search?
Any help is appreciated, thanks
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char** argv) {
char fileRead[60];
char newOut[];
while (!feof(stdin)){
fgets(fileRead,60,stdin); //read file 60 characters at a time
if (strstr(fileRead,argv[1])){ // if argumentv[1] is contained in fileRead
strncpy(newOut, fileRead, strlen(argv[1])); // replace
}
}
return (0);
}
As I observed in the comments to your previous question, C — A better method for replacing:
An obvious suggestion is to read whole lines with fgets() and then search those (maybe with strstr()) to find the word to be replaced, and then print the material before the word and the replacement text before resuming the search from after the matched word in the line (so [given "test" as argv[1]] a line containing "testing, 1, 2, 3, tested!" ends up as "Replaced!ing, 1, 2, 3, Replaced!ed!".
This is a rather straight-forward implementation of the described algorithm.
#include <assert.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char **argv)
{
assert(argc > 1);
char fileRead[4096]; /* Show me a desktop computer where this causes trouble! */
char replace[] = "Replaced!";
size_t word_len = strlen(argv[1]);
while (fgets(fileRead, sizeof(fileRead), stdin) != 0)
{
char *start = fileRead;
char *word_at;
while ((word_at = strstr(start, argv[1])) != 0)
{
printf("%.*s%s", (int)(word_at - start), start, replace);
start = word_at + word_len;
}
printf("%s", start);
}
return (0);
}
Note that the position of the assert() makes this C99 code; place it after the definition of word_len and it becomes C89 code.

C: lseek() related question

I want to write some bogus text in a file ("helloworld" text in a file called helloworld), but not starting from the beginning. I was thinking to lseek() function.
If I use the following code (edited):
#include <unistd.h>
#include <fcntl.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <stdlib.h>
#include <stdio.h>
#define fname "helloworld"
#define buf_size 16
int main(){
char buffer[buf_size];
int fildes,
nbytes;
off_t ret;
fildes = open(fname, O_CREAT | O_TRUNC | O_WRONLY, S_IRUSR | S_IWUSR);
if(fildes < 0){
printf("\nCannot create file + trunc file.\n");
}
//modify offset
if((ret = lseek(fildes, (off_t) 10, SEEK_END)) < (off_t) 0){
fprintf(stdout, "\nCannot modify offset.\n");
}
printf("ret = %d\n", (int)ret);
if(write(fildes, fname, 10) < 0){
fprintf(stdout, "\nWrite failed.\n");
}
close(fildes);
return (0);
}
, it compiles well and it runs without any apparent errors.
Still if i :
cat helloworld
The output is not what I expected, but:
helloworld
Can
Where is "Can" comming from, and where are my empty spaces ?
Should i expect for "zeros" instead of spaces ? If i try to open helloworld with gedit, an error occurs, complaining that the file character encoding is unknown.
LATER EDIT:
After I edited my program with the right buffer for writing, and then compile / run again, the "helloworld" file still cannot be opened with gedit.strong text
LATER EDIT
I understand the issue now. I've added to the code the following:
fildes = open(fname, O_RDONLY);
if(fildes < 0){
printf("\nCannot open file.\n");
}
while((nbytes = read(fildes, c, 1)) == 1){
printf("%d ", (int)*c);
}
And now the output is:
0 0 0 0 0 0 0 0 0 0 104 101 108 108 111 119 111 114 108 100
My problem was that i was expecting spaces (32) instead of zeros (0).
In this function call, write(fildes, fname, buf_size), fname has 10 characters (plus a trailing '\0' character, but you're telling the function to write out 16 bytes. Who knows what in the memory locations after the fname string.
Also, I'm not sure what you mean by "where are my empty spaces?".
Apart from expecting zeros to equal spaces, the original problem was indeed writing more than the length of the "helloworld" string. To avoid such a problem, I suggest letting the compiler calculate the length of your constant strings for you:
write(fildes, fname, sizeof(fname) - 1)
The - 1 is due to the NUL character (zero, \0) that is used to terminate C-style strings, and sizeof simply returning the size of the array that holds the string. Due to this you cannot use sizeof to calculate the actual length of a string at runtime, but it works fine for compile-time constants.
The "Can" you saw in your original test was almost certainly the beginning of one of the "\nCannot" strings in your code; after writing the 11 bytes in "helloworld\0" you continued to write the remaining bytes from whatever was following it in memory, which turned out to be the next string constant. (The question has now been amended to write 10 bytes, but the originally posted version wrote 16.)
The presence of NUL characters (= zero, '\0') in a text file may indeed cause certain (but not all) text editors to consider the file binary data instead of text, and possibly refuse to open it. A text file should contain just text, not control characters.
Your buf_size doesn't match the length of fname. It's reading past the buffer, and therefore getting more or less random bytes that just happened to sit after the string in memory.

Resources