I am trying to read line by line a standard file input.
This is my code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#define BUFFER_SIZE 1204
char* readLine(char* buffer){
int i = 0;
for(i; i< BUFFER_SIZE; i++){
printf("%c",buffer[i]);
if( '\n' == buffer[i]){
char* line[124];
memcpy( line, &buffer[0], i-1 );
return *line;
}
}
free(buffer);
}
int doStuffWithLine(char* line){
return 1;
}
int main(int argc, char *argv[])
{
ssize_t aux1;
char *buffer = malloc(sizeof(char)*BUFFER_SIZE);
char *line = malloc(sizeof(char)*BUFFER_SIZE);
while((read(STDIN_FILENO, buffer, BUFFER_SIZE))>0){
line = readLine(buffer);
doStuffWithLine(line);
printf("%s", line);
}
return 0;
}
This is the input file content:
lol1
lol2
lol3
And this is the output of my program:
lol1
Segmentation fault (core dumped)
I want to know how read lines 2 and 3, solve it and a little explanation about what I am doing wrong because I do not understand the problem.
Thank you in advance.
Function read reads in raw bytes and will not terminate your buffer with a string termination character '\0'; Using it then for printf("%s",...), which expects a 0-terminated C-string, yields undefined behaviour (e.g. a crash).
I'd suggest to use fgets instead.
First of all, thank you all that helped me and spent some time trying it.
After spending some hours learning and breaking my brain I have found a solution. In conclusion I am ***** and noob.
If someone is having the same problem I am submitting my code. Easy peasy:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <limits.h>
char* doStuff(char* line){
return line;
}
int main(int argc, char *argv[])
{
char *line = malloc(sizeof(char)*LINE_MAX);
while(fgets(line, LINE_MAX, stdin)!= NULL)
{
line = doStuff(line);
printf("%s", line);
}
return 0;
}
Related
I'm trying to write a function named read_line() to gets data from a text file, line by line. After calling the function, the line would be written to str pointer and the function will return the length of the line. Unfortunately, I ended up getting null all the time.
/* readline.c*/
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include "readline.h"
int read_line(char *str)
{
/* Open the file for reading */
size_t line_buf_size = 0;
ssize_t line_size;
FILE *fp = fopen("0.txt", "r");
if (!fp)
{
fprintf(stderr, "Error opening file '%s'\n", "0.txt");
return EXIT_FAILURE;
}
/* Get the first line of the file. */
line_size = getline(&str, &line_buf_size, fp);
printf(str);
return line_size - 2;
}
/* main.c*/
#include "readline.h"
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char **argv)
{
char *str = NULL;
int num;
num = read_line(str);
printf("%s", str);
printf("%d", num);
return 0;
}
Expected: get the content of the first line of the text file.
Actual: (null)12
In main, you have an object named str of type char*. read_line takes the value of the pointer as a parameter, but does not modify the pointer itself. This means that no matter what happens, the value of str will still be NULL when it reaches the printfs in main
One approach would be to pass a pointer to your pointer object, instead of its value.
int read_line(char **str)
{
...
/* Get the first line of the file. */
line_size = getline(str, &line_buf_size, fp);
printf("%s", *str);
...
}
This way, str will be modified by getline
The reason why I would want to do this is because I want to read from a file line-by-line, and for each line check whether it matches a regex. I am using the getline() function, which puts the line into a char * type variable. I am trying to use regexec() to check for a regex match, but this function wants you to provide the string to match as a const char *.
So my question is, can I create a const char * from a char *? Or perhaps is there a better way to approach the problem I'm trying to solve here?
EDIT: I was requested to provide an example, which I didn't think about and apologise for not giving one in the first place. I did read the answer by #chqrlie before writing this. The following code gives a segmentation fault.
#define _GNU_SOURCE
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <string.h>
#include <stdbool.h>
#include <regex.h>
int main() {
FILE * file = fopen("myfile", "r");
char * line = NULL;
size_t len = 0;
ssize_t read;
regex_t regex;
const char * regexStr = "a+b*";
if (regcomp(®ex, regexStr, 0)) {
fprintf(stderr, "Could not compile regex \"%s\"\n", regexStr);
exit(1);
}
while ((read = getline(&line, &len, file)) != -1) {
int match = regexec(®ex, line, 0, NULL, 0);
if (match == 0) {
printf("%s matches\n", line);
}
}
fclose(file);
return 0;
}
char * can be converted to const char * without any special syntax. The const in this type means that the data pointed by the pointer will no be modified via this pointer.
char array[] = "abcd"; // modifiable array of 5 bytes
char *p = array; // array can be modified via p
const char *q = p; // array cannot be modified via q
Here are some examples:
int strcmp(const char *s1, const char *s2);
size_t strlen(const char *s);
char *strcpy(char *dest, const char *src);
As you can see, strcmp does not modify the strings it receives pointers to, but you can of course pass regular char * pointers to it.
Similarly, strlen does not modify the string, and strcpy modifies the destination string but not the source string.
EDIT: You problem has nothing to do with constness conversion:
You do not check the return value of fopen(), the program produces a segmentation fault on my system because myfile does not exist.
You must pass REG_EXTENDED to compile a regex with the newer syntax such asa+b*
Here is a corrected version:
#define _GNU_SOURCE
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <regex.h>
int main() {
FILE *file = fopen("myfile", "r");
char *line = NULL;
size_t len = 0;
ssize_t read;
regex_t regex;
const char *regexStr = "a+b*";
if (file == NULL) {
printf("cannot open myfile, using stdin\n");
file = stdin;
}
if (regcomp(®ex, regexStr, REG_EXTENDED)) {
fprintf(stderr, "Could not compile regex \"%s\"\n", regexStr);
exit(1);
}
while ((read = getline(&line, &len, file)) != -1) {
int match = regexec(®ex, line, 0, NULL, 0);
if (match == 0) {
printf("%s matches\n", line);
}
}
fclose(file);
return 0;
}
I have problem in program with locale and reading from stdin with fgetws function.
#include <stdio.h>
#include <locale.h>
#include <wchar.h>
static const int N = 2;
int main(void) {
setlocale(LC_ALL, "");
wchar_t data[N];
fgetws(data, N, stdin);
printf("%ls\n", data);
/* fclose(stdin); */
return 0;
}
When input is long enough (5 or more chars) I get segfault if I don't close stdin before return. Why is that? What is wrong with this program?
Suspect fgetws(data, 2, stdin) is broken.
fgetws(), using such a small buffer should, at most, read 1 wchar_t from stdin and append a termanting (wchar_t) '\0'.
As usual, when code fails mysteriously, best to check return from the functions to see if they are as expected.
#include <stdio.h>
#include <locale.h>
#include <wchar.h>
#include <assert.h>
#include <stdio.h>
#include <locale.h>
#include <wchar.h>
static const int N = 2;
int main(void) {
char *p = setlocale(LC_ALL, "");
assert(p);
wchar_t data[N];
wchar_t *s = fgetws(data, N, stdin);
assert(s);
int i = printf("%ls\n", data);
assert(i == 2);
i = fclose(stdin);
assert(i == 0);
return 0;
}
I've got a problem reading a couple of lines from a read-only FIFO. In particular, I have to read two lines — a number n, followed by a \n and a string str — and my C program should write str in a write-only FIFO for n times. This is my attempt.
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <ctype.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <string.h>
char *readline(int fd);
int main(int argc, char** argv) {
int in = open(argv[1], O_RDONLY);
mkfifo(argv[2], 0666);
int out = open(argv[2] ,O_WRONLY);
char *line = (char *) malloc(50);
int n;
while (1) {
sscanf(readline(in), "%d", &n);
strcpy(line, readline(in));
int i;
for (i = 0; i < n; i++) {
write(out, line, strlen(line));
write(out, "\n", 1);
}
}
close(in);
close(out);
return 0;
}
char *readline(int fd) {
char *c = (char *) malloc(1);
char line[50];
while (read(fd, c, 1) != 0) {
if (strcmp(c, "\n") == 0) {
break;
}
strcat(line, c);
}
return line;
}
The code is working properly, but it puts a random number of newlines after the last string repetition. Also, this number changes at each execution.
Could someone please give me any help?
Besides the facts that reading character wise and and comparing two characters using "string" comparsion both is far from being efficient, readline() returns a pointer to memory being declared local to readline(), that is line[50] The memory gets deallocated as soon as readline() returns, so accessing it afterwards invokes undefine behaviour.
One possibility to fix this is to declare the buffer to read the line into outside readline() and pass a reference to it down like so:
char * readline(int fd, char * line, size_t size)
{
if ((NULL != line) && (0 < size))
{
char c = 0;
size_t i = 0;
while (read(fd, &c, 1) >0)
{
if ('\n' == c) or (size < i) {
break;
}
line[i] = c;
++i;
}
line [i] = 0;
}
return line;
}
And then call it like this:
char * readline(int fd, char * line, size_t size);
int main(void)
{
...
char line[50] = "";
...
... readline(in, line, sizeof(line) - 1) ...
I have not tried running your code, but in your readline function you have not terminated the line with null ('\0') character. once you hit '\n' character you just breaking the while loop and returning the string line. Try adding '\0' character before returning from the function readline.
Click here for more info.
Your code did not work on my machine, and I'd say you're lucky to get any meaningful results at all.
Here are some problems to consider:
readline returns a locally defined static char buffer (line), which will be destroyed when the function ends and the memory it once occupied will be free to be overwritten by other operations.
If line was not set to null bytes on allocation, strcat would treat its garbage values as characters, and could possibly try to write after its end.
You allocate a 1-byte buffer (c), I suspect, just because you need a char* in read. This is unnecessary (see the code below). What's worse, you do not deallocate it before readline exits, and so it leaks memory.
The while(1) loop would re-read the file and re-print it to the output fifo until the end of time.
You're using some "heavy artillery" - namely, strcat and memory allocation - where there are simpler approaches.
Last, some C standard versions may require that you declare all your variables before using them. See this question.
And here's how I modified your code. Note that, if the second line is longer than 50 characters, this code may also not behave well. There are techniques around the buffer limit, but I don't use any in this example:
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <ctype.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <string.h>
char *readline(int fd, char * buffer);
int main(int argc, char** argv) {
int in = open(argv[1], O_RDONLY);
int out;
int n;
int i;
char line[50];
memset(line, 0, 50);
mkfifo(argv[2], 0666);
out = open(argv[2] ,O_WRONLY);
sscanf(readline(in, line), "%d", &n);
strcpy(line, readline(in, line));
for (i = 0; i < n; i++) {
write(out, line, strlen(line));
write(out, "\n", 1);
}
close(in);
close(out);
return 0;
}
char *readline(int fd, char * buffer) {
char c;
int counter = 0;
while (read(fd, &c, 1) != 0) {
if (c == '\n') {
break;
}
buffer[counter++] = c;
}
return buffer;
}
This works on my box as you described. Compiled with GCC 4.8.2 .
snprintf in a loop does not work on linux but it works properly on windows.
#include <stdio.h>
#include <stdlib.h>
int main( int argc, char **argv) {
char buffer[255] ={0};
for ( int i = 0; i < 10; i++) {
snprintf(buffer, 255, "%s:%x\0",buffer, i );
}
printf ( "BUFFER = %s\n", buffer );
return 0;
}
This code does not append existing buffer but only takes the last iteration value.
You can avoid the undefined behavior of using the buffer both as the target string and as an argument like this:
#include <stdio.h>
#include <stdlib.h>
int main( int argc, char **argv) {
char buffer[255] ={0};
int offset = 0;
for ( int i = 0; i < 10; i++) {
offset += snprintf(buffer + offset, 255 - offset, ":%x\0", i);
}
printf ( "BUFFER = %s\n", buffer );
return 0;
}
sprintf()'ing the result array to itself is undefined behaviour.
EDIT: if you want some code that works, here you are: use strcat() (or the safer strncat, etc. insert usual security discussion about buffer overflow here):
#include <stdio.h>
#include <stdlib.h>
int main( int argc, char **argv) {
char buffer[255] = { 0 };
char fmtbuf[64];
int i;
for (i = 0; i < 10; i++) {
snprintf(fmtbuf, 64, "%x", fmtbuf, i);
strcat(buffer, fmtbuf);
}
printf ("BUFFER = %s\n", buffer);
return 0;
}
Also note that printf() calls don't need the terminating zero to be written out manually -- it's automatically added.
snprintf does work as specified on Linux, but your code does not append it. Read the Note in the linked documentation!
You should not use as its arguments (after the format string) the destination.
If you want it to append, either ensure that you don't overflow your fixed buffer, or reallocate that buffer when it gets too small.
You could not write 'buffer' to itself by 'snprintf'.
The test code is as follow:
#include <stdio.h>
#include <stdlib.h>
#include <cstring>
int main( int argc, char **argv) {
char buffer[255] ={0};
for ( int i = 0; i < 10; i++) {
char tmp[255] = {0};
strcpy(tmp, buffer);
snprintf(buffer, 255, "%s:%x\0",tmp, i );
printf ( "BUFFER = %s\n", buffer );
}
printf ( "BUFFER = %s\n", buffer );
return 0;
}
The standard specifically states that this code is not expected to work. Firstly, the initial buffer argument is declared restrict, which means that it cannot alias another argument. Secondly, the standard has the following clause just for emphasis:
c99
7.19.6.5 The snprintf function
Description
2 - [...] If copying takes place between objects that overlap, the behavior is undefined.