unexpected undefined characters fread - c

I am reading a file using a small buffer and printing it. But after every time it after fread and printf some unrecognized characters appear. I do not know why.
I think it has something to do with printf and not fread.
This is the small code:
f = fopen(host_filename2, "r");
char chunks[4];
int size;
do {
memset(chunks,0,sizeof chunks);
size = fread(chunks, 1, sizeof chunks, f);
if (size <= 0) break;
printf("%s", chunks);
} while (size == sizeof chunks);
fclose(f);

printf("%s", chunks); expect chunks[] to be a string. Strings have a null character at the end and fread(chunks, 1, sizeof chunks, f) did not certainly read a '\0' and form a string.
Write what was read (Best)
// printf("%s", chunks);
fwrite(chunks, 1, size, stdout);
Write what was read up to a '\0'
"%.*s" writes a limited amount from a character array, stopping at the size or when a '\0' is detected.
// printf("%s", chunks);
printf("%.*s", size, chunks);
Append your own '\0'
This will perform like printf("%.*s", size, chunks).
char chunks[4 + 1]; // One bigger
int size;
do {
// memset(chunks,0,sizeof chunks - 1); // Not needed
size = fread(chunks, 1, sizeof chunks - 1, f);
if (size <= 0) break;
chunks[size] = '\0';
printf("%s", chunks);
} while (size == sizeof chunks - 1);
Avoid naked magic numbers
Use size_t for array sizing.
#define CHUNK_SIZE 4
char chunks[CHUNK_SIZE];
size_t size;
size_t n = sizeof chunks/sizeof *chunks;
do {
size = fread(chunks, sizeof *chunks, n, f);
if (size <= 0) break;
fwrite(chunks, sizeof *chunks, size, stdout);
} while (size == sizeof chunks);

Related

Can't compare Lines of a file in C

I got this piece of code:
void scanLinesforArray(FILE* file, char search[], int* lineNr){
char line[1024];
int line_count = 0;
while(fgets(line, sizeof(line),file) !=NULL){
++line_count;
printf("%d",line_count);
printf(line);
char *temp = malloc(strlen(line));
// strncpy(temp,line,sizeof(line));
// printf("%s\n",temp);
free(temp);
continue;
}
}
This will print all lines of the file, but as soon as I uncomment the strncpy(), the program just stops without error.
Same happens as soon as I use strstr() to compare the line to my search variable.
I tried the continue statement and other redundant things, but nothing helps.
Many problems:
Do not print a general string as a format
Code risks undefined behavior should the string contain a %.
// printf(line); // BAD
printf("%s", line);
// or
fputs(line, stdout);
Bad size
strncpy(temp,line,sizeof(line)); is like strncpy(temp,line, 1024);, yet temp points to less than 1024 allocated bytes. Code attempts to write outside allocated memory. Undefined behavior (UB).
Rarely should code use strncpy().
Bad specifier
%s expects a match string. temp does not point to a string as it lacks a null character. Instead allocated for the '\0'.
// printf("%s\n", temp);`.
char *temp = malloc(strlen(line) + 1); // + 1
strcpy(temp,line);
printf("<%s>", temp);
free(temp);
No compare
"Can't compare Lines of a file in C" is curious as there is no compare code.
Recall fgets() typically retains a '\n' in line[].
Perhaps
long scanLinesforArray(FILE* file, const char search[]){
char line[1024*4]; // Suggest wider buffer - should be at least as wide as the search string.
long line_count = 0; // Suggest wider type
while(fgets(line, sizeof line, file)) {
line_count++;
line[strcspn(line, "\n")] = 0; // Lop off potential \n
if (strcmp(line, search) == 0) {
return line_count;
}
}
return 0; // No match
}
Advanced: Sample better performance code.
long scanLinesforArray(FILE *file, const char search[]) {
size_t len = strlen(search);
size_t sz = len + 1;
if (sz < BUFSIZ) sz = BUFSIZ;
if (sz > INT_MAX) {
return -2; // Too big for fgets()
}
char *line = malloc(sz);
if (line == NULL) {
return -1;
}
long line_count = 0;
while (fgets(line, (int) sz, file)) {
line_count++;
if (memcmp(line, search, len) == 0) {
if (line[len] == '\n' || line[len] == 0) {
free(line);
return line_count;
}
}
}
free(line);
return 0; // No match
}

Printing all phrases in a file with C program

I need to print all phrases from a file (phrases can end in '.', '?' or '!')
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
char* read_file(char *name) {
FILE *file;
char *text;
long num_bytes;
file = fopen(name, "r");
if(!file) {
printf("File could not be opened!");
exit(EXIT_FAILURE);
}
fseek(file, 0, SEEK_END);
num_bytes = ftell(file);
fseek(file, 0, SEEK_SET);
text = (char*) malloc(num_bytes * sizeof(char));
fread(text, 1, num_bytes, file);
fclose(file);
return text;
}
I have this piece of code that kind of works but if my file as the following text: "My name is Maria. I'm 19." the second phrase is printed with a ' ' in the beggining.
Can someone please help finding a way to ignore those spaces? Thank you
To start, you have several problems that will invoke Undefined Behaviour. In
char *line = (char*) malloc(sizeof(text));
sizeof (text) is the size of a pointer (char *), not the length of the buffer it points to.
sizeof (char *) depends on your system, but is very likely to be 8 (go ahead and test this: printf("%zu\n", sizeof (char *));, if you are curious), which means line can hold a string of length 7 (plus the null-terminating byte).
Long sentences will easily overflow this buffer, leading to UB.
(Aside: do not cast the return of malloc in C.)
Additionally, strlen(text) may not work properly as text may not include the null-terminating byte ('\0'). fread works with raw bytes, and does not understand the concept of a null-terminated string - files do not have to be null-terminated, and fread will not null-terminate buffers for you.
You should allocate one additional byte to in the read_file function
text = malloc(num_bytes + 1);
text[num_bytes] = 0;
and place the null-terminating byte there.
(Aside: sizeof (char) is guaranteed to be 1.)
Note that ftell to determine the length of a file should not be relied upon.
isspace from <ctype.h> can be used to determine if the current character is whitespace. Its argument should be cast to unsigned char. Note this will include characters such as '\t' and '\n'. Use simple comparison if you only care about spaces (text[i + 1] == ' ').
A loop can be used to consume the trailing whitespace after matching a delimiter.
Make sure to null-terminate line before printing it, as %s expects a string.
Use %u to print an unsigned int.
Do not forget to free your dynamically allocated memory when you are done with it. Additionally, heavily consider checking any library function that can fail has not done so.
#include <ctype.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
void pdie(const char *msg) {
perror(msg);
exit(EXIT_FAILURE);
}
char *read_file(char *name) {
FILE *file = fopen(name, "r");
if (!file)
pdie(name);
fseek(file, 0, SEEK_END);
long num_bytes = ftell(file);
if (-1 == num_bytes)
pdie(name);
fseek(file, 0, SEEK_SET);
char *text = malloc(num_bytes + 1);
if (!text)
pdie("malloc");
if (-1 == num_bytes)
pdie(name);
text[num_bytes] = 0;
if (fread(text, 1, num_bytes, file) != num_bytes)
pdie(name);
fclose(file);
return text;
}
int main(int argc, char **argv) {
if (argc < 2) {
fprintf(stderr, "usage: %s TEXT_FILE\n", argv[0]);
return EXIT_FAILURE;
}
char *text = read_file(argv[1]);
unsigned int count = 0;
size_t length = strlen(text);
size_t index = 0;
char *line = malloc(length + 1);
if (!line)
pdie("malloc");
for (size_t i = 0; i < length; i++) {
line[index++] = text[i];
if (text[i] == '.' || text[i] == '?' || text[i] == '!') {
line[index] = '\0';
index = 0;
printf("[%u] <<%s>>\n", ++count, line);
while (isspace((unsigned char) text[i + 1]))
i++;
}
}
free(text);
free(line);
return EXIT_SUCCESS;
}
Input file:
My name is Maria. I'm 19. Hello world! How are you?
stdout:
[1] <<My name is Maria.>>
[2] <<I'm 19.>>
[3] <<Hello world!>>
[4] <<How are you?>>
You can test for a whitespace character by comparing the char in question to ' '.
if(text[i] == ' ')
// text[i] is whitespace
One possible solution, advance to the next non-whitespace character when you find the end of the sentence. You also need to make sure you've mallocd enough memory for the current phrase:
#include <ctype.h> // for isspace
...
size_t textLength = strlen(text);
// malloc based on the text length here, plus 1 for the NUL terminator.
// sizeof(text) gives you the size of the pointer, not the size of the
// memory block it points to.
char *line = malloc(textLength+1);
for(size_t i = 0; i < textLength; i++) {
line[index] = text[i];
index++;
if(text[i] == '.' || text[i] == '?' || text[i] == '!') {
count++;
printf("[%d] %s\n", count, line);
memset(line, 0, index + 1);
index = 0;
// advance to the next non-whitespace char
do
{
// advance to the next char (we know the current char is not a space)
i++;
// keep advancing i while the next char is in range of the
// text and the next char is a space.
}while (i+1 < textLength && isspace(text[i+1]) != 0);
}
}
Output:
[1] My name is Maria.
[2] I'm 19.
Demonstration
There's also no need to cast the return value of malloc

strncpy only copying 16 bytes from one string to another

so I have this function that, for now, takes a string from a input.txt and passes it to a string str by reference using strncpy(). but when I've tried calling the string outside of the function, I dont get anything at all if it tries to copy more than 16 (15 with the '/0' at the end).
this is my function:
int openFile(char *str){
FILE *arquivo;
arquivo = fopen("input.txt", "r");
if (arquivo == NULL){
return 1; // couldnt open input.txt
}
fseek(arquivo, 0, SEEK_END);
int size = ftell(arquivo);
rewind(arquivo);
char *a = malloc(sizeof(char) * size); // enough for 'size' chars
size_t buffer_len = sizeof(char) * size; // sizeof(a) returns sizeof(a as pointer), so uh.. dont
printf("%d %d %d %d\n", sizeof(char), sizeof(a), size, buffer_len); // just to test what it's getting
fgets(a, size + 1, arquivo);
printf( "%s, %d" , a, size);
realloc(str, sizeof(char) * size); // just in case there isnt enough space in str
strncpy(str, a, 16); // error in THIS LINE doesnt copy more than 16 bytes
// memmove(str, a, 16); //not a problem with strncpy, memmove gives the same problem
// str[buffer_len - 1] = '\0';
fclose(arquivo);
free(a);
return 0;
}
my main is really simple too
int main(){
char *str;
if(openFile(str)){ // if openFile return 1, error opening file
printf("error opening file.\n");
return 0;
}
printf("\n%s", str);
free(str);
return 0;
}
and finally, the input/output (input = MEEUMOCSHMSC1T*AGU0A***L2****T*****A):
1 4 36 36
MEEUMOCSHMSC1T*AGU0A***L2****T*****A, 36
MEEUMOCSHMSC1T*A­
}
its a cypher, thats why input is so jambled;
that "}" at the end is part of the string I guess, it changes every time AND it dissapears when I substitute str[15] with '\0';
strncpy(str, a, 16);
strncpy(dest, src, length);
length is maximum number of chars copied. So it will not comply more than 16 chars in your case

dynamic buffer size for reading input

I am trying to create a program that will read line by line from stdin, search that line for the start and end of a given word and output all the matching words. Here is the code:
int main()
{
char buffer[100];
char **words = NULL;
int word_count = 0;
while (fgets(buffer, sizeof(buffer), stdin) != NULL) {
int length = strlen(buffer);
if (buffer[length - 1] == '\n') {
word_count = count_words(buffer, FIRSTCHAR);
if (word_count > 0) {
words = get_words(buffer, FIRSTCHAR, LASTCHAR);
for (int i = 0; i < word_count; ++i) {
printf("%s\n", words[i]);
free(words[i]);
}
free(words);
}
}
}
return 0;
}
I got the basic functionality working, but I am relying on fgets() with a fixed buffer size.
What I would like is to dynamically allocate a memory buffer with a size based on the length of each line.
I can only see one way of going about solving it, which is to iterate over input with fgetc and increment a counter until end of line and use that counter in place of sizeof(buffer), but I don't know how I would get fgetc to read the correct relevant line.
Is there any smart way of solving this?
but I am relying on fgets() with a fixed buffer size. What I would like is to dynamically allocate a memory buffer with a size based on the length of each line
I did wrote a version of fgets for another SO answer that reads the whole line and returns a
malloc allocated pointer with the contents of the whole line. This is the
code:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
char *fgets_long(FILE *fp)
{
size_t size = 0, currlen = 0;
char line[1024];
char *ret = NULL, *tmp;
while(fgets(line, sizeof line, fp))
{
int wholeline = 0;
size_t len = strlen(line);
if(line[len - 1] == '\n')
{
line[len-- - 1] = 0;
wholeline = 1;
}
if(currlen + len >= size)
{
// we need more space in the buffer
size += (sizeof line) - (size ? 1 : 0);
tmp = realloc(ret, size);
if(tmp == NULL)
break; // return all we've got so far
ret = tmp;
}
memcpy(ret + currlen, line, len + 1);
currlen += len;
if(wholeline)
break;
}
if(ret)
{
tmp = realloc(ret, currlen + 1);
if(tmp)
ret = tmp;
}
return ret;
}
The trick is to check if the newline was read. If it was read, then you can
return the buffer, otherwise it reallocates the buffer with sizeof line more
bytes and appends it to the buffer. You could use this function if you like.
An alternative would be if you are using a POSIX system and/or are compiling with GNU GCC, then you
can use getline as well.
void foo(FILE *fp)
{
char *line = NULL;
size_t len = 0;
if(getline(&line, &len, fp) < 0)
{
free(line); // man page says even on failure you should free
fprintf(stderr, "could not read whole line\n");
return;
}
printf("The whole line is: '%s'\n", line);
free(line);
return;
}
the function: getline() does just what you want. The syntax:
ssize_t getline(char **lineptr, size_t *n, FILE *stream);
The function is exposed in the stdio.h header file and usually requires something like: #define _POSIX_C_SOURCE 200809L or #define _GNU_SOURCE as the first line in the file that calls getline()
Strongly suggest reading/understanding the MAN page for `getline() for all the grubby details.

why is : fp_len = 400 , size_t len = 1

determine content-length and append '\0'
fseek(fp, 0, SEEK_END);
long fp_len;
fp_len = ftell(fp);
fseek(fp, 0, SEEK_SET);
char *text = malloc(sizeof(*text) * fp_len + 1);
size_t len = fread(text, fp_len, 1, fp);
text[fp_len] = '\0';
fp_len prints : 400, while len prints : 1
printf("%d", fp_len);
printf("%d", len);
my understanding is this is wrong:
text[fp_len] = '\0';
and this is correct :
text[len] = '\0';
but if "len" is printing 1..
wouldn't '\0' be added to the 2nd spot in the array ?
Call fread(text, fp_len, 1, fp) asks to read one element of size fp_len so after succesful execution result is 1 (number of elements read) or it can be 0 if reading fails.
If you want to count number of bytes (character) read from file, you can change places of arguments, like
fread(text, 1, fp_len, fp)
For more information refer to references
size_t fread(void * restrict ptr, size_t size, size_t nmemb, FILE * restrict stream);
As other have said, fread returned 1 - the number of elements read and each had a size of size or 400.
Put the augments in the correct order.
// size_t len = fread(text, fp_len, 1, fp);
size_t len = fread(text, 1, fp_len, fp);
Better to avoid magic numbers like 1 here. Instead, use the size of the text[] element.
size_t len = fread(text, sizeof *text, fp_len, fp);
Further, code lacks error checking and printf() specifier correctness.
if (fp == NULL) Handle_Error("fopen");
if (fseek(fp, 0, SEEK_END)) Handle_Error("fseek");
long fp_len = ftell(fp);
if (fp_len == -1) Handle_Error("ftell");
if (fp_len < 0 || fp_len >= SIZE_MAX) Handle_Error("long to size_t");
char *text = malloc(sizeof *text * (fp_len + 1));
if (text == NULL) Handle_Error("malloc");
size_t len = fread(text, 1, fp_len, fp);
if (len == 0 && fp_len > 0) Handle_Error("fread");
text[len] = '\0';
printf("%ld", fp_len); // note specifiers
printf("%zu", len);
According to the manual page you need to write
fread(text, 1, fp_len, fp);

Resources