C memory allocation with File input - c

Hello i have a problem with memory allocation,
1. open file
2. take lenght of text inside
3. make buffer in size of lenght (array[] ? malloc ?)
4. make operations on text in buffer.
5. close
it terminates when text any longer than 1xx characters i have no idea whats going on.
ps.attention! im learning and quality of this code can be bad
#include <stdio.h>
#include <stdlib.h>
void copy_to_buffer(FILE *fp, int length, char *buffer){
for(int i = 0; i < length; i++){
char c = fgetc(fp);
buffer[i] = c;
}
}
int length_of_text(FILE *fp) {
fseek(fp, 0L, SEEK_END);
int size = ftell(fp);
rewind(fp);
return size;
}
void char_counter(int length, char *buffer, int *charBuffer) {
int counts[128] = { 0 };
for (int i = 0; i < length; i++) {
counts[(int)(buffer[i])]++;
charBuffer[i] = counts[i];
}
for (int i = 0; i < 128; i++) {
charBuffer[i] = counts[i];
if(counts[i] != 0)
printf("%d.(%c) counted: %d times.\n", i,i, counts[i]);
}
}
/***********************************MAIN***********************************/
int main(int argc, char** argv) {
FILE *fp = fopen("tekst.txt" , "r");
int length = length_of_text(fp); //lenght of text
char *buffer = malloc(sizeof(char)*length); //buffer for text from file
if(buffer == NULL)
printf("error");
else
printf("alocated at = %p\n", &buffer);
int charBuffer[128] = {0}; // charcount buffer
buffer[length] = '\0'; // '\0' after last sign
copy_to_buffer(fp, length, buffer);
char_counter(length, buffer, charBuffer);
free(buffer);
fclose(fp);
return 0;
}

In this line
charBuffer[i] = counts[i];
you will overflow charBuffer[128] when the file size is >= 128, since i is indexing by up to the length of the file.

In your char_counter function you do
charBuffer[i] = counts[i];
in the first for loop but buffer is only defined to be 128 ints. If the text is longer than 128 characters this will cause a buffer overflow and a segmentation fault.
Remove that line and let the 2nd for loop do it.

Related

pallindrome is not copied to next file but printed on output screen

I have a file named fp1 containing different names, some being palindromes, and have to read all names from fp1 and check if each name is a palindrome or not. If it's a palindrome the I need to print the name to screen and copy it to another file named fp.
Here's my program:
#include <stdio.h>
#include <conio.h>
#include <stdlib.h>
void main() {
FILE *fp, *fp1;
char m, y[100];
int k = 0, i = 0, t = 1, p = 0;
fp = fopen("C:\\Users\\HP\\Desktop\\New folder\\file 2.txt", "w");
fp1 = fopen("C:\\Users\\HP\\Desktop\\New folder\\file4.txt", "r");
if (fp == NULL) {
printf("error ");
exit(1);
}
if (fp1 == NULL) {
printf("error");
exit(1);
}
k = 0;
m = fgetc(fp1);
while (m != EOF) {
k = 0;
i = 0;
t = 1;
p = 0;
while (m != ' ') {
y[k] = m;
k = k + 1;
m = fgetc(fp1);
}
p = k - 1;
for (i = 0; i <= k - 1; i++) {
if (y[i] != y[p]) t = 0;
p = p - 1;
}
if (t == 1) {
fputs(y, fp);
printf("%s is a pallindrome\n", y);
}
m = fgetc(fp1);
}
fclose(fp);
fclose(fp1);
}
coping pallindrome from one file to next file
You are not null terminating your buffer before attempting to use the contents as a string. After placing the last valid character read by fgetc into the buffer, you must place a null terminating character (\0).
A character buffer without a null terminating byte is not a string. Passing such a buffer to fputs, or the printf specifier %s without a length bound, will invoke Undefined Behaviour.
fgetc returns an int, not a char. On systems where char is unsigned, you will not be able to reliably test against the negative value of EOF.
The inner while loop is not checking for EOF. When the file is exhausted, it will repeatedly assign EOF to the buffer, until the buffer overflows.
To that end, in general, the inner while loop does nothing to prevent a buffer overflow for longer inputs.
In a hosted environment, void main() is never the correct signature for main. Use int main(void) or int main(int argc, char **argv).
Note that fputs does not print a trailing newline. As is, you would fill the output file full of strings with no delineation.
The nested while loops are fairly clumsy, and I would suggest moving your palindrome logic to its own function.
Here is a refactored version of your program. This program discards the tails of overly long words ... but the buffer is reasonably large.
#include <ctype.h>
#include <stdio.h>
#include <stdlib.h>
#define BUFFER_SIZE 1024
FILE *open_file_or_die(const char *path, const char *mode)
{
FILE *file = fopen(path, mode);
if (!path) {
perror(path);
exit(EXIT_FAILURE);
}
return file;
}
int is_palindrome(const char *word, size_t len)
{
for (size_t i = 0; i < len / 2; i++)
if (word[i] != word[len - i - 1])
return 0;
return 1;
}
int main(void)
{
/*
FILE *input = open_file_or_die("C:\\Users\\HP\\Desktop\\New folder\\file4.txt", "r");
FILE *output = open_file_or_die("C:\\Users\\HP\\Desktop\\New folder\\file 2.txt", "w");
*/
FILE *input = stdin;
FILE *output = stdout;
char buffer[BUFFER_SIZE];
size_t length = 0;
int ch = 0;
while (EOF != ch) {
ch = fgetc(input);
if (isspace(ch) || EOF == ch) {
buffer[length] = '\0';
if (length && is_palindrome(buffer, length)) {
fputs(buffer, output);
fputc('\n', output);
printf("<%s> is a palindrome.\n", buffer);
}
length = 0;
} else if (length < BUFFER_SIZE - 1)
buffer[length++] = ch;
}
/*
fclose(input);
fclose(output);
*/
}

Unexpected results with fgets

int main(int argc, char **argv)
{
char *buf = (char *)malloc(31);
FILE *fp = fopen("td.txt", "r");
char* temps[4];
for (int i = 0; i < 4; i++)
{
fgets(buf, 3, fp);
temps[i] = buf;
}
fclose(fp);
}
I tried to read from a text like:
a
b
c
d
So I think the result of temps should be:
temp[0] = 'a\n'
...
temp[3] = 'd\n'
But the actual result is:
temp[0] = 'd\n'
...
temp[3] = 'd\n'
After debugging I find every time after fgets run suddenly temps change for no reason.
How did this happen? How should I correct my code?
buf points to an allocation whose data contents changes with each fgets().
temps[i] = buf; assigns the pointer buf to temps[i]. After 4 iterations, temps[0], temps[1], temps[2], temps[3] all have the same pointer value. They all point to same place as buf.
How should I correct my code?
To save unique copies of user input, use a large buffer to read user input. Then allocate right-size buffers for a copy of input.
#define N 4
#define BUF_SZ 100
int main(void) {
FILE *fp = fopen("td.txt", "r");
if (fp) {
char buf[BUF_SZ];
char* temps[N];
for (int i = 0; i < N; i++) {
if (fgets(buf, sizeof buf, fp) {
temps[i] = strdup(buf);
} else {
temps[i] = NULL;
}
}
// Use temps[] somehow
// cleanup
for (int i = 0; i < N; i++) {
free(temps[i]);
}
fclose(fp);
}

How to implement MPI in my C program to read file & remove space from it

I am new to C, After 4 days, I finally managed to make a program that read a file and remove space from it. I need to also make it parallel using MPI in any way. I tried various solutions, but MPI does not seem straightforward, it is complex, can someone please help me a bit to move forward.
Here is my code. It first reads a text file, and then removes space and new line characters.
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
FILE* pInputFile;
int chr = 0;
int main()
{
FILE* fptr;
char c;
char filename[] = "Lorem.txt";
char* str, * strblank;
int i = 0;
errno_t err;
if ((err = fopen_s(&pInputFile, filename, "r")) == 0)
{
/*count the number of characters in file for file initialization*/
size_t pos = ftell(pInputFile); // Current position
fseek(pInputFile, 0, SEEK_END); // Go to end
size_t length = ftell(pInputFile); // read the position which is the size
fseek(pInputFile, pos, SEEK_SET); // restore original position
//creating dynamic array of file size
str = malloc(length * sizeof(char));
strblank = malloc(length * sizeof(char));
while ((chr = getc(pInputFile)) != EOF)
{
str[i] = chr;
i++;
}
i = 0;
printf("%s", str);
removespace(str, strblank);
printf("%s", strblank);
fclose(pInputFile);
}
else
{
fprintf(stderr, "Cannot open file, error %d\n", err);
}
return 0;
}
int removespace(char aj[500], char mj[500])
{
int i = 0, j = 0, len;
len = strlen(aj); // len stores the length of the input string
while (aj[i] != '\0') // till string doesn't terminate
{
if (aj[i] != ' ' && aj[i] != '\n') // if the char is not a white space
{
/*
incrementing index j only when
the char is not space
*/
mj[j++] = aj[i];
}
/*
i is the index of the actual string and
is incremented irrespective of the spaces
*/
i++;
}
mj[j] = '\0';
printf("\n\nThe string after removing all the spaces is: ");
return 0;
}

Parsing senteces from a txt file to a multidimensional array in C

this is driving me crazy. I'm trying to parse from a txt file every sentence (that is all characters between dots) and insert each sentence into an array. The end goal is to have a multi dimensional array with each sentence as single array.
I managed to reach a point where I think it should work but I'm getting a segmentation fault (core dumped) error from the line numOfRow++
void parseRows(FILE* file){
int c;
int numOfRow = 0;
int numOfChar = 0;
int numOfRows = countNumOfRows(file);
fseek(file, 0, SEEK_SET); // Reset file pointer position to the beginning
char **rows = malloc(numOfRows*sizeof(char*));
for (int i=0; i < numOfRows; i++) rows[i] = malloc(1000*sizeof(char));
while ((c=fgetc(file))!= EOF) {
if (c != '.') {
rows[numOfRow][numOfChar] = c;
numOfChar++;
} else {
rows[numOfRow][numOfChar] = '\0';
numOfRow++; // This is throwing the error
numOfChar = 0;
}
}
printOutput(rows, numOfRows);
}
If I comment out that line the program overwrites every line on the first array and I get only the last sentence as result so I know it's working.
What am I missing?
Complete code here:
#include <stdio.h>
#include <stdlib.h>
#define USAGE "USAGE: ./huffman <textFile.txt>\n"
FILE* openFile(char[]);
void parseRows(FILE*);
int countNumOfRows(FILE*);
void printOutput(char**, int);
int main(int argc, char** argv){
FILE* fd;
if (argc != 2) printf("%s", USAGE);
fd = openFile(argv[1]);
parseRows(fd);
}
FILE* openFile(char* file){
FILE* stream;
stream = fopen(file, "r");
return stream;
}
int countNumOfRows(FILE* file){
int i = 0;
char c;
while ((c=fgetc(file))!= EOF) {
if (c == '.') i++;
}
printf("numero di righe %d\n", i);
return i;
}
void parseRows(FILE* file){
int c;
int numOfRow = 0;
int numOfChar = 0;
int numOfRows = countNumOfRows(file);
fseek(file, 0, SEEK_SET); // Reset file pointer position to the beginning
char **rows = malloc(numOfRows*sizeof(char*));
for (int i=0; i < numOfRows; i++) rows[i] = malloc(1000*sizeof(char));
while ((c=fgetc(file))!= EOF) {
if (c != '.') {
rows[numOfRow][numOfChar] = (char)c;
numOfChar++;
} else {
rows[numOfRow][numOfChar] = '\0';
numOfRow += 1;
numOfChar = 0;
}
}
printOutput(rows, numOfRows);
}
void printOutput(char** matrix, int rows){
for (int i=0; i<rows; i++){
printf("%s", matrix[i]);
}
}
Example of input file textFile.txt:
Any text that contains more than one sentence.
This Should get parsed and return a 2 dimension array with every sentence as single array.
Your countNumOfRows() function counts the dots in a file, and you use that number to malloc space for your array. However, there are likely more characters beyond the last dot and before EOF (e.g. a CR or LF or CRLF), so you can easily write past the end of your malloc'd memory.
Try:
return (i + 1)
at the end of countNumOfRows() and see if that eliminates the segfault.

struct pointers to same memory address producing different data?

I have this simple code to read the lines of a file and store them in a struct:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
struct filedata {
char **items;
int lines;
};
struct filedata *read_file(char *filename) {
FILE* file = fopen(filename, "r");
if (file == NULL) {
printf("Can't read %s \n", filename);
exit(1);
}
char rbuff;
int nlines = 0; // amount of lines
int chr = 0; // character count
int maxlen = 0; // max line length (to create optimal buffer)
int minlen = 2; // min line length (ignores empty lines with just \n, etc)
while ((rbuff = fgetc(file) - 0) != EOF) {
if (rbuff == '\n') {
if (chr > maxlen) {
maxlen = chr + 1;
}
if (chr > minlen) {
nlines++;
}
chr = 0;
}
else {
chr++;
}
}
struct filedata *rdata = malloc(sizeof(struct filedata));
rdata->lines = nlines;
printf("lines: %d\nmax string len: %d\n\n", nlines, maxlen);
rewind(file);
char *list[nlines];
int buffsize = maxlen * sizeof(char);
char buff[buffsize];
int i = 0;
while (fgets(buff, buffsize, file)) {
if (strlen(buff) > minlen) {
list[i] = malloc(strlen(buff) * sizeof(char) + 1);
strcpy(list[i], buff);
i++;
}
}
rdata->items = (char **)list;
fclose(file);
int c = 0;
for (c; c < rdata->lines; c++) {
printf("line %d: %s\n", c + 1, rdata->items[c]);
}
printf("\n");
return rdata;
}
int main(void) {
char fname[] = "test.txt";
struct filedata *ptr = read_file(fname);
int c = 0;
for (c; c < ptr->lines; c++) {
printf("line %d: %s\n", c + 1, ptr->items[c]);
}
return 0;
}
This is the output when I run it:
lines: 2
max string len: 6
line 1: hello
line 2: world
line 1: hello
line 2: H��
For some reason when it reaches the second index in ptr->items, it prints gibberish output. But yet, if I throw some printf()'s in there to show the pointer addresses, they're exactly the same.
Valgrind also prints this when iterating over the char array the second time:
==3777== Invalid read of size 8
==3777== at 0x400AB3: main (test.c:81)
==3777== Address 0xfff000540 is on thread 1's stack
==3777== 240 bytes below stack pointer
But that really doesn't give me any clues in this case.
I'm using gcc 4.9.4 with glibc-2.24 if that matters.
list is an non-static local variable and using it after exiting its scope (returning from read_file in this case) will invoke undefined behavior because it will vanish on exiting its scope. Allocate it dynamically (typically on the heap) like
char **list = malloc(sizeof(char*) * nlines);
Adding code to check if malloc()s are successful will make your code better.
The variable list is local to read_file, but you store a pointer to list in rdata->items. When read_file returns, rdata->items is a dangling pointer, and accessing it is undefined behavior.

Resources