c programming - Extracting string of data from file - c

I have the following task: I have a file (card) with 5 strings:
U98_25984nhdrwedb \n
U98_5647BGFREdand \n
U98_30984bgtjfYTs \n
U77_76498375nnnnn \n
U98_83645bscdrTRF \n
I need to extract to another file image.txt those strings starting with "U9".
The below code without the memory assignment (malloc, calloc) print out the codes correctly to the screen, but it does not print the correct data to the image.txt, where I only get "98_25984nhdrwedb#".
I think I am applying the memory allocation incorrectly, but when I use malloc or calloc (before the while loop) it gets worse and print out garbage and I cannot figure out how to set this correctly.
#include <stdio.h>
#include <stdlib.h>
#include <getopt.h>
#include <stdint.h>
typedef uint8_t BYTE;
int main()
{
FILE *input_card = fopen("card","r"); //open the file for reding
BYTE data[18];
int i, n = 5;
FILE* output = fopen("image.txt","w"); //open the output file for writing
output = malloc(sizeof(data)*18); //assign memory
while (!feof(input_card))
{
for (i = 1; i <= n; i++)
{
fread(data,sizeof(BYTE),18,input_card);
if(data[i] != 0)
{
if (data[0] == 'U' && data[1] == '9')
{
printf("data: %s",data);
fwrite(&data[i],sizeof(BYTE),18,output);
}
fclose(output);
}
}
}
fclose(input_card);
free(output);
return 0;
}

In the following 2 lines from your code, the second line is incorrect:
FILE* output = fopen("image.txt","w"); //open the output file for writing
output = malloc(sizeof(data)*18); //assign memory <= This is WRONG
The variable output is a FILE pointer. You should not allocate it using malloc. You should only use it if fopen returns it successfully, which means it has already been allocated by fopen.
This means you don't need this:
free(output); // This is also WRONG
Because this already freed the pointer's allocated data:
fclose(output);

fopen() returns a FILE pointer so when you try to allocate memory (using malloc, that also returns a pointer) you're replacing the pointer to the FILE with something that points to memory instead of the file. Removing
output = malloc(sizeof(data)*18); //assign memory
should make it just work.

Related

Trying to fscanf with 2d array that uses calloc()

I'm trying to read text from a file and print it in the terminal while using dynamic memory(?), but as soon as I use calloc the code crashes. I'm new to C so I don't know what's wrong
#include <stdio.h>
#include <stdlib.h>
void filecheck(FILE*);
int main(void){
int i=0;
char** text=(char**)calloc(50,sizeof(char*));
for(i=0;i<50;i++) text[i]=(char*)calloc(50,sizeof(char));
FILE *file = fopen("F1.txt","r");
filecheck(file);
while(fscanf(file,"%s", text[i])!=EOF){
printf("%s\n",text[i]);
i++;
}
free(text);
return 0;
}
void filecheck(FILE*file){
if(file==NULL){
printf("Problem");
exit(0);
}
}
The problem is that you don't set i to 0 before you use it in the 2nd while loop. This cause the segmentation fault when you access text out of bounds. I addressed that issue below by using the same type of for loop that you used to initialize the array in the first place.
Bonus items:
Reformatted code for readability (to me) with spaces and moved * next to variable instead of next to type.
Introduced a couple of defines to replace your magic 50 numbers.
Moved filecheck() before main() so you don't need the declaration.
filecheck() now return a status code. This allows main() to free memory on failure which was technically a memory leak (even if the OS does this for you).
Check return value of calloc.
Use a status variable to hold exit code. This allows for clean-up to be shared in both normal and failure case.
Used variable instead of type as argument to sizeof.
Declare the variable as part of each for loop instead of reusing a variable. Reuse is not wrong, btw, but I think it's a good practice even if you use the same variable name.
fgets() instead of fscanf(). fscanf() is subject to buffer overflow when reading strings. Note: fscanf() reads a sequence of non-white-space characters, while fgets() read a line including the '\n'. Removed the the '\n' in the subsequent printf().
Only read at most ARR_LEN strings.
fclose() file descriptor (even if OS would do this for you).
Free the memory you allocate for text[i]. It is technically a memory leak if you don't (even if the OS frees it for you).
#include <stdio.h>
#include <stdlib.h>
#define ARR_LEN 50
#define STR_LEN 50
int filecheck(FILE *file) {
if(!file) {
printf("Problem");
return EXIT_FAILURE;
}
return EXIT_SUCCESS;
}
int main(void) {
int status = EXIT_FAILURE;
char **text = calloc(ARR_LEN, sizeof(*text));
if(!text)
goto out;
for(int i=0; i < ARR_LEN; i++) {
text[i] = calloc(STR_LEN, sizeof(**text));
if(!text[i])
goto out;
}
FILE *file = fopen("F1.txt","r");
if(filecheck(file) != EXIT_SUCCESS)
goto out;
for(int i=0; (i < ARR_LEN) && fgets(text[i], STR_SIZE, file); i++)
printf("%s",text[i]);
status = EXIT_SUCCESS;
out:
if(file) fclose(file);
for(int i=0; i<ARR_LEN; i++)
free(text[i]);
free(text);
return status;
}
Other error (except i not set to 0) is that the function fscanf returns a number of scanned arguments. So you should use:
while (fscanf(file,"%s", text[i])!=1) {
...
}
Moreover, individual text[i] are never freed and leak.

I need to split a file (for now text file) into multiple buffer C

i'm trying to read a file and split this file into multiple buffers.
This is what i came up with:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#define PKT_SIZE 2048;
#define PATH "directory of some kind"
int main() {
char filepath[200] = PATH;
FILE *packet;
int size = PKT_SIZE;
char *buffer[size];
int i=0;
//OPEN FILE
if((packet = fopen(filepath, "r")) == NULL){ //I'm trying with a txt file, then i'll change it to 'rb'
printf("Error Opening File\n");
return -1;
}
//READ FILE
while(*fgets((char *) *buffer[i], (int) strlen(buffer[i]), packet) != NULL) { //read the file and cycling insert the fgets into the buffer i
printf("Create %d buffer\n", i);
i++;
}
fclose(packet);
return 0;
}
Now, when i run this program, i get a SIGSEGV error, i managed to understand that this error is definetly:
*fgets((char *) *buffer[i], (int) strlen(buffer[i]), packet) != NULL
Do you have any suggestions?
*fgets((char *) *buffer[i], (int) strlen(buffer[i]), packet)
This line as several problems.
buffer[i] is just an un-initialized pointer pointing nowhere.
*buffer[i] is of type char you need to pass the char*.
strlen is not returning the size of the buffer. It is undefined behavior here because you called it over uninitialized pointer value.
Also dererencing whatever fgets is return is bad when the fgets returns NULL. It invokes undefined behavior.
There many solutions to this ranging from dynamic memory allocation to using
char buffer[size][MAXSIZE];. If you go about this you can get input this way:
#define MAXSIZE 100
...
char buffer[size][MAXSIZE];
while(fgets(buffer[i], sizeof(buffer[i]), packet)!=NULL){...
char* buffer[size] is an array of N char* pointers which are uninitialized. You must allocate memory to these before using them or your program will explode in a ball of fire.
The fix is to allocate:
for (size_t i = 0; i < size; ++i) {
buffer[i] = malloc(PKT_SIZE);
}
You're going to be responsible for that memory going forward, too, so don't forget to free later.
Allocating an arbitrary number of buffers is pretty wasteful. It's usually better to use some kind of simple linked-list type structure and append chunks as necessary. This avoids pointless over-allocation of memory.

Segmentation Fault on fputs

I am pretty new to C and memory allocation in general. Basically what I am trying to do is copy the contents of an input file of unknown size and reverse it's contents using recursion. I feel that I am very close, but I keep getting a segmentation fault when I try to put in the contents of what I presume to be the reversed contents of the file (I presume because I think I am doing it right....)
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int recursive_back(char **lines, int lineNumber, FILE *input) {
char *input_line = malloc(sizeof(char) * 1000);
lines = realloc(lines, (lineNumber) * 1000 * sizeof(char));
if(fgets(input_line, 201, input) == NULL) {
*(lines + lineNumber) = input_line;
return 1;
}
else {
printf("%d\n", lineNumber);
return (1+recursive_back(lines, ++lineNumber, input));
}
}
void backward (FILE *input, FILE *output, int debugflag ) {
int i;
char **lines; //store lines in here
lines = malloc(1000 * sizeof(char *) ); //1000 lines
if(lines == NULL) { //if malloc failed
fprintf(stderr, "malloc of lines failed\n");
exit(1);
}
int finalLineCount, lineCount;
finalLineCount = recursive_back(lines, 0, input);
printf("test %d\n", finalLineCount);
for(i = finalLineCount; i > 0; i--) {
fputs(*(lines+i), output); //segfault here
}
}
I am using a simple input file to test the code. My input file is 6 lines long that says "This is a test input file". The actual input files are being opened in another function and passed over to the backward function. I have verified that the other functions in my program work since I have been playing around with different options. These two functions are the only functions that I am having trouble with. What am I doing wrong?
Your problem is here:
lines = realloc(lines, (lineNumber) * 1000 * sizeof(char));
exactly as #ooga said. There are at least three separate things wrong with it:
You are reallocating the memory block pointed to by recursive_back()'s local variable lines, and storing the new address (supposing that the reallocation succeeds) back into that local variable. The new location is not necessarily the same as the old, but the only pointer to it is a local variable that goes out of scope at the end of recursive_back(). The caller's corresponding variable is not changed (including when the caller is recursive_back() itself), and therefore can no longer be relied upon to be a valid pointer after recursive_back() returns.
You allocate space using the wrong type. lines has type char **, so the object it points to has type char *, but you are reserving space based on the size of char instead.
You are not reserving enough space, at least on the first call, when lineNumber is zero. On that call, when the space requested is exactly zero bytes, the effect of the realloc() is to free the memory pointed to by lines. On subsequent calls, the space allocated is always one line's worth less than you think you are allocating.
It looks like the realloc() is altogether unnecessary if you can rely on the input to have at most 1000 lines, so you should consider just removing it. If you genuinely do need to be able to reallocate in a way that the caller will see, then the caller needs to pass a pointer to its variable, so that recursive_back() can modify it via that pointer.

Seemingly automatic pointer freeing

I have a function whichtakes a file, reads it line by line, puts every line in a *char[], puts this twodimensional array in a struct, and returns this struct:
wordlist.h:
#ifndef H_WORDLIST
#define H_WORDLIST
typedef struct {
char **chWordsList;
int listlen;
}Wordlist;
Wordlist getWordlistFromFile(char *chFilename);
char *getRandomWord();
#endif
The function (plus headers):
#include "wordlist.h"
#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
#include <string.h>
#define WORDSIZE 100
Wordlist getWordlistFromFile(char *chFilename){
FILE *file = fopen(chFilename,"r");
if (file == NULL){
printf("Unable to open file %s. Check if the file exists and can be read by this user.\n",chFilename);
exit(1);
}
char chWord[WORDSIZE];
int intFileSize = 0;
//First: coundt the amount of lines in the file
while((fgets(chWord,WORDSIZE,file) != NULL)){
++intFileSize;
}
rewind(file);
char *chWordList[intFileSize];
for (int count = 0; (fgets(chWord,WORDSIZE,file) != NULL); ++count){
chWordList[count] = malloc( strlen(chWord +1));
strcpy(chWordList[count],chWord);
chWordList[count][strlen(chWord) -1] = 0;
}
fclose(file);
Wordlist wordlist;
wordlist.chWordsList = chWordList;
wordlist.listlen = intFileSize;
for (int i = 0; i < wordlist.listlen; ++i){
printf("%s\n", wordlist.chWordsList[i]);
}
return wordlist;
}
So far this works great. The last for loop prints exactly every line of the given file, all fully expected behaviour, works perfect. Now, I actually want to use the function. So: in my main.c:
Wordlist list = getWordlistFromFile(strFilePath);
for (int i = 0; i < list.listlen; ++i){
printf("%s\n", list.chWordsList[i]);
}
This gives me the weirdest output:
abacus
wordlist
(null)
(null)
��Ⳏ
E����H�E
gasses
While the output should be:
abacus
amused
amours
arabic
cocain
cursor
gasses
It seems to me almost like some pointers get freed or something, while others stay intact. What is going on? Why is wordlist perfect before the return and broken after?
char *chWordList[intFileSize]
This array of strings is allocated on stack since it's declared as a local of getWordlistFromFile. Upon exiting the function the stack pointer is decreased and the array is no longer valid.
You should use the same approach used for the single string: allocate in on heap.
char **chWordList = malloc(intFileSize*sizeof(char*))
In this way the array will persist the scope of the function and you will be able to use it after the call to the function.
Because you are returning pointers to objects whose lifetime has expired. In particular, chWordsList inside the return value points to an object whose lifetime ends when the function returns. When you dereference that pointer you get undefined behavior (UB); therefore any result would not be surprising.
What you need to do is malloc memory for the chWordList instead of declaring it as a local array:
char **chWordList = malloc(intFileSize * sizeof(char*))
Change
char *chWordList[intFileSize];
to
char **chWordList = malloc(sizeof(char *) * intFileSize);
i.e allocated chwordList and set that in the WordList.
Your code is returning array variable chWordList allocated on stack, so it will not be valid once the function getWordlistFromFile() completes and returns to main().

Error reading char* type from .DAT file with C

So, for some reason, I need to make a external file (.DAT) to store data by appending the new one to the end of old data.
#include <stdio.h>
#include <stdlib.h>
int main () {
typedef struct {
char *Name;
int Index;
} DataFile;
static FILE *file;
size_t result;
DataFile *DataTable;
file = fopen("database.DAT","ab");
DataTable = (DataFile *) malloc (sizeof(DataFile));
DataTable[0].Name = "somefile.txt";
DataTable[0].Index = 7;
printf("%s %d \n",DataTable[0].Name,DataTable[0].Index);
result = fwrite(DataTable,sizeof(DataFile),1,file);
fclose(file);
free(DataTable);
return 0;
}
After running code above, I then check if the data stored correctly. So, I make this code below.
#include <stdio.h>
#include <stdlib.h>
int main () {
typedef struct {
char *Name;
int Index;
} DataFile;
static FILE *file;
size_t result;
long size;
int i;
DataFile *DataTable;
file = fopen("database.DAT","rb");
if (file == NULL) printf("Error1");
// Determine the size of file
fseek(file,0,SEEK_END);
size = ftell(file);
rewind(file);
DataTable = (DataFile *) malloc ((size/sizeof(DataFile)) * sizeof(DataFile));
if (DataTable == NULL) printf("Error2");
result = fread(DataTable,sizeof(DataFile),size/sizeof(DataFile),file);
fclose(file);
for (i=0; i<result; i++) {
printf("%s %d \n",DataTable[i].Name,DataTable[i].Index);
}
free(DataTable);
return 0;
}
However, it gives output
somefile.txt 7
from the first code block and
Error1 7
from the second code block.
I notice that the problem is not because the failure either when opening .DAT file or when allocating memory for DataTable. Also, it works for int type (Index) but not for char* type (Name) when reading from .DAT file. I have no idea what to do to solve this char*-type-reading problem (and where 'error1' comes from). (not even google gives me answer.)
Your structure DataFile stores one pointer and one integer. When you write it to the file, you write some program specific pointer to a string, and an integer.
When reading from it, you just refill your structure with the pointer and the integer, wich means that DataFile.Name will be a pointer to a probably-not-initialized memory segment. But since you created your file pointing to the first hard-coded string ("filename.txt"), some undefined but understandable behaviour happens, and your pointer in this case points to the first hard-coded string you wrote in you second program (which in your case is Error1)
What you really want to do is write the real string in your file.
A simple solution, if you want to the keep the hole writing structure thing is to create an array instead of a pointer
typedef struct {
char Name[512];
int Index;
} DataFile;
then initialize your data with
strncpy(DataTable[0].Name, "somefile.txt", sizeof(DataTable[0].Name) - 1); // just to make sure you dont overflow your array size
DataTable[0].Name[sizeof(DataTable[0].Name) - 1] = '\0';
and retreview your data the way you did.
A char* is only a pointer, i.e. the address of the character array containing your strings. You don't write the strings themselves to the file. After reading the file, as the same strings aren't in your memory at the same addresses any more, the application will fail.
You'll have to come up with a way to save the strings themselves to file as well. Probably by first writing their length, and then writing their content. Upon reading, you can use the length information to allocate memory dynamically, then read into that memory.
In your writing code you haven't allocated storage for char *Name. When you perform the DataTable[0].Name = "somefile.txt" instruction you're not actually copying the "somefile.txt" into memory pointed by Name, it's actually assigning a Name a value pointing to a constant characters string (moreover, it will become dangling pointer since the string is an rvalue, i.e. doesn't have a memory to be addressed via). Same goes for your file reading code.
You need to:
Allocate storage for your Name.
Copy the string using memcpy or similar into the allocated storage.

Resources