Question about offsets and reading a file line by line

Question about offsets and reading a file line by line - c

#include "offsetFinder.h"
/** Reads a GIS record file (as described in the corresponding project
* specification), and determines, for each GIS record contained in that
* file, the offset at which that record begins. The offsets are stored
* into an array supplied by the caller.
*
* Pre: gisFile is open on a GIS record file
* offsets[] is an array large enough to hold the offsets
* Post: offsets[] contains the GIS record offsets, in the order
* the records occur in the file
* Returns: the number of offsets that were stored in offsets[]
*/
uint32_t findOffsets(FILE* gisFile, uint32_t offsets[]) {
FILE *op;
/*** Complete the implementation of this function ***/
int count = 0;
char offsets[1000];
char *reader;
op = fopen(gisFile, "r");
if (!op) {
perror("Failed to open file!\n");
exit(1);
}
else {
reader = offsets;
while (*reader != '\n' && fgets(offsets, sizeof(offsets), op)) {
count++;
}
}
return count;
}
Hello all, I have a question about this assignment. Is this set up alright?
For the GISData.txt, I am supposed to read through the file and I have to return the number of offsets that were stored in offsets[].
FEATURE_ID|FEATURE_NAME|FEATURE_CLASS|STATE_ALPHA|STATE_NUMERIC|COUNTY_NAME|COUNTY_NUMERIC|PRIMARY_LAT_DMS|PRIM_LONG_DMS|PRIM_LAT_DEC|PRIM_LONG_DEC|SOURCE_LAT_DMS|SOURCE_LONG_DMS|SOURCE_LAT_DEC|SOURCE_LONG_DEC|ELEV_IN_M|ELEV_IN_FT|MAP_NAME|DATE_CREATED|DATE_EDITED
885513|Siegrest Draw|Valley|NM|35|Eddy|015|323815N|1043256W|32.6376116|-104.5488549|323859N|1043732W|32.6498321|-104.6255227|1095|3592|Parish Ranch|11/13/1980|
885526|AAA Tank|Reservoir|NM|35|Eddy|015|321043N|1041456W|32.1786543|-104.2489615|||||1006|3300|Bond Draw|11/13/1980|06/23/2011
885566|Adobe Draw|Valley|NM|35|Eddy|015|322820N|1042141W|32.4723375|-104.361345|322704N|1042129W|32.4511111|-104.3580556|1007|3304|Carlsbad West|11/13/1980|
885567|Adobe Flat|Flat|NM|35|Eddy|015|322849N|1042119W|32.4803932|-104.3552339|||||1006|3300|Carlsbad West|11/13/1980|
885607|Alacran Hills|Range|NM|35|Eddy|015|322812N|1041055W|32.4701183|-104.1818931|||||1009|3310|Carlsbad East|11/13/1980|
885684|Alkali Lake|Lake|NM|35|Eddy|015|323039N|1041133W|32.5109371|-104.1924802|||||966|3169|Angel Draw|11/13/1980|06/23/2011
885697|Allen Well|Well|NM|35|Eddy|015|322309N|1042120W|32.3859489|-104.3555084|||||1038|3405|Carlsbad West|11/13/1980|
This is a snippet of the GISData.txt and each region data (a line) is considered a GIS record.
"The offsets referred to in the assignment are the positions at which the GIS records begin in the GIS data file.
Since each GIS record occupies a whole line, the offset of a GIS record is simply the offset of the first byte in the GIS record.
And, of course, the first line in the GIS data file does not contain a GIS record, so there is no GIS record at offset 0."
Can someone look over my code and revise it if I'm completely wrong? Thank you!!

"For the GISData.txt, I am supposed to read through the file and I have to return the number of offsets that were stored in
offsets[]....Can someone look over my code and revise it if I'm
completely wrong?"
First it appears that there is some confusion about what offset means in this question. And, after Googling "gid offset", I can understand why. Here is a GIS specific definition:
_" offset
[cartography] In cartography, the displacement or movement of features so that they do not overlap when displayed at a given scale.
For example, a road can be offset from a river if the symbols are wide
enough that they overlap.
[symbology] In symbology, the shift of the origin or insertion point of a symbol in an x and/or y direction.
[ESRI software] In ArcGIS, a change in or the act of changing the z-value for a surface or features in a scene by a constant amount or
by using an expression. Offsets may be applied to make features draw
just above a surface."_
And this "GIS System in C" definition:
"The file can be thought of as a sequence of bytes, each at a unique offset from the beginning of the file, just like the cells of an
array. So, each GIS record begins at a unique offset from the
beginning of the file"
These two definitions, although both deriving from searches on gis offset, are so different as to not offer any clarification on what the terms mean in this question. For purposes of this answer then I am taking my queues from your responses in the comments, and will address how to parse the first field in each record of the file. (excluding the header record in line 1.)
Here are some suggested steps to consider that could be used to implement this.
Steps to consider:
Prototype design
As described in comments, the prototype to the findOffsets() function should provide following: filespec, size of array, array. Not mentioned in comments, but also useful might be the length of the longest record that will be read. eg:
uint32_t findOffsets(const char *fileSpec, size_t longestElement, size_t numElements, uint32_t offsets[numElements]);
From calling function
read file once to determine number of records. eg: numRecords. See
int count_names(const char *filename, size_t *count){...}
example here for example of how to read number of records, (and when needed to get longest record.) in file. close file when done:
use number of records from previous step to size the array.
Example:
uint32_t offsets[numRecords-1]; //-1 skipping header line
memset(records, 0, sizeof records);
call findOffsets()
Example:
size_t numOffsets = sizeof records/sizeof *records
uint32_t count = findOffsets("c:\\gis\\data.gis", longestRecord, numOffsets, offsets);
if(count > 0)
{
//do something with records
}
Inside findOffsets()
Open file for second read of process
Read each line of file (skipping header line)
Parse first '|' delimited token from each line
convert parsed token from string to integer
close file
return count of lines processed.
A code example (with very limited safeties/error checking) is below showing how this could be done. It was tested using your sample file contents, and borrows from code linked above, adapted to this purpose:
const char *fileSpec = "C:\\some_directory\\gisData.gis";
uint32_t findOffsets(const char *fileSpec, size_t longestElement, size_t numElements, uint32_t offsets[numElements]);
int count_lines_in_file(const char *filename, size_t *count);
size_t filesize(const char *fn);
int main(void)
{
size_t numRecords = 0;
int longestRecord = count_lines_in_file(fileSpec, &numRecords)+1;//+1 room for null terminator
uint32_t offsets[numRecords -1];//-1 - skipping header line
memset(offsets, 0, sizeof offsets);//initialize VLA offsets
int recordsProcessed = findOffsets(fileSpec, longestRecord, numRecords -1, offsets);//do the work
return 0;
}
uint32_t findOffsets(const char *fileSpec, size_t longestElement, size_t numElements, uint32_t offsets[numElements])
{
char *delim = "|";
char *tok = NULL;
char line[longestElement+1]; //+1 - room for null terminator during read.
memset(line, 0, sizeof line);//initialize VLA line to all zeros
int inx = 0;
FILE *fp = fopen(fileSpec, "r");
if(fp)
{
while(fgets(line, sizeof line, fp))//loop to read all lines in file
{
if(!strstr(line, "FEATURE_ID"))//skip header line, process all other lines
{
tok = strtok(line, delim);//extract first field
if(tok)
{
offsets[inx] = atoi(tok);//convert token and store number
inx++;
}
}
}
fclose(fp);
}
return inx;
}
//passes back count of lines in file, and return longest line
int count_lines_in_file(const char *filename, size_t *count)
{
int len=0, lenKeep = 0;
FILE *fp = fopen(filename, "r");
if(fp)
{
char *tok = NULL;
char *delim = "\n";
int cnt = 0;
size_t fSize = filesize(filename);
char *buf = calloc(fSize, 1);
while(fgets(buf, fSize, fp)) //goes to newline for each get
{
tok = strtok(buf, delim);
while(tok)
{
cnt++;
len = strlen(tok);
if(lenKeep < len) lenKeep = len;
tok = strtok(NULL, delim);
}
}
*count = cnt;
fclose(fp);
free(buf);
}
return lenKeep;
}
//return file size in bytes (binary read)
size_t filesize(const char *fn)
{
size_t size = 0;
FILE*fp = fopen(fn, "rb");
if(fp)
{
fseek(fp, 0, SEEK_END);
size = ftell(fp);
fseek(fp, 0, SEEK_SET);
fclose(fp);
}
return size;
}

Related

read write binary into a string c

So, I looked around the internet and a couple questions here and I couldn't find anything that could fix my problem here. I have an assignment for C programming, to write a program that allows user to enter words into a string, add more words, put all words in the string to a text file, delete all words in string, and when they exit it saves the words in a binary, which is loaded upon starting up the program again. I've gotten everything to work except where the binary is concerned.
I made two functions, one that loads the bin file when the program starts, one that saves the bin file when it ends. I don't know in which, or if in both, the problem starts. But basically I know it's not working right because I get garbage in my text file if I save it in a text file after the program loads the bin file into the string. I know for sure that the text file saver is working properly.
Thank you to anyone who takes the time to help me out, it's been an all-day process! lol
Here are the two snippets of my functions, everything else in my code seems to work so I don't want to blot up this post with the entire program, but if need be I'll put it up to solve this.
SIZE is a constant of 10000 to meet program specs of a 1000 words. But I couldn't get this to run even asking for only 10 elements or 1, just to clear that up
void loadBin(FILE *myBin, char *stringAll) {
myBin = fopen("myBin.bin", "rb");
if (myBin == NULL) {
saveBin(&myBin, stringAll);
}//if no bin file exists yet
fread(stringAll, sizeof(char), SIZE + 1, myBin);
fclose(myBin); }
/
void saveBin(FILE *myBin, char *stringAll) {
int stringLength = 0;
myBin = fopen("myBin.bin", "wb");
if (myBin == NULL) {
printf("Problem writing file!\n");
exit(-1);
stringLength = strlen(stringAll);
fwrite(&stringAll, sizeof(char), (stringLength + 1), myBin);
fclose(myBin); }

You are leaving bad values in your myBin FILE*, and passing the & (address) of a pointer.
Pass the filename, and you can (re) use the functions for other purposes, other files, et al.
char* filename = "myBin.bin";
Pass the filename, buffer pointer, and max size to read. You should consider using stat/fstat to discover file size
size_t loadBin(char *fn, char *stringAll, size_t size)
{
//since you never use myBin, keep this FILE* local
FILE* myBin=NULL;
if( NULL == (myBin = fopen(fn, "rb")) ) {
//create missing file
saveBin(fn, stringAll, 0);
}//if no bin file exists yet
size_t howmany = fread(stringAll, sizeof(char), size, myBin);
if( howmany < size ) printf("read fewer\n");
if(myBin) fclose(myBin);
return howmany;
}
Pass the file name, buffer pointer, and size to save
size_t saveBin(char *fn, char *stringAll, size_t size)
{
int stringLength = 0;
//again, why carry around FILE* pointer only used locally?
FILE* myBin=NULL;
if( NULL == (myBin = fopen("myBin.bin", "wb")) ) {
printf("Problem writing file!\n");
exit(-1);
}
//binary data may have embedded '\0' bytes, cannot use strlen,
//stringLength = strlen(stringAll);
size_t howmany = fwrite(stringAll, sizeof(char), size, myBin);
if( howmany < size ) printf("short write\n");
if(myBin) fclose(myBin);
return howmany;
}
Call these; you are not guaranteed to write & read the same sizes...
size_t buffer_size = SIZE;
char buffer[SIZE]; //fill this with interesting bytes
saveBin(filename, buffer, buffer_size);
size_t readcount = loadBin(filename, buffer, buffer_size);

How to parse files that cannot fit entirely in memory RAM

I have created a framework to parse text files of reasonable size that can fit in memory RAM, and for now, things are going well. I have no complaints, however what if I encountered a situation where I have to deal with large files, say, greater than 8GB(which is the size of mine)?
What would be an efficient approach to deal with such large files?
My framework:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>
int Parse(const char *filename,
const char *outputfile);
int main(void)
{
clock_t t1 = clock();
/* ............................................................................................................................. */
Parse("file.txt", NULL);
/* ............................................................................................................................. */
clock_t t2 = clock();
fprintf(stderr, "time elapsed: %.4f\n", (double)(t2 - t1) / CLOCKS_PER_SEC);
fprintf(stderr, "Press any key to continue . . . ");
getchar();
return 0;
}
long GetFileSize(FILE * fp)
{
long f_size;
fseek(fp, 0L, SEEK_END);
f_size = ftell(fp);
fseek(fp, 0L, SEEK_SET);
return f_size;
}
char *dump_file_to_array(FILE *fp,
size_t f_size)
{
char *buf = (char *)calloc(f_size + 1, 1);
if (buf) {
size_t n = 0;
while (fgets(buf + n, INT_MAX, fp)) {
n += strlen(buf + n);
}
}
return buf;
}
int Parse(const char *filename,
const char *outputfile)
{
/* open file for reading in text mode */
FILE *fp = fopen(filename, "r");
if (!fp) {
perror(filename);
return 1;
}
/* store file in dynamic memory and close file */
size_t f_size = GetFileSize(fp);
char *buf = dump_file_to_array(fp, f_size);
fclose(fp);
if (!buf) {
fputs("error: memory allocation failed.\n", stderr);
return 2;
}
/* state machine variables */
// ........
/* array index variables */
size_t x = 0;
size_t y = 0;
/* main loop */
while (buf[x]) {
switch (buf[x]) {
/* ... */
}
x++;
}
/* NUL-terminate array at y */
buf[y] = '\0';
/* write buffer to file and clean up */
outputfile ? fp = fopen(outputfile, "w") :
fp = fopen(filename, "w");
if (!fp) {
outputfile ? perror(outputfile) :
perror(filename);
}
else {
fputs(buf, fp);
fclose(fp);
}
free(buf);
return 0;
}
Pattern deletion function based on the framework:
int delete_pattern_in_file(const char *filename,
const char *pattern, const char *outputfile)
{
/* open file for reading in text mode */
FILE *fp = fopen(filename, "r");
if (!fp) {
perror(filename);
return 1;
}
/* copy file contents to buffer and close file */
size_t f_size = GetFileSize(fp);
char *buf = dump_file_to_array(fp, f_size);
fclose(fp);
if (!buf) {
fputs("error - memory allocation failed", stderr);
return 2;
}
/* delete first match */
size_t n = 0, pattern_len = strlen(pattern);
char *tmp, *ptr = strstr(buf, pattern);
if (!ptr) {
fputs("No match found.\n", stderr);
free(buf);
return -1;
}
else {
n = ptr - buf;
ptr += pattern_len;
tmp = ptr;
}
/* delete the rest */
while (ptr = strstr(ptr, pattern)) {
while (tmp < ptr) {
buf[n++] = *tmp++;
}
ptr += pattern_len;
tmp = ptr;
}
/* copy the rest of the buffer */
strcpy(buf + n, tmp);
/* open file for writing and print the processed buffer to it */
outputfile ? fp = fopen(outputfile, "w") :
fp = fopen(filename, "w");
if (!fp) {
outputfile ? perror(outputfile) :
perror(filename);
}
else {
fputs(buf, fp);
fclose(fp);
}
free(buf);
return 0;
}

If you wish to stick with your current design, an option might be to mmap() the file instead of reading it into a memory buffer.
You could change the function dump_file_to_array to the following (linux-specific):
char *dump_file_to_array(FILE *fp, size_t f_size) {
buf = mmap(NULL, f_size, PROT_READ, MAP_SHARED, fileno(fp), 0);
if (buf == MAP_FAILED)
return NULL;
return buf;
}
Now you can read over the file, the memory manager will take automatically care to only hold the relevant potions of the file in memory.
For Windows, similar mechanisms exist.

Chances you are parsing the file line-by line. So read in a large block (4k or 16k) and parse all the lines in that. Copy the small remainder to the beginning of the 4k or 16k buffer and read in the rest of the buffer. Rinse and repeat.
For JSON or XML you will need an event based parser that can accept multiple blocks or input.

There are multiple issues with your approach.
The concept of maximum and available memory are not so evident: technically, you are not limited by the RAM size, but by the quantity of memory your environment will let you allocate and use for your program. This depends on various factors:
What ABI you compile for: the maximum memory size accessible to your program is limited to less than 4 GB if you compile for 32-bit code, even if your system has more RAM than that.
What quota the system is configured to let your program use. This may be less than available memory.
What strategy the system uses when more memory is requested than is physically available: most modern systems use virtual memory and share physical memory between processes and system tasks (such as the disk cache) using very advanced algorithms that cannot be describe in a few lines. It is possible on some systems for your program to allocate and use more memory than is physically installed on the motherboard, swapping memory pages to disk as more memory is accessed, at a huge cost in lag time.
There are further issues in your code:
The type long might be too small to hold the size of the file: on Windows systems, long is 32-bit even on 64-bit versions where memory can be allocated in chunks larger than 2GB. You must use different API to request the file size from the system.
You read the file with an series of calls to fgets(). This is inefficient, a single call to fread() would suffice. Furthermore, if the file contains embedded null bytes ('\0' characters), chunks from the file will be missing in memory. However you could not deal with embedded null bytes if you use string functions such as strstr() and strcpy() to handle your string deletion task.
the condition in while (ptr = strstr(ptr, pattern)) is an assignment. While not strictly incorrect, it is poor style as it confuses readers of your code and prevents life saving warnings by the compiler where such assignment-conditions are coding errors. You might think that could never happen, but anyone can make a typo and a missing = in a test is difficult to spot and has dire consequences.
you short-hand use of the ternary operator in place of if statements is quite confusing too: outputfile ? fp = fopen(outputfile, "w") : fp = fopen(filename, "w");
rewriting the input file in place is risky too: if anything goes wrong, the input file will be lost.
Note that you can implement the filtering on the fly, without a buffer, albeit inefficiently:
#include <stdio.h>
#include <string.h>
int main(int argc, char *argv[]) {
if (argc < 2) {
fprintf(stderr, "usage: delpat PATTERN < inputfile > outputfile\n");
return 1;
}
unsigned char *pattern = (unsigned char*)argv[1];
size_t i, j, n = strlen(argv[1]);
size_t skip[n + 1];
int c;
skip[0] = 0;
for (i = j = 1; i < n; i++) {
while (memcmp(pattern, pattern + j, i - j)) {
j++;
}
skip[i] = j;
}
i = 0;
while ((c = getchar()) != EOF) {
for (;;) {
if (i < n && c == pattern[i]) {
if (++i == n) {
i = 0; /* match found, consumed */
}
break;
}
if (i == 0) {
putchar(c);
break;
}
for (j = 0; j < skip[i]; j++) {
putchar(pattern[j]);
}
i -= skip[i];
}
}
for (j = 0; j < i; j++) {
putchar(pattern[j]);
}
return 0;
}

First of all I wouldn't suggest holding such big files in RAM but instead using streams. This because buffering is usually done by the library as well as by the kernel.
If you are accessing the file sequentially, which seems to be the case, then you probably know that all modern systems implement read-ahead algorithms so just reading the whole file ahead of time IN RAM may in most cases just waste time.
You didn't specify the use-cases you have to cover so I'm going to have to assume that using streams like
std::ifstream
and doing the parsing on the fly will suit your needs. As a side note, also make sure your operations on files that are expected to be large are done in separate threads.

An alternative solution: If you're on linux systems, and you have a decent amount of swap space, just open the whole bad boy up. It will consume your ram and also consume harddrive space (swap). Thus you can have the entire thing open at once, just not all of it will be on the ram.
Pros
If an unexpected shut down occurred, the memory on the swap space is recoverable.
RAM is expensive, HDDs are cheap, so the application would put less strain on your expensive equipment
Virus could not harm your computer because there would be no room in RAM for them to run
You'll be taking full advantage of the Linux operating system by using the swap space. Normally the swap space module is not used and all it does is clog up precious ram.
The additional energy that is needed to utilize the entirety of the ram can warm the immediate area. Useful during winter time
You can add "Complex and Special Memory Allocation Engineering" to your resume.
Cons
None

Consider treating the file as an external array of lines.
Code can use an array of line indexes. This index array can be kept in memory at a fraction of the size of the large file. Access to any line is accomplished quickly via this lookup, a seek with fsetpos() and an fread()/fgets(). As the lines are edited, the new lines can be saved, in any order, in temporary text file. Saving of the file reads both the original file and temp one in sequence to form and write the new file.
typedef struct {
int attributes; // not_yet_read, line_offset/length_determined,
// line_changed/in_other_file, deleted, etc.
fpos_t line_offset; // use with fgetpos() fsetpos()
unsigned line_length; // optional field as code could re-compute as needed.
} line_index;
size_t line_count;
// read some lines
line_index *index = malloc(sizeof *index * line_count);
// read more lines
index = realloc(index, sizeof *index * line_count);
// edit lines, save changes to appended temporary file.
// ...
// Save file -weave the contents of the source file and temp file to the new output file.
Additionally, with enormous files, the array line_index[] itself can be realized in disk memory too. Access to is easily computed. In an extreme sense, only 1 line of the file needs to in memory at any time.

You mentioned state-machine. Every finite-state-automata can be optimized to have minimal (or no) lookahead.
Is it possible to do this in Lex? It will generate output c file which you can compile.
If you don't want to use Lex, you can always do following:
Read n chars into (ring?) buffer where n is size of pattern.
Try to match buffer with pattern
If match goto 1
Print buffer[0], read char, goto 2
Also for very long patterns and degenerate inputs strstr can be slow. In that case you might want to look into more advanced sting matching aglorithms.

mmap() is a pretty good way of working on files with large sizes.
It provides you with lot of flexibility but you need to be cautious with page size. Here is a good article which talks about more specifics.

Infinitely reading input when inputting a file from stdin

I want to parse information from a CSV file and display it to my screen. My issue is that I keep receiving Segmentation fault core dumped from my while loop, which is supposed to be storing the CSV file info into different arrays to represent the different columns in the CSV. A sample line in my CSV would be rice,10,20,$2.00 which represents the food, stock, reorder limit, and cost per unit.
char buffer[1024];
char *line=buffer;
FILE *file=fopen("inventory.csv", "rw");
int x;
int i=0;
int j=0;
char *food [100];
int stock [100];
double cost [100];
int reorder [100];
while (fgets(line, 2048, file) != NULL){
food[i] = (strtok(line, ","));
stock [i] = (atoi(strtok(NULL, ",")));
reorder[i] = (atoi(strtok(NULL, ",")));
cost[i] = (atof(strtok(NULL, ",")));
line=line+25;
i++; }

First, your code has a basic problem: you want to read an infinite amount of data and store it into finite space.
In more details:
You keep advancing line by 25 after each line, it points to buffer which has a size of 1024, so after 40 lines you will be writing after the end of it.
Your arrays have a size of 100, so after 100 lines you will be writing after the end of them.
Do you need to store everything you read forever?
Now if this is not the problem, it could also be that strtok fails to find the "," and returns NULL, you should always check the return of strtok before using the result. It may for example fail if your file finishes with some empty lines, or just that you don't have a "," after the last element of the line that you read.

My bet is that you're trying to access a position in your arrays that doesn't exist. Try looking the last i value before your program crashes.

Please take a look at the following class. You may call it with
main.cpp
Stock line_stock;
file >> line_stock; //Use it in a loop cause it only reads a line
Stock.h
#include <iostream>
#include <string>
class Stock {
public:
private:
std::string food;
unsigned int stock;
double cost;
unsigned int reorder;
friend std::istream& operator >> (std::istream& is, Stock& s){
char coma;
is >> s.food >> coma>>
s.stock >> coma >>
s.cost >> coma >>
s.reorder;
return is;
}
};
//double doesn't work well on money, use unsigned int instead. Divide by 100 to get the dollar and % 100 to get the cents.
Using that class reads a line, then you process it and you're up to the next line.

When trying to read a file with an unknown number of data lines 2 approaches come to mind:
1) Read line by line and adjust memory allocation as needed.
2) Parse the file twice. First to find needed length.
A robust program typically uses method #1. But #2 is simpler for learners. #2 follows.
FILE *file = fopen("inventory.csv", "rw");
// Test for file == NULL
char buffer[100];
size_t n = 0;
// Find number of lines
while (fgets(buffer, sizeof buffer, file) != NULL)
n++;
char *food = malloc(n * sizeof *food);
int stock = malloc(n * sizeof *stock);
double *cost = malloc(n * sizeof *cost);
int *reorder = malloc(n * sizeof *reorder);
// TBD: Add checks for failed malloc
rewind(file);
size_t i = 0;
for (i = 0; i<n; i++) {
char name[sizeof buffer];
if (fgets(buffer, sizeof buffer, file) != NULL) Handle_UnexpectedError();
int cnt = sscanf(buffer, "%s ,%d ,%d , $%lf",
name, &stock[i], &reorder[i], &cost[i]);
if (cnt != 4) Handle_FormatError();
food[i] = strdup(name);
}
fclose(file);
// Use food, stock, reorder, cost
// When done
for (i = 0; i<n; i++) {
free(food[i]);
}
free(food);
free(stock);
free(reorder);
free(cost);

Reading a csv file using C and then doing string manipulation

First I open a CSV file (Cars.csv) using my C file cars.c. My CSV file would look like:
toyota, 100, 20, 150.50
nissan, 200, 50, 100.75
BMW, 400, 80, 323.00
The 1st column is names, 2nd column is stock, 3rd column is order, 4th column is price.
Then I have to do two works(I will call the program in the commandline with the work name and any other argument I need to pass):
List all the items in the CSV file while the commas cannot be a part of the output. I also have to print headers for each column. (./cars list)
I would send an argument when running the program from the command line. The argument would be one the name of the items in my csv file. Then I would have to subtract 1 from the stock and update the CSVfile. (./cars reduce nissan)
This is how I opened the CSV file.
void main(int argc, char *argv[]) {
FILE *in = fopen ("Cars.csv" ,"rt");
if (in == NULL) {
fclose (in); return 0;
}
//I need to store the data from the CSV file in an array of pointers. I know how to do that, but Im having a problem in storing them without the commas and then doing what I need to do in part 1.
//This is the only code I could write for the 2nd part.
int p =0;
for(p =0; p<; p = p + 4) {
if(strcmp(argsv[2],save[p]) == 0) { //save is the array of pointers where I
int r = int(save[p +1]); //store the data from the csv file
int s = r - 1;
char str[15];
sprintf(str, "%d", s);
save[p +1] = str; // not sure if this line actually makes sense
}
}

There are a couple of possible solutions.
Here's one:
Read the lines into an array of pointers. Allocate array with malloc and reallocate as needed with realloc. Read lines with fgets.
For each line, use strtok to split the line, and do whatever you need with each field.
Another solution:
Use your favorite search engine to search for an existing CSV-reading library (there are many out there, which can handle very advanced CSV files).

Think how the data should be represented and define a structure.
typedef struct {
char *name;
int stock;
int order;
double price;
} car_type;
Open the file: See OP's code
Find how many lines and allocate memory
char buf[100];
size_t n = 0;
while (fgets(buf, sizeof buf, in) != NULL) n++;
rewind(in);
car_type *car = malloc(n * sizeof *car);
if (car == NULL) OutOfMemory();
Read the cars
for (size_t i=0; i<n; i++) {
if (fgets(buf, sizeof buf, in) == NULL) Handle_UnexpectedEOF();
char car_name[100];
if (4 != sscanf(buf, "%s ,%d ,%d, %lf", car_name,
&car[i].stock, &car[i].order, &car[i].price) Handle_FormatError();
car[i].name = strdup(car_name);
}
// Use `car` ...
When done
fclose(in);
for (size_t i=0; i<n; i++) {
free(car[i].name);
}
free(car);

How to read contents of a bin file until there is no data left

I have to read a binary (.bin) file. This file has video data that is the RGBA data. Each component is compose of 4096 bytes and are of type unsigned char. Hence i open the file and read the file as shown below code snippet:
FILE *fp=fopen(path,"rb");
//Allocating memory to copy RGBA colour components
unsigned char *r=(unsigned char*)malloc(sizeof(unsigned char)*4096);
unsigned char *g=(unsigned char*)malloc(sizeof(unsigned char)*4096);
unsigned char *b=(unsigned char*)malloc(sizeof(unsigned char)*4096);
unsigned char *b=(unsigned char*)malloc(sizeof(unsigned char)*4096);
//copying file contents
fread(r,sizeof(unsigned char),4096,fp);
fread(g,sizeof(unsigned char),4096,fp);
fread(b,sizeof(unsigned char),4096,fp);
fread(a,sizeof(unsigned char),4096,fp);
Once the data is copied in r,g,b,a they are sent for suitable function for displaying.
The above code works fine for copying one set of RGBA data. However i should keep copying and keep sending the data for display.
I searched and could only find example of displaying contents of a file, however it is suitable only for text files i.e. the EOF technique.
Hence i kindly request the users to provide suitable suggestions for inserting the above code snippet into a loop(the loop condition).

fread has a return value. You want to check it in case of an error.
fread() and fwrite() return the number of items successfully read or written (i.e., not the number of characters).
If an error occurs, or the end-of-file is reached, the return value is a short item count (or zero).
So, you want to try to do an fread, but be sure to look for an error.
while (!feof(fp) {
int res = fread(r,sizeof(unsigned char),4096,fp);
if (res != 4096) {
int file_error = ferror(fp)
if (0 != file_error) {
clearerr(fp)
//alert/log/do something with file_error?
break;
}
}
//putting the above into a function that takes a fp
// and a pointer to r,g,b, or a would clean this up.
// you'll want to wrap each read
fread(g,sizeof(unsigned char),4096,fp);
fread(b,sizeof(unsigned char),4096,fp);
fread(a,sizeof(unsigned char),4096,fp);
}

I think that an EOF technique would do just fine here, something like:
while (!feof(fp))
fread(buffer, sizeof(unsigned char), N, fp);
in your file do you have (R,G,B,A) values?
int next_r = 0;
int next_g = 0;
int next_b = 0;
int next_a = 0;
#define N 4096
while (!feof(fp))
{
int howmany = fread(buffer, sizeof(unsigned char), N, fp);
for (int i=0; i<howmany; i+=4)
{
r[next_r++] = buffer[i];
g[next_g++] = buffer[i+1];
b[next_b++] = buffer[i+2];
a[next_a++] = buffer[i+3];
}
}

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Question about offsets and reading a file line by line - c

Related

read write binary into a string c

How to parse files that cannot fit entirely in memory RAM

Infinitely reading input when inputting a file from stdin

Reading a csv file using C and then doing string manipulation

How to read contents of a bin file until there is no data left

Categories

Resources