Reading line by line in Arduino IDE

Reading line by line in Arduino IDE - file

I'm trying to read txt file (has numeric values) line by line.
I used SPIFFS and I used this function
void readFile(fs::FS &fs, const char * path){
Serial.printf("Reading file: %s\r\n", path);
File file = fs.open(path);
if(!file || file.isDirectory()){
Serial.println("− failed to open file for reading");
return;
}
int count = 0;
Serial.println(" read from file:");
while(file.available()){
if (count < 100)
Serial.write(file.read());
}
}
What is the alternative function for "file.read()" something like "readline" because I need to read the file from first line to 100 and from 101 to 200 and so on .

You need to use readStringUntil. Although it's not the most efficient way, i'll show you how it's done.
#include <vector>
#include <SPIFFS.h>
std::vector<double> readLine(String path, uint16_t from, uint16_t to) {
std::vector<double> list;
if(from > to) return list;
File file = SPIFFS.open(path.c_str());
if(!file) return list;
uint16_t counter = 0;
while(file.available() && counter <= to) {
counter++;
if(counter < from) {
file.readStringUntil('\n');
} else {
String data = file.readStringUntil('\n');
list.push_back(data.toDouble());
}
}
return list;
}
void setup() {
Serial.begin(115200);
SPIFFS.begin(true);
std::vector<double> numbers = readLine("file.txt", 0, 100);
Serial.println("Data from 0 to 100:");
uint16_t counter = 0;
for (auto& n : numbers) {
counter++;
Serial.print(String(n) + "\t");
if(counter % 10 == 0) {
Serial.println();
}
}
numbers = readLine("file.txt", 101, 200);
Serial.println("\nData from 101 to 200:");
counter = 0;
for (auto& n : numbers) {
counter++;
Serial.print(String(n) + "\t");
if(counter % 10 == 0) {
Serial.println();
}
}
}
UPDATE
Supposed you have 1050 values and you want to parse it for each 100 values.
int i = 0;
while(i < 1050) {
int start = i + 1;
int end = (i + 100) > 1050 ? 1050 : i + 100;
std::vector<double> numbers = readLine("file.txt", start, end);
Serial.println("Data from " + String(start) + " to " + String(end) + ":");
uint16_t counter = 0;
for (auto& n : numbers) {
counter++;
Serial.print(String(n) + "\t");
if(counter % 10 == 0) {
Serial.println();
}
}
i += 100;
}

To read lines from Serial, network Client, from File or other object implementing the Arduino Stream class you can use function readBytesUntil.
uint8_t lineBuffer[64];
while (file.available()) {
int length = file.readBytesUntil('\n', lineBuffer, sizeof(lineBuffer) - 1);
if (length > 0 && lineBuffer[length - 1] == '\r') {
length--; // to remove \r if it is there
}
lineBuffer[length] = 0; // terminate the string
Serial.println(lineBuffer);
}

The File is a Stream and has all of the methods that a Stream does, including readBytesUntil and readStringUntil.

Related

Reading values from an array

I am trying to read an RFID tag and compare the RFID number with a array which I have already there. In my code, the comparing is not working properly. It always gives output as both 'Found' and 'Not found'. Can any one help me on this?
The two RFID numbers that I want to read are: 37376B34 and 7AA29B1A. These two are only for testing. I'm going to store about 20 RFID numbers in the array and check.
My code:
#include <SPI.h>
#include <MFRC522.h>
#define SS_PIN 10
#define RST_PIN 9
MFRC522 mfrc522(SS_PIN, RST_PIN); // Create MFRC522 instance.
int readsuccess;
byte readcard[4];
char str[32] = "";
String StrUID;
char* myTags[] = {"9FF4375C","37376B34","7AA29B1A","1B7D5223","9FF4375C"};
void setup() {
Serial.begin(9600); // Initialize serial communications with the PC
SPI.begin(); // Init SPI bus
mfrc522.PCD_Init(); // Init MFRC522 card
}
// --------------------------------------------------------------------
void loop() {
readsuccess = getid();
if (readsuccess) {
Serial.println(StrUID);
delay(1000);
}
for (int i = 0; i < 5; i++) {
if (StrUID == myTags[i]) {
Serial.println("Found");
return;
} else {
Serial.println("Not Found");
}
}
}
// --------------------------------------------------------------------
int getid() {
if (!mfrc522.PICC_IsNewCardPresent()) {
return 0;
}
if (!mfrc522.PICC_ReadCardSerial()) {
return 0;
}
for (int i=0; i<4; i++) {
readcard[i] = mfrc522.uid.uidByte[i]; //storing the UID of the tag in readcard
array_to_string(readcard, 4, str);
StrUID = str;
}
mfrc522.PICC_HaltA();
return 1;
}
// --------------------------------------------------------------------
void array_to_string(byte array[], unsigned int len, char buffer[]) {
for (unsigned int i = 0; i < len; i++) {
byte nib1 = (array[i] >> 4) & 0x0F;
byte nib2 = (array[i] >> 0) & 0x0F;
buffer[i * 2 + 0] = nib1 < 0xA ? '0' + nib1 : 'A' + nib1 - 0xA;
buffer[i * 2 + 1] = nib2 < 0xA ? '0' + nib2 : 'A' + nib2 - 0xA;
}
buffer[len * 2] = '\0';
}

Data type byte in Arduino is just unsigned char in c++. Since the data that you read from the reader are in bytes, so it is possible to directly compare it with myTags without the need of using String object or even the array to string conversion. You can do the comparison with C++ memcmp() function.
const char* myTags[] = {"9FF4375C","37376B34","7AA29B1A","1B7D5223","9FF4375C"};
byte myReading[4] = {0x9f,0xf4,0x37,0x5c};
for (int i = 0; i < 5; i++) {
if (memcmp(myTags[i], myReading, sizeof(myReading)) == 0) {
Serial.println("Found");
return;
} else {
Serial.println("Not Found");
}
}

C struct value resetting to NULL

I'm working on a school project where I have to store PPM data into structs. I have an issue with an array of strings inside the struct.
typedef struct {
char **comments;
} PPM;
I have 3 functions that uses this struct.
PPM *getPPM() is used to get all the PPM data from file and store it into the struct
void showPPM(PPM * img) to show the image data in terminal
PPM *encode(PPM *img) which is used to change the LSB of the RGB values of the image
The problem is that getPPM works as intended and gets all the comments into the comments array in getPPM. It displays them fine if I do it like this:
PPM *p = getPPM(fin);
showPPM(p);
But if I try to call it with the encode function like this:
PPM *p = getPPM(fin);
PPM *g = encode(p);
showPPM(g);
the debugger shows that as soon as the program enters the encode function, the comments value resets to NULL even though this function doesn't even touch comments. Is the way I am calling these functions wrong or is there something wrong with my code? I will try to provide the minimal code if the problem is not the way the functions are being called, as the code is big and everything is dependent on one another.
I'm very new to C language. I tried understanding the problem for hours but can't find a solution anywhere. Any help would be greatly appreciated.
EDIT: This is as small as I could make it.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
//Structures
typedef struct {
int r, g, b;
} pixels;
typedef struct {
char format[3];
char **comments;
int width, height, maxColors, commentCounter;
pixels **pixelValues;
} PPM;
// Functions declarations
PPM *getPPM(FILE * f);
PPM *encode(PPM *im, char *message, unsigned int mSize, unsigned int secret);
void showPPM(PPM * im);
static int *decimalToBinary(const char *message, unsigned int length);
// Main method
int main(int argc, char **argv) {
FILE * fin = fopen(argv[1], "r");
if(fin == NULL) {
perror("Cannot open file");
exit(1);
}
PPM *p = getPPM(fin);
PPM *g = encode(p, "test", 5, 1);
showPPM(g);
return 0;
}
/*
* This function is used to get the image data from a file and populate
* our strutures with it.
*/
PPM *getPPM(FILE * f) {
// Allocate the memory for structure and check if it was successful
PPM *pic = (PPM *) malloc(sizeof(PPM));
if(!pic) {
perror("Unable to allocate memory for structure");
exit(1);
}
char line[100]; // Expecting no more than 100 characters per line.
pic->commentCounter = 0; // This is the counter to keep size if there are more than one comments
int pixelsCounter = 0; // Counter for pixels' array
pic->comments = malloc(sizeof(char *));
pic->pixelValues = malloc(sizeof(PPM));
int lineCounter = 0;
if((pic->comments) == NULL) {
perror("Unable to allocate memory for pixels");
exit(1);
}
while(fgets(line, sizeof(line), f)) {
// Reference: https://stackoverflow.com/questions/2693776/removing-trailing-newline-character-from-fgets-input
size_t length = strlen(line);
if(length > 0 && line[length-1] == '\n') {
line[--length] = '\0';
}
// Assigning the file format
if(line[0] == 'P') {
pic->format[0] = line[0];
pic->format[1] = line[1];
pic->format[2] = '\0';
}
//Populate comments into struct PPM
if(line[0] == '#') {
// Reallocate/allocate the array size each time a new line of comment is found
if(pic->commentCounter != 0) {
pic->comments = realloc(pic->comments, (pic->commentCounter+1) * sizeof(char *));
}
// Allocate the memory for the string
pic->comments[pic->commentCounter] = malloc(100 * sizeof(char));
// Write the at commentCounter position of the array; character by character
int i = 0;
while(line[i] != '\0') {
pic->comments[pic->commentCounter][i] = line[i];
i++;
}
pic->comments[pic->commentCounter][i] = '\0';
pic->commentCounter++;
}
/*
* Loading the max color property of the file which is gonna be 3 letters (Usually 255)
* and checking if we previously got maxColors in our construct or not. If we didn't
* then we load this value into the consturct and the condition will never validate
* throughout the iterations
*/
if(strlen(line) == 3 && pic->maxColors == 0 && line[0] != '#') {
pic->maxColors = atoi(line);
continue;
}
/*
* Check if the length of string > 3, which means it is going to be a
* number, potentially RGB value or a comment. But since width & height
* comes before RGB values, our condition will fail once we have found
* the width/height for the next iteration. That's why this condition
* only checks if it is a comment or a numbered value of length > 3
*/
if((strlen(line) > 3) && (pic->width == 0) && (line[0] != '#')) {
char *width = strtok(line, " ");
char *height = strtok(NULL, " ");
pic->width = atoi(width);
pic->height = atoi(height);
continue;
}
/*
* If the width/height and maxColors have been found, that means every
* other line is either going to be the RGB values or a comment.
*/
if((pic->width != 0) && (pic->maxColors != 0) && (line[0] != '#')) {
// length(line) > 3 means all the RGB values are in same line
if(strlen(line) > 3) {
char *val1 = strtok(line, " ");
char *val2 = strtok(NULL, " ");
char *val3 = strtok(NULL, " ");
// pixelsCounter = 0 means it's the first element.
if(pixelsCounter != 0) {
// Reallocate memory each time a new R G B value line is found
pic->pixelValues = realloc(pic->pixelValues, (pixelsCounter + 1) * sizeof(PPM));
}
pic->pixelValues[pixelsCounter] = malloc(12 * sizeof(pixels));
pic->pixelValues[pixelsCounter]->r = atoi(val1);
pic->pixelValues[pixelsCounter]->g = atoi(val2);
pic->pixelValues[pixelsCounter]->b = atoi(val3);
pixelsCounter++;
} else if(strlen(line) <= 3) {
/*
* If each individual RGB values are in a separete lines, we will
* use a switch case and a line counter to keep track of where the
* values were inserted and when to know when we got RGB values for
* one pixel
*/
if(pixelsCounter != 0 && lineCounter == 0) {
// Reallocate memory each time a new R G B value line is found
pic->pixelValues = realloc(pic->pixelValues, (pixelsCounter + 1) * sizeof(PPM));
}
switch(lineCounter) {
case 0 :
pic->pixelValues[pixelsCounter] = malloc(12 * sizeof(pixels));
pic->pixelValues[pixelsCounter]->r = atoi(line);
lineCounter++;
continue;
case 1 :
pic->pixelValues[pixelsCounter]->g = atoi(line);
lineCounter++;
continue;
case 2 :
pic->pixelValues[pixelsCounter]->b = atoi(line);
lineCounter=0;
pixelsCounter++;
continue;
default:
continue;
}
}
}
}
pic->pixelValues[pixelsCounter] = NULL;
fclose(f);
return pic;
}
void showPPM(PPM * im) {
printf("%s\n",im->format);
int k = 0;
while(k < im->commentCounter) {
printf("%s\n", im->comments[k]);
k++;
}
printf("%d %d\n", im->width, im->height);
printf("%d\n",im->maxColors);
int j = 0;
while(im->pixelValues[j] != NULL) {
printf("%d %d %d\n", im->pixelValues[j]->r, im->pixelValues[j]->g, im->pixelValues[j]->b);
j++;
}
}
PPM *encode(PPM *im, char *message, unsigned int mSize, unsigned int secret) {
int *binaryMessage = decimalToBinary(message, mSize);
int i, j = 0, lineCounter = 0;
for(i = 0; i < 40; i++) {
switch(lineCounter) {
case 0 :
im->pixelValues[j]->r |= binaryMessage[i] << 0;
lineCounter++;
continue;
case 1 :
im->pixelValues[j]->g |= binaryMessage[i] << 0;
lineCounter++;
continue;
case 2 :
im->pixelValues[j]->b |= binaryMessage[i] << 0;
lineCounter=0;
j++;
continue;
default:
continue;
}
}
return im;
}
/*
* Converts a string into binary to be used in encode function. It
* first converts each letter of the string into ascii code. Then
* finds and stores each of the 8 bits of that int (ascii code of
* the letter) sequentially in an array.
*/
static int *decimalToBinary(const char *message, unsigned int length) {
/*
* malloc is used here instead of [] notation to allocate memory,
* because defining the variable with [] will make its scope
* limited to this function only. Since we want to access this
* array later on, we use malloc to assign space in the memory
* for it so we can access it using a pointer later on.
*/
int k=0, i, j;
unsigned int c;
unsigned int *binary = malloc(8 * length);
for(i = 0; i < length; i++) {
c = message[i];
for(j = 7; j >= 0; j--,k++) {
/*
* We check here if the jth bit of the number is 1 or 0
* using the bit operator &. If it is 1, it will return
* 1 because 1 & 1 will be true. Otherwise 0.
*/
if((c >> j) & 1)
binary[k] = 1;
else
binary[k] = 0;
}
}
return binary;
}
PPM file:
P3
# CREATOR: GIMP PNM Filter Version 1.1
# Amazing comment 2
# Yet another amazing comment
400 530
255
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0

in decimalToBinar
unsigned int *binary = malloc(8 * length);
must be
unsigned int *binary = malloc(8 * length * sizeof(int));
new code is :
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
//Structures
typedef struct {
int r, g, b;
} pixels;
typedef struct {
char format[3];
char **comments;
int width, height, maxColors, commentCounter;
pixels **pixelValues;
} PPM;
// Functions declarations
PPM *getPPM(FILE * f);
PPM *encode(PPM *im, char *message, unsigned int mSize, unsigned int secret);
void showPPM(PPM * im);
static int *decimalToBinary(const char *message, unsigned int length);
// Main method
int main(int argc, char **argv) {
FILE * fin = fopen(argv[1], "r");
if(fin == NULL) {
perror("Cannot open file");
exit(1);
}
PPM *p = getPPM(fin);
PPM *g = encode(p, "test", 5, 1);
showPPM(g);
free(p->comments);
free(p);
return 0;
}
/*
* This function is used to get the image data from a file and populate
* our strutures with it.
*/
PPM *getPPM(FILE * f) {
// Allocate the memory for structure and check if it was successful
PPM *pic = (PPM *) malloc(sizeof(PPM));
if(!pic) {
perror("Unable to allocate memory for structure");
exit(1);
}
char line[100]; // Expecting no more than 100 characters per line.
pic->commentCounter = 0; // This is the counter to keep size if there are more than one comments
int pixelsCounter = 0; // Counter for pixels' array
pic->comments = malloc(sizeof(char *));
pic->pixelValues = malloc(sizeof(PPM));
int lineCounter = 0;
if((pic->comments) == NULL) {
perror("Unable to allocate memory for pixels");
exit(1);
}
pic->width = 0;
pic->height = 0;
pic->maxColors = 0;
while(fgets(line, sizeof(line), f)) {
// Reference: https://stackoverflow.com/questions/2693776/removing-trailing-newline-character-from-fgets-input
size_t length = strlen(line);
if(length > 0 && line[length-1] == '\n') {
line[--length] = '\0';
}
// Assigning the file format
if(line[0] == 'P') {
pic->format[0] = line[0];
pic->format[1] = line[1];
pic->format[2] = '\0';
}
//Populate comments into struct PPM
if(line[0] == '#') {
// Reallocate/allocate the array size each time a new line of comment is found
if(pic->commentCounter != 0) {
pic->comments = realloc(pic->comments, (pic->commentCounter+1) * sizeof(char *));
}
// Allocate the memory for the string
pic->comments[pic->commentCounter] = malloc(100 * sizeof(char));
// Write the at commentCounter position of the array; character by character
int i = 0;
while(line[i] != '\0') {
pic->comments[pic->commentCounter][i] = line[i];
i++;
}
pic->comments[pic->commentCounter][i] = '\0';
pic->commentCounter++;
}
/*
* Loading the max color property of the file which is gonna be 3 letters (Usually 255)
* and checking if we previously got maxColors in our construct or not. If we didn't
* then we load this value into the consturct and the condition will never validate
* throughout the iterations
*/
if(strlen(line) == 3 && pic->maxColors == 0 && line[0] != '#') {
pic->maxColors = atoi(line);
continue;
}
/*
* Check if the length of string > 3, which means it is going to be a
* number, potentially RGB value or a comment. But since width & height
* comes before RGB values, our condition will fail once we have found
* the width/height for the next iteration. That's why this condition
* only checks if it is a comment or a numbered value of length > 3
*/
if((strlen(line) > 3) && (pic->width == 0) && (line[0] != '#')) {
char *width = strtok(line, " ");
char *height = strtok(NULL, " ");
pic->width = atoi(width);
pic->height = atoi(height);
continue;
}
/*
* If the width/height and maxColors have been found, that means every
* other line is either going to be the RGB values or a comment.
*/
if((pic->width != 0) && (pic->maxColors != 0) && (line[0] != '#')) {
// length(line) > 3 means all the RGB values are in same line
if(strlen(line) > 3) {
char *val1 = strtok(line, " ");
char *val2 = strtok(NULL, " ");
char *val3 = strtok(NULL, " ");
// pixelsCounter = 0 means it's the first element.
if(pixelsCounter != 0) {
// Reallocate memory each time a new R G B value line is found
pic->pixelValues = realloc(pic->pixelValues, (pixelsCounter + 1) * sizeof(PPM));
}
pic->pixelValues[pixelsCounter] = malloc(12 * sizeof(pixels));
pic->pixelValues[pixelsCounter]->r = atoi(val1);
pic->pixelValues[pixelsCounter]->g = atoi(val2);
pic->pixelValues[pixelsCounter]->b = atoi(val3);
pixelsCounter++;
} else if(strlen(line) <= 3) {
/*
* If each individual RGB values are in a separete lines, we will
* use a switch case and a line counter to keep track of where the
* values were inserted and when to know when we got RGB values for
* one pixel
*/
if(pixelsCounter != 0 && lineCounter == 0) {
// Reallocate memory each time a new R G B value line is found
pic->pixelValues = realloc(pic->pixelValues, (pixelsCounter + 1) * sizeof(PPM));
}
switch(lineCounter) {
case 0 :
pic->pixelValues[pixelsCounter] = malloc(12 * sizeof(pixels));
pic->pixelValues[pixelsCounter]->r = atoi(line);
lineCounter++;
continue;
case 1 :
pic->pixelValues[pixelsCounter]->g = atoi(line);
lineCounter++;
continue;
case 2 :
pic->pixelValues[pixelsCounter]->b = atoi(line);
lineCounter=0;
pixelsCounter++;
continue;
default:
continue;
}
}
}
}
pic->pixelValues[pixelsCounter] = NULL;
fclose(f);
return pic;
}
void showPPM(PPM * im) {
printf("%s\n",im->format);
int k = 0;
while(k < im->commentCounter) {
printf("%s\n", im->comments[k]);
k++;
}
printf("%d %d\n", im->width, im->height);
printf("%d\n",im->maxColors);
int j = 0;
while(im->pixelValues[j] != NULL) {
printf("%d %d %d\n", im->pixelValues[j]->r, im->pixelValues[j]->g, im->pixelValues[j]->b);
j++;
}
}
PPM *encode(PPM *im, char *message, unsigned int mSize, unsigned int secret) {
int *binaryMessage = decimalToBinary(message, mSize);
int i, j = 0, lineCounter = 0;
for(i = 0; i < 40; i++) {
switch(lineCounter) {
case 0 :
im->pixelValues[j]->r |= binaryMessage[i] << 0;
lineCounter++;
continue;
case 1 :
im->pixelValues[j]->g |= binaryMessage[i] << 0;
lineCounter++;
continue;
case 2 :
im->pixelValues[j]->b |= binaryMessage[i] << 0;
lineCounter=0;
j++;
continue;
default:
continue;
}
}
free(binaryMessage);
return im;
}
/*
* Converts a string into binary to be used in encode function. It
* first converts each letter of the string into ascii code. Then
* finds and stores each of the 8 bits of that int (ascii code of
* the letter) sequentially in an array.
*/
static int *decimalToBinary(const char *message, unsigned int length) {
/*
* malloc is used here instead of [] notation to allocate memory,
* because defining the variable with [] will make its scope
* limited to this function only. Since we want to access this
* array later on, we use malloc to assign space in the memory
* for it so we can access it using a pointer later on.
*/
int k = 0, i, j;
unsigned int c;
unsigned int *binary = malloc(8 * length * sizeof(int));
for(i = 0; i < length; i++) {
c = message[i];
for(j = 7; j >= 0; j--,k++) {
/*
* We check here if the jth bit of the number is 1 or 0
* using the bit operator &. If it is 1, it will return
* 1 because 1 & 1 will be true. Otherwise 0.
*/
if((c >> j) & 1)
binary[k] = 1;
else
binary[k] = 0;
}
}
return binary;
}

The fastest way to save graph to file in C

I have such a problem: I need to save a large graph with > million edges to txt file. Each edge is represented with a structure containing 3 integers: from, to, cost. My task is to write a program that will fastly save the whole graph to txt file in such format: "from to cost\n".
I am interested in the method, how to do that.
My idea is in creating a huge buffer of chars, where I add each digit to the buffer without the need to reverse then (first of all I get the number of digits of each integer, then add each digit to the buffer, then I add whitespace/new line symbol and do it until the last number is added.
Then I save the whole buffer to file by using fwrite() function.
Despite the fact that this method is relatively fast, I saw programs that do it faster. My question is: do you know more efficient way to implement this program in order to get faster results?
The program must be in C language.
typedef struct {
int edge_start;
int edge_count;
int parent;
int cost;
} node_t;
typedef struct {
graph_t *graph;
node_t *nodes;
int num_nodes;
int start_node;
} dijkstra_t;
The function to get the number of digits:
int getNumberOfDigitsBig(int x) {
if (x >= 10000) {
if (x >= 10000000) {
if (x >= 100000000) {
if (x >= 1000000000)
return 9;
return 8;
}
return 7;
}
if (x >= 100000) {
if (x >= 1000000)
return 6;
return 5;
}
return 4;
}
if (x >= 100) {
if (x >= 1000)
return 3;
return 2;
}
if (x >= 10)
return 1;
return 0;
}
Save function:
const dijkstra_t *const dij = (dijkstra_t*)dijkstra;
if (dij) {
FILE *f = fopen(filename, "w");
if (f) {
int numberOfNodes = dij->num_nodes;
long bufferLength = numberOfNodes * (9 * 3 + 3);
buffer = (char *)malloc(bufferLength + 1);
long bufferCounter = 0;
int number;
// printf("i = %d\n", number);
int counter;
int digits;
buffer[bufferCounter++] = '0';
buffer[bufferCounter++] = ' ';
buffer[bufferCounter++] = '0';
buffer[bufferCounter++] = ' ';
buffer[bufferCounter++] = '-';
buffer[bufferCounter++] = '1';
buffer[bufferCounter++] = '\n';
for(int i = 1; i < numberOfNodes; i++) {
const node_t *const node = &(dij->nodes[i]);
number = i;
digits = getNumberOfDigits(number);
counter = bufferCounter;
do {
buffer[counter + digits] = ZERO + number % 10;
--digits;
++bufferCounter;
} while(number /= 10);
buffer[bufferCounter++] = ' ';
number = node->cost;
if(number != -1) {
digits = getNumberOfDigitsBig(number);
counter = bufferCounter;
do {
buffer[counter + digits] = ZERO + number % 10;
digits = digits - 1;
bufferCounter = bufferCounter + 1;
} while(number /= 10);
} else {
buffer[bufferCounter++] = '-';
buffer[bufferCounter++] = '1';
}
buffer[bufferCounter++] = ' ';
buffer[bufferCounter++] = ' ';
number = node->parent;
if(number != -1) {
digits = getNumberOfDigitsBig(number);
counter = bufferCounter;
do {
buffer[counter + digits] = ZERO + number % 10;
--digits;
++bufferCounter;
} while(number /= 10);
} else {
buffer[bufferCounter++] = '-';
buffer[bufferCounter++] = '1';
}
buffer[bufferCounter++] = '\n';
}
fwrite(buffer, 1, bufferCounter, f);
ret = fclose(f) == 0;
free(buffer);
}
}
Thanks for attention.

I assume that what you need is an optimized version of printf that only processes positive integers. I did not benchmark it, but I would try to do as little comparisons and operations as possible, so I ended with that function:
int printint(FILE *fd, int n) {
char buffer[32]; // an uint64_t uses max 20 chars in base 10
int i = sizeof(buffer);
do {
buffer[--i] = '0' + n%10; // write digits from the right of buffer
n /= 10;
} while(n > 0);
return fwrite(buffer + i, 1, sizeof(buffer) - i, fd);
}
Then I would not use a huge buffer, but just rely on the default buffering of a FILE *
Saving code could then become (more or less just started with the example form the question):
const dijkstra_t *const dij = (dijkstra_t*)dijkstra;
if (dij) {
FILE *f = fopen(filename, "w");
if (f) {
int numberOfNodes = dij->num_nodes;
fputs("0 0 -1\n", f);
for(int i = 1; i < numberOfNodes; i++) {
const node_t *const node = &(dij->nodes[i]);
fputc(' ', f);
number = node->parent;
//printf("parent = %d\n", number);
if(number != -1) {
printint(number, f);
} else {
fwrite("-1", 1, 2, f);
}
fputc('\n', f);
}
ret = fclose(f) == 0;
free(buffer);
}
}

You can improve a little by using this "itoa":
void gwf_i2a(char *d, int i, int l) {
char *e = d + l;
while (l > 0) {
e--;
l--;
e[0] = '0' + (i % 10);
i /= 10;
}
}
ORIGINAL TIME: 76 clicks (7.6e-05 seconds).
NEW TIME: 39 clicks (3.9e-05 seconds).
source:
#include <ctime>
#include <iostream>
#include <random>
#include <vector>
#define ZERO '0'
void gwf_i2a(char *d, int i, int l) {
char *e = d + l;
while (l > 0) {
e--;
l--;
e[0] = '0' + (i % 10);
i /= 10;
}
}
typedef struct {
int x, y, z;
} graph_t;
typedef struct {
int edge_start;
int edge_count;
int parent;
int cost;
} node_t;
typedef struct {
graph_t *graph;
node_t *nodes;
int num_nodes;
int start_node;
} dijkstra_t;
graph_t graph = {111, 222, 3456789};
node_t nodes[] = {{1, 1, 1, 9999}, {2, 2, 2, 8999}, {2, 2, 2, 1234567890}};
dijkstra_t data[] = {&graph, (node_t *)&nodes, 4, 0};
int getNumberOfDigits(int x) {
if (x >= 100) {
if (x >= 1000) return 3;
return 2;
}
if (x >= 10) return 1;
return 0;
}
int getNumberOfDigitsBig(int x) {
if (x >= 10000) {
if (x >= 10000000) {
if (x >= 100000000) {
if (x >= 1000000000) return 9;
return 8;
}
return 7;
}
if (x >= 100000) {
if (x >= 1000000) return 6;
return 5;
}
return 4;
}
if (x >= 100) {
if (x >= 1000) return 3;
return 2;
}
if (x >= 10) return 1;
return 0;
}
void save(const char *filename, const dijkstra_t *dijkstra) {
int ret;
const dijkstra_t *const dij = (dijkstra_t *)dijkstra;
char *buffer;
if (dij) {
FILE *f = fopen(filename, "w");
if (f) {
int numberOfNodes = dij->num_nodes;
long bufferLength = numberOfNodes * (9 * 3 + 3);
buffer = (char *)malloc(bufferLength + 1);
long bufferCounter = 0;
int number;
// printf("i = %d\n", number);
int counter;
int digits;
buffer[bufferCounter++] = '0';
buffer[bufferCounter++] = ' ';
buffer[bufferCounter++] = '0';
buffer[bufferCounter++] = ' ';
buffer[bufferCounter++] = '-';
buffer[bufferCounter++] = '1';
buffer[bufferCounter++] = '\n';
for (int i = 1; i < numberOfNodes; i++) {
const node_t *const node = &(dij->nodes[i]);
number = i;
digits = getNumberOfDigits(number);
counter = bufferCounter;
do {
buffer[counter + digits] = ZERO + number % 10;
--digits;
++bufferCounter;
} while (number /= 10);
buffer[bufferCounter++] = ' ';
number = node->cost;
if (number != -1) {
digits = getNumberOfDigitsBig(number);
counter = bufferCounter;
do {
buffer[counter + digits] = ZERO + number % 10;
digits = digits - 1;
bufferCounter = bufferCounter + 1;
} while (number /= 10);
} else {
buffer[bufferCounter++] = '-';
buffer[bufferCounter++] = '1';
}
buffer[bufferCounter++] = ' ';
buffer[bufferCounter++] = ' ';
number = node->parent;
if (number != -1) {
digits = getNumberOfDigitsBig(number);
counter = bufferCounter;
do {
buffer[counter + digits] = ZERO + number % 10;
--digits;
++bufferCounter;
} while (number /= 10);
} else {
buffer[bufferCounter++] = '-';
buffer[bufferCounter++] = '1';
}
buffer[bufferCounter++] = '\n';
}
fwrite(buffer, 1, bufferCounter, f);
ret = fclose(f) == 0;
free(buffer);
}
}
}
void new_save(const char *filename, const dijkstra_t *dijkstra) {
int ret;
const dijkstra_t *const dij = (dijkstra_t *)dijkstra;
char *buffer;
if (dij) {
FILE *f = fopen(filename, "w");
if (f) {
int numberOfNodes = dij->num_nodes;
long bufferLength = numberOfNodes * (9 * 3 + 3);
buffer = (char *)malloc(bufferLength + 1);
long bufferCounter = 0;
int number;
int counter;
int digits;
buffer[bufferCounter++] = '0';
buffer[bufferCounter++] = ' ';
buffer[bufferCounter++] = '0';
buffer[bufferCounter++] = ' ';
buffer[bufferCounter++] = '-';
buffer[bufferCounter++] = '1';
buffer[bufferCounter++] = '\n';
for (int i = 1; i < numberOfNodes; i++) {
const node_t *const node = &(dij->nodes[i]);
int len = getNumberOfDigits(i) + 1;
gwf_i2a((char *)&buffer[bufferCounter], i, len);
bufferCounter += len;
buffer[bufferCounter++] = ' ';
number = node->cost;
if (number != -1) {
len = getNumberOfDigitsBig(number) + 1;
gwf_i2a((char *)&buffer[bufferCounter], number, len);
bufferCounter += len;
} else {
buffer[bufferCounter++] = '-';
buffer[bufferCounter++] = '1';
}
buffer[bufferCounter++] = ' ';
buffer[bufferCounter++] = ' ';
number = node->parent;
if (number != -1) {
digits = getNumberOfDigitsBig(number);
counter = bufferCounter;
do {
buffer[counter + digits] = ZERO + number % 10;
--digits;
++bufferCounter;
} while (number /= 10);
} else {
buffer[bufferCounter++] = '-';
buffer[bufferCounter++] = '1';
}
buffer[bufferCounter++] = '\n';
}
fwrite(buffer, 1, bufferCounter, f);
ret = fclose(f) == 0;
free(buffer);
}
}
}
void original() {
clock_t t;
t = clock();
save("bogus.txt", data);
t = clock() - t;
std::cout << "original: " << t << " clicks (" << ((float)t) / CLOCKS_PER_SEC
<< " seconds)." << std::endl;
}
void new_test() {
clock_t t;
t = clock();
new_save("new_bogus.txt", data);
t = clock() - t;
std::cout << "NEW: " << t << " clicks (" << ((float)t) / CLOCKS_PER_SEC
<< " seconds)." << std::endl;
}
int main(int argc, char **argv) {
original();
new_test();
return 0;
}

[Rewritten on 2018-01-13.]
Standard I/O (printf() etc.) is indeed comparatively slow in converting numerical data to text form. Here, the problem is to output lines of form
<node> <cost> <parent>
where all three are either unsigned (32-bit) integers in decimal notation, or -1. For simplicity, let's reserve the value UINT32_MAX (4294967295) for -1.
I suggest a two-fold approach:
Construct each record from right to left. This avoids the need of checking how many digits there are in a number.
Buffer a number of records at once. This reduces the number of fwrite() calls, at the cost of a modest dynamically allocated buffer.
Note that this means that the records in each chunk must be processed last-to-first, in order to retain the correct order.
Consider the following code. Note that I've reduced the definitions of node_t and dijkstra_t to the fields that area actually used, so that the following example can be compiled as-is. Also note that instead of -1 for parent or cost, one must use UINT32_MAX, as their types are now uint32_t.
#include <stdlib.h>
#include <stdint.h>
#include <limits.h>
#include <stdio.h>
typedef struct {
uint32_t parent; /* Use UINT32_MAX for -1 */
uint32_t cost; /* Use UINT32_MAX for -1 */
} node_t;
typedef struct {
node_t *nodes;
uint32_t num_nodes;
} dijkstra_t;
/* This function will store an unsigned 32-bit value
in decimal form, ending at 'end'.
UINT32_MAX will be written as "-1", however.
Returns a pointer to the start of the value.
*/
static inline char *prepend_value(char *end, uint32_t value)
{
if (value == UINT32_MAX) {
*(--end) = '1';
*(--end) = '-';
} else {
do {
*(--end) = '0' + (value % 10u);
value /= 10u;
} while (value);
}
return end;
}
/* Each record consists of three unsigned 32-bit integers,
each at most 10 characters, with spaces in between
and a newline at end. Thus, at most 33 characters. */
#define RECORD_MAXLEN 33
/* We process records in chunks of 16384.
Maximum number of records (nodes) is 2**32 - 2 - RECORD_CHUNK,
or 4,294,950,910 in this case. */
#define RECORD_CHUNK 16384
/* Each chunk of record is up to CHUNK_CHARS long.
(Roughly half a megabyte in this case.) */
#define CHUNK_CHARS (RECORD_MAXLEN * RECORD_CHUNK)
/* Save the edges in a graph to a stream.
Returns 0 if success, -1 if an error occurs.
*/
int save_edges(dijkstra_t *dij, FILE *out)
{
if (dij && out && !ferror(out)) {
const int nodes = dij->num_nodes;
const node_t *node = dij->nodes;
const uint32_t root_parent = dij->nodes->parent;
const uint32_t root_cost = dij->nodes->cost;
char *buf, *end, *ptr;
uint32_t o;
/* Allocate memory for the chunk buffer. */
buf = malloc(CHUNK_CHARS);
if (!buf)
return -1;
end = buf + CHUNK_CHARS;
/* Temporarily, we reset the root node parent
to UINT32_MAX and cost to 0, so that the
very first record in the output is "0 0 -1". */
dij->nodes->cost = 0;
dij->nodes->parent = UINT32_MAX;
for (o = 0; o < nodes; o += RECORD_CHUNK) {
uint32_t i = (o + RECORD_CHUNK < nodes) ? o + RECORD_CHUNK : nodes;
/* Fill buffer back-to-front. */
ptr = end;
while (i-->o) {
const node_t *curr = node + i;
/* Format: <i> ' ' <cost> ' ' <parent> '\n' */
/* We construct the record from right to left. */
*(--ptr) = '\n';
ptr = prepend_value(ptr, curr->parent);
*(--ptr) = ' ';
ptr = prepend_value(ptr, curr->cost);
*(--ptr) = ' ';
ptr = prepend_value(ptr, i);
}
/* Write the chunk buffer out. */
if (fwrite(ptr, 1, (size_t)(end - ptr), out) != (size_t)(end - ptr)) {
dij->nodes->cost = root_cost;
dij->nodes->parent = root_parent;
free(buf);
return -1;
}
}
/* Reset root node, and free the buffer. */
dij->nodes->cost = root_cost;
dij->nodes->parent = root_parent;
free(buf);
/* Check for write errors. */
if (fflush(out))
return -1;
if (ferror(out))
return -1;
/* Success. */
return 0;
}
return -1;
}
Additional speedup is possible, if we can use POSIX low-level I/O (open(), close(), write(), and fstat() from <unistd.h>). When the destination is a pipe or device, we can just directly write the data; when the destination is a file, we should write in chunks of multiples of st_blksize, to avoid read-modify-write cycles. Unlike standard I/O, with low-level I/O we can do that with just one "overflow" buffer of st_blksize, without having to copy the entire chunk buffer around in memory. However, since the question is not tagged posix, I shall refrain from further discussion along those edges.
OP stated their own version is still faster. I found that difficult to believe, because it does much more work than my version above. When I checked, on my machine a large dataset (of say 100,000,000) cannot be written in a single fwrite() call, as it only does a partial write; a loop is required to actually write the entire dataset. Therefore, in my opinion, the benchmark OP uses to compare the different version is very suspect.
Consider the following microbenchmark instead. It generates a singly linked list, and uses an externally compiled save_graph() function to output it (to standard output). There are three versions implemented: null, which does not save anything at all; antonkretov, for OP's implementation (adapted to work here); and nominalanimal, for mine.
Makefile:
CC := gcc
CFLAGS := -std=c99 -O2 -Wall
LDFLAGS :=
BINS := test-null test-antonkretov test-nominalanimal
NODES := 100000000
.PHONY: all clean run
all: clean $(BINS)
clean:
rm -f $(BINS) *.o
%.o: %.c
$(CC) $(CFLAGS) -c $^
test-null: main.o data-null.o
$(CC) $(CFLAGS) $^ -o $#
test-antonkretov: main.o data-antonkretov.o
$(CC) $(CFLAGS) $^ -o $#
test-nominalanimal: main.o data-nominalanimal.o
$(CC) $(CFLAGS) $^ -o $#
run: $(BINS)
#echo "Testing $(NODES) nodes."
#./test-null $(NODES) > /dev/null
#echo "Overhead (nothing saved):"
#bash -c 'time ./test-null $(NODES) > /dev/null'
#echo ""
#echo "Anton Kretov:"
#bash -c 'time ./test-antonkretov $(NODES) > /dev/null'
#echo ""
#echo "Nominal Animal:"
#bash -c 'time ./test-nominalanimal $(NODES) > /dev/null'
#echo ""
Note that this forum converts Tabs to spaces, and Makefile format requires the indentation to use spaces, so if you copy and paste the above to file, you need to run e.g. sed -e 's|^ *|\t|' -i Makefile to fix it.
data.h:
#ifndef DATA_H
#define DATA_H
#include <stdint.h>
#include <limits.h>
#include <stdio.h>
#define INVALID_COST UINT32_MAX
#define INVALID_PARENT UINT32_MAX
typedef struct {
uint32_t parent; /* Use INVALID_PARENT for -1 */
uint32_t cost; /* Use INVALID_COST for -1 */
} node_t;
typedef struct {
node_t *nodes;
uint32_t num_nodes;
} dijkstra_t;
int save_graph(dijkstra_t *, FILE *);
#endif /* DATA_H */
data-null.c, for measuring runtime overhead:
#include "data.h"
int save_graph(dijkstra_t *dij, FILE *out)
{
/* Does not do anything */
return 0;
}
data-antonkretov.c, version of OP's save routine, for comparison:
#include <stdlib.h>
#include "data.h"
int getNumberOfDigits(uint32_t x)
{
if (x >= 10000) {
if (x >= 10000000) {
if (x >= 100000000) {
if (x >= 1000000000)
return 9;
return 8;
}
return 7;
}
if (x >= 100000) {
if (x >= 1000000)
return 6;
return 5;
}
return 4;
}
if (x >= 100) {
if (x >= 1000)
return 3;
return 2;
}
if (x >= 10)
return 1;
return 0;
}
int save_graph(dijkstra_t *dij, FILE *out)
{
uint32_t numberOfNodes = dij->num_nodes;
size_t bufferLength = numberOfNodes * (size_t)33;
size_t bufferCounter = 0, counter;
size_t bytes;
uint32_t number, digits, i;
char *buffer;
if ((size_t)(bufferLength / 33) != numberOfNodes)
return -1;
buffer = malloc(bufferLength);
if (!buffer)
return -1;
buffer[bufferCounter++] = '0';
buffer[bufferCounter++] = ' ';
buffer[bufferCounter++] = '0';
buffer[bufferCounter++] = ' ';
buffer[bufferCounter++] = '-';
buffer[bufferCounter++] = '1';
buffer[bufferCounter++] = '\n';
for (i = 1; i < numberOfNodes; i++) {
const node_t *const node = dij->nodes + i;
number = i;
digits = getNumberOfDigits(number);
counter = bufferCounter;
do {
buffer[counter + digits] = '0' + (number % 10u);
--digits;
++bufferCounter;
} while (number /= 10u);
buffer[bufferCounter++] = ' ';
number = node->cost;
if (number != UINT32_MAX) {
digits = getNumberOfDigits(number);
counter = bufferCounter;
do {
buffer[counter + digits] = '0' + (number % 10u);
--digits;
++bufferCounter;
} while (number /= 10u);
} else {
buffer[bufferCounter++] = '-';
buffer[bufferCounter++] = '1';
}
buffer[bufferCounter++] = ' ';
number = node->parent;
if (number != UINT32_MAX) {
digits = getNumberOfDigits(number);
counter = bufferCounter;
do {
buffer[counter + digits] = '0' + (number % 10u);
--digits;
++bufferCounter;
} while (number /= 10u);
} else {
buffer[bufferCounter++] = '-';
buffer[bufferCounter++] = '1';
}
buffer[bufferCounter++] = '\n';
}
counter = 0;
while (counter < bufferCounter) {
bytes = fwrite(buffer + counter, 1, bufferCounter - counter, out);
if (!bytes) {
free(buffer);
return -1;
}
counter += bytes;
}
free(buffer);
return 0;
}
data-nominalanimal.c, my chunked back-to-front version of the save routine:
#include <stdlib.h>
#include "data.h"
/* This function will store an unsigned 32-bit value
in decimal form, ending at 'end'.
UINT32_MAX will be written as "-1", however.
Returns a pointer to the start of the value.
*/
static inline char *prepend_value(char *end, uint32_t value)
{
if (value == UINT32_MAX) {
*(--end) = '1';
*(--end) = '-';
} else {
do {
*(--end) = '0' + (value % 10u);
value /= 10u;
} while (value);
}
return end;
}
/* Each record consists of three unsigned 32-bit integers,
each at most 10 characters, with spaces in between
and a newline at end. Thus, at most 33 characters. */
#define RECORD_MAXLEN 33
/* We process records in chunks of 16384.
Maximum number of records (nodes) is 2**32 - 2 - RECORD_CHUNK,
or 4,294,950,910 in this case. */
#define RECORD_CHUNK 16384
/* Each chunk of record is up to CHUNK_CHARS long.
(Roughly half a megabyte in this case.) */
#define CHUNK_CHARS (RECORD_MAXLEN * RECORD_CHUNK)
/* Save the edges in a graph to a stream.
Returns 0 if success, -1 if an error occurs.
*/
int save_graph(dijkstra_t *dij, FILE *out)
{
if (dij && out && !ferror(out)) {
const int nodes = dij->num_nodes;
const node_t *node = dij->nodes;
const uint32_t root_parent = dij->nodes->parent;
const uint32_t root_cost = dij->nodes->cost;
char *buf, *end, *ptr;
uint32_t o;
/* Allocate memory for the chunk buffer. */
buf = malloc(CHUNK_CHARS);
if (!buf)
return -1;
end = buf + CHUNK_CHARS;
/* Temporarily, we reset the root node parent
to UINT32_MAX and cost to 0, so that the
very first record in the output is "0 0 -1". */
dij->nodes->cost = 0;
dij->nodes->parent = UINT32_MAX;
for (o = 0; o < nodes; o += RECORD_CHUNK) {
uint32_t i = (o + RECORD_CHUNK < nodes) ? o + RECORD_CHUNK : nodes;
/* Fill buffer back-to-front. */
ptr = end;
while (i-->o) {
const node_t *curr = node + i;
/* Format: <i> ' ' <cost> ' ' <parent> '\n' */
/* We construct the record from right to left. */
*(--ptr) = '\n';
ptr = prepend_value(ptr, curr->parent);
*(--ptr) = ' ';
ptr = prepend_value(ptr, curr->cost);
*(--ptr) = ' ';
ptr = prepend_value(ptr, i);
}
/* Write buffer. */
if (fwrite(ptr, 1, (size_t)(end - ptr), out) != (size_t)(end - ptr)) {
dij->nodes->cost = root_cost;
dij->nodes->parent = root_parent;
free(buf);
return -1;
}
}
/* Reset root node, and free the buffer. */
dij->nodes->cost = root_cost;
dij->nodes->parent = root_parent;
free(buf);
if (fflush(out))
return -1;
if (ferror(out))
return -1;
return 0;
}
return -1;
}
and finally the main program itself, main.c, that generates the data and calls the save_graph() functions:
#include <stdlib.h>
#include <inttypes.h>
#include <limits.h>
#include <string.h>
#include "data.h"
#define EDGES_MAX 4294901759
int main(int argc, char *argv[])
{
dijkstra_t graph;
size_t bytes;
uint32_t edges, i;
char dummy;
if (argc != 2 || !strcmp(argv[1], "-h") || !strcmp(argv[1], "--help")) {
fprintf(stderr, "\nUsage: %s EDGES\n\n", argv[0]);
return EXIT_SUCCESS;
}
if (sscanf(argv[1], " %" SCNu32 " %c", &edges, &dummy) != 1 || edges < 1 || edges > EDGES_MAX) {
fprintf(stderr, "%s: Invalid number of edges.\n", argv[1]);
return EXIT_FAILURE;
}
bytes = (1 + (size_t)edges) * sizeof graph.nodes[0];
if ((size_t)(bytes / (1 + (size_t)edges)) != sizeof graph.nodes[0]) {
fprintf(stderr, "%s: Too many edges.\n", argv[1]);
return EXIT_FAILURE;
}
graph.num_nodes = edges + 1;
graph.nodes = malloc(bytes);
if (!graph.nodes) {
fprintf(stderr, "%s: Too many edges: out of memory.\n", argv[1]);
return EXIT_FAILURE;
}
/* Generate a graph; no randomness, to keep timing steady. */
graph.nodes[0].parent = INVALID_COST;
graph.nodes[0].cost = 0;
for (i = 1; i <= edges; i++) {
graph.nodes[i].parent = i - 1;
graph.nodes[i].cost = 1 + (i % 10);
}
/* Print graph. */
if (save_graph(&graph, stdout)) {
fprintf(stderr, "Write error!\n");
return EXIT_FAILURE;
}
/* Done. */
return EXIT_SUCCESS;
}
Running make clean run (or make NODES=100000000 clean run) recompiles the benchmarks, and measures their run time, for a graph with 100,000,000 nodes. On my machine, the output is
Testing 100000000 nodes.
Overhead (nothing saved):
real0m0.514s
user0m0.297s
sys0m0.217s
Anton Kretov:
real0m4.059s
user0m3.379s
sys0m0.680s
Nominal Animal:
real0m3.336s
user0m3.151s
sys0m0.184s
which shows that mine is significantly faster. If we ignore the overhead (of generating the graph), mine took about 2.8 seconds of real time to save the data to /dev/null, whereas OP's took about 3.5 seconds. In other words, mine shows a 20% speed improvement.
It is important to note that both tests do produce the exact same output. For example, both ./test-nominalanimal 100000000 | sha256sum - and ./test-antonkretov 100000000 | sha256sum - show the exact same SHA256 checksums, 7504a1c97167701297c03c4aab8b0f20c5cac82a50128074d6e09c474353d0f8.
(You can also save the output to a file, and compare them; both are exactly 1,987,777,795 bytes long, and contain the exact same data. I did check.)
If you want to run a benchmark that stores the data to storage, for the comparisons to be fair, you need to start with cold caches. Otherwise the order in which you run the benchmarks will heavily impact their timings.

C login program that appends to csv file

I'm trying to make a c login program that verifies whether the user input matches values on a csv file. If the user input matches values on the file, then the program will allow the user to login, otherwise, the user is redirected to an error HTML page.
The CSV file has a format like: Full Name,username,password. I'm having trouble trying to extract only the username and password to match adn ignore the full name. Can anyone give me some hints on how to get started?

The CSV file format allows values to be placed in double quotes as follows:
REFERENCE: http://en.wikipedia.org/wiki/Comma-separated_values
Adjacent fields must be separated by a single comma. However, "CSV" formats
vary greatly in this choice of separator character. In particular, in locales
where the comma is used as a decimal separator, semicolon, TAB, or other
characters are used instead.
1997,Ford,E350
Any field may be quoted (that is, enclosed within double-quote characters).
Some fields must be quoted, as specified in following rules.
"1997","Ford","E350"
Fields with embedded commas or double-quote characters must be quoted.
1997,Ford,E350,"Super, luxurious truck"
Each of the embedded double-quote characters must be represented by a pair
of double-quote characters.
1997,Ford,E350,"Super, ""luxurious"" truck"
Fields with embedded line breaks MUST be quoted (however, many CSV
implementations simply do not support this).
1997,Ford,E350,"Go get one now
they are going fast"
I wrote and used the following code to parse a CSV file with quotes. I wrote and used csv_getFieldN to retrieve the ZERO-BASED NTH field on a line.
However, this code will NOT parse/ support a file with a field that breaks into the next line, even though that is a legal part of the CSV format defined on the WIKIPEDIA site.
#define QUOTE_DBL_CHAR '"'
#define COMMA_CHAR ','
#define NULL_CHAR 0x00
/******************************************************************************
* csv_countDQ()
*
*****************************************************************************/
int csv_countDQ(char* data)
{
unsigned int i;
int returnValue;
if (data == NULL) {
return 0;
}
returnValue = 0;
for (i = 0; i < strlen(data); i++) {
if (data[i] == QUOTE_DBL_CHAR) {
returnValue = returnValue + 1;
}
else {
break;
}
}
return returnValue;
}
/******************************************************************************
* csv_processQuotes()
*
*****************************************************************************/
int csv_processQuotes(char* line, int* out_insideQuote, int start_insideQuote,
char* out_quoteData, int* out_quoteDataLen)
{
int countDQ;
int i;
if ((line == NULL) || (out_insideQuote == NULL) ||
(out_quoteData == NULL) || (out_quoteDataLen == NULL)) {
return -1;
}
if ((start_insideQuote != 0) && (start_insideQuote != 1)) {
return -1;
}
countDQ = csv_countDQ(line);
if ((countDQ % 2) == 1) {
if (start_insideQuote == 0) {
(*out_insideQuote)=1;
}
else {
(*out_insideQuote)=0;
}
}
(*out_quoteDataLen) = 0;
for (i = 0; i < (countDQ / 2); i++) {
out_quoteData[(*out_quoteDataLen)] = QUOTE_DBL_CHAR;
(*out_quoteDataLen) = (*out_quoteDataLen) + 1;
}
out_quoteData[(*out_quoteDataLen)] = NULL_CHAR;
return 0;
}
/******************************************************************************
* csv_getNumFields()
*
*****************************************************************************/
int csv_getNumFields(char* line)
{
int currentIdxField;
unsigned int i;
int insideQuote;
int NEW_insideQuote;
int consecutiveDQCount;
int result;
char quoteData[1024];
int quoteDataLen;
/**************************************************************************
*
*************************************************************************/
if (line == NULL) {
return -1;
}
currentIdxField = 0;
insideQuote = 0;
for (i = 0; i < strlen(line); i++) {
/**********************************************************************
*
*********************************************************************/
if (line[i] == QUOTE_DBL_CHAR) {
consecutiveDQCount = csv_countDQ(&line[i]);
result = csv_processQuotes(&line[i], &NEW_insideQuote, insideQuote,
quoteData, &quoteDataLen);
insideQuote = NEW_insideQuote;
if (consecutiveDQCount >= 1) {
i = i + (consecutiveDQCount-1);
}
continue;
}
/**********************************************************************
*
*********************************************************************/
if ((line[i] == COMMA_CHAR) && (insideQuote == 0)) {
currentIdxField = currentIdxField+1;
}
}
return currentIdxField+1;
}
/******************************************************************************
* csv_getFieldN()
*
*****************************************************************************/
int csv_getFieldN(char* line, char* outField, int idxField)
{
int currentIdxField;
unsigned int i;
int insideQuote;
int NEW_insideQuote;
int charIDX_outField;
int consecutiveDQCount;
int result;
char quoteData[1024];
int quoteDataLen;
int j;
/**************************************************************************
*
*************************************************************************/
if (line == NULL) {
return -1;
}
currentIdxField = 0;
insideQuote = 0;
charIDX_outField = 0;
for (i = 0; i < strlen(line); i++) {
/**********************************************************************
*
*********************************************************************/
if (line[i] == QUOTE_DBL_CHAR) {
consecutiveDQCount = csv_countDQ(&line[i]);
result = csv_processQuotes(&line[i], &NEW_insideQuote, insideQuote,
quoteData, &quoteDataLen);
insideQuote = NEW_insideQuote;
if (currentIdxField == idxField) {
for (j = 0; j < quoteDataLen; j++) {
outField[charIDX_outField] = quoteData[j];
charIDX_outField = charIDX_outField + 1;
}
}
if (consecutiveDQCount >= 1) {
i = i + (consecutiveDQCount-1);
}
continue;
}
/**********************************************************************
*
*********************************************************************/
if ((line[i] == COMMA_CHAR) && (insideQuote == 0)) {
currentIdxField = currentIdxField+1;
}
else {
if (currentIdxField == idxField) {
outField[charIDX_outField] = line[i];
charIDX_outField = charIDX_outField + 1;
}
}
}
outField[charIDX_outField] = NULL_CHAR;
return 0;
}

Reading a fixed length line of integers from a file (bu

I'm trying to reading a file line by line of integers (that are 16 in length), but unknown number of lines and storing them into an integer array before adding into a linked list. This is what I have started off with, but I'm struggling to extend this to read multiple lines (as I don't know how many there will be). Is there a better way to achieve what I want then using the current method?
int deck[16];
while (fgetc(f) != EOF) {
for (int i = 0; i < DECK_SIZE; i++) {
c = fgetc(f);
tempNum = c - '0';
if(tempNum < MIN_CARD || tempNum > MAX_CARD) {
failed = 1;
}
deck[i] = (int)tempNum;
}
if(fgetc(f) != '\n') { // file too long
failed = 1;
}
//CREATE A NODE, ADD IT TO THE LINKED LIST
}
Example contents of a file (1): (each line holds 16 numbers)
1234567891234564
9876543211234233
Example contents of a file (2): (each line holds 16 numbers)
1234567891234562
9876543211234233
2354365457658674
3634645756858665

#include <stdio.h>
enum {
DECK_SIZE = 16
};
int main()
{
char deck[DECK_SIZE + 2];
while (fgets(deck, sizeof(deck), stdin)) {
char* ch;
for (ch = deck; *ch; ++ch) {
*ch -= '0';
}
if (ch - deck!= DECK_SIZE + 1) {
return 1;
}
// CREATE A NODE, ADD IT TO THE LINKED LIST
}
return 0;
}

int deck[DECK_SIZE];
int failed = 0;
int c, tempNum, i=0;
while((c = fgetc(f)) != EOF){
if(i < DECK_SIZE){
tempNum = c - '0';
if(tempNum < MIN_CARD || tempNum > MAX_CARD){
failed = 1;
fprintf(stderr, "out of range\n");
} else {
deck[i++] = tempNum;
}
} else {
if(c != '\n'){
failed = 1;
fprintf(stderr, "invalid format\n");
} else {
i = 0;
//CREATE A NODE, ADD IT TO THE LINKED LIST
}
}
if(failed){
fprintf(stderr, "exit\n");
exit(EXIT_FAILURE);
}
}

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Reading line by line in Arduino IDE - file

The File is a Stream and has all of the methods that a Stream does, including readBytesUntil and readStringUntil.

Related

Reading values from an array

C struct value resetting to NULL

The fastest way to save graph to file in C

C login program that appends to csv file

Reading a fixed length line of integers from a file (bu

Categories

Resources