C Program to Convert a Text File into a CSV File - c

The question is to convert a text file into a CSV file using C programming. The input text file is formatted as the following:
JACK Maria Stephan Nora
20 34 45 28
London NewYork Toronto Berlin
The output CSV file should look like:
Jack,20,London
Maria,34,NewYork
Stephan,45,Toronto
Nora,28,Berlin
The following code is what I tried so far:
void load_and_convert(const char* filename){
FILE *fp1, *fp2;
char ch;
fp1=fopen(filename,"r");
fp2=fopen("output.csv","w");
for(int i=0;i<1000;i++){
ch=fgetc(fp1);
fprintf(fp2,"%c",ch);
if(ch==' '|| ch=='\n')
fprintf(fp2,"%c,\n",ch);
}
fclose(fp1);
fclose(fp2);
}
The output from my code looks like:
Jack,
Maria,
Stephan,
Nora,
20,
34,
45,
28,
London,
NewYork,
Toronto,
Berlin,
How should I modify my code to make it work correctly?
What's the idea to treat this question?

Since I have some times, here is a working solution for you (tried my best to make the solution as elegant as I can):
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAX_STRING_LENGTH 50
#define MAX_NUMBER_OF_PEOPLE 50
typedef struct
{
char name[MAX_STRING_LENGTH];
int age;
char city[MAX_STRING_LENGTH];
} Person;
void getName(char *src, char *delim, Person *people) {
char *ptr = strtok(src, delim);
int i = 0;
while(ptr != NULL)
{
strncpy(people[i].name, ptr, MAX_STRING_LENGTH);
ptr = strtok(NULL, delim);
i++;
}
}
void getAge(char *src, char *delim, Person *people) {
char *ptr = strtok(src, delim);
int i = 0;
while(ptr != NULL)
{
people[i].age = atoi(ptr);
i++;
ptr = strtok(NULL, delim);
}
}
void getCity(char *src, char *delim, Person *people) {
char *ptr = strtok(src, delim);
int i = 0;
while(ptr != NULL)
{
strncpy(people[i].city, ptr, MAX_STRING_LENGTH);
i++;
ptr = strtok(NULL, delim);
}
}
int main(void)
{
Person somebody[MAX_NUMBER_OF_PEOPLE];
FILE *fp;
char *line = NULL;
size_t len = 0;
ssize_t read;
int ln = 0;
fp = fopen("./test.txt", "r");
if (fp == NULL)
return -1;
// Read every line, support first line is name, second line is age...
while ((read = getline(&line, &len, fp)) != -1) {
// remote trailing newline character
line = strtok(line, "\n");
if (ln == 0) {
getName(line, " ", somebody);
} else if (ln == 1) {
getAge(line, " ", somebody);
} else {
getCity(line, " ", somebody);
}
ln++;
}
for (int j = 0; j < MAX_NUMBER_OF_PEOPLE; j++) {
if (somebody[j].age == 0)
break;
printf("%s, %d, %s\n", somebody[j].name, somebody[j].age, somebody[j].city);
}
fclose(fp);
if (line)
free(line);
return 0;
}

What you are needing to do is non-trivial if you want to approach the problem holding all values in memory as you transform the 3-rows with 4-fields in each row, to a format of 4-rows with 3-fields per-row. So when you have your datafile containing:
Example Input File
$ cat dat/col2csv3x4.txt
JACK Maria Stephan Nora
20 34 45 28
London NewYork Toronto Berlin
You want to read each of the three lines and then transpose the columns into rows for .csv output. Meaning you will then end up with 4-rows of 3-csv fields each, e.g.
Expected Program Output
$ ./bin/transpose2csv < dat/col2csv3x4.txt
JACK,20,London
Maria,34,NewYork
Stephan,45,Toronto
Nora,28,Berlin
There is nothing difficult in doing it, but it takes meticulous attention to handling the memory storage of object and allocating/reallocating to handle the transformation between 3-rows with 4-pieces of data to 4-rows with 3-pieces of data.
One approach is to read all original lines into a typical pointer-to-pointer to char setup. Then transform/transpose the columns into rows. Since conceivably there could be 100-rows with 500-fields next time, you will want to approach the transformation using indexes and counters to track your allocation and reallocation requirement to make your finished code able to handle transposing a generic number of lines and fields into fields-number of lines with as many vales per row as you had original lines.
You can design your code to provide the transformation in two basic functions. The first to read and store the lines (saygetlines`) and the second to then transpose those lines into a new pointer-to-pointer to char so it can be output as comma separated values
One way to approach these two functions would be similar to the following that takes the filename to read as the first-arguments (or will read from stdin by default if no argument is given). The code isn't trivial, but it isn't difficult either. Just keep track of all your allocations, preserving a pointer to the beginning of each, so the memory may be freed when no longer needed, e.g.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define NPTR 2
#define NWRD 128
#define MAXC 1024
/** getlines allocates all storage required to read all lines from file.
* the pointers are doubled each time reallocation is needed and then
* realloc'ed a final time to exactly size to the number of lines. all
* lines are stored with the exact memory required.
*/
char **getlines (size_t *n, FILE *fp)
{
size_t nptr = NPTR; /* tracks number of allocated pointers */
char buf[MAXC]; /* tmp buffer sufficient to hold each line */
char **lines = calloc (nptr, sizeof *lines);
if (!lines) { /* validate EVERY allocaiton */
perror ("calloc-lines");
return NULL;
}
*n = 0; /* pointer tracks no. of lines read */
rewind (fp); /* clears stream error state if set */
while (fgets (buf, MAXC, fp)) { /* read each line o finput */
size_t len;
if (*n == nptr) { /* check/realloc ptrs if required */
void *tmp = realloc (lines, 2 * nptr * sizeof *lines);
if (!tmp) { /* validate reallocation */
perror ("realloc-tmp");
break;
}
lines = tmp; /* assign new block, (opt, zero new mem below) */
memset (lines + nptr, 0, nptr * sizeof *lines);
nptr *= 2; /* increment allocated pointer count */
}
buf[(len = strcspn(buf, "\r\n"))] = 0; /* get line, remove '\n' */
lines[*n] = malloc (len + 1); /* allocate for line */
if (!lines[*n]) { /* validate */
perror ("malloc-lines[*n]");
break;
}
memcpy (lines[(*n)++], buf, len + 1); /* copy to line[*n] */
}
if (!*n) { /* if no lines read */
free (lines); /* free pointers */
return NULL;
}
/* optional final realloc to free unused pointers */
void *tmp = realloc (lines, *n * sizeof *lines);
if (!tmp) {
perror ("final-realloc");
return lines;
}
return (lines = tmp); /* return ptr to exact no. of required ptrs */
}
/** free all pointers and n alocated arrays */
void freep2p (void *p2p, size_t n)
{
for (size_t i = 0; i < n; i++)
free (((char **)p2p)[i]);
free (p2p);
}
/** transpose a file of n rows and a varying number of fields to an
* allocated pointer-to-pointer t0 char structure with a fields number
* of rows and n csv values per row.
*/
char **transpose2csv (size_t *n, FILE *fp)
{
char **l = NULL, **t = NULL;
size_t csvl = 0, /* csv line count */
ncsv = 0, /* number of csv lines allocated */
nchr = MAXC, /* initial chars alloc for csv line */
*offset, /* array tracking read offsets in lines */
*used; /* array tracking write offset to csv lines */
if (!(l = getlines (n, fp))) { /* read all lines to l */
fputs ("error: getlines failed.\n", stderr);
return NULL;
}
ncsv = *n;
#ifdef DEBUG
for (size_t i = 0; i < *n; i++)
puts (l[i]);
#endif
if (!(t = malloc (ncsv * sizeof *t))) { /* alloc ncsv ptrs for csv */
perror ("malloc-t");
freep2p (l, *n); /* free everything else on failure */
return NULL;
}
for (size_t i = 0; i < ncsv; i++) /* alloc MAXC chars to csv ptrs */
if (!(t[i] = malloc (nchr * sizeof *t[i]))) {
perror ("malloc-t[i]");
while (i--) /* free everything else on failure */
free (t[i]);
free (t);
freep2p (l, *n);
return NULL;
}
if (!(offset = calloc (*n, sizeof *offset))) { /* alloc offsets array */
perror ("calloc-offsets");
free (t);
freep2p (l, *n);
return NULL;
}
if (!(used = calloc (ncsv, sizeof *used))) { /* alloc used array */
perror ("calloc-used");
free (t);
free (offset);
freep2p (l, *n);
return NULL;
}
for (;;) { /* loop continually transposing cols to csv rows */
for (size_t i = 0; i < *n; i++) { /* read next word from each line */
char word[NWRD]; /* tmp buffer for word */
int off; /* number of characters consumed in read */
if (sscanf (l[i] + offset[i], "%s%n", word, &off) != 1)
goto readdone; /* break nested loops on read failure */
size_t len = strlen (word); /* get word length */
offset[i] += off; /* increment read offset */
if (csvl == ncsv) { /* check/realloc new csv row as required */
size_t newsz = ncsv + 1; /* allocate +1 row over *n */
void *tmp = realloc (t, newsz * sizeof *t); /* realloc ptrs */
if (!tmp) {
perror ("realloc-t");
freep2p (t, ncsv);
goto readdone;
}
t = tmp;
t[ncsv] = NULL; /* set new pointer NULL */
/* allocate nchr chars to new pointer */
if (!(t[ncsv] = malloc (nchr * sizeof *t[ncsv]))) {
perror ("malloc-t[i]");
while (ncsv--) /* free everything else on failure */
free (t[ncsv]);
goto readdone;
}
tmp = realloc (used, newsz * sizeof *used); /* realloc used */
if (!tmp) {
perror ("realloc-used");
freep2p (t, ncsv);
goto readdone;
}
used = tmp;
used[ncsv] = 0;
ncsv++;
}
if (nchr - used[csvl] - 2 < len) { /* check word fits in line */
/* realloc t[i] if required (left for you) */
fputs ("realloc t[i] required.\n", stderr);
}
/* write word to csv line at end */
sprintf (t[csvl] + used[csvl], used[csvl] ? ",%s" : "%s", word);
t[csvl][used[csvl] ? used[csvl] + len + 1 : len] = 0;
used[csvl] += used[csvl] ? len + 1 : len;
}
csvl++;
}
readdone:;
freep2p (l, *n);
free (offset);
free (used);
*n = csvl;
return t;
}
int main (int argc, char **argv) {
char **t;
size_t n = 0;
/* use filename provided as 1st argument (stdin by default) */
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
perror ("file open failed");
return 1;
}
if (!(t = transpose2csv (&n, fp))) {
fputs ("error: transpose2csv failed.\n", stderr);
return 1;
}
if (fp != stdin) fclose (fp); /* close file if not stdin */
for (size_t i = 0; i < n; i++)
if (t[i])
puts (t[i]);
freep2p (t, n);
return 0;
}
Example Use/Output
$ ./bin/transpose2csv < dat/col2csv3x4.txt
JACK,20,London
Maria,34,NewYork
Stephan,45,Toronto
Nora,28,Berlin
Memory Use/Error Check
In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.
It is imperative that you use a memory error checking program to insure you do not attempt to access memory or write beyond/outside the bounds of your allocated block, attempt to read or base a conditional jump on an uninitialized value, and finally, to confirm that you free all the memory you have allocated.
For Linux valgrind is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.
$ valgrind ./bin/transpose2csv < dat/col2csv3x4.txt
==18604== Memcheck, a memory error detector
==18604== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==18604== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
==18604== Command: ./bin/transpose2csv
==18604==
JACK,20,London
Maria,34,NewYork
Stephan,45,Toronto
Nora,28,Berlin
==18604==
==18604== HEAP SUMMARY:
==18604== in use at exit: 0 bytes in 0 blocks
==18604== total heap usage: 15 allocs, 15 frees, 4,371 bytes allocated
==18604==
==18604== All heap blocks were freed -- no leaks are possible
==18604==
==18604== For counts of detected and suppressed errors, rerun with: -v
==18604== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Always confirm that you have freed all memory you have allocated and that there are no memory errors.
Look things over and let me know if you have further questions.

Related

Valgrind: REALLOC Uninitialised value was created by a heap allocation

Please, after reading and trying to apply solutions found on stackOverflow the problem has not been solved.
Conditional jump or move depends on uninitialised value(s):
Conditional jump or move depends on uninitialised value(s):
Uninitialised value was created by a heap allocation.
Error popped up by Valgrind:
I'm trying to implement file reading line by line and dynamically realloc an array for them.
Error on line: result = realloc(result, currLen * sizeof(char *));
void readFile(char *fileName) {
FILE *fp = NULL;
size_t len = 0;
int currLen = 2;
char **result = calloc(currLen, sizeof(char *));
fp = fopen(fileName, "r");
if (fp == NULL)
exit(EXIT_FAILURE);
if (result == NULL)
exit(EXIT_FAILURE);
int i = 0;
while (getline(&(*(result + i)), &len, fp) != -1) {
if (i >= currLen - 1) {
currLen *= 2;
result = realloc(result, currLen * sizeof(char *));
}
++i;
}
fclose(fp);
for (int j = 0; j < currLen; ++j) {
free(*(result + j));
}
free(result);
result = NULL;
}
int main() {
readFile("");
exit(EXIT_SUCCESS);
}
In the original posted code, result undergoes an initial allocation via calloc, which zero-initializes the content therein, and being pointers, null-initializes. Later on, when expanding the sequence via realloc, no such affordance is taken. In effect if the original array looked like this:
[ NULL, NULL ]
and after adding two elements, looks like this:
[ addr1, addr2 ]
the realloc kicks in and gives you this :
[ addr1, addr2, ????, ???? ]
Adding salt to the wound, getline also requires the length argument being reflective of the allocation size present in the line. But you're carrying over the length from the prior loop iteration, so not only is the pointer wrong after the first expansion, the length is never correct after the first invocation of getline (leading to your actual crash; the rest of the problems are just not something you saw yet).
Solving all of this
Use a separate pointer and length for each iteration,
Ensure they're properly initialized to null,0 before the getline call
If you read the line successfully, then expand the line pointer buffer.
Store the pointer, discard the length, and reset both to null,0 before the next iteration.
In practice, it looks like this:
#define _POSIX_C_SOURCE 200809L
#include <stdio.h>
#include <stdlib.h>
char **readFile(const char *fileName, size_t *lines)
{
FILE *fp = fopen(fileName, "r");
if (fp == NULL)
exit(EXIT_FAILURE);
// initially empty, no size or capacity
char **result = NULL;
size_t size = 0;
size_t capacity = 0;
size_t len = 0;
char *line = NULL;
while (getline(&line, &len, fp) != -1)
{
if (size == capacity)
{
size_t new_capacity = (capacity ? 2 * capacity : 1);
void *tmp = realloc(result, new_capacity * sizeof *result);
if (tmp == NULL)
{
perror("Failed to expand lines buffer");
exit(EXIT_FAILURE);
}
// recoup the expanded buffer and capacity
result = tmp;
capacity = new_capacity;
}
result[size++] = line;
// reset these to NULL,0. they trigger the fresh allocation
// and size storage on the next iteration.
line = NULL;
len = 0;
}
// if getline allocated a buffer on the failure case
// get rid of it (didn't see that coming).
if (line)
free(line);
fclose(fp);
*lines = size;
return result;
}
int main()
{
size_t count = 0;
char **lines = readFile("/usr/share/dict/words", &count);
if (lines)
{
for (size_t i = 0; i < count; ++i)
{
fputs(lines[i], stdout);
free(lines[i]);
}
free(lines);
}
return 0;
}
On a stock Linux/Mac system /usr/share/dict/words contains about a quarter-million words in the English language. On my stock Mac, its 235886 (yours will vary). The callers gets the line pointer and the count, and is responsible for freeing the content therein.
Output
A
a
aa
aal
aalii
aam
Aani
aardvark
aardwolf
Aaron
Aaronic
Aaronical
Aaronite
Aaronitic
Aaru
.... a ton of lines omitted ....
zymotically
zymotize
zymotoxic
zymurgy
Zyrenian
Zyrian
Zyryan
zythem
Zythia
zythum
Zyzomys
Zyzzogeton
Valgrind Summary
==17709==
==17709== HEAP SUMMARY:
==17709== in use at exit: 0 bytes in 0 blocks
==17709== total heap usage: 235,909 allocs, 235,909 frees, 32,506,328 bytes allocated
==17709==
==17709== All heap blocks were freed -- no leaks are possible
==17709==
Alternative: Let getline reuse its buffer
There is no guarantee the buffer getline allocates matches the line length efficiently. In fact, the only guarantee is, on successful execution, the function returns the number of chars including the delimiter (but not the terminator), and the memory holds that data. The actual allocation size could be considerably more than that, and that space is effectively wasted.
To demonstrate this, consider the following. The same code, but we do NOT reset the buffer on each loop, and rather than store its pointer directly, we store a strdup of the line. Note this only works if the line does not contain embedded null chars. This allows getline to reuse its buffer, and only expand if needed, for each read. We're responsible for making the actual copy of the line data (and we do using POSIX strdup). When executed there are still no leaks, but note the valgrind summary, specifically the number of bytes allocated in comparison to the number of bytes from the previous version above.
char **readFile(const char *fileName, size_t *lines)
{
FILE *fp = fopen(fileName, "r");
if (fp == NULL)
exit(EXIT_FAILURE);
// initially empty, no size or capacity
char **result = NULL;
size_t size = 0;
size_t capacity = 0;
size_t len = 0;
char *line = NULL;
while (getline(&line, &len, fp) != -1)
{
if (size == capacity)
{
size_t new_capacity = (capacity ? 2 * capacity : 1);
void *tmp = realloc(result, new_capacity * sizeof *result);
if (tmp == NULL)
{
perror("Failed to expand lines buffer");
exit(EXIT_FAILURE);
}
// recoup the expanded buffer and capacity
result = tmp;
capacity = new_capacity;
}
// make copy here. let getline reuse 'line'
result[size++] = strdup(line);
}
// free whatever was left
if (line)
free(line);
fclose(fp);
*lines = size;
return result;
}
Valgrind Summary
==17898==
==17898== HEAP SUMMARY:
==17898== in use at exit: 0 bytes in 0 blocks
==17898== total heap usage: 235,909 allocs, 235,909 frees, 6,929,003 bytes allocated
==17898==
==17898== All heap blocks were freed -- no leaks are possible
==17898==
The number of allocations is the same (which tells us getline allocated a large enough buffer up front to never need expansion), but the actual total allocated space is considerably more efficient, as now we are storing strings in buffers allocated to match their length; not whatever getline stood up as a read buffer.

Dynamic allocation isn't working in a loop

I'm trying to break a string str containing the bar symbol | into an array of strings(output), with the delimiter being |, which will not be included in the array of strings. There will be 20 elements in output. This code belongs to a function that will return output pointer, which is why I need dynamic allocation.
I'm trying to do this without using the sscanf() function.
Example: if str is "abc|def||jk" then this is what output should look like at the end (less than 20 elements for demonstration purpose):
output[0]=="abc"
output[1]=="def"
output[2]==""
output[3]=="jk"
However, I always get an error exit code, something like:
Process finished with exit code -1073740940 (0xC0000374)
When debugging I found out that the first string element is parsed correctly into output, but the second element is produced correctly sometimes, and other times I ran into trouble.
Code below:
char **output = (char**) calloc(20, 20*sizeof(char));
int begin = 0;
int end = 1;
// index that counts output element
int arrayIndex = 0;
int i;
for(i = 0; i < strlen(str); i++) {
end = i;
bool endOfString = false;
// there is no '|' symbol at the end of the string
if (*(str+i) == '|' || (endOfString = (i+1)==strlen(str))) {
end = endOfString ? (end+1):end;
// problem here. Assembly code poped up when debugging (see image below)
char *target = (char*) calloc(end-begin+1, sizeof(char));
// give proper value to target
strcpy(target, str+begin);
*(target+end-begin) = '\0';
*(output+arrayIndex) = target;
// increase proper indexes
begin = i + 1;
arrayIndex++;
}
}
The worst of all is that I cannot debug it because a window with assembly code pops up the instance I step over the calloc function when debugging.
I used gdb too, but it didn't work:
56 char target = (char) calloc(length, sizeof(char));
(gdb) n
warning: Critical error detected c0000374
Thread 1 received signal SIGTRAP, Trace/breakpoint trap.
0x00007ffded8191f3 in ?? ()
(gdb) bt
#0 0x00007ffded8191f3 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) continue
Continuing.
gdb: unknown target exception 0xc0000374 at 0x7ffded819269
(edit) indexing was OK. Couple of notes: sizeof(char) is always 1 -- just use 1. Further, there is no need to cast the return of malloc, it is unnecessary. See: Do I cast the result of malloc?. Additionally, how many times do you call strlen(str)? (hopefully optimization limits the loop condition to a single call, but you could potentially be calling strlen(str) for every iteration). Same with endOfString = (i+1)==strlen(str). Save the length of the string before entering the loop and then just use the saved value to compare against.
While you can count indexes as you have and loop character-by-character looking for delimiters, it is far more efficient to let strchr (pointer, '|') advance to the next delimiter (or return NULL indicating no more delimiters remain). Then rather than worrying about indexes, simply keep a pointer p and end-pointer ep to advance down the string, e.g.
#define NFIELDS 20
...
char **splitstr (const char *s, const char delim, size_t *n)
{
const char *p = s, /* pointer for parsing */
*ep = s; /* end pointer for parsing */
*n = 0; /* zero string counter */
while (*n < NFIELDS && *p) { /* loop while less than NFIELDS */
size_t len;
if ((ep = strchr (p, delim))) /* get pointer to delim */
len = ep - p;
else
len = strlen (p); /* or get length of final string */
if (!(output[*n] = malloc (len + 1))) { /* allocated for string */
...
memcpy (output[*n], p, len); /* copy chars to output[n] */
output[(*n)++][len] = 0; /* nul-terminate to make string */
if (!ep) /* if delim not found, last */
break;
p = ++ep; /* update p to 1-past delim */
}
...
Adding appropriate error checking and returning the pointer to the allocated strings, you could do:
char **splitstr (const char *s, const char delim, size_t *n)
{
const char *p = s, /* pointer for parsing */
*ep = s; /* end pointer for parsing */
char **output = calloc (NFIELDS, sizeof *output); /* pointer to output */
if (!output) {
perror ("calloc-output");
return NULL;
}
*n = 0; /* zero string counter */
while (*n < NFIELDS && *p) { /* loop while less than NFIELDS */
size_t len;
if ((ep = strchr (p, delim))) /* get pointer to delim */
len = ep - p;
else
len = strlen (p); /* or get length of final string */
if (!(output[*n] = malloc (len + 1))) { /* allocated for string */
perror ("malloc-output[n]");
while ((*n)--) /* free prior allocations on failure */
free (output[*n]);
free(output);
return NULL;
}
memcpy (output[*n], p, len); /* copy chars to output[n] */
output[(*n)++][len] = 0; /* nul-terminate to make string */
if (!ep) /* if delim not found, last */
break;
p = ++ep; /* update p to 1-past delim */
}
return output; /* return pointer to allocated strings */
}
In your case a complete short example would be:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define NFIELDS 20
#define MAXC 1024
char **splitstr (const char *s, const char delim, size_t *n)
{
const char *p = s, /* pointer for parsing */
*ep = s; /* end pointer for parsing */
char **output = calloc (NFIELDS, sizeof *output); /* pointer to output */
if (!output) {
perror ("calloc-output");
return NULL;
}
*n = 0; /* zero string counter */
while (*n < NFIELDS && *p) { /* loop while less than NFIELDS */
size_t len;
if ((ep = strchr (p, delim))) /* get pointer to delim */
len = ep - p;
else
len = strlen (p); /* or get length of final string */
if (!(output[*n] = malloc (len + 1))) { /* allocated for string */
perror ("malloc-output[n]");
while ((*n)--) /* free prior allocations on failure */
free (output[*n]);
free(output);
return NULL;
}
memcpy (output[*n], p, len); /* copy chars to output[n] */
output[(*n)++][len] = 0; /* nul-terminate to make string */
if (!ep) /* if delim not found, last */
break;
p = ++ep; /* update p to 1-past delim */
}
return output; /* return pointer to allocated strings */
}
int main (void) {
char buf[MAXC], /* buffer for input */
**output = NULL; /* pointer to split/allocated strings */
size_t n = 0; /* number of strings filled */
if (!fgets (buf, MAXC, stdin)) { /* validate input */
fputs ("error: invalid input.\n", stderr);
return 1;
}
buf[strcspn (buf, "\n")] = 0; /* trim newline from buf */
/* split buf into separate strings on '|' */
if (!(output = splitstr (buf, '|', &n))) {
fputs ("error: splitstr() failed.\n", stderr);
return 1;
}
for (size_t i = 0; i < n; i++) { /* loop outputting each & free */
printf ("output[%2zu]: %s\n", i, output[i]);
free (output[i]); /* free strings */
}
free (output); /* free pointers */
}
Example Use/Output
$ echo "abc|def||jk" | ./bin/splitstr
output[ 0]: abc
output[ 1]: def
output[ 2]:
output[ 3]: jk
Memory Use/Error Check
In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.
It is imperative that you use a memory error checking program to insure you do not attempt to access memory or write beyond/outside the bounds of your allocated block, attempt to read or base a conditional jump on an uninitialized value, and finally, to confirm that you free all the memory you have allocated.
For Linux valgrind is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.
$ echo "abc|def||jk" | valgrind ./bin/splitstr
==32024== Memcheck, a memory error detector
==32024== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==32024== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
==32024== Command: ./bin/splitstr
==32024==
output[ 0]: abc
output[ 1]: def
output[ 2]:
output[ 3]: jk
==32024==
==32024== HEAP SUMMARY:
==32024== in use at exit: 0 bytes in 0 blocks
==32024== total heap usage: 5 allocs, 5 frees, 172 bytes allocated
==32024==
==32024== All heap blocks were freed -- no leaks are possible
==32024==
==32024== For counts of detected and suppressed errors, rerun with: -v
==32024== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Always confirm that you have freed all memory you have allocated and that there are no memory errors.
Look things over and let me know if you have further questions.

String is truncated after allocating 2D array (Edited)

I am dynamically allocating the 2D array like this:
char ** inputs;
inputs = (char **) malloc(4 * sizeof(char));
After doing this I started having the problem with the string. I printed the string before and after allocating the 2D-array:
printf("%s\n", str);
char ** inputs;
inputs = (char **) malloc(4 * sizeof(char));
printf("%s\n", str);
But I get strange output:
before: input aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa with len 34
after: input aaaaaaaaaaaaaaaaaaaaaaaaaaaa with len 29
Why the length is changed? I've searched through stackoverflow and other websites but couldn't find reasonable answer for that.
Here is my all function call:
int main(int argc, char const *argv[])
{
/* code */
mainProcess();
printf("\nEnd of the program\n");
return 0;
}
// Reading the input from the user
char * getInput(){
printf("Inside of the getInput\n");
char * result;
char * st;
char c;
result = malloc(4 * sizeof(char));
st = malloc(4 * sizeof(char));
// code goes here
printf("$ ");
while(3){
c = fgetc(stdin);
if(c == 10){
break;
}
printf("%c", c);
result[length] = c;
length++;
}
result[length] = '\0';
return result;
}
void mainProcess(){
char * input;
printf("Inside of Main process\n");
input = getInput();
printf("\nthis is input %s with len %d\n", input, strlen(input));
splitInput(input);
printf("\nthis is input %s with len %d\n", input, strlen(input));
}
char ** splitInput(const char * str){
char ** inputs;
inputs = NULL;
printf("inside split\n");
printf("%s\n", str);
inputs = (char **) malloc( sizeof(char));
// free(inputs);
printf("------\n"); // for testing
printf("%s\n", str);
if(!inputs){
printf("Error in initializing the 2D array!\n");
exit(EXIT_FAILURE);
}
return NULL;
}
It is not entirely clear what you are trying to accomplish, but it appears you are attempting to read a line of text with getInput and then you intend to separate the input into individual words in splitInput, but are not clear on how to go about doing it. The process of separating a line of text into words is called tokenizing a string. The standard library provide strtok (aptly named) and strsep (primarily useful if you have the possibility of an empty delimited field).
I have explained the difference between a 2D array and your use of a pointer-to-pointer-to-char in the comments above.
To begin, look at getInput. One issue that will give you no end of grief is c must be type int or you cannot detect EOF. In addition, you can simply pass a pointer (type size_t) as a parameter and keep count of the characters in result and avoid the need for strlen to get the length of the returned string. You MUST use a counter anyway to insure you do not write beyond the end of result to begin with, so you may as well make the count available back in the calling function e.g.
char *getInput (size_t *n)
{
printf ("Inside of the getInput\n");
char *result = NULL;
int c = 0; /* c must be type 'int' or you cannot detect EOF */
/* validate ALL allocations */
if ((result = malloc (MAXC * sizeof *result)) == NULL) {
fprintf (stderr, "error: virtual memory exhausted.\n");
return result;
}
printf ("$ ");
fflush (stdout); /* output is buffered, flush buffer to show prompt */
while (*n + 1 < MAXC && (c = fgetc (stdin)) != '\n' && c != EOF) {
printf ("%c", c);
result[(*n)++] = c;
}
putchar ('\n'); /* tidy up with newline */
result[*n] = 0;
return result;
}
Next, as indicated above, it appears you want to take the line of text in result and use splitInput to fill a pointer-to-pointer-to-char with the individual words (which you are confusing to be a 2D array). To do that, you must keep in mind that strtok will modify the string it operates on so you must make a copy of str which you pass as const char * to avoid attempting to modify a constant string (and the segfault).
You are confused in how to allocate the pointer-to-pointer-to-char object. First you must allocate space for a sufficient number of pointers, e.g. (with #define MAXW 32) you would need something like:
/* allocate MAXW pointers */
if ((inputs = malloc (MAXW * sizeof *inputs)) == NULL) {
fprintf (stderr, "error: memory exhausted - inputs.\n");
return inputs;
}
Then as you tokenize the input string, you must allocate for each individual word (each themselves an individual string), e.g.
if ((inputs[*n] = malloc ((len + 1) * sizeof *inputs[*n])) == NULL) {
fprintf (stderr, "error: memory exhausted - word %zu.\n", *n);
break;
}
strcpy (inputs[*n], p);
(*n)++;
note: 'n' is a pointer to size_t to make the word count available back in the caller.
To tokenize the input string you can wrap the allocation above in:
for (char *p = strtok (cpy, delim); p; p = strtok (NULL, delim))
{
size_t len = strlen (p);
...
if (*n == MAXW) /* check if limit reached */
break;
}
Throughout your code you should also validate all memory allocations and provide effective returns for each function that allocates to allow the caller to validate whether the called function succeeded or failed.
Putting all the pieces together, you could do something like the following:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAXC 256 /* constant for maximum characters of user input */
#define MAXW 32 /* constant for maximum words in line */
void mainProcess();
int main (void)
{
mainProcess();
printf ("End of the program\n");
return 0;
}
char *getInput (size_t *n)
{
printf ("Inside of the getInput\n");
char *result = NULL;
int c = 0; /* c must be type 'int' or you cannot detect EOF */
/* validate ALL allocations */
if ((result = malloc (MAXC * sizeof *result)) == NULL) {
fprintf (stderr, "error: virtual memory exhausted.\n");
return result;
}
printf ("$ ");
fflush (stdout); /* output is buffered, flush buffer to show prompt */
while (*n + 1 < MAXC && (c = fgetc (stdin)) != '\n' && c != EOF) {
printf ("%c", c);
result[(*n)++] = c;
}
putchar ('\n'); /* tidy up with newline */
result[*n] = 0;
return result;
}
/* split str into tokens, return pointer to array of char *
* update pointer 'n' to contain number of words
*/
char **splitInput (const char *str, size_t *n)
{
char **inputs = NULL,
*delim = " \t\n", /* split on 'space', 'tab' or 'newline' */
*cpy = strdup (str);
printf ("inside split\n");
printf ("%s\n", str);
/* allocate MAXW pointers */
if ((inputs = malloc (MAXW * sizeof *inputs)) == NULL) {
fprintf (stderr, "error: memory exhausted - inputs.\n");
return inputs;
}
/* split cpy into tokens (words) max of MAXW words allowed */
for (char *p = strtok (cpy, delim); p; p = strtok (NULL, delim))
{
size_t len = strlen (p);
if ((inputs[*n] = malloc ((len + 1) * sizeof *inputs[*n])) == NULL) {
fprintf (stderr, "error: memory exhausted - word %zu.\n", *n);
break;
}
strcpy (inputs[*n], p);
(*n)++;
if (*n == MAXW) /* check if limit reached */
break;
}
free (cpy); /* free copy */
return inputs;
}
void mainProcess()
{
char *input = NULL,
**words = NULL;
size_t len = 0, nwords = 0;
printf ("Inside of Main process\n\n");
input = getInput (&len);
if (!input || !*input) {
fprintf (stderr, "error: input is empty or NULL.\n");
return;
}
printf ("this is input '%s' with len: %zu (before split)\n", input, len);
words = splitInput (input, &nwords);
printf ("this is input '%s' with len: %zu (after split)\n", input, len);
free (input); /* done with input, free it! */
printf ("the words in input are:\n");
for (size_t i = 0; i < nwords; i++) {
printf (" word[%2zu]: '%s'\n", i, words[i]);
free (words[i]); /* free each word */
}
free (words); /* free pointers */
putchar ('\n'); /* tidy up with newline */
}
Example Use/Output
$ ./bin/mainprocess
Inside of Main process
Inside of the getInput
$ my dog has fleas
my dog has fleas
this is input 'my dog has fleas' with len: 16 (before split)
inside split
my dog has fleas
this is input 'my dog has fleas' with len: 16 (after split)
the words in input are:
word[ 0]: 'my'
word[ 1]: 'dog'
word[ 2]: 'has'
word[ 3]: 'fleas'
End of the program
Memory Error Check
In any code you write that dynamically allocates memory, you need to run your code though a memory/error checking program. On Linux, valgrind is the normal choice. Simply run your code through it, e.g.
$ valgrind ./bin/mainprocess
==15900== Memcheck, a memory error detector
==15900== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==15900== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==15900== Command: ./bin/mainprocess
==15900==
Inside of Main process
Inside of the getInput
$ my dog has fleas
my dog has fleas
this is input 'my dog has fleas' with len: 16 (before split)
inside split
my dog has fleas
this is input 'my dog has fleas' with len: 16 (after split)
the words in input are:
word[ 0]: 'my'
word[ 1]: 'dog'
word[ 2]: 'has'
word[ 3]: 'fleas'
End of the program
==15900==
==15900== HEAP SUMMARY:
==15900== in use at exit: 0 bytes in 0 blocks
==15900== total heap usage: 7 allocs, 7 frees, 546 bytes allocated
==15900==
==15900== All heap blocks were freed -- no leaks are possible
==15900==
==15900== For counts of detected and suppressed errors, rerun with: -v
==15900== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Always verify you have freed any memory you allocate, and that there are no memory errors.
Look things over and let me know if you have any questions. If I guess wrong about what you intended, well that's where an MCVE helps :)
This code compiles (gcc -Wall) without warnings and does not change the size.
It also tries to stress the need for allocating enough space and/or not to write beyond allocated memory.
Note for example the
malloc((MaxInputLength+1) * sizeof(char))
while(length<MaxInputLength)
inputs[i]=malloc((MaxLengthOfSplitString+1) * sizeof(char));
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
// the length which was used in your MCVE, probably accidentally
#define MaxInputLength 3 // you will probably want to increase this
#define MaxLengthOfSplitString 1 // and this
#define MaxNumberOfSplitStrings 3 // and this
// Reading the input from the user
char * getInput(){
printf("Inside of the getInput\n");
char * result;
char c;
int length=0;
result = malloc((MaxInputLength+1) * sizeof(char));
// code goes here
printf("$ ");
while(length<MaxInputLength){
c = fgetc(stdin);
if(c == 10){
break;
}
printf("%c", c);
result[length] = c;
length++;
}
result[length] = '\0';
return result;
}
char ** splitInput(const char * str){
char ** inputs;
inputs = NULL;
printf("inside split\n");
printf("%s\n", str);
inputs = (char **) malloc(MaxNumberOfSplitStrings * sizeof(char*));
{
int i;
for (i=0; i< MaxNumberOfSplitStrings; i++)
{
inputs[i]=malloc((MaxLengthOfSplitString+1) * sizeof(char));
}
// Now you have an array of MaxNumberOfSplitStrings char*.
// Each of them points to a buffer which can hold a ero- terminated string
// with at most MaxLengthOfSplitString chars, ot counting the '\0'.
}
// free(inputs);
printf("------\n"); // for testing
printf("%s\n", str);
if(!inputs){
printf("Error in initializing the 2D array!\n");
exit(EXIT_FAILURE);
}
return NULL;
}
void mainProcess(){
char * input;
printf("Inside of Main process\n");
input = getInput();
printf("\nthis is input %s with len %d\n", input, strlen(input));
splitInput(input);
printf("\nthis is input %s with len %d\n", input, strlen(input));
}
int main(int argc, char const *argv[])
{
/* code */
mainProcess();
printf("\nEnd of the program\n");
return 0;
}

reading large lists through stdin in C

If my program is going to have large lists of numbers passed in through stdin, what would be the most efficient way of reading this in?
The input I'm going to be passing into the program is going to be of the following format:
3,5;6,7;8,9;11,4;;
I need to process the input so that I can use the numbers between the colons (i.e I want to be able to use 3 and 5, 6 and 7 etc etc). The ;; indicates that it is the end of the line.
I was thinking of using a buffered reader to read entire lines and then using parseInt.
Would this be the most efficient way of doing it?
This is a working solution
One way to do this is to use strtok() and store the values in an array. Ideally, dynamically allocated.
int main(int argc, char *argv[])
{
int lst_size=100;
int line_size=255;
int lst[lst_size];
int count=0;
char buff[line_size];
char * token=NULL;
fgets (buff, line_size, stdin); //Get input
Using strtok by passing ',' and ';' as deleminator.
token=strtok(buff, ";,");
lst[count++]=atoi(token);
while(token=strtok(NULL, ";,")){
lst[count++]=atoi(token);
}
Finally you have to account for the double ";;" by reducing the count by 1, because atoi(token) will return 0 for that case and store it in the nth index. Which you don't want.
count--;
}
One other fairly elegant way to handle this is to allow strtol to parse the input by advancing the string to be read to endptr as returned by strtol. Combined with an array allocated/reallocated as needed, you should be able to handle lines of any length (up to memory exhaustion). The example below uses a single array for the data. If you want to store multiple lines, each as a separate array, you can use the same approach, but start with a pointer to array of pointers to int. (i.e. int **numbers and allocate the pointers and then each array). Let me know if you have questions:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
#define NMAX 256
int main () {
char *ln = NULL; /* NULL forces getline to allocate */
size_t n = 0; /* max chars to read (0 - no limit) */
ssize_t nchr = 0; /* number of chars actually read */
int *numbers = NULL; /* array to hold numbers */
size_t nmax = NMAX; /* check for reallocation */
size_t idx = 0; /* numbers array index */
if (!(numbers = calloc (NMAX, sizeof *numbers))) {
fprintf (stderr, "error: memory allocation failed.");
return 1;
}
/* read each line from stdin - dynamicallly allocated */
while ((nchr = getline (&ln, &n, stdin)) != -1)
{
char *p = ln; /* pointer for use with strtol */
char *ep = NULL;
errno = 0;
while (errno == 0)
{
/* parse/convert each number on stdin */
numbers[idx] = strtol (p, &ep, 10);
/* note: overflow/underflow checks omitted */
/* if valid conversion to number */
if (errno == 0 && p != ep)
{
idx++; /* increment index */
if (!ep) break; /* check for end of str */
}
/* skip delimiters/move pointer to next digit */
while (*ep && (*ep <= '0' || *ep >= '9')) ep++;
if (*ep)
p = ep;
else
break;
/* reallocate numbers if idx = nmax */
if (idx == nmax)
{
int *tmp = realloc (numbers, 2 * nmax * sizeof *numbers);
if (!tmp) {
fprintf (stderr, "Error: struct reallocation failure.\n");
exit (EXIT_FAILURE);
}
numbers = tmp;
memset (numbers + nmax, 0, nmax * sizeof *numbers);
nmax *= 2;
}
}
}
/* free mem allocated by getline */
if (ln) free (ln);
/* show values stored in array */
size_t i = 0;
for (i = 0; i < idx; i++)
printf (" numbers[%2zu] %d\n", i, numbers[i]);
/* free mem allocate to numbers */
if (numbers) free (numbers);
return 0;
}
Output
$ echo "3,5;6,7;8,9;11,4;;" | ./bin/prsistdin
numbers[ 0] 3
numbers[ 1] 5
numbers[ 2] 6
numbers[ 3] 7
numbers[ 4] 8
numbers[ 5] 11
numbers[ 6] 4
Also works where the string is stored in a file as:
$ cat dat/numsemic.csv | ./bin/prsistdin
or
$ ./bin/prsistdin < dat/numsemic.csv
Using fgets and without size_t
It took a little reworking to come up with a revision I was happy with that eliminated getline and substituted fgets. getline is far more flexible, handling the allocation of space for you, with fgets it is up to you. (not to mention getline returning the actual number of chars read without having to call strlen).
My goal here was to preserve the ability to read any length line to meet your requirement. That either meant initially allocating some huge line buffer (wasteful) or coming up with a scheme that would reallocate the input line buffer as needed in the event it was longer than the space initially allocate to ln. (this is what getline does so well). I'm reasonably happy with the results. Note: I put the reallocation code in functions to keep main reasonably clean. footnote 2
Take a look at the following code. Note, I have left the DEBUG preprocessor directives in the code allowing you to compile with the -DDEBUG flag if you want to have it spit out each time it allocates. [footnote 1] You can compile the code with:
gcc -Wall -Wextra -o yourexename yourfilename.c
or if you want the debugging output (e.g. set LMAX to 2 or something less than the line length), use the following:
gcc -Wall -Wextra -o yourexename yourfilename.c -DDEBUG
Let me know if you have questions:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
#define NMAX 256
#define LMAX 1024
char *realloc_char (char *sp, unsigned int *n); /* reallocate char array */
int *realloc_int (int *sp, unsigned int *n); /* reallocate int array */
char *fixshortread (FILE *fp, char **s, unsigned int *n); /* read all stdin */
int main () {
char *ln = NULL; /* dynamically allocated for fgets */
int *numbers = NULL; /* array to hold numbers */
unsigned int nmax = NMAX; /* numbers check for reallocation */
unsigned int lmax = LMAX; /* ln check for reallocation */
unsigned int idx = 0; /* numbers array index */
unsigned int i = 0; /* simple counter variable */
char *nl = NULL;
/* initial allocation for numbers */
if (!(numbers = calloc (NMAX, sizeof *numbers))) {
fprintf (stderr, "error: memory allocation failed (numbers).");
return 1;
}
/* initial allocation for ln */
if (!(ln = calloc (LMAX, sizeof *ln))) {
fprintf (stderr, "error: memory allocation failed (ln).");
return 1;
}
/* read each line from stdin - dynamicallly allocated */
while (fgets (ln, lmax, stdin) != NULL)
{
/* provide a fallback to read remainder of line
if the line length exceeds lmax */
if (!(nl = strchr (ln, '\n')))
fixshortread (stdin, &ln, &lmax);
else
*nl = 0;
char *p = ln; /* pointer for use with strtol */
char *ep = NULL;
errno = 0;
while (errno == 0)
{
/* parse/convert each number on stdin */
numbers[idx] = strtol (p, &ep, 10);
/* note: overflow/underflow checks omitted */
/* if valid conversion to number */
if (errno == 0 && p != ep)
{
idx++; /* increment index */
if (!ep) break; /* check for end of str */
}
/* skip delimiters/move pointer to next digit */
while (*ep && (*ep <= '0' || *ep >= '9')) ep++;
if (*ep)
p = ep;
else
break;
/* reallocate numbers if idx = nmax */
if (idx == nmax)
realloc_int (numbers, &nmax);
}
}
/* free mem allocated by getline */
if (ln) free (ln);
/* show values stored in array */
for (i = 0; i < idx; i++)
printf (" numbers[%2u] %d\n", (unsigned int)i, numbers[i]);
/* free mem allocate to numbers */
if (numbers) free (numbers);
return 0;
}
/* reallocate character pointer memory */
char *realloc_char (char *sp, unsigned int *n)
{
char *tmp = realloc (sp, 2 * *n * sizeof *sp);
#ifdef DEBUG
printf ("\n reallocating %u to %u\n", *n, *n * 2);
#endif
if (!tmp) {
fprintf (stderr, "Error: char pointer reallocation failure.\n");
exit (EXIT_FAILURE);
}
sp = tmp;
memset (sp + *n, 0, *n * sizeof *sp); /* memset new ptrs 0 */
*n *= 2;
return sp;
}
/* reallocate integer pointer memory */
int *realloc_int (int *sp, unsigned int *n)
{
int *tmp = realloc (sp, 2 * *n * sizeof *sp);
#ifdef DEBUG
printf ("\n reallocating %u to %u\n", *n, *n * 2);
#endif
if (!tmp) {
fprintf (stderr, "Error: int pointer reallocation failure.\n");
exit (EXIT_FAILURE);
}
sp = tmp;
memset (sp + *n, 0, *n * sizeof *sp); /* memset new ptrs 0 */
*n *= 2;
return sp;
}
/* if fgets fails to read entire line, fix short read */
char *fixshortread (FILE *fp, char **s, unsigned int *n)
{
unsigned int i = 0;
int c = 0;
i = *n - 1;
realloc_char (*s, n);
do
{
c = fgetc (fp);
(*s)[i] = c;
i++;
if (i == *n)
realloc_char (*s, n);
} while (c != '\n' && c != EOF);
(*s)[i-1] = 0;
return *s;
}
footnote 1
nothing special about the choice of the word DEBUG (it could have been DOG, etc..), the point to take away is if you want to conditionally include/exclude code, you can simply use preprocessor flags to do that. You just add -Dflagname to pass flagname to the compiler.
footnote 2
you can combine the reallocation functions into a single void* function that accepts a void pointer as its argument along with the size of the type to be reallocated and returns a void pointer to the reallocated space -- but we will leave that for a later date.
What you could do is read in from stdin using fgets or fgetc. You could also use getline() since you're reading in from stdin.
Once you read in the line you can use strtok() with the delimiter for ";" to split the string into pieces at the semicolons. You can loop through until strok() is null, or in this case, ';'. Also in C you should use atoi() to convert strings to integers.
For Example:
int length = 256;
char* str = (char*)malloc(length);
int err = getline(&str, &length, stdin);
I would read in the command args, then parse using the strtok() library method
http://man7.org/linux/man-pages/man3/strtok.3.html
(The web page referenced by the URL above even has a code sample of how to use it.)
I'm a little rusty at C, but could this work for you?
char[1000] remainder;
int first, second;
fp = fopen("C:\\file.txt", "r"); // Error check this, probably.
while (fgets(&remainder, 1000, fp) != null) { // Get a line.
while (sscanf(remainder, "%d,%d;%s", first, second, remainder) != null) {
// place first and second into a struct or something
}
}
getchar_unlocked() is what you are looking for.
Here is the code:
#include <stdio.h>
inline int fastRead_int(int * x)
{
register int c = getchar_unlocked();
*x = 0;
// clean stuff in front of + look for EOF
for(; ((c<48 || c>57) && c != EOF); c = getchar_unlocked());
if(c == EOF)
return 0;
// build int
for(; c>47 && c<58 ; c = getchar_unlocked()) {
*x = (*x<<1) + (*x<<3) + c - 48;
}
return 1;
}
int main()
{
int x;
while(fastRead_int(&x))
printf("%d ",x);
return 0;
}
For input 1;2;2;;3;;4;;;;;54;;;; the code above produces 1 2 2 3 4 54.
I guarantee, this solution is a lot faster than others presented in this topic. It is not only using getchar_unlocked(), but also uses register, inline as well as multiplying by 10 tricky way: (*x<<1) + (*x<<3).
I wish you good luck in finding better solution.

Read unknown number of lines from stdin, C

i have a problem with reading stdin of unknown size. In fact its a table in .txt file, which i get to stdin by calling parameter '<'table.txt. My code should look like this:
#include <stdio.h>
#include <string.h>
int main(int argc,char *argv[])
{
char words[10][1024];
int i=0;
while(feof(stdin)==0)
{
fgets(words[i],100,stdin);
printf("%s", words[i]);
i++;
}
return 0;
}
but there is the problem i dont know the nuber of lines, which in this case is 10(we know the number of characters in line - 1024).
It would be great if someone know the solution. Thanks in advance.
You have hit on one of the issues that plagues all new C-programmers. How do I dynamically allocate all memory I need to free myself from static limits while still keeping track of my collection of 'stuff' in memory. This problem usually presents itself when you need to read an unknown number of 'things' from an input. The initial options are (1) declare some limit big enough to work (defeating the purpose), or (2) dynamically allocate a pointers as needed.
Obviously, the goal is (2). However, you then run into the problem of "How do I keep track of what I've allocated?" This in itself is an issue that dogs beginners. The problem being, If I dynamically allocate using a bunch of pointers, **How do I iterate over the list to get my 'stuff' back out? Also, you have to initialize some initial number of pointers (unless using an advanced data structure like a linked-list), so the next question is "what do I do when I run out?"
The usual solution is to allocate an initial set of pointers, then when the limit is reached, reallocate to twice as many as original, and keep going. (as Grayson indicated in his answer).
However, there is one more trick to iterate over the list to get your 'stuff' back out that is worth understanding. Yes, you can allocate with malloc and keep track of the number of pointers used, but you can free yourself from tying a counter to your list of pointers by initially allocating with calloc. That not only allocates space, but also sets the allocated pointers to NULL (or 0). This allows you to iterate over your list with a simple while (pointer != NULL). This provides many benefits when it comes to passing your collection of pointers to functions, etc.. The downside (a minimal one) is that you get to write a reallocation scheme that uses calloc to allocate new space when needed. (bummer, I get to get smarter -- but I have to work to do it...)
You can evaluate whether to use malloc/realloc off-the-shelf, or whether to reallocate using calloc and a custom reallocate function depending on what your requirements are. Regardless, understanding both, just adds more tools to your programming toolbox.
OK, enough jabber, where is the example in all this blather?
Both of the following examples simply read all lines from any text file and print the lines (with pointer index numbers) back to stdout. Both expect that you will provide the filename to read as the first argument on the command line. The only difference between the two is the second has the reallocation with calloc done is a custom reallocation function. They both allocate 255 pointers initially and double the number of pointers each time the limit is hit. (for fun, you can set MAXLINES to something small like 10 and force repeated reallocations to test).
first example with reallocation in main()
# include <stdio.h>
# include <stdlib.h>
# include <string.h>
#define MAXLINES 255
void free_buffer (char **buffer)
{
register int i = 0;
while (buffer[i])
{
free (buffer[i]);
i++;
}
free (buffer);
}
int main (int argc, char **argv) {
if (argc < 2) {
fprintf (stderr, "Error: insufficient input. Usage: %s input_file\n", argv[0]);
return 1;
}
char *line = NULL; /* forces getline to allocate space for buf */
ssize_t read = 0; /* number of characters read by getline */
size_t n = 0; /* limit number of chars to 'n', 0 no limit */
char **filebuf = NULL;
char **rtmp = NULL;
int linecnt = 0;
size_t limit = MAXLINES;
size_t newlim = 0;
FILE *ifp = fopen(argv[1],"r");
if (!ifp)
{
fprintf(stderr, "\nerror: failed to open file: '%s'\n\n", argv[1]);
return 1;
}
filebuf = calloc (MAXLINES, sizeof (*filebuf)); /* allocate MAXLINES pointers */
while ((read = getline (&line, &n, ifp)) != -1) /* read each line in file with getline */
{
if (line[read - 1] == 0xa) { line[read - 1] = 0; read--; } /* strip newline */
if (linecnt >= (limit - 1)) /* test if linecnt at limit, reallocate */
{
newlim = limit * 2; /* set new number of pointers to 2X old */
if ((rtmp = calloc (newlim, sizeof (*filebuf)))) /* calloc to set to NULL */
{
/* copy original filebuf to newly allocated rtmp */
if (memcpy (rtmp, filebuf, linecnt * sizeof (*filebuf)) == rtmp)
{
free (filebuf); /* free original filebuf */
filebuf = rtmp; /* set filebuf equal to new rtmp */
}
else
{
fprintf (stderr, "error: memcpy failed, exiting\n");
return 1;
}
}
else
{
fprintf (stderr, "error: rtmp allocation failed, exiting\n");
return 1;
}
limit = newlim; /* update limit to new limit */
}
filebuf[linecnt] = strdup (line); /* copy line (strdup allocates) */
linecnt++; /* increment linecnt */
}
fclose(ifp);
if (line) free (line); /* free memory allocated to line */
linecnt = 0; /* reset linecnt to iterate filebuf */
printf ("\nLines read in filebuf buffer:\n\n"); /* output all lines read */
while (filebuf[linecnt])
{
printf (" line[%d]: %s\n", linecnt, filebuf[linecnt]);
linecnt++;
}
printf ("\n");
free_buffer (filebuf); /* free memory allocated to filebuf */
return 0;
}
second example with reallocation in custom function
# include <stdio.h>
# include <stdlib.h>
# include <string.h>
#define MAXLINES 255
/* function to free allocated memory */
void free_buffer (char **buffer)
{
register int i = 0;
while (buffer[i])
{
free (buffer[i]);
i++;
}
free (buffer);
}
/* custom realloc using calloc/memcpy */
char **recalloc (size_t *lim, char **buf)
{
int newlim = *lim * 2;
char **tmp = NULL;
if ((tmp = calloc (newlim, sizeof (*buf))))
{
if (memcpy (tmp, buf, *lim * sizeof (*buf)) == tmp)
{
free (buf);
buf = tmp;
}
else
{
fprintf (stderr, "%s(): error, memcpy failed, exiting\n", __func__);
return NULL;
}
}
else
{
fprintf (stderr, "%s(): error, tmp allocation failed, exiting\n", __func__);
return NULL;
}
*lim = newlim;
return tmp;
}
int main (int argc, char **argv) {
if (argc < 2) {
fprintf (stderr, "Error: insufficient input. Usage: %s input_file\n", argv[0]);
return 1;
}
char *line = NULL; /* forces getline to allocate space for buf */
ssize_t read = 0; /* number of characters read by getline */
size_t n = 0; /* limit number of chars to 'n', 0 no limit */
char **filebuf = NULL;
int linecnt = 0;
size_t limit = MAXLINES;
FILE *ifp = fopen(argv[1],"r");
if (!ifp)
{
fprintf(stderr, "\nerror: failed to open file: '%s'\n\n", argv[1]);
return 1;
}
filebuf = calloc (MAXLINES, sizeof (*filebuf)); /* allocate MAXLINES pointers */
while ((read = getline (&line, &n, ifp)) != -1) /* read each line in file with getline */
{
if (line[read - 1] == 0xa) { line[read - 1] = 0; read--; } /* strip newline */
if (linecnt >= (limit - 1)) /* test if linecnt at limit, reallocate */
{
filebuf = recalloc (&limit, filebuf); /* reallocate filebuf to 2X size */
if (!filebuf)
{
fprintf (stderr, "error: recalloc failed, exiting.\n");
return 1;
}
}
filebuf[linecnt] = strdup (line); /* copy line (strdup allocates) */
linecnt++; /* increment linecnt */
}
fclose(ifp);
if (line) free (line); /* free memory allocated to line */
linecnt = 0; /* reset linecnt to iterate filebuf */
printf ("\nLines read in filebuf buffer:\n\n"); /* output all lines read */
while (filebuf[linecnt])
{
printf (" line[%d]: %s\n", linecnt, filebuf[linecnt]);
linecnt++;
}
printf ("\n");
free_buffer (filebuf); /* free memory allocated to filebuf */
return 0;
}
Take a look at both examples. Know that there are many, many ways to do this. These examples just give one approach that provide example of using a few extra tricks than you will normally find. Give them a try. Drop a comment if you need more help.
I suggest that you use malloc and realloc to manage your memory. Keep track of how big your array is or how many entries it has, and call realloc to double its size whenever the array is not big enough.
Op appears to need to store the data somewhere
#define N 100000u
char BABuffer[N];
int main(int argc, char *argv[]) {
size_t lcount = 0;
size_t ccount = 0;
char words[1024 + 2];
while(fgets(words, sizeof words, stdin) != NULL) {
size_t len = strlen(words);
if (ccount + len >= N - 1) {
fputs("Too much!\n", stderr);
break;
}
memcpy(&BABuffer[ccount], words, len);
ccount += len;
lcount++;
}
BABuffer[ccount] = '\0';
printf("Read %zu lines.\n", lcount);
printf("Read %zu char.\n", ccount);
fputs(BABuffer, stdout);
return 0;
}
Note: ccount includes the end-of-line character(s).

Resources