Formatting a file of text based on input - c

I am working on a project for school and I have run into a little bit of trouble. The gist of the project is to write a program that reads in a file of text and formats that file so that it fits in a specific width.To format this file, the user specifies the input file, the length of an output line, and the justification for the output text. An example would be this:
$ ./format test.dat 15 right
The quick brown
fox jumps over
the lazy old
dog.
$ ./format test.dat 15 left
The quick brown
fox jumps over
the lazy old
dog.
$ ./format test.dat 15 center
The quick brown
fox jumps over
the lazy old
dog.
Anyway, I am basically stuck on how to go about outputting the file based on this. Attached below is my code for reading in the file, and what little I have done on outputting the file. I am mainly looking for tips or suggestions on how to go about doing it. I know I need to use printf with a width and such, but I am confused on how to move to the next line.
char **inputFile(FILE *fp, int size) {
int i = 0;
char *token;
char **str;
str = malloc(sizeof(char *) * size);
token = readToken(fp);
while(!feof(fp)) {
if(i+1 == size) {
realloc(str, size * 2);
}
str[i] = token;
token = readToken(fp);
i++;
}
return str;
}
void toPrint(char **string, int width, int indent) {
int i;
int curLineLength;
if(indent == 0) {
for(i = 0; I < strlen(string); i++
char *token = string[i];
if(curLineLength + strlen(*string) > width) {
if(curLineLength > 0) {
printf("\n");
curLineLength = 0;
}
}
printf("%s ", token);
curLineLength += strlen(*string);
}
/*
if(indent == 1) {
}
if(indent == 2) {
}
*/
}

Following on from the comment, your task of justifying the lines is more a logic challenge for structuring your output function than it is a difficult one. You are getting close. There are a number of ways to do it, and you probably need to add error checking to make sure width isn't less than the longest line.
Here is a quick example you can draw from. Make sure you understand why each line was written the way it was, and pay close attention to the variable length array definition (you will need to compile as c99 - or if using gcc, rely on the gcc extension (default)). Let me know if you have questions:
/* format 'n' lines of output of 't' justified as specified in 'just'
(left 'l', right 'r' or centered 'c') within a width 'w'.
NOTE: 'w' must be larger than the longest line in 't'.
*/
void formatted (char **t, char just, size_t w, size_t n)
{
if (!t || !*t) return;
size_t i = 0;
size_t lmax = 0;
size_t len[n];
/* calculate the length of each line, set lmax */
for (i = 0; i < n; i++) {
len[i] = strlen (t[i]);
if (len[i] > lmax) lmax = len[i];
}
/* handle w < lmax reformat or error here */
if (w < lmax) {
fprintf (stderr, "%s() error: invalid width < lmax (%zu).\n",
__func__, lmax);
return;
}
/* left justified output */
if (just == 'l') {
for (i = 0; i < n; i++) {
printf ("%s\n", t[i]);
}
return;
}
/* center or right justified output */
for (i = 0; i < n; i++) {
int spaces = w - len[i];
if (just == 'c')
printf ("%*s%s\n", spaces/2, " ", t[i]);
else if (just == 'r')
printf ("%*s%s\n", spaces, " ", t[i]);
}
}
Note: if you are on windows, change __func__ to the function name in each of the error statements.
Logic of Function - Long Version
Let's look a little closer at what the function is doing and why it does it the way it does. First, lets look at the paramaters it takes:
void formatted (char **t, char just, size_t w, size_t n)
char **t, your 'string', well... actually your array or pointers to type char*. When you pass the array of pointers to your function, and this may be where your confusion is, you only have 2 ways to iterate over the array an print each of the lines of text: (1) pass the number of valid pointers that point to strings containing text, or (2) provide a sentinel within the array that is pointed to by the pointer following the last pointer that points to a valid line of text. (usually just NULL) The sentinel serves as the pointer in the array that tells you ("Hey dummy, stop trying to print lines -- you already printed the last one...")
This takes a little more explanation. Consider your array of pointers in memory. Normally you will always allocate some reasonably anticipated number of pointers to fill, and when you fill the last pointer, you will realloc the array to contain more space. What this means is you will always have at least 1 pointer at the end of your array that is not filled. If you will intialize your array to contain NULL pointers at the very beginning (by allocating with calloc instead of malloc) -- you automatically provide a sentinel, or, you can always explicitly set the next pointer to NULL as you fill your array. This will leave your array of pointers to char* looking similar to the following:
Pointer The pointers in the array of pointers to char*, char **t;
Address point to the first character in each associated string.
+-----------+
| 0x192a460 | --> The quick brown
+-----------+
| 0x192a480 | --> fox jumps over
+-----------+
| 0x192a4a0 | --> a lazy
+-----------+
| 0x192a4c0 | --> dog.
+-----------+
| NULL |
+-----------+
| NULL |
+-----------+
...
Understand: your string, my t is an array of pointers to type char* not the strings themselves. Your string is the left column of pointers above. string is an array of the starting addresses of each real string and your string[i] will be the start of actual string itself -- at that address. i will range from 0-3 (4 total) where string[0] = "The quick brown", string[1] = "fox jumps over", etc.. To help, change the name in your code of string to array or str_array to help keep this straight.
Your 2 options for iterating over the array to print each string become (1) (with size 'n' passed to function):
for (i = 0; i < n; i++)
printf ("%s\n", t[i]);
or (2) (relying on a sentinel):
while (t[i]))
printf ("%s\n", t[i++]);
(while not readily apparent here, this provides significant benefits as your code becomes more complex and you need to pass the array between many different functions)
Now that you know how you can access each of your strings, start with the case where you will just print the string left justified. You don't care about the length and you don't care about the width (as long as your string will fit within the width). All you need to do is print each of the strings. Since we pass the number of strings 'n' to the funciton, we can simply use the for loop (method 1) to print each string. (the length is computed prior to printing to insure each string will fit in width, since we will use it later, we store each length in the variable length array len so we don't have to make redudant calls to strlen later)
The more interesting cases are the center and right justified cases. In addition to 'n' you need to know which justification the user wants. You can pass any type flag you want to pass that information to the function. A char (1-byte) is simply the most efficient and does not require conversion to a number when read as input to the program. That is essentially why I chose to pass just as a char instead of an int (4-bytes + conversion on input) (or short (2-bytes), etc..)
Let's first look at how we right-justify the output. For discussion, let's consider an output width of 20 characters you want to right-justify your strings in. Your first string is 15 (printable) characters long (it's actually 16 chars in memory due to the null-terminating char at the end). Let's visualize what our string of printable characters would look like in a 20-character buffer (which you would use to save a right-justified copy of the string in memory, rather than printing)
|<------- 20 character width -------->|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| | | | | |T|h|e| |q|u|i|c|k| |b|r|o|w|n|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|<-- 15 character length -->|
This makes it easy to see how many spaces are needed before we start printing our string. The task here is to turn this into a generalized piece of code to handle any length string. Not too difficult:
spaces = width - length;
Then to utilize this to print the string right-justified, we make use of the minimum field width directive in the printf format string. Of particular use here, is the ability to specify the field width directive as a variable in the printf argument list. Constructing the format string and argument list to accomplish our goal, we have:
printf ("%*s%s\n", spaces, " ", t[i]);
Which says, print a minimum field width of spaces for the string " " (which essentially prints spaces number of spaces -- poor choice of names) followed by the actual string in our array of pointers t[i].
Looking at the diagram again, what would we have to do to shift the string to the center of the 20-character width? Instead of shifing the whole width - length number of spaces, we could only shift it 1/2 that much and it would end up where we want it. (I can hear the gears in your head grinding, and I can smell the smoke -- "but, wait... 5 is an odd number and we are using integers!" -- it doesn't matter, integer division will take care of it, and if we shift by 2 instead of 2.5, it's just fine, you can't print 1/2 a character... So putting it all together, to handle centered or right justified text, all you need is:
for (i = 0; i < n; i++) {
int spaces = w - len[i];
if (just == 'c')
printf ("%*s%s\n", spaces/2, " ", t[i]);
else if (just == 'r')
printf ("%*s%s\n", spaces, " ", t[i]);
}
Putting It All Together With The Rest
Sometimes seeing how the whole thing fits together helps. Same rules. Go though it function-by-function, line-by-line, and ask questions when you get struck. C is a low-level language, meaning you have to understand where things are in memory. When it comes down to it, programming is really about how to manipulate what you load into memory. Other languages try to hide that from you. C doesn't. That's its strength, and also where you have to concentrate a good part of your learning. Enough babble, here is the code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAXC 256
#define MAXL 64
char **inputfile (char ***t, FILE *fp, size_t *sz);
void formatted (char **t, char just, size_t w, size_t n);
void *xcalloc (size_t n, size_t s);
void freecdp (char **p, size_t n);
unsigned chr2lower (unsigned c);
int main (int argc, char **argv) {
if (argc < 4) { /* validate required arguments on command line */
fprintf (stderr, "error: insufficient input. "
"usage: %s just width [filename (stdin)]\n",
argv[0]);
return 1;
}
char **text = NULL; /* initialize all variables for main */
size_t lines = 0;
char j = 0; /* the following are ternary operators */
char just = argc > 1 ? *argv[1] : 'l'; /* sets defaults if no input */
size_t width = argc > 2 ? (size_t)strtol (argv[2], NULL, 10) : 0;
FILE *fp = argc > 3 ? fopen (argv[3], "r") : stdin;
/* read input from file */
if (!(inputfile (&text, fp, &lines))) { /* check if return is NULL */
fprintf (stderr, "error: file read failed '%s'.\n",
argc > 3 ? argv[3] : "stdin");
return 1;
}
j = chr2lower (just); /* force user input to lower-case */
if (j != 'l' && j != 'r' && j != 'c')
fprintf (stderr, "error: invalid justification '%c' "
"(defaulting to 'left').\n", just);
/* print output in requested justification */
formatted (text, j, width, lines);
/* free all memory allocated in program */
freecdp (text, lines);
return 0;
}
char **inputfile (char ***t, FILE *fp, size_t *sz)
{
if (!t || !sz) { /* validate parameters are not NULL */
fprintf (stderr, "%s() error: invalid parameters.\n", __func__);
return NULL;
}
if (!fp) { /* check that file pointer is valid */
fprintf (stderr, "%s() error: file open failed.\n", __func__);
return NULL;
}
size_t idx = 0; /* declare/initialize function variables */
size_t maxl = MAXL;
char ln[MAXC] = {0};
/* allocate MAXL number of pointers */
*t = xcalloc (MAXL, sizeof **t);
while (fgets (ln, MAXC, fp)) { /* read each line in file */
size_t len = strlen (ln); /* calculate length */
/* remove trailing newline (or carriage return) */
while (len && (ln[len-1] == '\n' || ln[len-1] == '\r'))
ln[--len] = 0;
/* allocate & copy ln saving pointer in t[i], increment i by 1 */
(*t)[idx++] = strdup (ln); /* strdup allocates & copies */
if (idx == maxl) { /* check if you reached limit, realloc if needed */
void *tmp = realloc (*t, maxl * sizeof **t * 2);
if (!tmp) {
fprintf (stderr, "%s() virtual memory exhausted.\n", __func__);
return NULL;
}
*t = tmp; /* set new pointers NULL below (sentinel) */
memset (*t + maxl, 0, maxl * sizeof **t);
maxl *= 2;
}
}
*sz = idx; /* update value at address of sz so it is available in main */
if (fp != stdin) fclose (fp);
return *t;
}
/* format 'n' lines of output of 't' justified as specified in 'just'
(left 'l', right 'r' or centered 'c') within a width 'w'.
NOTE: 'w' must be larger than the longest line in 't'.
*/
void formatted (char **t, char just, size_t w, size_t n)
{
if (!t || !*t) return;
size_t i = 0;
size_t lmax = 0;
size_t len[n];
/* calculate the length of each line, set lmax */
for (i = 0; i < n; i++) {
len[i] = strlen (t[i]);
if (len[i] > lmax) lmax = len[i];
}
/* handle w < lmax reformat or error here */
if (w < lmax) {
fprintf (stderr, "%s() error: invalid width < lmax (%zu).\n",
__func__, lmax);
return;
}
/* left justified output */
if (just == 'l') {
for (i = 0; i < n; i++) {
printf ("%s\n", t[i]);
}
return;
}
/* center or right justified output */
for (i = 0; i < n; i++) {
int spaces = w - len[i];
if (just == 'c')
printf ("%*s%s\n", spaces/2, " ", t[i]);
else if (just == 'r')
printf ("%*s%s\n", spaces, " ", t[i]);
}
}
/* help functions below for calloc, free, and to lower-case */
void *xcalloc (size_t n, size_t s)
{
register void *memptr = calloc (n, s);
if (memptr == 0) {
fprintf (stderr, "%s() error: virtual memory exhausted.\n",
__func__);
exit (EXIT_FAILURE);
}
return memptr;
}
void freecdp (char **p, size_t n)
{
if (!p) return;
size_t i;
for (i = 0; i < n; i++)
free (p[i]);
free (p);
}
unsigned chr2lower (unsigned c)
{ return ('A' <= c && c <= 'Z') ? c | 32 : c; }
Example Use/Output
$ ./bin/str_justify l 15 dat/fox_4lines.txt
The quick brown
fox jumps over
a lazy
dog.
$ ./bin/str_justify c 15 dat/fox_4lines.txt
The quick brown
fox jumps over
a lazy
dog.
$ ./bin/str_justify r 15 dat/fox_4lines.txt
The quick brown
fox jumps over
a lazy
dog.

Related

How to fscanf word by word in a file?

I have a file with a series of words separated by a white space. For example file.txt contains this: "this is the file". How can I use fscanf to take word by word and put each word in an array of strings?
Then I did this but I don't know if it's correct:
char *words[100];
int i=0;
while(!feof(file)){
fscanf(file, "%s", words[i]);
i++;
fscanf(file, " ");
}
When reading repeated input, you control the input loop with the input function itself (fscanf in your case). While you can also loop continually (e.g. for (;;) { ... }) and check independently whether the return is EOF, whether a matching failure occurred, or whether the return matches the number of conversion specifiers (success), in your case simply checking that the return matches the single "%s" conversion specifier is fine (e.g. that the return is 1).
Storing each word in an array, you have several options. The most simple is using a 2D array of char with automatic storage. Since the longest non-medical word in the Unabridged Dictionary is 29-characters (requiring a total of 30-characters with the nul-terminating character), a 2D array with a fixed number of rows and fixed number of columns of at least 30 is fine. (dynamically allocating allows you to read and allocate memory for as many words as may be required -- but that is left for later.)
So to set up storage for 128 words, you could do something similar to the following:
#include <stdio.h>
#define MAXW 32 /* if you need a constant, #define one (or more) */
#define MAXA 128
int main (int argc, char **argv) {
char array[MAXA][MAXW] = {{""}}; /* array to store up to 128 words */
size_t n = 0; /* word index */
Now simply open your filename provided as the first argument to the program (or read from stdin by default if no argument is given), and then validate that your file is open for reading, e.g.
/* use filename provided as 1st argument (stdin by default) */
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
perror ("file open failed");
return 1;
}
Now to the crux of your read-loop. Simply loop checking the return of fscanf to determine success/failure of the read, adding words to your array and incrementing your index on each successful read. You must also include in your loop-control a check of your index against your array bounds to ensure you do not attempt to write more words to your array than it can hold, e.g.
while (n < MAXA && fscanf (fp, "%s", array[n]) == 1)
n++;
That's it, now just close the file and use your words stored in your array as needed. For example just printing the stored words you could do:
if (fp != stdin) fclose (fp); /* close file if not stdin */
for (size_t i = 0; i < n; i++)
printf ("array[%3zu] : %s\n", i, array[i]);
return 0;
}
Now just compile it, With Warnings Enabled (e.g. -Wall -Wextra -pedantic for gcc/clang, or /W3 on (VS, cl.exe) and then test on your file. The full code is:
#include <stdio.h>
#define MAXW 32 /* if you need a constant, #define one (or more) */
#define MAXA 128
int main (int argc, char **argv) {
char array[MAXA][MAXW] = {{""}}; /* array to store up to 128 words */
size_t n = 0; /* word index */
/* use filename provided as 1st argument (stdin by default) */
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
perror ("file open failed");
return 1;
}
while (n < MAXA && fscanf (fp, "%s", array[n]) == 1)
n++;
if (fp != stdin) fclose (fp); /* close file if not stdin */
for (size_t i = 0; i < n; i++)
printf ("array[%3zu] : %s\n", i, array[i]);
return 0;
}
Example Input File
$ cat dat/thefile.txt
this is the file
Example Use/Output
$ ./bin/fscanfsimple dat/thefile.txt
array[ 0] : this
array[ 1] : is
array[ 2] : the
array[ 3] : file
Look things over and let me know if you have further questions.
strtok() might be a function that can help you here.
If you know that the words will be separated by whitespace, then calling strtok will return the char pointer to the start of the next word.
Sample code from https://www.systutorials.com/docs/linux/man/3p-strtok/
#include <string.h>
...
char *token;
char *line = "LINE TO BE SEPARATED";
char *search = " ";
/* Token will point to "LINE". */
token = strtok(line, search);
/* Token will point to "TO". */
token = strtok(NULL, search);
In your case, the space character would also act as a delimiter in the line.
Note that strtok might modify the string passed in, so if you need to you should make a deep copy using something like malloc.
It might also be easier to use fread() to read a block from a file
As mentioned in comments, using feof() does not work as would be expected. And, as described in this answer unless the content of the file is formatted with very predictable content, using any of the scanf family to parse out the words is overly complicated. I do not recommend using it for that purpose.
There are many other, better ways to read content of a file, word by word. My preference is to read each line into a buffer, then parse the buffer to extract the words. This requires determining those characters that may be in the file, but would not be considered part of a word. Characters such as \n,\t, (space), -, etc. should be considered delimiters, and can be used to extract the words. The following is a recipe for extracting words from a file: (example code for a few of the items is included below these steps.)
Read file to count words, and get the length of the longest word.
Use count, and longest values from 1st step to allocate memory for words.
Rewind the file.
Read file line by line into a line buffer using while(fgets(line, size, fp))
Parse each new line into words using delimiters and store each word into arrays of step 2.
Use resulting array of words as necessary.
free all memory allocated when finished with arrays
Some example of code to do some of these tasks:
// Get count of words, and longest word in file
int longestWord(char *file, int *nWords)
{
FILE *fp=0;
int cnt=0, longest=0, numWords=0;
int c;
fp = fopen(file, "r");
if(fp)
{
// if((strlen(buf) > 0) && (buf[0] != '\t') && (buf[0] != '\n') && (buf[0] != '\0')&& (buf[0] > 0))
while ( (c = fgetc(fp) ) != EOF )
{
if ( isalnum (c) ) cnt++;
else if ( ( ispunct (c) ) || ( isspace(c) ) || (c == '\0' ))
{
(cnt > longest) ? (longest = cnt, cnt=0) : (cnt=0);
numWords++;
}
}
*nWords = numWords;
fclose(fp);
}
else return -1;
return longest;
}
// Create indexable memory for word arrays
char ** Create2DStr(ssize_t numStrings, ssize_t maxStrLen)
{
int i;
char **a = {0};
a = calloc(numStrings, sizeof(char *));
for(i=0;i<numStrings; i++)
{
a[i] = calloc(maxStrLen + 1, 1);
}
return a;
}
Usage: For a file with 25 words, the longest being 80 bytes:
char **strArray = Create2DStr(25, 80+1);//creates 25 array locations
//each 80+1 characters long
//(+1 is room for null terminator.)
int i=0;
char words[50][50];
while(fscanf(file, " %s ", words[i]) != EOF)
i++;
I wouldn't entirely recommend doing it this way, because of the unknown amount of words in the file, and the unknown length of a "word". Either can be over the size of '50'. Just do it dynamically, instead. Still, this should show you how it works.
How can I use fscanf to take word by word and put each word in an array of strings?
Read each word twice: first to find length via "%n". 2nd time, save it. (Inefficient yet simple)
Re-size strings as you go. Again inefficient, yet simple.
// Rough untested sample code - still need to add error checking.
size_t string_count = 0;
char **strings = NULL;
for (;;) {
long pos = ftell(file);
int n = 0;
fscanf(file, "%*s%n", &n); // record where scanning a "word" stopped
if (n == 0) break;
fseek(file, pos, SEEK_SET); // go back;
strings = realloc(strings, sizeof *strings * (string_count+1));// increase array size
strings[string_count] = malloc(n + 1u); // Get enough memory for the word
fscanf(file, "%s ", strings[string_count] ); // read/save word
}
// use strings[], string_count
// When done, free each strings[] and then strings

How to use fgets() in 2d-arrays (multiple dimension arrays)?

#include <stdio.h>
#include <stdlib.h>
void *salloc(int x){
char **pointer;
int i;
pointer = malloc(sizeof(char)*x);
if(pointer == NULL){
exit(-1);
}
for(i=0; i<x; i++){
pointer[i] = malloc(sizeof(char) * 20);
if(pointer[i] == NULL){
exit(-1);
}
}
return pointer;
}
void Input(int value, char **array){
for(i = 0; i < value; i++){
printf("%d ----\n", i);
fgets(array[i], 20, stdin);
printf("%d ----\n", i);
}
}
int main(int argc, char *argv[]){
char **array;
int value = 2;
array = salloc(value);
Input(value, array);
return 0;
}
The general idea, can be that I miss some syntax.
So I want to read in a string with spaces. If I run this for the value 2, it will print:
0 ----
0 ----
1 ----
"some string"
and it crashes after I press enter.
If I do this with value 1:
it immediately crashes.
However if I replace fgets() with:
scanf("%s", array[i]);
it works (except for the spaces).
So how does fgets() work in 2d-arrays?
Because I get it to work in 1d-arrays. And for some reason I can print 1d-arrays from row 2 when the array only has 2 rows, so it should only be able to print from rows 0 and 1 right?
Here is a demonstrative program that shows how fgets can be used with a 2D array.
#include <stdio.h>
#define N 5
#define M 10
int main( void )
{
char lines[N][M];
size_t n = 0;
while( n < N && fgets( lines[n], sizeof( *lines ), stdin ) != NULL ) ++n;
for ( size_t i = 0; i < n; i++ ) puts( lines[i] );
return 0;
}
If to enter for example
One
Two
Three
Four
Five
then the program output will be the same
One
Two
Three
Four
Five
When you do
pointer = malloc(sizeof(char)*x);
you only allocate x characters (i.e. bytes), not pointers to characters. Change to
pointer = malloc(sizeof(char*)*x);
Without the change, you might go out of bounds and have undefined behavior. And this is exactly what happens in your code, you allocate only two bytes to store two pointers, and a single pointer is either four or eight bytes, so you don't allocate enough memory for even a single pointer.
Undefined behavior is a common cause of crashes, but sometimes it might also seem to work.
Note also when taking input with fgets (or any of the line-oriented input functions), fgets will read up to, and include, the '\n' at the end of each line read. You should perform 2 additional tests/operations. (1) you should test that the last character read by fgets is in fact the '\n' character. If it is not, that will indicate your input was truncated by fgets at the length you specified in the second parameter to fgets, and additional character remain unread for that line. Without this check and some way to handle lines that exceed the specified width, you next call to fgets will read the remaining characters for the current line as your next line of input.
(2) you should remove the newline included by fgets to prevent your strings from containing embedded '\n' characters at the end. (if you are simply parsing numbers from the line and not storing it as a string, this can be handled in several different manners). But, for the general case, you use strlen to locate the end of the string, and then overwrite the '\n' with a nul-terminating character.
In addition to the above, it may make more sense to allocate memory for each individual pointer in array only after a line of data has been read to prevent over-allocating space in your code. Since you are specifying that the allocation size for each string will be 20, a simple character buffer of that size can be used to take input, and after confirming input, you can allocate storage for that line in array. A short example of your function with the checks included and with allocation included in input would be:
#define MAXC 20 /* max chars per read */
...
void Input (int value, char **array)
{
int i = 0;
size_t len = 0;
char buf[MAXC] = {0};
while (i < value && fgets (buf, MAXC, stdin)) {
printf ("%d ----\n", i);
len = strlen (buf); /* get length */
if (len + 1 == MAXC && buf[len-1] != '\n') /* validate read */
fprintf (stderr, "warning: chars exceed MAXC, line[%d]\n", i);
else
buf[--len] = 0; /* strip '\n' */
printf ("%d ----\n", i);
array[i++] = strdup (buf); /* allocate/copy */
}
}
Lastly, why choose void as the function type? Why not return the number of values read into array if values are successfully read into array, or 0 otherwise. This will at least allow some indication of success or failure of your read and provide a way of returning the number of lines allocated back to the calling function.
An example of your code incorporating the adjusted allocation, necessary checks, and useful return types would be:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAXC 20
void *salloc (int x)
{
char **pointer;
pointer = malloc (sizeof *pointer * x);
if (pointer == NULL)
exit(-1);
return pointer;
}
int input (int value, char **array)
{
int i = 0;
size_t len = 0;
char buf[MAXC] = {0};
while (i < value && fgets (buf, MAXC, stdin)) {
printf ("%d ----\n", i);
len = strlen (buf); /* get length */
if (len + 1 == MAXC && buf[len-1] != '\n') /* validate read */
fprintf (stderr, "warning: chars exceed MAXC, line[%d]\n", i);
else
buf[--len] = 0; /* strip '\n' */
printf ("%d ----\n", i);
array[i++] = strdup (buf); /* allocate/copy */
}
return i;
}
int main (int argc, char *argv[]) {
int i, nlines, value = argc > 1 ? atoi (argv[1]) : 2;
char **array;
array = salloc (value);
if (!(nlines = input (value, array))) /* validate input */
return 1;
for (i = 0; i < nlines; i++) /* print input */
printf (" array[%2d] : %s\n", i, array[i]);
for (i = 0; i < nlines; i++) /* free memory */
free (array[i]);
free (array);
return 0;
}
Test Input File
The following is a test input file where line 1 (the 2nd line) exceedS 20 characters:
$ cat dat/captnjack.txt
This is a tale
Of Captain Jack Sparrow
A Pirate So Brave
On the Seven Seas.
Example Output
$ ./bin/readarray 10 <../dat/captnjack.txt
0 ----
0 ----
1 ----
warning: chars exceed MAXC, line[1]
1 ----
2 ----
2 ----
3 ----
3 ----
array[ 0] : This is a tale
array[ 1] : Of Captain Jack Spa
array[ 2] : rrow
array[ 3] : A Pirate So Brave
array[ 4] : On the Seven Seas.
Don't forget to use a memory error check program like valgrind to validate your use of the memory you allocate and to insure you have freed it when it is no longer needed. Let me know if you have any additional questions.

C : How to sort words from variable number of files with frequency # and in alphabetical order

I'm new to C and I'm having trouble writing a C program that takes a variable number of files via command line arguments and sorts the words by (ASCII)alphabetical order and prints only unique words, but includes the frequencies. I managed to get as far as sorting words through user input in alphabetical order, but I don't know how to properly write the code to take file input, and I also have no clue how to only print each unique word once with it's frequency.
here's what I got so far, which takes stdin rather than file and lacks frequency count:
#include <stdio.h>
#include <string.h>
int main(void) {
char a[2048][2048];
int i = 0,
j = 0,
k = 0,
n;
while(i < 2048 && fgets(a[i], 2048, stdin) != NULL)
{
n = strlen(a[i]);
if(n > 0 && a[i][n-1] == '\n')
a[i][n -1] = '\0';
i++;
}
for(j = 0; j < i; j++)
{
char max[2048];
strcpy (max,a[j]);
for(k = j + 1; k < i; k++)
{
if(strcmp(a[k], max) < 0)
{
char temp[2048];
strcpy(temp, a[k]);
strcpy(a[k], max);
strcpy(max, temp);
}
}
strcpy(a[j],max);
}
for( j = 0; j < i; j++){
printf("%s\n", a[j]);
}
return 0;
}
In order to read words in a file into an array holding only unique words while keeping track of the number of occurrences of each time a word is seen, can be done in a couple of ways. An easy and straight-forward approach is to keep 2 separate arrays. The first, a 2D character array of sufficient size to hold the number of words anticipated, and the second, a numeric array (unsigned int or size_t) that contains the number of times each word is seen at the same index as the word is stored in the character array.
The only challenge while reading words from the file is to determine if a word has been seen before, if not, the new word is added to the seen character array at a given index and the frequency array freq is then updated at that index to reflect the word has been seen 1 time (e.g. freq[index]++;).
If while checking against your list of words in seen, you find the current word already appears at index X, then you skip adding the word to seen and simply update freq[X]++;.
Below is a short example that does just that. Give it a try and let me know if you have any questions:
#include <stdio.h>
#include <string.h>
#define MAXW 100
#define MAXC 32
int main (int argc, char **argv) {
/* initialize variables & open file or stdin for reading */
char seen[MAXW][MAXC] = {{ 0 }};
char word[MAXC] = {0};
size_t freq[MAXW] = {0};
size_t i, idx = 0;
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) {
fprintf (stderr, "error: file open failed '%s'.\n", argv[1]);
return 1;
}
/* seen 1st word into 'seen' array, update index 'idx' */
if (fscanf (fp, " %32[^ ,.\t\n]%*c", word) == 1) {
strcpy (seen[idx], word);
freq[idx]++;
idx++;
}
else {
fprintf (stderr, "error: file read error.\n");
return 1;
}
/* read each word in file */
while (fscanf (fp, " %32[^ ,.\t\n]%*c", word) == 1) {
/* check against all words in seen */
for (i = 0; i < idx; i++) {
/* if word already in 'seen', update 'freq' count */
if (strcmp (seen[i], word) == 0) {
freq[i]++;
goto skipdup; /* skip adding word to 'seen' */
}
} /* add word to 'seen', update freq & 'idx' */
strcpy (seen[idx], word);
freq[idx]++;
idx++;
skipdup:
if (idx == MAXW) { /* check 'idx' against MAXW */
fprintf (stderr, "warning: MAXW words exceeded.\n");
break;
}
}
if (fp != stdin) fclose (fp);
printf ("\nthe occurrence of words are:\n\n");
for (i = 0; i < idx; i++)
printf (" %-28s : %zu\n", seen[i], freq[i]);
return 0;
}
Compile
gcc -Wall -Wextra -O3 -o bin/file_words_occur file_words_occur.c
Input
$ cat dat/words.txt
the quick brown fox jumps over the lazy dog. the fox jumps over the dog to avoid the squirrel.
Output
$ ./bin/file_words_occur <dat/words.txt
the occurrence of words are:
the : 8
quick : 1
brown : 1
fox : 2
jumps : 2
over : 2
lazy : 1
dog : 2
to : 1
avoid : 1
squirrel : 2
was : 1
in : 1
path : 1
of : 1
captain : 1
jack : 1
sparrow : 1
a : 1
pirate : 1
so : 1
brave : 1
on : 1
seven : 1
seas : 1
Note: the longest word in the abridged dictionaries is 28 chars long (Antidisestablishmentarianism). It requires space for the nul-terminating character for a total of 29 chars. The choice of MAXC of 32 should accommodate all normal words.
Handle Multiple Files + Sorting Words/Occurrences Alphabetically
As noted in the comments, handling multiple files can be done with the existing code, simply by utilizing the codes ability to read from stdin. All you need to do is cat file1 file2 file3 | ./prog_name. Updating the code to handle multiple files as arguments is not difficult either. (you could just wrap the existing body with a for (j = 1, j < argc, j++) and open/close each filename provided. (some other slight tweaks to the fp declaration are also needed)
But what's the fun in that? Whenever you think about doing the same thing more than once in your program, the "I should make that a function" lightbulb should wink on. That is the proper way to think about handling repetitive processes in your code. (arguably, since there is just one thing we are doing more than once, and since we could simply wrap that in a for loop, we could get by without a function in this case -- but where is the learning in that?)
OK, so we know we are going to move the file-read/frequency-count code to a function, but what about the sort requirement? That's where we need to change the data handling from 2-arrays to an array of struct. Why go from 2-arrays to handling the data in a struct?
When you sort the words alphabetically, you must maintain the relationship between the seen array and the freq array so after the sort, you have the right number of occurrences with the right word. You cannot independently sort the arrays and keep that relationship. However, if we put both the word and the occurrences of that word in a struct, then we could sort an array of structs by the word and the right number of occurrences remains associated with the right word. e.g. something like the following would work:
typedef struct {
char seen[MAXC];
size_t freq;
} wfstruct;
(wfstruct is just a semi-descriptive name for word-frequency struct, it can be anything that makes sense to you)
Which in your program you will declare as an array of with something like:
wfstruct words[MAXW];
(you will actually want to initialize each member to zero -- that is done in the actual code below)
How to sort an array of that? qsort is your friend. qsort will sort a collection of anything so long as you can pass qsort (1) the array, (2) how many elements to sort, (3) the size of the elements, and (4) a compare function that takes a const void pointer to the elements it will compare. This always gives new C programmers fits because you have to figure out (a) how to pass the element of your array-of-whatever as a pointer, and (b) then how to handle getting the data you need back out of the pointer in the function to compare.
The declaration for a comparison function for qsort is:
int compare (const void *a, const void *b);
To write the compare function, all you need to ask yourself is "What do I need to compare to sort my collection the way I want it sorted?" In this case you know you want to sort the array of structs by the word seen in each element of the array of wfstruct. You know seen will be a simple character string, so you can sort using strcmp.
Then the final thing you need to ask yourself is "How in the heck do I get my seen string out of const void *a (and *b) so I can feed it to strcmp?" Here you know the const void *a must represent the basic element of what you will be sorting, which is struct wfstruct. So you know that const void *a is a pointer to wfstruct. Since it will be a pointer, you know you must use the -> operator to derefernce the seen member of the struct. (e.g. the seen member is access as mystruct->seen.
But "what is the rule regarding dereferncing a void pointer?" (Answer: "you can't derefernce a void pointer") How do you handle this? Simple, you just declare a pointer of type struct wfstruct in your compare function and typecase a to (wfstruct *). Example:
wfstruct *ap = (wfstruct *)a;
Now you have a good-ole pointer to struct wfstruct (or simply pointer to wfstruct since we included the typedef for wfstruct in its declaration). You do the same thing for b and now you can pass ap->seen and bp->seen to strcmp and sort your array of struct:
int compare (const void *a, const void *b)
{
wfstruct *ap = (wfstruct *)a;
wfstruct *bp = (wfstruct *)b;
return (strcmp (ap->seen, bp->seen));
}
The call to qsort in your program is nothing more than:
/* sort words alphabetically */
qsort (words, idx, sizeof *words, compare);
With the basics out of the way, you can now move the needed code to a function to allow you to read multiple files as arguments, keep a total of the number of words seen between files (as well as their frequency) and then sort the resulting array of structs alphabetically.
note: to keep track of the total number of words between multiple files (calls to your funciton), you can either return the number of words gathered for each file as the return from your read function, and keep a total that way, or you can simply pass a pointer to your total to the read function and have it updated directly in the function. We will take the second approach below.
Putting the pieces together, you get:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAXW 100
#define MAXC 32
typedef struct {
char seen[MAXC];
size_t freq;
} wfstruct;
int get_word_freq (wfstruct *words, size_t *idx, FILE *fp);
int compare (const void *a, const void *b);
int main (int argc, char **argv) {
/* initialize variables & open file or stdin for seening */
wfstruct words[MAXW] = {{{ 0 }, 0}};
size_t i, idx = 0;
FILE *fp = NULL;
if (argc < 2) { /* read from stdin */
get_word_freq (words, &idx, stdin);
}
else {
/* read each file given on command line */
for (i = 1; i < (size_t)argc; i++)
{ /* open file for reading */
if (!(fp = fopen (argv[i], "r"))) {
fprintf (stderr, "error: file open failed '%s'.\n",
argv[i]);
continue;
}
/* check 'idx' against MAXW */
if (idx == MAXW) break;
get_word_freq (words, &idx, fp);
}
}
/* sort words alphabetically */
qsort (words, idx, sizeof *words, compare);
printf ("\nthe occurrence of words are:\n\n");
for (i = 0; i < idx; i++)
printf (" %-28s : %zu\n", words[i].seen, words[i].freq);
return 0;
}
int get_word_freq (wfstruct *words, size_t *idx, FILE *fp)
{
char word[MAXC] = {0};
size_t i;
/* read 1st word into array, update index 'idx' */
if (*idx == 0) {
if (fscanf (fp, " %32[^ ,.\t\n]%*c", word) == 1) {
strcpy (words[*idx].seen, word);
words[*idx].freq++;
(*idx)++;
}
else {
fprintf (stderr, "error: file read error.\n");
return 1;
}
}
/* read each word in file */
while (fscanf (fp, " %32[^ ,.\t\n]%*c", word) == 1) {
/* check against all words in struct */
for (i = 0; i < *idx; i++) {
/* if word already 'seen', update 'words[i]. freq' count */
if (strcmp (words[i].seen, word) == 0) {
words[i].freq++;
goto skipdup; /* skip adding word to 'words[i].seen' */
}
} /* add to 'words[*idx].seen', update words[*idx].freq & '*idx' */
strcpy (words[*idx].seen, word);
words[*idx].freq++;
(*idx)++;
skipdup:
if (*idx == MAXW) { /* check 'idx' against MAXW */
fprintf (stderr, "warning: MAXW words exceeded.\n");
break;
}
}
fclose (fp);
return 0;
}
/* qsort compare funciton */
int compare (const void *a, const void *b)
{
wfstruct *ap = (wfstruct *)a;
wfstruct *bp = (wfstruct *)b;
return (strcmp (ap->seen, bp->seen));
}
Output
$ ./bin/file_words_occur_multi dat/words.txt dat/words.txt
the occurrence of words are:
a : 2
avoid : 2
brave : 2
brown : 2
captain : 2
dog : 4
fox : 4
in : 2
jack : 2
jumps : 4
lazy : 2
of : 2
on : 2
over : 4
path : 2
pirate : 2
quick : 2
seas : 2
seven : 2
so : 2
sparrow : 2
squirrel : 4
the : 16
to : 2
was : 2
Passing Index (idx) as Non-Pointer
As mentioned above, there are two ways to keep track of the number of unique words seen across multiple files: (1) pass the index and keep the total in main, or (2) pass a pointer to the index and update its value directly in the function. The example above passes a pointer. Since the additional syntax required to dereference and properly use the pointer value can be challenging for those new to C, here is an example of passing idx as a simple variable and keeping track of the total in main.
(note: you are required to pass the index either way, it's your choice whether you pass idx as a regular variable and work with a copy of the variable in the function, or whether you pass idx as a pointer and operate on the value directly in the function)
Here are the simple changes to get_word_freq and the changes required in main follow (note: size_t is chosen as the type rather than int because the array index can never be negative):
size_t get_word_freq (wfstruct *words, size_t idx, FILE *fp)
{
char word[MAXC] = {0};
size_t i;
/* read 1st word into array, update index 'idx' */
if (idx == 0) {
if (fscanf (fp, " %32[^ ,.\t\n]%*c", word) == 1) {
strcpy (words[idx].seen, word);
words[idx].freq++;
idx++;
}
else {
fprintf (stderr, "error: file read error.\n");
return idx;
}
}
/* read each word in file */
while (fscanf (fp, " %32[^ ,.\t\n]%*c", word) == 1) {
/* check against all words in struct */
for (i = 0; i < idx; i++) {
/* if word already 'seen', update 'words[i]. freq' count */
if (strcmp (words[i].seen, word) == 0) {
words[i].freq++;
goto skipdup; /* skip adding word to 'words[i].seen' */
}
} /* add to 'words[*idx].seen', update words[*idx].freq & '*idx' */
strcpy (words[idx].seen, word);
words[idx].freq++;
idx++;
skipdup:
if (idx == MAXW) { /* check 'idx' against MAXW */
fprintf (stderr, "warning: MAXW words exceeded.\n");
break;
}
}
fclose (fp);
return idx;
}
The changes required in main:
...
if (argc < 2) { /* read from stdin */
idx = get_word_freq (words, idx, stdin);
}
else {
/* read each file given on command line */
for (i = 1; i < (size_t)argc; i++)
{ /* open file for reading */
...
/* check 'idx' against MAXW */
if ((idx = get_word_freq (words, idx, fp)) == MAXW)
break;
}
}
...
Let me know if you have further questions.
There are still many things to add to your program!
Loop over input files given on command line. A simple C way could be:
int main(int argc, char *argv[]) {
FILE *fd;
...
while (*(argv++) != NULL) {
if strcmp(*argv, "-") { /* allow - to stand for stdin */
fd = stdin;
}
else {
fd = fopen(*argv, "r");
if (fd == NULL) {
/* process error condition */
...
}
/* process file */
...
if (fd != stdin) fclose(fd); /* don't forget to close */
}
return 0;
}
Split the files in words
char word[64];
int cr;
while ((cr = fscanf(fd, "%63s", word)) == 1) {
filter(word); /* optionally convert to lower case, remove punctuation... */
/* process word */
...
}
store the words in a container and count their occurence. At the simplest level, you can use an array with linear search, but a tree would be much better.
unsigned int maxWord = 2048, totWord = 0, nWord = 0;
typedef {
char *word;
int count;
} stat;
stat * st = calloc(maxWord, sizeof(stat));
and later
void add(stat *st, const char * word) {
unsigned int i;
totWord += 1;
for (i=0; i<nWord; i++) {
if (strcmp(word, st[i].word) == 0) {
st[i].count += 1;
return;
}
}
if (nWord < maxWord) {
st[nWord].word = strdup(word);
st[nWord].count += 1;
nWord += 1;
}
}
You now have to glue above together, sort the st array (with qsort), and the frequency of each word is ((float) st[i].count) / totWord

File Handling + character manipulation

this is my code.
the input numbers are
1234567890
the output of this code should be
(123)456-7890
but the output is different. Any advice or error fixes in my code?
#include <stdio.h>
#include <ctype.h>
int main()
{
char ch;
int a[100], s[100], str, k, i;
FILE *fp;
fp = fopen("number.c", "r");
while ( ( ch = fgetc(fp) ) != EOF )
{
k = 0;
a[k] = '(';
a[k+4] = ')';
a[k+8] = '-';
for (i = 0; s[i] != '\0'; i++)
{
if (isdigit(s[i]))
{
a[k++] = s[i];
if (k == 3)
{
k++;
}
}
printf("%s", a);
}
fclose(fp);
return 0;
}
}
This looks like an assignment from a first year course in CS. If so, I would say find a TA during office hours and discuss.
There are several issues with the code:
Your outer loop is intending to read a line at a time from a file and populate the s array. It is instead reading a character at a time and populating the ch variable.
As mentioned in the comments, you are not accounting for the "-" when putting characters into the a array.
You are not terminating your string in the a array.
There may be different schools of thought on this in c, but I would make s and a char[] instead of int[].
My advice would be to get out a piece of paper and make spaces for each of your variables. Then read your code line by line and manipulate your variables the way you expect the computer to execute what is written. If you can read what is written, rather than what you expect the code to do, then the issues will become apparent.
/* ugly: The old phone #
nice: The formatted phone #
*/
#include <stdio.h>
void fmtpn(const char *ugly, char *nice)
{
int i, j;
/* add one to allocate space for null terminator */
char first3[3 + 1], next3[3 + 1], last4[4 + 1];
if (strlen(ugly) != 10 || containsalpha(ugly)) {
strcpy(nice, "Invalid pn!");
return;
}
for (i = 0; i < 3; ++i)
first3[i] = ugly[i];
first3[i] = 0; /* null terminate the string */
for (j = 0; j < 3; ++i, ++j)
next3[j] = ugly[i];
next3[j] = 0; /* null terminate */
for (j = 0; j < 4; ++i, ++j)
last4[j] = ugly[i];
last4[j] = 0; /* null terminate */
sprintf(nice, "(%s) %s-%s", first3, next3, last4);
}
To read from the file:
FILE *fp;
char ugly[32], good[32];
if (fp = fopen("file", "r")) {
fgets(ugly, 32, fp);
fmtpn(ugly, good);
puts(good);
}
No love for sscanf?
#include <stdio.h>
int prettyprint(char *input, char *output)
{
int n[10], ret;
ret = sscanf(input, "%1d%1d%1d%1d%1d%1d%1d%1d%1d%1d", &(n[0]), &(n[1]),
&(n[2]), &(n[3]), &(n[4]), &(n[5]), &(n[6]),
&(n[7]), &(n[8]), &(n[9]));
if (ret != 10)
fprintf(stderr, "invalid input\n");
sprintf(output, "(%1d%1d%1d) %1d%1d%1d-%1d%1d%1d%1d",
n[0], n[1], n[2],
n[3],n[4], n[5],
n[6], n[7], n[8], n[9]);
return 0;
}
int main(int argc, char **argv)
{
char digits[] = "0123456789";
char output[256];
prettyprint(digits, output);
printf("%s\n", output);
}
You have other options aside from looping through your sting to build the phone number. Sometimes, when dealing with fixed strings or known quantities, a straight forward packing of the characters into a fixed format is a lot simpler than picking the characters out of loops.
For example, here you know you are dealing with a 10 char string of digits. In your code you can read/parse each line into a string of 10 digits. Then your only task is to format those 10 digits into the phone number. Using a pointer for each string and then strncpy is about as easy as anything else:
#include <stdio.h>
#include <string.h>
int main (void) {
char *digits = "1234567890";
char *p = digits;
char phonenum[15] = {0};
char *pf = phonenum;
/* build formatted phone number */
*pf++ = '(';
strncpy (pf, p, 3);
pf += 3, p += 3;
*pf++ = ')';
*pf++ = ' '; /* note: included space, remove line if unwanted */
strncpy (pf, p, 3);
pf += 3, p += 3;
*pf++ = '-';
strncpy (pf, p, 4);
pf += 4;
*pf = 0;
printf ("\n digits : %s\n phone : %s\n\n", digits, phonenum);
return 0;
}
Output
$ ./bin/phnumbld
digits : 1234567890
phone : (123) 456-7890
You can easily turn the code above into a simple function that creates a formatted phone number given any 10-digit string. Breaking your code down into functional pieces not only makes your code easier to read and write, but it also builds flexibility and ease of maintenance into your code. Here were you dealing with an actual dial-string that included the international dialing prefix and country code, you could easily format the last 10 digits of the longer string by using a pointer to the appropriate beginning character.
With File Handling
Writing anything in C is no different. You simply break the problem down into discrete operations and then write small bits of code to handle each part of the problem. As you get more experience, you will build a collection of routines to handle most situations.
Below the code declare three constants. ACPN (area code phone number length), MAXC (maximum digits in dial string including country code and international dialing prefix), and MAXS (maximum number of chars in line to read from file)
You options for reading lines of data in C are broken into two broad categories, character oriented input and line oriented input. When reading lines from a file, in most cases line oriented input is the proper choice. You read a line of data at a time into a buffer, then you parse the information you need from the buffer. Your primary choices for line oriented input in C are fgets and getline. We use the standard fgets below.
Below, the code will read a line of data, then call get_n_digits to extract up to MAXC digits in the line into a separate buffer holding the digits (numstr). The number string is then passed to fmt_phone which takes the last 10 digits in the string (discarding any initial country-code or int'l dialing prefix) and formatting those digits into a telephone number format. You can adjust any part as needed to meet your input file:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define ACPN 10
#define MAXC 16
#define MAXS 256
size_t strip_newline (char *s);
char *get_n_digits (char *numstr, char *s, size_t n);
char *fmt_phone (char *fmts, char *s, size_t n);
int main (int argc, char **argv) {
/* open file or read from stdin */
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) {
fprintf (stderr, "error: failed to open file for reading\n");
return 1;
}
char numstr[MAXC] = {0}; /* string of digits (max MAXC - 1) */
char fmtstr[MAXC] = {0}; /* formatted phone number string */
char line[MAXS] = {0}; /* line buffer holding full line */
/* read each line from fp (up to MAXS chars) */
while (fgets (line, MAXS, fp))
{
size_t len = strip_newline (line); /* strip trailing newline */
get_n_digits (numstr, line, MAXC); /* get MAXC digits from line */
printf ("\n read : %s (%zu chars), taking last 10 of : %s\n",
line, len, numstr);
/* format last 10 digits into phone number */
fmt_phone (fmtstr, numstr, ACPN);
printf (" phone : %s\n", fmtstr);
}
if (fp != stdin) fclose (fp);
return 0;
}
size_t strip_newline (char *s)
{
size_t len = strlen (s);
s [--len] = 0;
return len;
}
/* extract upto n digits from string s, copy to numstr */
char *get_n_digits (char *numstr, char *s, size_t n)
{
char *p = s;
size_t idx = 0;
while (*p && idx < n - 1) {
if (*p >= '0' && *p <= '9')
numstr[idx++] = *p;
p++;
}
numstr[idx] = 0;
return numstr;
}
/* format last n (10) digits in s into a formatted
telephone number: (xxx) yyy-zzzz, copy to fmts.
'last 10' accounts for country code and international
dialing prefix at beginning of dial string.
*/
char *fmt_phone (char *fmts, char *s, size_t n)
{
/* validate strings */
if (!fmts || !s) {
fprintf (stderr, "%s() error: invalid string parameter.\n", __func__);
*fmts = 0;
return fmts;
}
/* validate length of n */
if (n < ACPN) {
fprintf (stderr, "%s() error: insufficient size 'n' for format.\n", __func__);
*fmts = 0;
return fmts;
}
/* validate length of s */
size_t len = strlen (s);
if (len < n) {
fprintf (stderr, "%s() error: insufficient digits in string.\n", __func__);
*fmts = 0;
return fmts;
}
/* set start pointer to last 10 digits */
char *p = len > n ? s + len - n : s;
char *pf = fmts;
/* build formatted phone number */
*pf++ = '(';
strncpy (pf, p, 3);
pf += 3, p += 3;
*pf++ = ')';
*pf++ = ' ';
strncpy (pf, p, 3);
pf += 3, p += 3;
*pf++ = '-';
strncpy (pf, p, 4);
pf += 4;
*pf = 0;
return fmts;
}
Compile with gcc -Wall -Wextra -o progname sourcename.c
Example Input
$ cat dat/pnumtest.txt
123456789012345
12345678901234
1234567890123
123456789012
12345678901
1234567890
123456789
Example Output
$ ./bin/phnum dat/pnumtest.txt
read : 123456789012345 (15 chars), taking last 10 of : 123456789012345
phone : (678) 901-2345
read : 12345678901234 (14 chars), taking last 10 of : 12345678901234
phone : (567) 890-1234
read : 1234567890123 (13 chars), taking last 10 of : 1234567890123
phone : (456) 789-0123
read : 123456789012 (12 chars), taking last 10 of : 123456789012
phone : (345) 678-9012
read : 12345678901 (11 chars), taking last 10 of : 12345678901
phone : (234) 567-8901
read : 1234567890 (10 chars), taking last 10 of : 1234567890
phone : (123) 456-7890
read : 123456789 (9 chars), taking last 10 of : 123456789
fmt_phone() error: insufficient digits in string.
phone :
Note: there are many, many different ways to approach this problem, this is but one.
Note2: while not required for this code, I included a function showing how to strip the trailing newline ('\n') from the input read by fgets. It is never a good idea to leave newlines dangling from strings in your code. While here they would not have caused a problem, in most cases they will bite you if your are not aware of them. So get in the practice of handling/removing the trailing newlines when using fgets or getline to read from a file. (note: getline provides the number of characters actually read as its return, so you can avoid calling strlen and simply use the return of getline to remove the newline in that case.)

Searching and Reading a text file

this is my first time asking a question on here so I'll try to do my best. I'm not that great at C, I'm only in Intermediate C programming.
I'm trying to write a program that reads a file, which I got working. But I'm have search for a word then save the word after it into an array. What I have going right now is
for(x=0;x<=256;x++){
fscanf(file,"input %s",insouts[x][0]);
}
In the file there are lines that say "input A0;" and I want it to save "A0" to insouts[x][0]. 256 is just a number I picked because I don't know how many inputs it might have in the text file.
I have insouts declared as:
char * insouts[256][2];
Use fgets() & sscanf(). Seperate I/O from format scanning.
#define N (256)
char insouts[N][2+1]; // note: no * and 2nd dimension is 3
for(size_t x = 0; x < N; x++){
char buf[100];
if (fgets(buf, sizeof buf, stdin) == NULL) {
break; // I/O error or EOF
}
int n = 0;
// 2 this is the max length of characters for insouts[x]. A \0 is appended.
// [A-Za-z0-9] this is the set of legitimate characters for insouts
// %n record the offset of the scanning up to that point.
int result = sscanf(buf, "input %2[A-Za-z0-9]; %n", insouts[x], &n);
if ((result != 1) || (buf[n] != '\0')) {
; // format error
}
}
You want to pass the address of the x'th element of the array and not the value stored there. You can use the address-of operator & to do this.
I think
for(x = 0;x < 256; x++){
fscanf(file,"input %s", &insouts[x][0]);
// you could use insouts[x], which should be equivalent to &insouts[x][0]
}
would do the trick :)
Also, you are only allocating 2 bytes for every string. Keep in mind that strings need to be terminated by a null character, so you should change the array allocation to
char * insouts[256][3];
However, I'm pretty sure the %s will match A0; and not just A0, so you might need to account for this as well. You can use %c together with a width to read a given number of characters. However, you add to add the null byte yourself. This should work (not tested):
char* insouts[256][3];
for(x = 0; x < 256; x++) {
fscanf(file, "input %2c;", insouts[x]);
insouts[x][2] = '\0';
}
Rather than trying to use fscanf why don't you use "getdelim" with ';' as the delimiter.
According to the man page
" getdelim() works like getline(), except that a line delimiter other than newline can be specified as the delimiter argument. As with getline(), a delimiter character is not added if one was not present in the input before end of file was reached."
So you can do something like (untested and uncompiled code)
char *line = NULL;
size_t n, read;
int alloc = 100;
int lc = 0;
char ** buff = calloc(alloc, sizeof(char *)); // since you don't know the file size have 100 buffer and realloc if you need more
FILE *fp = fopen("FILE TO BE READ ", "r");
int deli = (int)';';
while ((read = getline(&line, &n, fp)) != -1) {
printf("%s", line); // This should have "input A0;"
// you can use either sscanf or strtok here and get A0 out
char *out = null ;
sscanf(line, "input %s;", &out);
if (lc > alloc) {
alloc = alloc + 50;
buff = (char **) realloc(buff, sizeof(char *) * alloc);
}
buff[lc++] = out
}
int i = 0 ;
for (i = 0 ; i < lc; i++)
printf ("%s\n", buff[i]);

Resources