Searching and Reading a text file - c

this is my first time asking a question on here so I'll try to do my best. I'm not that great at C, I'm only in Intermediate C programming.
I'm trying to write a program that reads a file, which I got working. But I'm have search for a word then save the word after it into an array. What I have going right now is
for(x=0;x<=256;x++){
fscanf(file,"input %s",insouts[x][0]);
}
In the file there are lines that say "input A0;" and I want it to save "A0" to insouts[x][0]. 256 is just a number I picked because I don't know how many inputs it might have in the text file.
I have insouts declared as:
char * insouts[256][2];

Use fgets() & sscanf(). Seperate I/O from format scanning.
#define N (256)
char insouts[N][2+1]; // note: no * and 2nd dimension is 3
for(size_t x = 0; x < N; x++){
char buf[100];
if (fgets(buf, sizeof buf, stdin) == NULL) {
break; // I/O error or EOF
}
int n = 0;
// 2 this is the max length of characters for insouts[x]. A \0 is appended.
// [A-Za-z0-9] this is the set of legitimate characters for insouts
// %n record the offset of the scanning up to that point.
int result = sscanf(buf, "input %2[A-Za-z0-9]; %n", insouts[x], &n);
if ((result != 1) || (buf[n] != '\0')) {
; // format error
}
}

You want to pass the address of the x'th element of the array and not the value stored there. You can use the address-of operator & to do this.
I think
for(x = 0;x < 256; x++){
fscanf(file,"input %s", &insouts[x][0]);
// you could use insouts[x], which should be equivalent to &insouts[x][0]
}
would do the trick :)
Also, you are only allocating 2 bytes for every string. Keep in mind that strings need to be terminated by a null character, so you should change the array allocation to
char * insouts[256][3];
However, I'm pretty sure the %s will match A0; and not just A0, so you might need to account for this as well. You can use %c together with a width to read a given number of characters. However, you add to add the null byte yourself. This should work (not tested):
char* insouts[256][3];
for(x = 0; x < 256; x++) {
fscanf(file, "input %2c;", insouts[x]);
insouts[x][2] = '\0';
}

Rather than trying to use fscanf why don't you use "getdelim" with ';' as the delimiter.
According to the man page
" getdelim() works like getline(), except that a line delimiter other than newline can be specified as the delimiter argument. As with getline(), a delimiter character is not added if one was not present in the input before end of file was reached."
So you can do something like (untested and uncompiled code)
char *line = NULL;
size_t n, read;
int alloc = 100;
int lc = 0;
char ** buff = calloc(alloc, sizeof(char *)); // since you don't know the file size have 100 buffer and realloc if you need more
FILE *fp = fopen("FILE TO BE READ ", "r");
int deli = (int)';';
while ((read = getline(&line, &n, fp)) != -1) {
printf("%s", line); // This should have "input A0;"
// you can use either sscanf or strtok here and get A0 out
char *out = null ;
sscanf(line, "input %s;", &out);
if (lc > alloc) {
alloc = alloc + 50;
buff = (char **) realloc(buff, sizeof(char *) * alloc);
}
buff[lc++] = out
}
int i = 0 ;
for (i = 0 ; i < lc; i++)
printf ("%s\n", buff[i]);

Related

How can I create a 2D array to store a collection of words scanned from a .txt file in C?

I am working on a program where I want to scan a .txt file that contains a poem. After scanning the poem, I want to be able to store each individual word as a single string and store those strings in a 2D array. For example, if my .txt file contains the following:
Haikus are easy.
But sometimes they don't make sense.
Refrigerator.
I want to be able to store each word as the following in a single array:
H a i k u s \0
a r e \0
e a s y . \0
B u t \0
s o m e t i m e s \0
t h e y \0
d o n ' t \0
m a k e \0
s e n s e . \0
R e f r i g e r a t o r . \0
So far, this is the code I have. I am having difficulties understanding 2D arrays, so if someone could explain that to me as well in context to this problem, that would be great. I am still learning the C language, so it takes time for me to understand some things. I have been scratching my head at this for a few hours now and am using this as help after trying everything I could think of!
The following is my function for getting the words and storing them in to arrays (it also returns the number of words there are, which is used separately for a different part of the program):
int getWords(int maxSize, FILE* inFile, char strings[][COL_SIZE]){
int numWords;
for(int i = 0; i < maxSize; i++){
fscanf(inFile, "%s", strings[i]);
while(fscanf(inFile, "%s", strings[i] == 10){
numWords++;
}
}
return numWords;
}
Here's the code I have where I call the function in the main function (I am not sure what numbers to set the COL_SIZE and MAX_LENGTH to, like I said, I am new to this and am trying my best to understand 2D arrays and how they work):
#define COL_SIZE 10
#define MAX_LENGTH 500
int main(){
FILE* fp;
char strArray[MAX_LENGTH][COL_SIZE];
fp = fopen(FILE_NAME, "r");
if(fp == NULL){
printf("File could not be found!");
}
else{
getWords(MAX_LENGTH, fp, strArray);
fclose(fp);
}
return 0;
}
What you are not understanding, it that COL_SIZE must be large enough to store the longest word +1 for the nul-terminating character. Take:
R e f r i g e r a t o r . \0
----------------------------
1 2 3 4 5 6 7 8 9 0 1 2 3 4 - > 14 characters of storage required
You declare a 500 x 10 2D array of char:
char strArray[500][10]
"Refrigertator." cannot fit in strArray, so what happens is "Refrigerat" is stored at one row-index, and then "tor.\0" overwrites the first 5 characters of the next.
There are a number of ways to handle the input, but if you want to use fscanf, then you need (1) to include a field-width modifier with the string conversion to limit the number of characters stored to the amount of storage available, and (2) validate the next character after those you have read is a whitespace character, e.g.
#include <ctype.h>
int getWords(int maxSize, FILE* inFile, char strings[][COL_SIZE])
{
char c;
int n = 0;
while (n < maxSize) {
int rtn = fscanf (inFile, "%9s%c", strings[n], &c);
if (rtn == 2 && isspace(c))
n++;
else if (rtn == 1) {
n++;
break;
}
else
break;
}
return n;
}
Note the format string contains a field-width modifier of one-less than the total number of characters available, and then the character conversion stores the next character and validates it is whitespace (if it isn't you have a word that is too long to fit in your array)
With any user-input function, you cannot use it correctly unless you check the return. Above, the return from fscanf() is saved in rtn. If you have a successful conversion of both your string limited to COL_SIZE - 1 by your field-width modifier and c is whitespace, you have a successful read of the word and you are not yet at EOF. If the return is 1, you have the successful read of the word and you have reached EOF (non-POSIX line end on last line). Otherwise, you will either reach the limit of MAX_LENGTH and exit the loop, or your will reach EOF and fscanf() will return EOF forcing an exit of the loop through the else clause.
Lastly, don't skimp on buffer size. The longest word in the non-medical unabridged dictionary is 29-character, requiring a total of 30 characters storage, so #define COL_SIZE 32 makes more sense than 10.
Look things over and let me know if you have more questions.
stdio.h Only
If you are limited to stdio.h, then you can manually confirm that c contains a whitespace character:
if (rtn == 2 && (c == ' ' || c == '\t' || c == '\n'))
n++;
You probably don't want a traditional 2D array. Those are usually rectangular, which is not well suited to storing variable length words. Instead, you would want an array of pointers to buffers, sort of like argv is. Since the goal is to load from a file, I suggest using a contiguous buffer rather than allocating a separate one for each word.
The general idea is this:
First pass: get total file size and read in the whole thing (+1 byte for trailing NUL).
Second pass: count the words and split them with NULs.
Third pass: allocate a buffer for the word pointers and fill it in
Here's how to load the entire file:
#include <sys/stat.h>
#include <stdlib.h>
#include <stdio.h>
char *load_file(const char *fname, int *n)
{
struct stat st;
if(stat(fname, &st) == -1 || st.st_size == 0) return NULL;
char *buffer = malloc(st.st_size + 1);
if(buffer == NULL) return NULL;
FILE *file = fopen(fname, "r");
if(file == NULL || fread(buffer, 1, st.st_size, file)) {
free(buffer);
buffer = NULL;
}
fclose(file);
*n = st.st_size;
return buffer;
}
You can count the words by just stepping through the file contents and marking the end of each word.
#include <ctype.h>
char *skip_nonword(char *text, char *end)
{
while(text != end && !isalpha(*text)) text++;
return text;
}
char *skip_word(char *text, char *end)
{
while(text != end && isalpha(*text)) text++;
return text;
}
int count_words(char *text, int n)
{
char *end = text + n;
int count = 0;
while(text < end) {
text = skip_nonword(text, end);
if(text < end) {
count++;
text = skip_word(text, end);
*text = '\0';
}
}
return count;
}
Now you are in position to allocate the word buffer and fill it in:
char **list_words(const char *text, int n, int count)
{
char *end = text + n;
char **words = malloc(count * sizeof(char *));
if(words == NULL) return NULL;
for(int i = 0; i < count; i++) {
words[i] = skip_nonword(text, end);
text = skip_word(words[i], end);
}
return words;
}

How to fscanf word by word in a file?

I have a file with a series of words separated by a white space. For example file.txt contains this: "this is the file". How can I use fscanf to take word by word and put each word in an array of strings?
Then I did this but I don't know if it's correct:
char *words[100];
int i=0;
while(!feof(file)){
fscanf(file, "%s", words[i]);
i++;
fscanf(file, " ");
}
When reading repeated input, you control the input loop with the input function itself (fscanf in your case). While you can also loop continually (e.g. for (;;) { ... }) and check independently whether the return is EOF, whether a matching failure occurred, or whether the return matches the number of conversion specifiers (success), in your case simply checking that the return matches the single "%s" conversion specifier is fine (e.g. that the return is 1).
Storing each word in an array, you have several options. The most simple is using a 2D array of char with automatic storage. Since the longest non-medical word in the Unabridged Dictionary is 29-characters (requiring a total of 30-characters with the nul-terminating character), a 2D array with a fixed number of rows and fixed number of columns of at least 30 is fine. (dynamically allocating allows you to read and allocate memory for as many words as may be required -- but that is left for later.)
So to set up storage for 128 words, you could do something similar to the following:
#include <stdio.h>
#define MAXW 32 /* if you need a constant, #define one (or more) */
#define MAXA 128
int main (int argc, char **argv) {
char array[MAXA][MAXW] = {{""}}; /* array to store up to 128 words */
size_t n = 0; /* word index */
Now simply open your filename provided as the first argument to the program (or read from stdin by default if no argument is given), and then validate that your file is open for reading, e.g.
/* use filename provided as 1st argument (stdin by default) */
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
perror ("file open failed");
return 1;
}
Now to the crux of your read-loop. Simply loop checking the return of fscanf to determine success/failure of the read, adding words to your array and incrementing your index on each successful read. You must also include in your loop-control a check of your index against your array bounds to ensure you do not attempt to write more words to your array than it can hold, e.g.
while (n < MAXA && fscanf (fp, "%s", array[n]) == 1)
n++;
That's it, now just close the file and use your words stored in your array as needed. For example just printing the stored words you could do:
if (fp != stdin) fclose (fp); /* close file if not stdin */
for (size_t i = 0; i < n; i++)
printf ("array[%3zu] : %s\n", i, array[i]);
return 0;
}
Now just compile it, With Warnings Enabled (e.g. -Wall -Wextra -pedantic for gcc/clang, or /W3 on (VS, cl.exe) and then test on your file. The full code is:
#include <stdio.h>
#define MAXW 32 /* if you need a constant, #define one (or more) */
#define MAXA 128
int main (int argc, char **argv) {
char array[MAXA][MAXW] = {{""}}; /* array to store up to 128 words */
size_t n = 0; /* word index */
/* use filename provided as 1st argument (stdin by default) */
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
perror ("file open failed");
return 1;
}
while (n < MAXA && fscanf (fp, "%s", array[n]) == 1)
n++;
if (fp != stdin) fclose (fp); /* close file if not stdin */
for (size_t i = 0; i < n; i++)
printf ("array[%3zu] : %s\n", i, array[i]);
return 0;
}
Example Input File
$ cat dat/thefile.txt
this is the file
Example Use/Output
$ ./bin/fscanfsimple dat/thefile.txt
array[ 0] : this
array[ 1] : is
array[ 2] : the
array[ 3] : file
Look things over and let me know if you have further questions.
strtok() might be a function that can help you here.
If you know that the words will be separated by whitespace, then calling strtok will return the char pointer to the start of the next word.
Sample code from https://www.systutorials.com/docs/linux/man/3p-strtok/
#include <string.h>
...
char *token;
char *line = "LINE TO BE SEPARATED";
char *search = " ";
/* Token will point to "LINE". */
token = strtok(line, search);
/* Token will point to "TO". */
token = strtok(NULL, search);
In your case, the space character would also act as a delimiter in the line.
Note that strtok might modify the string passed in, so if you need to you should make a deep copy using something like malloc.
It might also be easier to use fread() to read a block from a file
As mentioned in comments, using feof() does not work as would be expected. And, as described in this answer unless the content of the file is formatted with very predictable content, using any of the scanf family to parse out the words is overly complicated. I do not recommend using it for that purpose.
There are many other, better ways to read content of a file, word by word. My preference is to read each line into a buffer, then parse the buffer to extract the words. This requires determining those characters that may be in the file, but would not be considered part of a word. Characters such as \n,\t, (space), -, etc. should be considered delimiters, and can be used to extract the words. The following is a recipe for extracting words from a file: (example code for a few of the items is included below these steps.)
Read file to count words, and get the length of the longest word.
Use count, and longest values from 1st step to allocate memory for words.
Rewind the file.
Read file line by line into a line buffer using while(fgets(line, size, fp))
Parse each new line into words using delimiters and store each word into arrays of step 2.
Use resulting array of words as necessary.
free all memory allocated when finished with arrays
Some example of code to do some of these tasks:
// Get count of words, and longest word in file
int longestWord(char *file, int *nWords)
{
FILE *fp=0;
int cnt=0, longest=0, numWords=0;
int c;
fp = fopen(file, "r");
if(fp)
{
// if((strlen(buf) > 0) && (buf[0] != '\t') && (buf[0] != '\n') && (buf[0] != '\0')&& (buf[0] > 0))
while ( (c = fgetc(fp) ) != EOF )
{
if ( isalnum (c) ) cnt++;
else if ( ( ispunct (c) ) || ( isspace(c) ) || (c == '\0' ))
{
(cnt > longest) ? (longest = cnt, cnt=0) : (cnt=0);
numWords++;
}
}
*nWords = numWords;
fclose(fp);
}
else return -1;
return longest;
}
// Create indexable memory for word arrays
char ** Create2DStr(ssize_t numStrings, ssize_t maxStrLen)
{
int i;
char **a = {0};
a = calloc(numStrings, sizeof(char *));
for(i=0;i<numStrings; i++)
{
a[i] = calloc(maxStrLen + 1, 1);
}
return a;
}
Usage: For a file with 25 words, the longest being 80 bytes:
char **strArray = Create2DStr(25, 80+1);//creates 25 array locations
//each 80+1 characters long
//(+1 is room for null terminator.)
int i=0;
char words[50][50];
while(fscanf(file, " %s ", words[i]) != EOF)
i++;
I wouldn't entirely recommend doing it this way, because of the unknown amount of words in the file, and the unknown length of a "word". Either can be over the size of '50'. Just do it dynamically, instead. Still, this should show you how it works.
How can I use fscanf to take word by word and put each word in an array of strings?
Read each word twice: first to find length via "%n". 2nd time, save it. (Inefficient yet simple)
Re-size strings as you go. Again inefficient, yet simple.
// Rough untested sample code - still need to add error checking.
size_t string_count = 0;
char **strings = NULL;
for (;;) {
long pos = ftell(file);
int n = 0;
fscanf(file, "%*s%n", &n); // record where scanning a "word" stopped
if (n == 0) break;
fseek(file, pos, SEEK_SET); // go back;
strings = realloc(strings, sizeof *strings * (string_count+1));// increase array size
strings[string_count] = malloc(n + 1u); // Get enough memory for the word
fscanf(file, "%s ", strings[string_count] ); // read/save word
}
// use strings[], string_count
// When done, free each strings[] and then strings

How to restore string after using strtok()

I have a project in which I need to sort multiple lines of text based on the second, third, etc word in each line, not the first word. For example,
this line is first
but this line is second
finally there is this line
and you choose to sort by the second word, it would turn into
this line is first
finally there is this line
but this line is second
(since line is before there is before this)
I have a pointer to a char array that contains each line. So far what I've done is use strtok() to split each line up to the second word, but that changes the entire string to just that word and stores it in my array. My code for the tokenize bit looks like this:
for (i = 0; i < numLines; i++) {
char* token = strtok(labels[i], " ");
token = strtok(NULL, " ");
labels[i] = token;
}
This would give me the second word in each line, since I called strtok twice. Then I sort those words. (line, this, there) However, I need to put the string back together in it's original form. I'm aware that strtok turns the tokens into '\0', but Ive yet to find a way to get the original string back.
I'm sure the answer lies in using pointers, but I'm confused what exactly I need to do next.
I should mention I'm reading in the lines from an input file as shown:
for (i = 0; i < numLines && fgets(buffer, sizeof(buffer), fp) != 0; i++) {
labels[i] = strdup(buffer);
Edit: my find_offset method
size_t find_offset(const char *s, int n) {
size_t len;
while (n > 0) {
len = strspn(s, " ");
s += len;
}
return len;
}
Edit 2: The relevant code used to sort
//Getting the line and offset
for (i = 0; i < numLines && fgets(buffer, sizeof(buffer), fp) != 0; i++) {
labels[i].line = strdup(buffer);
labels[i].offset = find_offset(labels[i].line, nth);
}
int n = sizeof(labels) / sizeof(labels[0]);
qsort(labels, n, sizeof(*labels), myCompare);
for (i = 0; i < numLines; i++)
printf("%d: %s", i, labels[i].line); //Print the sorted lines
int myCompare(const void* a, const void* b) { //Compare function
xline *xlineA = (xline *)a;
xline *xlineB = (xline *)b;
return strcmp(xlineA->line + xlineA->offset, xlineB->line + xlineB->offset);
}
Perhaps rather than mess with strtok(), use strspn(), strcspn() to parse the string for tokens. Then the original string can even be const.
#include <stdio.h>
#include <string.h>
int main(void) {
const char str[] = "this line is first";
const char *s = str;
while (*(s += strspn(s, " ")) != '\0') {
size_t len = strcspn(s, " ");
// Instead of printing, use the nth parsed token for key sorting
printf("<%.*s>\n", (int) len, s);
s += len;
}
}
Output
<this>
<line>
<is>
<first>
Or
Do not sort lines.
Sort structures
typedef struct {
char *line;
size_t offset;
} xline;
Pseudo code
int fcmp(a, b) {
return strcmp(a->line + a->offset, b->line + b->offset);
}
size_t find_offset_of_nth_word(const char *s, n) {
while (n > 0) {
use strspn(), strcspn() like above
}
}
main() {
int nth = ...;
xline labels[numLines];
for (i = 0; i < numLines && fgets(buffer, sizeof(buffer), fp) != 0; i++) {
labels[i].line = strdup(buffer);
labels[i].offset = find_offset_of_nth_word(nth);
}
qsort(labels, i, sizeof *labels, fcmp);
}
Or
After reading each line, find the nth token with strspn(), strcspn() and the reform the line from "aaa bbb ccc ddd \n" to "ccd ddd \naaa bbb ", sort and then later re-order the line.
In all case, do not use strtok() - too much information lost.
I need to put the string back together in it's original form. I'm aware that strtok turns the tokens into '\0', but Ive yet to find a way to get the original string back.
Far better would be to avoid damaging the original strings in the first place if you want to keep them, and especially to avoid losing the pointers to them. Provided that it is safe to assume that there are at least three words in each line and that the second is separated from the first and third by exactly one space on each side, you could undo strtok()'s replacement of delimiters with string terminators. However, there is no safe or reliable way to recover the start of the overall string once you lose it.
I suggest creating an auxiliary array in which you record information about the second word of each sentence -- obtained without damaging the original sentences -- and then co-sorting the auxiliary array and sentence array. The information to be recorded in the aux array could be a copy of the second word of the sentence, their offsets and lengths, or something similar.

Analyzing Strings with sscanf

I need to analyze a string previous reader with fgets,
then I have a row from:
name age steps\n
mario 10 1 2 3 4\n
joe 15 3 5\n
max 20 9 3 2 4 5\n
there are a variable number of steps for each column,
then I can read name and age with
sscanf(mystring, "%s %d", name, &age);
after this I have a for cycle for read all steps
int step[20];
int index=0;
while(sscanf(mystring,"%d", &step[index++])>0);
but this cycle never ends populating all array data with the age column.
The reason this never ends is because you are constantly providing the same string to scan.
sscanf provides the %n switch which stores the amount of characters read before it is reached inside a, which allows you to move forward in your input string by that amount of characters before rescanning.
This'll work:
int step[20];
int index=0;
int readLen;
while(sscanf(mystring,"%d%n", &step[index++], &readLen)>0) {
mystring += readLen;
}
A working solution is given in the answer from sokkyoku.
Another possibility to read variable length lines is to use strtok like in the following code snippet:
int getlines (FILE *fin)
{
int nlines = 0;
int count = 0;
char line[BUFFSIZE]={0};
char *p;
if(NULL == fgets(buff, BUFFSIZE, fin))
return -1;
while(fgets(line, BUFFSIZE, fin) != NULL) {
//Remove the '\n' or '\r' character
line[strcspn(line, "\r\n")] = 0;
count = 0;
printf("line[%d] = %s\n", nlines, line);
for(p = line; (p = strtok(p, " \t")) != NULL; p = NULL) {
printf("%s ", p);
++count;
}
printf("\n\n");
++nlines;
}
return nlines;
}
Explanation of the above function getlines:
Each line in the file fin is read using fgets and stored in the variable line.
Then each substring in line (separated by a white space or \t character) is extracted and the pointer to that substring stored in p, by means of the function strtok in the for loop (see for example this post for further example on strtok).
The function then just print p but you can do everything with the substring here.
I also count (++count) the number of items found in each line. At the end, the function getline count and returns the number of lines read.

Formatting a file of text based on input

I am working on a project for school and I have run into a little bit of trouble. The gist of the project is to write a program that reads in a file of text and formats that file so that it fits in a specific width.To format this file, the user specifies the input file, the length of an output line, and the justification for the output text. An example would be this:
$ ./format test.dat 15 right
The quick brown
fox jumps over
the lazy old
dog.
$ ./format test.dat 15 left
The quick brown
fox jumps over
the lazy old
dog.
$ ./format test.dat 15 center
The quick brown
fox jumps over
the lazy old
dog.
Anyway, I am basically stuck on how to go about outputting the file based on this. Attached below is my code for reading in the file, and what little I have done on outputting the file. I am mainly looking for tips or suggestions on how to go about doing it. I know I need to use printf with a width and such, but I am confused on how to move to the next line.
char **inputFile(FILE *fp, int size) {
int i = 0;
char *token;
char **str;
str = malloc(sizeof(char *) * size);
token = readToken(fp);
while(!feof(fp)) {
if(i+1 == size) {
realloc(str, size * 2);
}
str[i] = token;
token = readToken(fp);
i++;
}
return str;
}
void toPrint(char **string, int width, int indent) {
int i;
int curLineLength;
if(indent == 0) {
for(i = 0; I < strlen(string); i++
char *token = string[i];
if(curLineLength + strlen(*string) > width) {
if(curLineLength > 0) {
printf("\n");
curLineLength = 0;
}
}
printf("%s ", token);
curLineLength += strlen(*string);
}
/*
if(indent == 1) {
}
if(indent == 2) {
}
*/
}
Following on from the comment, your task of justifying the lines is more a logic challenge for structuring your output function than it is a difficult one. You are getting close. There are a number of ways to do it, and you probably need to add error checking to make sure width isn't less than the longest line.
Here is a quick example you can draw from. Make sure you understand why each line was written the way it was, and pay close attention to the variable length array definition (you will need to compile as c99 - or if using gcc, rely on the gcc extension (default)). Let me know if you have questions:
/* format 'n' lines of output of 't' justified as specified in 'just'
(left 'l', right 'r' or centered 'c') within a width 'w'.
NOTE: 'w' must be larger than the longest line in 't'.
*/
void formatted (char **t, char just, size_t w, size_t n)
{
if (!t || !*t) return;
size_t i = 0;
size_t lmax = 0;
size_t len[n];
/* calculate the length of each line, set lmax */
for (i = 0; i < n; i++) {
len[i] = strlen (t[i]);
if (len[i] > lmax) lmax = len[i];
}
/* handle w < lmax reformat or error here */
if (w < lmax) {
fprintf (stderr, "%s() error: invalid width < lmax (%zu).\n",
__func__, lmax);
return;
}
/* left justified output */
if (just == 'l') {
for (i = 0; i < n; i++) {
printf ("%s\n", t[i]);
}
return;
}
/* center or right justified output */
for (i = 0; i < n; i++) {
int spaces = w - len[i];
if (just == 'c')
printf ("%*s%s\n", spaces/2, " ", t[i]);
else if (just == 'r')
printf ("%*s%s\n", spaces, " ", t[i]);
}
}
Note: if you are on windows, change __func__ to the function name in each of the error statements.
Logic of Function - Long Version
Let's look a little closer at what the function is doing and why it does it the way it does. First, lets look at the paramaters it takes:
void formatted (char **t, char just, size_t w, size_t n)
char **t, your 'string', well... actually your array or pointers to type char*. When you pass the array of pointers to your function, and this may be where your confusion is, you only have 2 ways to iterate over the array an print each of the lines of text: (1) pass the number of valid pointers that point to strings containing text, or (2) provide a sentinel within the array that is pointed to by the pointer following the last pointer that points to a valid line of text. (usually just NULL) The sentinel serves as the pointer in the array that tells you ("Hey dummy, stop trying to print lines -- you already printed the last one...")
This takes a little more explanation. Consider your array of pointers in memory. Normally you will always allocate some reasonably anticipated number of pointers to fill, and when you fill the last pointer, you will realloc the array to contain more space. What this means is you will always have at least 1 pointer at the end of your array that is not filled. If you will intialize your array to contain NULL pointers at the very beginning (by allocating with calloc instead of malloc) -- you automatically provide a sentinel, or, you can always explicitly set the next pointer to NULL as you fill your array. This will leave your array of pointers to char* looking similar to the following:
Pointer The pointers in the array of pointers to char*, char **t;
Address point to the first character in each associated string.
+-----------+
| 0x192a460 | --> The quick brown
+-----------+
| 0x192a480 | --> fox jumps over
+-----------+
| 0x192a4a0 | --> a lazy
+-----------+
| 0x192a4c0 | --> dog.
+-----------+
| NULL |
+-----------+
| NULL |
+-----------+
...
Understand: your string, my t is an array of pointers to type char* not the strings themselves. Your string is the left column of pointers above. string is an array of the starting addresses of each real string and your string[i] will be the start of actual string itself -- at that address. i will range from 0-3 (4 total) where string[0] = "The quick brown", string[1] = "fox jumps over", etc.. To help, change the name in your code of string to array or str_array to help keep this straight.
Your 2 options for iterating over the array to print each string become (1) (with size 'n' passed to function):
for (i = 0; i < n; i++)
printf ("%s\n", t[i]);
or (2) (relying on a sentinel):
while (t[i]))
printf ("%s\n", t[i++]);
(while not readily apparent here, this provides significant benefits as your code becomes more complex and you need to pass the array between many different functions)
Now that you know how you can access each of your strings, start with the case where you will just print the string left justified. You don't care about the length and you don't care about the width (as long as your string will fit within the width). All you need to do is print each of the strings. Since we pass the number of strings 'n' to the funciton, we can simply use the for loop (method 1) to print each string. (the length is computed prior to printing to insure each string will fit in width, since we will use it later, we store each length in the variable length array len so we don't have to make redudant calls to strlen later)
The more interesting cases are the center and right justified cases. In addition to 'n' you need to know which justification the user wants. You can pass any type flag you want to pass that information to the function. A char (1-byte) is simply the most efficient and does not require conversion to a number when read as input to the program. That is essentially why I chose to pass just as a char instead of an int (4-bytes + conversion on input) (or short (2-bytes), etc..)
Let's first look at how we right-justify the output. For discussion, let's consider an output width of 20 characters you want to right-justify your strings in. Your first string is 15 (printable) characters long (it's actually 16 chars in memory due to the null-terminating char at the end). Let's visualize what our string of printable characters would look like in a 20-character buffer (which you would use to save a right-justified copy of the string in memory, rather than printing)
|<------- 20 character width -------->|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| | | | | |T|h|e| |q|u|i|c|k| |b|r|o|w|n|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|<-- 15 character length -->|
This makes it easy to see how many spaces are needed before we start printing our string. The task here is to turn this into a generalized piece of code to handle any length string. Not too difficult:
spaces = width - length;
Then to utilize this to print the string right-justified, we make use of the minimum field width directive in the printf format string. Of particular use here, is the ability to specify the field width directive as a variable in the printf argument list. Constructing the format string and argument list to accomplish our goal, we have:
printf ("%*s%s\n", spaces, " ", t[i]);
Which says, print a minimum field width of spaces for the string " " (which essentially prints spaces number of spaces -- poor choice of names) followed by the actual string in our array of pointers t[i].
Looking at the diagram again, what would we have to do to shift the string to the center of the 20-character width? Instead of shifing the whole width - length number of spaces, we could only shift it 1/2 that much and it would end up where we want it. (I can hear the gears in your head grinding, and I can smell the smoke -- "but, wait... 5 is an odd number and we are using integers!" -- it doesn't matter, integer division will take care of it, and if we shift by 2 instead of 2.5, it's just fine, you can't print 1/2 a character... So putting it all together, to handle centered or right justified text, all you need is:
for (i = 0; i < n; i++) {
int spaces = w - len[i];
if (just == 'c')
printf ("%*s%s\n", spaces/2, " ", t[i]);
else if (just == 'r')
printf ("%*s%s\n", spaces, " ", t[i]);
}
Putting It All Together With The Rest
Sometimes seeing how the whole thing fits together helps. Same rules. Go though it function-by-function, line-by-line, and ask questions when you get struck. C is a low-level language, meaning you have to understand where things are in memory. When it comes down to it, programming is really about how to manipulate what you load into memory. Other languages try to hide that from you. C doesn't. That's its strength, and also where you have to concentrate a good part of your learning. Enough babble, here is the code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAXC 256
#define MAXL 64
char **inputfile (char ***t, FILE *fp, size_t *sz);
void formatted (char **t, char just, size_t w, size_t n);
void *xcalloc (size_t n, size_t s);
void freecdp (char **p, size_t n);
unsigned chr2lower (unsigned c);
int main (int argc, char **argv) {
if (argc < 4) { /* validate required arguments on command line */
fprintf (stderr, "error: insufficient input. "
"usage: %s just width [filename (stdin)]\n",
argv[0]);
return 1;
}
char **text = NULL; /* initialize all variables for main */
size_t lines = 0;
char j = 0; /* the following are ternary operators */
char just = argc > 1 ? *argv[1] : 'l'; /* sets defaults if no input */
size_t width = argc > 2 ? (size_t)strtol (argv[2], NULL, 10) : 0;
FILE *fp = argc > 3 ? fopen (argv[3], "r") : stdin;
/* read input from file */
if (!(inputfile (&text, fp, &lines))) { /* check if return is NULL */
fprintf (stderr, "error: file read failed '%s'.\n",
argc > 3 ? argv[3] : "stdin");
return 1;
}
j = chr2lower (just); /* force user input to lower-case */
if (j != 'l' && j != 'r' && j != 'c')
fprintf (stderr, "error: invalid justification '%c' "
"(defaulting to 'left').\n", just);
/* print output in requested justification */
formatted (text, j, width, lines);
/* free all memory allocated in program */
freecdp (text, lines);
return 0;
}
char **inputfile (char ***t, FILE *fp, size_t *sz)
{
if (!t || !sz) { /* validate parameters are not NULL */
fprintf (stderr, "%s() error: invalid parameters.\n", __func__);
return NULL;
}
if (!fp) { /* check that file pointer is valid */
fprintf (stderr, "%s() error: file open failed.\n", __func__);
return NULL;
}
size_t idx = 0; /* declare/initialize function variables */
size_t maxl = MAXL;
char ln[MAXC] = {0};
/* allocate MAXL number of pointers */
*t = xcalloc (MAXL, sizeof **t);
while (fgets (ln, MAXC, fp)) { /* read each line in file */
size_t len = strlen (ln); /* calculate length */
/* remove trailing newline (or carriage return) */
while (len && (ln[len-1] == '\n' || ln[len-1] == '\r'))
ln[--len] = 0;
/* allocate & copy ln saving pointer in t[i], increment i by 1 */
(*t)[idx++] = strdup (ln); /* strdup allocates & copies */
if (idx == maxl) { /* check if you reached limit, realloc if needed */
void *tmp = realloc (*t, maxl * sizeof **t * 2);
if (!tmp) {
fprintf (stderr, "%s() virtual memory exhausted.\n", __func__);
return NULL;
}
*t = tmp; /* set new pointers NULL below (sentinel) */
memset (*t + maxl, 0, maxl * sizeof **t);
maxl *= 2;
}
}
*sz = idx; /* update value at address of sz so it is available in main */
if (fp != stdin) fclose (fp);
return *t;
}
/* format 'n' lines of output of 't' justified as specified in 'just'
(left 'l', right 'r' or centered 'c') within a width 'w'.
NOTE: 'w' must be larger than the longest line in 't'.
*/
void formatted (char **t, char just, size_t w, size_t n)
{
if (!t || !*t) return;
size_t i = 0;
size_t lmax = 0;
size_t len[n];
/* calculate the length of each line, set lmax */
for (i = 0; i < n; i++) {
len[i] = strlen (t[i]);
if (len[i] > lmax) lmax = len[i];
}
/* handle w < lmax reformat or error here */
if (w < lmax) {
fprintf (stderr, "%s() error: invalid width < lmax (%zu).\n",
__func__, lmax);
return;
}
/* left justified output */
if (just == 'l') {
for (i = 0; i < n; i++) {
printf ("%s\n", t[i]);
}
return;
}
/* center or right justified output */
for (i = 0; i < n; i++) {
int spaces = w - len[i];
if (just == 'c')
printf ("%*s%s\n", spaces/2, " ", t[i]);
else if (just == 'r')
printf ("%*s%s\n", spaces, " ", t[i]);
}
}
/* help functions below for calloc, free, and to lower-case */
void *xcalloc (size_t n, size_t s)
{
register void *memptr = calloc (n, s);
if (memptr == 0) {
fprintf (stderr, "%s() error: virtual memory exhausted.\n",
__func__);
exit (EXIT_FAILURE);
}
return memptr;
}
void freecdp (char **p, size_t n)
{
if (!p) return;
size_t i;
for (i = 0; i < n; i++)
free (p[i]);
free (p);
}
unsigned chr2lower (unsigned c)
{ return ('A' <= c && c <= 'Z') ? c | 32 : c; }
Example Use/Output
$ ./bin/str_justify l 15 dat/fox_4lines.txt
The quick brown
fox jumps over
a lazy
dog.
$ ./bin/str_justify c 15 dat/fox_4lines.txt
The quick brown
fox jumps over
a lazy
dog.
$ ./bin/str_justify r 15 dat/fox_4lines.txt
The quick brown
fox jumps over
a lazy
dog.

Resources