Is this a valid use of fgetc? - c

My input stream is from a text file with a list of words separated by the \n character.
The function stringcompare is a function that will compare the equivalence of two strings, case insensitive.
I have two string arrays, word[50] and dict[50]. word is a string that would be given by the user.
Basically what I want to do is pass word[] and each word in the text file as arguments of the stringcompare function.
I've compiled and run this code but it is wrong. Very wrong. What am I doing wrong? Can I even use fgetc() like this? would dict[] even be a string array after the inner loop is done?
char c, r;
while((c = fgetc(in)) != EOF){
while((r = fgetc(in)) != '\n'){
dict[n] = r;
n++;
}
dict[n+1] = '\0'; //is this necessary?
stringcompare(word, dict);
}

It is wrong.
The return value of fgetc() should be stored to int, not char, especially when it will be compared with EOF.
You might forgot to initialize n.
You will miss the first character of each line, which is stored to c.
Use dict[n] = '\0'; instead of dict[n+1] = '\0'; because n is already incremented in the loop.
Possible fix:
int c, r;
while((c = fgetc(in)) != EOF){
ungetc(c, in); // push the read character back to the stream for reading by fgetc later
n = 0;
// add check for EOF and buffer overrun for safety
while((r = fgetc(in)) != '\n' && r != EOF && n + 1 < sizeof(dict) / sizeof(dict[0])){
dict[n] = r;
n++;
}
dict[n] = '\0'; //this is necessary
stringcompare(word, dict);
}

Related

how to stop my program from skipping characters before saving them

I am making a simple program to read from a file character by character, puts them into tmp and then puts tmp in input[i]. However, the program saves a character in tmp and then saves the next character in input[i]. How do I make it not skip that first character?
I've tried to read into input[i] right away but then I wasn't able to check for EOF flag.
FILE * file = fopen("input.txt", "r");
char tmp;
char input[5];
tmp= getc(file);
input[0]= tmp;
int i=0;
while((tmp != ' ') && (tmp != '\n') && (tmp != EOF)){
tmp= getc(file);
input[i]=tmp;
length++;
i++;
}
printf("%s",input);
It's supposed to print "ADD $02", but instead it prints "DD 02".
You are doing things in the wrong order in your code: The way your code is structures, reading and storing the first char is moved out of the loop. In the loop, that char is then overwritten. In that case start with i = 1.
Perhaps you want to read the first character anyway, but I guess you want to read everything up to the first space, which might be the first character. Then do this:
#include <stdio.h>
int main(void)
{
char input[80];
int i = 0;
int c = getchar();
while (c != ' ' && c != '\n' && c != EOF) {
if (i + 1 < sizeof(input)) { // store char if the is room
input[i++] = c;
}
c = getchar();
}
input[i] = '\0'; // null-terminate input
puts(input);
return 0;
}
Things to note:
The first character is read before the loop. the loop condition and the code that stores the char then use that char. Just before the end of the loop body, the next char is read, which will then be processed in the next iteration.
You don't enforce that the char buffer input cannot be overwritten. This is dangerous, especially since your buffer is tiny.
When you construct strings char by char, you should null-terminate it by placing an explicit '\0' at the end. You have to make sure that there is space for that terminator. Nearly all system functions like puts or printf("%s", ...) expect the string to be null-terminated.
Make the result of getchar an int, so that you can distinguish between all valid character codes and the special value EOF.
The code above is useful if the first and subsequent calls to get the next item are different, for example when tokenizing a string with strtok. Here, you can also choose another approach:
while (1) { // "infinite loop"
int c = getchar(); // read a char first thing in a loop
if (c == ' ' || c == '\n' || c == EOF) break;
// explicit break when done
if (i + 1 < sizeof(input)) {
input[i++] = c;
}
}
This approach has the logic of processing the chars in the loop body only, but you must wrap it in an infinite loop and then use the explicit break.

puts() output is appended "time" string

I get very unexpected output from quite simple code
char ch = getchar(), word[100], *p = word;
while (ch != '\n') {
*(p++) = ch;
ch = getchar();
}
puts(word);
output of any 17 character input is appended by "time" like
12345678901234567time
if exceeds "time" is overwritten like
1234567890123456789me
Am I doing something wrong?
puts expects a pointer to string. And a string needs to have a terminating null character - \0 - to signify where the string ends.
But in your case, you did not write the \0 at the end to signify that the string ends there.
You need to do:
char ch = getchar(), word[100], *p = word;
/* Also check that you are not writing more than 100 chars */
int i = 1;
while(ch != '\n' && i++ < 100){
*(p++) = ch;
ch = getchar();
}
*p = '\0'; /* write the terminaring null character */
puts(word);
Before, when you were not writing the terminating null character you could not expect anything determinate to print. It could also have been 12345678901234567AnyOtherWord or something.
There are multiple issues in your code:
You do not null terminate the string you pass to puts(), invoking undefined behavior... in your case, whatever characters happen to be present in word after the last one read from stdin are printed after these and until (hopefully) a '\0' byte is finally found in memory.
You read a byte from stdin into a char variable: this does not allow you to check for EOF, and indeed you do not.
If you read a long line, you will write bytes beyond the end if the word array, invoking undefined behavior. If the end of file is encountered before a '\n' is read from stdin, you will definitely write beyond the end of the buffer... Try for example giving an empty file as input for your program.
Here is a corrected version:
char word[100];
char *p = word;
int ch;
while ((ch = getchar()) != EOF && ch != '\n') {
/* check for long line: in this case, we truncate the line */
if (p < word + sizeof(word) - 1) {
*p++ = ch;
}
}
*p = '\0';
puts(word);

Using fgetc to pass only part of a text file to a buffer

I have the following text file:
13.69 (s, 1H), 11.09 (s, 1H).
So far I can quite happily use either fgets or fgetc to pass all text to a buffer as follows:
char* data;
data = malloc(sizeof(char) * 100);
int c;
int n = 0;
FILE* inptr = NULL;
inptr = fopen("NMR", "r");
if(NULL == fopen("NMR", "r"))
{
printf("Error: could not open file\n");
return 1;
}
for (c = fgetc(inptr); c != EOF && c != '\n'; c = fgetc(inptr))
{
data[n++] = c;
}
for (int i = 0, n = 100; i < n; i++)
{
printf ("%c", data[i]);
}
printf("\n");
and then print the buffer to the screen afterwards. However, I am only looking to pass part of the textfile to the buffer, namely:
13.69 (s, 1H),
So this means I want fgetc to stop after ','. However, this means the that the text will stop at 13.69 (s, and not 13.69 (s, 1H),
Is there a way around this? I have also experimented with fgets and then using strstr as follows:
char needle[4] = ")";
char* ret;
ret = strstr(data, needle);
printf("The substring is: %s\n", ret);
However, the output from this is:
), 11.09 (s, 1H)
thus giving me the rest of the string which I do not want. It's an interesting one and if anyone has any tips it would be much appreciated!
If you know that the closing parenthesis is the last character you want, you can use that as your stopping point in the fgetc() loop:
char data[100]; //No need to dynamically allocate if we know the size at compile time
int c;
int n = 0;
FILE* inptr = NULL;
inptr = fopen("NMR", "r");
if(inptr == NULL) //We want to check the value of the file we just opened
{ //and plan to use
printf("Error: could not open file\n");
return 1;
}
//We'll keep the original value guards (EOF and '\n') below and add two more
//to make sure we break from the loop
//We use n<98 below to make sure we can always create a null-terminated string,
//If we used 99, the 100th character might be a ')', then we have no room for a
//terminating null-char
for (c = fgetc(inptr); c != ')' && n < 98 && c != EOF && c != '\n'; c = fgetc(inptr))
{
data[n++] = c;
}
if(c != ')') //We hit EOF, \n, or ran out of space in data[]
{
printf("Error: no matching sequence found\n");
return 2;
}
data[n]=')'; //Could also write data[n]=c here, since we know it's a ')'
data[n+1]='\0'; //Add the terminating null character
printf("%s\n",data); //Since it's a properly formatted string, we can use %s
(Note that this example will handle null input characters differently from yours. If you expect null characters to be in the input stream (NMR file) then change the printf("%s",...) line back to the for loop you originally had.
Well with only one example of the format you are trying to parse it's not totally possible to give an answer, however if your input is always like this I would simply have a counter and break after the second comma.
int comma = 0;
for (c = fgetc(inptr); c != EOF && c != '\n' && c != ',' && comma < 1; c = fgetc(inptr))
{
if (data[n] = ',')
comma++;
data[n++] = c;
}
In case the characters inside the parenthesis can be more complex I would simply maintain a boolean state to know if I am actually inside or outside a parenthesis and break when I read a comma outside of it.
Simply read using fgets and store desired string in char * using sscanf-
char *new_data;
new_data=malloc(100); // allocate memory
...
fgets(data,100,inptr); // read from file but check its return
sscanf(data,"%[^)]",new_data); // store string untill ')' in new_data from data
strcat(new_data,")"); // concatenating new_data and ")"
printf("%s",new_data); // print new_data
...
free(new_data); // remember to free memory
Also you should check return of malloc though not done in my example and also close the file opened .

Dynamically created C string

I'm trying to get an expression from the user and put it in a dynamically created string. Here's the code:
char *get_exp() {
char *exp, *tmp = NULL;
size_t size = 0;
char c;
scanf("%c", &c);
while (c != EOF && c != '\n') {
tmp = realloc(exp, ++size * sizeof char);
if (tmp == NULL)
return NULL;
exp = tmp;
exp[size-1] = c;
scanf("%c", &c);
}
tmp = realloc(exp, size+1 * sizeof char);
size++;
exp = tmp;
exp[size] = '\0';
return exp;
}
However, the first character read is a newline char every time for some reason, so the while loop exits. I'm using XCode, may that be the cause of the problem?
No, XCode is not part of your problem (it is a poor workman who blames his tools).
You've not initialized exp, which is going to cause problems.
Your code to detect EOF is completely broken; you must test the return value of scanf() to detect EOF. You'd do better using getchar() with int c:
int c;
while ((c = getchar()) != EOF && c != '\n')
{
...
}
If you feel you must use scanf(), then you need to test each call to scanf():
char c;
while (scanf("%c", &c) == 1 && c != EOF)
{
...
}
You do check the result of realloc() in the loop; that's good. You don't check the result of realloc() after the loop (and you aren't shrinking your allocation); please check every time.
You should consider using a mechanism that allocates many bytes at a time, rather than one realloc() per character read; that is expensive.
Of course, if the goal is simply to read a line, then it would be simplest to use POSIX getline(), which handles all the allocation for you. Alternatively, you can use
fgets() to read the line. You might use a fixed buffer to collect the data, and then copy that to an appropriately sized dynamically allocated buffer. You would also allow for the possibility that the line is very long, so you'd check that you'd actually got the newline.
Here on Windows XP/cc, like Michael said, it works if exp is initialized to NULL.
Here's a fixed code, with comments explaining what is different from your code in the question:
char *get_exp()
{
// keep variables with narrowest scope possible
char *exp = NULL;
size_t size = 0;
// use a "forever" loop with break in the middle, to avoid code duplication
for(;;) {
// removed sizeof char, because that is defined to be 1 in C standard
char *tmp = realloc(exp, ++size);
if (tmp == NULL) {
// in your code, you did not free already reserved memory here
free(exp); // free(NULL) is allowed (does nothing)
return NULL;
}
exp = tmp;
// Using getchar instead of scanf to get EOF,
// type int required to have both all byte values, and EOF value.
// If you do use scanf, you should also check it's return value (read doc).
int ch = getchar();
if (ch == EOF) break; // eof (or error, use feof(stdin)/ferror(stdin) to check)
if (ch == '\n') break; // end of line
exp[size - 1] = ch; // implicit cast to char
}
if (exp) {
// If we got here, for loop above did break after reallocing buffer,
// but before storing anything to the new byte.
// Your code put the terminating '\0' to 1 byte beyond end of allocation.
exp[size-1] = '\0';
}
// else exp = strdup(""); // uncomment if you want to return empty string for empty line
return exp;
}

How can I check if a specific char exists in a char array

How can I search for a specific character in a char array ?
Follow my code, but I think it's not correct in the function strchr:
while((c = getc(fp)) != EOF) {
for (i = 0; i < 1; i++) {
c2[i] = c;
int test = strchr(";", c2[i]);
}
printf("%c", c);
}
I have a structure that has int index, int data, and a pointer to the next register. I fill an array (c2[100]) with some data that come from my .cvs file. In the first register of my array I got something like this: 800;lucas . I need to find the character ';' in this array and split it, and then the number 800 will be the structure->index and the name 'lucas' will be the structure->data.
For each character that is read, you are storing it into the first slot of your c2[] array (ignoring the rest of the array), and then calling strchr() to check if the read character is a ; or not. Using strchr() for that is overkill. The following would be much simplier:
while((c = getc(fp)) != EOF)
{
if (c == ';')
{
...
}
printf("%c", c);
}
If you are actually trying to search your array instead, then you are using strchr() the wrong way. It should be more like this instead, assuming c2[] already contains the null-terminated string data you want to search in:
while((c = getc(fp)) != EOF)
{
int test = strchr(c2, c);
...
printf("%c", c);
}

Resources