After succesfully running an entabulator, my detabulator won't pick up on a character comparison that should exit a while loop. After trying "0(tab)8(enter)(ctrl+D)" as input the tab is written correctly as spaces, but after rp is incremented to point to the 8, the while loop that should read the 8 won't exit and I get a seg fault. Here's the code:
#include <string.h>
#include <stdio.h>
#define MAXLINE 100
char doc[9001];
main(int argc, char *argv[])
{
int max = 0;
char *rp = doc;
char *wp = rp;
char *tf = wp;
char *lp = doc;
while ((*(rp++) = getchar()) != EOF);
*--rp = '\0';
rp = doc;
j = 0;
while ( (*rp != '\0') && (argc == 1)) {
if (*rp == '\n') {
lp = rp + 1;
*wp++ = *rp++;
}
while( (*rp != '\t') && (*rp != '\0') && (*rp != '\n') ) { /*this loops after a tab*/
*wp++ = *rp++;
}
if (*rp == '\t') {
rp++;
tf = lp + ((((wp - lp) / 8) + 1) * 8);
while ((tf - wp) != 0)
*wp++ = 's';
}
}
if (*rp == '\0')
*wp = '\0';
printf("%s\n", doc);
}
There are some as yet unexplored problems with the initial input loop.
You should never risk overflowing a buffer, even if you allocate 9001 bytes for it. That's how viruses and things break into programs. Also, you have a problem because you are comparing a character with EOF. Unfortunately, getchar() returns an int: it has to because it returns any valid character value as a positive value, and EOF as a negative value (usually -1, but nothing guarantees that value).
So, you might write that loop more safely, and clearly, as:
char *end = doc + sizeof(doc) - 1;
int c;
while (rp < end && (c = getchar()) != EOF)
*rp++ = c;
*rp = '\0';
With your loop as written, one of two undesirable things happens:
if char is an unsigned type, then you will never detect EOF.
if char is a signed type, then you will detect EOF when you read a valid character (often ÿ, y-umlaut, LATIN SMALL LETTER Y WITH DIAERESIS, U+00FF).
Neither is good. The code above avoids both problems without needing to know whether plain char is signed or unsigned.
Conventionally, if you have an empty loop body, you emphasize this by placing the semicolon on a line on its own. Many an infinite loop has been caused by a stray semicolon after a while condition; by placing the semicolon on the next line, you emphasize that it is intentional, not accidental.
while ((*(rp++) = getchar()) != EOF);
while ((*(rp++) = getchar()) != EOF)
;
What I feel is, the below loop is going into infinite loop.
while( (*rp != '\t') && (*rp != '\0') && (*rp != '\n') ) { /*this loops after a tab*/
*wp++ = *rp++;
This is because, you are checking for rp!= '\t' and so on, but here
if (*rp == '\t')
{
rp++;
tf = lp + ((((wp - lp) / 8) + 1) * 8);
while ((tf - wp) != 0)
*wp++ = 's';
}
you are filling the doc array with char 's' and which is over writing '\t' also, so the above loop is going to infinite.
Related
I need to read a PPM file but I'm limited to only using getchar() but I'm running into trouble ignoring whitespaces.
I'm using num=num*10+(ch-48); to read the height and width but don't know how to read them all at once while ignoring spaces and '\n' or comments.
I use this to read the magic number:
int magic;
while(magic==0){
if (getchar()=='P') //MAGIC NUMBER
magic=getchar()-48;
}
printf("%d\\n",magic);
i used this function to read the height and width which works only when the data in the header is seperated only by '\n'
int getinteger(int base)
{ char ch;
int val = 0;
while ((ch = getchar()) != '\\n' && (ch = getchar()) != '\\t' && (ch = getchar()) != ' ')
if (ch \>= '0' && ch \<= '0'+base-1)
val = base\*val + (ch-'0');
else
return ERROR;
return val;
}
this is the part in main()
height=getinteger(10);
while(height==-1){
height=getinteger(10);
}
Comparing "magic" with 0 is undefined behaviour since it's not initialized yet (so it's basically just a chunk of memory):
int magic; // WARNING: We don't know exact value, may be not 0
while(magic==0){
if (getchar()=='P') //MAGIC NUMBER
magic=getchar()-48;
}
Consider initializing variable before comparing:
int magic = 0; // We know that magic will be defined as 0
while (magic == 0) {
if (getchar() == 'P') // MAGIC NUMBER
magic = getchar() - 48;
}
In this function:
int getinteger(int base)
{ char ch;
int val = 0;
while ((ch = getchar()) != '\\n' && (ch = getchar()) != '\\t' && (ch = getchar()) != ' ')
if (ch \>= '0' && ch \<= '0'+base-1)
val = base\*val + (ch-'0');
else
return ERROR;
return val;
}
(I'm assuming that ERROR = -1, is that correct?) In your condition getchar() will work 3 times, not 1 (since it calls getchar() for putting in ch every check). Rewrite it to call only once for saving in variable ch. Another problem occurs when first symbol will be whitespace, not digit. In this case val will remains 0 (since while loop will be skipped), so returned value will also be '0'. To avoid this, you can check value of val and return ERROR, when it is not changed:
int getinteger(int base) {
char ch = getchar(); // get only one char and save for later use
int val = 0; // 0 means not changed
while (ch != '\n' && ch != '\t' && ch != ' ') {
if (ch >= '0' && ch <= '0' + base - 1)
val = base * val + (ch - '0');
else
return ERROR;
ch = getchar(); // get a new char for next loop iteration and checking if it is digit
}
if (val == 0) // val was not changed
return ERROR; // loop in "main" will be continued
else
return val; // val was changed, return it
}
UPD: We can also use this fact to simplify our function a lot:
int getinteger(int base) {
char ch = getchar(); // get only one char and save for later use
int val = 0; // 0" means not changed
while (ch >= '0' && ch <= '0' + base - 1) { // if "ch" is not a number, this loop will be skipped
val = base * val + (ch - '0');
ch = getchar(); // get a new char for next loop iteration and checking if it is digit
}
if (val == 0) // val was not changed (ch was not a number)
return ERROR; // loop in "main" will be continued
else
return val; // val was changed, return it
}
And last but not least, remove extra '' before symbols (\\n -> \n, \>= -> >= etc.) if they present in your code.
Combining everything above results in something like this:
#include <stdio.h>
#define ERROR -1
int getinteger(int base) {
char ch = getchar(); // get only one char and save for later use
int val = 0; // 0 means not changed
while (ch >= '0' && ch <= '0' + base - 1) { // if "ch" is not a number, this loop will be skipped
val = base * val + (ch - '0');
ch = getchar(); // get a new char for next loop iteration and checking if it is digit
}
if (val == 0) // val was not changed (ch was not a number)
return ERROR; // loop in "main" will be continued
else
return val; // val was changed, return it
}
int main() {
// dunno what's before
int magic = 0; // We know that magic will be defined as 0
while (magic == 0) {
if (getchar() == 'P') // MAGIC NUMBER
magic = getchar() - 48;
}
printf("magic = %d\n", magic);
int height = getinteger(10);
while (height == -1)
height = getinteger(10);
printf("height = %d\n", height);
// dunno what's after
}
Result:
$ echo " \n P3 #blabla \n 34 \t " | ./a.out
magic = 3
height = 34
This code contains 3 file handling related functions which read from a file named "mno". But only the 1st called function in the main() is working. If the 1st function of the list is commented then, only the 2nd function will work and the third won't. Same goes for the 3rd one
#include <stdio.h>
#include <ctype.h>
#include <unistd.h>
void countVowel(char fin[])
{
FILE *fl;
char ch;
int count = 0;
fl = fopen(fin, "r");
while (ch != EOF)
{
ch = tolower(fgetc(fl));
count += (ch == 'a' || ch == 'e' || ch == 'i' || ch == 'o' || ch == 'u') ? 1 : 0;
}
fclose(fl);
printf("Number of Vowels in the file \" %s \"-> \t %d \n", fin, count);
}
void countConsonant(char fin[])
{
FILE *fl;
char ch;
int count = 0;
fl = fopen(fin, "r");
while (ch != EOF)
{
ch = tolower(fgetc(fl));
count += (!(ch == 'a' || ch == 'e' || ch == 'i' || ch == 'o' || ch == 'u') && (ch >= 'a' && ch <= 'z')) ? 1 : 0;
}
fclose(fl);
printf("Number of Consonant in the file \" %s \"-> \t %d \n", fin, count);
}
void countAlphabet(char fin[])
{
FILE *fl;
char ch;
int count = 0;
fl = fopen(fin, "r");
while (ch != EOF)
{
ch = tolower(fgetc(fl));
count += (ch >= 'a' && ch <= 'z') ? 1 : 0;
}
fclose(fl);
printf("Number of Alphabets in the file \" %s \"-> \t %d \n", fin, count);
}
int main()
{
countVowel("mno"); // output -> 10
countConsonant("mno"); // output -> 0
countAlphabet("mno"); // output -> 0
return 0;
}
Here are the contents of "mno" file ->
qwertyuiopasdfghjklzxcvbnm, QWERTYUIOPASDFGHJKLZXCVBNM, 1234567890
As others have mentioned, your handling of EOF was incorrect:
ch was uninitialized on the first loop iteration
Doing tolower(fgetc(fl)) would obliterate the EOF value.
Using char ch; instead of int ch; would allow a [legitimate] 0xFF to be seen as an EOF.
But, it seems wasteful to have three separate functions to create the three different counts because the most time is spent in the I/O versus the determination of what type of character we're looking at. This is particularly true when the counts are so interelated.
We can keep track of multiple types of counts easily using a struct.
Here's a refactored version that calculates all three counts in a single pass through the file:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <ctype.h>
struct counts {
int vowels;
int consonants;
int alpha;
};
void
countAll(const char *fin,struct counts *count)
{
FILE *fl;
int ch;
int vowel;
count->vowels = 0;
count->consonants = 0;
count->alpha = 0;
fl = fopen(fin, "r");
if (fl == NULL) {
perror(fin);
exit(1);
}
while (1) {
ch = fgetc(fl);
// stop on EOF
if (ch == EOF)
break;
// we only care about alphabetic chars
if (! isalpha(ch))
continue;
// got one more ...
count->alpha += 1;
ch = tolower(ch);
// is current character a vowel?
vowel = (ch == 'a' || ch == 'e' || ch == 'i' || ch == 'o' || ch == 'u');
// since we know it's alphabetic, it _must_ be either a vowel or a
// consonant
if (vowel)
count->vowels += 1;
else
count->consonants += 1;
}
fclose(fl);
printf("In the file: \"%s\"\n",fin);
printf(" Number of Vowels: %d\n",count->vowels);
printf(" Number of Consonants: %d\n",count->consonants);
printf(" Number of Alphabetics: %d\n",count->alpha);
}
int
main(void)
{
struct counts count;
countAll("mno",&count);
return 0;
}
For your given input file, the program output is:
In the file: "mno"
Number of Vowels: 10
Number of Consonants: 42
Number of Alphabetics: 52
You are using ch uninitialized. at while (ch != EOF). Every function call after the first has ch equal to 0 at the start, because you forgot to initialize it and the memory was set to -1 before. You can fix it by replacing the loops like this:
int ch;
...
while ((ch = fgetc(fl)) != EOF)
{
ch = tolower(ch);
count += ...;
}
Here ch is getting initialized before you check it and later converted to lowercase.
EDIT:
Note that this only works if ch is an int, so it can handle the value of -1 (EOF) and the byte 255 is not truncated to -1.
EDIT:
At first I said ch was 0 all the time. It was -1. I am so sorry, I swapped it with the null terminator, which is usually the reason for such behavior.
This is my program (school exercise, should be receiving a string from the user, change it and return the original and new string in a certain format):
#include <stdio.h>
#define MAX_STRING_LENGTH 50
char switchChar(char c) {
if ((c >= 'A') && (c <= 'Z')) {
c = c + 32;
} else
if ((c >= 'a') && (c <= 'z')) {
c = c - 32;
}
if ((c > '5') && (c <= '9')) {
c = 56;
}
if ((c >= '0') && (c < '5')) {
c = 48;
}
return c;
}
int main(void) {
char temp;
int i = 0;
char stringInput[MAX_STRING_LENGTH + 1];
printf("Please enter a valid string\n");
fgets(stringInput, 50, stdin);
char newString[MAX_STRING_LENGTH + 1];
while ((i != MAX_STRING_LENGTH + 1) && (stringInput[i] != '\0')) {
temp = switchChar(stringInput[j]);
newString[i] = temp;
i++;
}
printf( "\"%s\"", stringInput);
printf("->");
printf( "\"%s\"", newString);
return 0;
}
When running, the output goes down a line after the string and before the last " character, although it should all be printed in the same line.
I would appreciate any directions.
There are several issues in your code:
fgets() reads and leaves the newline character at the end of the destination array if present and if enough space is available. For consistency with your algorithm, you should strip this newline. You can do this safely with stringInput[strcspn(stringInput, "\n")] = '\0'; or use a little more code if you cannot use <string.h>. The presence of this newline character explains the observed undesirable behavior.
You read a line with fgets(), but you pass a buffer size that might be incorrect: hard coded to 50 when the array size is MAX_STRING_LENGTH + 1. With MAX_STRING_LENGTH defined as 50, it is not a problem, but if you later change the definition of the macro, you might forget to update the size argument to fgets(). Use sizeof stringInput for consistency
you forget to set the null terminator in newString. Testing the boundary value for i is not necessary as stringInput is null terminated within the array boundaries.
in switchChar(), you should not hardcode character values from the ASCII charset: it reduces portability and most importantly, reduces readability.
Here is a corrected and simplified version:
#include <stdio.h>
#define MAX_STRING_LENGTH 50
char switchChar(char c) {
if ((c >= 'A') && (c <= 'Z')) {
c = c + ('a' - 'A');
} else
if ((c >= 'a') && (c <= 'z')) {
c = c - ('a' - 'A');
} else
if ((c > '5') && (c <= '9')) {
c = '8';
} else
if ((c >= '0') && (c < '5')) {
c = '0';
}
return c;
}
int main(void) {
char stringInput[MAX_STRING_LENGTH + 1];
char newString[MAX_STRING_LENGTH + 1];
int c;
printf("Please enter a valid string\n");
if (fgets(stringInput, sizeof stringInput, stdin) != NULL) {
// strip the newline character if present
//stringInput[strcspn(stringInput, "\n")] = '\0';
char *p;
for (p = stringInput; *p != '\0' && *p != '\n'); p++)
continue;
*p = '\0';
for (i = 0; stringInput[i] != '\0'; i++) {
newString[i] = switchChar(stringInput[i]);
}
newString[i] = '\0';
printf("\"%s\"", stringInput);
printf("->");
printf("\"%s\"", newString);
printf("\n");
}
return 0;
}
It's because fgets() reads in the newline character as well if there's room in the buffer and it's stored in your newString.
You can remove it with:
fgets(stringInput,50,stdin);
stringInput[strcspn(stringInput, "\n")] = 0; /* removes the trailing newline if any */
From fgets():
fgets() reads in at most one less than size characters from stream
and stores them into the buffer pointed to by s. Reading stops after
an EOF or a newline. If a newline is read, it is stored into the
buffer. A terminating null byte ('\0') is stored after the last character in the buffer.
You requirements contain:
get only one string
no special processing for blank characters
In that case, scanf is probably more adapted than fgets, because the former will clean the input for any initial blank(space or tab) and stop before the first trailing blank (space, tab, cr or newline). Remark: as scanf stops before the first blank, the string cannot contains spaces or tab. If it is a problem, use fgets.
Just replace the line:
fgets(stringInput, 50, stdin);
with:
i = scanf("%50s", stringInput);
if (i != 1) { /* always control input function return code */
perror("Could not get input string");
return 1;
}
If you prefere to use fgets for any reason, you should remove the (optional) trailing newline:
if (NULL == fgets(stringInput, 50, stdin)) { /* control input */
perror("Could not get input string");
return 1;
}
int l = strlen(stringInput);
if ((l > 0) && (stringInput[l - 1] == '\n')) { /* test for a trailing newline */
stringInput[l - 1] = '\0'; /* remove it if found */
}
last three days I have a problem..
I have a file containing sentences.
When I'm reading file with
int maxSize = 256;
int currSize = 0;
int i = 0;
char *sentence = (char*)malloc(maxSize);
char c;
currSize = maxSize;
while ((c = fgetc(input)) != EOF)
{
sentence[i++] = c;
while((c = fgetc(input)) != '\n')
{
sentence[i++] = c;
if((c == '.') || (c == '?') || (c == '!'))
sentence[i++] = '\n';
if(i == currSize)
{
currSize = i + maxSize;
sentence = (char*)realloc(sentence,currSize);
}
}
}
sentence[i] = '\0';
addSentence(sentence);
when function addSentence is adding sentences into linked list there is problem because it only add one sentence made from all what is in the file...
I'm beginner in C. Thank you.
Your problem is that you only call addSentence() at the EOF, so it doesn't magically get to see anything before you have read the whole file. Presumably, you need to call it when you detect the end of a sentence (with the test for '.', '?' or '!' — you'll also need to null terminate the string before calling addSentence and reset the memory with a new allocation and the correct size) as well as at EOF. It's not clear why you have two loops; you could miss some newlines as end of sentence. Rework with just one loop.
It's not entirely clear if newlines mark the ends of sentences. This revision assumes that they do:
int maxSize = 256;
int currSize = maxSize;
int i = 0;
int c;
char *sentence = (char*)malloc(maxSize);
assert(sentence != 0); // Not a production-ready error check
while ((c = fgetc(input)) != EOF)
{
sentence[i++] = c;
if ((c == '\n') || (c == '.') || (c == '?') || (c == '!'))
{
if (c != '\n')
sentence[i++] = '\n';
sentence[i] = '\0';
addSentence(sentence);
sentence = malloc(maxSize);
assert(sentence != 0); // Not a production-ready error check
currSize = maxSize;
i = 0;
}
if (i == currSize)
{
currSize = i + maxSize;
sentence = (char*)realloc(sentence, currSize);
assert(sentence != 0); // Not a production-ready error check
}
}
sentence[i] = '\0';
addSentence(sentence);
Note that the error checking for failed memory allocation is not production quality; there should be some proper, unconditional error checking. There is a small risk of buffer overflow if the end of sentence punctuation falls in exactly the wrong place. Production code should avoid that, too, but it would be fiddlier. I'd use a string data type and a function to do the adding. I'd probably also take a guess that most sentences are shorter than 256 characters (especially if newlines mark the end), and would use maxSize of 64. It would lead to less unused memory being allocated.
I've created this function to read a word. I got segmentation fault and I can't find the problem. Here's what I've done.
void LeeCaracter(FILE * fp, char * s)
{
char c;
int i = 0;
c = fgetc(fp);
while(c==' ' || c=='\t' || c=='\n')
c = fgetc(fp);
while(c!=' ' && c!='\n')
{
s[i] = c;
i++;
c = fgetc(fp);
}
s[i] = '\0';
}
s is a pointer parameter, as I have to use it later. Is it correct to write it just with one *? Thanks for your help!
*And what about if I wanted to know the character that follows the word(' ' or '\n')? I added this after the while loop:
"printf("%c",c);"
but it doesn't print anything. Any ideas?
Consider:
while(c==' ' || c=='\t' || c=='\n')
c = fgetc(fp);
So, at this point, two things that c is not are ' ' and '\n'. Then:
while(c!=' ' && c!='\n')
{
s[i] = c;
i++;
}
Since the value of c does not change in the loop, the while condition is always true. Meaning that pretty quickly, s[i] will go out of bounds. You need to check against the length of s, probably by getting that passed in as a parameter (not to mention, rethink your algorithm a bit -- probably you want to fgetc more inside the loop).
You have to make sure that the 's' has enough space for containing a word with maximum characters in the input file. Then you need to make sure that you check for 'End Of File'. Here is a working version. I hope it works for you as well.
#include <stdio.h>
void LeeCaracter(FILE * fp, char * s)
{
char c;
int i = 0;
c = fgetc(fp);
if (feof(fp)) return;
while (c == ' ' || c == '\t' || c == '\n')
c = fgetc(fp);
while (!feof(fp) && (c != ' ' && c != '\n')) {
s[i++] = c;
c = fgetc(fp);
}
s[i] = '\0';
printf("%s\n", s);
}
int main(void)
{
char s[128]; /* assuming no word is larger than this size */
FILE *fp = fopen("/usr/share/dict/words", "r");
while (!feof(fp)) {
LeeCaracter(fp, s);
}
return 0;
}