One line is being skipped in file with isalpha() - c

I am using isalpha() to determine if there are characters on the line and it is showing that the line "90 1 0" contains alphabetic characters.
Here is the relevant code
bool digitTest(char *test, int arysize)
{
int i;
for (i =0; i<arysize; i++)
{
if ((isalpha(test[i])) != 0){
return 0;
}
if (i==arysize)
return 1;
i++;
}
return 1;
}
It is called here
char buffer[4096] = {};
while(NULL!=fgets(buffer, sizeof(buffer), fp)){
if (digitTest(buffer, 4096) == 0){
printf ("contains alpha\n");
continue;
/*printing code if there is no alphabetic characters below, i do not believe it is relevant*/
and this is output
1
1
1
contains alpha
contains alpha
contains alpha
25123.90
54321.23
6
and input
1
1
1
89 23.5 not a number
90 1 0
-5.25 not a number 10000
25123.90 54321.23 6

The code has a few problems:
You shouldn't check all buffer. Check buffer until \0. because the output of fgets is a C-style string.
Second problem is an extra i++ in function digitTest. You should remove it.
You don't need arysize any more.
Use this digitTest function instead
int digitTest(char *test)
{
int i;
for (i=0; test[i] && test[i] != '\n'; i++)
{
if (isalpha(test[i]) != 0)
return 0;
}
return 1;
}
(maybe has minor bugs, I didn't test it)
And call it like this:
while( fgets(buffer, sizeof(buffer), fp) ) {
if (!digitTest(buffer)) {
printf ("contains alpha\n");
continue;
}
}

It Looks like you might be accessing locations in memory that contain characters, Change your code to
char buffer[4096] = {};
memset(buffer, 0, 4096);
while(NULL!=fgets(buffer, sizeof(buffer), fp)){
if (digitTest(buffer, strlen(buffer)) == 0){ //get actual length of buffer
printf ("contains alpha\n");
continue;
EDIT
In order to make your code not respond to either \n or \r do
bool digitTest(char *test, int arysize)
{
int i;
for (i =0; i<arysize; i++)
{
if( test[i] == '\r' || test[i] == '\n')
return 1;
if ((isalpha(test[i])) != 0){
return 0;
}
if (i==arysize)
return 1;
}
return 1;
}

Related

Possible stack smashing?

I was looking at the following example in a book, and it seems to me that it will cause stack smashing:
int read_line(char str[], int n)
{
int ch, i = 0;
while ((ch = getchar()) != '\n')
if (i < n)
str[i++] = ch;
str[i] = '\0';
return i;
}
If I pass it an array with room for 10 chars, and n = 10, the if-statement will be true up to and including 9, and i will be incremented to 10.
Then, it will write the '\0' character at str[10] which would be just past the end of the array?
It works just fine, though (tried building with gcc on Linux, clang on Mac and VS on Windows).
VS on Windows is the only one showing an error when running the program, even though I have tried setting -fstack-protector in e.g. clang.
Your assessment is correct, the code has undefined behavior if the user types n or more bytes before the newline. There is also a problem if the end of file is encountered before the end of the line: the function will then run an infinite loop.
Here is a corrected version:
#include <stdio.h>
int read_line(char str[], int n) {
int ch, i = 0;
while ((ch = getchar()) != EOF && ch != '\n') {
if (i + 1 < n)
str[i] = ch;
i++;
}
if (n > 0) {
str[i < n ? i : n - 1] = '\0';
}
if (i == 0 && ch == EOF) {
/* end of file: no input */
return -1;
} else {
/* return the complete line length, excluding the newline */
return i;
}
}
int main() {
char buf[50];
int count = read_line(buf, sizeof buf);
if (count < 0) {
printf("Empty file\n");
} else
if (count >= sizeof buf) {
printf("Line was truncated: %s\n", buf);
} else {
printf("Read %d bytes: %s\n", count, buf);
}
return 0;
}

Counting number of words inside a text file and printing results in a different text file in c/c++

the code:
#include <ctype.h>
#include <stdio.h>
#include <string.h>
char filename[] = "11.txt";
char filename1[] = "2.txt";
FILE *ptr, *resultptr;
char string[100];
char words[100][100];
int len = sizeof(filename) / sizeof(char);
int i = 0, j = 0, k, length, count;
int main()
{
fopen_s(&ptr, filename, "r");
fopen_s(&resultptr, filename1, "w");
if ((ptr == nullptr) || (resultptr == nullptr)) {
printf("Files were not opened!");
return -1;
}
while (fgets(string, sizeof string, ptr)) {
for (k = 0; string[k] != '\0'; k++) {
if (string[k] != ' ' && string[k] != '\n') {
words[i][j++] = tolower(string[k]);
} else {
words[i][j] = '\0';
i++;
j = 0;
}
}
length = i + !!j;
fputs("Occurrences of each word:\n", resultptr); //prints this sentence into file
for (i = 0; i < length; i++) {
if (strcmp(words[i], "0") == 0)
continue;
count = 1;
char *ch = words[i];
for (j = i + 1; j < length; j++) {
if (strcmp(words[i], words[j]) == 0 && (strcmp(words[j], "0") != 0)) {
count++;
strcpy_s(words[j], "0");
}
}
fputs("The word ", resultptr);
if (string[i] != ' ' && string[i] != '\n') {
fprintf(resultptr, "%s", ch);
}
fputs(" occurred ", resultptr);
fprintf(resultptr, "%d", count);
fputs(" times\n", resultptr);
}
fclose(ptr);
fclose(resultptr);
return 0;
}
}
The counting part is working perfectly fine, but the problem is when I try to print results, for the sentence "to be or not: to be that is the question ..." it prints this:
Occurrences of each word:
The word to occurred 2 times
The word be occurred 2 times
The word occurred 1 times
The word not: occurred 1 times
The word that occurred 1 times
The word is occurred 1 times
The word occurred 1 times
Occurrences of each word:
The word to occurred 1 times
The word be occurred 1 times
The word or occurred 1 times
The word occurred 1 times
The word that occurred 1 times
The word is occurred 1 times
The word occurred 2 times
The word question occurred 1 times
The word ... occurred 1 times
What's messing? like I'm not professional, but can someone guide me on what's wrong here? I changed a bit from the original one but still a lot of mistakes
There are multiple problems in the code:
the global variables should be moved inside the body of the main() function.
fopen_s() is not portable, use fopen() instead.
strcpy_s() is not portable, use strcpy() instead or just set the first byte if the string to '\0' to make it an empty string.
i and j should be reset to 0 after each fgets().
you should test for letters with isalpha() instead of only testing for space and newline.
you should clear the duplicated words by setting them to the empty string.
you should use a simple fprintf() call for the output line.
you should not close the files inside the while(fgets(...)) loop.
If you want to count all words in the file, this approach is limited to a rather small number of words. A more general solution would construct a dictionary of words found as you read the file contents and increment the count for each word found.
Here is a modified version:
#include <ctype.h>
#include <errno.h>
#include <stdio.h>
#include <string.h>
#ifdef _MSC_VER
#pragma warning(disable:4996) // disable Microsoft obnoxious warning
#endif
#define WORDS 2000
#define CHARS 40
int main() {
char filename[] = "11.txt";
char filename1[] = "2.txt";
FILE *ptr, *resultptr;
char string[100];
char words[WORDS][CHARS];
int i, j, k, length, count;
ptr = fopen(filename, "r");
if (ptr == NULL) {
fprintf(stderr, "cannot open %s: %s\n", filename, strerror(errno));
return 1;
}
resultptr = fopen(filename1, "w");
if (resultptr == NULL) {
fprintf(stderr, "cannot open %s: %s\n", filename1, strerror(errno));
return 1;
}
i = j = 0;
while (i < WORDS && fgets(string, sizeof string, ptr)) {
for (k = 0; string[k] != '\0'; k++) {
unsigned char c = string[k];
if (isalpha(c)) {
if (j < CHARS - 1)
words[i][j++] = tolower(c);
} else {
words[i][j] = '\0';
if (j > 0) {
j = 0;
i++;
if (i == WORDS)
break;
}
}
}
if (j > 0) {
// include the last word if the file does not end with a newline
words[i][j] = '\0';
i++;
}
}
length = i;
fprintf(resultptr, "Occurrences of each word:\n");
for (i = 0; i < length; i++) {
if (words[i][0] == '\0')
continue;
count = 1;
for (j = i + 1; j < length; j++) {
if (strcmp(words[i], words[j]) == 0) {
count++;
words[j][0] = '\0';
}
}
fprintf(resultptr, "The word %s occurred %d times\n", words[i], count);
}
fclose(ptr);
fclose(resultptr);
return 0;
}
Note: Meanwhile, OP has applied the mentioned fixes to the question, thereby invalidating this answer. This answer applies to revision 4 of the question.
What's messing?
The word 0 occurred 1 times - You chose to replace word duplicates with the string "0". In order to not count those replacements as words, insert
if (strcmp(words[i], "0") == 0) continue;
at the very beginning of the printing for loop body. It seems you intended if (string[i] != ' ' && string[i] != '\0' && string[i]!='0' ) to do this, but that doesn't work - remove this code.
Besides, the empty string would be a better choice, allowing the word 0.
The word
occurred 1 times - The '\n' at the end was counted as a word. In order to not count this and in addition skip punctuation as well as avoid empty words due to consecutive non-word characters, replace
if (string[k] != ' ' && string[k] != '\0') {
words[i][j++] = tolower(string[k]);
}
else
with
if (isalnum(string[k]))
words[i][j++] = tolower(string[k]);
else if (j)
The word occurred 1 times - An empty word at the end of file was counted. In order to not count that, add 1 to i only if inside a word at EOF, i. e. change
length = i + 1;
to
length = i + !!j;

Odd output of string in C

I received an assignment to write a code that would erase the instances of a string in another string, and although my code does that successfully, the symbol ╠ appears many times at the end of the result string.
Example:
For input string 1 - A string is a string, and an input string 2 - str
The result should be A ing is a ing.
But I receive A ing is a ing╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠
Hoped I could get some assistance regarding this issue, cause no matter what I've tried I wasn't able to
fix this.
#include <stdio.h>
#define STRING_SIZE 100
int StrippingFunc(char input_str1[STRING_SIZE], char input_str2[STRING_SIZE], char
result_string[STRING_SIZE])
{
if (input_str2[0] == '\n' || input_str2[0] == '\0')
{
return 0;
}
for (int k1 = 0; k1 < STRING_SIZE; k1++)
{
if (input_str1[k1] == '\n')
{
input_str1[k1] = '\0';
}
}
for (int k2 = 0; k2 < STRING_SIZE; k2++)
{
if (input_str2[k2] == '\n')
{
input_str2[k2] = '\0';
}
}
int Length;
int length2 = 0;
int index2 = 0;
while (input_str2[index2] != '\0') // Loop used to determine input_string2's length.
{
length2++;
index2++;
}
int InString = 0;
int i = 0;
int j;
int resultindex = 0;
while (input_str1[i] != '\0')
{
Length = length2;
int l = i;
j = 0;
int proceed = 1;
if (input_str1[l] == input_str2[j])
{
while ((input_str2[j] != '\0') && (proceed != 0))
{
while (Length >= 0)
{
if (Length == 0)
{
InString = 1;
i += (l-i-1);
proceed = 0;
Length = -1;
}
if (input_str1[l] == input_str2[j])
{
Length--;
j++;
l++;
}
else if ((input_str1[l-1] == input_str2[j-1]) && (input_str2[j] == '\0'))
{
proceed = 0;
Length = -1;
}
else
{
proceed = 0;
Length = -1;
result_string[resultindex] = input_str1[l - 1];
resultindex++;
}
}
}
}
else
{
result_string[resultindex] = input_str1[i];
resultindex++;
}
i++;
}
return InString;
}
int main()
{
char result_string[STRING_SIZE];
char input_string1[STRING_SIZE];
char input_string2[STRING_SIZE];
printf("Please enter the main string..\n");
// Your function call here..
fgets(input_string1, STRING_SIZE + 1, stdin);
printf("Please enter the pattern string to find..\n");
// Your function call here..
fgets(input_string2, STRING_SIZE + 1, stdin);
int is_stripped = StrippingFunc(input_string1, input_string2, result_string);; // Your function call here..
// Store the result in the result_string if it exists
printf("> ");
printf(is_stripped ? result_string : "Cannot find the pattern in the string!");
return 0;
}
But I receive A ing is a ing╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠╠
In the code after you fill result_string but you missed to add the final null character, because of that the printf after reach non initialized characters with an undefined behavior producing your unexpected writting. After
while (input_str1[i] != '\0')
{
Length = length2;
...
}
add
result_string[resultindex] = 0;
note you have the place for because result_string and input_str1 have the same size
Having
char input_string1[STRING_SIZE];
char input_string2[STRING_SIZE];
these two lines can have an undefined behavior :
fgets(input_string1, STRING_SIZE + 1, stdin);
fgets(input_string2, STRING_SIZE + 1, stdin);
because fgets may write after the end of the arrays, you need to remove +1 or to size the arrays one more
In
for (int k1 = 0; k1 < STRING_SIZE; k1++)
{
if (input_str1[k1] == '\n')
{
input_str1[k1] = '\0';
}
}
for (int k2 = 0; k2 < STRING_SIZE; k2++)
{
if (input_str2[k2] == '\n')
{
input_str2[k2] = '\0';
}
}
except if fgets fill all the arrays you have an undefined behavior working on non initialized characters because you do not stop when you reach newline or the null character.
In
int length2 = 0;
int index2 = 0;
while (input_str2[index2] != '\0') // Loop used to determine input_string2's length.
{
length2++;
index2++;
}
length2 and length2 have exactly the same value, is it useless to have two variables, and in fact this lop is useless because the previous loop with the right termination already give you the expected length.
In
printf(is_stripped ? result_string : "Cannot find the pattern in the string!");
I encourage you to replace printf by a puts not only to add a final newline to flush the output and make it more clear in case you start your program in a shell, but also because in case the input string contains for instance %d and it is not removed and is_stripped is true then printf will try to get an argument whose do not exist, with an undefined behavior
If you do all the corrections with your inputs your code will print > A ing is a ing without undefined behavior

Replacing three 'a' in with a single '*' in a string

So my program should get input from an user and store it in an array. After that if the input string includes three 'a's in a row it should be replaced with a single '*'. However I can't seem to get it right. It only replaces the first a with a *. I tried to replace the following 2 a with a blank but the output looks funny.
For this exercise I have to use putchar() and getchar().
Thank you in advance.
#include <stdio.h>
char c;
char buffer[256];
int counter= 0;
int i;
int main()
{
while ((c = getchar()) != '\n') {
buffer[counter] =c;
counter++;
if (counter >255) {
break;
}
}
for(i=0; i<256; i++) {
if(buffer[i]== 'a'&&buffer[i+1]=='a'&&buffer[i+2]=='a')
{
buffer[i]= '*';
buffer[i+1]=' ';
buffer[i+2]=' ';
}
putchar(buffer[i]);
}
putchar('\n');
return 0;
}
So my program should get input from an user and store it in an array.
After that if the input string includes three 'a's in a row it should
be replaced with a single '*'. However I can't seem to get it right.
You almost got it! Just move index by 2 to and continue.
#include <stdio.h>
char c;
char buffer[256];
int counter= 0;
int i;
int main(void)
{
while ((c = getchar()) != '\n') {
buffer[counter] =c;
counter++;
if (counter >= 255) {
break;
}
}
buffer[counter] ='\0';
for(i=0; i<256; i++) {
if(buffer[i]== 'a'&&buffer[i+1]=='a'&&buffer[i+2]=='a')
{
buffer[i]= '*';
putchar(buffer[i]);
i = i + 2;
continue;
}
putchar(buffer[i]);
}
putchar('\n');
return 0;
}
Test:
123aaa456aaa78
123*456*78
In string you must assign a end of character at the end and that is call null character \0 or just a numeric 0. Correct your code like below:-
while ((c = getchar()) != '\n') {
buffer[counter] =c;
counter++;
if (counter >=255) {
break;
}
}
buffer[counter] ='\0';// or buffer[counter] =0;
To avoid side effect in a string array always set all its value with 0 first:-
char buffer[256];
memset(buffer, 0, sizeof(buffer));
If you want to change the number of characters, you will need to create a different buffer to copy the output to.
If you really just want to output to the console, you could just write every character until you hit your matching string.
#include <stdio.h>
char c;
char buffer[256];
char output[256];
int counter= 0;
int i, j;
int main()
{
while ((c = getchar()) != '\n') {
buffer[counter] = c;
counter++;
if (counter >255) {
break;
}
}
buffer[counter] = 0;
for(i=0, j=0; i<256; i++, j++) {
if(buffer[i] == 'a' && buffer[i+1] == 'a'&& buffer[i+2] == 'a')
{
output[j]= '*';
i += 2;
}
else
output[j] = buffer[i];
putchar(output[j]);
}
putchar('\n');
return 0;
}
There are multiple problems in your code:
there is no reason to make all these variables global. Declare them locally in the body of the main function.
use int for the type of c as the return value of getchar() does not fit in a char.
you do not check for EOF.
your test for buffer overflow is off by one.
you do not null terminate the string in buffer. You probably make buffer global so it is initialized to all bits 0, but a better solution is to set the null terminator explicitly after the reading loop.
to replace a sequence of 3 characters with a single one, you need to copy the rest of the string.
You can use a simple method referred as the 2 finger approach: you use 2 different index variables into the same array, one for reading, one for writing.
Here is how it works:
#include <stdio.h>
int main() {
char buffer[256];
int c;
size_t i, j, counter;
for (counter = 0; counter < sizeof(buffer) - 1; counter++) {
if ((c = getchar()) == EOF || c == '\n')
break;
buffer[counter] = c;
}
buffer[counter] = '\0';
for (i = j = 0; i < counter; i++, j++) {
if (buffer[i] == 'a' && buffer[i + 1] == 'a' && buffer[i + 2] == 'a') {
buffer[j] = '*';
i += 2;
} else {
buffer[j] = buffer[i];
}
}
buffer[j] = '\0'; /* set the null terminator, the string may be shorter */
printf("modified string: %s\n", buffer);
return 0;
}

Can't break a while loop. Reading undefined array size

I have this sequence of letters and numbers, in which the letters are always these four: s, S, m, M. The numbers can have any value. Since the size of the sequence is not given, I just can't use a for loop, so I decided to use a while loop, but I'm having issues on breaking the loop.
Some input examples are:
12 s 80 s 3 m 12 M 240 S 8 m 30 s 240 s 1440 S 8 m 18 s 60 M
5 m 120 s 30 s 360 S 6 M 5 s 42 S 36 M 8 m 66 M 3240 S 14 m
Here is my code so far:
#include <stdio.h>
int main () {
int n[100], i = 0;
char x[100];
while(x[i] != '\n')
{
scanf(" %d %c", &n[i], &x[i]);
printf("%d %c ", n[i], x[i]);
i++;
}
return 0;
}
Any toughts on how to break the loop, and have all this values saved correctly on the array?
like this
#include <stdio.h>
int main(void){
int n[100], i, j;
char x[100];
do {
for(i = 0; i < 100; ++i){
int ch;
while((ch = getchar()) != EOF && ch != '\n'){//check newline
if(ch == '-' || '0' <= ch && ch <= '9'){
ungetc(ch, stdin);//back to stream a character
break;
}
}
if(ch == EOF || ch == '\n')
break;
if(2 != scanf("%d %c", &n[i], &x[i])){
fprintf(stderr, "invalid format.\n");
i = 0;//End the outer do-while loop
break;
}
}
//print
for(j = 0; j < i; ++j){
printf("(%d, %c)", n[j], x[j]);
}
printf("\n");
} while(i != 0);//End with empty line
}
#include <stdio.h>
#define DATA_MAX_LEN 100
int main(void){
int n[DATA_MAX_LEN], i, len, loop_end;
char x[DATA_MAX_LEN], newline[2], ch;
while(scanf("%1[\n]", newline) != 1){//End with empty line(only newline), Need EOF check
for(loop_end = len = i = 0; i < DATA_MAX_LEN && !loop_end; ++i){
//format: integer character[space|newline]
if(scanf("%d %c%c", &n[i], &x[i], &ch) != 3)
loop_end = printf("invalid format.\n");
else if(ch == '\n')
loop_end = len = ++i;
else if(ch != ' ')
loop_end = printf("invalid format.\n");
}
for(i = 0; i < len; ++i){
printf("%d %c ", n[i], x[i]);
}
printf("\n");
}
}
scanf and fscanf have many problems so it's best to avoid them.
The general pattern for dealing with input is to create a large input buffer, and then process that into smaller chunks.
char line[4096];
fgets( line, sizeof(line), stdin );
Since line is reused it's ok to make it large enough to hold any reasonable input.
Now that you've read a line into memory, it's a string of known size to process as you like. sscanf (scanf on a string) doesn't have most of the problems of scanf, but it's also not suited to moving through a string. One approach is to split the string into tokens on whitespace with strtok, and process them alternately as letters and numbers.
const char sep[] = " \t\n";
bool expect_letter = false;
for(
char *token = strtok( line, sep );
token != NULL;
token = strtok( NULL, sep )
) {
if( expect_letter ) {
printf("Letter %s\n", token);
expect_letter = false;
}
else {
printf("Number %s\n", token);
expect_letter = true;
}
}
If you want to store them in an array, it's bad practice to allocate what you hope is enough memory. You'll have to use an array that grows as needed. C does not have these built in. You can write your own, and it's a good exercise, but it's easy to get wrong. For production use one from a library such as Glib.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <glib.h>
int main() {
// Read a single line of input into a buffer.
char line[4096];
fgets( line, sizeof(line), stdin );
// Create arrays to hold letters and numbers, these will grow as needed.
GArray *numbers = g_array_new( FALSE, TRUE, sizeof(int) );
GArray *letters = g_array_new( FALSE, TRUE, sizeof(char) );
// Split the string on whitespace into tokens.
const char sep[] = " \t\n";
gboolean expect_letter = FALSE;
for(
char *token = strtok( line, sep );
token != NULL;
token = strtok( NULL, sep )
) {
if( expect_letter ) {
// Need an error check to ensure that `token` is a single character.
g_array_append_val( letters, token[0] );
expect_letter = FALSE;
}
else {
// strtol is a better choice, it has error checking
int num = atoi(token);
g_array_append_val( numbers, num );
expect_letter = TRUE;
}
}
// Print the numbers and letters.
for( guint i = 0; i < letters->len; i++ ) {
printf(
"%d%c\n",
g_array_index( numbers, int, i ),
g_array_index( letters, char, i )
);
}
}
Note that GLib provides its own boolean, so I switched to that instead of stdbool to keep things consistent.
As noted in the comments, this does not include checks that the token is what you expect. It's also possible to have a number with no letter, so checking that letters and numbers are the same size would be good. Or you can make a struct to hold the letter/number pairs and have a single list of those structs.

Resources