Beginner C: Reading Age - c

I'm studying elementary programming in C and I'm doing a challenge to determine the reading age of various sentences. This is achieved by determining the amount of sentences in a string etc. I have some code that does the my first very basic step but it's not quite working as expected.
I'm thinking this is because my knowledge of the strlen function etc isn't sufficient.
I don't want to cheat for the answer as I like the sense of achievement from the problem solving. But would it be possible to get a gentle push in the right direction if at all possible?
char sentence[] = "One fish. Two fish. Red fish. Blue fish.";
int main(void)
{
int sentence_count = 0;
int word_count = 0;
int i;
int length = strlen(sentence);
for (i = 0; i == strlen(sentence); i++) //need to somehow go through a string one char at a time until the end.
{
if (sentence[i] == '.' || sentence[i] == '!' || sentence[i] == ';' || sentence[i] == '?')
{
return sentence_count ++;
}
if (sentence[i] == '\0')
{
return word_count ++;
}
}
printf("There are %i in %i sentences.\n", word_count, sentence_count);
printf("%i\n", length);
}

First Problem -
for (i = 0; i == strlen(sentence); i++)
This state should be -
for (i = 0; i < strlen(sentence); i++)
In your case, it would be terminating on the first iteration itself. However, You need to loop until you have reached the index - strlen(sentence).
Do recall that for loop has syntax - for(initialisation; condition; increment/decrement) will run only until the condition evaluates to true. So, you need the condition to evaluate to true till you have traversed the whole string which is done by the second line of code mentioned above.
A better alternative approach would be -
for (i = 0; sentence[i] != '\0'; i++)
which means the loop will run until you encounter a null-terminated character.
Second Problem -
return sentence_count ++;
.
.
return word_count ++;
Here, you don't need to add the return keyword before the above two statements. The return will directly exit from your program. Simply writing sentence_count++ and word_count++ would be correct.
Third Problem -
sentence[i] == '\0'
This statement doesn't quite fit with the logic that we are trying to achieve. The statement instead must check if the character is a space and then increment the word count -
if (sentence[i] == ' ')
{
return word_count ++;
}

Your for loop condition is the main problem; for (i = 0; i == strlen(sentence); i++) reads as "on entry, set i to 0, enter the body each time i is equal to the length of sentence, increment i at the end of each loop". But this means the loop never runs unless sentence is the empty string (has strlen of 0). You want to test i < strlen(sentence) (or to avoid potentially recomputing the length over and over, use the length you already calculated, i < length).
You also need to remove your returns; the function is supposed to count, and as written, it will return 0 the instant it finds any of the target characters, without using the incremented values in any way. Put a return 0; at the end of main to indicate exiting successfully (optionally, stdlib.h can be included so you can return EXIT_SUCCESS; to avoid magic numbers, but it's the same behavior).

The information in others answers being already covered, here are a couple of other suggestions for your consideration.
Remove function calls, such as strlen() from within the for(;;) loop. You've already obtained the length with:
int length = strlen(sentence);
Now, just use it in your for loop:
for(i = 0; i < length ; i++)//includes replacement of == with <
Encapsulate the working part of your code, in this case the the counting of words and sentences. The following uses a different approach, but its the same idea:
//includes deliminators for common end-of-sentence punctuation:
int count_sentences(const char *buf)
{
const char *delim = {".?!"};//add additional 'end-of-sentence' punctuation as needed.
char *tok = NULL;
int count = 0;
char *dup = strdup(buf);//preserve original input buffer
if(dup)
{
tok = strtok(dup, delim);
while(tok)
{
count++;
tok = strtok(NULL, delim);
}
free(dup);
}
return count;
}
One additional idea, that is out of scope with your original code, but very useful in practice is to remove any parts of the buffer that may not be part of the sentence, i.e. leading or trailing space. In your example, you have the test case for your sentences tightly defined within a string literal:
char sentence[] = "One fish. Two fish. Red fish. Blue fish.";
This is not incorrect, but what would happen at some point your code were to be expected to work with string buffers not so neatly packaged? i.e. leading or trailing white space characters in the buffer to be processed?
"\n\n\tOne fish. Two fish. Red fish. Blue fish.\f"
Removing unknown and unwanted content from the buffer prior to processioning simplifies the code doing the work, in this case counting sentences. Following is a simple example of how this can be done.
//prototype:
char *new = clear_leading_trailing_whitespace(sentence);
char * clear_end_space(const char *buf)
{
char *new = buf;
//clear leading whitespace
while (isspace(*new))
{
new++;
}
//clar trailing whitespace
int len = strlen(new);
while(isspace(*(new + len-1)))
{
len--;
}
*(new + len) = 0;
return buf;
}
Next, the following code segment is intending to count words:
if (sentence[i] == '\0')
{
return word_count ++;
}
But after being initialized to 0, word_count is only incremented once when seeing the null terminator, \0 once. Word count is generally a count of spaces between non-sentence terminating and non-whitespaces characters in a buffer. Or in otherwords, tracking how many clusters of non-whitespace there are. The following is a way to do this:
void countwords(const char *text, *count)
{
bool reading_word = false; // Flag
int words = 0;
for(int i=0; i<strlen(text); i++)
{
if(isspace(text[i])) {
reading_word = false;
}
else if(isalpha(text[i])) {
if(!reading_word) {
reading_word = true;
words++;
}
}
}
*count = words;
}
Functions like this can be used to greatly simplify contents of the main function:
char sentence[] = "One fish. Two fish. Red fish. Blue fish.";
int main(void)
{
int sentence_count = 0;
int word_count = 0;
char *new = clear_leading_trailing_whitespace(sentence);
countwords(new, &word_count);
sentence_count = count_sentences(new);
...
printf("There are %d words in %d sentences.\n", word_count, sentence_count);
}

Related

Program to find the longest word in a string

I wrote a program to find the longest word in a string and print the number of letters in the longest word. But the code is not printing. I analyzed the program many times but I could not find the solution.
#include <stdio.h>
#include <string.h>
int main() {
char string[100] = "Hello Kurnool";
int i = 0, letters = 0, longest = 0;
start:
for (; string[i] !=' '; i++) {
letters++;
}
if (letters >= longest)
longest = letters;
if (string[i] == ' ') {
letters = 0;
i++;
goto start;
}
printf("%d", longest);
return 0;
}
Using goto is highly discouraged. You should convert your code to use a loop.
The main problem in your code is you do not stop the scan when you reach the end of the string.
Here is a modified version:
#include <stdio.h>
int main() {
char string[100] = "Hello Kurnool";
int i, letters, longest = 0, longest_pos = 0;
for (i = 0; string[i] != '\0'; i++) {
for (letters = 0; string[i] != '\0' && string[i] != ' '; i++) {
letters++;
}
if (letters > longest) {
longest = letters;
longest_pos = i - longest;
}
}
printf("longest word: %d letters, '%.*s'\n",
longest, longest, string + longest_pos);
return 0;
}
Note that the implementation can be simplified into a single loop:
#include <stdio.h>
int main() {
char string[100] = "Hello Kurnool";
int i, start = 0, longest = 0, longest_pos = 0;
for (i = 0; string[i] != '\0'; i++) {
if (string[i] == ' ') {
start = i + 1;
} else {
if (i - start > longest) {
longest = i - start;
longest_pos = start;
}
}
}
printf("longest word: %d letters, '%.*s'\n",
longest, longest, string + longest_pos);
return 0;
}
Below is my approach. You should use C's string manipulation functions. This is the correct way to deal with strings in C.
In the code below, first I acquire the required bytes to store the input string in heap. Then I use strtok to split the string into tokens based on a delemeter and get the length of each sub string. Finally I free the space that I have allocated with malloc.
#include<stdio.h>
#include<string.h>
#include<stdlib.h>
#define phrase "Hello Kurnool"
int main()
{
char* string = malloc(strlen(phrase)+1);
strcpy(string,phrase);
int longest=0;
char *token;
char delimeter[2] = " ";
/* get the first token */
token = strtok(string, delimeter);
/* walk through other tokens */
while( token != NULL ) {
printf( " %s\n", token );
if(longest < strlen(token)){
longest = strlen(token);
}
token = strtok(NULL, delimeter);
}
printf("%d",longest);
free(string);
return 0;
}
People say - dont use goto but there is nothing inherently wrong with goto. Only thing is if goto is not used judiciously, it makes code more difficult to understand and maintain. For example, the way you have used it in your program ( instead of goto, a loop is perfect fit in such cases). Check this:
To use goto or not?
What is wrong with using goto?
Coming to your code, the for loop condition does not have check for terminating null character
for (; string[i] !=' '; i++) {
Hence it will not stop at the end of string.
To find the number of letters in longest word of string, you can do:
#include <stdio.h>
#include <string.h>
int main() {
char string[100] = "Hello Kurnool";
int i, letters = 0, longest = 0;
for (i = 0; string[i] != '\0'; i++) {
if (string[i] != ' ') {
letters++;
if (letters > longest) {
longest = letters;
}
} else {
letters = 0;
}
}
printf("longest : %d\n", longest);
return 0;
}
First of all,Please avoid using Goto, it is not a good practice.
Secondly, your loop will run infinite times when it iterates the second time because:
for(;string[i]!=' ';i++) // Here String[i] will never be equal to ' ' As there is no white space after your last word.
You can never expect what might be going wrong with your program if you are using
goto statement
which is never advisable to use rather it's bad programming if you use it. Secondly it looks like you are stuck in an infinite loop so her is a solution to your problem:
#include<stdio.h>
#include<string.h>
void main()
{
char s[1000];
scanf("%s",s);
int i=0;
int letters;
int longest=0;
while(s[i]!=NULL)
{
if(s[i]==' ')
{
if(longest>=letters)
{longest=letters;}
letters=0;
}
else
{letters++;}
}
printf("%d\n",longest);
}
So, what I have done is assuming a string s which is the input given by the user. You itterate through s till the last input given by the user after which it encounters a NULL character. Now you are searching for the length of the longest word, so you create a variable letters for counting the no. of letters in each word of the string. And if the string s encounters a space indicating the end of a word, then you check if the variable longest is greater than or less than the word count. And again you initialize letters to 0, so that it can start counting the next word from 0 again.So, by this method at the end i.e. after the while loop terminates we get our required output which is stored in the variable longest.
So, I guess this will print the no. of letters in the longest word.

Counting the number of times a character appears in a string in c programming?

The for loop below continues until the end of the string, while the if branch checks to see how many times the character 'u' appears in the string, "yuzuf Oztuk", which is 3 times. Meanwhile, the variable count counts the number of u's in the string. When i compile the code, I get 15 for the number of times u appears in the string, which is wrong.
int numTimesAppears(char* mystring, char ch)
{
int i;
int count;
for(i = 0; mystring[i] != '\0' ; ++i)
{
if (mystring[i] == ch)
{
count++;
}
}
return count;
}
I get 15 for the number of times u appears in the string, which is wrong.
Key issue: Code needs to initialize the value of count. #BLUEPIXY
// int count;
int count = 0;
Corner case: As the null character is in the string, a result of 1 "for number of times a character appears in a string" would be expected for any numTimesAppears(some_string, '\0'). A do loop fixes that. A similar standard library function is strchr(), which looks for the first match and considers the null character part of the search-able string: "... terminating null character is considered to be part of the string." As with all corner cases, various results could be inferred - best to document the coding goal in this case.
i = 0;
do {
if (mystring[i] == ch) {
count++;
}
} while (mystring[i++]);
As the function does not modify the inspected string, making it const increases the function's applicability and perhaps performance. #Vlad from Moscow
Array indexing best uses size_t rather than int. int may be too narrow.
size_t numTimesAppears(const char* mystring, char ch) {
size_t count = 0;
size_t i = 0;
do {
if (mystring[i] == ch) {
count++;
}
} while (mystring[i++]);
return count;
}

Comparing c strings to char using strcmp

I'm trying to count spaces using c strings. I can't use std strings. And on the line comparing the two chars I get the error 'invalid conversion from 'char' to 'const char*'.
I understand that I need to compare two const chars* but I'm not sure which one is which. I believe sentence[] is the char and char space[] is const char* is that right? And I need to use some sort of casting to convert the second but I'm not understanding the syntax I guess. Thanks for the help <3
int wordCount(char sentence[])
{
int tally = 0;
char space[2] = " ";
for(int i = 0; i > 256; i++)
{
if (strcmp(sentence[i], space) == 0)
{
tally++
}
}
return tally;
}
If you really want to count space characters, I think the following method would be better, since it checks where char array ends. A string terminator(\0) signals the end of the char array. I don't know why you hard-coded 256.
int countSpaceCharacters(char* sentence)
{
int count = 0;
int i = 0;
while (sentence[i] != '\0')
{
if (sentence[i] == ' ')
{
count++;
}
i++;
}
return count;
}
However, if you want to count words as I can see from the original method name, you need to think a better way. Because this algorithm would fail in non-perfect situations such as having consecutive space characters or punctuation marks that have no space character around them etc.
strcmp is used to compare two character strings not a single character with one character string.
there is no function: strcmp(char c, char*); // if there's it's illogical!
if you want to search for a single character in a character string just compare this character with all elements using iteration:
iint wordCount(char* sentence)
{
int tally = 0;
char space[2] = " ";
for(int i = 0; i < strlen(sentence); i++)
{
if (sentence[i] == space[0])
{
tally++;
}
}
return tally;
}
Might i suggest this:
for(int i = 0; i < 256; i++) {
if (isspace(sentence[i])) // or sentence[i] == ' '
{
tally++
}
}
What you are trying to do now is compare a char (sentence[i]) with a c-string (space) which wont work.
Note that your implementation wont do what you expect for sentences like
"a big space be in here."
So you need to think about what to do for multiple spaces.

Attach a String to another String in C WITHOUT any spaces

this is my first post in this forum so please be patient.
I need to make a short programm, where the user can enter 2 strings which should be attached afterwards.
I already got this code below (I am not allowed to use other "includes").
What I need to know is: How can I deny any spaces which the user will enter?
Example: 1. String "Hello " | 2. String "World" Result should be "HelloWorld" instead of "Hello World".
#include <stdio.h>
void main()
{
char eingabe1[100];
char eingabe2[100];
int i = 0;
int j = 0;
printf("Gib zwei Wörter ein, die aneinander angehängt werden sollen\n");
printf("1. Zeichenkette: ");
gets(eingabe1);
printf("\n");
printf("2. Zeichenkette: ");
gets(eingabe2);
printf("\n");
while (eingabe1[i] != '\0')
{
i++;
}
while (eingabe2[j] != '\0')
{
eingabe1[i++] = eingabe2[j++];
}
eingabe1[i] = '\0';
printf("Nach Verketten: ");
puts(eingabe1);
}
You have to filter out the spaces as you copy your strings.
You have two string indices, i for the first string and and j for the second string. You could make better use of these indices if you used i for the reading position (of both strings subsequently; you can "reuse" loop counters in independent loops) and j for the writing position.
Here's how. Note that the code attempts to prevent buffer overflow by only adding characters if there is space in the string. This check needs only to be done when copying the second string, because j <= i when you process the first string.
#include <stdio.h>
int main()
{
char str1[100] = "The quick brown fox jumps over ";
char str2[100] = "my big sphinx of quartz";
int i = 0;
int j = 0;
while (str1[i] != '\0') {
if (str1[i] != ' ') str1[j++] = str1[i];
i++;
}
i = 0;
while (str2[i] != '\0') {
if (str2[i] != ' ' && j + 1 < sizeof(str1)) str1[j++] = str2[i];
i++;
}
str1[j] = '\0';
printf("'%s'\n", str1);
return 0;
}
In addition to avoiding spaces between your two words, you also have to avoid the newline ('\n') character placed in the input buffer by the user pressing Enter. You can do that with a simple test after you have read the line with fgets() NOT gets(). gets() is no longer part of the standard C library and should not be used due to insecurity reasons. Plus fgets provides simple length control over the number of characters a user may enter at any time.
Below, you run into trouble when you read eingabe1. After the read, eingabe1 contains a '\n' character at its end. (as it would using any of the line-oriented input functions (e.g. getline(), fgets(), etc) To handle the newline, you can simply compare its length minus '1' after you loop over the string to find the nul character. e.g.:
if (eingabe1[i-1] == '\n') i--; /* remove trailing '\n', update i */
By simply reducing the index 'i', this will guarantee that the concatenation with eingabe2 will not have any spaces or newline characters between the words.
Putting the pieces together, and using fgets in place of the insecure gets, after #define MAX 100'ing a constant to prevent hardcoding your array indexes, you could come up with something similar to:
#include <stdio.h>
#define MAX 100
int main (void)
{
char eingabe1[MAX] = {0};
char eingabe2[MAX] = {0};
int i = 0;
int j = 0;
printf("Gib zwei Wörter ein, die aneinander angehängt werden sollen\n");
printf("1. Zeichenkette: ");
/* do NOT use gets - it is no longer part of the C library */
fgets(eingabe1, MAX, stdin);
putchar ('\n');
printf("2. Zeichenkette: ");
/* do NOT use gets - it is no longer part of the C library */
fgets(eingabe2, MAX, stdin);
putchar ('\n');
while (eingabe1[i]) i++; /* set i (index) to terminating nul */
if (i > 0) {
if (eingabe1[i-1] == '\n') i--; /* remove trailing '\n' */
while (i && eingabe1[i-1] == ' ') /* remove trailing ' ' */
i--;
}
while (eingabe2[j]) { /* concatenate string - no spaces */
eingabe1[i++] = eingabe2[j++];
}
eingabe1[i] = 0; /* nul-terminate eingabe1 */
printf("Nach Verketten: %s\n", eingabe1);
return 0;
}
Output
$ ./bin/strcatsimple
Gib zwei Wörter ein, die aneinander angehängt werden sollen
1. Zeichenkette: Lars
2. Zeichenkette: Kenitsche
Nach Verketten: LarsKenitsche
Let me know if you have any further questions. I have highlighted the changes with comments above.
/**
return: the new len of the string;
*/
int removeChar(char* string, char c) {
int i, j;
int len = strlen(string)+1; // +1 to include '\0'
for(i = 0, j = 0 ; i < len ; i++){
if( string[i] == c )
continue; // avoid incrementing j and copying c
string[ j ] = string[ i ]; // shift characters
j++;
}
return j-1; // do not count '\0';
}
int main(){
char str1[] = "sky is flat ";
char str2[100] = "earth is small ";
strcat( str2, str1 );
printf("with spaces:\n\t'%s'\n", str2) ;
removeChar(str2, ' ');
printf("without spaces:\n\t'%s'\n", str2 );
}
/**
BONUS: this will remove many characters at once, eg "\n \r\t"
return: the new len of the string;
*/
int removeChars(char* string, char *chars) {
int i, j;
int len = strlen(string);
for(i = 0, j = 0 ; i < len ; i++){
if( strchr(chars,string[i]) )
continue; // avoid incrementing j and copying c
string[ j ] = string[ i ]; // shift characters
j++;
}
string[ j ]=0;
return j;
}
Thank you everyone for all the answers.
I got the solution now.
I read some advices from you and will try to remember for the future.
See the code below:
(Excuse me for the strange names for the variables, I use german words)
A few notices:
I am not allowed to use library functions
I am not allowed to use fgets for some reasons as a trainee
#include <stdio.h>
void main()
{
char eingabe1[100];
char eingabe2[100];
int i = 0;
int j = 0;
printf("gib zwei wörter ein, die aneinander angehängt werden sollen\n");
printf("1. zeichenkette: ");
gets(eingabe1);
printf("\n");
printf("2. zeichenkette: ");
gets(eingabe2);
printf("\n");
//Attach Strings
while (eingabe1[i] != '\0')
{
i++;
}
while (eingabe2[j] != '\0')
{
eingabe1[i++] = eingabe2[j++];
}
//Remove Space
eingabe1[i] = '\0';
i = 0;
j = 0;
while (eingabe1[i] != '\0')
{
if (eingabe1[i] != 32)
{
eingabe2[j++] = eingabe1[i];
}
i++;
}
eingabe2[j] = '\0';
printf("Nach verketten: ");
puts(eingabe2);
}
Sounds like homework to me.
I just wanted to mention that you probably shouldn't use sizeof() on strings these days because there may be multibyte characters in there. Use strlen() instead. The only time sizeof() would be appropriate is if you're going to malloc() a certain number of bytes to store it.
I write little loops fairly often to do low level text stuff one character at a time, just be aware that strings in C usually have a 0 byte at the end. You have to expect to encounter one and be sure you put one on the output. Space is 0x20 or decimal 32 or ' ', it's just another character.

Program runs too slowly with large input - C

The goal for this program is for it to count the number of instances that two consecutive letters are identical and print this number for every test case. The input can be up to 1,000,000 characters long (thus the size of the char array to hold the input). The website which has the coding challenge on it, however, states that the program times out at a 2s run-time. My question is, how can this program be optimized to process the data faster? Does the issue stem from the large char array?
Also: I get a compiler warning "assignment makes integer from pointer without a cast" for the line str[1000000] = "" What does this mean and how should it be handled instead?
Input:
number of test cases
strings of capital A's and B's
Output:
Number of duplicate letters next to each other for each test case, each on a new line.
Code:
#include <stdio.h>
#include <string.h>
#include <math.h>
#include <stdlib.h>
int main() {
int n, c, a, results[10] = {};
char str[1000000];
scanf("%d", &n);
for (c = 0; c < n; c++) {
str[1000000] = "";
scanf("%s", str);
for (a = 0; a < (strlen(str)-1); a++) {
if (str[a] == str[a+1]) { results[c] += 1; }
}
}
for (c = 0; c < n; c++) {
printf("%d\n", results[c]);
}
return 0;
}
You don't need the line
str[1000000] = "";
scanf() adds a null terminator when it parses the input and writes it to str. This line is also writing beyond the end of the array, since the last element of the array is str[999999].
The reason you're getting the warning is because the type of str[10000000] is char, but the type of a string literal is char*.
To speed up the program, take the call to strlen() out of the loop.
size_t len = strlen(str)-1;
for (a = 0; a < len; a++) {
...
}
str[1000000] = "";
This does not do what you think it does and you're overflowing the buffer which results in undefined behaviour. An indexer's range is from 0 - sizeof(str) EXCLUSIVE. So you either add one to the
1000000 when initializing or use 999999 to access it instead. To get rid of the compiler warning and produce cleaner code use:
str[1000000] = '\0';
Or
str[999999] = '\0';
Depending on what you did to fix it.
As to optimizing, you should look at the assembly and go from there.
count the number of instances that two consecutive letters are identical and print this number for every test case
For efficiency, code needs a new approach as suggeted by #john bollinger & #molbdnilo
void ReportPairs(const char *str, size_t n) {
int previous = EOF;
unsigned long repeat = 0;
for (size_t i=0; i<n; i++) {
int ch = (unsigned char) str[i];
if (isalpha(ch) && ch == previous) {
repeat++;
}
previous = ch;
}
printf("Pair count %lu\n", repeat);
}
char *testcase1 = "test1122a33";
ReportPairs(testcase1, strlen(testcase1));
or directly from input and "each test case, each on a new line."
int ReportPairs2(FILE *inf) {
int previous = EOF;
unsigned long repeat = 0;
int ch;
for ((ch = fgetc(inf)) != '\n') {
if (ch == EOF) return ch;
if (isalpha(ch) && ch == previous) {
repeat++;
}
previous = ch;
}
printf("Pair count %lu\n", repeat);
return ch;
}
while (ReportPairs2(stdin) != EOF);
Unclear how OP wants to count "AAAA" as 2 or 3. This code counts it as 3.
One way to dramatically improve the run-time for your code is to limit the number of times you read from stdin. (basically process input in bigger chunks). You can do this a number of way, but probably one of the most efficient would be with fread. Even reading in 8-byte chunks can provide a big improvement over reading a character at a time. One example of such an implementation considering capital letters [A-Z] only would be:
#include <stdio.h>
#define RSIZE 8
int main (void) {
char qword[RSIZE] = {0};
char last = 0;
size_t i = 0;
size_t nchr = 0;
size_t dcount = 0;
/* read up to 8-bytes at a time */
while ((nchr = fread (qword, sizeof *qword, RSIZE, stdin)))
{ /* compare each byte to byte before */
for (i = 1; i < nchr && qword[i] && qword[i] != '\n'; i++)
{ /* if not [A-Z] continue, else compare */
if (qword[i-1] < 'A' || qword[i-1] > 'Z') continue;
if (i == 1 && last == qword[i-1]) dcount++;
if (qword[i-1] == qword[i]) dcount++;
}
last = qword[i-1]; /* save last for comparison w/next */
}
printf ("\n sequential duplicated characters [A-Z] : %zu\n\n",
dcount);
return 0;
}
Output/Time with 868789 chars
$ time ./bin/find_dup_digits <dat/d434839c-d-input-d4340a6.txt
sequential duplicated characters [A-Z] : 434893
real 0m0.024s
user 0m0.017s
sys 0m0.005s
Note: the string was actually a string of '0's and '1's run with a modified test of if (qword[i-1] < '0' || qword[i-1] > '9') continue; rather than the test for [A-Z]...continue, but your results with 'A's and 'B's should be virtually identical. 1000000 would still be significantly under .1 seconds. You can play with the RSIZE value to see if there is any benefit to reading a larger (suggested 'power of 2') size of characters. (note: this counts AAAA as 3) Hope this helps.

Resources