Read text letter by letter without strings

Read text letter by letter without strings - c

What would the best way to go about reading the text from a user and then counting the letters from the next one by one?
For example, the user enters
Hello World
The program would record in an array that
{0,0,0,1,1,0,0,1,0,0,0,3,0,0,2,0,0,1,0,0,0,0,1,0,0,0}
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
As the title says NO STRINGS!
In my attempts, I am trying to use the ascii table for a more efficient method instead of comparing each of the user inputs to every letter of the alphabet.
EDIT: how do i run a loop of all input string characters without using a string?

You don't have to compare each user input to every letter of the alphabet.
All you need to do is create an array of size 26 for the 26 English characters( assuming you only use upper case characters). Keep all the initial array elements equal to 0. Run a loop for all the input string characters and subtract 65 from ASCII value of that character which will give you the location of that character in the array and increment its value by 1.

You can have array of 26 integers and increment the respective ASCII index.
Example:
int counter[26];
char buffer[256];
fgets(buffer, sizeof buffer, stdin);
for(int i=0;i<strlen(buffer);i++)
{
if (buffer[i]>='A' && buffer[i]<='Z')
counter[buffer[i]-'A']++;
else if (buffer[i]>='a' && buffer[i]<='z')
counter[buffer[i]-'a']++;
}

The basic idea is to use a frequency table, which stores how often each character appears in the input. This happens in the first part of the below code.
In the second half, the code prints how often each of the interesting characters appears. This part does NOT assume that the letters appear in a single block in the character set. Therefore it also works on EBCDIC computers. It calculates the sum of the uppercase and lowercase frequencies and outputs that.
#include <studio.h>
int main(void) {
int freq[256] = {0}; // initializes the whole array to 0; only works with 0
int ch;
while ((ch = fgetc(stdin)) != EOF) {
freq[ch]++;
}
const char *upper = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
const char *lower = "abcdefghijklmnopqrstuvwxyz";
for (int i = 0; upper[i] != '\0') {
fprintf("character %c appears %5d times\n",
upper[i],
freq[upper[i]] + freq[lower[i]]);
}
}

Related

Do char's in C have pre-assigned zero indexed values?

Sorry if my title is a little misleading, I am still new to a lot of this but:
I recently worked on a small cipher project where the user can give the file a argument at the command line but it must be alphabetical. (Ex: ./file abc)
This argument will then be used in a formula to encipher a message of plain text you provide. I got the code to work, thanks to my friend for helping but i'm not 100% a specific part of this formula.
#include <stdio.h>
#include <cs50.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <ctype.h>
int main (int argc, string argv[])
{ //Clarify that the argument count is not larger than 2
if (argc != 2)
{
printf("Please Submit a Valid Argument.\n");
return 1;
}
//Store the given arguemnt (our key) inside a string var 'k' and check if it is alpha
string k = (argv[1]);
//Store how long the key is
int kLen = strlen(k);
//Tell the user we are checking their key
printf("Checking key validation...\n");
//Pause the program for 2 seconds
sleep(2);
//Check to make sure the key submitted is alphabetical
for (int h = 0, strlk = strlen(k); h < strlk; h++)
{
if isalpha(k[h])
{
printf("Character %c is valid\n", k[h]);
sleep(1);
}
else
{ //Telling the user the key is invalid and returning them to the console
printf("Key is not alphabetical, please try again!\n");
return 0;
}
}
//Store the users soon to be enciphered text in a string var 'pt'
string pt = get_string("Please enter the text to be enciphered: ");
//A prompt that the encrypted text will display on
printf("Printing encrypted text: ");
sleep(2);
//Encipher Function
for(int i = 0, j = 0, strl = strlen(pt); i < strl; i++)
{
//Get the letter 'key'
int lk = tolower(k[j % kLen]) - 'a';
//If the char is uppercase, run the V formula and increment j by 1
if isupper(pt[i])
{
printf("%c", 'A' + (pt[i] - 'A' + lk) % 26);
j++;
}
//If the char is lowercase, run the V formula and increment j by 1
else if islower(pt[i])
{
printf("%c", 'a' + (pt[i] - 'a' + lk) % 26);
j++;
}
//If the char is a symbol just print said symbol
else
{
printf("%c", pt[i]);
}
}
printf("\n");
printf("Closing Script...\n");
return 0;
}
The Encipher Function:
Uses 'A' as a char for the placeholder but does 'A' hold a zero indexed value automatically? (B = 1, C = 2, ...)

In C, character literals like 'A' are of type int, and represent whatever integer value encodes the character A on your system. On the 99.999...% of systems that use ASCII character encoding, that's the number 65. If you have an old IBM mainframe from the 1970s using EBCDIC, it might be something else. You'll notice that the code is subtracting 'A' to make 0-based values.
This does make the assumption that the letters A-Z occupy 26 consecutive codes. This is true of ASCII (A=65, B=66, etc.), but not of all codes, and not guaranteed by the language.

does 'A' hold a zero indexed value automatically? (B = 1, C = 2, ...)
No. Strictly conforming C code can not depend on any character encoding other than the numerals 0-9 being represented consecutively, even though the common ASCII character set does represent them consecutively.
The only guarantee regarding character sets is per 5.2.1 Character sets, paragraph 3 of the C standard:
... the value of each character after 0 in the above list of decimal digits shall be one greater than the value of the previous...
Character sets such as EBCDIC don't represent letters consecutively

char is a numeric type that happens to also often be used to represent visible characters (or special non-visible pseudo-characters). 'A' is a value (with actual type int) that can be converted to a char without overflow or underflow. That is, it's really some number, but you usually don't need to know what number, since you generally use a particular char value either as just a number or as just a character, not both.
But this program is using char values in both ways, so it somewhat does matter what the numeric values corresponding to visible characters are. One way it's very often done, but not always, is using the ASCII values which are numbered 0 to 127, or some other scheme which uses those values plus more values outside that range. So for example, if the computer uses one of those schemes, then 'A'==65, and 'A'+1==66, which is 'B'.
This program is assuming that all the lowercase Latin-alphabet letters have numeric values in consecutive order from 'a' to 'z', and all the uppercase Latin-alphabet letters have numeric values in consecutive order from 'A' to 'Z', without caring exactly what those values are. This is true of ASCII, so it will work on many kinds of machines. But there's no guarantee it will always be true!
C does guarantee the ten digit characters from '0' to '9' are in consecutive order, which means that if n is a digit number from zero to nine inclusive, then n + '0' is the character for displaying that digit, and if c is such a digit character, then c - '0' is the number from zero to nine it represents. But that's the only guarantee the C language makes about the values of characters.
For one counter-example, see EBCDIC, which is not in much use now, but was used on some older computers, and C supports it. Its alphabetic characters are arranged in clumps of consecutive letters, but not with all 26 letters of each case all together. So the program would give incorrect results running on such a computer.

Sequentiality is only one aspect of concern.
Proper use of isalpha(ch) is another, not quite implemented properly in OP's code.
isalpha(ch) expects a ch in the range of unsigned char or EOF. With k[h], a char, that value could be negative. Insure a non-negative value with:
// if isalpha(k[h])
if isalpha((unsigned char) k[h])

How to split string (character) and variable in 1 line on C?

How can I split character and variable in 1 line?
Example
INPUT
car1900food2900ram800
OUTPUT
car 1900
food 2900
ram 800
Code
char namax[25];
int hargax;
scanf ("%s%s",&namax,&hargax);
printf ("%s %s",namax,hargax);
If I use code like that, I need double enter or space for make output. How can I split without that?

You should be able to use code like this to read one name and number:
if (scanf("%24[a-zA-Z]%d", namax, &hargax) == 2)
…got name and number OK…
else
…some sort of problem to be reported and handled…
You would need to wrap that in a loop of some sort in order to get three pairs of values. Note that using &namax as an argument to scanf() is technically wrong. The %s, %c and %[…] (scan set) notations all expect a char * argument, but you are passing a char (*)[25] which is quite different. A fortuitous coincidence means you usually get away with the abuse, but it is still not correct and omitting the & is easy (and correct).
You can find details about scan sets etc in the POSIX specification of scanf().
You should consider reading a whole line of input with fgets() or POSIX
getline(), and then processing the resulting string with sscanf(). This makes error reporting and error recovery easier. See also How to use sscanf() in loops.

Since you are asking this question which is actually easy, I presume you are somewhat a beginner in C programming. So instead of trying to split the input itself during the input which seems to be a bit too complicated for someone who's new to C programming, I would suggest something simpler(not efficient when you take memory into account).
Just accept the entire input as a String. Then check the string internally to check for digits and alphabets. I have used ASCII values of them to check. If you find an alphabet followed by a digit, print out the part of string from the last such occurrence till the current point. And while printing this do the same with just a slight tweak with the extracted sub-part, i.e, instead of checking for number followed by letter, check for letter followed by digit, and at that point print as many number of spaces as needed.
just so that you know:
ASCII value of digits (0-9) => 48 to 57
ASCII value of uppercase alphabet (A-Z) => 65 to 90
ASCII value of lowercase alphabets (a-z)
=> 97 to 122
Here is the code:
#include<stdio.h>
#include<string.h>
int main() {
char s[100];
int i, len, j, k = 0, x;
printf("\nenter the string:");
scanf("%s",s);
len = strlen(s);
for(i = 0; i < len; i++){
if(((int)s[i]>=48)&&((int)s[i]<=57)) {
if((((int)s[i+1]>=65)&&((int)s[i+1]<=90))||(((int)s[i+1]>=97)&&((int)s[i+1]<=122))||(i==len-1)) {
for(j = k; j < i+1; j++) {
if(((int)s[j]>=48)&&((int)s[j]<=57)) {
if((((int)s[j-1]>=65)&&((int)s[j-1]<=90))||(((int)s[j-1]>=97)&&((int)s[j-1]<=122))) {
printf("\t");
}
}
printf("%c",s[j]);
}
printf("\n");
k = i + 1;
}
}
}
return(0);
}
the output:
enter the string: car1900food2900ram800
car 1900
food 2900
ram 800

In addition to using a character class to include the characters to read as a string, you can also use the character class to exclude digits which would allow you to scan forward in the string until the next digit is found, taking all characters as your name and then reading the digits as an integer. You can then determine the number of characters consumed so far using the "%n" format specifier and use the resulting number of characters to offset your next read within the line, e.g.
char namax[MAXNM],
*p = buf;
int hargax,
off = 0;
while (sscanf (p, "%24[^0-9]%d%n", namax, &hargax, &off) == 2) {
printf ("%-24s %d\n", namax, hargax);
p += off;
}
Note how the sscanf format string will read up to 24 character that are not digits as namax and then the integer that follows as hargax storing the number of characters consumed in off which is then applied to the pointer p to advance within the buffer in preparation for your next parse with sscanf.
Putting it altogether in a short example, you could do:
#include <stdio.h>
#define MAXNM 25
#define MAXC 1024
int main (void) {
char buf[MAXC] = "";
while (fgets (buf, MAXC, stdin)) {
char namax[MAXNM],
*p = buf;
int hargax,
off = 0;
while (sscanf (p, "%24[^0-9]%d%n", namax, &hargax, &off) == 2) {
printf ("%-24s %d\n", namax, hargax);
p += off;
}
}
}
Example Use/Output
$ echo "car1900food2900ram800" | ./bin/fgetssscanf
car 1900
food 2900
ram 800

Word count program - stdin

For below question,
Write a program to read English text to end-of-data (type control-D to indicate end of data at a terminal, see below for detecting it), and print a count of word lengths, i.e. the total number of words of length 1 which occurred, the number of length 2, and so on.
Define a word to be a sequence of alphabetic characters. You should allow for word lengths up to 25 letters.
Typical output should be like this:
length 1 : 10 occurrences
length 2 : 19 occurrences
length 3 : 127 occurrences
length 4 : 0 occurrences
length 5 : 18 occurrences
....
To read characters to end of data see above question.
Here is my working solution,
#include<stdio.h>
int main(void){
char ch;
short wordCount[20] = {0};
int count = 0;
while(ch = getchar(), ch >= 0){
if(ch == ' ' || ch == ',' || ch == ';'|| ch == ':'|| ch == '.'|| ch == '/'){
wordCount[count]++;
count=0;
}else{
count++;
}
}
wordCount[count]++; // Incrementing here looks weird to me
for(short i=1; i< sizeof(wordCount)/sizeof(short); i++){
printf("\nlength %d : %d occurences",i, wordCount[i]);
}
}
Question:
1)
From code elegance aspect, Can I avoid incrementing(++) wordCount outside while loop?
2)
Can I make wordCount array size more dynamic based on word size, rather than constant size 20?
Note: Learnt about struct but am yet to learn dynamic structures like Linkedlist

For the dynamic allocations you can start with space for 20 shorts (although the problem statement appears to ask for you to allow for words up to 25 characters):
short maxWord = 20;
short *wordCount = malloc(sizeof(*wordCount) * maxWord);
Then, when you increment count you can allocate more space if the current word is longer than can be counted in your dynamic array:
} else {
count++;
if (count >= maxWord) {
maxWord++;
wordCount = realloc(sizeof(*wordCount) * maxWord);
}
}
Don't forget to free(wordCount) when you are done.
Since you don't need to count zero-length words, you might consider modifying your code so that wordCount[0] stores the number of words of length 1, and so on.

To 1):
maybe scan from one delimiting character to the next until you increment wordCount. Make EOF a delimiting character as well.
To 2)
you can scan the file twice and then decide how much memory you need. Or you dynamically realloc whenever the more memory is needed. This is something the std::array class does internally for example.
Also you should think about what happens if there are two characters after one another. Right now you would count this as a word.

Character frequency histogram in C

I read this program, but i'm not able to understand it. Please explain what exactly is happening in the length[] arraay . How can it be used to store different type of characters i.e. both digits & chars.Following is the code:
#include <stdio.h>
#define EOL '\n'
#define ARYLEN 256
main()
{
int c, i, x;
int length[ARYLEN];
for(x = 0; x < ARYLEN;x++)
length[x] = 0;
while( (c = getchar() ) != EOL)
{
length[c]++;
if (c == EOL)
break;
}
for(x = 0; x < ARYLEN; x++)
{
if( length[x] > 0){
printf("%c | ", x);
for(i = 1; i <= length[x]; ++i){
printf("*");
}
printf("\n");
}
}
}

The array doesn't store any characters (at least conceptually). It stores the number of times the program has encountered a character with the numerical value c in the array position of index c.
Basically, in the C programming language, a char is a datatype that consists of 8 bits and is able to hold values of the range 0 to 255 for an unsigned char or -128 to 127 for a signed char.
The program then defines an array large enough to hold as many different values as it is possible to represent using a char, one array position for each unique value.
Then it counts the number of occurances using the appropriate array position, length[c], as a counter for that specific value. As it loops over the array to print out the data, it can tell which character the data belongs to just by looking at the current index inside the loop, so printf("%c | ", x); is the character while length[x] is the data we're after.

In your code the integer array length[] is not used to store characters. It is only used to store the count of each character being typed. The characters are read one by one into the character variable c while( (c = getchar() ) != EOL).
But the tricky part is length[c]++;. The count of each character is kept at a location equal to its ASCII value - 1 in the array length[].
For example in a system using ASCII codes, length[64] contains the count of A, because 65 is the ASCII code for A.
length[65] contains the count of B, because 66 is the ASCII-8 code for B.
length[96] contains the count of a, because 97 is the ASCII code for a.
length[47] contains the count of 0, because 48 is the ASCII code for 0.

Converting Character Array to Integer Array in C for ISBN Validation

I really hope someone can give a well explained example. I've been searching everywhere but can't find a proper solution.
I am taking an introduction to C Programming class, and our last assignment is to write a program which validates a 10 digit ISBN with dashes... The ISBN is inputted as a string in a CHAR array. From there I need to separate each digit and convert them into an integer, so I can calculated the validity of the ISBN. On top of that, the dashes need to be ignored..
My thought process was to create an INT array and then use a loop to store each character into the array, and pass it through the atoi() function. I also tried using an IF statement to check each part of the CHAR array to see if it found a dash. If it did find one, it would skip to the next spot in the array. It looked something like this:
int num[12], i = 0, j = 0, count = 0;
char isbn[12];
printf ("Enter an ISBN to validate: ");
scanf ("%13[0-9Xx-]%*c", &isbn);
do {
if (isbn[i] == '-') {
i++;
j++;
}
else {
num[i]= atoi(isbn[j]);
i++;
j++;
}
count++;
} while (count != 10);
But that creates a segmentation fault, so I can't even tell if my IF statement has actually filtered the dashes....
If someone could try and solve this I'd really appreciate that. The Assignment was due Dec 4th, however I got an extension until Dec 7th, so I'm pressed for time.
Please write out the code in your explanation. I'm a visual learner, and need to see step by step.
There's obviously a lot more that needs to be coded, but I can't move ahead until I get over this obstacle.
Thanks in advance!

First of all, your definition of isbn is not sufficient to hold 13 characters; it should therefore be 14 chars long (to also store the terminating '\0').
Second, your loop is overly complicated; three loop variables that maintain the same value is redundant.
Third, the loop is not safe, because a string might be as short as one character, but your code happily loops 10 times.
Lastly, converting a char that holds the ascii value of a digit can be converted by simply subtracting '0' from it.
This is the code after above improvements have been made.
#include <stdio.h>
int main(void)
{
int num[14], i;
char isbn[14], *p;
printf("Enter an ISBN to validate: ");
scanf("%13[0-9Xx-]%*c", &isbn);
// p iterates over each character of isbn
// *p evaluates the value of each character
// the loop stops when the end-of-string is reached, i.e. '\0'
for (p = isbn, i = 0; *p; ++p) {
if (*p == '-' || *p == 'X' || *p == 'x') {
continue;
}
// it's definitely a digit now
num[i++] = *p - '0';
}
// post: i holds number of digits in num
// post: num[x] is the digit value, for 0 <= x < i
return 0;
}

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight