Count Different Character Types In String - c

I wrote a program that counts and prints the number of occurrences of elements in a string but it throws a garbage value when i use fgets() but for gets() it's not so.
Here is my code:
#include<stdio.h>
#include<string.h>
#include<ctype.h>
#include<stdlib.h>
int main() {
char c[1005];
fgets(c, 1005, stdin);
int cnt[26] = {0};
for (int i = 0; i < strlen(c); i++) {
cnt[c[i] - 'a']++;
}
for (int i = 0; i < strlen(c); i++) {
if(cnt[c[i]-'a'] != 0) {
printf("%c %d\n", c[i], cnt[c[i] - 'a']);
cnt[c[i] - 'a'] = 0;
}
}
return 0;
}
This is what I get when I use fgets():
baaaabca
b 2
a 5
c 1
32767
--------------------------------
Process exited after 8.61 seconds with return value 0
Press any key to continue . . . _
I fixed it by using gets and got the correct result but i still don't understand why fgets() gives wrong result

Hurray! So, the most important reason your code is failing is that your code does not observe the following inviolable advice:
Always sanitize your inputs
What this means is that if you let the user input anything then he/she/it can break your code. This is a major, common source of problems in all areas of computer science. It is so well known that a NASA engineer has given us the tale of Little Bobby Tables:
Exploits of a Mom #xkcd.com
It is always worth reading the explanation even if you get it already #explainxkcd.com
medium.com wrote an article about “How Little Bobby Tables Ruined the Internet”
Heck, Bobby’s even got his own website — bobby-tables.com
Okay, so, all that stuff is about SQL injection, but the point is, validate your input before blithely using it. There are many, many examples of C programs that fail because they do not carefully manage input. One of the most recent and widely known is the Heartbleed Bug.
For more fun side reading, here is a superlatively-titled list of “The 10 Worst Programming Mistakes In History” #makeuseof.com — a good number of which were caused by failure to process bad input!
Academia, methinks, often fails students by not having an entire course on just input processing. Instead we tend to pretend that the issue will be later understood and handled — code in academia, science, online competition forums, etc, often assumes valid input!
Where your code went wrong
Using gets() is dangerous because it does not stop reading and storing input as long as the user is supplying it. It has created so many software vulnerabilities that the C Standard has (at long last) officially removed it from C. SO actually has an excellent post on it: Why is the gets function so dangerous that it should not be used?
But it does remove the Enter key from the end of the user’s input!
fgets(), in contrast, stops reading input at some point! However, it also lets you know whether you actually got an entire line of of text by not removing that Enter key.
Hence, assuming the user types: b a n a n a Enter
gets() returns the string "banana"
fgets() returns the string "banana\n"
That newline character '\n' (what you get when the user presses the Enter key) messes up your code because your code only accepts (or works correctly given) minuscule alphabet letters!
The Fix
The fix is to reject anything that your algorithm does not like. The easiest way to recognize “good” input is to have a list of it:
// Here is a complete list of VALID INPUTS that we can histogram
//
const char letters[] = "abcdefghijklmnopqrstuvwxyz";
Now we want to create a mapping from each letter in letters[] to an array of integers (its name doesn’t matter, but we’re calling it count[]). Let’s wrap that up in a little function:
// Here is our mapping of letters[] ←→ integers[]
// • supply a valid input → get an integer unique to that specific input
// • supply an invalid input → get an integer shared with ALL invalid input
//
int * histogram(char c) {
static int fooey; // number of invalid inputs
static int count[sizeof(letters)] = {0}; // numbers of each valid input 'a'..'z'
const char * p = strchr(letters, c); // find the valid input, else NULL
if (p) {
int index = p - letters; // 'a'=0, 'b'=1, ... (same order as in letters[])
return &count[index]; // VALID INPUT → the corresponding integer in count[]
}
else return &fooey; // INVALID INPUT → returns a dummy integer
}
For the more astute among you, this is rather verbose: we can totally get rid of those fooey and index variables.
“Okay, okay, that’s some pretty fancy stuff there, mister. I’m a bloomin’ beginner. What about me, huh?”
Easy. Just check that your character is in range:
int * histogram(char c) {
static int fooey = 0;
static int count[26] = {0};
if (('a' <= c) && (c <= 'z')) return &count[c - 'a'];
return &fooey;
}
“But EBCDIC...!”
Fine. The following will work with both EBCDIC and ASCII:
int * histogram(char c) {
static int fooey = 0;
static int count[26] = {0};
if (('a' <= c) && (c <= 'i')) return &count[ 0 + c - 'a'];
if (('j' <= c) && (c <= 'r')) return &count[ 9 + c - 'j'];
if (('s' <= c) && (c <= 'z')) return &count[18 + c - 's'];
return &fooey;
}
You will honestly never have to worry about any other character encoding for the Latin minuscules 'a'..'z'.Prove me wrong.
Back to main()
Before we forget, stick the required magic at the top of your program:
#include <stdio.h>
#include <string.h>
Now we can put our fancy-pants histogram mapping to use, without the possibility of undefined behavior due to bad input.
int main() {
// Ask for and get user input
char s[1005];
printf("s? ");
fgets(s, 1005, stdin);
// Histogram the input
for (int i = 0; i < strlen(s); i++) {
*histogram(s[i]) += 1;
}
// Print out the histogram, not printing zeros
for (int i = 0; i < strlen(letters); i++) {
if (*histogram(letters[i])) {
printf("%c %d\n", letters[i], *histogram(letters[i]));
}
}
return 0;
}
We make sure to read and store no more than 1004 characters (plus the terminating nul), and we prevent unwanted input from indexing outside of our histogram’s count[] array! Win-win!
s? a - ba na na !
a 4
b 1
n 2
But wait, there’s more!
We can totally reuse our histogram. Check out this little function:
// Reset the histogram to all zeros
//
void clear_histogram(void) {
for (const char * p = letters; *p; p++)
*histogram(*p) = 0;
}
All this stuff is not obvious. User input is hard. But you will find that it doesn’t have to be impossibly difficult genius-level stuff. It should be entertaining!
Other ways you could handle input is to transform things into acceptable values. For example you can use tolower() to convert any majuscule letters to your histogram’s input set.
s? ba na NA!
a 3
b 1
n 2
But I digress again...
Hang in there!

Related

Function to Split a String into Letters and Digits in C

I'm pretty new to C, and I'm trying to write a function that takes a user input RAM size in B, kB, mB, or gB, and determines the address length. My test program is as follows:
int bitLength(char input[6]) {
char nums[4];
char letters[2];
for(int i = 0; i < (strlen(input)-1); i++){
if(isdigit(input[i])){
memmove(&nums[i], &input[i], 1);
} else {
//memmove(&letters[i], &input[i], 1);
}
}
int numsInt = atoi(nums);
int numExponent = log10(numsInt)/log10(2);
printf("%s\n", nums);
printf("%s\n", letters);
printf("%d", numExponent);
return numExponent;
}
This works correctly as it is, but only because I have that one line commented out. When I try to alter the 'letters' character array with that line, it changes the 'nums' character array to '5m2'
My string input is '512mB'
I need the letters to be able to tell if the user input is in B, kB, mB, or gB.
I am confused as to why the commented out line alters the 'nums' array.
Thank you.
In your input 512mB, "mB" is not digit and is supposed to handled in commented code. When handling those characters, i is 3 and 4. But because length of letters is only 2, when you execute memmove(&letters[i], &input[i], 1);, letters[i] access out of bounds of array so it does undefined behaviour - in this case, writing to memory of nums array.
To fix it, you have to keep unique index for letters. Or better, for both nums and letters since i is index of input.
There are several problems in your code. #MarkSolus have already pointed out that you access letters out-of-bounds because you are using i as index and i can be more than 1 when you do the memmove.
In this answer I'll address some of the other poroblems.
string size and termination
Strings in C needs a zero-termination. Therefore arrays must be 1 larger than the string you expect to store in the array. So
char nums[4]; // Can only hold a 3 char string
char letters[2]; // Can only hold a 1 char string
Most likely you want to increase both arrays by 1.
Further, your code never adds the zero-termination. So your strings are invalid.
You need code like:
nums[some_index] = '\0'; // Add zero-termination
Alternatively you can start by initializing the whole array to zero. Like:
char nums[5] = {0};
char letters[3] = {0};
Missing bounds checks
Your loop is a for-loop using strlen as stop-condition. Now what would happen if I gave the input "123456789BBBBBBBB" ? Well, the loop would go on and i would increment to values ..., 5, 6, 7, ... Then you would index the arrays with a value bigger than the array size, i.e. out-of-bounds access (which is real bad).
You need to make sure you never access the array out-of-bounds.
No format check
Now what if I gave an input without any digits, e.g. "HelloWorld" ? In this case nothin would be written to nums so it will be uninitialized when used in atoi(nums). Again - real bad.
Further, there should be a check to make sure that the non-digit input is one of B, kB, mB, or gB.
Performance
This is not that important but... using memmove for copy of a single character is slow. Just assign directly.
memmove(&nums[i], &input[i], 1); ---> nums[i] = input[i];
How to fix
There are many, many different ways to fix the code. Below is a simple solution. It's not the best way but it's done like this to keep the code simple:
#define DIGIT_LEN 4
#define FORMAT_LEN 2
int bitLength(char *input)
{
char nums[DIGIT_LEN + 1] = {0}; // Max allowed number is 9999
char letters[FORMAT_LEN + 1] = {0}; // Allow at max two non-digit chars
if (input == NULL) exit(1); // error - illegal input
if (!isdigit(input[0])) exit(1); // error - input must start with a digit
// parse digits (at max 4 digits)
int i = 0;
while(i < DIGITS && isdigit(input[i]))
{
nums[i] = input[i];
++i;
}
// parse memory format, i.e. rest of strin must be of of B, kB, mB, gB
if ((strcmp(&input[i], "B") != 0) &&
(strcmp(&input[i], "kB") != 0) &&
(strcmp(&input[i], "mB") != 0) &&
(strcmp(&input[i], "gB") != 0))
{
// error - illegal input
exit(1);
}
strcpy(letters, &input[i]);
// Now nums and letter are ready for further processing
...
...
}
}

How to find the total number of a certain element in an array(C)

I'm trying to create a complete C program to read ten alphabets and display them on the screen. I shall also have to find the number of a certain element and print it on the screen.
#include <stdio.h>
#include <conio.h>
void listAlpha( char ch)
{
printf(" %c", ch);
}
int readAlpha(){
char arr[10];
int count = 1, iterator = 0;
for(int iterator=0; iterator<10; iterator++){
printf("\nAlphabet %d:", count);
scanf(" %c", &arr[iterator]);
count++;
}
printf("-----------------------------------------");
printf("List of alphabets: ");
for (int x=0; x<10; x++)
{
/* I’m passing each element one by one using subscript*/
listAlpha(arr[x]);
}
printf("%c",arr);
return 0;
}
int findTotal(){
}
int main(){
readAlpha();
}
The code should be added in the findTotal() element. The output is expected as below.
Output:
List of alphabets : C C C A B C B A C C //I've worked out this part.
Total alphabet A: 2
Total alphabet B: 2
Total alphabet C: 6
Alphabet with highest hit is C
I use an array to count the number of the existence of each character,
I did this code but the display of number of each character is repeated in the loop
int main()
{
char arr[100];
printf("Give a text :");
gets(arr);
int k=strlen(arr);
for(int iterator=0; iterator<k; iterator++)
{
printf("[%c]",arr[iterator]);
}
int T[k];
for(int i=0;i<k;i++)
{
T[i]=arr[i];
}
int cpt1=0;
char d;
for(int i=0;i<k;i++)
{int cpt=0;
for(int j=0;j<k;j++)
{
if(T[i]==T[j])
{
cpt++;
}
}
if(cpt>cpt1)
{
cpt1=cpt;
d=T[i];
}
printf("\nTotal alphabet %c : %d \n",T[i],cpt);
}
printf("\nAlphabet with highest hit is : %c\n",d,cpt1);
}
There is no way to get the number of elements You write in an array.
Array in C is just a space in the memory.
C does not know what elements are actual data.
But there are common ways to solve this problem in C:
as mentioned above, create an array with one extra element and, fill the element after the last actual element with zero ('\0'). Zero means the end of the actual data. It is right if you do not wish to use '\0' among characters to be processed. It is similar to null-terminated strings in C.
add the variable to store the number of elements in an array. It is similar to Pascal-strings.
#include <stdio.h>
#include <string.h>
#define ARRAY_SIZE 10
char array[ARRAY_SIZE + 1];
int array_len(char * inp_arr) {
int ret_val = 0;
while (inp_arr[ret_val] != '\0')
++ret_val;
return ret_val;
}
float array_with_level[ARRAY_SIZE];
int array_with_level_level;
int main() {
array[0] = '\0';
memcpy(array, "hello!\0", 7); // 7'th element is 0
printf("array with 0 at the end\n");
printf("%s, length is %d\n", array, array_len(array));
array_with_level_level = 0;
const int fill_level = 5;
int iter;
for (iter = 0; iter < fill_level; ++iter) {
array_with_level[iter] = iter*iter/2.0;
}
array_with_level_level = iter;
printf("array with length in the dedicated variable\n");
for (int i1 = 0; i1 < array_with_level_level; ++i1)
printf("%02d:%02.2f ", i1, array_with_level[i1]);
printf(", length is %d", array_with_level_level);
return 0;
}
<conio.h> is a non-standard header. I assume you're using Turbo C/C++ because it's part of your course. Turbo C/C++ is a terrible implementation (in 2020) and the only known reason to use it is because your lecturer made you!
However everything you actually use here is standard. I believe you can remove it.
printf("%c",arr); doesn't make sense. arr will be passed as a pointer (to the first character in the array) but %c expects a character value. I'm not sure what you want that line to do but it doesn't look useful - you've listed the array in the for-loop.
I suggest you remove it. If you do don't worry about a \0. You only need that if you want to treat arr as a string but in the code you're handling it quite validly as an array of 10 characters without calling any functions that expect a string. That's when it needs to contain a 0 terminator.
Also add return 0; to the end of main(). It means 'execution successful' and is required to be conformant.
With those 3 changes an input of ABCDEFGHIJ produces:
Alphabet 1:
Alphabet 2:
Alphabet 3:
Alphabet 4:
Alphabet 5:
Alphabet 6:
Alphabet 7:
Alphabet 8:
Alphabet 9:
Alphabet 10:-----------------------------------------List of alphabets: A B C D E F G H I J
It's not pretty but that's what you asked for and it at least shows you've successfully read in the letters. You may want to tidy it up...
Remove printf("\nAlphabet %d:", count); and insert printf("\nAlphabet %d: %c", count,arr[iterator]); after scanf(" %c", &arr[iterator]);.
Put a newline before and after the line of minus signs (printf("\n-----------------------------------------\n"); and it looks better to me.
But that's just cosmetics. It's up to you.
There's a number of ways to find the most frequent character. But at this level I recommend a simple nested loop.
Here's a function that finds the most common character (rather than the count of the most common character) and if there's a tie (two characters with the same count) it returns the one that appears first.
char findCommonest(const char* arr){
char commonest='#'; //Arbitrary Bad value!
int high_count=0;
for(int ch=0;ch<10;++ch){
const char counting=arr[ch];
int count=0;
for(int c=0;c<10;++c){
if(arr[c]==counting){
++count;
}
}
if(count>high_count){
high_count=count;
commonest=counting;
}
}
return commonest;
}
It's not very efficient and you might like to put some printfs in to see why!
But I think it's at your level of expertise to understand. Eventually.
Here's a version that unit-tests that function. Never write code without a unit test battery of some kind. It might look like chore but it'll help debug your code.
https://ideone.com/DVy7Cn
Footnote: I've made minimal changes to your code. There's comments with some good advice that you shouldn't hardcode the array size as 10 and certainly not litter the code with that value (e.g. #define ALPHABET_LIST_SIZE (10) at the top).
I have used const but that may be something you haven't yet met. If you don't understand it and don't want to learn it, remove it.
The terms of your course will forbid plagiarism. You may not cut and paste my code into yours. You are obliged to understand the ideas and implement it yourself. My code is very inefficient. You might want to do something about that!
The only run-time problem I see in your code is this statement:
printf("%c",arr);
Is wrong. At this point in your program, arr is an array of char, not a single char as expected by the format specifier %c. For this to work, the printf() needs to be expanded to:
printf("%c%c%c%c%c%c%c%c%c%c\n",
arr[0],arr[1],arr[2],arr[3],arr[4],
arr[5],arr[6],arr[7],arr[8],arr[9]);
Or: treat arr as a string rather than just a char array. Declare arr as `char arr[11] = {0};//extra space for null termination
printf("%s\n", arr);//to print the string
Regarding this part of your stated objective:
"I shall also have to find the number of a certain element and print it on the screen. I'm new to this. Please help me out."
The steps below are offered to modify the following work
int findTotal(){
}
Change prototype to:
int FindTotal(char *arr);
count each occurrence of unique element in array (How to reference)
Adapt above reference to use printf and formatting to match your stated output. (How to reference)

Boyer-Moore Algorithm

I'm trying to implement Boyer-Moore Algorithm in C for searching a particular word in .pcap file. I have referenced code from http://ideone.com/FhJok5. I'm using this code as it is.
Just I'm passing packet as string and the keyword I'm searching for to the function search() in it. When I'm running my code it is giving different values every time. Some times its giving correct value too. But most of times its not identifying some values.
I have obtained results from Naive Algo Implementation. Results are always perfect.
I am using Ubuntu 12.0.4 over VMware 10.0.1. lang: C
My question is It has to give the same result every time right? whether right or wrong. This output keeps on changing every time i run the file on same inputs; and during several runs, it gives correct answer too. Mostly the value is varying between 3 or 4 values.
For Debugging I did so far:
passed strings in stead of packet every time, Its working perfect and same and correct value every time.
checking pcap part, I can see all packets are being passed to the function (I checked by printing packet frame no).
same packets I am sending to Naive Algo code, its giving perfect code.
Please give me some idea, what can be the issue. I suspect some thing wrong with memory management. but how to find which one?
Thanks in advance.
# include <limits.h>
# include <string.h>
# include <stdio.h>
# define NO_OF_CHARS 256
// A utility function to get maximum of two integers
int max (int a, int b) { return (a > b)? a: b; }
// The preprocessing function for Boyer Moore's bad character heuristic
void badCharHeuristic( char *str, int size, int badchar[NO_OF_CHARS])
{
int i;
// Initialize all occurrences as -1
for (i = 0; i < NO_OF_CHARS; i++)
badchar[i] = -1;
// Fill the actual value of last occurrence of a character
for (i = 0; i < size; i++)
badchar[(int) str[i]] = i;
}
/* A pattern searching function that uses Bad Character Heuristic of
Boyer Moore Algorithm */
void search( char *txt, char *pat)
{
int m = strlen(pat);
int n = strlen(txt);
int badchar[NO_OF_CHARS];
/* Fill the bad character array by calling the preprocessing
function badCharHeuristic() for given pattern */
badCharHeuristic(pat, m, badchar);
int s = 0; // s is shift of the pattern with respect to text
while(s <= (n - m))
{
int j = m-1;
/* Keep reducing index j of pattern while characters of
pattern and text are matching at this shift s */
while(j >= 0 && pat[j] == txt[s+j])
j--;
/* If the pattern is present at current shift, then index j
will become -1 after the above loop */
if (j < 0)
{
printf("\n pattern occurs at shift = %d", s);
/* Shift the pattern so that the next character in text
aligns with the last occurrence of it in pattern.
The condition s+m < n is necessary for the case when
pattern occurs at the end of text */
s += (s+m < n)? m-badchar[txt[s+m]] : 1;
}
else
/* Shift the pattern so that the bad character in text
aligns with the last occurrence of it in pattern. The
max function is used to make sure that we get a positive
shift. We may get a negative shift if the last occurrence
of bad character in pattern is on the right side of the
current character. */
s += max(1, j - badchar[txt[s+j]]);
}
}
/* Driver program to test above function */
int main()
{
char txt[] = "ABAAAABAACD";
char pat[] = "AA";
search(txt, pat);
return 0;

Segmentation fault when indexing into char array via pointer

My code is causing a segmentation fault when accessing an array element even though that element was already accessed without a problem.
int charToInt(char a)
{
int b;
if(isdigit(a))
{
b = a - '0' - 1;
}
if(isalpha(a))
{
b = a - 65;
}
return b;
}
int validPosition(char **array, int r, int c, char* position, int slots)
{
int i,k;
if(strlen(position) == 5)
{
if(!isalpha(position[0]) || !isdigit(position[1]) || position[2]!=' ' || (position[3]!='N' && position[3]!='E' && position[3]!='W' && position[3]!='S')) //lathos gramma
{
printf("\n%s", "Invalid answear.This is an example of a valid answear: A5 N");
return 2;
}
if( charToInt(position[0]) > r - 1 || charToInt(position[1]) > c - 1 )//ama vgainei eksw apo ta oria
{
printf("\n%s", "The position you choosed is out of the bountries...");
return 2;
}
printf("\n%s%c%s","position[3] is: ",position[3], " but it doesn't work >_<"); // position[3] is N
if(position[3] == 'N') //the problem is here <~~~~~~~~~~~~~~~~~~~<
{
printf("\n%s", "come on");
if(charToInt(position[0]) + slots < r)
{
for(i=charToInt(position[0])-1; i<charToInt(position[0])+slots; i++)
{
if(array[i-1][charToInt(position[1])-1] != '.')
{
printf("\n%s", "The position you choosed is not valid because there is oneother ship there");
return 2;
}
}
}
else
{
printf("\n%s", "The ship is going out of the bountries...");
return 2;
}
}
}
}
When position holds the string "A9 N", the printf correctly outputs 'N' for position[3]. For some reason when it tries to do if(position[3] == 'N'), however, a segmentation fault occurs.
Example program run:
Example of positioning: G3 E
Aircraft carrier (5 places), Give location and direction: A9 N
1
position[3] is: N but it doesn't work >_<
Well, based on your updates, it seems you have a variety of problems. For future reference, actually adding in the (possibly simplified) code showing how you were calling the function in question is better than trying to describe it using prose in a comment. There will be less guesswork for the people trying to help you.
If I'm reading your comment correctly, the code that calls validPosition looks something like this:
// "r and c are 9 and 9 in the specific example(rows columns)."
int rows = 9;
int columns = 9;
// "slots=5."
int slots = 5;
// "array is a 2d array and it contains characters(created with malloc)."
char **array = malloc(rows * columns * sizeof(char));
// "i created char position[10] in the function that called this function"
char position[10];
// "and with fgets(position, 10, stdin); i putted A9 N inside it."
fgets(position, 10, stdin);
validPosition(array, rows, columns, position, slots);
The first problem is your description of the allocation of array (I apologize if I misunderstood your comment and this isn't actually what you are doing). It should look similar to the code below for a dynamically sized two-dimensional array used with two subscripting operations (array[index1][index2], as it is in validPosition). Pointers-to-pointers (char **array) act differently than fixed sized multi-dimensional arrays (array[SIZE1][SIZE2]) when you access them that way.
// each entry in array should be a pointer to an array of char
char **array = malloc(rows * sizeof(char*));
for(i = 0; i < rows; i++)
array[i] = malloc(columns * sizeof(char));
You also need to be careful about using position after the fgets call. You should check the return value to make sure it isn't NULL (indicating an EOF or error condition). The string may not be \0-terminated in this case. In fact, all the elements may still be uninitialized (assuming you didn't initialized them before the call). This can lead to undefined behavior.
The next issue is that validPosition does not return a value on every code path. One example is if strlen(position) != 5. The other is if you enter the for loop and array[i-1][charToInt(position[1])-1] != '.' is never true (that is, the ship placement is deemed valid).
As strange as it is for an English speaker to say this to a Greek author, lets ignore internationalization and focus only on the default C local. The checks on position[0] should therefore be sufficient, though you might consider allowing your users to use lowercase letters as well. When converting position[1] from 1-based to 0-based, however, you do not account for the case when it is '0', which will result in charToInt returning -1. Furthermore, you're erroneously doing the subtraction again in the second array subscript of array[i-1][charToInt(position[1])-1].
Similarly, as pointed out by Jite and BLUEPIXY, you are doing two extra subtractions on the result of charToInt(position[0]): one in the for loop initializer (i=charToInt(position[0])-1) and one in the first array subscript of array[i-1][charToInt(position[1])-1].
Once you fix that, you might find that you are sometimes incorrectly telling the user that their selection is invalid. This is because you are checking charToInt(position[0]) + slots < r instead of <= r.
As I mentioned in my comment, one of the accesses to array is very probably the culprit behind your segmentation violation, not position[3] == 'N'. The reason you don't see the output of printf("\n%s", "come on"); is that your stdout appears to be line-buffered and there's no end of line to flush it. It is generally automatically flushed on normal program termination, however you're seg-faulting so that doesn't happen.
Finally, these are only the semantic errors I noticed. Stylistically, the code could also stand to be improved. For instance, it seems you're going to be implementing else if(position[3] == 'E', else if(position[3] == 'W', and else if(position[3] == 'S' clauses with similar logic to your if(position[3] == 'N' clause. This increases the likelihood you'll introduce an error by incorrectly copying-and-pasting and also increases your work later when you need to make a change in four places instead of one.
Since the terminology 'Segmentation Fault' I believe you are on Linux machine.
Use gdb to find the cause of error. Here are the steps.
Compile with additional -g flag (ex. gcc -g my_prog.c)
Run debugger: gdb a.out
Use 'list' command to find the line for break point (eg. first line of your function)
Set breakpoint on that line with: b 25 (if 25 is that line)
Run program with 'run' command
Use command 'next' to execute next line of code
Now the execution will pause on that line, you can examine memory, print variable contents
and stuff. But generally you want to determine on which line the execution fails and what was in which variable.
With a little playing with memory, you will easily find where the problem is. Personally, my code wont work with gdb support.
Perhaps segmentation fault at array[i-1][charToInt(position[1])-1]
i:charToInt(position[0])-1 : charToInt('A') - 1 : -1 <- Array out-of-bounds

Caesar Cipher Program - Absurd Number in Array Output

I'm actually writing about the same program as before, but I feel like I've made significant progress since the last time. I have a new question however; I have a function designed to store the frequencies of letters contained within the message inside an array so I can do some comparison checks later. When I ran a test segment through the function by outputting all of my array entries to see what their values are, it seems to be storing some absurd numbers. Here's the function of issue:
void calcFreq ( float found[] )
{
char infname[15], alpha[27];
char ch;
float count = 0;
FILE *fin;
int i = 0;
while (i < 26) {
alpha[i] = 'A' + i++;
}
printf("Please input the name of the file you wish to scan:\n");
scanf("%s", infname);
fin = fopen ( infname, "r");
while ( !feof(fin) ) {
fscanf(fin, "%c", &ch);
if ( isalpha(ch) ) {
count += 1;
i = 0;
if ( islower(ch) ) { ch = toupper(ch); }
while ( i < 26 ) {
if ( ch == alpha[i] ) {
found[i]++;
i = 30;
}
i++;
}
}
}
fclose(fin);
i = 0;
while ( i < 26 ) {
found[i] = found[i] / count;
printf("%f\n", found[i]);
i++;
}
}
At like... found[5], I get this hugely absurd number stored in there. Is there anything you can see that I'm just overlooking? Also, some array values are 0 and I'm pretty certain that every character of the alphabet is being used at least once in the text files I'm using.
I feel like a moron - this program should be easy, but I keep overlooking simple mistakes that cost me a lot of time >.> Thank you so much for your help.
EDIT So... I set the entries to 0 of the frequency array and it seems to turn out okay - in a Linux environment. When I try to use an IDE from a Windows environment, the program does nothing and Windows crashes. What the heck?
Here are a few pointers besides the most important one of initializing found[], which was mentioned in other comments.
the alpha[] array complicates things, and you don't need it. See below for a modified file-read-loop that doesn't need the alpha[] array to count the letters in the file.
And strictly speaking, the expression you're using to initialize the alpha[] array:
alpha[i] = 'A' + i++;
has undefined behavior because you modify i as well as use it as an index in two different parts of the expression. The good news is that since you don't need alpha[] you can get rid of its initialization entirely.
The way you're checking for EOF is incorrect - it'll result in you acting on the last character in the file twice (since the fscanf() call that results in an EOF will not change the value of ch). feof() won't return true until after the read that occurs at the end of the file. Change your ch variable to an int type, and modify the loop that reads the file to something like:
// assumes that `ch` is declared as `int`
while ( (ch = fgetc(fin)) != EOF ) {
if ( isalpha(ch) ) {
count += 1;
ch = toupper(ch);
// the following line is technically non-portable,
// but works for ASCII targets.
// I assume this will work for you because the way you
// initialized the `alpha[]` array assumed that `A`..`Z`
// were consecutive.
int index = ch - 'A';
found[index] += 1;
}
}
alpha[i] = 'A' + i++;
This is undefined behavior in C. Anything can happen when you do this, including crashes. Read this link.
Generally I would advise you to replace your while loops with for loops, when the maximum number of iterations is already known. This makes the code easier to read and possibly faster as well.
Is there a reason you are using float for counter variables? That doesn't make sense.
'i = 30;' What is this supposed to mean? If your intention was to end the loop, use a break statement instead of some mysterious magic number. If your intention was something else, then your code isn't doing what you think it does.
You should include some error handling if the file was not found. fin = fopen(..) and then if(fin == NULL) handle errors. I would say this is the most likely cause of the crash.
Check the definition of found[] in the caller function. You're probably running out of bounds.

Resources