Unsure as to why toupper() is cutting off last letter in C - arrays

So the goal of this program is to basically take a 26 letter 'key' in the terminal (through argv[]) and use its index's as a substitution guideline. So there are 2 inputs you enter in the terminal, one in the argv[] and one is just a plain get_string() input. The argv[] input will look like this: ./s YTNSHKVEFXRBAUQZCLWDMIPGJO where s is the file name. And then the get_string() input will look like this: plaintext: HELLO. (The input is HELLO). What the program will then do is loop through all the letters in the plaintext input and substitute its alphabetical index according to the index of the argv[] key. For example, H has an alphabetical index of 7 (where a = 0 and z = 25), so we look at the 7th index in the key YTNSHKV(E)FXRBAUQZCLWDMIPGJO which in this case is E. It does this for each letter in the input and we'll end up with the output ciphertext: EHBBQ. This is what it should look like in the terminal:
./s YTNSHKVEFXRBAUQZCLWDMIPGJO
plaintext: HELLO
ciphertext: EHBBQ
But my output is EHBB, since it cuts off the last letter for some reason when I use toupper().
And also, the uppercase and lowercase depends on the plaintext input, if the plaintext input was hello, world and the argv[] key was YTNSHKVEFXRBAUQZCLWDMIPGJO, the output would be jrssb, ybwsp, and if the input was HellO, world with the same key, the output would be JrssB, ybwsp.
I'm basically done with the problem, my program substitutes the plaintext given into the correct ciphertext based on the key that was inputted through the command line. Right now, say if the plaintext input was HELLO, and the key was vchprzgjntlskfbdqwaxeuymoi (all lowercase), then it should return HELLO and not hello. This is because my program puts all the letters in the command line key into an array of length 26 and I loop through all the plaintext letters and match it's ascii value (minus a certain number to get it into 0-25 index range) with the index in the key. So E has an alphabetical index of 4 so in this case my program would get lowercase p, but I need it to be P, so that's why I'm using toupper().
When I use tolower(), everything worked fine, and once I started using toupper(), the last letter of the ciphertext is cut off for some reason. Here is my output before using toupper():
ciphertext: EHBBQ
And here is my output after I use toupper():
ciphertext: EHBB
Here is my code:
int main(int argc, string argv[]) {
string plaintext = get_string("plaintext: ");
// Putting all the argv letters into an array called key
char key[26]; // change 4 to 26
for (int i = 0; i < 26; i++) // change 4 to 26
{
key[i] = argv[1][i];
}
// Assigning array called ciphertext, the length of the inputted text, to hold cipertext chars
char ciphertext[strlen(plaintext)];
// Looping through the inputted text, checking for upper and lower case letters
for (int i = 0; i < strlen(plaintext); i++)
{
// The letter is lower case
if (islower(plaintext[i]) != 0)
{
int asciiVal = plaintext[i] - 97; // Converting from ascii to decimal value and getting it into alphabetical index (0-25)
char l = tolower(key[asciiVal]); // tolower() works properly
//printf("%c", l);
strncat(ciphertext, &l, 1); // Using strncat() to append the converted plaintext char to ciphertext
}
// The letter is uppercase
else if (isupper(plaintext[i]) != 0)
{
int asciiVal = plaintext[i] - 65; // Converting from ascii to decimal value and getting it into alphabetical index (0-25)
char u = toupper(key[asciiVal]); // For some reason having this cuts off the last letter
strncat(ciphertext, &u, 1); // Using strncat() to append the converted plaintext char to ciphertext
}
// If its a space, comma, apostrophe, etc...
else
{
strncat(ciphertext, &plaintext[i], 1);
}
}
// prints out ciphertext output
printf("ciphertext: ");
for (int i = 0; i < strlen(plaintext); i++)
{
printf("%c", ciphertext[i]);
}
printf("\n");
printf("%c\n", ciphertext[1]);
printf("%c\n", ciphertext[4]);
//printf("%s\n", ciphertext);
return 0;
}

The strncat function expects its first argument to be a null terminated string that it appends to. You're calling it with ciphertext while it is uninitialized. This means that you're reading unitialized memory, possibly reading past the end of the array, triggering undefined behavior.
You need to make ciphertext an empty string before you call strncat on it. Also, you need to add 1 to the size of this array to account for the terminating null byte on the completed string to prevent writing off the end of it.
char ciphertext[strlen(plaintext)+1];
ciphertext[0] = 0;

There are multiple problems in the code:
you do not test the command line argument presence and length
the array should be allocated with 1 extra byte for the null terminator and initialized as an empty string for strncat() to work properly.
instead of hard coding ASCII values such as 97 and 65, use character constants such as 'a' and 'A'
strncat() is overkill for your purpose. You could just write ciphertext[i] = l; instead of strncat(ciphertext, &l, 1)
islower() and isupper() are only defined for positive values of the type unsigned char and the special negative value EOF. You should cast char arguments as (unsigned char)c to avoid undefined behavior on non ASCII bytes on platforms where char happens to be a signed type.
avoid redundant tests such as islower(xxx) != 0. It is more idiomatic to just write if (islower(xxx))
Here is a modified version:
#include <ctype.h>
#include <stdio.h>
#include <string.h>
#include <cs50.h>
int main(int argc, string argv[]) {
// Testing the argument
if (argc < 2 || strlen(argv[1]) != 26) {
printf("invalid or missing argument\n");
return 1;
}
// Putting all the argv letters into an array called key
char key[26];
memcpy(key, argv[1], 26);
string plaintext = get_string("plaintext: ");
int len = strlen(plaintext);
// Define an array called ciphertext, the length of the inputted text, to hold ciphertext chars and a null terminator
char ciphertext[len + 1];
// Looping through the inputted text, checking for upper and lower case letters
for (int i = 0; i < len; i++) {
unsigned char c = plaintext[i];
if (islower(c)) { // The letter is lower case
int index = c - 'a'; // Converting from ascii to decimal value and getting it into alphabetical index (0-25)
ciphertext[i] = tolower((unsigned char)key[index]);
} else
if (isupper(c)) {
// The letter is uppercase
int index = c - 'A'; // Converting from ascii to decimal value and getting it into alphabetical index (0-25)
ciphertext[i] = toupper((unsigned char)key[index]);
} else {
// other characters are unchanged
ciphertext[i] = c;
}
}
ciphertext[len] = '\0'; // set the null terminator
printf("ciphertext: %s\n", ciphertext);
return 0;
}

Related

CS50 / BEGINNER - Segmentation fault in nested for loop in C

I'm trying to write code that will take each digit from a plaintext string input and, if it is a letter, output a different letter, as defined by a substitution key (26-letter key).
In other words, if the alphabet was "abcd" and provided key was "hjkl", an input of "bad" would output "jhl".
// Regular alphabet is to be used as comparison base for key indexes //
string alphabet = "abcdefghijklmnopqrstuvwxyz";
// Prompt user for input and assign it to plaintext variable //
string plaintext = get_string("plaintext: ");
Non-letters should be printed as-is.
My idea was to loop the input digit through every index in the alphabet looking for the corresponding letter and, if found, print the same index character from the string. (confusing, I think)
This loop, however, returns a segfault when I run it, but not when debugging:
// Loop will iterate through every ith digit in plaintext and operate the cipher //
for (int i = 0; plaintext[i] != '\0'; i++) {
// Storing plaintext digit in n and converting char to string //
char n[2] = "";
n[0] = plaintext[i];
n[1] = '\0';
// If digit is alphabetic, operate cipher case-sensitive; if not, print as-is //
if (isalpha(n) != 0) {
for (int k = 0; alphabet[k] != '\0'; k++) {
char j[2] = "";
j[0] = alphabet[k];
j[1] = '\0';
if (n[0] == j[0] || n[0] == toupper(j[0])) {
if (islower(n) != 0) {
printf("%c", key[k]);
break;
} else {
printf("%c", key[k] + 32);
break;
}
}
}
} else {
printf("%c", (char) n);
}
}
What's going wrong? I've looked for help online but most sources are not very beginner-friendly.
Your code seems to be working except one error: The program crashes at
isalpha(n)
Cause you declared
char n[2]
the parameter there is a pointer of type char*. But islower only accepts an int parameter, so just write it as
isalpha(n[0])
Same for islower.

How to do read multiple characters from an argument

I am trying to read multiple characters from an argument in c. So when the person rules the file like "./amazing_program qwertyyuiopasdfghjklzxcvbnm" it would read the qwerty characters and store the, into a array as a number (ASCII) like:
array[0] = 'q';
array[1] = 'w';
array[2] = 'e';
array[3] = 'r';
array[4] = 't';
array[5] = 'y';
and so on...
My goal: Is to separate the argument into each individual character and store each individual character into a different place in the array (like shown above).
I tried this way, but it didn't work.
int user_sub = 0;
int argument = 1;
while (argument < argc) {
user_sub = atoi(argv[argument]);
argument = argument + 1;
}
From reading your comments, I've come to understand you just want to be able to get to the characters so you can do a shift. Well, that's not so hard to do, so I've tried to show you how you can do it here without having to complete the Caesar logic for you.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define SHIFT 13
int main (int argc, const char *argv[]) {
// Verify they gave exactly one input string.
if (argc != 2) {
fprintf(stderr, "Usage: %s <word>\n", argv[0]);
exit(EXIT_FAILURE);
}
// A string IS already an array of characters. So shift then and output.
int n = strlen(argv[1]);
for (int i = 0; i < n; i++) {
char c = argv[1][i];
// Shift logic here: putchar(...);
printf("%d: %c\n", i, c);
}
return EXIT_SUCCESS;
}
The key takeaway is that a string is already an array. You don't need to make a new array and stick all the characters in it. You already have one. What this program does is simply "extract" and print them for you so you can see this. It currently only writes the current argument string to output, and does no shifting. That's for you to do. It also doesn't take into account non-alphabetical characters. You'll have to think about them yourself.
You have serious lack :
1)
A string in C is an ARRAY of type char. We know where the end of the array is thank to a special value : '\0'.
Now, you have to deeply understand that each case of the array contain a NUMBER : since the type of the case is char, it will be a number in the range [-128, 127] (yeah, I know that char is special and can be signed or not, but let's keep it simple for the time being).
So if you acces each case of the array and print it, you will have a number between -128 and 127. So how the program know to print a letter instead of a number ? And how do he know which letter for which number ?
Thank to an internal table used for this uniq purpose. The most common is the ASCII table. So if a case of the array is 65, what will be printed is 'A'.
2) How can I go through each case of a string ? (which is an array of char terminated by '\0') ?
Simply with a for loop.
char str[] = "test example";
for (size_t i = 0; str[i] != '\0'; ++i) {
printf("The %d letter is '%c'\n", i, str[i]);
}
Again, since it's a number in str[i], how the program know how to print a letter ? Thank to the "%c" in printf, meaning "print the letter using the table (probably ASCII)". If you use "%s", it's the same thing, but you have to give the array itself instead of a case of the array.
So, what if I want to print the number instead of the letter ? Just use "%d" in printf.
char str[] = "test example";
for (size_t i = 0; str[i] != '\0'; ++i) {
printf("The %d letter is '%c' and it's real value is %d\n", i, str[i], str[i]);
}
Now, what if we increment all the value in each case of the string ?
char str[] = "test example";
for (size_t i = 0; str[i] != '\0'; ++i) {
str[i] = str[i] + 1; // Or ++str[i];
}
for (size_t i = 0; str[i] != '\0'; ++i) {
printf("The %d letter is '%c' and it's real value is %d\n", i, str[i], str[i]);
}
We have changed the string "test example" into "uftu fybnqmf".
Now, for your problem, you have to take the resolution step by step :
First, make a function that alter (cypher) a string given in argument by adding a shift.
void CesarCypherString(char *string);
Beware of "overflow" ! If I want to have a shift of 5, then 'a' will become 'f', but what happen for 'z' ? It should be 'e'.
But if you look at the ascii table, 'a' = 97, 'f' = 102 (and it make sense, since 'a' + 5 = 'f', 97 + 5 = 102), but 'z' is 122 and 'e' is 101. So you cannot directly do 'z' + 5 = 'e' since it's wrong.
Hint : use modulo operator (%).
Next, when you have finished to do the function CesarCypherString, do the function CesarDecypherString that will decypher a string.
When you have finished, then you can concentrate on how to read/duplicate a string from argv.

Substitution Cipher Alphabet to QWERTY

I'm working on a program that uses ciphers. The Cipher I need to use is the alphabet to qwerty. So...
abcdefghijklmnopqrstuvwxyz
qwertyuiopasdfghjklzxcvbnm
the program needs to take the encoding key
qwertyuiopasdfghjklzxcvbnm
and produce the decoding key.
How would I go about doing this? I've only done a Caesar Cipher in the past.
Here is the code in C to convert a string input to qwerty cipher, assuming you're working with only lowercase letters, and using a buffer size of 500 for strings:
#include <stdio.h>
#include <string.h>
int main() {
char* ciphertext = "qwertyuiopasdfghjklzxcvbnm"; // cipher lookup
char input[500]; // input buffer
printf("Enter text: ");
fgets(input, sizeof(input), stdin); // safe input from user
input[strlen(input) - 1] = 0; // remove the \n (newline)
int count = strlen(input); // get the string length
char output[count]; // output string
for(int i = 0; i < count; i++) { // loop through characters in input
int index = ((int) input[i]) - 97; // get the index in the cipher by subtracting 'a' (97) from the current character
if(index < 0) {
output[i] = ' '; // if index < 0, put a space to account for spaces
}
else {
output[i] = ciphertext[index]; // else, assign the output[i] to the ciphertext[index]
}
}
output[count] = 0; // null-terminate the string
printf("output: %s\n", output); // output the result
}

Testing to see if a string only contains alphabetic numbers

I am studying C but am stuck on a program I've been trying to create. Essentially I'm testing to see if a character string only contains alphabetic characters a-z or A-Z.
What I have done:
defined a function called strisalpha to do this
called the function in my "test bench", which asks the user to enter a string
What goes wrong in the gcc compiler:
testBench1.c:21:28: warning: implicit declaration of function 'atoi' [-Wimplicit-function-declaration]
integerCharValue = atoi( string[loopPointer1] );
This is my definition of strisalpha:
int strisalpha(char *string)
{
int stringLength = 0;
int loopPointer1 = 0;
int integerCharValue = 0;
int dummyArgument = 0;
/* Get length of string */
stringLength = strlen (string);
printf("\nString length is: %d", stringLength);
/* ASCII Codes In Decimal */
A (65Decimal) to Z(90Decimal) and
a (97Decimal) to z (122Decimal)
Set up a loop and query if ASCII alphabetic character
*/
for (loopPointer1 = 1; loopPointer1 > stringLength; loopPointer1++ )
{
/* Convert character to integer */
integerCharValue = atoi( string[loopPointer1] );
printf ("%d \n", integerCharValue);
if (integerCharValue >= 65)
if (integerCharValue <= 90)
return 1; /* Upper case alphabetic character, so OK */
else if (integerCharValue >= 97)
if (integerCharValue <= 122)
return 1; /* Lower case alphabetic character, so OK */
else
The result always says I entered an ASCII character, even if I didn't. Please could someone shed some light on what I'm doing wrong? Thanks
The main problem is your for loop
for (loopPointer1 = 1; loopPointer1 > stringLength; loopPointer1++ )
Arrays in C start at 0 and the middle section defines if the loop should continue not finish. So what you want is:
for (loopPointer1 = 0; loopPointer1 < stringLength; loopPointer1++ )
And then for checking each character you don't need to do anything to them as you can compare characters like this for example
if (string[loopPointer] >= 'A')
atoi isn't what you want to use here. chars are already stored as numeric ASCII values. You can just set integerCharValue = string[loopPointer1].
Your loopPointer1 is starting at 1, so you will skip the first character in the string. In C, the index starts at 0.
Also, you don't want to return immediately if you find a letter. Calling return will exit the function and stop your loop. What you probably want to do is look for characters that are not letters, and return 0 if you find one. Then, if you make it to the end of the loop, you can return 1 because you know you didn't find any characters that weren't letters.
Here is a one-liner function that evaluates to 1 if the string only contains alphabetic characters, and 0 otherwise:
#include <string.h>
int strisalpha(const char *str) {
return str[strspn(str, "abcdefghijklmnopqrstuvwxyz"
"ABCDEFGHIJKLMNOPQRSTUVWXYZ")] == '\0';
}
And here is a more classic approach with isalpha():
#include <ctype.h>
int strisalpha(const char *str) {
while (*str) {
if (!isalpha((unsigned char)*str++)
return 0;
}
return 1;
}

C: Output with symbols in Caesar’s cipher encrypts, WHY? pset2 cs50

This is Caesar’s cipher encrypts problem in pset2 of cs50x course in edx.org.
I already solved this problem with another algorithm but this was my first try and I'm still curious why appear all these symbols at the right side of the caesar text.
ie. I enter the text "Testing" and the output is "Fqefuz�����w����l��B��" but the answer is correct without the symbols.
Can anyone explain me that?
int main(int argc, string argv[])
{
bool keyOk = false;
int k = 0;
do
{
if(argc != 2) // Checking if the key was correctly entered.
{
printf("You should enter the key in one argument from"
" the prompt(i.e. './caesar <key>').\n");
return 1;
}
else
{
k = atoi(argv[1]); // Converting string to int.
keyOk = true; // Approving key.
}
}
while(keyOk == false);
string msg = GetString(); // Reading user input.
char caesarMsg[strlen(msg)];
for(int i=0, n = strlen(msg); i < n; i++)
{
if( (msg[i] >= 'a') && (msg[i] <= 'z') )
// Processing lower case characters
{
caesarMsg[i] = ((((msg[i] - 97) + k) % 26) + 97);
}
else if( (msg[i] >= 'A') && (msg[i] <= 'Z') )
// Processing upper case characters
{
caesarMsg[i] = ((((msg[i] - 65) + k) % 26) + 65);
}
else
{
caesarMsg[i] = msg[i];
}
}
printf("%s", caesarMsg);
printf("\n");
}
The root problem is C does not have a full, proper, or first-class "string" datatype. In C strings are in fact character arrays that are terminated with the NUL ('\0') (*) character.
Look at
string msg = GetString(); // Reading user input.
char caesarMsg[strlen(msg)];
This is equivalent to
char* msg = GetString(); /* User or library function defined elsewhere */
/* calculates the length of the string s, excluding the terminating null
byte ('\0') */
size_t len = strlen(msg);
char caesarMsg[len]; /* Create an character (byte) array of size `len` */
Hopefully this makes it clearer, why this section fails to work correctly. The variable len that I've added, is the length of the sequence of non-NUL characters in the string msg. So when you create the character array caesarMsg of length len, there is no room for the NUL character to be stored.
The for loop correctly executes, but the printf("%s", caesarMsg); will continue to print characters until it finds a NUL or crashes.
BTW you can reduce the two printf statements at the end into a single printf statement easily.
printf("%s\n", caesarMsg);
Strings and character arrays are a frequent source of confusion to anyone new to C, and some not-so-new to C. Some additional references:
I really recommend bookmarking is the comp.lang.c FAQ.
I also strongly that you have either get your own copy or ensure you have access to Kernighan and Ritchie's The C Programming Language, Second Edition (1988).
Rant: And whoever created the string typedef is evil / making a grave error, by misleading students that into thinking C's strings are are a "real" (or first-class) data type.
(*) NUL is different from NULL, because NULL (the null-pointer) is cast as a pointer as so it the same size as other pointers, where as NUL is a null-character (and either the size of a char or int).

Resources