Implementation of a Stream cipher in c - c

I want to implement the following function:
I am using the Mersenne-Twister-Algorithm (from Wikipedia) as my pseudorandom number generator.
It is a stream cipher
The pseudocode is: encrypted text = CLEARTEXT XOR STREAM; the 'stream' is defined as PSEUDORANDOM_NUMBER XOR KEY
I wrote the following function:
int encrypt(char clear[1000], char key[100], int lk/*length of the cleatext*/, int ls /*length of key*/) {
int a, i;
unsigned char result[1000];
char string[1000];
for (i = 0; i <= lk; i++) {
if (i+1-ls >= 0) { /*if the key is too short*/
a = mersenne_twister();
string[i]=key[i+1-ls]^a; /*XOR */
} else {
a=mersenne_twister();
string[i] = key[i]^a; /*XOR */
}
result[i] = clear[i]^string[i];
putchar(result[i]);
}
return 1;
}
But the function does not work properly; it returns (the putchar part) something that is not readable.
Where is my mistake? Or is the whole code wrong?

If you are trying to print a char xored with something else, you will really often get strange characters. For instance, xoring 'M' with 'P' will result in '\GS' (Group Separator character), which is not printable.

Don't print the result as a character. It's not a character: you've XOR'd a character with some pseudo-random byte. The result is a byte. It may, by chance, be printable, but, then again, it may not be. You should treat the result for what it is, a byte, and print it as such:
/* format a byte as 2 hex digits */
printf("%02X", result[i]);
I will also add a cautionary note: do not use this "cipher". The Mersenne twister is not a cryptographically secure random number generator and the resulting cipher won't be secure either.
If you want to study stream ciphers, start with the simple Vernam cipher then read about RC4. Both are easy to understand and simple to implement. As a result they are good for getting your feet wet so to speak.

Related

+'0' wont give char value of int

I was trying to make this int to char program. The +'0' in the do while loop wont convert the int value to ascii, whereas, +'0' in main is converting. I have tried many statements, but it won't work in convert() .
#include<stdio.h>
#include<string.h>
void convert(int input,char s[]);
void reverse(char s[]);
int main()
{
int input;
char string[5];
//prcharf("enter int\n");
printf("enter int\n");
scanf("%d",&input);
convert(input,string);
printf("Converted Input is : %s\n",string);
int i=54;
printf("%c\n",(i+'0')); //This give ascii char value of int
printf("out\n");
}
void convert(int input,char s[])
{
int sign,i=0;
char d;
if((sign=input)<0)
input=-input;
do
{
s[i++]='0'+input%10;//but this gives int only
} while((input/=10)>0);
if(sign<0)
s[i++]='-';
s[i]=EOF;
reverse(s);
}
void reverse(char s[])
{
int i,j;
char temp;
for(i=0,j=strlen(s)-1;i<j;i++,j--)
{
temp=s[i];
s[i]=s[j];
s[j]=temp;
}
}
Output screenshot
Code screenshot
The +'0' in the do while loop wont convert the int value to ascii
Your own screenshot shows otherwise (assuming an ASCII-based terminal).
Your code printed 56, so it printed the bytes 0x35 and 0x36, so string[0] and string[1] contain 0x35 and 0x36 respectively, and 0x35 and 0x36 are the ASCII encodings of 5 and 6 respectively.
You can also verify this by printing the elements of string individually.
for (int i=0; string[i]; ++i)
printf("%02X ", string[i]);
printf("\n");
I tried your program and it is working for the most part. I get some goofy output because of this line:
s[i]=EOF;
EOF is a negative integer macro that represents "End Of File." Its actual value is implementation defined. It appears what you actually want is a null terminator:
s[i]='\0';
That will remove any goofy characters in the output.
I would also make that string in main a little bigger. No reason we couldn't use something like
char string[12];
I would use a bare minimum of 12 which will cover you to a 32 bit INT_MAX with sign.
EDIT
It appears (based on all the comments) you may be actually trying to make a program that simply outputs characters using numeric ascii values. What the convert function actually does is converts an integer to a string representation of that integer. For example:
int num = 123; /* Integer input */
char str_num[12] = "123"; /* char array output */
convert is basically a manual implementation of itoa.
If you are trying to simply output characters given ascii codes, this is a much simpler program. First, you should understand that this code here is a mis-interpretation of what convert was trying to do:
int i=54;
printf("%c\n",(i+'0'));
The point of adding '0' previously, was to convert single digit integers to their ascii code version. For reference, use this: asciitable. For example if you wanted to convert the integer 4 to a character '4', you would add 4 to '0' which is ascii code 48 to get 52. 52 being the ascii code for the character '4'. To print out the character that is represented by ascii code, the solution is much more straightforward. As others have stated in the comments, char is a essentially a numeric type already. This will give you the desired behavior:
int i = 102 /* The actual ascii value of 'f' */
printf("%c\n", i);
That will work, but to be safe that should be cast to type char. Whether or not this is redundant may be implementation defined. I do believe that sending incorrect types to printf is undefined behavior whether it works in this case or not. Safe version:
printf("%c\n", (char) i);
So you can write the entire program in main since there is no need for the convert function:
int main()
{
/* Make initialization a habit */
int input = 0;
/* Loop through until we get a value between 0-127 */
do {
printf("enter int\n");
scanf("%d",&input);
} while (input < 0 || input > 127);
printf("Converted Input is : %c\n", (char)input);
}
We don't want anything outside of 0-127. char has a range of 256 bits (AKA a byte!) and spans from -127 to 127. If you wanted literal interpretation of higher characters, you could use unsigned char (0-255). This is undesirable on the linux terminal which is likely expecting UTF-8 characters. Values above 127 will be represent portions of multi-byte characters. If you wanted to support this, you will need a char[] and the code will become a lot more complex.

C: How to add char to chars, and when the max char is reached have it loop back to 'a'?

I am creating a simple encryption program.
I am adding chars to chars to create a new char.
As of now the new 'char' is often a represented by a '?'.
My assumption was that the char variable has a max sum and once it was passed it looped back to 0.
assumed logic:
if char a == 1 && char z == 255
then 256 should == a.
This does not apear to be the case.
This snippet adds a char to a char.
It often prints out something like:
for (int i = 0; i < half; ++i) {
halfM1[i] = halfM1[i] + halfP1[i];
halfM2[i] = halfM2[i] + halfP2[(half + i)];
}
printf("\n%s\n", halfM1 );
printf("%s\n", halfM2);
Returns:
a???
3d??
This snippet removes the added char and the strings go back to normal.
for (int i = 0; i < half; ++i) {
halfM1[i] = halfM1[i] - halfP1[i];
halfM2[i] = halfM2[i] - halfP2[(half + i)];
}
printf("\n%s\n", halfM1 );
printf("%s\n", halfM2);
returns:
messagepart1
messagepart2
The code technically works, but I would like the encryption to be in chars.
If question on why 'half' is everywhere.
The message and key are split in half so the first half and second half of message have separate encryption.
First of all, there is no such thing as "wraparound" for common char. A common char is a signed type in x86, and signed integers do not have wraparound. Instead the overflow leads to undefined behaviour. Additionally, the range of chars can be -128 ... 127, or even something
For cryptographic purposes you'd want to use unsigned chars, or even better, raw octets with uint8_t (<stdint.h>).
Second problem is that you're printing with %s. One of the possible 256 resulting characters is \0. If this gets into the resulting string, it will terminate the string prematurely. Instead of using %s, you should output it with fwrite(halfM1, buffer_size, 1, stdout). Of course the problem is that the output is still some binary garbage. For this purposes many Unix encryption programs will write to file, or have an option to output an ASCII-armoured file. A simple ASCII armouring would be to output as hex instead of binary.
The third is that there is an operation that is much better than addition/subtraction for cryptographic purposes: XOR, or halfM1[i] = halfM1[i] ^ halfP1[i]; - the beauty of which is that it is its own inverse!

Assembly Vigenère cipher program

I'm not really sure how to approach this problem:
For better frequency characteristics the keyword should not have any repeated
letters. Also, if it contains the letter A the encrypted letter will be the same as the plaintext, although this is not necessarily a bad thing.
To implement this algorithm with a pencil and paper, many descriptions ask you tobuild a Vigenère Square. However this is not really necessary when you are using acomputer to do the encoding and decoding.
Essentially the keyword is written repeatedly over and over above the plaintext.
Suppose the keyword is CRYPTOGRAM.
CRYPTOGRAMCRYPTOGRAMCRYPTOGRAMCRYPTOGRAMCRYPTOGRAMCRYPTOGRAMCRYPTOGR
WEHAVEBEENBETRAYEDALLISDISCOVEREDFLYATONCEMEETUSBYTHEOLDTREEATNINEPM
Consider that the letters are numbered 0 to 25. The letter on the top determines
which Caesar-cypher to use for the letter below. Thus C means shift the alphabet by 2, A means shift by 0, and so on. In mathematical terms, we are adding the two letters together modulo 26. (The square was used because the concept of modular arithmetic was not generally understood by soldiers in 1553.)
To decrypt the message, the same operation is performed in reverse. That is, the
value of the keyword letter is subtracted rather than added. Step 3. What your code should do
Your code should use STDIN and STDOUT for input and output. (This is the
default.) Use redirection on the command line to read from a file and write to a
file.
Your code should open a file, read it character by character and save it into an
array.
When you get to the end of the file you should encode the contents of the
array with a Vigenère cipher using the keyword CRYPTOGRAM, then print it
out.
Maintain the distinction between upper-case and lower-case letters, and do
not modify non-alphabetic characters. This is not very good for the security of
your message, but the result will look neater.
This program should use glibc functions. In addition to printf(), you may
need getchar() and putchar().
Assume that the input file contains just ASCII text Don't worry about what
happens with non-text files.
Once the encoder is working, build a decoder by duplicating the code and
changing the addition to a subtraction.
If you use printf() to output the array, remember that a null termination is
required on a string.
Start by breaking the problem down in smaller parts like "read input from stdin", "encrypt a string", "print output to stdout".
You need to be familiar with the modulus operator, because you will need to use it more than once in your program.
If you are having a hard time, here is one way to break down the problem
(there are other ways that are just as good):
/* For printf, getchar etc: */
#include <stdio.h>
/* For isalpha, isupper, islower etc: */
#include <ctype.h>
char encryptChar(char ch, char cypher) {
int shiftBy = cypher - 'A';
char encryptedLetter;
/* There are 3 cases: uppercase, lowercase, other char */
if (isupper(ch)) {
/* add code to encrypt uppercase char */
} else if (islower(ch)) {
/* add code to encrypt lowercase char */
} else {
/* Other characters stay as they are */
encryptedLetter = ch;
}
return encryptedLetter;
}
char *cypherString = "CRYPTOGRAM";
int main(int argc, char **argv) {
int ch;
int cypherStringLength = strlen(cypherString);
int counter = 0;
char cypher;
while ((ch = getchar()) != EOF) {
cypher = cypherString[counter%cypherStringLength];
ch = encryptChar(ch, cypher);
/* Add code to print the character */
counter++;
}
return 0;
}

RLE algorithm decoding - escape character

I have to do a rle algorithm (escape character) that is able to encode and decode every file.
I did the first part (encoding) and now already before to begin the decoding part i can see some problems. Example:
If I have a file and inside it there is: AAAAABBBBBBCCCCCDDD
The encode function that I did give an output like this: QA5QB6QC5DDD
But you have to think that I have to work with real file so inside there is not just letter also numbers and symbols.
So, after the encode part, what I have to do if inside the encoded file there is something like QA55?
The output have to be AAAAA5 or fifty five A?
Another example, if I have to read QA5
Which is the final output? AAAAA or just QA5?
I mean that I don't know how I can recognize when the block of letter that I'm reading is something of encoded or not.
This is my encode function:
void encode (FILE *source, FILE *destination) {
char currentChar;
char seqChar = 'Z'; //could be any character
int count = 0;
while(1) {
int endFile = (fread(&currentChar, sizeof(char),1, source) == 0);
if(endFile || seqChar!=currentChar) {
if(count>3) {
char escape = 'Q';
int k = count;
char str[100];
int digits = sprintf(str,"%d",count);
fwrite(&escape, sizeof(escape), 1, destination);
fwrite(&seqChar, sizeof(escape),1, destination);
fwrite(&str, sizeof(char), digits, destination);
}
else {
for(int i=0;i<count;i++)
fwrite(&seqChar,sizeof(char),1,destination);
}
seqChar = currentChar;
count =1;
}
else count++;
if(endFile)
break;
}
fclose(source);
fclose(destination);
}
I hope you know what I mean,
for sure, I think, that I have to invent some convention in order to solve this problem, but I can not figure out which and what kind.
How do you place a literal backslash in a C string? How do you write a percent sign with printf? You have to find an escape sequence that represents the escape character itself.
Your escape character is Q (strange choice, by the way). Then Q + character + count could mean: that character, count times. And QQ could mean the escape character itself.
You'll see that you cannot compress sequences of Q's that way, because Q already means "Q". There are two possibilities to fix this: Get rid of the QQ special meaning and always encode "Q" as a sequence of one "Q", ie. QQ1. Or place the count in front of the character to encode and have Q not be a valid count.
(By the way, that's not so much a C question, it's more about the design of your compression algorithm. You might want to re-tag it and remove the code.)

Best way to do binary arithmetic in C?

I am learning C and writing a simple program that will take 2 string values assumed to each be binary numbers and perform an arithmetic operation according to user selection:
Add the two values,
Subtract input 2 from input 1, or
Multiply the two values.
My implementation assumes each character in the string is a binary bit, e.g. char bin5 = "0101";, but it seems too naive an approach to parse through the string a character at a time. Ideally, I would want to work with the binary values directly.
What is the most efficient way to do this in C? Is there a better way to treat the input as binary values rather than scanf() and get each bit from the string?
I did some research but I didn't find any approach that was obviously better from the perspective of a beginner. Any suggestions would be appreciated!
Advice:
There's not much that's obviously better than marching through the string a character at a time and making sure the user entered only ones and zeros. Keep in mind that even though you could write a really fast assembly routine if you assume everything is 1 or 0, you don't really want to do that. The user could enter anything, and you'd like to be able to tell them if they screwed up or not.
It's true that this seems mind-bogglingly slow compared to the couple cycles it probably takes to add the actual numbers, but does it really matter if you get your answer in a nanosecond or a millisecond? Humans can only detect 30 milliseconds of latency anyway.
Finally, it already takes far longer to get input from the user and write output to the screen than it does to parse the string or add the numbers, so your algorithm is hardly the bottleneck here. Save your fancy optimizations for things that are actually computationally intensive :-).
What you should focus on here is making the task less manpower-intensive. And, it turns out someone already did that for you.
Solution:
Take a look at the strtol() manpage:
long strtol(const char *nptr, char **endptr, int base);
This will let you convert a string (nptr) in any base to a long. It checks errors, too. Sample usage for converting a binary string:
#include <stdlib.h>
char buf[MAX_BUF];
get_some_input(buf);
char *err;
long number = strtol(buf, &err, 2);
if (*err) {
// bad input: try again?
} else {
// number is now a long converted from a valid binary string.
}
Supplying base 2 tells strtol to convert binary literals.
First out I do recommend that you use stuff like strtol as recommended by tgamblin,
it's better to use things that the lib gives to you instead of creating the wheel over and over again.
But since you are learning C I did a little version without strtol,
it's neither fast or safe but I did play a little with the bit manipulation as a example.
int main()
{
unsigned int data = 0;
int i = 0;
char str[] = "1001";
char* pos;
pos = &str[strlen(str)-1];
while(*pos == '0' || *pos == '1')
{
(*pos) -= '0';
data += (*pos) << i;
i++;
pos--;
}
printf("data %d\n", data);
return 0;
}
In order to get the best performance, you need to distinguish between trusted and untrusted input to your functions.
For example, a function like getBinNum() which accepts input from the user should be checked for valid characters and compressed to remove leading zeroes. First, we'll show a general purpose in-place compression function:
// General purpose compression removes leading zeroes.
void compBinNum (char *num) {
char *src, *dst;
// Find first non-'0' and move chars if there are leading '0' chars.
for (src = dst = num; *src == '0'; src++);
if (src != dst) {
while (*src != '\0')
*dst++ = *src++;
*dst = '\0';
}
// Make zero if we removed the last zero.
if (*num == '\0')
strcpy (num, "0");
}
Then provide a checker function that returns either the passed in value, or NULL if it was invalid:
// Check untested number, return NULL if bad.
char *checkBinNum (char *num) {
char *ptr;
// Check for valid number.
for (ptr = num; *ptr == '0'; ptr++)
if ((*ptr != '1') && (*ptr != '0'))
return NULL;
return num;
}
Then the input function itself:
#define MAXBIN 256
// Get number from (untrusted) user, return NULL if bad.
char *getBinNum (char *prompt) {
char *num, *ptr;
// Allocate space for the number.
if ((num = malloc (MAXBIN)) == NULL)
return NULL;
// Get the number from the user.
printf ("%s: ", prompt);
if (fgets (num, MAXBIN, stdin) == NULL) {
free (num);
return NULL;
}
// Remove newline if there.
if (num[strlen (num) - 1] == '\n')
num[strlen (num) - 1] = '\0';
// Check for valid number then compress.
if (checkBinNum (num) == NULL) {
free (num);
return NULL;
}
compBinNum (num);
return num;
}
Other functions to add or multiply should be written to assume the input is already valid since it will have been created by one of the functions in this library. I won't provide the code for them since it's not relevant to the question:
char *addBinNum (char *num1, char *num2) {...}
char *mulBinNum (char *num1, char *num2) {...}
If the user chooses to source their data from somewhere other than getBinNum(), you could allow them to call checkBinNum() to validate it.
If you were really paranoid, you could check every number passed in to your routines and act accordingly (return NULL), but that would require relatively expensive checks that aren't necessary.
Wouldn't it be easier to parse the strings into integers, and then perform your maths on the integers?
I'm assuming this is a school assignment, but i'm upvoting you because you appear to be giving it a good effort.
Assuming that a string is a binary number simply because it consists only of digits from the set {0,1} is dangerous. For example, when your input is "11", the user may have meant eleven in decimal, not three in binary. It is this kind of carelessness that gives rise to horrible bugs. Your input is ambiguously incomplete and you should really request that the user specifies the base too.

Resources