Where is correct checking char* array is AES 128/192/256 crypt, or non crypt - is text?
not use OpenSSL, pls.
tldr: If you want a 100%-working solution, it´s completely impossible.
Long version:
First, stop thinking "binary vs text". That´s not how it works.
AES ciphertext surely is binary data in the computer, but "text" is too.
If you want do distinguish AES ciphertext from other non-AES data, it´s impossible because:
AES ciphertext can be some unreadable gargabe, but it can be a poem of Goethe too.
Every possible data thing can be ("is") a AES ciphertext for some plaintext with some key.
Non-AES data can be as much unreadable garbage as AES data. (Pseudo-) Random bytes as example: AES with proper input is an excellent random byte generator.
The other way round; if you want to distinguish proper and sane human-readable text from other things, it´s impossible because: There is no law or something what "text" is in your computer.
If you want to search for english letters, consider following points:
As said above, readable words can be AES ciphertext too.
English letters? What´s about German, Japanese, old Greek, Russian...?
How are letters mapped to bytes? ISO88591, UTF16LE with BOM, EBCDIC, own mappings...?
What´s about file formats like MS Word *.doc? In it, there´s text
too, yet it´s binary "garbage" data. Or compression algorithms: Gzip,
Rar etc. doesn´t make text less sane.
If you finally extracted proper letters, how do you know if it isn´t something like "miodsjoiusdJf"? Recognizing words and their meaning is a very big topic on it´s own, and nearly everything in it is guesswork.
Related
I need to send an encrypted data through a socket in C. So I make my payload like:
message_type|flags|message_1|message_2
I want to encrypt the message_x with AES.
For security concerns, I need to keep the message_x length for the ciphertext. That's why I am using the OpenSSL AES-CTR mode.
But there's the problem. The output is completely awful, like �3Y������ȳ�_�[��. And there are characters like [, _, etc. Every character could appear in the output. So the receiver won't be able to unserialize the payload because, perhaps, the delimiter (| in the example) has been generated by AES...
I saw base64 conversion could do the trick. But the conversion changes the length of the ciphertext (like explained in this article).
Does anyone have any ideas ?
i'm trying to implement this algorithm; you can find a good description here
LINK - Chapter 11.6 - Playfair Cipher
I'm getting some thoughts about the decryptying phase.
After i follow the instructions for crypting the text i get:
35VRX2NZDCR25885
then to decrypt i follow the instruction in the opposite direction, but i'm stucked at the point i get the message decrypted as follow
LETUSMEETATNOON
How could I pass from "LETUSMEETATNOON" to "LET US MEET AT NOON"?
Should I treat spaces in a different way?
Spaces are not allowed as a plaintext nor a ciphertext of your cryptosystem as it is defined in the attached document.
You could use a larger definition matrix and add all the useful symbols you need, probably . , ; - ? ! : ' \ / and maybe a newline.
There is one more approach. It is a general issue: How to add spaces to a text to get human readable (English) sentences? How to recognize a valid English sentence? This is a very hard problem of mathematical linguistics, which has not been solved yet.
In your case you could omit syntactical analysis and check the validity of words only. You could easily check all possible splits and check if all resulting words are valid English words. All you need is a good English dictionary (list of all English words), which can be found for example on Linux in folders /usr/share/dict/ or /var/lib/dict/ and many others can be downloaded from the Internet.
On Playfair algorithm:
Do not use it if you need any real security - it could be very easily broken using frequency analysis.
As we know, Hill cipher is a classic cipher in cryptography and is mostly used for encrypting text. I need to encrypt a file (such as .doc, .ppt, .jpeg, etc), and not just the contents of file. I already searched on the Internet, but I didn't find much research that focuses on file encryption.
What I found : encrypting text content in .txt doesn't encrypt the .txt file.
Using Java or .Net or Python (pick one or some), how to implement Hill Cipher to encrypt files as I explained above?
As a note, this question is not for my homework or assignment. I am just confused and curious about how one can implement the Hill Cipher to encrypt a file. Thank you.
The Hill cipher, like most classical ciphers from the pre-computer era, was traditionally used to only encrypt letters: that is, the valid inputs would consist only of the 26 letters from A to Z (and, in some variants, possibly a few extra symbols to make the alphabet size a prime number).
That said, there's no reason why you couldn't use a variant of the Hill cipher with, say, an alphabet size of 256, allowing you to directly encrypt input consisting of arbitrary bytes.
For a key, you'd need a random matrix that is invertible modulo 256, that is, a matrix consisting of random values from 0 to 255 chosen such that its determinant is an odd number. An easy way to generate such matrices is to just pick the matrix elements uniformly at random, calculate the determinant, and start over if it happens to be even. On average, you'll need two tries to succeed. Of course, for decryption, you'll also need to actually calculate the inverse matrix.
All that said, it's worth noting that the Hill cipher, on its own, is very easy to break. Indeed, even its inventor Lester S. Hill realized this, and only recommended it for use in combination with a substitution cipher, in what we might today consider a primitive substitution-permutation network.
Of course, nowadays we have access to much more efficient and secure ciphers, such as, say, AES. For any practical encryption tasks (as opposed to just learning exercises), you should use those rather than trying to develop your own.
Good Afternoon all,
I am working over rsa encryption and decryption, for more security i am also using padding in cipher text, for different input (amit) , i am getting different length output like-
plain text- amit
cipher text-10001123A234A987A765A
My problem is- For big plain text ,my algo generate large size cipher text, and i thought,
it is wastage of resources to keep long string in database ,
Is there any way with the help of that i can compact cipher and convert real cipher when i will require?
In order for the algorithm to be encryption and not just hashing, it must be reversible. To be reversible, the output must contain as much information as the input, and so is unlikely to be significantly shorter.
You may compress the data before encryption. There's not a lot else you can do unless you're willing to give up the ability to recover your original text from the ciphertext.
There are a couple of possibilities:
Change your encryption scheme there are schemes where the size is same as the input size
Compress your data before you encrypt, this will be effective only if you have a large block of text to encrypt and then there's the additional overhead of decrypting too.
This doesn't apply to RSA specifically, but: any secure cipher will give output close to indistinguishable from a random bit pattern. A random bit pattern has, per definition, maximum information theoretic entropy, since for each bit, both 0 and 1 are equally likely.
Now, you want a lossless compression scheme, since you need to be able to decompress to the exact data you originally compressed. An optimal compression scheme will maximize the entropy of it's output. However, we know that the output of our cipher already has maximum entropy, so we can't possibly increase the entropy.
And thus, trying to compress encrypted data is useless.
Note: Depending on your encryption method, compression might be possible, for example, when using a block cipher in EBC mode. RSA is a completely different beast altogether though, and, well, compressing won't do anything (except quite possibly make your final output bigger).
[Edit] Also, the length of your RSA ciphertext will be in the order of log n. With n your public modulus. This is the reason that, especially for small plaintexts, public key crypto is extremely 'wasteful'. You normally employ RSA to setup a (smaller, e.g. 128-bit) symmetric key between two parties and then encrypt your data with a symmetric key algorithm such as AES. AES has a block size of 128 bits, so if you do straightforward encryption of your data, the maximum 'overhead' you incur will be length(message) mod 128 bits.
erm ... you wrote in a comment here that you apply RSA encryption to all single characters:
i am using rsa- it perform over
numbers to convert amit in cipher text
first i do a->97 m->109 i->105..and
then apply rsa over 97 ,109 ... then i
get different integers for 109, 105 or
... i joined that all as a string...
a good advice: don't do that since you will lose the security of RSA
if you use RSA in this way, your scheme becomes a substitution-cypher (with only one substitution alphabet) ... given a reasonably long cypher-text or a reasonable number of cypher-texts, this scheme can be broken by analyzing the frequency of cypher-text-chars
see RSAES-OAEP for a padding scheme to apply to your plaintext before encryption
I am basically creating an API in php, and one of the parameters that it will accept is an md5 encrypted value. I don't have much knowledge of different programming languages and also about the MD5. So my basic question is, if I am accepting md5 encrypted values, will the value remain same, generated from any programing language like .NET, Java, Perl, Ruby... etc.
Or there would be some limitation or validations for it.
Yes, correct implementation of md5 will produce the same result, otherwise md5 would not be useful as a checksum. The difference may come up with encoding and byte order. You must be sure that text is encoded to exactly the same sequence of bytes.
It will, but there's a but.
It will because it's spec'd to reliably produce the same result given a repeated series of bytes - the point being that we can then compare that results to check the bytes haven't changed, or perhaps only digitally sign the MD5 result rather than signing the entire source.
The but is that a common source of bugs is making assumptions about how strings are encoded. MD5 works on bytes, not characters, so if we're hashing a string, we're really hashing a particular encoding of that string. Some languages (and more so, some runtimes) favour particular encodings, and some programmers are used to making assumptions about that encoding. Worse yet, some spec's can make assumptions about encodings. This can be a cause of bugs where two different implementations will produce different MD5 hashes for the same string. This is especially so in cases where characters are outside of the range U+0020 to U+007F (and since U+007F is a control, that one has its own issues).
All this applies to other cryptographic hashes, such as the SHA- family of hashes.
Yes. MD5 isn't an encryption function, it's a hash function that uses a specific algorithm.
Yes, md5 hashes will always be the same regardless of their origin - as long as the underlying algorithm is correctly implemented.
A vital point of secure hash functions, such as MD5, is that they always produce the same value for the same input.
However, it does require you to encode the input data into a sequence of bytes (or bits) the same way. For instances, there are many ways to encode a string.