How to manage scatterlist for Linux crypto api use? - c

I need to (de)cipher some data at a time. Extra padding bytes may have to be added to the target data bytes at the beginning and at the end. The built-in crypto API works on struct scatterlist objects, as you can see with the definition of the encrypt method of a block cipher :
int (*encrypt)(struct blkcipher_desc *desc, struct scatterlist *dst,
struct scatterlist *src, unsigned int nbytes);
Now the procedure I am following for the ciphering operation :
Get a data buffer buf (length L)
Compute left padding and right padding bytes (rpad and lpad)
Cipher the whole thing (lpad and buf and rpad)
Get rid of the padding bytes in the result
The most simple and unefficient solution would be to allocate L + rpad + lpad bytes and copy the buffer's content in this new area appropriately. But since the API uses those scatterlist objects, I was wondering if there was a way to avoid this pure waste of resources.
I read a couple of articles on LWN about scatterlist chaining but a quick glance at the header file worries me : it looks like I have to manually set up the whole thing, which is a pretty bad practice ...
Any clue on how to use the scatterlist API properly ? Ideally, I would like to do the following :
Allocates buffers for the padding bytes for both input and output
Allocate a "payload" buffer that will only store the "useful" ciphered bytes
Create the scatterlist objects that includes padding buffers and target buffer
Cipher the whole and store the result in output padding buffers + output "payload" buffer
Discard the input and output padding buffers
Return the ciphered "payload" buffer to the user

first, sorry for my pour english,I am not a native english speaker.I think you are looking for
this api in kernel " blkcipher_walk_virt" , you can find the usage of this in ecb.c
"crypto_ecb_crypt". and you also can see the padlock_aes.c

After having investigated through the code, I found a suitable solution. It follows quite well the procedure I listed in my question, though there are some subtle differences.
As suggested by JohnsonDiao, I dived into the scatterwalk.c file to see how the Crypto API was making use of the scatterlist objects.
The problem that has arisen is the "boundary" between two subsequent scatterlist. Let's say I have two chained scatterlist. The first one hold information about a 12 bytes buffer, the second to a 20 bytes buffer. I want to encrypt the two buffers as a whole using AES128-CTR.
In this particular case, the API will :
Encrypt the 12 bytes of the buffer referenced by the first scatterlist.
Increment the counter
Encrypt the 16 first bytes of the second scatterlist
Increment the counter
Encrypt the last remaining 4 bytes
The behaviour I would have expected was :
Encrypt the 12 bytes of first buffer + the 4 first bytes of second buffer
Increment the counter
Encrypt the last 16 bytes of the second buffer
Thus, to enforce this, one must allocate a 16-byte aligned padding buffer in the pattern :
Let npad the number of padding bytes needed for the requested encryption. Then we have :
Where lbuf is the total length of the padding buffer. Now, the last lbuf - npad bytes must be filled with the first input data bytes. If the input is too short to ensure a full copy, that's not a matter.
Therefore we copy the first lcpy = min(lbuf - npad, ldata) bytes at the offset npad in the padding buffer
In short, here is the procedure :
Allocate the appropriate padding buffer with length lbuf
Copy the first lcpy bytes of the payload buffer at offset npad in the padding buffer
Reference the padding buffer in a scatterlist
Reference the payload buffer in another scatterlist (with a lcpy shift)
Ask for the ciphering
Extract the payload bytes present in the padding buffer
Discard the padding buffer
I tested this and it seemed to work perfectly.

I am also learning this part. and this is my analysis:
if your encryption device need cipher 16-bytes at once, you should set the alignment to (16-1). just like the padlock_aes.c , see ecb_aes_alg.cra_alignmask. the kernel would handle
this in blkcipher_next_copy and blkcipher_next_slow.
but I am puzzled, in aes_generic.c the alignmask is 3, how the kernel handle this without
blkcipher_next_copy?

Related

Attempting to parse a WAV file, memcpy results are unexpected

Assume that I have a small WAV file I've opened and dumped as a array of char for processing.
Right now, I am attempting to memcpy the fmt chunk ID into a 4 byte buffer.
char fmt[4];
memcpy(fmt_chunk_id, raw_file + 12, sizeof(char) * 4);
From my understanding of memcpy, this will copy the 4 bytes starting at offset 12 into fmt. However, when I go to debug the program I get some very strange output:
It seems to copy the fmt section correctly, but now for some reason I have a bunch of garbage after it. Interestingly, this garbage comes before format at offset bytes 0 (RIFF), and 8 (WAVE). It is a little endian file (RIFF).
I can't for the life of me figure out why I'm getting data from the beginning of the buffer at the end of this given that I only copied 4 bytes worth of data (which should exactly fit the first 4 characters f m t and space).
What is going on here? The output seems to indicate to me I'm somehow over-reading memory somewhere - but if that was the case I'd expect garbage rather than the previous offset's data.
EDIT:
If it matters, the data type of raw_file is const char* const.
The debugger is showing you an area of memory which has been dynamically allocated on the stack.
What is in all probability happening is that you read data from the file, and even if you asked to read, say, 50 bytes, the underlying system might have decided to read more (1024, 2048, or 4096 bytes usually). So those bytes passed around in memory, likely some on the stack, and that stack is being reused by your function now. If you asked to read more than those four bytes, then this is even more likely to happen.
Then the debugger sees that you are pointing to a string, but in C strings run until they get terminated by a zero (ASCIIZ). So what you're shown is the first four bytes and everything else that followed, up to the first 0x00 byte.
If that's important to you, just
char fmt[5];
fmt[4] = 0;
// read four bytes into fmt.
Now the debugger will only show you the first four bytes.
But now you see why you should always scrub and overwrite sensitive information from a memory area before free()ing it -- the data might remain there and even be reused or dumped by accident.

How to read a binary into an array

Say I have a 90 megabyte file. It's not encrypted, but it is binary.
I want to store this file into a table as an array of byte values so I can process the file byte by byte.
I can spare up to 2 GB of ram, so something with a thing like jotting down what bytes have been processed, which bytes have yet to be processed, and the processed bytes, would all be good. I don't exactly care about how long it may take to process.
How should I approach this?
Note I've expanded and rewritten this answer due to Egor's comment.
You first need the file open in binary mode. The distinction is important on Windows, where the default text mode will change line endings from CR+LF into C newlines. You do this by specifying a mode argument to io.open of "rb".
Although you can read a file one byte at a time, in practice you will want to work through the file in buffers. Those buffers can be fairly large, but unless you know you are handling only small files in a one-off script, you should avoid reading the entire file into a buffer with file:read"*a" since that will cause various problems with very large files.
Once you have a file open in binary mode, you read a chunk of it using buffer = file:read(n), where n is an integer count of bytes in the chunk. Using a moderately sized power of two will likely be the most efficient. The return value will either be nil, or will be a string of up to n bytes. If less than n bytes long, that was the last buffer in the file. (If reading from a socket, pipe, or terminal, however, reads less than n may only indicate that no data has arrived yet, depending on lots of other factors to complex to explain in this sentence.)
The string in buffer can be processed any number of ways. As long as #buffer is not too big, then {buffer:byte(1,-1)} will return an array of integer byte values for each byte in the buffer. Too big partly depends on how your copy of Lua was configured when it was built, and may depend on other factors such as available memory as well. #buffer > 1E6 is certainly too big. In the example that follows, I used buffer:byte(i) to access each byte one at a time. That works for any size of buffer, at least as long as i remains an integer.
Finally, don't forget to close the file.
Here's a complete example, lightly tested. It reads a file a buffer at a time, and accumulates the total size and the sum of all bytes. It then prints the size, sum, and average byte value.
-- sum all bytes in a file
local name = ...
assert(name, "Usage: "..arg[0].." filename")
file = assert(io.open(name, "rb"))
local sum, len = 0,0
repeat
local buffer = file:read(1024)
if buffer then
len = len + #buffer
for i = 1, #buffer do
sum = sum + buffer:byte(i)
end
end
until not buffer
file:close()
print("length:",len)
print("sum:",sum)
print("mean:", sum / len)
Run with Lua 5.1.4 on my Windows box using the example as its input, it reports:
length: 402
sum: 30374
mean: 75.557213930348
To split the contents of a string s into an array of bytes use {s:byte(1,-1)}.

Sending † character instead of Space character in Char array

I've migrated my project from XE5 to 10 Seattle. I'm still using ANSII codes to communicate with devices. With my new build, Seattle IDE is sending † character instead of space char (which is #32 in Ansii code) in Char array. I need to send space character data to text file but I can't.
I tried #32 (like before I used), #032 and #127 but it doesn't work. Any idea?
Here is how I use:
fillChar(X,50,#32);
Method signature (var X; count:Integer; Value:Ordinal)
Despite its name, FillChar() fills bytes, not characters.
Char is an alias for WideChar (2 bytes) in Delphi 2009+, in prior versions it is an alias for AnsiChar (1 byte) instead.
So, if you have a 50-element array of WideChar elements, the array is 100 bytes in size. When you call fillChar(X,50,#32), it fills in the first 50 bytes with a value of $20 each. Thus the first 25 WideChar elements will have a value of $2020 (aka Unicode codepoint U+2020 DAGGER, †) and the second 25 elements will not have any meaningful value.
This issue is explained in the FillChar() documentation:
Fills contiguous bytes with a specified value.
In Delphi, FillChar fills Count contiguous bytes (referenced by X) with the value specified by Value (Value can be of type Byte or AnsiChar)
Note that if X is a UnicodeString, this may not work as expected, because FillChar expects a byte count, which is not the same as the character count.
In addition, the filling character is a single-byte character. Therefore, when Buf is a UnicodeString, the code FillChar(Buf, Length(Buf), #9); fills Buf with the code point $0909, not $09. In such cases, you should use the StringOfChar routine.
This is also explained in Embarcadero's Unicode Migration Resources white papers, for instance on page 28 of Delphi Unicode Migration for Mere Mortals: Stories and Advice from the Front Lines by Cary Jensen:
Actually, the complexity of this type of code is not related to pointers and buffers per se. The problem is due to Chars being used as pointers. So, now that the size of Strings and Chars in bytes has changed, one of the fundamental assumptions that much of this code embraces is no longer valid: That individual Chars are one byte in length.
Since this type of code is so problematic for Unicode conversion (and maintenance in general), and will require detailed examination, a good argument can be made for refactoring this code where possible. In short, remove the Char types from these operations, and switch to another, more appropriate data type. For example, Olaf Monien wrote, "I wouldn't recommend using byte oriented operations on Char (or String) types. If you need a byte-buffer, then use ‘Byte’ as [the] data type: buffer: array[0..255] of Byte;."
For example, in the past you might have done something like this:
var
Buffer: array[0..255] of AnsiChar;
begin
FillChar(Buffer, Length(Buffer), 0);
If you merely want to convert to Unicode, you might make the following changes:
var
Buffer: array[0..255] of Char;
begin
FillChar(Buffer, Length(buffer) * SizeOf(Char), 0);
On the other hand, a good argument could be made for dropping the use of an array of Char as your buffer, and switch to an array of Byte, as Olaf suggests. This may look like this (which is similar to the first segment, but not identical to the second, due to the size of the buffer):
var
Buffer: array[0..255] of Byte;
begin
FillChar(Buffer, Length(buffer), 0);
Better yet, use this second argument to FillChar which works regardless of the data type of the array:
var
Buffer: array[0..255] of Byte;
begin
FillChar(Buffer, Length(buffer) * SizeOf(Buffer[0]), 0);
The advantage of these last two examples is that you have what you really wanted in the first place, a buffer that can hold byte-sized values. (And Delphi will not try to apply any form of implicit string conversion since it's working with bytes and not code units.) And, if you want to do pointer math, you can use PByte. PByte is a pointer to a Byte.
The one place where changes like may not be possible is when you are interfacing with an external library that expects a pointer to a character or character array. In those cases, they really are asking for a buffer of characters, and these are normally AnsiChar types.
So, to address your issue, since you are interacting with an external device that expects Ansi data, you need to declare your array as using AnsiChar or Byte elements instead of (Wide)Char elements. Then your original FillChar() call will work correctly again.
If you want to use ANSI for communication with devices, you would define the array as
x: array[1..50] of AnsiChar;
In this case to fill it with space characters you use
FillChar(x, 50, #32);
Using an array of AnsiChar as communication buffer may become troublesome in a Unicode environment, so therefore I would suggest to use a byte array as communication buffer
x: array[1..50] of byte;
and intialize it with
FillChar(x, 50, 32);

Appending an Int to a char * in C

So I am looking to append the length of a cipher text onto the end of the char array that I am storing the cipher in. I am not a native to C and below is a test snippet of what I have devised that I think works.
...
int cipherTextLength = 0;
unsigned char *cipherText = NULL;
...
EVP_EncryptFinal_ex(&encryptCtx, cipherText + cipherTextLength, &finalBlockLength);
cipherTextLength += finalBlockLength;
EVP_CIPHER_CTX_cleanup(&encryptCtx);
// Append the length of the cipher text onto the end of the cipher text
// Note, the length stored will never be anywhere near 4294967295
char cipherLengthChar[1];
sprintf(cipherLengthChar, "%d", cipherTextLength);
strcat(cipherText, cipherLengthChar);
printf("ENC - cipherTextLength: %d\n", cipherTextLength);
...
The problem is I don't think using strcat when dealing with binary data is going to be trouble free. Could anyone suggest a better way to do this?
Thanks!
EDIT
Ok, so I'll add a little context as to why I was looking to append the length. In my encrypt function, the function EVP_EncryptUpdate requires the length of the plainText being encrypted. As this is much more easy to obtain, this part isn't a problem. However, similarly, using EVP_DecryptFinal_ex in my decrypt function requires the length of the ciperText being decrypted, so I need to store it somewhere.
In the application where I am implementing this, all I am doing is changing some poor hashing to proper encryption. To add further hassle, the way the application is I first need to decrypt information read in from XML, do something with it, then encrypt it and rewrite it to XML again, so I need to have this cipher length stored in the cipher somehow. I also don't have scope to redesign this.
Instead of what you are doing now, it may be smarter to encode the ciphertext size to a location before the ciphertext itself. Once you start decrypting, it is not very useful to find the size at the end. You need to know the end to get the size to find the end, not very helpful.
Furthermore, the ciphertext is binary, so you don't need to convert anything to string. You would like to convert it to a fixed number of bytes (otherwise you don't know the size of the size :P ). So create a bigger buffer (4 bytes more than you require for the ciphertext), and start encrypting to offset 4 forwards. Then copy the size of the ciphertext in at the start of the buffer.
If you don't know how to encode an integer, take a look at - for instance - this question/ answer. Note, this will only encode 32 bits for a maximum size of the ciphertext of 2^32, about 4 GiB. Furthermore, the link pointed to use Big Endian encoding. You should use either Big Endian (preferred for crypto code) or Little Endian encoding - but don't mix the two.
Neither the ciphertext nor the encoded size should be used as a character string. If you need a character string, my suggestion is to base 64 encode the buffer up to the end of the ciphertext.
I hope you are having enough big arrays both for cipherText and cipheLegthChar to store the required text. Hence instead of
unsigned char *cipherText = NULL;
You can have
unsigned char cipherText[MAX_TEXT];
similarly for
cipherLenghthChar[MAX_INT];
Or you can have them dynamically allocated.
where MAX_TEXT and MAX_INT max buffer size to store text and integer. Also after first call of EVP_EncryptFinal_ex NULL terminate cipherText so that you strcat works.
The problem is I don't think using strcat when dealing with binary data is going to be trouble free.
Correct! That's not your only problem though:
// Note, the length stored will never be anywhere near 4294967295
char cipherLengthChar[1];
sprintf(cipherLengthChar, "%d", cipherTextLength);
Even if cipherTextLength is 0 here, you've gone out of bounds, since sprintf will add a null terminator, making a total of two chars -- but cipherLengthChar only has room for one.
If you consider, e.g. 4294967295, as a string, that's 10 chars + '\0' = 11 chars.
It would appear that finalBlockLength is the length of the data put into cipherText. However, the EVP_EncryptFinal_ex() call will probably fail in one way or another, or at least not do what you want, since cipherText == NULL. You then add 0 to that (== 0 aka. still NULL) and submit it as a parameter. Either you need a pointer to a pointer there (if EVP_EncryptFinal_ex is going to allocate space for you), or else you have to make sure there is enough room in cipherText to start with.
With regard to tacking text (or whatever) onto the end, you can just use sprintf directly:
sprintf(cipherText + finalBlockLength, "%d", cipherTextLength);
Presuming that cipherText is non-NULL and has enough extra room in it (see first couple of paragraphs).
However, I'm very dubious that doing that will be useful later on, but since I don't have any further context, I can't say more.

zlib: how to dimension avail_out

I would like to deflate a small block of memory (<= 16 KiB) using zlib. The output is stored in a block of memory as well. No disk or database access here.
According to the documentation, I should call deflate() repeatedly until the whole input is deflated. In between, I have to increase the size of the memory block where the output goes.
However, that seems unnecessarily complicated and perhaps even inefficient. As I know the size of the input, can't I predetermine the maximum size needed for the output, and then do everything with just one call to deflate()?
If so, what is the maximum output size? I assume something like: size of input + some bytes overhead
zlib has a function to calculate the maximum size a buffer will deflate to. Your assumption is correct - the returned value is the size of the input buffer + header sizes. After deflation you can realloc the buffer to reclaim the 'wasted' memory.
From zlib.h:
ZEXTERN uLong ZEXPORT deflateBound OF((z_streamp strm, uLong sourceLen));
/*
deflateBound() returns an upper bound on the compressed size after
deflation of sourceLen bytes. It must be called after deflateInit() or
deflateInit2(), and after deflateSetHeader(), if used. This would be used
to allocate an output buffer for deflation in a single pass, and so would be
called before deflate().
*/

Resources