I have a device which generates some noise that I want to add to the entropy pool for the /dev/random device in an embedded Linux system.
I'm reading the man page on /dev/random and I don't really understand the structure that you pass into the RNDADDENTROPY ioctl call.
RNDADDENTROPY
Add some additional entropy to the input pool, incrementing
the entropy count. This differs from writing to /dev/random
or /dev/urandom, which only adds some data but does not
increment the entropy count. The following structure is used:
struct rand_pool_info {
int entropy_count;
int buf_size;
__u32 buf[0];
};
Here entropy_count is the value added to (or subtracted from)
the entropy count, and buf is the buffer of size buf_size
which gets added to the entropy pool.
Is entropy_count in this structure the number of bits that I am adding? Why wouldn't this just always be buf_size * 8 (assuming that buf_size is in terms of bytes)?
Additionally why is buf a zero size array? How am I supposed to assign a value to it?
Thanks for any help here!
I am using a hardware RNG to stock my entropy pool. My struct is a static size
and looks like this (my kernel has a slightly different random.h; just copy what
you find in yours and increase the array size to whatever you want):
#define BUFSIZE 256
/* WARNING - this struct must match random.h's struct rand_pool_info */
typedef struct {
int bit_count; /* number of bits of entropy in data */
int byte_count; /* number of bytes of data in array */
unsigned char buf[BUFSIZ];
} entropy_t;
Whatever you pass in buf will be hashed and will stir the entropy pool.
If you are using /dev/urandom, it does not matter what you pass for bit_count
because /dev/urandom ignores it equaling zero and just keeps on going.
What bit_count does is push the point out at which /dev/random will block
and wait for something to add more entropy from a physical RNG source.
Thus, it's okay to guesstimate on bit_count. If you guess low, the worst
that will happen is that /dev/random will block sooner than it otherwise
would have. If you guess high, /dev/random will operate like /dev/urandom
for a little bit longer than it otherwise would have before it blocks.
You can guesstimate based on the "quality" of your entropy source.
If it's low, like characters typed by humans, you can set it to 1 or 2
per byte. If it's high, like values read from a dedicated hardware RNG,
you can set it to 8 bits per byte.
If your data is perfectly random, then I believe it would be appropriate for entropy_count to be the number of bits in the buffer you provide. However, many (most?) sources of randomness aren't perfect, and so it makes sense for the buffer size and amount of entropy to be kept as separate parameters.
buf being declared to be size zero is a standard C idiom. The deal is that when you actually allocate a rand_pool_info, you do malloc(sizeof(rand_pool_info) + size_of_desired_buf), and then you refer to the buffer using the buf member. Note: With some C compilers, you can declare buf[*] instead of buf[0] to be explicit that in reality buf is "stretchy".
The number of bytes you have in the buffer correlates to the entropy of the data but the entropy can not be calculated only from that data or its length.
Sure, if the data came from a good, unpredictable and equal-distributed hardware random noise generatr the entropy (in bits) is 8*size of the buffer (in bytes).
But if the bits are not equally distributed or are somehow predictable the entropy becomes less.
See https://en.wikipedia.org/wiki/Entropy_(information_theory)
I hope that helps.
Related
My understanding is that size_t is a type large enough to represent (or address) any memory position in a given architecture.
For instance, on a 32 bit machine size_t should be able to represent at least 2^32 values. This means that sizeof(size_t) must be >= 4 in 32 bit architectures, right?
So what should be the sizeof(size_t) on code that's meant to run a gpu?
Since many gpus have more than 4gb, sizeof(size_t) must be at least 5. But I imagine it's 8, for alignment purposes.
Roughly speaking, size_t should be able to represent the size of any single allocated object. This might be smaller than the total address space though.
For example in 16-bit MS-DOS program one memory model had a 16-bit size_t even though many megabytes of memory were available, and pointers were 32-bit. But you could not allocate any particular chunk of memory larger than 64K.
It would be up to the compiler writer for the GPU to make size_t have some size that is large enough for the largest possible allocation on that GPU. As you say, this is likely to be a power of 2 (but not guaranteed).
The type used to represent any memory position is void *.
I am and have been working on software for the Pebble. It is the first time I have worked with C, and I am struggling to get my head around how to manage information/data within the program.
I am used to being able to have multi-dimensional arrays with thousands of entries. With the Pebble we are very limited.
I can talk to the requirements for my program, but happy to see any sort of discussion on the topic.
The application I am building needs to store a running feed of data with every button press. Ideally I would like to store one binary value and two small integer values with each press. I would like to take advantage of the local storage on the Pebble which is limited to 256 bytes per array which presents a challenge.
I have thought about using a custom struct - and having multiple arrays of those, making sure to check that each array doesn't exceed the 256 byte mark. It just seems really messy and complicated to manage... am I missing something fundamentally simple, or does it need to be this complicated?
At the moment my program only stores the binary value and I haven't bothered with the small integer values at all.
Perhaps you could define structures as follows:
#pragma pack(1)
typedef struct STREAM_RECORD_S
{
unsigned short uint16; // The uint16 field will store a number from 0-65535
unsigned short uint15 : 15; // The uint15 field will store a number from 0-32767
unsigned short binary : 1; // The binary field will store a number from 0-1
} STREAM_RECORD_T;
typedef struct STREAM_BLOCK_S
{
struct STREAM_BLOCK_S *nextBlock; // Store a pointer to the next block.
STREAM_RECORD_T records[1]; // Array of records for this block.
} STREAM_BLOCK_T;
#pragma pack(0);
The actual number of records in the array would depend on the size of the nextBlock pointer. For example, if you are running with 32-bit addressing, the nextBlock size would likely be 4 bytes; and it would be 2 bytes with 16-bit addressing, or 8 bytes with 64-bit addressing. (I do not know the pointer size on an ARM Cortex-M3 processor).
So, recordsPerArray = (256 - sizeof(nextBlock)) / sizeof(STREAM_RECORD_T);
Suppose I'm given a function with the following signature:
void SendBytesAsync(unsigned char* data, T length)
and I need a buffer large enough to hold a byte array of the maximum length that can be specified by type T. How do I declare that buffer? I can't just use sizeof as it will return the size (in bytes) of type T and not the maximum value that the type could contain. I don't want to use limits.h as the underlying type could change and my buffer be too small. I can't use pow from math.h because I need a constant expression. So how do I get a constant expression for the maximum size of a type at compile time in C?
Edit
The type will be unsigned. Since everyone seems to be appalled at the idea of a statically allocated buffer determined at compile time, I'll provide a little background. This is for an embedded application (on a microcontroller) where reliability and speed are the priorities. As such, I'm perfectly OK with wasting statically assigned memory for the sake of run time integrity (no malloc issues) and performance (no overhead for memory allocation each time I need the buffer). I understand the risk that if the max size of T is too large my linker will not be able to allocate a buffer that big, but that will be a compile-time failure, which can be accommodated, rather than a run-time failure, which cannot be tolerated. If, for example I use size_t for the size of the payload and allocate the memory dynamically, there is a very real possibility that the system will not have that much memory available. I would much rather know this at compile time, than at run-time where this will result in packet loss, data corruption, etc. Looking at the function signature I provided, it is ridiculous to provide a type as a size parameter for a dynamically allocated buffer and not expect the possibility that a caller will use the max value of the type. So I'm not sure why there seems to be so much consternation about allocating that memory once, for good. I can see this being a huge problem in the Windows world where multiple processes are fighting for the same memory resources, but in the embedded world, there's only 1 task to be done and if you can't do that effectively, then it doesn't matter how much memory you saved.
Use _Generic:
#define MAX_SIZE(X) _Generic((X),
long: LONG_MAX,
unsigned long: ULONG_MAX,
/* ... */)
Prior to C11 there isn't a portable way to find an exact maximum value of an object of type T (all calculations with CHAR_BIT, for example, may yield overestimates due to padding bits).
Edit: Do note that under certain conditions (think segmented memory of real-life situations) you might not be able to allocate a buffer large enough to equal the maximum value of any given type T.
if T is unsigned, then would ((T) -1) work?
(This is probably really bad, and if so, please let me know why :-) )
Is there a reason why you are allocating the maximum possible buffer size instead of a buffer that is only as large as you need? Why not have the caller simply specify the amount of memory needed?
Recall that the malloc() function takes an argument of type size_t. That means that (size_t)(-1) (which is SIZE_MAX in C99 and later) will represent the largest value that can be passed to malloc. If you are using malloc as your allocator, then this will be your absolute upper limit.
Maybe try using a bit shift?
let's see:
unsigned long max_size = (1 << (8 * sizeof(T))) - 1
sizeof(T) gives you the number of bytes T occupies in memory. (not technically true. usually the compiler will align the structure with memory... so if T is one byte, it will actually allocate 4, or something.)
Breaking it down:
8 * sizeof(T) gives you the number of bits that size represents
1 << x is the same as saying 2 to the x power. Because every time you shift to the left, you're multiplying by two. Just as every time you shift to the left in base 10, you are multiplying by 10.
- 1 an 8-bit number can hold 256 values. 0..255.
Interesting question. I would start by looking in the 'limits' header for the max value of a numeric type T. I have not tried it but I would do something that uses T::max
I am using MPI2.2 standard to write parallel program in C. I have 64 bit machine.
/* MPI offset is long long*/
MPI_Offset my_offset; printf ("%3d: my offset = %lld\n", my_rank, my_offset);
int count;
MPI_Get_count(&status, MPI_BYTE, &count);
printf ("%3d: read =%d\n", my_rank, count);
I am reading a file of very large size byte by byte. To read the file parallely i am setting the offset for each process using offset variable. I am having confusion for the data-type of MPI_offset type, that "whither it is signed or unsigned" long ?
My second question is about limitation of the "range of count variable" which is used in MPI_Get_count() function. since this function is used here to read all the elements from each process's buffer so i think it should also be of the long long type to read such a very large file.
MPI_Offset's size isn't defined by the standard - it is, roughly, as large as possible. ROMIO, a widely-used underlying implemetation of MPI-IO, uses 8-byte integers on systems which support them. You can probably find out for sure by looking in your system's mpi.h.
MPI_Offset is very definitely signed; there are functions like MPI_File_seek where it is perfectly reasonable to have values of type MPI_Offset take negative values.
MPI_Get_count returns an integer, of normal integer size, and this can certainly cause problems for some large file IO strategies.
Generally, it's better for a number of reasons not to use small low-level units of IO like bytes when doing MPI-IO; it's better in terms of performance and code readability to express the IO in units of your underlying data types. In doing so, these size limitations become less of an issue. If your underlying data type really is bytes, though, thre aren't many options.
Did you try to interleave MPI_File_read with something like MPI_File_seek(mpiFile,mpiOffset,MPI_SEEK_CUR ) ? This way you may succeed to avoid MPI_Offset overflow
Ho do i calculate the size of a AES256 encrypted file/buffer.
I do malloc of (n + AES_BLOCK_SIZE -1) bytes (where n is the unencrypted buffer size).
But will the size of the encrypted buffer always be that size? can it also be "smaller"?
Any idea how i pre-calculate the exact size?
Thanks
It depends on the padding you are using. The most common padding scheme (as it is reversible and contains a slight integrity check) is PKCS#5-padding: This appends a number of bytes such that the final size is a multiple of the block size, and at least one byte is appended. (Each appended byte than has the same value as the number of bytes appended.)
I.e. at most one full block (16 bytes for AES) will be appended.
n + AES_BLOCK_SIZE is always enough (and in some cases just enough), but you can calculate it more precise as n + AES_BLOCK_SIZE - (n % AES_BLOCK_SIZE).
Note that there are some modes of operation which don't need padding at all, like CTR, CFB and OFB mode. Also note that you often want to transmit the initialization vector (another full block), too.
AES is "block" encryption. It has a 128-bit block size. This means that it always takes as input a 128-bit block (16 bytes) and always outputs a same-sized block.
If your input is not a multiple of 16 bytes, you should append some data (perhaps bytes containing the value zero) to round it out.
If your data is more than 16 bytes, you will be encrypting multiple blocks, and will need to call your AES encryption function as many times as you have input blocks.
If you are only allocating space for the output, malloc(AES_BLOCK_SIZE); would be the allocation you seek. Don't add the input size or subtract one byte.