Is there a function for inflate (zlib/miniz) which return upper bound of inflate/decompress size?

Is there a function for inflate (zlib/miniz) which return upper bound of inflate/decompress size? - zlib

I know that zlib/miniz provides compressBound which returns a upper bound of deflate/compress size, according to plain-text size. That's convenient.
Is there a function for inflate (zlib/miniz) which return upper bound of inflate/decompress size?
Or a simple formula determines it? like:
decompress size = compressed size * factor

Yes, but I don't think you will find it very useful. The upper limit is 1032 times the size of the input data.

Related

C (OSDev) - How could I shift the contents of a 32-bit framebuffer upwards efficiently?

I'm working on writing a hobby operating system. Currently a large struggle that I'm having is attempting to scroll the framebuffer upwards.
It's simply a 32-bit linear framebuffer.
I have access to a few tools that could be helpful:
some of the mem* functions from libc: memset, memcpy, memmove, and memcmp
direct access to the framebuffer
the width, height, and size in bytes, of said framebuffer
a previous attempt that managed to scroll it up a few lines, albeit EXTREMELY slowly, it took roughly 25 seconds to scroll the framebuffer up by 5 pixels
speaking of which, my previous attempt:
for (uint64_t i = 0; i != atoi(numLines); i++) {
for (uint64_t j = 0; j != bootboot.fb_width; j++) {
for (uint64_t k = 1; k != bootboot.fb_size; k++) {
((uint32_t *)&fb)[k - 1] = ((uint32_t *)&fb)[k];
}
}
}
A few things to note about the above:
numLines is a variable passed into the function, it's a char * that contains the number of lines to scroll up by, in a string. I eventually want this to be the number of actual text lines to scroll up by, but for now treating this as how many pixels to scroll up by is sufficient.
the bootboot struct is provided by the bootloader I use, it contains a few variables that could be of use: fb_width (the width of the framebuffer), fb_height (the height of the framebuffer), and fb_size (the size, in bytes, of the framebuffer)
the fb variable that I'm using the address of is also provided by the bootloader I use, it is a single byte that is placed at the first byte of the framebuffer, hence the need to cast it into a uint32_t * before using it.
Any and all help would be appreciated.

If I read the code correctly, what's happening with the triple nested loops is:
For every line to scroll,
For every pixel that the framebuffer is wide,
For every pixel in the entire framebuffer,
Move that pixel backwards by one.
Essentially you're moving each pixel one pixel distance at a time, so it's no wonder it takes so long to scroll the framebuffer. The total number of pixel moves is (numLines * fb_width * fb_size), so if your framebuffer is 1024x768, that's 5*1024*1024*768 moves, which is 4,026,531,840 moves. That's basically 5000 times the amount of work required.
Instead, you'll want to loop over the framebuffer only once, calculate that pixel's start and its end pointer, and only do the move once. Or you can calculate the source, destination, and size of the move once and then use memmove. Here's my attempt at this (with excessive comments):
// Convert string to integer
uint32_t numLinesInt = atoi(numLines);
// The destination of the move is just the top of the framebuffer
uint32_t* destination = (uint32_t*)&fb;
// Start the move from the top of the framebuffer plus however
// many lines we want to scroll.
uint32_t* source = (uint32_t*)&fb +
(numLinesInt * bootboot.fb_width);
// The total number of pixels to move is the size of the
// framebuffer minus the amount of lines we want to scroll.
uint32_t pixelSize = (bootboot.fb_height - numLinesInt)
* bootboot.fb_width;
// The total number of bytes is that times the size of one pixel.
uint32_t byteSize = pixelSize * sizeof(uint32_t);
// Do the move
memmove(destination, source, byteSize);
I haven't tested this, and I'm making a number of assumptions about how your framebuffer is laid out, so please make sure it works before using it. :)
(P.S. Also, if you put atoi(numLines) inside the end condition of the for loop, atoi will be called every time through the loop, instead of once at the beginning like you intended.)

Currently a large struggle that I'm having is attempting to scroll the framebuffer upwards.
The first problem is that the framebuffer is typically much slower to access than RAM (especially reads); so you want to do all the drawing in a buffer in RAM and then blit it efficiently (with a smaller number of much larger writes).
Once you have a buffer in RAM, you can make the buffer bigger than the screen. E.g. for a 1024 x 768 video mode you might have a 1024 x 1024 buffer. In that case small amounts of scrolling can often be done using the same "blit it efficiently" function; but sometimes you'll have to scroll the buffer in RAM.
To scroll the buffer in RAM you can cheat - treat it as a circular buffer and map a second copy into virtual memory immediately after the first. This allows you to (e.g.) copy 768 lines starting from the middle of the first copy without caring about hitting the end of the first buffer. The end result is that you can scroll the buffer in RAM without moving any data or changing the "blit it efficiently" function.
As a bonus, this also minimizing "tearing" artifacts. E.g. often you want to scroll the pixel data up and add more pixel data to the bottom, then blit it (without the user seeing an intermediate "half finished" frame).

Is there a way to use MPI_Reduce with dynamic size of send buffer?

I would like to use the "MPI_Reduce" function with a variable number of elements for each process.
For example.
We have 4 processes with an allocated buffer with a dynamic size.
P (0) size buffer = 21
P (1) size buffer = 24
P (2) size buffer = 21
P (3) size buffer = 12
I would like to reduce the values of these elements on the processor with rank 0.
In my thoughts I would like to allocate a receive buffer of a size equal to the maximum of objects to be received by a process (in this case 24) and use that to retrieve the values from the various processes.
There is a way in
which is it possible to do without increasing the execution times too much?
I am using Open MPI 2.1.1 in C, Thanks.

There is no reduction variant that works with different numbers of elements per rank in MPI. It wouldn't know what to fill in for missing operands in the reduction operation. It's pretty straightforward to write though, just as you suggested:
Determine the maximum buffer size
Allocate max-sized buffer on each rank, copy in local buffer, pad with whatever the neutral element of your reduction operation is
Run reduction on the now equal-sized buffers

Algorithm/Data structure name request: fixed size array with increasing stride on overwrite

I'm looking for a "cannonical" name of a data structure with the characteristics below. I can implement it with a little thought given my application, but I'm curious if there's a matching data structure type or algorithm that's been studied in the past.
Requirements:
Fixed size array backing
Writing uses monotonically increasing strides in the following sense:
The array is filled from start to finish with single index increases until the array is filled. E.g. given an integer i = 0 and an unchanging fixed size integer n, the array is filled with i++ until i = n.
The index is then reset to the second element (index 1), so i = 1. The stride thus increases by a value of 1; in general the stride will increase by 1 each time the array bound is met or overshot. For the first time the stride is increased, instead of i++ we have i = i + 2 until i >= n.
int sRes=0;
template<T>
void addElement(T* element)
{
if(i >= n)
{
if (stride > 2*n)
{
sRes++;
// keep stride from overflowing the integer type
stride = (stride mod n);
}
stride += 1;
i = stride mod n;
}
// array of AR type, AR defined in #3 below
array[i] = new AR(element, stride);
i += stride;
}
Array elements include the "stride value" when the entry was written/overwritten. Thus I can determine what stride the current element was written under. Thus we may define an array element as:
template<T> struct AR
{
T* value;
int stride;
}
Array reads respect the random-access model (e.g. any index may be read at anytime, due to the under-laying array).
To me, this has a few issues; if I want to know the current stride, I can look it up in the array. But at some point I will have strides that vary between n and 2*n. While this seems suited to the task above, I'm wondering about the implications. I'm operating on an embedded device, with little memory. I want some notion on how long I've gone updating the array. I could save the number of times I've modded, with further logic to reduce the necessary meta information on the stride value, like making the current mod count a static; the mod count being the number of times I've increased the stride.
Thus the interest I have in a name might include observations like the mod issue as presented. I can store the current number of stride increases in an int32. Under my constraints, this is acceptable; though I like using the proper name of an idea if it exists and I like reading about observations I may have missed.
I hope this presentation of the issue is understandable! If not, let me know and I'll update accordingly.
Thank you!

Calculate maximum number of repetitions of a matrix for a given memory limit

While using repmat, I get this error:
Error using repmat
Requested 2192800x2400 (39.2GB) array exceeds maximum array size preference. Creation of arrays greater than this limit may take a
long time and cause MATLAB to become unresponsive. See array size limit or preference panel for more information.
I would like a function that accepts two inputs: input_array and max_mem, where the first is the array I would like to replicate, and max_mem is a amount of memory in GBs. The function should return N_max, an integer that maximises the number of rows of repmat(input_array, N_max, 1) while constraining it to be within the memory limit specified by max_mem.

If I understand correctly
function N_max = foo (input_array, max_mem)
arrayInfo = whos('input_array');
arraySize = arrayInfo.bytes;
% max_mem in bytes, conversion if necessary
N_max = floor(max_mem / arraySize);

Ideal data structure for mapping integers to integers?

I won't go into details, but I'm attempting to implement an algorithm similar to the Boyer-Moore-Horspool algorithm, only using hex color values instead of characters (i.e., there is a much greater range).
Following the example on Wikipedia, I originally had this:
size_t jump_table[0xFFFFFF + 1];
memset(jump_table, default_value, sizeof(jump_table);
However, 0xFFFFFF is obviously a huge number and this quickly causes C to seg-fault (but not stack-overflow, disappointingly).
Basically, what I need is an efficient associative array mapping integers to integers. I was considering using a hash table, but having a malloc'd struct for each entry just seems overkill to me (I also do not need hashes generated, as each key is a unique integer and there can be no duplicate entries).
Does anyone have any alternatives to suggest? Am I being overly pragmatic about this?
Update
For those interested, I ended up using a hash table via the uthash library.

0xffffff is rather too large to put on the stack on most systems, but you absolutely can malloc a buffer of that size (at least on current computers; not so much on a smartphone). Whether or not you should do it for this task is a separate issue.
Edit: Based on the comment, if you expect the common case to have a relatively small number of entries other than the "this color doesn't appear in the input" skip value, you should probably just go ahead and use a hash map (obviously only storing values that actually appear in the input).
(ignore earlier discussion of other data structures, which was based on an incorrect recollection of the algorithm under discussion -- you want to use a hash table)

If the array you were going to make (of size 0xFFFFFF) was going to be sparse you could try making a smaller array to act as a simple hash table, with the size being 0xFFFFFF / N and the hash function being hexValue / N (or hexValue % (0xFFFFFF / N)). You'll have to be creative to handle collisions though.
This is the only way I can foresee getting out of mallocing structs.

You can malloc(3) 0xFFFFFF blocks of size_t on the heap (for simplicity), and address them as you do with an array.
As for the stack overflow. Basically the program receives a SIGSEGV, which can be a result of a stack overflow or accessing illegal memory or writing on a read-only segment etc... They are all abstracted under the same error message "Segmentation fault".
But why don't you use a higher level language like python that supports associate arrays?

At possibly the cost of some speed, you could try modifying the algorithm to find only matches that are aligned to some boundary (every three or four symbols), then perform the search at byte level.

You could create a sparse array of sorts which has "pages" like this (this example uses 256 "pages", so the upper most byte is the page number):
int *pages[256];
/* call this first to make sure all of the pages start out NULL! */
void init_pages(void) {
for(i = 0; i < 256; ++i) {
pages[i] = NULL;
}
}
int get_value(int index) {
if(pages[index / 0x10000] == NULL) {
pages[index / 0x10000] = calloc(0x10000, 1); /* calloc so it will zero it out */
}
return pages[index / 0x10000][index % 0x10000];
}
void set_value(int index, int value) {
if(pages[index / 0x10000] == NULL) {
pages[index / 0x10000] = calloc(0x10000, 1); /* calloc so it will zero it out */
}
pages[index / 0x10000][index % 0x10000] = value;
}
this will allocate a page the first time it is touched, read or write.

To avoid the overhead of malloc you can use a hashtable where the entries in the table are your structs, assuming they are small. In your case a pair of integers should suffice, with a special value to indicate emptyness of the slot in the table.

How many values are there in your output space, i.e. how many different values do you map to in the range 0-0xFFFFF?
Using randomized universal hashing you can come up with a collision-free hash function with a table no bigger than 2 times the number of values in your output space (for a static table)