Compression Ratio Calculation in Delta Encoding - c

I am new to use Delta encoding to compress Hex data, I used the C implementation in wiki, so if my data is like 0xFFFFFFF,0xFFFFFFF,0xFFFFFFF,0xFFFFFFF,0xFFFFFFF the encoding result will be as follows : 0xFFFFFFF,0x0000000,0x0000000,0x0000000,0x0000000 , unlike rest of lossless algorithms which compression ratio = origial size / compressed size , i found that size of data will be fixed like before the compression, so how could i calculate compression ratio in delta encoding ? and how could i compress redundant delta ?
The code is :
{
unsigned char last = 0;
for (int i = 0; i < length; i++)
{
unsigned char current = buffer[i];
buffer[i] = current - last;
last = current;
}
}
void delta_decode(unsigned char *buffer, int length)
{
unsigned char last = 0;
for (int i = 0; i < length; i++)
{
unsigned char delta = buffer[i];
buffer[i] = delta + last;
last = buffer[i];
}
} ```

Delta encoding is a step before compression that does not, itself, compress. It enables the subsequent compression. You should then take the result of delta encoding and feed it to a standard lossless compressor to see how much it compresses.
Your question in the comments "but will Huffman or RLE work with Hex data in numeric format?" suggests some confusion on your part. The result of the delta encoding is the contents of the array in binary. Not the hexadecimal and text representation of that binary data.

Related

reading a UPC barcode from an image

i need some guidance on how to get a 12 digit barcode from a bmp file, i'm completely clueless on how approach this.
i started by reading the image into a bitmam, how can i continue?
example: the barcode of the image below is 081034489030.
how to i get these numbers?
void part1() {
int width, height;
unsigned char ** img = NULL;
img = readBMP("package.bmp", &height, &width);
}
unsigned char** readBMP(char* filename, int* height_r, int* width_r)
{
int i, j;
FILE* f;
fopen_s(&f,filename, "rb");
unsigned char info[54];
fread(info, sizeof(unsigned char), 54, f); // read the 54-byte header
// extract image height and width
//from header
int width = *(int*)&info[18];
int height = *(int*)&info[22];
int pad_needed = 4 - (3 * width) % 4; // pad calculation
int paddedRow = 3 * width + ((pad_needed != 4) ? pad_needed : 0);
unsigned char** map2d = (unsigned char**)malloc(width * sizeof(unsigned
char*)); // alocate memory for img 2d array
for (i = 0; i < width; i++) {
map2d[i] = (unsigned char*)malloc(height * sizeof(unsigned char));
}
unsigned char* data = (unsigned char*)malloc(paddedRow * sizeof(unsigned
char)); // allocate memory for each read from file
for (i = 0; i < height; i++) {
fread(data, sizeof(unsigned char), paddedRow, f); //read line from file
for (j = 0; j < width; j++) {
map2d[j][i] = (int)data[3 * j]; // insert data to map2d. jump 3,
//becasue we need only one value of the colors (RGB)
}
}
free(data);
fclose(f);
*width_r = width;
*height_r = height;
return map2d;
}
You need to apply computer vision techniques to:
Segment the barcode from the image
Decode the barcode information so that it can be further used in an application.
There is no single answer to this problem, and it will definitely not be a one-liner.
A way to start is by using a dedicated computer vision library like OpenCV. It will not only handle the image loading on your behalf, but enable you to apply advanced image processing algorithms on the loaded data. It supports C, Python, C#, so you should easily find the version that matches your language of choice.
Once OpenCV is added to your project, it is time to solve point number 1. A good algorithm to start from is described Detecting Barcodes in Images with Python and OpenCV. Don't get distracted by the use of Python, the same OpenCV functions are available in C as well, the idea is to understand the algorithm.
Assuming you now have a working segmentation algorithm, the last step is to decode the barcode itself. Here I would suggest Parts 2 and 3 of this article as a starting point. There are also pre-built libraries (if you Google, there are plenty of UPC decoders written in Java or C# like this one), so with a bit of digging you may be able to find an out-of-the-box solution.
Hope this helps.

Storing bits from an array in an integer

So i have an array of bits, basically 0's and 1's in a character array.
Now what I want to do is store these bits in an integer I have in another array (int array), but I'm not sure how to do this.
Here is my code to get the bits:
char * convertStringToBits(char * string) {
int i;
int stringLength = strlen(string);
int mask = 0x80; /* 10000000 */
char *charArray;
charArray = malloc(8 * stringLength + 1);
if(charArray == NULL) {
printf("An error occured!\n");
return NULL; //error - cant use charArray
}
for(i = 0; i < stringLength; i++) {
mask = 0x80;
char c = string[i];
int x = 0;
while(mask > 0) {
char n = (c & mask) > 0;
printf("%d", n);
charArray[x++] = n;
mask >>= 1; /* move the bit down */
}
printf("\n");
}
return charArray;
}
This gets a series of bits in an array {1, 0, 1, 1, 0, 0, 1} for example. I want to store this in the integers that I have in another array. I've heard about integers having unused space or something.
For Reference: The integer values are red values from the rgb colour scheme.
EDIT:
To use this I would store this string in the integer values, later to be decoded the same way to retrieve the message (steganography).
So you want to do LSB substitution for the integers, the simplest form of steganography.
It isn't that integers have unused space, it's just that changing the LSB changes the value of an integer by 1, at most. So if you're looking at pixels, changing their value by 1 won't be noticeable by the human eye. In that respect, the LSB holds redundant information.
You've played with bitwise operations. You basically want to clear the last bit of an integer and substitute it with the value of one of your bits. Assuming your integers range between 0 and 255, you can do the following.
pixel = (pixel & 0xfe) | my_bit;
Edit: Based on the code snippet from the comments, you can achieve this like so.
int x;
for (x = 0; x < messageLength; x++) {
rgbPixels[x][0] = (rgbPixels[x][0] & 0xfe) | bitArray[x];
}
Decoding is much simpler, in that all you need to do is read the value of the LSB of each pixel. The question here is how will you know how many pixels to read? You have 3 options:
The decoder knows the message length in advance.
The message length is similarly hidden in some known location so that the decoder can extract it. For example, 16 bits representing in binary the message length, which is hidden in the first 16 pixels before bitArray.
You use an end-of-message marker, where you keep extracting bits until you hit a signature sequence that signals you to stop. For example, eight 0s in a row. You must make sure that how long the sequence and whatever it will be, it mustn't be encountered prematurely in your bit array.
So say somehow you have allocated the size for the message length. You can simply get extract your bit array (after allocation) like so.
int x;
for (x = 0; x < messageLength; x++) {
bitArray[x] = rgbPixels[x][0] & 0x01;
}
This converts a string to the equivalent int.
char string[] = "101010101";
int result = 0;
for (int i=0; i<strlen(string); i++)
result = (result<<1) | string[i]=='1';

Simple reverb alghoritm when buffer is small

I'm trying to implement simple delay/reverb described in this post https://stackoverflow.com/a/5319085/1562784 and I have a problem. On windows where I record 16bit/16khz samples and get 8k samples per recording callback call, it works fine. But on linux I get much smaller chunks from soundcard. Something around 150 samples. Because of that I modified delay/reverb code to buffer samples:
#define REVERB_BUFFER_LEN 8000
static void reverb( int16_t* Buffer, int N)
{
int i;
float decay = 0.5f;
static int16_t sampleBuffer[REVERB_BUFFER_LEN] = {0};
//Make room at the end of buffer to append new samples
for (i = 0; i < REVERB_BUFFER_LEN - N; i++)
sampleBuffer[ i ] = sampleBuffer[ i + N ] ;
//copy new chunk of audio samples at the end of buffer
for (i = 0; i < N; i++)
sampleBuffer[REVERB_BUFFER_LEN - N + i ] = Buffer[ i ] ;
//perform effect
for (i = 0; i < REVERB_BUFFER_LEN - 1600; i++)
{
sampleBuffer[i + 1600] += (int16_t)((float)sampleBuffer[i] * decay);
}
//copy output sample
for (i = 0; i < N; i++)
Buffer[ i ] = sampleBuffer[REVERB_BUFFER_LEN - N + i ];
}
This results in white noise on output, so clearly I'm doing something wrong.
On linux, I record in 16bit/16khz, same like on Windows and I'm running linux in VMWare.
Thank you!
Update:
As indicated in answered post, I was 'reverbing' old samples over and over again. Simple 'if' sovled a problem:
for (i = 0; i < REVERB_BUFFER_LEN - 1600; i++)
{
if((i + 1600) >= REVERB_BUFFER_LEN - N)
sampleBuffer[i + 1600] += (int16_t)((float)sampleBuffer[i] * decay);
}
Your loop that performs the actual reverb effect will be performed multiple times on the same samples, on different calls to the function. This is because you save old samples in the buffer, but you perform the reverb on all samples each time. This will likely cause them to overflow at some point.
You should only perform the reverb on the new samples, not on ones which have already been modified. I would also recommend checking for overflow and clipping to the min/max values instead of wrapping in that case.
A probably better way to perform reverb, which will work for any input buffer size, is to maintain a circular buffer of size REVERB_SAMPLES (1600 in your case), which contains the last samples.
void reverb( int16_t* buf, int len) {
static int16_t reverb_buf[REVERB_SAMPLES] = {0};
static int reverb_pos = 0;
for (int i=0; i<len; i++) {
int16_t new_value = buf[i] + reverb_buf[reverb_pos] * decay;
reverb_buf[reverb_pos] = new_value;
buf[i] = new_value;
reverb_pos = (reverb_pos + 1) % REVERB_SAMPLES;
}
}

In-place run length decoding?

Given a run length encoded string, say "A3B1C2D1E1", decode the string in-place.
The answer for the encoded string is "AAABCCDE". Assume that the encoded array is large enough to accommodate the decoded string, i.e. you may assume that the array size = MAX[length(encodedstirng),length(decodedstring)].
This does not seem trivial, since merely decoding A3 as 'AAA' will lead to over-writing 'B' of the original string.
Also, one cannot assume that the decoded string is always larger than the encoded string.
Eg: Encoded string - 'A1B1', Decoded string is 'AB'. Any thoughts?
And it will always be a letter-digit pair, i.e. you will not be asked to converted 0515 to 0000055555
If we don't already know, we should scan through first, adding up the digits, in order to calculate the length of the decoded string.
It will always be a letter-digit pair, hence you can delete the 1s from the string without any confusion.
A3B1C2D1E1
becomes
A3BC2DE
Here is some code, in C++, to remove the 1s from the string (O(n) complexity).
// remove 1s
int i = 0; // read from here
int j = 0; // write to here
while(i < str.length) {
assert(j <= i); // optional check
if(str[i] != '1') {
str[j] = str[i];
++ j;
}
++ i;
}
str.resize(j); // to discard the extra space now that we've got our shorter string
Now, this string is guaranteed to be shorter than, or the same length as, the final decoded string. We can't make that claim about the original string, but we can make it about this modified string.
(An optional, trivial, step now is to replace every 2 with the previous letter. A3BCCDE, but we don't need to do that).
Now we can start working from the end. We have already calculated the length of the decoded string, and hence we know exactly where the final character will be. We can simply copy the characters from the end of our short string to their final location.
During this copy process from right-to-left, if we come across a digit, we must make multiple copies of the letter that is just to the left of the digit. You might be worried that this might risk overwriting too much data. But we proved earlier that our encoded string, or any substring thereof, will never be longer than its corresponding decoded string; this means that there will always be enough space.
The following solution is O(n) and in-place. The algorithm should not access memory it shouldn't, both read and write. I did some debugging, and it appears correct to the sample tests I fed it.
High level overview:
Determine the encoded length.
Determine the decoded length by reading all the numbers and summing them up.
End of buffer is MAX(decoded length, encoded length).
Decode the string by starting from the end of the string. Write from the end of the buffer.
Since the decoded length might be greater than the encoded length, the decoded string might not start at the start of the buffer. If needed, correct for this by shifting the string over to the start.
int isDigit (char c) {
return '0' <= c && c <= '9';
}
unsigned int toDigit (char c) {
return c - '0';
}
unsigned int intLen (char * str) {
unsigned int n = 0;
while (isDigit(*str++)) {
++n;
}
return n;
}
unsigned int forwardParseInt (char ** pStr) {
unsigned int n = 0;
char * pChar = *pStr;
while (isDigit(*pChar)) {
n = 10 * n + toDigit(*pChar);
++pChar;
}
*pStr = pChar;
return n;
}
unsigned int backwardParseInt (char ** pStr, char * beginStr) {
unsigned int len, n;
char * pChar = *pStr;
while (pChar != beginStr && isDigit(*pChar)) {
--pChar;
}
++pChar;
len = intLen(pChar);
n = forwardParseInt(&pChar);
*pStr = pChar - 1 - len;
return n;
}
unsigned int encodedSize (char * encoded) {
int encodedLen = 0;
while (*encoded++ != '\0') {
++encodedLen;
}
return encodedLen;
}
unsigned int decodedSize (char * encoded) {
int decodedLen = 0;
while (*encoded++ != '\0') {
decodedLen += forwardParseInt(&encoded);
}
return decodedLen;
}
void shift (char * str, int n) {
do {
str[n] = *str;
} while (*str++ != '\0');
}
unsigned int max (unsigned int x, unsigned int y) {
return x > y ? x : y;
}
void decode (char * encodedBegin) {
int shiftAmount;
unsigned int eSize = encodedSize(encodedBegin);
unsigned int dSize = decodedSize(encodedBegin);
int writeOverflowed = 0;
char * read = encodedBegin + eSize - 1;
char * write = encodedBegin + max(eSize, dSize);
*write-- = '\0';
while (read != encodedBegin) {
unsigned int i;
unsigned int n = backwardParseInt(&read, encodedBegin);
char c = *read;
for (i = 0; i < n; ++i) {
*write = c;
if (write != encodedBegin) {
write--;
}
else {
writeOverflowed = 1;
}
}
if (read != encodedBegin) {
read--;
}
}
if (!writeOverflowed) {
write++;
}
shiftAmount = encodedBegin - write;
if (write != encodedBegin) {
shift(write, shiftAmount);
}
return;
}
int main (int argc, char ** argv) {
//char buff[256] = { "!!!A33B1C2D1E1\0!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!" };
char buff[256] = { "!!!A2B12C1\0!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!" };
//char buff[256] = { "!!!A1B1C1\0!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!" };
char * str = buff + 3;
//char buff[256] = { "A1B1" };
//char * str = buff;
decode(str);
return 0;
}
This is a very vague question, though it's not particularly difficult if you think about it. As you say, decoding A3 as AAA and just writing it in place will overwrite the chars B and 1, so why not just move those farther along the array first?
For instance, once you've read A3, you know that you need to make space for one extra character, if it was A4 you'd need two, and so on. To achieve this you'd find the end of the string in the array (do this upfront and store it's index).
Then loop though, moving the characters to their new slots:
To start: A|3|B|1|C|2|||||||
Have a variable called end storing the index 5, i.e. the last, non-blank, entry.
You'd read in the first pair, using a variable called cursor to store your current position - so after reading in the A and the 3 it would be set to 1 (the slot with the 3).
Pseudocode for the move:
var n = array[cursor] - 2; // n = 1, the 3 from A3, and then minus 2 to allow for the pair.
for(i = end; i > cursor; i++)
{
array[i + n] = array[i];
}
This would leave you with:
A|3|A|3|B|1|C|2|||||
Now the A is there once already, so now you want to write n + 1 A's starting at the index stored in cursor:
for(i = cursor; i < cursor + n + 1; i++)
{
array[i] = array[cursor - 1];
}
// increment the cursor afterwards!
cursor += n + 1;
Giving:
A|A|A|A|B|1|C|2|||||
Then you're pointing at the start of the next pair of values, ready to go again. I realise there are some holes in this answer, though that is intentional as it's an interview question! For instance, in the edge cases you specified A1B1, you'll need a different loop to move subsequent characters backwards rather than forwards.
Another O(n^2) solution follows.
Given that there is no limit on the complexity of the answer, this simple solution seems to work perfectly.
while ( there is an expandable element ):
expand that element
adjust (shift) all of the elements on the right side of the expanded element
Where:
Free space size is the number of empty elements left in the array.
An expandable element is an element that:
expanded size - encoded size <= free space size
The point is that in the process of reaching from the run-length code to the expanded string, at each step, there is at least
one element that can be expanded (easy to prove).

Mono to Stereo conversion

I have the following issue here: I get a block of bytes (uint16_t*) representing audio data, and the device generating them is capturing mono sound, so obviously I have mono audio data, on 1 channel. I need to pass this data to another device, which is expecting interleaved stereo data (so, 2 channels). What I want to do is basically duplicate the 1 channel in data so that both channels of the stereo data will contain the same bytes. Can you point me to an efficient algorithm doing this?
Thanks,
f.
If you just want interleaved stereo samples then you could use a function like this:
void interleave(const uint16_t * in_L, // mono input buffer (left channel)
const uint16_t * in_R, // mono input buffer (right channel)
uint16_t * out, // stereo output buffer
const size_t num_samples) // number of samples
{
for (size_t i = 0; i < num_samples; ++i)
{
out[i * 2] = in_L[i];
out[i * 2 + 1] = in_R[i];
}
}
To generate stereo from a single mono buffer then you would just pass the same pointer for in_L and in_R, e.g.
interleave(mono_buffer, mono_buffer, stereo_buffer, num_samples);
You might want to do the conversion in-place to save some memory. Depends on how small an amount of memory the device in question has. So you might want to use something like this instead of Paul R's approach:
void interleave(uint16_t buf[], const int len)
{
for (int i = len / 2 - 1, j = len - 1; i >= 0; --i) {
buf[j--] = buf[i];
buf[j--] = buf[i];
}
}
When getting the sound data from the mono device, you allocate a buffer that's twice as big as needed and pass that to the mono device. This will fill half the buffer with mono audio. You then pass that buffer to the above function, which converts it to stereo. And finally you pass the buffer to the stereo device. You save an extra allocation and thus use 33% less memory for the conversion.
Pass to both channels the same pointer? If that violates restrict rules, use memcpy()?
Sorry, but your question is otherwise to broad. API? OS? CPUArchitectures?
You are going to have to copy the buffer and duplicate it. As you haven't told us the format, how it is terminated, I can't give code, but it will look like a simple for loop.
int_16* allocateInterleaved(int_16* data, int length)
int i;
int *copy = malloc(sizeof(int_16)*length*2);
if(copy == NULL) {
/* handle error */
}
for(i =0; i<length; i++) {
copy[2*i] = data[i];
copy[2*i+1] = data[i];
}
return copy;
}
forgive any glaring typos, my C is a bit rusty. typdef in whatever type you need for signed 16bit into int_16. Don't forget to free the copy buffer, or better yet reuse it.
You need to interleave the data, but if the frame length is anything greater than one, none of the above solutions will work. The below code can account for variable frame lengths.
void Interleave(BYTE* left, BYTE* right, BYTE* stereo,int numSamples_in, int frameSize)
{
int writeIndex = 0;
for (size_t j = 0; j < numSamples_in; j++)
{
for (int k = 0; k < frameSize; k++)
{
int index = j * frameSize + k;
stereo[k + writeIndex] = left[index];
stereo[k + writeIndex + frameSize] = right[index];
}
writeIndex += 2 * frameSize;
}
}

Resources