zdelta compression library segfaults - c

I'm using zdelta library (http://cis.poly.edu/zdelta/) to compress a bunch of binary files and have been running into the issue where decompression almost always segfaults, even with the command line interface. Just wondering if anyone ran into this before?
I did some error isolation: compression output with my code is same as what I got from CLI (command is ./zdc reference.bin fileToCompress.bin > compressedFile.bin.del) so I assume compression works fine. The confusing part is say I use A.bin as reference and compress against itself, then everything works perfectly. As soon as I try a different file it segfaults (compress B.bin with A.bin being the reference, for example). Same with the decompression CLI.
Code for compression, bufferIn is the uncompressed data and bufferOut is an output buffer area which is large enough (ten times as input buffer, so even if the compression grows the file things should still work):
int rv = zd_compress(reference, refSize,
bufferIn, inputSize,
bufferOut, &outputSize);
Documentation for compress:
433 /* computes zdelta difference between target data and reference data
434 *
435 * INPUT:
436 * ref pointer to reference data set
437 * rsize size of reference data set
438 * tar pointer to targeted data set
439 * tsize size of targeted data set
440 * delta pointer to delta buffer
441 * the delta buffer IS allocated by the user
442 * *dsize size of delta buffer
443 *
444 *
445 * OUTPUT parameters:
446 * delta pointer to zdelta difference
447 * *dsize size of zdelta difference
448 *
449 * zd_compress returns ZD_OK on success,
450 * ZD_MEM_ERROR if there was not enough memory,
451 * ZD_BUF_ERROR if there was not enough room in the output
452 * buffer.
453 */
454 ZEXTERN int ZEXPORT zd_compress OF ((const Bytef *ref, uLong rsize,
455 const Bytef *tar, uLong tsize,
456 Bytef *delta, uLongf* dsize));
==============================
Code for decompression, bufferIn is the compressed data and bufferOut is an output buffer area which is 1000 times than the input (bad practice yes, but I'd like to figure out the segfault first..):
int rv = zd_uncompress(reference, refSize,
bufferOut, &outputSize,
bufferIn, inputSize);
Documentation for uncompress:
518 /* rebuilds target data from reference data and zdelta difference
519 *
520 * INPUT:
521 * ref pointer to reference data set
522 * rsize size of reference data set
523 * tar pointer to target buffer
524 * this buffer IS allocated by the user
525 * tsize size of target buffer
526 * delta pointer to zdelta difference
527 * dsize size of zdelta difference
528 *
529 *
530 * OUTPUT parameters:
531 * tar pointer to recomputed target data
532 * *tsize size of recomputed target data
533 *
534 * zd_uncompress returns ZD_OK on success,
535 * ZD_MEM_ERROR if there was not enough memory,
536 * ZD_BUF_ERROR if there was not enough room in the output
537 * buffer.
538 */
539 ZEXTERN int ZEXPORT zd_uncompress OF ((const Bytef *ref, uLong rsize,
540 Bytef *tar, uLongf *tsize,
541 const Bytef *delta, uLong dsize));
The size variables are all properly initialized. Whenever I run decompression it segfaults deep inside zdelta library at a memcpy in zdelta/inffast.c, seems like a bad destination (only except the case I mentioned above). Anyone had this issue before? Thanks!

I figured this problem was caused by a negation of an unsigned variable, in file inffast.c at line 138:
ptr = rwptr[best_ptr] + (sign == ZD_PLUS ? d : -d);
d is declared of type uInt, so the negation in the false part will (most likely) overflow, which was the cause of the bad destination address of memcpy().
SImply changing this into:
if(ZD_PLUS == sign)
{
ptr = rwptr[best_ptr] + d;
}
else
{
ptr = rwptr[best_ptr] - d;
}
Resolves the issue.
Same story for line 257 in infcodes.c:
c->bp = rwptr[best_ptr] + (c->sign == ZD_PLUS ? c->dist : -c->dist);

Related

Reading bytes from progmem

I'm trying to write a simple program (as a pre-cursor to a more complicated one) that stores an array of bytes to progmem, and then reads and prints the array. I've looked through a million blog/forums posts online and think I'm doing everything fine, but I'm still getting utter gibberish as output.
Here is my code, any help would be much appreciated!
void setup() {
byte hello[10] PROGMEM = {1,2,3,4,5,6,7,8,9,10};
byte buffer[10];
Serial.begin(9600);
memcpy_P(buffer, (char*)pgm_read_byte(&hello), 10);
for(int i=0;i<10;i++){
//buffer[i] = pgm_read_byte(&(hello[i])); //output is wrong even if i use this
Serial.println(buffer[i]);
}
}
void loop() {
}
If I use memcpy, I get the output:
148
93
0
12
148
93
0
12
148
93
And if I use the buffer = .... statement in the for loop (instead of memcpy):
49
5
9
240
108
192
138
173
155
173
You're thinking about two magnitudes too complicated.
memcpy_P wants a source pointer, a destination pointer and a byte count. And the PROGMEM pointer is simply the array. So, your memcpy_P line should like like
memcpy_P (buffer, hello, 10);
that's it.
memcpy (without the "P") will not be able to reach program memory and copy stuff from data RAM instead. That is not what you want.

C++ fwrite() writes more than expected

I have problem when copying files
code:
bool done;
FILE* fin;
FILE* fout;
const int bs = 1024*64;//64 kb
char* buffer[bs];
int er, ew, br, bw;
long long int size = 0;
long long int sizew = 0;
er = fopen_s(&fin,s.c_str(),"rb");
ew = fopen_s(&fout,s2.c_str(),"wb");
if(er == 0 && ew == 0){
while(br = fread(buffer,1,bs,fin)){
size += br;
sizew += fwrite(buffer,1,bs,fout);
}
done = true;
}else{
done = false;
}
if(fin != NULL)fclose(fin);
if(fout != NULL)fclose(fout);
Somehow fwrite writes whole buffer ignoring count value (br)
Some examples how:
Copying 595 file of 635 DONE. 524288/524288 B
Copying 596 file of 635 DONE. 524288/524288 B
Copying 597 file of 635 DONE. 65536/145 B
Copying 598 file of 635 DONE. 65536/16384 B
Copying 599 file of 635 DONE. 65536/145 B
Copying 600 file of 635 DONE. 65536/67 B
Copying 601 file of 635 DONE. 65536/32768 B
Copying 602 file of 635 DONE. 65536/67 B
Anyone knows where is the problem?
ignoring count value (br)
Actually you wrote bs.
A good example of the dangers of poor variable naming!
You should do
sizew += fwrite(buffer,1,br,fout);
You were passing bs, which was the maximum amount that fread was allowed to read. br was the amount that fread actually did read.

Read and write process' memory through /dev/mem, text segment works but data segment can not, why?

I want to read to and write from process' memory through /dev/mem.
First, I get process' memory map through a linux kernel module coded by myself, output is like this:
start_code_segment 4000000000000000
end_code_segment 4000000000019c38
start_data_segment 6000000000009c38
end_data_segment 600000000000b21d
start_brk 6000000000010000
brk 6000000000034000
start_stack 60000fffffde7b00
Second, I can convert virtual address(VA) to PA thorough the linux kernel module, for example, I can convert VA:0x4000000000000008 to PA:0x100100c49f8008
Third, function read_phy_mem can get memory data in PA:0x100100c49f8008,code at the final.
Problem: My problem is when I read text segment PA memory, everything is OK, but if I read data segment PA memory, *((long *)mapAddr) in line 243 will cause system to go down. Also, I tried
memcpy( &data, (void *)mapAddr, sizeof(long) )
but it still make the system go down.
other info: my computer is IA64, OS is Linux 2.6.18, when system is down, I can get output Info from console like this, then system will restart.
Entered OS MCA handler. PSP=20010000fff21320 cpu=0 monarch=1
cpu 0, MCA occurred in user space, original stack not modified
All OS MCA slaves have reached rendezvous
MCA: global MCA
mlogbuf_finish: printing switched to urgent mode, MCA/INIT might be dodgy or fail.
Delaying for 5 seconds...
code of function read_phy_mem
/*
* pa: physical address
* data: memory data in pa
*
* return int: success or failed
*/
188 int read_phy_mem(unsigned long pa,long *data)
189 {
190 int memfd;
191 int pageSize;
192 int shift;
193 int do_mlock;
194 void volatile *mapStart;
195 void volatile *mapAddr;
196 unsigned long pa_base;
197 unsigned long pa_offset;
198
199 memfd = open("/dev/mem", O_RDWR | O_SYNC);
200 if(memfd == -1)
201 {
202 perror("Failed to open /dev/mem");
203 return FAIL;
204 }
205
206 shift = 0;
207 pageSize = PAGE_SIZE; //#define PAGE_SIZE 16384
208 while(pageSize > 0)
209 {
210 pageSize = pageSize >> 1;
211 shift ++;
212 }
213 shift --;
214 pa_base = (pa >> shift) << shift;
215 pa_offset = pa - pa_base;
224 mapStart = (void volatile *)mmap(0, PAGE_SIZE, PROT_READ | PROT_WRITE,MAP_SHARED | MAP_LOCKED, memfd, pa_base);
226 if(mapStart == MAP_FAILED)
227 {
228 perror("Failed to mmap /dev/mem");
229 close(memfd);
230 return FAIL;
231 }
232 if(mlock((void *)mapStart, PAGE_SIZE) == -1)
233 {
234 perror("Failed to mlock mmaped space");
235 do_mlock = 0;
236 }
237 do_mlock = 1;
238
239 mapAddr = (void volatile *)((unsigned long)mapStart + pa_offset);
243 printf("mapAddr %p %d\n", mapAddr, *((long *)mapAddr));
256 if(munmap((void *)mapStart, PAGE_SIZE) != 0)
257 {
258 perror("Failed to munmap /dev/mem");
259 }
260 close(memfd);
269 return OK;
270 }
Can anyone understand why text segment works well but data segment does not?
I guess, its happening because code-section remain in memory while process executes(if not a DLL code), Whereas data section leave in & out continuously.
Try with stack-Segment. And check if its working?
Write your own test program and allocate memory dynamically in KBs and keep that memory in use within a loop. Than try it with your code to read memory segments of test program. I think it will work.
I have done similar work in windows to replace BIOS address from IVT.
Should be root user.

BoehmGC - Understanding memory allocator GC_malloc

I am breaking my head in understanding the BoehmGC allocation scheme - GC_malloc. I am not getting how it allocates memory, not seen any malloc or mmap which GC_malloc internally calls.
Can someone kindly help me? Any links or code snippet will be of big help.
Huge thanks in advance.
Boehm GC source code
enter code here
254 /* Allocate lb bytes of composite (pointerful) data */
255 #ifdef THREAD_LOCAL_ALLOC
256 void * GC_core_malloc(size_t lb)
257 #else
258 void * GC_malloc(size_t lb)
259 #endif
260 {
261 void *op;
262 void **opp;
263 size_t lg;
264 DCL_LOCK_STATE;
265
266 if(SMALL_OBJ(lb)) {
267 lg = GC_size_map[lb];
268 opp = (void **)&(GC_objfreelist[lg]);
269 LOCK();
270 if( EXPECT((op = *opp) == 0, 0) ) {
271 UNLOCK();
272 return(GENERAL_MALLOC((word)lb, NORMAL));
273 }
274 /* See above comment on signals. */
275 GC_ASSERT(0 == obj_link(op)
276 || (word)obj_link(op)
277 <= (word)GC_greatest_plausible_heap_addr
278 && (word)obj_link(op)
279 >= (word)GC_least_plausible_heap_addr);
280 *opp = obj_link(op);
281 obj_link(op) = 0;
282 GC_bytes_allocd += GRANULES_TO_BYTES(lg);
283 UNLOCK();
284 return op;
285 } else {
286 return(GENERAL_MALLOC(lb, NORMAL));
287 }
288 }
There are two possibilities:
It returns a pointer given by GENERAL_MALLOC (two returns)
it sets op = *opp (the line with the EXPECT) and then it returns op. I'll say that the second is to reuse freed blocks.
For the second case: look at the value of opp before the if:
opp = (void **)&(GC_objfreelist[lg]);
In opp there is a pointer to the "free" list of objects.
The if probably checks if there is any block in that list. If there isn't (== 0) then it uses GENERAL_MALLOC.
Look at the os_deps.c file where (most) of the OS-dependent functions are implemented.
mmap can be used by Boehm-GC if it's configured to use that. (See the various GC_unix_get_mem(bytes) functions.)
If mmap isn't used, the other (bare) allocator used sbrk.

Passing char * into fopen with C

I'm writing a program that passes data from a file into an array, but I'm having trouble with fopen (). It seems to work fine when I hardcode the file path into the parameters (eg fopen ("data/1.dat", "r");) but when I pass it as a pointer, it returns NULL.
Note that line 142 will print "data/1.dat" if entered from command line so parse_args () appears to be working.
132 int
133 main(int argc, char **argv)
134 {
135 FILE *in_file;
136 int *nextItem = (int *) malloc (sizeof (int));
137 set_t *dictionary;
138
139 /* Parse Arguments */
140 clo_t *iopts = parse_args(argc, argv);
141
142 printf ("INPUT FILE: %s.\n", iopts->input_file); /* This prints correct path */
143 /* Initialise dictionary */
144 dictionary = set_create (SET_INITAL_SIZE);
145
146 /* Use fscanf to read all data values into new set_t */
147 if ((in_file = fopen (iopts->input_file, "r")) == NULL)
148 {
149 printf ("File not found...\n");
150 return 0;
151 }
Thanks!
Rhys
MORE: If I try to print the string after I run set_create() (ln 144), the string doesn't print. (But there isn't any reference to the string in the function at all...)
47 set_t *
48 set_create(int size)
49 {
50 set_t *set;
51
52 /* set set_t members */
53 set->items = 0;
54 set->n_max = size;
55 set->lock = FALSE;
56
57 /* allocate memory for dictionary input */
58 set->data = (int *) malloc (size * sizeof (int));
59
60 return set;
61 }
It does work if I call this function after fopen ().
I can't see how this is affecting the filename though...
Thanks again.
Your new code shows that you are writing to invalid memory. set is a pointer but you never initialize it. You're overwriting some random memory and thereby destroying the pointer to the string that you're passing to fopen().
Are you sure parse_args works correctly? If it, for example, returns a pointer to a local variable (or a struct that contains such pointers), the values like iopts->input_file would easily be destroyed by subsequent function calls.
That second part is your problem. set is not initialized.
To clarify: you're modifying stuff that you don't mean to, causing the fopen() to fail.

Resources