Writing to allocated char buffer - c

I have following code:
framingStatus compressToFrame(char* inputBuffer, size_t inputBufferLength, char* frame, size_t* frameLength)
{
dword crc32 = GetMaskedCrc(inputBuffer,inputBufferLength);
size_t length = snappy_max_compressed_length(inputBufferLength);
char* compressed = (char*)malloc(length);
snappy_status status = snappy_compress(inputBuffer,inputBufferLength,compressed,&length);
if( status!=SNAPPY_OK )
return FS_ERROR;
frame[0] = 0x00; // Typ ramki skompresowany
frame[1] = length&0xff;
frame[2] = (length&0xff00)>>8;
frame[3] = (length&0xff00)>>16;
frame[4] = crc32&0xff;
frame[5] = (crc32&0xff00)>>8;
frame[6] = (crc32&0xff0000)>>16;
frame[7] = (crc32&0xff000000)>>24;
frame[8] = '\0'; // Pomoc dla strcat
strcat(frame,compressed);
*frameLength = length+8;
free(compressed);
return FS_OK;
}
Before calling this function I allocate memory for buffer named frame. All is ok, but assign instructions frame[x] = ... don't seem to write anything to buffer called frame and strcat concatenate compressed data to empty buffer without header I need.
Why assign instructions frame[x] = ... etc. don't give any result?
[EDIT:]
Can you suggest what function I have to use if I want to concatenate frame header with compressed data?
[EDIT2:]
Code presented below works just fine.
framingStatus compressToFrame(char* inputBuffer, size_t inputBufferLength, char* frame, size_t* frameLength)
{
dword crc32 = GetMaskedCrc(inputBuffer,inputBufferLength);
size_t length = snappy_max_compressed_length(inputBufferLength);
char* compressed = (char*)malloc(length);
snappy_status status = snappy_compress(inputBuffer,inputBufferLength,compressed,&length);
if( status!=SNAPPY_OK )
return FS_ERROR;
frame[0] = 0x00; // Typ ramki skompresowany
frame[1] = length;
frame[2] = length >> 8;
frame[3] = length >> 16;
frame[4] = crc32;
frame[5] = crc32 >>8;
frame[6] = crc32 >>16;
frame[7] = crc32 >>24;
memcpy(&frame[8],compressed,length);
*frameLength = length+8;
free(compressed);
return FS_OK;
}

You have
frame[0] = 0x00;
which is the same as
frame[0] = '\0';
No matter what you add after the first character, frame becomes a 0 length string.

strcat is for strings, not general binary bytes. Because frame first byte is zero, strcat will copy compressed starting at frame[0] and will stop copying when it sees a zero in compressed.
Try memcpy instead.
memcpy(&frame[8], compressed, length);
Also, since the length of frame is passed as an argument, you might want to be checking the total length you are copying to frame to make sure there's no illegal overwrite in that case.

As others already pointed out, you use binary data and not text strings. Therefore, strcat function is inappropriate here, use memcpy instead.
Furthermore, you should use unsigned char instead of plain char.
Additionally, you don't need to mask the values before shifting
frame[2] = (length&0xff00)>>8;
could be just
frame[2] = length >> 8;
And in this case, it is even buggy
frame[3] = (length&0xff00)>>16;
Same here
frame[5] = crc32 >> 8;
frame[6] = crc32 >> 16;
frame[7] = crc32 >> 24;

frame[0] = 0x00;
will make the first character as end of string character therefore your string frame is empty.
frame[0] = 0x00;
is same as writing,
frame[0] = '\0';

Related

String Allocated with malloc Enaccessible after Function Return

I have run into a strange bug and I cannot for the life of me get it figured out. I have a function that decodes a byte array into a string based on another encoding function. The function that decodes looks roughly like this:
char *decode_string( uint8_t *encoded_string, uint32_t length,
uint8_t encoding_bits ) {
char *sequence_string;
uint32_t idx = 0;
uint32_t posn_in_buffer;
uint32_t posn_in_cell;
uint32_t encoded_nucleotide;
uint32_t bit_mask;
// Useful Constants
const uint8_t CELL_SIZE = 8;
const uint8_t NUCL_PER_CELL = CELL_SIZE / encoding_bits;
sequence_string = malloc( sizeof(char) * (length + 1) );
if ( !sequence_string ) {
ERR_PRINT("could not allocate enough space to decode the string\n");
return NULL;
}
// Iterate over the buffer, converting one nucleotide at a time.
while ( idx < length ) {
posn_in_buffer = idx / NUCL_PER_CELL;
posn_in_cell = idx % NUCL_PER_CELL;
encoded_nucleotide = encoded_string[posn_in_buffer];
encoded_nucleotide >>= (CELL_SIZE - encoding_bits*(posn_in_cell+1));
bit_mask = (1 << encoding_bits) - 1;
encoded_nucleotide &= bit_mask;
sequence_string[idx] = decode_nucleotide( encoded_nucleotide );
// decode_nucleotide returns a char on integer input.
idx++;
}
sequence_string[idx] = '\0';
printf("%s", sequence_string); // prints the correct string
return sequence_string;
}
The bug is that the return pointer, if I try to print it, causes a segmentation fault. But calling printf("%s\n", sequence_string) inside of the function will print everything just fine. If I call the function like this:
const char *seq = "AA";
uint8_t *encoded_seq;
encode_string( &encoded_seq, seq, 2, 2);
char *decoded_seq = decode_string( encoded_seq, 2, 2);
if ( decoded_seq ) {
printf("%s\n",decoded_seq); // this crashes
if ( !strcmp(decoded_seq, seq) ) {
printf("Success!");
}
then it will crash on the print.
A few notes, the other functions seem to all work, I've tested them fairly thoroughly (i.e. decode_nucleotide, encode_string). The string also prints correctly inside the function. It is only after the function returns that it stops working.
My question is, what might cause this memory to become invalid just by returning the pointer from a function? Thanks in advance!
First (and not that important, but) in the statement:
sequence_string = malloc( sizeof(char) * (length + 1) );
sizeof(char) by definition is always == 1. so the statement becomes:
sequence_string = malloc(length + 1);
In this section of your post:
char *decoded_seq = decode_string( encoded_seq, 2, 2);
...since I cannot see your implementation of decode_string, I can only make assumptions about how you are verifying its output before returning it. I do however understand that you are expecting the return value to contain values that would be legal contents for a C string. I can also assume that because you are working with coding and decoding, that the output type is likely unsigned char. If I am correct, then a legal range of characters for an output type of unsigned char is 0-255.
You are not checking the output before sending the value to the printf statement. If the value at the memory address of decoded_seq happens to be 0, (in the range of unsigned char) your program would crash. String functions do not work well with null pointers.
You should verify the return of _decode_string_ sending it to printf
char *decoded_seq = decode_string( encoded_seq, 2, 2);
if(decoded_seq != NULL)
{
...

How to Hash CPU ID in C

I'm trying to short the cpu id of my microcontroller (STM32F1).
The cpu id is composed by 3 word ( 3 x 4 bytes). This is the id string built from the 3 word: 980416578761680031125348904
I found a very useful library that do this.
The library is Hashids and there is a C code.
I try to build a test code on PC with "Code Blocks IDE" and the code works.
But when I move the code into the embedded side (Keil v5 IDE), I get an error on strdup() function: "strdup implicit declaration of function".
The problem is related to the strdup function isn't a standard library function and ins't included into string.h.
I will avoid to replace the strdup function with a custom function (that mimic the behaviour of strdup) to avoid memory leak because strdup copy strings using malloc.
Is there a different approach to compress long numbers?
Thanks for the help!
<---Appendix--->
This is the function that uses the strdup.
/* common init */
struct hashids_t *
hashids_init3(const char *salt, size_t min_hash_length, const char *alphabet)
{
struct hashids_t *result;
unsigned int i, j;
size_t len;
char ch, *p;
hashids_errno = HASHIDS_ERROR_OK;
/* allocate the structure */
result = _hashids_alloc(sizeof(struct hashids_t));
if (HASHIDS_UNLIKELY(!result)) {
hashids_errno = HASHIDS_ERROR_ALLOC;
return NULL;
}
/* allocate enough space for the alphabet and its copies */
len = strlen(alphabet) + 1;
result->alphabet = _hashids_alloc(len);
result->alphabet_copy_1 = _hashids_alloc(len);
result->alphabet_copy_2 = _hashids_alloc(len);
if (HASHIDS_UNLIKELY(!result->alphabet || !result->alphabet_copy_1
|| !result->alphabet_copy_2)) {
hashids_free(result);
hashids_errno = HASHIDS_ERROR_ALLOC;
return NULL;
}
/* extract only the unique characters */
result->alphabet[0] = '\0';
for (i = 0, j = 0; i < len; ++i) {
ch = alphabet[i];
if (!strchr(result->alphabet, ch)) {
result->alphabet[j++] = ch;
}
}
result->alphabet[j] = '\0';
/* store alphabet length */
result->alphabet_length = j;
/* check length and whitespace */
if (result->alphabet_length < HASHIDS_MIN_ALPHABET_LENGTH) {
hashids_free(result);
hashids_errno = HASHIDS_ERROR_ALPHABET_LENGTH;
return NULL;
}
if (strchr(result->alphabet, ' ')) {
hashids_free(result);
hashids_errno = HASHIDS_ERROR_ALPHABET_SPACE;
return NULL;
}
/* copy salt */
result->salt = strdup(salt ? salt : HASHIDS_DEFAULT_SALT);
result->salt_length = (unsigned int) strlen(result->salt);
/* allocate enough space for separators */
result->separators = _hashids_alloc((size_t)
(ceil((float)result->alphabet_length / HASHIDS_SEPARATOR_DIVISOR) + 1));
if (HASHIDS_UNLIKELY(!result->separators)) {
hashids_free(result);
hashids_errno = HASHIDS_ERROR_ALLOC;
return NULL;
}
/* non-alphabet characters cannot be separators */
for (i = 0, j = 0; i < strlen(HASHIDS_DEFAULT_SEPARATORS); ++i) {
ch = HASHIDS_DEFAULT_SEPARATORS[i];
if ((p = strchr(result->alphabet, ch))) {
result->separators[j++] = ch;
/* also remove separators from alphabet */
memmove(p, p + 1,
strlen(result->alphabet) - (p - result->alphabet));
}
}
/* store separators length */
result->separators_count = j;
/* subtract separators count from alphabet length */
result->alphabet_length -= result->separators_count;
/* shuffle the separators */
hashids_shuffle(result->separators, result->separators_count,
result->salt, result->salt_length);
/* check if we have any/enough separators */
if (!result->separators_count
|| (((float)result->alphabet_length / (float)result->separators_count)
> HASHIDS_SEPARATOR_DIVISOR)) {
unsigned int separators_count = (unsigned int)ceil(
(float)result->alphabet_length / HASHIDS_SEPARATOR_DIVISOR);
if (separators_count == 1) {
separators_count = 2;
}
if (separators_count > result->separators_count) {
/* we need more separators - get some from alphabet */
int diff = separators_count - result->separators_count;
strncat(result->separators, result->alphabet, diff);
memmove(result->alphabet, result->alphabet + diff,
result->alphabet_length - diff + 1);
result->separators_count += diff;
result->alphabet_length -= diff;
} else {
/* we have more than enough - truncate */
result->separators[separators_count] = '\0';
result->separators_count = separators_count;
}
}
/* shuffle alphabet */
hashids_shuffle(result->alphabet, result->alphabet_length,
result->salt, result->salt_length);
/* allocate guards */
result->guards_count = (unsigned int) ceil((float)result->alphabet_length
/ HASHIDS_GUARD_DIVISOR);
result->guards = _hashids_alloc(result->guards_count + 1);
if (HASHIDS_UNLIKELY(!result->guards)) {
hashids_free(result);
hashids_errno = HASHIDS_ERROR_ALLOC;
return NULL;
}
if (HASHIDS_UNLIKELY(result->alphabet_length < 3)) {
/* take some from separators */
strncpy(result->guards, result->separators, result->guards_count);
memmove(result->separators, result->separators + result->guards_count,
result->separators_count - result->guards_count + 1);
result->separators_count -= result->guards_count;
} else {
/* take them from alphabet */
strncpy(result->guards, result->alphabet, result->guards_count);
memmove(result->alphabet, result->alphabet + result->guards_count,
result->alphabet_length - result->guards_count + 1);
result->alphabet_length -= result->guards_count;
}
/* set min hash length */
result->min_hash_length = min_hash_length;
/* return result happily */
return result;
}
The true question seems to be
Is there a different approach to compress long numbers?
There are many. They differ in several respects, including which bits of the input contribute to the output, how many inputs map to the same output, and what manner of transformations of the input leave the output unchanged.
As a trivial examples, you can compress the input to a single bit by any of these approaches:
Choose the lowest-order bit of the input
Choose the highest-order bit of the input
The output is always 1
etc
Or you can compress to 7 bits by using using the number of 1 bits in the input as the output.
None of those particular options is likely to be of interest to you, of course.
Perhaps you would be more interested in producing 32-bit outputs for your 96-bit inputs. Do note that in that case on average there will be at least 264 possible inputs that map to each possible output. That depends only on the sizes of input and output, not on any details of the conversion.
For example, suppose that you have
uint32_t *cpuid = ...;
pointing to the hardware CPU ID. You can produce a 32-bit value from it that depends on all the bits of the input simply by doing this:
uint32_t cpuid32 = cpuid[0] ^ cpuid[1] ^ cpuid[2];
Whether that would suit your purpose depends on how you intend to use it.
You can easily implement strdup yourself like this:
char* strdup (const char* str)
{
size_t size = strlen(str);
char* result = malloc(size);
if(result != NULL)
{
memcpy(result, str, size+1);
}
return result;
}
That being said, using malloc or strdup on an embedded system is most likely just nonsense practice, see this. Nor would you use float numbers. Overall, that library seems to have been written by a desktop-minded person.
If you are implementing something like for example a chained hash table on an embedded system, you would use a statically allocated memory pool and not malloc. I'd probably go with a non-chained one for that reason (upon duplicates, pick next free spot in the buffer).
Unique device ID register (96 bits) is located under address 0x1FFFF7E8. It is factory programmed and is read-only. You can read it directly without using any other external library. For example:
unsigned int b = *(0x1FFFF7E8);
should give you the first 32 bits (31:0) of the unique device ID. If you want to retrieve a string as in case of the library mentioned, the following should work:
sprintf(id, "%08X%08X%08X", *(0x1FFFF7E8), *(0x1FFFF7E8 + 4), *(0x1FFFF7E8 + 8);
Some additional casting may be required, but generally that's what the library did. Please refer to STM32F1xx Reference Manual (RM0008), section 30.2 for more details. The exact memory location to read from is different in case of Cortex-M4 family of the MCUs.

realloc overwrite variable (Xilinx SDK on a Zynq SoC (Cortex A9))

As mentioned I have a Zynq SoC (ZC706 Eval Board) and I'm trying to read an image from the SD Card. To do this I'm using the FatFs lib (http://elm-chan.org/fsw/ff/00index_e.html).
In my code I read 4096 Byte from the file and save it to a buffer. After that i copy the buffer to an unsigned char pointer that size I increase after every read operation.
Then I'm using realloc, the for loop in the copyU32ArrayToUnsignedCharArray function 'failed' because the size variable is overwritten by the out array.
Code that overwrite the "size" in the copyU32ArrayToUnsignedCharArray function:
u32 buffer[1024];
unsigned char *img = NULL;
bytesreaded = 0;
for (;;) {
br=0;
fr = f_read(&fil, buffer, sizeof(buffer), &br); /* Read a chunk of source file */
if (fr || br == 0)
break; /* error or eof */
img = realloc(img,br);
copyU32ArrayToUnsignedCharArray(buffer, &img[bytesreaded], br/4); // /4 because u32(32 bit) in to unsigned char(8 bit)
bytesreaded += br; // update readed bytes
}
The code that worked:
u32 buffer[1024];
unsigned char *img = NULL;
img = malloc(512*512*3+100);
bytesreaded = 0;
for (;;) {
br=0;
fr = f_read(&fil, buffer, sizeof(buffer), &br); /* Read a chunk of source file */
if (fr || br == 0)
break; /* error or eof */
copyU32ArrayToUnsignedCharArray(buffer, &img[bytesreaded], br/4); // /4 because u32(32 bit) in to unsigned char(8 bit)
bytesreaded += br; // update readed bytes
}
The copyU32ArrayToUnsignedCharArray function:
void copyU32ArrayToUnsignedCharArray(u32 *in, unsigned char* out, uint size){
int i,x;
x = 0;
for (i = 0; i < size; i++) {
if(size != 1024)
break;
in[i] = Xil_In32BE(&in[i]);
out[x] = (u32) in[i] >> 24;
out[x + 1] = (u32) in[i] >> 16 & 0x00FF;
out[x + 2] = (u32) in[i] >> 8 & 0x0000FF;
out[x + 3] = (u32) in[i] & 0x000000FF;
x += 4;
}
}
I want to use realloc because I don't know how big the image will be that I read.
Update:
Some further information to the code that doesn't work. I debugged it and the pointer to *img isn't null, so the realloc was successfully. If I'm using gdb the following things happen in the copyU32ArrayToUnsignedCharArray function:
- pointer to the variable "out" is 0x001125a8
- the address of the "size" variable is 0x0011309c (the value that is stored at this location is correct)
- the space in memory between this two variables is 0xaf4 = 2804 dec (difference of the two addresses)
- if the for loop within the copyU32ArrayToUnsignedCharArray function reached i=702 and x=2808 the size variable is changed to another value
Sincerely,
Arno
I solved the problem with the hint from Notlikethat. The problem was the small heap size. Increasing the heap is done by editing the linker script file

Create a string from uint32/16_t and then parse back the original numbers

I need to put into a char* some uint32_t and uint16_t numbers. Then I need to get them back from the buffer.
I have read some questions and I've tried to use sprintf to put them into the char* and sscanf get the original numbers again. However, I'm not able to get them correctly.
Here's an example of my code with only 2 numbers. But I need more than 2, that's why I use realloc. Also, I don't know how to use sprintf and sscanf properly with uint16_t
uint32_t gid = 1100;
uint32_t uid = 1000;
char* buffer = NULL;
uint32_t offset = 0;
buffer = realloc(buffer, sizeof(uint32_t));
sprintf(buffer, "%d", gid);
offset += sizeof(uint32_t);
buffer = realloc(buffer, sizeof(uint32_t) + sizeof(buffer));
sprintf(buffer+sizeof(uint32_t), "%d", uid);
uint32_t valorGID;
uint32_t valorUID;
sscanf(buffer, "%d", &valorGID);
buffer += sizeof(uint32_t);
sscanf(buffer, "%d", &valorUID);
printf("ValorGID %d ValorUID %d \n", valorGID, valorUID);
And what I get is
ValorGID 11001000 ValorUID 1000
What I need to get is
ValorGID 1100 ValorUID 1000
I am new in C, so any help would be appreciated.
buffer = realloc(buffer, sizeof(uint32_t));
sprintf(buffer, "%d", gid);
offset += sizeof(uint32_t);
buffer = realloc(buffer, sizeof(uint32_t) + sizeof(buffer));
sprintf(buffer+sizeof(uint32_t), "%d", uid);
This doesn't really make sense, and will not work as intended except in lucky circumstances.
Let us assume that the usual CHAR_BIT == 8 holds, so sizeof(uint32_t) == 4. Further, let us assume that int is a signed 32-bit integer in two's complement representation without padding bits.
sprintf(buffer, "%d", gid) prints the decimal string representation of the bit-pattern of gid interpreted as an int to buffer. Under the above assumptions, gid is interpreted as a number between -2147483648 and 2147483647 inclusive. Thus the decimal string representation may contain a '-', contains 1 to 10 digits and the 0-terminator, altogether it uses two to twelve bytes. But you have allocated only four bytes, so whenever 999 < gid < 2^32-99 (the signed two's complement interpretation is > 999 or < -99), sprintf writes past the allocated buffer size.
That is undefined behaviour.
It's likely to not crash immediately because allocating four bytes usually gives you a larger chunk of memory effectively (if e.g. malloc always returns 16-byte aligned blocks, the twelve bytes directly behind the allocated four cannot be used by other parts of the programme, but belong to the programme's address space, and writing to them will probably go undetected). But it can easily crash later when the end of the allocated chunk lies on a page boundary.
Also, since you advance the write offset by four bytes for subsequent sprintfs, part of the previous number gets overwritten if the string representation (excluding the 0-termnator) used more than four bytes (while the programme didn't yet crash due to writing to non-allocated memory).
The line
buffer = realloc(buffer, sizeof(uint32_t) + sizeof(buffer));
contains further errors.
buffer = realloc(buffer, new_size); loses the reference to the allocated memory and causes a leak if realloc fails. Use a temporary and check for success
char *temp = realloc(buffer, new_size);
if (temp == NULL) {
/* reallocation failed, recover or cleanup */
free(buffer);
exit(EXIT_FAILURE);
}
/* it worked */
buffer = temp;
/* temp = NULL; or let temp go out of scope */
The new size sizeof(uint32_t) + sizeof(buffer) of the new allocation is always the same, sizeof(uint32_t) + sizeof(char*). That's typically eight or twelve bytes, so it doesn't take many numbers to write outside the allocated area and cause a crash or memory corruption (which may cause a crash much later).
You must keep track of the number of bytes allocated to buffer and use that to calculate the new size. There is no (portable¹) way to determine the size of the allocated memory block from the pointer to its start.
Now the question is whether you want to store the string representations or the bit patterns in the buffer.
Storing the string representations has the problem that the length of the string representation varies with the value. So you need to include separators between the representations of the numbers, or ensure that all representations have the same length by padding (with spaces or leading zeros) if necessary. That would for example work like
#include <stdint.h>
#include <inttypes.h>
#define MAKESTR(x) # x
#define STR(x) MAKESTR(x)
/* A uint32_t can use 10 decimal digits, so let each field be 10 chars wide */
#define FIELD_WIDTH 10
uint32_t gid = 1100;
uint32_t uid = 1000;
size_t buf_size = 0, offset = 0;
char *buffer = NULL, *temp = NULL;
buffer = realloc(buffer, FIELD_WIDTH + 1); /* one for the '\0' */
if (buffer == NULL) {
exit(EXIT_FAILURE);
}
buf_size = FIELD_WIDTH + 1;
sprintf(buffer, "%0" STR(FIELD_WIDTH) PRIu32, gid);
offset += FIELD_WIDTH;
temp = realloc(buffer, buf_size + FIELD_WIDTH);
if (temp == NULL) {
free(buffer);
exit(EXIT_FAILURE);
}
buffer = temp;
temp = NULL;
buf_size += FIELD_WIDTH;
sprintf(buffer + offset, "%0" STR(FIELD_WIDTH) PRIu32, uid);
offset += FIELD_WIDTH;
/* more */
uint32_t valorGID;
uint32_t valorUID;
/* rewind for scanning */
offset = 0;
sscanf(buffer + offset, "%" STR(FIELD_WIDTH) SCNu32, &valorGID);
offset += FIELD_WIDTH;
sscanf(buffer + offset, "%" STR(FIELD_WIDTH) SCNu32, &valorUID);
printf("ValorGID %u ValorUID %u \n", valorGID, valorUID);
with zero-padded fixed-width fields. If you'd rather use separators than a fixed width, the calculation of the required length and the offsets becomes more complicated, but unless the numbers are large, it would use less space.
If you'd rather store the bit-patterns, which would be the most compact way of storing, you'd use something like
size_t buf_size = 0, offset = 0;
unsigned char *buffer = NULL, temp = NULL;
buffer = realloc(buffer, sizeof(uint32_t));
if (buffer == NULL) {
exit(EXIT_FAILURE);
}
buf_size = sizeof(uint32_t);
for(size_t b = 0; b < sizeof(uint32_t); ++b) {
buffer[offset + b] = (gid >> b*8) & 0xFF;
}
offset += sizeof(uint32_t);
temp = realloc(buffer, buf_size + sizeof(uint32_t));
if (temp == NULL) {
free(buffer);
exit(EXIT_FAILURE);
}
buffer = temp;
temp = NULL;
buf_size += sizeof(uint32_t);
for(size_t b = 0; b < sizeof(uint32_t); ++b) {
buffer[offset + b] = (uid >> b*8) & 0xFF;
}
offset += sizeof(uint32_t);
/* And for reading the values */
uint32_t valorGID, valorUID;
/* rewind */
offset = 0;
valorGID = 0;
for(size_t b = 0; b < sizeof(uint32_t); ++b) {
valorGID |= buffer[offset + b] << b*8;
}
offset += sizeof(uint32_t);
valorUID = 0;
for(size_t b = 0; b < sizeof(uint32_t); ++b) {
valorUID |= buffer[offset + b] << b*8;
}
offset += sizeof(uint32_t);
¹ If you know how malloc etc. work in your implementation, it may be possible to find the size from malloc's bookkeeping data.
The format specifier '%d' is for int and thus is wrong for uint32_t. First uint32_t is an unsigned type, so you should at least use '%u', but then it might also have a different width than int or unsigned. There are macros foreseen in the standard: PRIu32 for printf and SCNu32 for scanf. As an example:
sprintf(buffer, "%" PRIu32, gid);
The representation returned by sprintf is a char*. If you are trying to store an array of integers as their string representatins then your fundamental data type is a char**. This is a ragged matrix of char if we are storing only the string data itself, but since the longest string a uint32_t can yield is 10 chars, plus one for the terminating null, it makes sense to preallocate this many bytes to hold each string.
So to store n uint32_t's from array a in array s as strings:
const size_t kMaxIntLen=11;
uint32_t *a,b;
// fill a somehow
...
size_t n,i;
char **s.*d;
if((d=(char*)malloc(n*kMaxIntLen))==NULL)
// error!
if((s=(char**)malloc(n*sizeof(char*)))==NULL)
// error!
for(i=0;i<n;i++)
{
s[i]=d+i; // this is incremented by sizeof(char*) each iteration
snprintf(s[i],kMaxIntLen,"%u",a[i]); // snprintf to be safe
}
Now the ith number is at s[i] so to print it is just printf("%s",s[i]);, and to retrieve it as an integer into b is sscanf(s[i],"%u",&b);.
Subsequent memory management is a bit trickier. Rather than constantly using using realloc() to grow the buffer, it is better to preallocate a chunk of memory and only alter it when exhausted. If realloc() fails it returns NULL, so store a pointer to your main buffer before calling it and that way you won't lose a reference to your data. Reallocate the d buffer first - again allocate enough room for several more strings - then if it succeeds see if d has changed. If so, destroy (free()) the s buffer, malloc() it again and rebuild the indices (you have to do this since if d has changed all your indices are stale). If not, realloc() s and fix up the new indices. I would suggest wrapping this whole thing in a structure and having a set of routines to operate on it, e.g.:
typedef struct StringArray
{
char **strArray;
char *data;
size_t nStrings;
} StringArray;
This is a lot of work. Do you have to use C? This is vastly easier as a C++ STL vector<string> or list<string> with the istringstream classes and the push_back() container method.
uint32_t gid = 1100;
uint32_t uid = 1000;
char* buffer = NULL;
uint32_t offset = 0;
buffer = realloc(buffer, sizeof(uint32_t));
sprintf(buffer, "%d", gid);
offset += sizeof(uint32_t);
buffer = realloc(buffer, sizeof(uint32_t) + sizeof(buffer));
sprintf(buffer+sizeof(uint32_t), "%d", uid);
uint32_t valorGID;
uint32_t valorUID;
sscanf(buffer, "%4d", &valorGID);
buffer += sizeof(uint32_t);
sscanf(buffer, "%d", &valorUID);
printf("ValorGID %d ValorUID %d \n", valorGID, valorUID);
`
I think this may resolve the issue !

How to process a string char by char in the XS code

Let's suppose there is a piece of code like this:
my $str = 'some text';
my $result = my_subroutine($str);
and my_subroutine() should be implemented as Perl XS code. For example it could return the sum of bytes of the (unicode) string.
In the XS code, how to process a string (a) char by char, as a general method, and (b) byte by byte, if the string is composed of ASCII codes subset (a built-in function to convert from the native data srtucture of a string to char[]) ?
At the XS layer, you'll get byte or UTF-8 strings. In the general case, your code will likely contain a char * to point at the next item in the string, incrementing it as it goes. For a useful set of UTF-8 support functions to use in XS, read the "Unicode Support" section of perlapi
An example of mine from http://cpansearch.perl.org/src/PEVANS/Tickit-0.15/lib/Tickit/Utils.xs
int textwidth(str)
SV *str
INIT:
STRLEN len;
const char *s, *e;
CODE:
RETVAL = 0;
if(!SvUTF8(str)) {
str = sv_mortalcopy(str);
sv_utf8_upgrade(str);
}
s = SvPV_const(str, len);
e = s + len;
while(s < e) {
UV ord = utf8n_to_uvchr(s, e-s, &len, (UTF8_DISALLOW_SURROGATE
|UTF8_WARN_SURROGATE
|UTF8_DISALLOW_FE_FF
|UTF8_WARN_FE_FF
|UTF8_WARN_NONCHAR));
int width = wcwidth(ord);
if(width == -1)
XSRETURN_UNDEF;
s += len;
RETVAL += width;
}
OUTPUT:
RETVAL
In brief, this function iterates the given string one Unicode character at a time, accumulating the width as given by wcwidth().
If you're expecting bytes:
STRLEN len;
char* buf = SvPVbyte(sv, len);
while (len--) {
char byte = *(buf++);
... do something with byte ...
}
If you're expecting text or any non-byte characters:
STRLEN len;
U8* buf = SvPVutf8(sv, len);
while (len) {
STRLEN ch_len;
UV ch = utf8n_to_uvchr(buf, len, &ch_len, 0);
buf += ch_len;
len -= ch_len;
... do something with ch ...
}

Resources