SegFault due to invalid assignment? - c

I have an array that is declared inside a public struct like this:
uint16_t *registers;
In a function I'm retrieving a char string (stored in buffer, see code below) that contains numerical values separated by a comma (e.g., "1,12,0,136,5,76,1243"). My goal is to get each individual numerical value and store it in the array, one after another.
i = 0;
const char delimiter[] = ",";
char *end;
tmp.vals = strtok(buffer, delimiter);
while (tmp.vals != NULL) {
tmp.registers[i] = strtol(tmp.vals, &end, 10);
tmp.vals = strtok(NULL, delimiter);
i++;
}
The problem is that the line containing strtol is producing a Segmentation fault (core dumped) error. I'm pretty sure it's caused by trying to fit unsigned long values into uint16_t array slots but no matter what I try I can't get it fixed.
Changing the code as follows seems to have solved the problem:
unsigned long num = 0;
size_t size = 0;
i = 0;
size = 1;
tmp.vals = (char *)calloc(strlen(buffer) + 1, sizeof(char));
tmp.registers = (uint16_t *)calloc(size, sizeof(uint16_t));
tmp.vals = strtok(buffer, delimiter);
while (tmp.vals != NULL) {
num = strtoul(tmp.vals, &end, 10);
if (0 <= num && num < 65536) {
tmp.registers = (uint16_t *)realloc(tmp.registers, size + i);
tmp.registers[i] = (uint16_t)num;
} else {
fprintf(stderr, "==> %lu is too large to fit in register[%d]\n", num, i);
}
tmp.vals = strtok(NULL, delimiter);
i++;
}

A long integer is at least 32 bits, so yes, you're going to lose information trying to shove a signed 32 bit integer into an unsigned 16 bit integer. If you have compiler warnings on (I use -Wall -Wshadow -Wwrite-strings -Wextra -Wconversion -std=c99 -pedantic) it should tell you that.
test.c:20:28: warning: implicit conversion loses integer precision: 'long' to 'uint16_t'
(aka 'unsigned short') [-Wconversion]
tmp.registers[i] = strtol(tmp.vals, &end, 10);
~ ^~~~~~~~~~~~~~~~~~~~~~~~~~
However, this isn't going to cause a segfault. You'll lose 16 bits and the change in sign will do funny things.
#include <stdio.h>
#include <inttypes.h>
int main() {
long big = 1234567;
uint16_t small = big;
printf("big = %ld, small = %" PRIu16 "\n", big, small);
}
If you know what you're reading will fit into 16 bits, you can make things a little safer first by using strtoul to read an unsigned long, verify that it's small enough to fit, and explicitly cast it.
unsigned long num = strtoul(tmp.vals, &end, 10);
if( 0 <= num && num < 65536 ) {
tmp.registers[i] = (uint16_t)num;
}
else {
fprintf(stderr, "%lu is too large to fit in the register\n", num);
}
More likely tmp.registers (and possibly buffer) weren't properly initialized and allocated points to garbage. If you simply declared the tmp on the stack like so:
Registers tmp;
This only allocates memory for tmp, not the things it points to. And it will contain garbage. tmp.registers will point to some random spot in memory. When you try to write to it it will segfault... eventually.
The register array needs to be allocated.
size_t how_many = 10;
uint16_t *registers = malloc( sizeof(uint16_t) * how_many );
Thing tmp = {
.registers = registers,
.vals = NULL
};
This is fine so long as your loop only ever runs how_many times. But you can't be sure of that when reading input. Your loop is potentially reading an infinite number of registers. If it goes over the 10 we've allocated it will again start writing into someone else's memory and segfault.
Dynamic memory is too big a topic for here, but we can at least limit the loop to the size of the array by tracking the maximum size of registers and how far in it is. We could do it in the loop, but it really belongs in the struct.
typedef struct {
uint16_t *registers;
char *vals;
size_t max;
size_t size;
} Registers;
While we're at it, put initialization into a function so we're sure it's done reliably each time.
void Registers_init( Registers *registers, size_t size ) {
registers->registers = malloc( sizeof(uint16_t) * size );
registers->max = size;
registers->size = 0;
}
And same with our bounds check.
void Registers_push( Registers *registers, uint16_t num ) {
if( registers->size == registers->max ) {
fprintf(stderr, "Register has reached its limit of %zu\n", registers->max);
exit(1);
}
registers->registers[ registers->size ] = (uint16_t)num;
registers->size++;
}
Now we can add registers safely. Or at least it will error nicely.
Registers registers;
Registers_init( &registers, 10 );
tmp.vals = strtok(buffer, delimiter);
while (tmp.vals != NULL) {
unsigned long num = strtoul(tmp.vals, &end, 10);
if( 0 <= num && num < 65536 ) {
Registers_push( &tmp, (uint16_t)num );
}
else {
fprintf(stderr, "%lu is too large to fit in the register\n", num);
}
tmp.vals = strtok(NULL, delimiter);
i++;
}
At this point we're re-implementing a size-bound array. It's a good exercise, but for production code use an existing library such as GLib which provides self-growing arrays and a lot more features.

Related

Splitting a string and store to the heap algorithm question

For this code below that I was writing. I was wondering, if I want to split the string but still retain the original string is this the best method?
Should the caller provided the ** char or should the function "split" make an additional malloc call and memory manage the ** char?
Also, I was wondering if this is the most optimizing method, or could I optimize the code better than this?
I still have not debug the code yet, I am a bit undecided whether if the caller manage the ** char or the function manage the pointer ** char.
#include <stdio.h>
#include <stdlib.h>
size_t split(const char * restrict string, const char splitChar, char ** restrict parts, const size_t maxParts){
size_t size = 100;
size_t partSize = 0;
size_t len = 0;
size_t newPart = 1;
char * tempMem;
/*
* We just reverse a long page of memory
* At reaching the space character that is the boundary of the new
*/
char * mem = (char*) malloc( sizeof(char) * size );
if ( mem == NULL ) return 0;
for ( size_t i = 0; string[i] != 0; i++ ) {
// If it is a split char we at a new part
if ( string[i] == splitChar) {
// If the last character was not the split character
// Then mem[len] = 0 and increase the len by 1.
if (newPart == 0) mem[len++] = 0;
newPart = 1;
continue;
} else {
// If this is a new part
// and not a split character
// we make a new pointer
if ( newPart == 1 ){
// if reach maxpart we break.
// It is okay here, to not worry about memory
if ( partSize == maxParts ) break;
parts[partSize++] = &mem[len];
newPart = 0;
}
mem[len++] = string[i];
if ( len == size ){
// if ran out of memory realloc.
tempMem = (char*)realloc(mem, sizeof(char) * (size << 1) );
// if fail quit loop
if ( tempMem == NULL ) {
// If we can't get more memory the last part could be corrupted
// We have to return.
// Otherwise the code below can seg.
// There maybe a better way than this.
return partSize--;
}
size = size << 1;
mem = tempMem;
}
}
}
// If we got here and still in a newPart that is fine no need
// an additional character.
if ( newPart != 1 ) mem[len++] = 0;
// realloc to give back the unneed memory
if ( len < size ) {
tempMem = (char*) realloc(mem, sizeof(char) * len );
// If the resizing did not fail but yielded a different
// memory block;
if ( tempMem != NULL && tempMem != mem ){
for ( size_t i = 0; i < partSize; i++ ){
parts[i] = tempMem + (parts[i] - mem);
}
}
}
return partSize;
}
int main(){
char * tStr = "This is a super long string just to test the str str adfasfas something split";
char * parts[10];
size_t len = split(tStr, ' ', parts, 10);
for (size_t i = 0; i < len; i++ ){
printf("%zu: %s\n", i, parts[i]);
}
}
What is "best" is very subjective, as well as use case dependent.
I personally would keep the parameters as input only, define a struct to contain the split result, and probably return such by value. The struct would probably contain pointers to memory allocation, so would also create a helper function free that memory. The parts might be stored as list of strings (copy string data) or index&len pairs for the original string (no string copies needed, but original string needs to remain valid).
But there are dozens of very different ways to do this in C, and all a bit klunky. You need to choose your flavor of klunkiness based on your use case.
About being "more optimized": unless you are coding for a very small embedded device or something, always choose a more robust, clear, easier to use, harder to use wrong over more micro-optimized. The useful kind of optimization turns, for example, O(n^2) to O(n log n). Turning O(3n) to O(2n) of a single function is almost always completely irrelevant (you are not going to do string splitting in a game engine inner rendering loop...).

how to create array of struct

I want to implement a searching table and
here's the data:
20130610 Diamond CoinMate 11.7246 15.7762 2897
20130412 Diamond Bithumb 0.209 0.2293 6128
20130610 OKCash Bithumb 0.183 0.2345 2096
20130412 Ethereum Chbtc 331.7282 401.486 136786
20170610 OKCash Tidex 0.0459 0.0519 66
...
and my code
typedef struct data{
int *date;
string currency[100];
string exchange[100];
double *low;
double *high;
int *daily_cap;
} Data;
int main()
{
FILE *fp = fopen("test_data.txt", "r");
Data tmp[50];
int i = 0;
while (!feof(fp)){
fscanf(fp, "%d%s%s%f%f%7d", &tmp[i].date, tmp[i].currency, tmp[i].exchange, &tmp[i].low, &tmp[i].high, &tmp[i].daily_cap);
i++;
}
fclose(fp);
}
but the first problem is that I can't create a large array to store my struct like
Data tmp[1000000]
and even I try just 50 elements , the program break down when finish main().
can anyone tell how to fix it or give me a better method, thanks.
You can not scan a value to an unallocated space, in other words, you need room for all those pointers in the struct, switch to
typedef struct data{
int date;
string currency[100];
string exchange[100];
double low;
double high;
int daily_cap;
} Data;
Or use malloc to assign space to those pointers before using them.
while (!feof(fp)){
tmp[i].date = malloc(sizeof(int));
...
But in this case, you don't need to pass the address of such members to fscanf since they are already pointers:
fscanf(fp, "%d%s%s%f%f%7d", &tmp[i].date, ..
should be
fscanf(fp, "%d%s%s%lf%lf%7d", tmp[i].date, ...
Notice that double wants %lf instead of %f
This is also very confusing:
typedef struct data{
int *date;
string currency[100];
...
Is string a typedef of char? I think you mean string currency; since string is usually an alias of char *, in this case you need room for this member too: currency = malloc(100);
Finally, take a look to Why is “while ( !feof (file) )” always wrong?
There are too many errors in a short snippet, I suggest you to read a good C book.
Your code corrected using dynamic memory that allows you to reserve space for a big amount of data (see the other answer of #LuisColorado) and using fgets and sscanf instead of fscanf:
#include <stdio.h>
#include <stdlib.h>
typedef struct data{
int date;
char currency[100];
char exchange[100];
double low;
double high;
int daily_cap;
} Data;
int main(void)
{
FILE *fp = fopen("test_data.txt", "r");
/* Always check the result of fopen */
if (fp == NULL) {
perror("fopen");
exit(EXIT_FAILURE);
}
Data *tmp;
tmp = malloc(sizeof(*tmp) * 50);
if (tmp == NULL) {
perror("malloc");
exit(EXIT_FAILURE);
}
char buf[512];
int i = 0;
/* Check that you don't read more than 50 lines */
while ((i < 50) && (fgets(buf, sizeof buf, fp))) {
sscanf(buf, "%d%99s%99s%lf%lf%7d", &tmp[i].date, tmp[i].currency, tmp[i].exchange, &tmp[i].low, &tmp[i].high, &tmp[i].daily_cap);
i++;
}
fclose(fp);
/* Always clean what you use */
free(tmp);
return 0;
}
Of course you can't. Think you are creating an array of 1.0E6 registers of sizeof (Data) which I guess is not less than 32 (four pointers) and 200 bytes (not less than this, as you don't give the definition of type string) and this is 232MBytes (at least) in a 64 byte machine (in 32bit it is 216MBytes) and that in case the type string is only one character wide (what I fear is not) In case string is a typedef of char * then you have 432 pointers in your struct giving to 432MBytes in only one variable. Next, if you are declaring this absolutely huge variable as a local variable, you must know that te stack in most unix operating systems is limited to around 8Mb, and this means you need to build your program with special parameters to allow a larger stack max size. And also you probably need your account to raise to that size also the ulimits to make the kernel to allow you such a large stack size segment.
Please, next time, give us full information, as not knowing the definition of the string type, or posting an incomplete program, only allows us to make guesses on what can be ongoing, and not to be able to discover actual errors. This makes you to waste your time, and for us the same. Thanks.
If your list of currency and exchange are known before hand, then there is no need to allocate or store any arrays within your struct. The lists can be global arrays of pointers to string literals and all you need do is store a pointer to the literal for both currency and exchange (you can even save a few more bytes by storing the index instead of a pointer).
For example, your lists of exchanges can be stored once as follows:
const char *currency[] = { "Diamond", "OKCash", "Ethereum" },
*exchange[] = { "CoinMate", "Bithumb", "Chbtc", "Tidex" };
(if the number warrants, allocate storage for the strings and read them from a file)
Now you have all of the possible strings for currency and exchange stored, all you need in your data struct is a pointer for each, e.g.
typedef struct {
const char *currency, *exchange;
double low, high;
unsigned date, daily_cap;
} data_t;
(unsigned gives a better range and there are no negative dates or daily_cap)
Now simply declare an array of data_t (or allocate for them, depending on number). Below is a simply array of automatic storage for example purposes. E.g.
#define MAXD 128
...
data_t data[MAXD] = {{ .currency = NULL }};
Since you are reading 'lines' of data, fgets or POSIX getline are the line-oriented choices. After reading a line, you can parse the line with sscanf using temporary values, compare whether the values for currency and exchange read from the file match values stored, and then assign a pointer to the appropriate string to your struct, e.g.
int main (void) {
char buf[MAXC] = "";
size_t n = 0;
data_t data[MAXD] = {{ .currency = NULL }};
while (n < MAXD && fgets (buf, MAXC, stdin)) {
char curr[MAXE] = "", exch[MAXE] = "";
int havecurr = 0, haveexch = 0;
data_t tmp = { .currency = NULL };
if (sscanf (buf, "%u %31s %31s %lf %lf %u", &tmp.date,
curr, exch, &tmp.low, &tmp.high, &tmp.daily_cap) == 6) {
for (int i = 0; i < NELEM(currency); i++) {
if (strcmp (currency[i], curr) == 0) {
tmp.currency = currency[i];
havecurr = 1;
break;
}
}
for (int i = 0; i < NELEM(exchange); i++) {
if (strcmp (exchange[i], exch) == 0) {
tmp.exchange = exchange[i];
haveexch = 1;
break;
}
}
if (havecurr & haveexch)
data[n++] = tmp;
}
}
...
Putting it altogether in a short example, you could do something similar to the following:
#include <stdio.h>
#include <string.h>
#define MAXC 256
#define MAXD 128
#define MAXE 32
#define NELEM(x) (int)(sizeof (x)/sizeof (*x))
const char *currency[] = { "Diamond", "OKCash", "Ethereum" },
*exchange[] = { "CoinMate", "Bithumb", "Chbtc", "Tidex" };
typedef struct {
const char *currency, *exchange;
double low, high;
unsigned date, daily_cap;
} data_t;
int main (void) {
char buf[MAXC] = "";
size_t n = 0;
data_t data[MAXD] = {{ .currency = NULL }};
while (n < MAXD && fgets (buf, MAXC, stdin)) {
char curr[MAXE] = "", exch[MAXE] = "";
int havecurr = 0, haveexch = 0;
data_t tmp = { .currency = NULL };
if (sscanf (buf, "%u %31s %31s %lf %lf %u", &tmp.date,
curr, exch, &tmp.low, &tmp.high, &tmp.daily_cap) == 6) {
for (int i = 0; i < NELEM(currency); i++) {
if (strcmp (currency[i], curr) == 0) {
tmp.currency = currency[i];
havecurr = 1;
break;
}
}
for (int i = 0; i < NELEM(exchange); i++) {
if (strcmp (exchange[i], exch) == 0) {
tmp.exchange = exchange[i];
haveexch = 1;
break;
}
}
if (havecurr & haveexch)
data[n++] = tmp;
}
}
for (size_t i = 0; i < n; i++)
printf ("%u %-10s %-10s %8.4f %8.4f %6u\n", data[i].date,
data[i].currency, data[i].exchange, data[i].low,
data[i].high, data[i].daily_cap);
}
Example Use/Output
$ ./bin/coinread <dat/coin.txt
20130610 Diamond CoinMate 11.7246 15.7762 2897
20130412 Diamond Bithumb 0.2090 0.2293 6128
20130610 OKCash Bithumb 0.1830 0.2345 2096
20130412 Ethereum Chbtc 331.7282 401.4860 136786
20170610 OKCash Tidex 0.0459 0.0519 66
With this approach, regardless whether you allocate for your array of struct or use automatic storage, you minimize the size of the data stored by not duplicating storage of known values. On x86_64, your data_t struct size will be approximately 40-bytes. With on average a 1-4 Megabyte stack, you can store a lot of 40-byte structs safely before you need to start allocating. You can always start with automatic storage, and if you reach some percentage of the available stack space, dynamically allocate, memcpy, set a flag to indicate the storage in use and keep going...

How to convert bytes stored inside a buffer to a variable?

So I'm reading from a file descriptor which contains an int variable in its raw byte format.
So I'm doing:
char buffer[sizeof(int)];
ssize_t sizeOfFile = read(sock_fd, buffer, sizeof(int));
int extractedInt = ???;
How can I convert that buffer to an integer? I was thinking of memcpy but was wondering if there are better ways.
You could read directly an integer
int extractedInt;
ssize_t sizeOfFile = read(sock_fd, &extractedInt, sizeof(int));
read will read the size of an int bytes, and store them into extractedInt.
If your int is actually a string in a file you want to convert to an int, the procedure is a bit different.
#define SIZE 20
char buffer[SIZE]; // ensure there is enough space for a string containing an integer
ssize_t sizeOfFile = read(sock_fd, buffer, SIZE);
int extractedInt = atoi(buffer); // convert string to integer
I can guess from your code that you're reading from the network. This is then not portable to just read a int from the buffer, in your network protocol you chose a certain endianness but you cannot expect that all the platforms where your program will run to have the same, so it will lead to bad convertions.
And other proposed solutions of asking read to return an int will lead to the same problem.
So in your case, I can only advice to iterate through your array and compute the integer by progressively placing the bytes at the right place depending on the endianness of the platform.
You can detect the endianness of the build target platform by using the macro __BYTE_ORDER__in GCC.
There is an example for network data that is big endian:
// construct an `int` with the bytes in the given buffer
// considering the buffer contains the representation
// of an int in big endian
int buffer_to_int(char* buffer, int buffer_size) {
int result = 0;
int i;
char sign = buffer[0] & 0x80;
char * res_bytes = (char*)&result; // this pointer allows to access the int bytes
int offset = sizeof(int) - buffer_size;
if( sign != 0 )
sign = 0xFF;
if( offset < 0 ) {
// not representable with a `int` type
// we chose here to return the closest representable value
if( sign == 0 ) { //positive
return INT_MAX;
} else {
return INT_MIN;
}
}
#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
for(i=0; i<buffer_size; i++) {
res_bytes[i] = buffer[buffer_size-i-1]; // invert the bytes
}
for(i=0; i<offset; i++){
res_bytes[buffer_size+i] = sign;
}
#else
// same endianness, so simply copy bytes using memcpy
memcpy(&result + offset, buffer, buffer_size);
for(i=0; i<offset; i++){
res_bytes[i] = sign;
}
#endif
return result;
}

Memory comparison causes system halt

I am working on a kernel module and I need to compare two buffers to find out if they are equivalent. I am using the memcmp function defined in the Linux kernel to do so. My first buffer is like this:
cache_buffer = (unsigned char *)vmalloc(4097);
cache_buffer[4096] = '/0';
The second buffer is from a page using the page_address() function.
page = bio_page(bio);
kmap(page);
write_buffer = (char *)page_address(page);
kunmap(page);
I have printed the contents of both buffers before hand and not only to they print correctly, but they also have the same content. So next, I do this:
result = memcmp(write_buffer, cache_buffer, 2048); // only comparing up to 2048 positions
This causes the kernel to freeze up and I cannot figure out why. I checked the implementation of memcmp and saw nothing that would cause the freeze. Can anyone suggest a cause?
Here is the memcmp implementation:
int memcmp(const void *cs, const void *ct, size_t count)
{
const unsigned char *su1, *su2;
int res = 0;
for (su1 = cs, su2 = ct; 0 < count; ++su1, ++su2, count--)
if ((res = *su1 - *su2) != 0)
break;
return res;
}
EDIT: The function causing the freeze is memcmp. When I commented it out, everything worked. Also, when I did I memcmp as follows
memcmp(write_buffer, write_buffer, 2048); //comparing two write_buffers
Everything worked as well. Only when I throw the cache_buffer into the mix is when I get the error. Also, above is a simplification of my actual code. Here is the entire function:
static int compare_data(sector_t location, struct bio * bio, struct cache_c * dmc)
{
struct dm_io_region where;
unsigned long bits;
int segno;
struct bio_vec * bvec;
struct page * page;
unsigned char * cache_data;
char * temp_data;
char * write_data;
int result, length, i;
cache_data = (unsigned char *)vmalloc((dmc->block_size * 512) + 1);
where.bdev = dmc->cache_dev->bdev;
where.count = dmc->block_size;
where.sector = location << dmc->block_shift;
printk(KERN_DEBUG "place: %llu\n", where.sector);
dm_io_sync_vm(1, &where, READ, cache_data, &bits, dmc);
length = 0;
bio_for_each_segment(bvec, bio, segno)
{
if(segno == 0)
{
page = bio_page(bio);
kmap(page);
write_data = (char *)page_address(page);
//kunmap(page);
length += bvec->bv_len;
}
else
{
page = bio_page(bio);
kmap(page);
temp_data = strcat(write_data, (char *)page_address(page));
//kunmap(page);
write_data = temp_data;
length += bvec->bv_len;
}
}
printk(KERN_INFO "length: %u\n", length);
cache_data[dmc->block_size * 512] = '\0';
for(i = 0; i < 2048; i++)
{
printk("%c", write_data[i]);
}
printk("\n");
for(i = 0; i < 2048; i++)
{
printk("%c", cache_data[i]);
}
printk("\n");
result = memcmp(write_data, cache_data, length);
return result;
}
EDIT #2: Sorry guys. The problem was not memcmp. It was the result of memcmp. When ever it returned a positive or negative number, the function that called my function would play with some pointers, one of which was uninitialized. I don't know why I didn't realize it before. Thanks for trying to help though!
I'm no kernel expert, but I would assume you need to keep this memory mapped while doing the comparison? In other words, don't call kunmap until after the memcmp is complete. I would presume that calling it before will result in write_buffer pointing to a page which is no longer mapped.
Taking your code in the other question, here is a rough attempt at incremental. Still needs some cleanup, I'm sure:
static int compare_data(sector_t location, struct bio * bio, struct cache_c * dmc)
{
struct dm_io_region where;
unsigned long bits;
int segno;
struct bio_vec * bvec;
struct page * page;
unsigned char * cache_data;
char * temp_data;
char * write_data;
int length, i;
int result = 0;
size_t position = 0;
size_t max_size = (dmc->block_size * 512) + 1;
cache_data = (unsigned char *)vmalloc(max_size);
where.bdev = dmc->cache_dev->bdev;
where.count = dmc->block_size;
where.sector = location << dmc->block_shift;
printk(KERN_DEBUG "place: %llu\n", where.sector);
dm_io_sync_vm(1, &where, READ, cache_data, &bits, dmc);
bio_for_each_segment(bvec, bio, segno)
{
// Map the page into memory
page = bio_page(bio);
write_data = (char *)kmap(page);
length = bvec->bv_len;
// Make sure we don't go past the end
if(position >= max_size)
break;
if(position + length > max_size)
length = max_size - position;
// Compare the data
result = memcmp(write_data, cache_data + position, length);
position += length;
kunmap(page);
// If the memory is not equal, bail out now and return the result
if(result != 0)
break;
}
cache_data[dmc->block_size * 512] = '\0';
return result;
}

Create a string from uint32/16_t and then parse back the original numbers

I need to put into a char* some uint32_t and uint16_t numbers. Then I need to get them back from the buffer.
I have read some questions and I've tried to use sprintf to put them into the char* and sscanf get the original numbers again. However, I'm not able to get them correctly.
Here's an example of my code with only 2 numbers. But I need more than 2, that's why I use realloc. Also, I don't know how to use sprintf and sscanf properly with uint16_t
uint32_t gid = 1100;
uint32_t uid = 1000;
char* buffer = NULL;
uint32_t offset = 0;
buffer = realloc(buffer, sizeof(uint32_t));
sprintf(buffer, "%d", gid);
offset += sizeof(uint32_t);
buffer = realloc(buffer, sizeof(uint32_t) + sizeof(buffer));
sprintf(buffer+sizeof(uint32_t), "%d", uid);
uint32_t valorGID;
uint32_t valorUID;
sscanf(buffer, "%d", &valorGID);
buffer += sizeof(uint32_t);
sscanf(buffer, "%d", &valorUID);
printf("ValorGID %d ValorUID %d \n", valorGID, valorUID);
And what I get is
ValorGID 11001000 ValorUID 1000
What I need to get is
ValorGID 1100 ValorUID 1000
I am new in C, so any help would be appreciated.
buffer = realloc(buffer, sizeof(uint32_t));
sprintf(buffer, "%d", gid);
offset += sizeof(uint32_t);
buffer = realloc(buffer, sizeof(uint32_t) + sizeof(buffer));
sprintf(buffer+sizeof(uint32_t), "%d", uid);
This doesn't really make sense, and will not work as intended except in lucky circumstances.
Let us assume that the usual CHAR_BIT == 8 holds, so sizeof(uint32_t) == 4. Further, let us assume that int is a signed 32-bit integer in two's complement representation without padding bits.
sprintf(buffer, "%d", gid) prints the decimal string representation of the bit-pattern of gid interpreted as an int to buffer. Under the above assumptions, gid is interpreted as a number between -2147483648 and 2147483647 inclusive. Thus the decimal string representation may contain a '-', contains 1 to 10 digits and the 0-terminator, altogether it uses two to twelve bytes. But you have allocated only four bytes, so whenever 999 < gid < 2^32-99 (the signed two's complement interpretation is > 999 or < -99), sprintf writes past the allocated buffer size.
That is undefined behaviour.
It's likely to not crash immediately because allocating four bytes usually gives you a larger chunk of memory effectively (if e.g. malloc always returns 16-byte aligned blocks, the twelve bytes directly behind the allocated four cannot be used by other parts of the programme, but belong to the programme's address space, and writing to them will probably go undetected). But it can easily crash later when the end of the allocated chunk lies on a page boundary.
Also, since you advance the write offset by four bytes for subsequent sprintfs, part of the previous number gets overwritten if the string representation (excluding the 0-termnator) used more than four bytes (while the programme didn't yet crash due to writing to non-allocated memory).
The line
buffer = realloc(buffer, sizeof(uint32_t) + sizeof(buffer));
contains further errors.
buffer = realloc(buffer, new_size); loses the reference to the allocated memory and causes a leak if realloc fails. Use a temporary and check for success
char *temp = realloc(buffer, new_size);
if (temp == NULL) {
/* reallocation failed, recover or cleanup */
free(buffer);
exit(EXIT_FAILURE);
}
/* it worked */
buffer = temp;
/* temp = NULL; or let temp go out of scope */
The new size sizeof(uint32_t) + sizeof(buffer) of the new allocation is always the same, sizeof(uint32_t) + sizeof(char*). That's typically eight or twelve bytes, so it doesn't take many numbers to write outside the allocated area and cause a crash or memory corruption (which may cause a crash much later).
You must keep track of the number of bytes allocated to buffer and use that to calculate the new size. There is no (portable¹) way to determine the size of the allocated memory block from the pointer to its start.
Now the question is whether you want to store the string representations or the bit patterns in the buffer.
Storing the string representations has the problem that the length of the string representation varies with the value. So you need to include separators between the representations of the numbers, or ensure that all representations have the same length by padding (with spaces or leading zeros) if necessary. That would for example work like
#include <stdint.h>
#include <inttypes.h>
#define MAKESTR(x) # x
#define STR(x) MAKESTR(x)
/* A uint32_t can use 10 decimal digits, so let each field be 10 chars wide */
#define FIELD_WIDTH 10
uint32_t gid = 1100;
uint32_t uid = 1000;
size_t buf_size = 0, offset = 0;
char *buffer = NULL, *temp = NULL;
buffer = realloc(buffer, FIELD_WIDTH + 1); /* one for the '\0' */
if (buffer == NULL) {
exit(EXIT_FAILURE);
}
buf_size = FIELD_WIDTH + 1;
sprintf(buffer, "%0" STR(FIELD_WIDTH) PRIu32, gid);
offset += FIELD_WIDTH;
temp = realloc(buffer, buf_size + FIELD_WIDTH);
if (temp == NULL) {
free(buffer);
exit(EXIT_FAILURE);
}
buffer = temp;
temp = NULL;
buf_size += FIELD_WIDTH;
sprintf(buffer + offset, "%0" STR(FIELD_WIDTH) PRIu32, uid);
offset += FIELD_WIDTH;
/* more */
uint32_t valorGID;
uint32_t valorUID;
/* rewind for scanning */
offset = 0;
sscanf(buffer + offset, "%" STR(FIELD_WIDTH) SCNu32, &valorGID);
offset += FIELD_WIDTH;
sscanf(buffer + offset, "%" STR(FIELD_WIDTH) SCNu32, &valorUID);
printf("ValorGID %u ValorUID %u \n", valorGID, valorUID);
with zero-padded fixed-width fields. If you'd rather use separators than a fixed width, the calculation of the required length and the offsets becomes more complicated, but unless the numbers are large, it would use less space.
If you'd rather store the bit-patterns, which would be the most compact way of storing, you'd use something like
size_t buf_size = 0, offset = 0;
unsigned char *buffer = NULL, temp = NULL;
buffer = realloc(buffer, sizeof(uint32_t));
if (buffer == NULL) {
exit(EXIT_FAILURE);
}
buf_size = sizeof(uint32_t);
for(size_t b = 0; b < sizeof(uint32_t); ++b) {
buffer[offset + b] = (gid >> b*8) & 0xFF;
}
offset += sizeof(uint32_t);
temp = realloc(buffer, buf_size + sizeof(uint32_t));
if (temp == NULL) {
free(buffer);
exit(EXIT_FAILURE);
}
buffer = temp;
temp = NULL;
buf_size += sizeof(uint32_t);
for(size_t b = 0; b < sizeof(uint32_t); ++b) {
buffer[offset + b] = (uid >> b*8) & 0xFF;
}
offset += sizeof(uint32_t);
/* And for reading the values */
uint32_t valorGID, valorUID;
/* rewind */
offset = 0;
valorGID = 0;
for(size_t b = 0; b < sizeof(uint32_t); ++b) {
valorGID |= buffer[offset + b] << b*8;
}
offset += sizeof(uint32_t);
valorUID = 0;
for(size_t b = 0; b < sizeof(uint32_t); ++b) {
valorUID |= buffer[offset + b] << b*8;
}
offset += sizeof(uint32_t);
¹ If you know how malloc etc. work in your implementation, it may be possible to find the size from malloc's bookkeeping data.
The format specifier '%d' is for int and thus is wrong for uint32_t. First uint32_t is an unsigned type, so you should at least use '%u', but then it might also have a different width than int or unsigned. There are macros foreseen in the standard: PRIu32 for printf and SCNu32 for scanf. As an example:
sprintf(buffer, "%" PRIu32, gid);
The representation returned by sprintf is a char*. If you are trying to store an array of integers as their string representatins then your fundamental data type is a char**. This is a ragged matrix of char if we are storing only the string data itself, but since the longest string a uint32_t can yield is 10 chars, plus one for the terminating null, it makes sense to preallocate this many bytes to hold each string.
So to store n uint32_t's from array a in array s as strings:
const size_t kMaxIntLen=11;
uint32_t *a,b;
// fill a somehow
...
size_t n,i;
char **s.*d;
if((d=(char*)malloc(n*kMaxIntLen))==NULL)
// error!
if((s=(char**)malloc(n*sizeof(char*)))==NULL)
// error!
for(i=0;i<n;i++)
{
s[i]=d+i; // this is incremented by sizeof(char*) each iteration
snprintf(s[i],kMaxIntLen,"%u",a[i]); // snprintf to be safe
}
Now the ith number is at s[i] so to print it is just printf("%s",s[i]);, and to retrieve it as an integer into b is sscanf(s[i],"%u",&b);.
Subsequent memory management is a bit trickier. Rather than constantly using using realloc() to grow the buffer, it is better to preallocate a chunk of memory and only alter it when exhausted. If realloc() fails it returns NULL, so store a pointer to your main buffer before calling it and that way you won't lose a reference to your data. Reallocate the d buffer first - again allocate enough room for several more strings - then if it succeeds see if d has changed. If so, destroy (free()) the s buffer, malloc() it again and rebuild the indices (you have to do this since if d has changed all your indices are stale). If not, realloc() s and fix up the new indices. I would suggest wrapping this whole thing in a structure and having a set of routines to operate on it, e.g.:
typedef struct StringArray
{
char **strArray;
char *data;
size_t nStrings;
} StringArray;
This is a lot of work. Do you have to use C? This is vastly easier as a C++ STL vector<string> or list<string> with the istringstream classes and the push_back() container method.
uint32_t gid = 1100;
uint32_t uid = 1000;
char* buffer = NULL;
uint32_t offset = 0;
buffer = realloc(buffer, sizeof(uint32_t));
sprintf(buffer, "%d", gid);
offset += sizeof(uint32_t);
buffer = realloc(buffer, sizeof(uint32_t) + sizeof(buffer));
sprintf(buffer+sizeof(uint32_t), "%d", uid);
uint32_t valorGID;
uint32_t valorUID;
sscanf(buffer, "%4d", &valorGID);
buffer += sizeof(uint32_t);
sscanf(buffer, "%d", &valorUID);
printf("ValorGID %d ValorUID %d \n", valorGID, valorUID);
`
I think this may resolve the issue !

Resources