update: the point of whether char, signed char, or unsigned was ultimately moot here. it was more appropriate to use memcpy in this situation, since it works indiscriminately on bytes.
Couldn't be a simpler operation, but I seem to be missing a critical step. In the following code, I am attempting to fill bufferdata with buffer for which the compiler warns me of a difference in signedness.
unsigned char buffer[4096] = {0};
char *bufferdata;
bufferdata = (char*)malloc(4096 * sizeof(bufferdata));
if (! bufferdata)
return false;
while( ... )
{
// nextBlock( voidp _buffer, unsigned _length );
read=nextBlock( buffer, 4096);
if( read > 0 )
{
bufferdata = strncat(bufferdata, buffer, read); // (help)
// leads to: pointer targets in passing argument 2 of strncat differ in signedness.
if(read == 4096) {
// let's go for another chunk
bufferdata = (char*)realloc(bufferdata, ( strlen(bufferdata) + (4096 * sizeof(bufferdata)) ) );
if (! bufferdata) {
printf("failed to realloc\n");
return false;
}
}
}
else if( read<0 )
{
printf("error.\n");
break;
}
else {
printf("done.\n");
break;
}
}
Obviously in your compiler char is signed char, thus the warning message.
char * strncat ( char * destination, char * source, size_t num );
so as your destination buffer is unsigned char so there will be a warning of sign. You can either change your buffer to char array if signed is not necessary else you can ignore warnings using -w option in compiler.
Related
For this code below that I was writing. I was wondering, if I want to split the string but still retain the original string is this the best method?
Should the caller provided the ** char or should the function "split" make an additional malloc call and memory manage the ** char?
Also, I was wondering if this is the most optimizing method, or could I optimize the code better than this?
I still have not debug the code yet, I am a bit undecided whether if the caller manage the ** char or the function manage the pointer ** char.
#include <stdio.h>
#include <stdlib.h>
size_t split(const char * restrict string, const char splitChar, char ** restrict parts, const size_t maxParts){
size_t size = 100;
size_t partSize = 0;
size_t len = 0;
size_t newPart = 1;
char * tempMem;
/*
* We just reverse a long page of memory
* At reaching the space character that is the boundary of the new
*/
char * mem = (char*) malloc( sizeof(char) * size );
if ( mem == NULL ) return 0;
for ( size_t i = 0; string[i] != 0; i++ ) {
// If it is a split char we at a new part
if ( string[i] == splitChar) {
// If the last character was not the split character
// Then mem[len] = 0 and increase the len by 1.
if (newPart == 0) mem[len++] = 0;
newPart = 1;
continue;
} else {
// If this is a new part
// and not a split character
// we make a new pointer
if ( newPart == 1 ){
// if reach maxpart we break.
// It is okay here, to not worry about memory
if ( partSize == maxParts ) break;
parts[partSize++] = &mem[len];
newPart = 0;
}
mem[len++] = string[i];
if ( len == size ){
// if ran out of memory realloc.
tempMem = (char*)realloc(mem, sizeof(char) * (size << 1) );
// if fail quit loop
if ( tempMem == NULL ) {
// If we can't get more memory the last part could be corrupted
// We have to return.
// Otherwise the code below can seg.
// There maybe a better way than this.
return partSize--;
}
size = size << 1;
mem = tempMem;
}
}
}
// If we got here and still in a newPart that is fine no need
// an additional character.
if ( newPart != 1 ) mem[len++] = 0;
// realloc to give back the unneed memory
if ( len < size ) {
tempMem = (char*) realloc(mem, sizeof(char) * len );
// If the resizing did not fail but yielded a different
// memory block;
if ( tempMem != NULL && tempMem != mem ){
for ( size_t i = 0; i < partSize; i++ ){
parts[i] = tempMem + (parts[i] - mem);
}
}
}
return partSize;
}
int main(){
char * tStr = "This is a super long string just to test the str str adfasfas something split";
char * parts[10];
size_t len = split(tStr, ' ', parts, 10);
for (size_t i = 0; i < len; i++ ){
printf("%zu: %s\n", i, parts[i]);
}
}
What is "best" is very subjective, as well as use case dependent.
I personally would keep the parameters as input only, define a struct to contain the split result, and probably return such by value. The struct would probably contain pointers to memory allocation, so would also create a helper function free that memory. The parts might be stored as list of strings (copy string data) or index&len pairs for the original string (no string copies needed, but original string needs to remain valid).
But there are dozens of very different ways to do this in C, and all a bit klunky. You need to choose your flavor of klunkiness based on your use case.
About being "more optimized": unless you are coding for a very small embedded device or something, always choose a more robust, clear, easier to use, harder to use wrong over more micro-optimized. The useful kind of optimization turns, for example, O(n^2) to O(n log n). Turning O(3n) to O(2n) of a single function is almost always completely irrelevant (you are not going to do string splitting in a game engine inner rendering loop...).
I have an array that is declared inside a public struct like this:
uint16_t *registers;
In a function I'm retrieving a char string (stored in buffer, see code below) that contains numerical values separated by a comma (e.g., "1,12,0,136,5,76,1243"). My goal is to get each individual numerical value and store it in the array, one after another.
i = 0;
const char delimiter[] = ",";
char *end;
tmp.vals = strtok(buffer, delimiter);
while (tmp.vals != NULL) {
tmp.registers[i] = strtol(tmp.vals, &end, 10);
tmp.vals = strtok(NULL, delimiter);
i++;
}
The problem is that the line containing strtol is producing a Segmentation fault (core dumped) error. I'm pretty sure it's caused by trying to fit unsigned long values into uint16_t array slots but no matter what I try I can't get it fixed.
Changing the code as follows seems to have solved the problem:
unsigned long num = 0;
size_t size = 0;
i = 0;
size = 1;
tmp.vals = (char *)calloc(strlen(buffer) + 1, sizeof(char));
tmp.registers = (uint16_t *)calloc(size, sizeof(uint16_t));
tmp.vals = strtok(buffer, delimiter);
while (tmp.vals != NULL) {
num = strtoul(tmp.vals, &end, 10);
if (0 <= num && num < 65536) {
tmp.registers = (uint16_t *)realloc(tmp.registers, size + i);
tmp.registers[i] = (uint16_t)num;
} else {
fprintf(stderr, "==> %lu is too large to fit in register[%d]\n", num, i);
}
tmp.vals = strtok(NULL, delimiter);
i++;
}
A long integer is at least 32 bits, so yes, you're going to lose information trying to shove a signed 32 bit integer into an unsigned 16 bit integer. If you have compiler warnings on (I use -Wall -Wshadow -Wwrite-strings -Wextra -Wconversion -std=c99 -pedantic) it should tell you that.
test.c:20:28: warning: implicit conversion loses integer precision: 'long' to 'uint16_t'
(aka 'unsigned short') [-Wconversion]
tmp.registers[i] = strtol(tmp.vals, &end, 10);
~ ^~~~~~~~~~~~~~~~~~~~~~~~~~
However, this isn't going to cause a segfault. You'll lose 16 bits and the change in sign will do funny things.
#include <stdio.h>
#include <inttypes.h>
int main() {
long big = 1234567;
uint16_t small = big;
printf("big = %ld, small = %" PRIu16 "\n", big, small);
}
If you know what you're reading will fit into 16 bits, you can make things a little safer first by using strtoul to read an unsigned long, verify that it's small enough to fit, and explicitly cast it.
unsigned long num = strtoul(tmp.vals, &end, 10);
if( 0 <= num && num < 65536 ) {
tmp.registers[i] = (uint16_t)num;
}
else {
fprintf(stderr, "%lu is too large to fit in the register\n", num);
}
More likely tmp.registers (and possibly buffer) weren't properly initialized and allocated points to garbage. If you simply declared the tmp on the stack like so:
Registers tmp;
This only allocates memory for tmp, not the things it points to. And it will contain garbage. tmp.registers will point to some random spot in memory. When you try to write to it it will segfault... eventually.
The register array needs to be allocated.
size_t how_many = 10;
uint16_t *registers = malloc( sizeof(uint16_t) * how_many );
Thing tmp = {
.registers = registers,
.vals = NULL
};
This is fine so long as your loop only ever runs how_many times. But you can't be sure of that when reading input. Your loop is potentially reading an infinite number of registers. If it goes over the 10 we've allocated it will again start writing into someone else's memory and segfault.
Dynamic memory is too big a topic for here, but we can at least limit the loop to the size of the array by tracking the maximum size of registers and how far in it is. We could do it in the loop, but it really belongs in the struct.
typedef struct {
uint16_t *registers;
char *vals;
size_t max;
size_t size;
} Registers;
While we're at it, put initialization into a function so we're sure it's done reliably each time.
void Registers_init( Registers *registers, size_t size ) {
registers->registers = malloc( sizeof(uint16_t) * size );
registers->max = size;
registers->size = 0;
}
And same with our bounds check.
void Registers_push( Registers *registers, uint16_t num ) {
if( registers->size == registers->max ) {
fprintf(stderr, "Register has reached its limit of %zu\n", registers->max);
exit(1);
}
registers->registers[ registers->size ] = (uint16_t)num;
registers->size++;
}
Now we can add registers safely. Or at least it will error nicely.
Registers registers;
Registers_init( ®isters, 10 );
tmp.vals = strtok(buffer, delimiter);
while (tmp.vals != NULL) {
unsigned long num = strtoul(tmp.vals, &end, 10);
if( 0 <= num && num < 65536 ) {
Registers_push( &tmp, (uint16_t)num );
}
else {
fprintf(stderr, "%lu is too large to fit in the register\n", num);
}
tmp.vals = strtok(NULL, delimiter);
i++;
}
At this point we're re-implementing a size-bound array. It's a good exercise, but for production code use an existing library such as GLib which provides self-growing arrays and a lot more features.
I found something really strange today.
In the code above, I saw that utrlen(&eot) (utrlen == strlen) is equal to two, and utrlen(&etx) == 1.
The best part : when I swap the order of declaration with etx, utrlen(&eot) == 1 and utrlen(&etx) == 2...
char **get_ukey_string()
{
static char eot = 0x4;
static char etx = 0x3;
static char *ukey_string[NB_UKEY] = {
"\b", "\r", "[B", "\0", "\n", "[D", "[C", "[A", &etx, &eot
};
return (ukey_string);
}
t_hashtable *new_ukey_htable(int fd)
{
char **ukey_string;
t_hashtable *htable;
t_hashnode *hnode;
char *key;
unsigned int i;
size_t size;
if ((htable = new_hashtable(NB_UKEY)) == NULL)
return (NULL);
ukey_string = get_ukey_string();
i = 2;
while (i < NB_UKEY + 2)
{
key = ukey_string[i - 2];
size = utrlen(key);
if (((hnode = new_hashnode(sutrdup(key, size), size, i)) == NULL)
|| htable->add_node(htable, hnode))
{
delete_hashtable(htable);
return (NULL);
}
++i;
}
return (htable);
}
Does anyone have an idea why?
eot and etx are char, not char *, so you can't apply strlen on them because there are no null terminator.
The strlen function (why are you calling it utrlen?) takes an argument of type const char*. That argument must point to the initial character of a string, defined as "a contiguous sequence of characters terminated by and including the first null character".
Calling strlen with the address of a declared char object is legal; the address type char* (pointer to char) matches the required type to be passed to strlen. But it's not the initial character of a string, so the behavior is undefined.
In practice, strlen will start at the char object whose address you gave it, and iterate until it sees a null character '\0'.
If the char object happens to have the value '\0', it finds it immediately and returns 0 -- which is not particularly useful.
If it has some other value, strlen will attempt to scan bytes in memory that aren't part of the object. The result is undefined behavior.
Don't do that.
I have a college project where need to convert an int to a buffer of char.
I need to use memcpy but when I copy the values it's not work because the msg_buf still empty.
I have some constraints:
- I need to use memcpy because my teacher will test my code like memcmp(msg_str, &opcode, 2) == 0).
Here is my code:
int message_to_buffer(struct message_t *msg, char **msg_buf){
int opcode = htons(msg->opcode);
int c_type = htons(msg->c_type);
int result;
int buffer = sizeof(opcode) + sizeof(c_type);
switch(msg->c_type){
case CT_RESULT:
result = htonl(msg->content.result);
buffer += sizeof(result);
*msg_buf = (char*)malloc(sizeof(char) * 12);
if(msg_buf == NULL)
return -1;
memcpy(*msg_buf,&opcode,sizeof(opcode));
break;
};
return buffer;
}
What is wrong here?
More specifically, you need to be copying the shorts as shorts, not ints. sizeof(short) != sizeof(int) (usually, depending on the architecture):
int message_to_buffer(struct message_t *msg, char **msg_buf){
short opcode = htons(msg->opcode);
short c_type = htons(msg->c_type);
int result;
char* buffer = NULL, *buf_start=NULL;
*msg_buf = NULL;
switch(msg->c_type){
case CT_RESULT:
result = htonl(msg->content.result);
buffer = (char*)malloc(sizeof(char) * 12);
if (buffer == NULL)
return -1;
buf_start = buffer;
memcpy(buffer,&opcode,sizeof(opcode)); // sizeof(short) == 2; sizeof(int) == 4
buffer += sizeof(opcode);
memcpy(buffer,&c_type,sizeof(c_type)); // sizeof(short) == 2; sizeof(int) == 4
buffer += sizeof(c_type);
memcpy(buffer,&result, sizeof(result));
buffer += sizeof(result);
*msg_buf = buf_start;
break;
};
return buffer - buf_start;
}
I think your problem may be that you are calling htons() on an int. htons() is meant to be used with values of type short, so you may be losing the upper 16 bits of your msg->opcode and msg->c_type there. Try replacing htons() with htonl() instead.
Also, it looks like you are allocating a 12-byte buffer with malloc(), but only writing 4 bytes into it, leaving the latter 8 bytes of it uninitialized/undefined. Is that intentional?
Why don't you use itoa function to convert int to char*? So you replace your memcpy with itoa function.
Reference: http://www.cplusplus.com/reference/cstdlib/itoa/
[EDIT]
If your compiler does not support itoa, you can use sprintf instead.
On occasion, the following code works, which probably means good concept, but poor execution. Since this crashes depending on where the bits fell, this means I am butchering a step along the way. I am interested in finding an elegant way to fill bufferdata with <=4096 bytes from buffer, but admittedly, this is not it.
EDIT: the error I receive is illegal access on bufferdata
unsigned char buffer[4096] = {0};
char *bufferdata;
bufferdata = (char*)malloc(4096 * sizeof(*bufferdata));
if (! bufferdata)
return false;
while( ... )
{
// int nextBlock( voidp _buffer, unsigned _length );
read=nextBlock( buffer, 4096);
if( read > 0 )
{
memcpy(bufferdata+bufferdatawrite,buffer,read);
if(read == 4096) {
// let's go for another chunk
bufferdata = (char*)realloc(bufferdata, ( bufferdatawrite + ( 4096 * sizeof(*bufferdata)) ) );
if (! bufferdata) {
printf("failed to realloc\n");
return false;
}
}
}
else if( read<0 )
{
printf("error.\n");
break;
}
else {
printf("done.\n");
break;
}
}
free(bufferdata);
It's hard to tell where the error is, there's some code missing here and there.
if(read == 4096) { looks like a culprit, what if nextBlock, returned 4000 on one iteration, and 97 on the next ? Now you need to store 4097 bytes but you don't reallocate the buffer to accomodate for it.
You need to accumulate the bytes, and realloc whenever you pass a 4096 boundary.
something like:
#define CHUNK_SIZE 4096
int total_read = 0;
int buffer_size = CHUNK_SIZE ;
char *bufferdata = malloc(CHUNK_SIZE );
char buffer[CHUNK_SIZE];
while( ... )
{
// int nextBlock( voidp _buffer, unsigned _length );
read=nextBlock( buffer, CHUNK_SIZE );
if( read > 0 )
{
total_read += read;
if(buffer_size < total_read) {
// let's go for another chunk
char *tmp_buf;
tmp_buf= (char*)realloc(bufferdata, buffer_size + CHUNK_SIZE );
if (! tmp_buf) {
free(bufferdata);
printf("failed to realloc\n");
return false;
}
buffer_data = tmp_buf;
buffer_size += CHUNK_SIZE ;
}
memcpy(bufferdata+total_read-read,buffer,read);
}
...
}
A few comments:
Please define or const 4096. You will get burned if you ever need to change this. realloc chaining is an extremely inefficient way to get a buffer. Any way you could prefetch the size and grab it all at once? perhaps not, but I always cringe when i see realloc(). I'd also like to know what kZipBufferSize is and if it's in bytes like the rest of your counts. Also, what exactly is bufferdatawrite? I'm assuming it's source data, but I'd like to see it's declaration to make sure it's not a memory alignment issue - which is kinda what this feels like. Or a buffer overrun due to bad sizing.
Finally, are you sure they nextBlock isn't overruning memory some how? This is another point of potential weakness in your code.