Initially I want to convert this uint8_t array to a char array in c. I have been a little stuck trying to resolve this problem. My first alternative solution is to copy another type value to the temporary one, copy the tmp value to a writable char, and then remove tmp value from memory. By the way this is used to accompany a blake hash function. Here is my code snippet:
char * bl(char *input)
{
uint8_t output[64];
char msg[]= "";
char *tmp;
int dInt;
memset(output,0,64);
tmp = (char*) malloc(64);
if (!tmp){
exit( 1);
}
dInt = strlen(input);
if (dInt > 0xffff){
exit( 1);
}
uint8_t data[dInt];
memset(data,0, dInt);
strlcpy(data,input,dInt);
uint64_t dLen =dInt;
blake512_hash(output, data,dLen);
int k;
for (k=0;k<64;k++){
tmp[k] = output[k]; //does this "copy" is buggy code?
}
memcpy(msg, tmp,64);
//so here I can to delete tmp value
// I dont want there were left unused value in memory
// delete tmp;
free(tmp);
return msg;
}
I think the code above is still not efficient, so what are your opinion, hints and the fixes?
Thank you very much before!
First of all, you should never return a pointer to a local variable since the variable will be destroyed by the time the function exits. You should probably want to pass the output array to bl function and use that to output the string.
For most cases(if uint8_t IS char, which is usually the case), memcpy(msg, output, 64) should be sufficient. If you want to be strict about it(quite frankly blake512_hash shouldn't return uint8_t array in the first place if you are expecting char array as the output all the time), you could simply call msg[k] = (char)tmp[k] in your for loop and remove memcpy.
A bit much is wrong here.
dInt = strlen(input) + 1; // dInt is the size of the string including the terminating '\0'.
strlcpy indeed uses the size, not strlen.
msg = tmp; and not freeing tmp. As msg is const char* "" (in C++ terms).
Related
Fairly simple question regarding malloc. What is the max that I can set within the allocated area. For instance:
char *buffer;
buffer = malloc(20);
buffer[19] = 'a'; //Is this the highest spot I can set?
buffer[20] = 'a'; //Or is this the highest spot I can set?
free(buffer);
The phrasing of your question is a bit off. You mean "what is the maximum index I can use for an allocated block of memory". The answer is the same as for arrays.
If you are reading or writing the memory, you may safely use indices between (and including) 0 and one less than the size of the block (in your case, that means index 19). All up, that means you can access the 20 values that you asked for.
If you are simply obtaining the pointer for comparison with other pointers inside the same block (and you are not going to read or write to it), you may additionally obtain the pointer one-past-the-end (in your case that means index 20).
To clarify these things with examples:
Yes, buffer[19] = 'a'; is the very last value you may access in a read or write capacity. Don't forget that if you want to store a string in this memory, and hand it to functions that expect a null-terminated string, this slot is your last chance to put that value of '\0'.
You are allowed to access buffer[20] in the following manner:
char *p;
for( p = &buffer[0]; p != &buffer[20]; ++p )
{
putc( *p, stdout );
}
This is useful because of the way we tend to iterate over memory and store sizes. It would make our code quite less readable if we had to subtract 1 all over the place.
Oh, and it gives you the neat trick:
size_t buf_size = 20;
char *buffer = malloc(buf_size);
char *start = buffer;
char *end = buffer + buf_size;
size_t oops_i_forgot_the_size = end - start;
malloc(x) will allocate x bytes.
So by accessing buffer[0] you access the first byte, by accessing buffer[1] you access the second.
e.g
char * buffer = (char *) malloc(1);
buffer[0] = 0; // legal
buffer[1] = 0; // illegal
I have a program that accepts a char input using argv from the command line. I copy the input argv[1] using strcpy to a pointer called structptr(it goes to structptr->words from struct) where memory has been allocated. I then copy character by character from the memory that the pointer structptr points to another pointer called words that points to memory that has been allocated. After i've copied one character i print that element [c] to make sure that it has been copied correctly(which it has). I then finish copying all of the characters and return the result to a char pointer but for some reason it is blank/null. After each copying of the characters i checked if the previous elements were correct but they don't show up anymore([c-2], [c-1], [c]). Here is my code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
struct StructHolder {
char *words;
};
typedef struct StructHolder Holder;
char *GetCharacters(Holder *ptr){
int i=0;
char *words=malloc(sizeof(char));
for(i;i<strlen(ptr->words);i++){
words[i]=ptr->words[i];
words=realloc(words,sizeof(char)+i);
}
words[strlen(ptr->words)]='\0';
return words;
}
int main(int argc, char **argv){
Holder *structptr=malloc(sizeof(Holder));
structptr->words=malloc(strlen(argv[1]));
strcpy(structptr->words, argv[1]);
char *charptr;
charptr=(GetCharacters(structptr));
printf("%s\n", charptr);
return 0;
At first I thought this was the problem:
char *words=malloc(sizeof(char)) is allocating 1 byte (sizeof 1 char). You probably meant char *words = malloc(strlen(ptr->words)+1); - You probably want to null check the ptr and it's member just to be safe.
Then I saw the realloc. Your realloc is always 1 char short. When i = 0 you allocate 1 byte then hit the loop, increment i and put a char 1 past the end of the realloced array (at index 1)
Also your strcpy in main is has not allocated any memory in the holder.
In these two lines,
structptr->words=malloc(strlen(argv[1]));
strcpy(structptr->words, argv[1]);
need to add one to the size to hold the nul-terminator. strlen(argv[1]) should be strlen(argv[1])+1.
I think the same thing is happening in the loop, and it should be larger by 1. And sizeof(char) is always 1 by definition, so:
...
words=realloc(words,i+2);
}
words=realloc(words,i+2); // one more time to make room for the '\0'
words[strlen(ptr->words)]='\0';
FYI: Your description talks about structptr but your code uses struct StructHolder and Holder.
This code is a disaster:
char *GetCharacters(Holder *ptr){
int i=0;
char *words=malloc(sizeof(char));
for(i;i<strlen(ptr->words);i++){
words[i]=ptr->words[i];
words=realloc(words,sizeof(char)+i);
}
words[strlen(ptr->words)]='\0';
return words;
}
It should be:
char *GetCharacters(const Holder *ptr)
{
char *words = malloc(strlen(ptr->words) + 1);
if (words != 0)
strcpy(words, ptr->words);
return words;
}
Or even:
char *GetCharacters(const Holder *ptr)
{
return strdup(ptr->words);
}
And all of those accept that passing the structure type makes sense; there's no obvious reason why you don't just pass the const char *words instead.
Dissecting the 'disaster' (and ignoring the argument type):
char *GetCharacters(Holder *ptr){
int i=0;
OK so far, though you're not going to change the structure so it could be a const Holder *ptr argument.
char *words=malloc(sizeof(char));
Allocating one byte is expensive — more costly than calling strlen(). This is not a good start, though of itself, it is not wrong. You do not, however, check that the memory allocation succeeded. That is a mistake.
for(i;i<strlen(ptr->words);i++){
The i; first term is plain weird. You could write for (i = 0; ... (and possibly omit the initializer in the definition of i, or you could write for (int i = 0; ....
Using strlen() repeatedly in a loop like that is bad news too. You should be using:
int len = strlen(ptr->words);
for (i = 0; i < len; i++)
Next:
words[i]=ptr->words[i];
This assignment is not a problem.
words=realloc(words,sizeof(char)+i);
This realloc() assignment is a problem. If you get back a null pointer, you've lost the only reference to the previously allocated memory. You need, therefore, to save the return value separately, test it, and only assign if successful:
void *space = realloc(words, i + 2); // When i = 0, allocate 2 bytes.
if (space == 0)
break;
words = space;
This would be better/safer. It isn't completely clean; it might be better to replace break; with { free(words); return 0; } to do an early exit. But this whole business of allocating one byte at a time is not the right way to do it. You should work out how much space to allocate, then allocate it all at once.
}
words[strlen(ptr->words)]='\0';
You could avoid recalculating the length by using i instead of strlen(ptr->words). This would have the side benefit of being correct if the if (space == 0) break; was executed.
return words;
}
The rest of this function is OK.
I haven't spent time analyzing main(); it is not, however, problem-free.
I have a string function that accepts a pointer to a source string and returns a pointer to a destination string. This function currently works, but I'm worried I'm not following the best practice regrading malloc, realloc, and free.
The thing that's different about my function is that the length of the destination string is not the same as the source string, so realloc() has to be called inside my function. I know from looking at the docs...
http://www.cplusplus.com/reference/cstdlib/realloc/
that the memory address might change after the realloc. This means I have can't "pass by reference" like a C programmer might for other functions, I have to return the new pointer.
So the prototype for my function is:
//decode a uri encoded string
char *net_uri_to_text(char *);
I don't like the way I'm doing it because I have to free the pointer after running the function:
char * chr_output = net_uri_to_text("testing123%5a%5b%5cabc");
printf("%s\n", chr_output); //testing123Z[\abc
free(chr_output);
Which means that malloc() and realloc() are called inside my function and free() is called outside my function.
I have a background in high level languages, (perl, plpgsql, bash) so my instinct is proper encapsulation of such things, but that might not be the best practice in C.
The question: Is my way best practice, or is there a better way I should follow?
full example
Compiles and runs with two warnings on unused argc and argv arguments, you can safely ignore those two warnings.
example.c:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
char *net_uri_to_text(char *);
int main(int argc, char ** argv) {
char * chr_input = "testing123%5a%5b%5cabc";
char * chr_output = net_uri_to_text(chr_input);
printf("%s\n", chr_output);
free(chr_output);
return 0;
}
//decodes uri-encoded string
//send pointer to source string
//return pointer to destination string
//WARNING!! YOU MUST USE free(chr_result) AFTER YOU'RE DONE WITH IT OR YOU WILL GET A MEMORY LEAK!
char *net_uri_to_text(char * chr_input) {
//define variables
int int_length = strlen(chr_input);
int int_new_length = int_length;
char * chr_output = malloc(int_length);
char * chr_output_working = chr_output;
char * chr_input_working = chr_input;
int int_output_working = 0;
unsigned int uint_hex_working;
//while not a null byte
while(*chr_input_working != '\0') {
//if %
if (*chr_input_working == *"%") {
//then put correct char in
sscanf(chr_input_working + 1, "%02x", &uint_hex_working);
*chr_output_working = (char)uint_hex_working;
//printf("special char:%c, %c, %d<\n", *chr_output_working, (char)uint_hex_working, uint_hex_working);
//realloc
chr_input_working++;
chr_input_working++;
int_new_length -= 2;
chr_output = realloc(chr_output, int_new_length);
//output working must be the new pointer plys how many chars we've done
chr_output_working = chr_output + int_output_working;
} else {
//put char in
*chr_output_working = *chr_input_working;
}
//increment pointers and number of chars in output working
chr_input_working++;
chr_output_working++;
int_output_working++;
}
//last null byte
*chr_output_working = '\0';
return chr_output;
}
It's perfectly ok to return malloc'd buffers from functions in C, as long as you document the fact that they do. Lots of libraries do that, even though no function in the standard library does.
If you can compute (a not too pessimistic upper bound on) the number of characters that need to be written to the buffer cheaply, you can offer a function that does that and let the user call it.
It's also possible, but much less convenient, to accept a buffer to be filled in; I've seen quite a few libraries that do that like so:
/*
* Decodes uri-encoded string encoded into buf of length len (including NUL).
* Returns the number of characters written. If that number is less than len,
* nothing is written and you should try again with a larger buffer.
*/
size_t net_uri_to_text(char const *encoded, char *buf, size_t len)
{
size_t space_needed = 0;
while (decoding_needs_to_be_done()) {
// decode characters, but only write them to buf
// if it wouldn't overflow;
// increment space_needed regardless
}
return space_needed;
}
Now the caller is responsible for the allocation, and would do something like
size_t len = SOME_VALUE_THAT_IS_USUALLY_LONG_ENOUGH;
char *result = xmalloc(len);
len = net_uri_to_text(input, result, len);
if (len > SOME_VALUE_THAT_IS_USUALLY_LONG_ENOUGH) {
// try again
result = xrealloc(input, result, len);
}
(Here, xmalloc and xrealloc are "safe" allocating functions that I made up to skip NULL checks.)
The thing is that C is low-level enough to force the programmer to get her memory management right. In particular, there's nothing wrong with returning a malloc()ated string. It's a common idiom to return mallocated obejcts and have the caller free() them.
And anyways, if you don't like this approach, you can always take a pointer to the string and modify it from inside the function (after the last use, it will still need to be free()d, though).
One thing, however, that I don't think is necessary is explicitly shrinking the string. If the new string is shorter than the old one, there's obviously enough room for it in the memory chunk of the old string, so you don't need to realloc().
(Apart from the fact that you forgot to allocate one extra byte for the terminating NUL character, of course...)
And, as always, you can just return a different pointer each time the function is called, and you don't even need to call realloc() at all.
If you accept one last piece of good advice: it's advisable to const-qualify your input strings, so the caller can ensure that you don't modify them. Using this approach, you can safely call the function on string literals, for example.
All in all, I'd rewrite your function like this:
char *unescape(const char *s)
{
size_t l = strlen(s);
char *p = malloc(l + 1), *r = p;
while (*s) {
if (*s == '%') {
char buf[3] = { s[1], s[2], 0 };
*p++ = strtol(buf, NULL, 16); // yes, I prefer this over scanf()
s += 3;
} else {
*p++ = *s++;
}
}
*p = 0;
return r;
}
And call it as follows:
int main()
{
const char *in = "testing123%5a%5b%5cabc";
char *out = unescape(in);
printf("%s\n", out);
free(out);
return 0;
}
It's perfectly OK to return newly-malloc-ed (and possibly internally realloced) values from functions, you just need to document that you are doing so (as you do here).
Other obvious items:
Instead of int int_length you might want to use size_t. This is "an unsigned type" (usually unsigned int or unsigned long) that is the appropriate type for lengths of strings and arguments to malloc.
You need to allocate n+1 bytes initially, where n is the length of the string, as strlen does not include the terminating 0 byte.
You should check for malloc failing (returning NULL). If your function will pass the failure on, document that in the function-description comment.
sscanf is pretty heavy-weight for converting the two hex bytes. Not wrong, except that you're not checking whether the conversion succeeds (what if the input is malformed? you can of course decide that this is the caller's problem but in general you might want to handle that). You can use isxdigit from <ctype.h> to check for hexadecimal digits, and/or strtoul to do the conversion.
Rather than doing one realloc for every % conversion, you might want to do a final "shrink realloc" if desirable. Note that if you allocate (say) 50 bytes for a string and find it requires only 49 including the final 0 byte, it may not be worth doing a realloc after all.
I would approach the problem in a slightly different way. Personally, I would split your function in two. The first function to calculate the size you need to malloc. The second would write the output string to the given pointer (which has been allocated outside of the function). That saves several calls to realloc, and will keep the complexity the same. A possible function to find the size of the new string is:
int getNewSize (char *string) {
char *i = string;
int size = 0, percent = 0;
for (i, size; *i != '\0'; i++, size++) {
if (*i == '%')
percent++;
}
return size - percent * 2;
}
However, as mentioned in other answers there is no problem in returning a malloc'ed buffer as long as you document it!
Additionally what was already mentioned in the other postings, you should also document the fact that the string is reallocated. If your code is called with a static string or a string allocated with alloca, you may not reallocate it.
I think you are right to be concerned about splitting up mallocs and frees. As a rule, whatever makes it, owns it and should free it.
In this case, where the strings are relatively small, one good procedure is to make the string buffer larger than any possible string it could contain. For example, URLs have a de facto limit of about 2000 characters, so if you malloc 10000 characters you can store any possible URL.
Another trick is to store both the length and capacity of the string at its front, so that (int)*mystring == length of string and (int)*(mystring + 4) == capacity of string. Thus, the string itself only starts at the 8th position *(mystring+8). By doing this you can pass around a single pointer to a string and always know how long it is and how much memory capacity the string has. You can make macros that automatically generate these offsets and make "pretty code".
The value of using buffers this way is you do not need to do a reallocation. The new value overwrites the old value and you update the length at the beginning of the string.
I'm very new to C, I'm getting stuck using the strncpy function.\
Here's an example of what I'm working with:
int main()
{
const char *s = "how";
struct test {
char *name;
};
struct test *t1 = malloc(sizeof(struct test));
strncpy(t1->name, s, sizeof(*s));
t1->name[NAMESIZE] = '\0';
printf("%s\n", t1->name);
}
I have a const char *, I need to set the "name" value of test to the const char. I'm having a really tough time figuring this out. Is this even the correct approach?
Thank you very much!
Well, you allocate the structure, but not the string inside the structure. You need to do that before you copy to it. Even when you do, you will probably overwrite unallocated memory when you attempt to set the string terminator.
And, due to a hight intake ow wine, I just noticed you actually only copy one character, but it's still undefined behavior.
Let's take this one step at a time:
struct test *t1 = malloc(sizeof(struct test));
this allocates space for a struct test; enough space for the pointer name, but not any memory for the pointer to point to. At a minimum, you'll want to do the following:
t1->name = malloc(strlen(s) + 1);
Having done that, you can proceed to copy the string. However, you already computed the length of the string once to allocate the memory; there's no sense in doing it again implicitly by calling strncpy. Instead, do the following:
const size_t len = strlen(s) + 1; // +1 accounts for terminating NUL
t1->name = malloc(len);
memcpy(t1->name, s, len);
In general, try to use this basic pattern; compute the length of strings once when they come into your code, but then use explicit-sized memory buffers and the mem* operations instead of implicit-length strings with str* operations. It is at least as safe (and often safer) and more efficient if done properly.
You might use strncpy if t1->name was a fixed-size array instead (though many people prefer to use strlcpy). That would look like the following:
struct test { char name[MAXSIZE]; };
struct test *t1 = malloc(sizeof *t1);
strncpy(t1->name, s, MAXSIZE - 1);
t1->name[MAXSIZE-1] = 0; // force NUL-termination
Note that the size argument to strncpy should always be the size of the destination, not the source, to avoid writing outside the bounds of the destination buffer.
Without any attempt at completeness or educational direction, here's a version of your code that should work. You can play "spot the difference" and search for an explanation for each one separately on this site.
int main()
{
const char s[] = "how"; // s is an array, const char[4]
struct test{ char name[NAMESIZE]; }; // test::name is an array
struct test * t1 = malloc(sizeof *t1); // DRY
strncpy(t1->name, s, NAMESIZE); // size of the destination
t1->name[NAMESIZE - 1] = '\0'; // because strncpy is evil
printf("%s\n", t1->name);
free(t1); // clean up
}
strncpy() is always wrong
if the result is too long, the target string will not be nul-terminated
if the target is too long (the third argument) , the trailing end will be completely padded with NULs. This will waste a lot of cycles if you have large buffers and short strings.
Instead, you cound use memcpy() or strcpy, (or in your case even strdup() )
int main()
{
const char *s = "how";
struct test {
char *name;
};
struct test *t1
size_t len;
t1 = malloc(sizeof *t1);
#if USE_STRDUP
t1->name = strdup(s);
#else
len = strlen(s);
t1->name = malloc (1+len);
memcpy(t1->name, s, len);
t1->name[len] = '\0';
#endif
printf("%s\n", t1->name);
return 0;
}
I'm trying to convert some code from a dynamic-typed language to C. Please
bear with me as I have no practical experience yet with C.
I have a dispatcher function that decides how to convert it's input based on
the value of the flag argument.
void output_dispatcher(char *str, int strlen, int flag) {
char output[501];
char *result;
switch (flag) {
/* No conversion */
case 0:
result = str;
break;
case 1:
result = convert_type1(output, str, strlen);
len = strlen(result);
break;
/* ... */
}
/* do something with result */
}
I currently have 5 different output converters and they all (even future
ones) are guaranteed to only produce 300-500 characters. From my reading, it
is preferable to use a heap variable than dynamically allocate space on the
stack, if possible. The function declaration for one looks like:
static char * convert_type1(char *out, const char *in, int inlen);
I want to avoid the strlen in the dispatcher, since it is uncessary to
recalculate the output size because the output converters know it when they
construct the output. Also, since I'm passing in a pointer to the output
variable, I shouldn't need to return the result pointer, right? So I modify
it to the following, but get an 'incompatible type' compilation error.
void output_dispatcher(char *str, int strlen, int flag) {
char output[501];
switch (flag) {
/* No conversion */
case 0:
output = str; /* ERROR: incompatible type */
break;
case 1:
strlen = convert_type1(output, str, strlen);
break;
/* ... */
}
/* do something with result */
}
Can this approach work, or is there a better way to go?
To avoid the recalculation your output converters would need to have a prototype like this:
static char * convert_type1(char *out, const char *in, int *len);
called thus:
result = convert_type1(output, str, &strlen);
Internally the output converter would need to read the contents of the pointer now containing the string length, and overwrite the contents of that pointer before returning.
On the issue of heap vs stack, indeed you need to use the heap since variables allocated on the stack will disappear as soon as the function ends.
The line:
output = str;
is giving you problems because, while arrays and pointers are similar, they're not the same.
"output" is an array, not a pointer.
str = output;
will work, because a char ptr is much like an array variable.
But the opposite does not because the "output" variable is not just the pointer to the array, but the array itself.
For example, if you had:
char output[501];
char output1[501];
and you did:
output1 = output;
This would be ok, and C would copy the contents of the output array in to the output1 array.
So, you're just a little confused about arrays and ptrs.
char output[501];
output = str; /* ERROR: incompatible type */
=>
strncpy(output, str, sizeof(output));
Note, you should check if 'output' is big enough to hold 'str'
The error in this case makes sense. output is a buffer that will hold come char data, while str is a pointer to some other area in memory. You don't want to assign the address of what str is pointing to output, right? If you want to go with this approach I think would just copy the data pointed to by str into output. Better yet just use str if no conversion is required.
C does not allow arrays to be modified by direct assignment - you must individually modify the array members. Thus, if you want to copy the string pointed to by str into the array output, you must use:
strcpy(output, str);
or perhaps
memcpy(output, str, strlen + 1);
(In both cases, after first checking that strlen < sizeof output).
Note that naming a local variable strlen, thus shadowing the standard function of that name, is going to more than a little confusing for someone who looks at your code later. I'd pick another name.