C printing extremely weird values with printf - arrays

I am rewriting this post because I managed to figure out the problem. The problem with my extremely broken output was due to an improper dynamic memory allocation.
Basically I needed to allocate memory for an array of pointers that pointed to structs, but the array itself was nested inside of another struct and the nesting confused me slightly and I ended up over complicating it.
So I had a struct named Catalog, that my array was in and that array pointed to another struct named Books.
When I originally allocated memory for it I was only allocated memory for an array, not an array of pointers:
catalog->arrayP = malloc(INITIAL_CAPACITY * sizeof( Books );
// But I should have done this:
catalog->arrayP = (Books *) malloc(INITIAL_CAPACITY * sizeof( Books );
// That first (Books *) was extremely important
The second issue I was having was that when I was trying to update the memory to allow for more books I was actually decreasing it:
catalog->arrayP = realloc(catalog->arrayP, 2 * sizeof( catalog->arrayP));
// I did this thinking it would just increase the memory to twice that of what it currently was, but it didn't
cataloc->capacity = catalog->capacity * 2;
catalog->arrayP = realloc(catalog->arrayP, catalog->capacity * sizeof( catalog->arrayP));
So whenever I needed to grow my array of pointers I ended up just allocating enough memory for 2 books rather than double the current.

Frankenstein; Or, The Modern Prometh..Shelley, Mary Woll.
Your printing results kind of give away the answer. You forgot the null terminator on your strings and printf invaded the next field until reached the null terminator.
In the following fields it couldn't find and invaded even more stuff.
Here's a minimal example
#include <stdio.h>
#include <string.h>
struct test{
char test[37]; // space for 36 chars + null
char test2[16]; // space for 15 chars + null
};
int main(void) {
struct test Test;
strcpy(Test.test, "randomrandomrandomrandomrandomrandom"); // Copy the 37 bytes
strcpy(Test.test2, "notnotnotnotnot"); // Copy the 16 bytes
//Replace null terminator with trash for demonstration purposes
Test.test[36] = '1'; // replaces 37th byte containing the terminator (\0) with trash
printf("%38s", Test.test); // should print randomrandomrandomrandomrandomrandom1notnotnotnotnot
return 0;
}

Related

C: Realloc() after reading second line in file results in garbage values

I'm attempting to read sequences from a FASTA file into a table of structs that I've created, which each contain a character array member called "seq". My code seems to work well for the first loop, but when I realloc() memory for the second sequence, the pointer seems to point to garbage values and then the strcat() method gives me a segfault.
Here's the whole FASTA file I'm trying to read from:
>1
AAAAAAAAAAGWTSGTAAAAAAAAAAA
>2
LLLLLLLLLLGWTSGTLLLLLLLLLLL
>3
CCCCCCCCCCGWTSGTCCCCCCCCCCC
Here's the code (sorry that some of the variable names are in french):
typedef struct _tgSeq { char *titre ; char *seq ; int lg ; } tgSeq ;
#define MAX_SEQ_LN 1000
tgSeq* readFasta(char *nomFile) {
char ligne[MAX_SEQ_LN];
tgSeq *lesSeq = NULL;
int nbSeq=-1;
FILE *pF = fopen(nomFile, "r");
while(fgets(ligne, MAX_SEQ_LN, pF) != NULL) {
if(ligne[0] == '>') {
/*create a new sequence*/
nbSeq++;
//reallocate memory to keep the new sequence in the *lesSeq table
lesSeq = realloc(lesSeq, (nbSeq)*sizeof(tgSeq));
//allocate memory for the title of the new sequence
lesSeq[nbSeq].titre = malloc((strlen(ligne)+1)*sizeof(char));
//lesSeq[nbSeq+1].titre becomes a pointer that points to the same memory as ligne
strcpy(lesSeq[nbSeq].titre, ligne);
//Now we create the new members of the sequence that we can fill with the correct information later
lesSeq[nbSeq].lg = 0;
lesSeq[nbSeq].seq = NULL;
} else {
/*fill the members of the sequence*/
//reallocate memory for the new sequence
lesSeq[nbSeq].seq = realloc(lesSeq[nbSeq].seq, (sizeof(char)*(lesSeq[nbSeq].lg+1+strlen(ligne))));
strcat(lesSeq[nbSeq].seq, ligne);
lesSeq[nbSeq].lg += strlen(ligne);
}
}
// Close the file
fclose(pF);
return lesSeq;
}
For the first line (AAAAAAAAAAGWTSGTAAAAAAAAAAA), lesSeq[nbSeq].seq = realloc(lesSeq[nbSeq].seq, (sizeof(char)*(lesSeq[nbSeq].lg+1+strlen(ligne)))); gives me an empty character array that I can concatenate onto, but for the second line (LLLLLLLLLLGWTSGTLLLLLLLLLLL) the same code gives me garbage characters like "(???". I'm assuming the problem is that the reallocation is pointing towards some sort of garbage memory, but I don't understand why it would be different for the first line versus the second line.
Any help you could provide would be greatly appreciated! Thank you!
The problem here is the first realloc gets the value of nbSeq as 0 which does not allocate any memory.
Replace
int nbSeq=-1;
with
int nbSeq=0;
Access the index with lesSeq[nbSeq - 1]
Some programmer dude already pointed out that you do not allocate enough memory.
You also seem to expect some behaviour from realloc that will not happen.
You call realloc with NULL pointers. This will make it behave same as malloc.
For the first line (AAAAAAAAAAGWTSGTAAAAAAAAAAA), ...= realloc(); gives me an empty character array that I can concatenate onto, but for the second line (LLLLLLLLLLGWTSGTLLLLLLLLLLL) the same code gives me garbage characters like "(???".
You should not expect any specifiy content of your allocated memory. Especially the memory location is not set to 0. If you want to rely on that, you can use calloc.
Or you simply assign a 0 to the first memory location.
You do not really concatenaty anything. Instead you allocate new memory where you could simply use strcpy instead of strcat.

C - resizing an array of pointers

I more or less have an idea, but I'm not sure if I've even got the right idea and I was hoping maybe I was just missing something obvious. Basically, I have and array of strings (C strings, so basically an array of pointers to character arrays) like so:
char **words;
Which I don't know how many words I'll have in the end. As I parse the string, I want to be able to resize the array, add a pointer to the word, and move on to the next word then repeat.
The only way I can think of is to maybe start with a reasonable number and realloc every time I hit the end of the array, but I'm not entirely sure that works. Like I want to be able to access words[0], words[1], etc. If I had char **words[10] and called
realloc(words, n+4) //assuming this is correct since pointers are 4 bytes
once I hit the end of the array, if I did words[11] = new word, is that even valid?
Keep track of your array size:
size_t arr_size = 10;
And give it an initial chunk of memory:
char **words = malloc( arr_size * sizeof(char*) );
Once you have filled all positions, you may want to double the array size:
size_t tailIdx = 0;
while( ... ) {
if( tailIdx >= arr_size ) {
char **newWords;
arr_size *= 2;
newWords = realloc(words, arr_size * sizeof(char*) );
if( newWords == NULL ) { some_error() };
words = newWords;
}
words[tailIdx++] = get_next_word();
}
...
free(words);
That approach is fine ,although you may want to do realloc(words, n * 2) instead. calling realloc and malloc is expensive so you want to have to reallocate as little as possible and this means you can go for longer without reallocating (and possibly copying data). This is how most buffers are implemented to amortize allocation and copy costs. So just double the size of your buffer every time you run out of space.
You are probably going to want to allocate multiple blocks of memory. One for words, which will contain the array of pointers. And then another block for each word, which will be pointed to by elements in the words array.
Adding elements then involves realloc()ing the words array and then allocating new memory blocks for each new word.
Be careful how you write your clean up code. You'll need to be sure to free up all those blocks.

40 easy lines, 1 annoying segmentation fault. I really don't know where else to turn now

I apologize if this is a waste of time and/or not what should be on this site, but I'm kind of out of ideas... I'm still a novice at programming, can't get a hold of my teacher for guidance, so... TO THE INTERNET!
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
void months( FILE* monthfp, char** monthGroup );
int main (void){
FILE *monthfp; /*to be used for reading in months from months.txt*/
char** monthGroup;
int i;
if (( monthfp = fopen ( "months.txt", "r" )) == NULL ){
printf( "unable to open months.txt. \n" );
exit ( 1 );
}
months( monthfp, monthGroup );
/*test so far*/
for ( i = 0; i < 12; i++ ){
printf( "%s", monthGroup[i] );
}
fclose( monthfp );
}
void months ( FILE* monthfp, char** monthGroup ){
/*****************************************
name: months
input: input file, data array
returns: No return. Modifies array.
*/
char buffer[50];
int count = 0;
while ( fgets( buffer, sizeof(buffer), monthfp ) != NULL ){
count++;
monthGroup = malloc( count * sizeof ( char* ));
monthGroup[count] = malloc( sizeof( buffer ) * sizeof( char ));
strcpy(monthGroup[ count - 1 ], buffer );
}
}
I'm compiling in C89, everything seems to work, except for a segmentation fault. Any guidance would be very much appreciated.
edit
Thanks to everyone who took the time to provide a little bit of insight into something I've been having trouble wrapping my head around. I feel like a little kid in a village of elders in a foreign land. Much appreciation for the courtesy and guidance.
I'm afraid you don't realize how far you are from getting it right. Sit tight, this is going to be long. Welcome to C.
char** monthGroup
All this really means is "a pointer-to-pointer-to-char". However, C has many reasons why you would want to point to something. In your case, the "inner" pointing is so that you can actually point at a sequence of chars in memory (which you colloquially treat as a "string", which C properly does not have), and the "outer" pointing is so that you can point at a sequence of those char*s, and treat that sequence as an "array" (even though it isn't; you're going to dynamically allocate it).
Here's the problem: When you pass in this char** that came from main, it doesn't actually point at anything. "That's fine", you say; "the function is going to make it point at some memory that I'll allocate with malloc()".
Nope.
C passes everything by value. The char** that months receives is a copy of the char** in main's chunk of local variables. You overwrite the pointer (with the result of the malloc call), write some pointers into that pointed-at memory (more malloc results), copy some data into those chunks of pointed-at memory... and then, at the end of the function, the parameter monthGroup (which is a local variable in months) no longer exists, and you've lost all that data, and the variable monthGroup in main is still unchanged at pointing at nothing. When you try to use it as if it points at something, boom you're dead.
So how do we get around this? With another level of pointing, of course, C properly does not have "pass by reference", so we must fake it. We accept a char***, and pass it &monthGroup. This is still a copied value, but it points directly into the local variable storage for that invocation of main (on the stack). That lets us write a value that will be visible in main. We assign the first malloc result to *monthGroup, and write pointers into that storage (*monthGroup[count]), etc.
Except we don't really want to do that, because it's incredibly ugly and confusing and hard to get right. Let's instead do what should be an incredibly obvious thing that you're meant to do and that basic instruction doesn't emphasize nearly enough: use the return value of the function to return the result of the calculation - that's why it's called the return value.
That is, we set up a char** in months (not accepting any kind of parameter for it), return it, and use it to initialize the value in main.
Are we done? No.
You still have some logical errors:
You re-allocate the "outer" layer within your while-loop. That's clearly not what you want; you're allocating several "strings", but only one "array", so that allocation goes outside the loop. Otherwise, you throw away (without properly deallocating them!) the old arrays each time.
Actually, you do want to do something like this, but only because you don't know in advance how many elements you need. The problem is that the new allocation is just that - a new allocation - not containing the previously-set-up pointers.
Fortunately, C has a solution for this: realloc. This will allocate the new memory, copy the old contents across (the pointers to your allocated "strings"), and deallocate the old chunk. Hooray! Better yet, realloc will behave like malloc if we give it a NULL pointer for the "old memory". That lets us avoid special-casing our loop.
You're using the value count incorrectly. The first time through the loop, you'll increment count to 1, allocate some space for monthGroup[1] to point at, and then attempt to write into the space pointed at by monthGroup[0], which was never set up. You want to write into the same space for a "string" that you just allocated. (BTW, sizeof(char) is useless: it is always 1. Even if your system uses more than 8 bits to represent a char! The char is the fundamental unit of storage on your system.)
Except not, because there's a simpler way: use strdup to get a pointer to an allocated copy of your buffer.
char** months(FILE* monthfp) {
char buffer[50];
int count = 0;
char** monthGroup = NULL;
while (fgets(buffer, sizeof(buffer), monthfp) != NULL) {
// (re-)allocate the storage:
monthGroup = realloc(monthGroup, count * sizeof(char*));
// ask for a duplicate of the buffer contents, and put a pointer to the
// duplicate sequence into the last element of the storage:
monthGroup[count - 1] = strdup(buffer);
}
return monthGroup;
}
Adjusting main to match is left as a (hopefully trivial) exercise. Please also read the documentation for realloc and strdup.
Are we done? No.
You should still be checking for NULL returns from realloc and strdup (since they both attempt to allocate memory, and thus may fail in that way in C), and you still need code to free the allocated memory.
And, as others pointed out, you shouldn't be assuming there will be 12 months. If you could assume that, you wouldn't be dynamically allocating monthGroup in the first place; you'd just use an array. So you need to communicate the size of the result "array" somehow (adding an explicit NULL pointer to the end is one way; another is to do the horribly ugly thing, pass in a char***, and use the return value to count the size).
C has pass-by-value semantics for function calls. This is a fancy way of saying that
int main() {
int a = 5;
addOneTo(a);
printf("%d\n", a);
return 0;
}
will print 5 no matter what addOneTo() does to its parameter.
In your code, your months() function sets its local variable monthGroup to the value returned by the first malloc(), then throws away that value when it returns.
You have a few choices here on how to fix this problem. You could malloc into monthGroup outside the months() function then pass it in. You could return the monthGroup value. Or you could pass a pointer to monthGroup for pass-by-reference semantics (char***).
In any case, I would encourage you to learn how to use a debugger (e.g. gdb) so you can see why it segfaults next time!
Your problem lies in the months function, specifically your understanding of how memory works.
Looking at your code:
monthGroup = malloc( count * sizeof ( char* ));
This line allocates a chunk of memory which is equivalent to an array of char * of size count.
monthGroup[count] = malloc( sizeof( buffer ) * sizeof( char ));
Here, a buffer is allocated of size sizeof (buffer) (the sizeof (char) is unneccesary). This is one problem here: you are assigning it to monthGroup[count]. Arrays in C are zero-base, which means that the array:
int array [3];
has elements:
array [0], array [1] and array [2]
array [3] is outside the memory of the array. So monthGroup[count] is also outside the memory of the array. You want monthGroup[count-1] instead. This will write to the last element in the array.
The second problem is that every time you do the first allocation, you lose the previously allocated data (this is know as a memory leak) and the data it contained.
To fix this, there are two approaches.
When allocating the array, copy the contents of the old array to the new array:
oldarray = monthGroup;
monthGroup = malloc (count * sizeof (char *))
memcpy (monthGroup, oldarray, count-1 * sizeof (char *));
free (oldarray);
monthGroup [count-1] = ....
or use realloc.
Use a linked list. A lot more complex this one but has the advantage of not requiring the arrays to be copied every time a new item is read.
Also, the monthGroup parameter doesn't get passed back to the caller. Either change the function to:
char **months (FILE *fp)
or:
void months (FILE *fp, char ***ugly_pointer)
Finally, the caller currently assumes that there are 12 entries and attempts to print each one out. What happens if there are fewer than 12, or more than 12? One way to cope is to use a special pointer to terminate the monthsGroup array, a NULL would do nicely. Just allocate one extra element to the array and set the last one to NULL.
To me the most obvious of your problems is that you pass char** monthGroup as a parameter by value, then malloc it inside the function months, and afterwards try to use it in the caller function. However, since you passed it by value, you only stored the malloced address in a local copy of monthGroup, which does not change the value of the original variable in main.
As a quick fix, you need to pass a pointer to monthGroup, rather than (a copy of) its current value:
int main (void){
...
char** monthGroup;
...
months( monthfp, &monthGroup );
...
}
void months ( FILE* monthfp, char*** monthGroup ){
...
*monthGroup = malloc( count * sizeof ( char* ));
...
}
This is ugly (IMHO there should be no real reason to use char*** in real code) but at least a step in the right direction.
Then, as others rightly mentioned, you should also rethink your approach of reallocating monthGroup in a loop and forgetting about the previous allocations, leaving memory leaks and dangling pointers behind. What happens in the loop in your current code is
// read the first bunch of text from the file
count++;
// count is now 1
monthGroup = malloc( count * sizeof ( char* ));
// you allocated an array of size 1
monthGroup[count] = malloc( sizeof( buffer ) * sizeof( char ));
// you try to write to the element at index 1 - another segfault!
// should be monthGroup[count - 1] as below
strcpy(monthGroup[ count - 1 ], buffer );
Even with the fix suggested above, after 10 iterations, you are bound to have an array of 10 elements, the first 9 of which are dangling pointers and only the 10th pointing to a valid address.
The completed code would be this:
int main (void)
{
FILE *monthfp; /*to be used for reading in months from months.txt*/
char **monthGroup = NULL;
char **iter;
if ((monthfp = fopen("c:\\months.txt", "r")) == NULL){
printf("unable to open months.txt. \n");
exit(1);
}
months(monthfp, &monthGroup);
iter = monthGroup;
/* We know that the last element is NULL, and that element will stop the while */
while (*iter) {
printf("%s", *iter);
free(*iter);
iter++;
}
/* Remember that you were modifying iter, so you have to discard it */
free(monthGroup);
fclose(monthfp);
}
void months(FILE *monthfp, char ***monthGroup)
{
/*****************************************
name: months
input: input file, data array
returns: No return. Modifies array.
*/
char buffer[50];
int count = 0;
while (fgets(buffer, sizeof(buffer), monthfp) != NULL){
count++;
/* We realloc the buffer */
*monthGroup = (char**)realloc(*monthGroup, count * sizeof(char**));
/* Here I'm allocating an exact buffer by counting the length of the line using strlen */
(*monthGroup)[count - 1] = (char*)malloc((strlen(buffer) + 1) * sizeof( char ));
strcpy((*monthGroup)[count - 1], buffer);
}
/* We add a terminating NULL element here. Other possibility would be returning count. */
count++;
*monthGroup = (char**)realloc(*monthGroup, count * sizeof(char**));
(*monthGroup)[count - 1] = NULL;
}
As said by others a char*** is ugly.
The principal error that I see immediately, is that your allocation for monthGroup will never make it back into your main.

Dynamic string array struct in C

I have to write a function in c which will return a dynamic array of strings. Here are my requirements:
I have 10 different examine functions which will return either true or false and associated error text. (error text string is also dynamic).
My function must collect the result(true or false) + the error string and it will be called n examine functions. So my function must collect n results and finally return a dynamic array of strings to other functions.
You can allocate an array of arbitrary length with malloc() (it's like "new" in Java), and make it grow or shrink with realloc().
You have to remember to free the memory with free() as in C there is not garbarage collector.
Check: http://www.gnu.org/software/libc/manual/html_node/Memory-Allocation.html#Memory-Allocation
Edit:
#include <stdlib.h>
#include <string.h>
int main(){
char * string;
// Lets say we have a initial string of 8 chars
string = malloc(sizeof(char) * 9); // Nine because we need 8 chars plus one \0 to terminate the string
strcpy(string, "12345678");
// Now we need to expand the string to 10 chars (plus one for \0)
string = realloc(string, sizeof(char) * 11);
// you can check if string is different of NULL...
// Now we append some chars
strcat(string, "90");
// ...
// at some point you need to free the memory if you don't want a memory leak
free(string);
// ...
return 0;
}
Edit 2:
This is the sample for allocate and expand an array of pointers to chars (an array of strings)
#include <stdlib.h>
int main(){
// Array of strings
char ** messages;
char * pointer_to_string_0 = "Hello";
char * pointer_to_string_1 = "World";
unsigned size = 0;
// Initial size one
messages = malloc(sizeof(char *)); // Note I allocate space for 1 pointer to char
size = 1;
// ...
messages[0] = pointer_to_string_0;
// We expand to contain 2 strings (2 pointers really)
size++;
messages = realloc(messages, sizeof(char *) * size);
messages[1] = pointer_to_string_1;
// ...
free(messages);
// ...
return 0;
}
Consider creating apropriate types suitable for you problem. For example, you can create a struct holding a pointer and sn integer length to represent the dynamic arrays.
Do you have some constraints over
the prototyping of the examine()
function and the function you have
to write ? (let's call it
validate())
You say you have 10 examine() functions, does it mean you will have a maximum of 10 messages/results in the array return by validate() ?
I'm a Java programmer with a C background, so maybe I can highlight a few things for you :
there is no equivalent of Array.length in C : you'll have to supply a side integer value to store the effective size of your array
C arrays can't "grow" : you'll have to use pointers and allocate/reallocate the memory pointed by your array begin pointer as this array grows or shrinks
you should already know that there is no notion of class or method in C, however you can use struct, typedef and function pointers to add some kind of object oriented / genericity behavior to your C programs...
Depending on your needs and obligations, arrays might be a good way to go, or not : perhaps you should try to figure out a way of building/finding an equivalent of the java List interface in C, so that you can add, remove/destroy or sort examine result elements without having to duplicate memory allocation / reallocation / freeing code each time you manipulate your result set (and you should perhaps send a header file with your structs/examine functions to describe what you did for now anyway, and express your needs a bit more precisely, so that we can guide you to the good direction)
Don't hesitate to provide more information or ask for specifics about the above bullets points ;)

keeping track of how much memory malloc has allocated

After a quick scan of related questions on SO, I have deduced that there's no function that would check the amount of memory that malloc has allocated to a pointer. I'm trying to replicate some of std::string basic functionality (mainly dynamic size) using simple char*'s in C and don't want to call realloc all the time. I guess I'll need to keep track of how much memory has been allocated. In order to do that, I'm considering creating a typedef that will contain the string itself and an integer with the amount of memory currently allocated, something like this:
typedef struct {
char * str;
int mem;
} my_string_t;
Is that an optimal solution, or perhaps you can suggest something that will bear better results? Thanks in advance for your help.
You will want to allocate the space for both the length and the string in the same block of memory. This may be what you intended with your struct, but you have reserved space for only a pointer to the string.
There must be space allocated to contain the characters of the string.
For example:
typedef struct
{
int num_chars;
char string[];
} my_string_t;
my_string_t * alloc_my_string(char *src)
{
my_string_t * p = NULL;
int N_chars = strlen(src) + 1;
p = malloc( N_chars + sizeof(my_string_t));
if (p)
{
p->num_chars = N_chars;
strcpy(p->string, src);
}
return p;
}
In my example, to access the pointer to your string, you address the string member of the my_string_t:
my_string_t * p = alloc_my_string("hello free store.");
printf("String of %d bytes is '%s'\n", p->num_chars, p->string);
Be careful to realize that you are obtaining the pointer for the string as a consequence of allocating space to store the characters. The resource you are allocating is the storage for the characters, the pointer obtained is a reference to the allocated storage.
In my example, the memory allocated is laid out sequentially as follows:
+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+
| 00 | 00 | 00 | 11 | 'h'| 'e'| 'l'| 'l'| 'o'| 20 | 'f'| 'r'| 'e'| 'e'| 20 | 's'| 't'| 'o'| 'r'| 'e'| '.'| 00 |
+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+
^^ ^
|| |
p| |
p->num_chars p->string
Notice that the value of p->string is not stored in the allocated memory, it is four bytes from the beginning of the allocated memory, immediately subsequent to the (presumed 32-bit, four-byte) integer.
Your compiler may require that you declare the flexible C array as:
typedef struct
{
int num_chars;
char string[0];
} my_string_t;
but the version lacking the zero is supposedly C99-compliant.
You can accomplish the equivalent thing with no array member as follows:
typedef struct
{
int num_chars;
} mystr2;
char * str_of_mystr2(mystr2 * ms)
{
return (char *)(ms + 1);
}
mystr2 * alloc_mystr2(char *src)
{
mystr2* p = NULL;
size_t N_chars = strlen(src) + 1;
if (N_chars num_chars = (int)N_chars;
strcpy(str_of_mystr2(p), src);
}
return p;
}
printf("String of %d bytes is '%s'\n", p->num_chars, str_of_mystr2 (p));
In this second example, the value equivalent to p->string is calculated by str_of_mystr2(). It will have approximately the same value as the first example, depending on how the end of structs are packed by your compiler settings.
While some would suggest tracking the length in a size_t I would look up some old Dr. Dobb's article on why I disagree. Supporting values greater than INT_MAX is of doubtful value to your program's correctness. By using an int, you can write assert(p->num_chars >= 0); and have that test something. With an unsigned, you would write the equivalent test something like assert(p->num_chars < UINT_MAX / 2); As long as you write code which contains checks on run-time data, using a signed type can be useful.
On the other hand, if you are writing a library which handles strings in excess of UINT_MAX / 2 characters, I salute you.
This is the obvious solution. And while you are at it, you might want to have a struct member that maintains the amount of allocated memory actually in use. This will avoid having to call strlen() all the time, and would enable you to support non null-terminated strings, as the C++ std::string class does.
That is how it was done in the Pleistocene, and that's how you should do it today. You are dead on the money that malloc does not offer any portable, supported, mechanism to query the size of an allocated block.
A more common way is to wrap malloc (and realloc) and keep a list of sizes and pointers
That way you don't need to change any string functions.
write wrapper functions. If you are using malloc then you should do that anyway.
For an example look in "writing solid code"
I think you could use malloc_usable_size.

Resources