Related
I have a large string, where I want to use pieces of it but I don't want to necessarily copy them, so I figured I can make a structure that marks the beginning and length of the useful chunk from the big string, and then create a function that reads it.
struct descriptor {
int start;
int length;
};
So far so good, but when I got to writing the function I realized that I can't really return the chunk without copying into memory...
char* getSegment(char* string, struct descriptor d) {
char* chunk = malloc(d.length + 1);
strncpy(chunk, string + d.start, d.length);
chunk[d.length] = '\0';
return chunk;
}
So the questions I have are:
Is there any way that I can return the piece of string without copying it
If not, how can I deal with this memory leak, since the copy is in heap memory and I don't have control over who will call getSegment?
Answering your two questions:
No
The caller should provide buffer for the copied string
I would personally pass the pointer to the descrpiptor
char* getSegment(const char* string, const char *buff, struct descriptor *d)
Is there any way that I can return the piece of string without copying it
A string includes the terminating null character, so unless the part code wants is the tail, a pointer to a "piece of string" and still be a string, is not possible.
how can I deal with this memory leak, since the copy is in heap memory and I don't have control over who will call getSegment?
Create temporary space with a variable length array (since C99 and optional supported in C11). Good until the end of the block. At which point, the memory is released and should not be further used.
char* getSegment(char* string, struct descriptor d, char *dest) {
// form result in `dest`
return dest;
}
Usage
char *t;
{
struct descriptor des = bar();
char *large_string = foo();
char sub[des.length + 1u]; //VLA
t = getSegment(large_string, des, sub);
puts(t); // use sub or t;
}
// do not use `t` here, invalid pointer.
Recall size is of concern. If code is returning large sub-strings, best to malloc() a buffer and oblige the calling code to free it when done.
Is there any way that I can return the piece of string without copying it
You're right that if you want to use the chunks in conjunction with any of the many C functions that expect to work with null-terminated character arrays, then you have to make copies. Otherwise, adding the terminators modifies the original string.
If you're prepared to handle the chunks as fixed-length, unterminated arrays, however, then you can represent them without copying as a combination of a pointer to the first character and a length. Some standard library functions work with user-specified string lengths, thus supporting operations on such segments without null termination. You would need to be very careful with them, however.
If you take that approach, I would recommend colocating the pointer and length in a structure. For example,
struct string_segment {
char *start;
size_t length;
};
You could declare variables of this type, pass and return objects of this type, and create compound literals of this type without any dynamic memory allocation, thus avoiding opening any avenue for memory leakage.
If not, how can I deal with this memory leak, since the copy is in heap memory and I don't have control over who will call getSegment?
Returning dynamically-allocated objects does not automatically create a memory leak -- it merely confers a responsibility on the caller to free the allocated memory. It is when the caller fails to either satisfy that responsibility or pass it on to other code that a memory leak occurs. Several standard library functions indeed do return dynamically-allocated objects, and it's not so unusual in third-party libraries. The canonical example (other than malloc() itself) would probably be the POSIX-standard strdup() function.
If your function returns a pointer to a dynamically-allocated object -- whether a copied string, or a chunk definition structure -- then it should document the responsibility to free that falls on callers. You must ensure that you satisfy your obligation when you call it from your own code, but having clearly documented the function's behavior, you cannot take responsibility for errors other callers may make by failing to fulfill their obligations.
i just started learning c. i am doing an exercise and the question is as follows.
Write a function called insertString to insert one character string into another string.The arguments to the function should consist of the source string, the string to be inserted, and the position in the source string where the string is to be inserted. So, the call insertString (text, "per", 10); with text as originally defined "the wrong son" results in the character string "per" being inserted inside text, beginning at text[10].Therefore, the character string "the wrong person" is stored inside the text array after the function returned.
#include<stdio.h>
int insertString(char[],char[],int);
int stringLength(char[]);
int main()
{
char text[]="the wrong son";
int result=insertString(text,"per",10);
if(result!=-1)
printf("string 1 is : %s \n",text);
else
printf("Not possible\n");
return 0;
}
int insertString(char a[],char b[],int pos)
{
int i=0,j=0;
int lengthA=stringLength(a);
int lengthB=stringLength(b);
if(pos>lengthA)
return -1;
for(i=lengthA;i>=pos;i--)
a[i+lengthB]=a[i];
for ( i = 0; i < lengthB; ++i )
a[i + pos] = b[i];
return 1;
}
int stringLength(char x[])
{
int length=0;
while(x[length]!='\0')
length++;
return length;
}
i have done this and it's working too. but i am receiving a message abort trap : 6. when i looked upon it, i learned it's an error because i am writing to the memory that i don't own. since i have used variable length character arrays, wherever the null character is, indicates the end of array and i am trying to extending it by inserting a string, that's my understanding. am i right so far?
i am also moving the null character. i don't know whether it's right or wrong.
so is there a way to get around this error? Also, i don't know pointers yet and they're in the next chapter of the textbook .
Any help in this would be appreciated very much.
A variable-length array is a very specific C construct that has nothing to do with what your textbook calls "variable length arrays". If I were you I would not trust this textbook if it said that 1+1=2. So much for it.
A character array that ends with a null character is called string by pretty much everyone, everywhere.
char text[]="the wrong son";
Your textbook led you to believe that text will hold as many characters as you need. Alas, there is no such thing in C. In fact text will hold exactly as many characters as there are in its initializer, plus 1 for the null terminator, so you cannot insert anything in it.
In order for your program to work, you need to explicitly allocate as many characters for text as the resulting string will contain.
So as there are 14 characters in "the wrong son" (including the terminator) and three characters in "per" (not including the terminator), you need 17 characters in total:
char text[17]="the wrong son";
You can also check your calculations:
int result=insertString(text, "per", 10, sizeof(text));
...
int insertString(char a[], char b[], int pos, int capacity)
{
...
if (lengthA + lengthB + 1 < capacity)
return -1;
...
First you must understand what the difference between C-programming and other programming languages are manual memory management and pointers.
In C you have to do everything yourself but you have total control of everything, in other languages like Java a lot is made automatically for you but you can't open the hood.
Memory handling in C is the essence of C and is very different from for instance Java that looks very alike C. Java and C syntax are very much alike, but in two completely different worlds.
C++ is an extension of C that allow similar features like in Java, but still memory wise is C.
There are two types of memory in C:
Automatic string array (declared as char xx[]) with the exact length of its initialization number or defined string length (the number between the []) + 1 (for null termination), can't be changed.
Dynamic memory (declared as char*) is allocated with calloc() or alloc() and can be changed in length with realloc() and must be manually freed or else the program will leak memory (still allocate memory after the program is ended (some OS has automatic clean-up of that but it is bad style C programming not freeing memory)).
Dynamic memory is delivered to a pointer (char*) that points to the memory allocated. Pointers can point at any type of memory also string arrays and even integers.
A pointer is an integer, a number pointing at the available memory address in the OS, the OS keep track of the memory of each pointer, but do not clean it up like in Java.
Also note that after the realloc the old memory of the old pointer is freed by the command, new is allocated that you must manually free, later after use.
It is possible to send a pointer (it is just a number) into a function and the function changes the pointer (it is just another number pointing at memory (that might not be the same)).
Because of this it is essential to return the new pointer from functions that might have changed its content.
In practice the core of C-programming is pointer programming and the programmer must have a firm track of the memory or the program goes berserk, you have to learn the routines.
With pointer programming it is possible to have absolute control over all the memory and the functions becomes normally very efficient, fast and memory lean.
This is used also when we are talking about huge data like in high resolution pictures or video content, and often the only way to get performance.
Extended level - pointer to pointers
When getting more advanced it is possible to send the pointer of a pointer (char**) to a function allowing the function to amend the content of a pointer like reallocation of a string and the updated pointer will be readable by the calling function. This way multiple pointers can be amended (there is only one return value).
A pointer to a pointer, points to the memory where the pointer address (the number that points to the memory) is stored, so sending it into the function the function can change the pointers number (what memory it points at) and the calling function can read it (the same pointer have a new value).
Pointers to pointers are normally used for instance in database programming with linked lists being able to control a huge number of memory chunks in a long chain, and being able to handle them smoothly.
Most other programming languages basic system is programmed in C, so normally it is possible to integrate chunks of C--code to improve performance.
ANSI C is the same in all computers so it is also a way of making code real portable from system to system and work the same in them all.
Lets check out your case, here is a sample code to show.
#include<stdlib.h>
#include<string.h>
#include <stdio.h>
char* insertString(
char* pTarget,
char* pInput);
void main(void)
{
char Target[9] = { "Hello" };
char Target2[9] = { "Hi" };
char Input[] = { " World" };
Target and Target2 are automatic string arrays with the exact length of its initialization number, the number between the[] and Input is defined by the string length(+1 for null termination), can't be changed.
So, the length of Input is defined as 7 bytes, six letters +1, as Target and Target2 are defined as 9 bytes (can contain 8 letters), can't be changed, they are string arrays.
This below will not work, because Target is too short, only 9 chars space (enough for 8 letters) and Target + Input is 11 letters, the program will crash.
strcat(Target, Input);
But this will work because Target2 is 9 chars (space for 8 letters) and Target2 + Input is 8 letters, so it fits.
strcat(Target2, Input);
printf("%s\n", Target2);
This below will not work because Target is an automatic char array with the exact length of its initialization number or string length +1 (for null termination), its length can't be changed.
They are fixed in length and not possible to extend or shrink in length, can't realloc them, and they will be freed automatically at the end of the function.
In fact, it is created normally in another set of memory than the dynamic memory and is protected from change.
pTarget = insertString(Target, Input);
{
This below will work because it is dynamically allocated memory(by a calloc or alloc command) that can be reallocated to any size.
Dynamic memory(volatile) in C is not automatic like in other programming languages, must be taken care of manually.
Usually in C a p is put ahead for pointers to differentiate them from automatic string arrays.
Dynamically allocated memory must be manually freed after use or the program will leak memory, it is not Java with auto clean - up.
char* pTarget = calloc(strlen(Target) + 1, sizeof(char));
if (pTarget) {
strcpy(pTarget, Target);
pTarget = insertString(pTarget, Input);
Also notice you as a programmer must check that you got the memory you asked for by the memory allocation command calloc.
If not (very unlikely memory is unavailable in 2022) you can't perform the action and, you fail, or the program will crash.
printf("%s\n", pTarget);
free(pTarget);
}
else
printf("%s\n", "Failure!");
}
}
char* insertString(
char* pTarget,
char* pInput)
{
We are here reallocating the memory to get it enlarged to fit our use
pTarget = realloc(pTarget, (strlen(pTarget) + strlen(pInput) + 1) * sizeof(char));
The old memory is freed by realloc and a new larger is allocated for us.
Now the pointer (the number that points to memory) might not be the same as before realloc.
A pointer is a storage of the number and the same storage pTarget contains the new number to the new data, OK.
if (pTarget)
strcat(pTarget, pInput);
return pTarget;
}
int replace_substring (char *str, char *substr, char *new_substr) {
int pos = delete_substring (str, substr); /* first delete the existing substring */
if (pos == -1) return pos; /* substring not found, return */
insert_substring (str, pos, new_substr); /* add the new substring at the deleted position */
}
int replace_substring (char *str, char *substr, char *new_substr) {
int pos = delete_substring (str, substr); /first delete the existing substring/
if (pos == -1) return pos; /substring not found, return/
insert_substring (str, pos, new_substr); /add the new substring at the deleted position/
}
gcc (GCC) 4.7.2
c89
Hello,
All error checking removed from snippet - to keep the code short.
I have a problem freeing some memory that I have allocated and copied a string to.
My program will check for digits and increment the pointer until it gets to an non-digit.
When I go to free the memory I get a stack dump with invalid free.
I think this is because I have incremented the pointer and now it is pointing to halfway down the string, as that is when the non-digits start.
If I don't increment its ok to free. However, if I do increment it and then try and free I get the stack dump.
int parse_input(const char *input)
{
char *cpy_input = calloc(strlen(input) + 1, sizeof(char));
size_t i = 0;
apr_cpystrn(cpy_input, input, strlen(input) + 1);
/* Are we looking for a range of channels */
for(i = 0; i < strlen(cpy_input); i++) {
if(isdigit(*cpy_input)) {
/* Do something here */
cpy_input++;
}
}
/* Where finished free the memory */
free(cpy_input); /* Crash here */
return 0;
}
I resolved the issue by declaring another pointer and assigning the address, so it points to the first character, then I free that. It works ok i.e.
char *mem_input = cpy_input;
free(mem_input);
My question is why do I need to declare another pointer to be able to free the memory? Is there another way of doing this?
Many thanks in advance,
You need to save the original pointer. Only the original pointer can be used when freeing the memory. You can just create another variable to hold the original pointer.
Or put the loop in a separate function. As variables by default is passed by value, i.e. copied, when you change the pointer in the function you only change the copy of the pointer.
Besides that, your loop seems a little weird. You loop using an index from zero to the length of the string, so you can easily use that index instead of modifying the pointer. Either that, or change the loop to something like while (*cpy_input != '\0'). I have never seen the two variants mixed.
By the way, you have a bug in that code. You only increase the pointer if the current character is a digit. But if the first character is not a digit, the loop will just loop until it reaches the end of the string, but the pointer will not be increased and you will check the first character over and over again. If you just want to get leading digits from the string (if any), you could use a loop such as
for (; isdigit(*cpy_input); cpy_input++)
{
/* do something, using `*cpy_input` */
}
Or of course
for (int i = 0; i < strlen(cpy_input); i++)
{
/* do something, using `cpy_input[i]` */
}
char *cpy_input = calloc(strlen(input) + 1, sizeof(char));
let's say cpu_input is 0x1000. Point is same pointer should be freed in free().
As per your logic if input length is 5, then after for loop cpy_input points to 0x1005 location. And if you calls free(cpy_input) it's free(0x1005), which is invalid pointer for free and it's getting crashed.
Well, of course there is another way of doing this: just decrement the cpy_input pointer exactly as many times as you incremented it. Or subtract the length of the string (assuming you saved it) from the final cpy_input value. That way you will restore the original cpy_input value and properly free the memory.
The bottom line here is simple: you have to pass to free the same pointer value that you received from calloc. There no way around it. So, in one way or another you have to be able to obtain the original pointer value. Saving it in another pointer is actually the best solution in your situation. But if you know how to do it in any other way - go ahead and use whatever you like most.
calloc returns the pointer to memory block requested from the memory.So you can free the same pointer location only which is returned from calloc.
Either free the original pointer or free backup copy of this.
you can change your loop for
for(i = 0; i < strlen(cpy_input); i++) {
if(isdigit(cpy_input[i])) {
/* Do something here */
}
}
or do Pointer Arithmetic to get the initial value later
It is important to understand, that a pointer just is a memory address.
The resource management system behind free and calloc will keep some book keeping data associated with the chunk of memory, in particular how big the chunk is, you requested by calling calloc. This might be in some lookup container, which stores it related to the pointer returned by calloc (i.e. the initial value of cpu_input), or this information is stored in memory right in front of the chunk, which is as far as I know more common.
If you now pass the changed value in cpu_input to free, it will either not find the book keeping data in its lookup container or it will look for the book keepin data in front of the pointer, where it will find the data of you string, which probably makes no sense at all.
So your solution of keeping a copy of the original pointer is an appropriate one.
I have the following test function to copy and concatenate a variable number of string arguments, allocating automatically:
char *copycat(char *first, ...) {
va_list vl;
va_start(vl, first);
char *result = (char *) malloc(strlen(first) + 1);
char *next;
strcpy(result, first);
while (next = va_arg(vl, char *)) {
result = (char *) realloc(result, strlen(result) + strlen(next) + 1);
strcat(result, next);
}
return result;
}
Problem is, if I do this:
puts(copycat("herp", "derp", "hurr", "durr"));
it should print out a 16-byte string, "herpderphurrdurr". Instead, it prints out a 42-byte string, which is the correct 16 bytes plus 26 more bytes of junk characters.
I'm not quite sure why yet. Any ideas?
The variable-argument-list functions don't magically know how many arguments there are, so you're most likely walking the stack until you happen to hit a NULL.
You either need an argument numStrings, or supply an explicit null-terminator argument after your list of strings.
You need a sentinel marker on your list:
puts(copycat("herp", "derp", "hurr", "durr", NULL));
Otherwise, va_arg doesn't actually know when to stop. That fact that you're getting junk is pure accident since you're invoking undefined behaviour. For example, when I ran your code as-is, I got a segmentation fault.
Variable argument functions, such as printf need some sort of indication as to how many items are passed in: printf itself uses the format string up front to figure this out.
The two general methods are a count (or format string) which is useful when you can't use one of the possible values as a sentinel (a marker at the end).
If you can use a sentinel (like NULL in the case of pointers, or -1 in the case of non-negative signed integers, that's usually better so you don't have to count the elements (and possible get the element count and element list out of step).
Keep in mind that puts(copycat("herp", "derp", "hurr", "durr")); is a memory leak since you're allocating memory then losing the pointer to it. Using:
char *s = copycat("herp", "derp", "hurr", "durr");
puts(s);
free (s);
is one way to fix that, and you may want to put in error checking code in case the allocations fail.
What I understand from your code is that you assume va_next will return NULL once each argument has been "popped". That's wrong as va_next has absolutely no way to determine the number of arguments : your while loop will keep running until a NULL is randomly hit.
Solution : either provide the number of arguments, or add call your function with an additional "NULL" argument.
PS: if you are wondering why printf doesn't require such an additional argument, it's because the number of expected arguments is deduced from the format string (the number of '%flag')
As an addition to the other answers, you should cast the NULL to the expected type when using it as an argument to a variadic function: (char *)NULL. If NULL is defined as 0, then an int will be stored instead, which will accidentally work when int has the sime size as the pointer and NULL is represented by all bits 0. But none of this is guaranteed, so you may run into strange behaviour that's hard to debug when porting the code or even when only changing the compiler.
As others have mentioned, va_arg does not know when to stop. It is up to you to provide NULL (or some other marker) when you call the function. Just a few side notes:
You must call free on pointers you obtain from malloc and realloc.
There is no reason to cast the result of malloc or realloc in C.
When calling realloc, it is best to store the return value into a temporary variable. If realloc is unable to reallocate enough memory, it returns NULL but the original pointer is not freed. If you use realloc the way you do, and it is unable to reallocate the memory, then you have lost the original pointer and your subsequent call to strcat will likely fail. You could use it like this:
char *tmp = realloc(result, strlen(result) + strlen(next) + 1);
if (tmp == NULL)
{
// handle error here and free the memory
free(result);
}
else
{
// reallocation was successful, re-assign the original pointer
result = tmp;
}
I have a structure that has an array of pointers. I would like to insert into the array digits in string format, i.e. "1", "2", etc..
However, is there any difference in using either sprintf or strncpy?
Any big mistakes with my code? I know I have to call free, I will do that in another part of my code.
Many thanks for any advice!
struct port_t
{
char *collect_digits[100];
}ports[20];
/** store all the string digits in the array for the port number specified */
static void g_store_digit(char *digit, unsigned int port)
{
static int marker = 0;
/* allocate memory */
ports[port].collect_digits[marker] = (char*) malloc(sizeof(digit)); /* sizeof includes 0 terminator */
// sprintf(ports[port].collect_digits[marker++], "%s", digit);
strncpy(ports[port].collect_digits[marker++], digit, sizeof(ports[port].collect_digits[marker]));
}
Yes, your code has a few issues.
In C, don't cast the return value of malloc(). It's not needed, and can hide errors.
You're allocating space based on the size of a pointer, not the size of what you want to store.
The same for the copying.
It is unclear what the static marker does, and if the logic around it really is correct. Is port the slot that is going to be changed, or is it controlled by a static variable?
Do you want to store only single digits per slot in the array, or multiple-digit numbers?
Here's how that function could look, given the declaration:
/* Initialize the given port position to hold the given number, as a decimal string. */
static void g_store_digit(struct port_t *ports, unsigned int port, unsigned int number)
{
char tmp[32];
snprintf(tmp, sizeof tmp, "%u", number);
ports[port].collect_digits = strdup(tmp);
}
strncpy(ports[port].collect_digits[marker++], digit, sizeof(ports[port].collect_digits[marker]));
This is incorrect.
You have allocated onto collect_digits a certain amount of memory.
You copy char *digits into that memory.
The length you should copy is strlen(digits). What you're actually copying is sizeof(ports[port].collect_digits[marker]), which will give you the length of a single char *.
You cannot use sizeof() to find the length of allocated memory. Furthermore, unless you know a priori that digits is the same length as the memory you've allocated, even if sizeof() did tell you the length of allocated memory, you would be copying the wrong number of bytes (too many; you only need to copy the length of digits).
Also, even if the two lengths are always the same, obtaining the length is this way is not expressive; it misleads the reader.
Note also that strncpy() will pad with trailing NULLs if the specified copy length is greater than the length of the source string. As such, if digits is the length of the memory allocated, you will have a non-terminated string.
The sprintf() line is functionally correct, but for what you're doing, strcpy() (as opposed to strncpy()) is, from what I can see and know of the code, the correct choice.
I have to say, I don't know what you're trying to do, but the code feels very awkward.
The first thing: why have an array of pointers? Do you expect multiple strings for a port object? You probably only need a plain array or a pointer (since you are malloc-ing later on).
struct port_t
{
char *collect_digits;
}ports[20];
You need to pass the address of the string, otherwise, the malloc acts on a local copy and you never get back what you paid for.
static void g_store_digit(char **digit, unsigned int port);
Finally, the sizeof applies in a pointer context and doesn't give you the correct size.
Instead of using malloc() and strncpy(), just use strdup() - it allocates the buffer bin enough to hold the content and copies the content to the new string, all in one shot.
So you don't need g_store_digit() at all - just use strdup(), and maintain marker on the caller's level.
Another problem with the original code: The statement
strncpy(ports[port].collect_digits[marker++], digit, sizeof(ports[port].collect_digits[marker]));
references marker and marker++ in the same expression. The order of evaluation for the ++ is undefined in this case -- the second reference to marker may be evaluated either before or after the increment is performed.