Modifying string in function C - c

Let's say I want to modify char array using function.
I am always seeing people using malloc, calloc, or pointers to modify int, char, or 2D arrays.
Am I right, if I say, that string can be returned from function only if I use malloc, create that array pointer and return him? Then why not getting/altering string, by passing it to function parameter?
Isn't my demonstration, which is using char array in parameter easier, than allocating/freeing? Is my concept wrong, or why am I never seeing people passing arrays to function? I am only seeing codes with passing like "char *array", not "char array[]", using malloc etc, when I see this method of altering char array easy. Am I missing something?
#include <stdio.h>
void change(char array[]){
array[0]='K';
}
int main(){
char array[]="HEY";
printf("%s\n", array);
change(array);
printf("%s\n",array );
return 0;
}

If you only need to change existing characters in the string, and the string will be in a variable, and you don't mind the side-effect of your original string being modified, then your solution may be acceptable and indeed easier. But:
What if you want to get a modified string, but also want to retain the original? To avoid destroying an arbitrary-sized original, you need to malloc space, make a copy, and modify that.
And what if you want to extend the string? If your change is to add " YOU" to the string, it can't modify the original because there's no space for it--it'll cause a buffer overflow, since there's only 4 bytes allocated for "HEY" (three letters plus the null terminator). Again, the solution involves mallocing space to work with.
Functions that make changes using your technique typically need a size or length parameter to avoid overflowing the array and causing a crash and a potential security risk. But although that avoids the overflow, there's still the question of what happens if there's not enough space: Silently drop some data? Pass back a flag or special value to indicate there wasn't enough space, and expect the caller to handle it? In the long run, it ends up easier to write it right the first time, and malloc/calloc the space and deal with having to free it up later and all that.

Related

A C function that returns a char array Vs a function working with 2 char arrays

I'm a C beginner so my apologies if this doubt is too obvious.
What would be considered the most efficient way to solve this problem: Imagine that you have a char array ORIG and after working with it, you should have a new char array DEST. So, if I wanted to create a function for this goal, what would the best approach be:
A function that takes only one char array parameter ( argument ORIG ) and returning a char array DEST or
A void function that takes two char array arguments and does its job changing DEST as wished?
Thanks!
This very much depends on the nature of your function.
In your first case, the function has to allocate storage for the result (or return a pointer to some static object, but this wouldn't be thread-safe). This can be the right thing to do, e.g. for a function that duplicates a string, like POSIX' strdup(). This also means the caller must use free() on the result when it is no longer needed.
The second case requires the caller to provide the storage. This is often the idiomatic way to do these things in C, because in this case, the caller could just write
char result[256];
func(result, "bla bla");
and therefore use an automatic object to hold the result. It also has the benefit that you can use the actual return value to signal errors.
Both are ways of valid ways of doing it, but I'd suggest using the latter, since it means you can load the contents into any block of memory, while a returned array will have to be on heap, and be freed by design.
Again, both are valid ways of doing things, and this is just a guideline. What should be done usually depends on the situation.
It depends,
If you know that the length of DEST will be the same as the lenght of ORIG i would go for the 2nd approach because then you wont have to dynamiclly allocate memory for dest inside the function (and remember to free it outside the function).
If the length is different you have to dynamiclly allocate memory and you can do so in two ways:
1. Like your first approach - for returning array from a function in c you have to allocate a new array and return it's address(pointer)
2. The function can recieve two argument one is ORIG and second is a double pointer to RES , because the function recieves a double pointer it can allocate an array inside and return it via the argument.
1- is more "cleaner" way in terms of code ,and easier to use in terms of user expirience(the user is the caller)
Good luck!
In option 1 you will have to dynamically allocate (malloc) the output array. Which means you have a potential for a memory leak. In option 2 the output array is provided for you, so there is no chance of a leak, but there is a chance that the output array is not of sufficient size and you will get a buffer overrun when writing to it.
Both methods are acceptable, there might be a small performance difference in one compared to the other, but its really down to your choice.
Personally, being a cautios programmer, I would go for option 3:
/* Returns 0 on success, 1 on failure
Requires : inputSize <= outpuSize
input != output
input != null
output != null
*/
int DoStuff (char* output, size_t outputSize, char* input, size_t inputSize);
(Sorry if that's not proper C, its been decades:) )
(Edited in accordance with Felix Palmen's points.)

Working with Pointers and Strcpy in C

I'm fairly new to the concept of pointers in C. Let's say I have two variables:
char *arch_file_name;
char *tmp_arch_file_name;
Now, I want to copy the value of arch_file_name to tmp_arch_file_name and add the word "tmp" to the end of it. I'm looking at them as strings, so I have:
strcpy(&tmp_arch_file_name, &arch_file_name);
strcat(tmp_arch_file_name, "tmp");
However, when strcat() is called, both of the variables change and are the same. I want one of them to change and the other to stay intact. I have to use pointers because I use the names later for the fopen(), rename() and delete() functions. How can I achieve this?
What you want is:
strcpy(tmp_arch_file_name, arch_file_name);
strcat(tmp_arch_file_name, "tmp");
You are just copying the pointers (and other random bits until you hit a 0 byte) in the original code, that's why they end up the same.
As shinkou correctly notes, make sure tmp_arch_file_name points to a buffer of sufficient size (it's not clear if you're doing this in your code). Simplest way is to do something like:
char buffer[256];
char* tmp_arch_file_name = buffer;
Before you use pointers, you need to allocate memory. Assuming that arch_file_name is assigned a value already, you should calculate the length of the result string, allocate memory, do strcpy, and then strcat, like this:
char *arch_file_name = "/temp/my.arch";
// Add lengths of the two strings together; add one for the \0 terminator:
char * tmp_arch_file_name = malloc((strlen(arch_file_name)+strlen("tmp")+1)*sizeof(char));
strcpy(tmp_arch_file_name, arch_file_name);
// ^ this and this ^ are pointers already; no ampersands!
strcat(tmp_arch_file_name, "tmp");
// use tmp_arch_file_name, and then...
free(tmp_arch_file_name);
First, you need to make sure those pointers actually point to valid memory. As they are, they're either NULL pointers or arbitrary values, neither of which will work very well:
char *arch_file_name = "somestring";
char tmp_arch_file_name[100]; // or malloc
Then you cpy and cat, but with the pointers, not pointers-to-the-pointers that you currently have:
strcpy (tmp_arch_file_name, arch_file_name); // i.e., no "&" chars
strcat (tmp_arch_file_name, "tmp");
Note that there is no bounds checking going on in this code - the sample doesn't need it since it's clear that all the strings will fit in the allocated buffers.
However, unless you totally control the data, a more robust solution would check sizes before blindly copying or appending. Since it's not directly related to the question, I won't add it in here, but it's something to be aware of.
The & operator is the address-of operator, that is it returns the address of a variable. However using it on a pointer returns the address of where the pointer is stored, not what it points to.

C Programming: Find Length of a Char* with Null Bytes

If I have a character pointer that contains NULL bytes is there any built in function I can use to find the length or will I just have to write my own function? Btw I'm using gcc.
EDIT:
Should have mentioned the character pointer was created using malloc().
If you have a pointer then the ONLY way to know the size is to store the size separately or have a unique value which terminates the string. (typically '\0') If you have neither of these, it simply cannot be done.
EDIT: since you have specified that you allocated the buffer using malloc then the answer is the paragraph above. You need to either remember how much you allocated with malloc or simply have a terminating value.
If you happen to have an array (like: char s[] = "hello\0world";) then you could resort to sizeof(s). But be very careful, the moment you try it with a pointer, you will get the size of the pointer, not the size of an array. (but strlen(s) would equal 5 since it counts up to the first '\0').
In addition, arrays decay to pointers when passed to functions. So if you pass the array to a function, you are back to square one.
NOTE:
void f(int *p) {}
and
void f(int p[]) {}
and
void f(int p[10]) {}
are all the same. In all 3 versions, p is a pointer, not an array.
How do you know where the string ends, if it contains NULL bytes as part of it? Certainly no built in function can work with strings like that. It'll interpret the first null byte as the end of the string.
If you want the length, you'll have to store it yourself. Keep in mind that no standard library string functions will work correctly on strings like these.
You'll need to keep track of the length yourself.
C strings are null terminated, meaning that the first null character signals the end of the string. All builtin string functions rely on this, so if you have a buffer that can contain NULLs as part of the data then you can't use them.
Since you're using malloc then you may need to keep track of two sizes: the size of your allocated buffer, and how many characters within that buffer constitute valid data.

what is the correct way to define a string in C?

what is the correct way to define a string in C?
using:
char string[10];
or
char *string="???";
If I use an array, I can use any pointer to point to it and then manipulate it.
It seems like using the second one will cause trouble because we didn't allocate memory for that. I am taught that array is just a pointer value, I thought these two are the same before.
Until I did something like string* = *XXXX, and realize it didn't work like a pointer.
As #affenlehrer points out, how you "define" a string depends on how you want to use it. In reality, 'defining' a string in C really just amounts to putting it in quotes somewhere in your program. You should probably read more about how memory works and is allocated in C, but if you write:
char *ptr = "???"
What happens is that the compiler will take the string "???" (which is really four bytes of data, three '?'s followed by one zero byte for the NUL terminator). It will insert that at some static place in your program (in something called the .bss segment), and when your program starts running, the value of ptr will be initialized to point to that location in memory. This means you have a pointer to four bytes of memory, and if you try to write outside of those bytes, your program is doing something bad (and probably violating memory safety).
On the other hand, if you write
char string[10];
Then this basically tells the compiler to go allocate some space in your program of 10 bytes, and make the variable 'string' point to it. It depends where you put this: if you put it inside a function, then you will have a stack allocated buffer of 10 bytes. If you manipulate this buffer inside a function, and then don't do anything with the pointer afterwards, you're all fine. However, if you pass back the address of string -- or use the pointer in any way -- after the function returns, you're in the wrong. This is because, after the function returns, you lose all of the stack allocated variables.
There are even more ways to create strings in C (e.g. using malloc). What is your usecase? Basically you need a place in memory where the data is stored (on the stack, on the heap, static as in your second example) and then a character pointer to the first character of your string. Most string related functions will "see" the end of the string by the trailing '\0', in some other cases (mostly general purpose data related functions) you also have to provide the length of the string.

C char* pointers pointing to same location where they definitely shouldn't

I'm trying to write a simple C program on Ubuntu using Eclipse CDT (yes, I'm more comfortable with an IDE and I'm used to Eclipse from Java development), and I'm stuck with something weird. On one part of my code, I initialize a char array in a function, and it is by default pointing to the same location with one of the inputs, which has nothing to do with that char array. Here is my code:
char* subdir(const char input[], const char dir[]){
[*] int totallen = strlen(input) + strlen(dir) + 2;
char retval[totallen];
strcpy(retval, input);
strcat(retval, dir);
...}
Ok at the part I've marked with [*], there is a checkpoint. Even at that breakpoint, when I check y locals, I see that retval is pointing to the same address with my argument input. It not even possible as input comes from another function and retval is created in this function. Is is me being unexperienced with C and missing something, or is there a bug somewhere with the C compiler?
It seems so obvious to me that they should't point to the same (and a valid, of course, they aren't NULL) location. When the code goes on, it literally messes up everything; I get random characters and shapes in console and the program crashes.
I don't think it makes sense to check the address of retval BEFORE it appears, it being a VLA and all (by definition the compiler and the debugger don't know much about it, it's generated at runtime on the stack).
Try checking its address after its point of definition.
EDIT
I just read the "I get random characters and shapes in console". It's obvious now that you are returning the VLA and expecting things to work.
A VLA is only valid inside the block where it was defined. Using it outside is undefined behavior and thus very dangerous. Even if the size were constant, it still wouldn't be valid to return it from the function. In this case you most definitely want to malloc the memory.
What cnicutar said.
I hate people who do this, so I hate me ... but ... Arrays of non-const size are a C99 extension and not supported by C++. Of course GCC has extensions to make it happen.
Under the covers you are essentially doing an _alloca, so your odds of blowing out the stack are proportional to who has access to abuse the function.
Finally, I hope it doesn't actually get returned, because that would be returning a pointer to a stack allocated array, which would be your real problem since that array is gone as of the point of return.
In C++ you would typically use a string class.
In C you would either pass a pointer and length in as parameters, or a pointer to a pointer (or return a pointer) and specify the calls should call free() on it when done. These solutions all suck because they are error prone to leaks or truncation or overflow. :/
Well, your fundamental problem is that you are returning a pointer to the stack allocated VLA. You can't do that. Pointers to local variables are only valid inside the scope of the function that declares them. Your code results in Undefined Behaviour.
At least I am assuming that somewhere in the ..... in the real code is the line return retval.
You'll need to use heap allocation, or pass a suitably sized buffer to the function.
As well as that, you only need +1 rather than +2 in the length calculation - there is only one null-terminator.
Try changing retval to a character pointer and allocating your buffer using malloc().
Pass the two string arguments as, char * or const char *
Rather than returning char *, you should just pass another parameter with a string pointer that you already malloc'd space for.
Return bool or int describing what happened in the function, and use the parameter you passed to store the result.
Lastly don't forget to free the memory since you're having to malloc space for the string on the heap...
//retstr is not a const like the other two
bool subdir(const char *input, const char *dir,char *retstr){
strcpy(retstr, input);
strcat(retstr, dir);
return 1;
}
int main()
{
char h[]="Hello ";
char w[]="World!";
char *greet=(char*)malloc(strlen(h)+strlen(w)+1); //Size of the result plus room for the terminator!
subdir(h,w,greet);
printf("%s",greet);
return 1;
}
This will print: "Hello World!" added together by your function.
Also when you're creating a string on the fly you must malloc. The compiler doesn't know how long the two other strings are going to be, thus using char greet[totallen]; shouldn't work.

Resources