best method to assign new string value to char array - c

I know that I have to use strcpy / strncpy to assign a new string value to an existing char array. Recently I saw a lot of code like this
char arr[128] = "\0";
sprintf(arr, "Hello World"); // only string constants no variable input
// or
sprintf(arr, "%s", "Hello World");
Both variants give the same result. What is the advantage of the latter variant?

It depends on whether the string to be copied is a literal, as shown, or can vary.
The best technique for the array shown would be:
char arr[128] = "Hello World";
If you're in charge of the string and it contains no % symbols, then there's not much difference between the two sprintf() calls. Strictly, the first uses the string as the format and copies the characters directly, while the second notes it has %s as the format and copies the characters from the extra argument directly — it's immeasurably slower. There's a case for:
snprintf(arr, sizeof(arr), "%s", "Hello World");
which ensures no buffer overflow even if "Hello World" becomes a much longer diatribe.
If you're not in charge of the string, then using snprintf() as shown becomes important as even if the string contains % symbols, it is simply copied and there's no overflow. You have to check the return value to establish whether any data was truncated.
Using strcpy() is reasonable if you know how long the string is and that there's space to hold it. Using strncpy() is fraught — it null pads to full length if the source is shorter than the target, and doesn't null terminate if the source is too long for the target.
If you've established the length of the string is short enough, using memmove() or memcpy() is reasonable too. If the string is too long, you have to choose an error handling strategy — truncation or error.
If the trailing (unused) space in the target array must be null bytes (for security reasons, to ensure there's no leftover password hidden in it), then using strncpy() may be sensible — but beware of ensuring null termination if the source is too long. In most cases, the initializer for the array is not really needed.
The compiler may be able to optimize the simple cases.

The first version won't work if the string contains any % characters, because sprintf() will treat them as formatting operators that need to be filled in using additional arguments.. This isn't a problem with a fixed string like Hello World, but if you're getting the string dynamically it could cause undefined behavior because there won't be any arguments to match the formatting operators. This can potentially cause security exploits.
If you're not actually doing any formatting, a better way is to just use strcpy():
strcpy(arr, "Hello World");
Also, when initiallizing the string it's not necessary to put an explicit \0 in the string. A string literal always ends with a null byte. So you can initialize it as:
char arr[128] = "";
And if you're immediately overwriting the variable with sprintf() or strcpy(), you don't need to initialize it in the first place.

Related

Segmentation fault of small code

I am trying to test something and I made a small test file to do so. The code is:
void main(){
int i = 0;
char array1 [3];
array1[0] = 'a';
array1[1] = 'b';
array1[2] = 'c';
printf("%s", array1[i+1]);
printf("%d", i);
}
I receive a segmentation error when I compile and try to run. Please let me know what my issue is.
Please let me know what my issue is. ? firstly char array1[3]; is not null terminated as there is no enough space to put '\0' at the end of array1. To avoid this undefined behavior increase the size of array1.
Secondly, array1[i+1] is a single char not string, so use %c instead of %s as
printf("%c", array1[i+1]);
I suggest you get yourself a good book/video series on C. It's not a language that's fun to pick up out of the blue.
Regardless, your problem here is that you haven't formed a correct string. In C, a string is a pointer to the start of a contiguous region of memory that happens to be filled with characters. There is no data whatsoever stored about it's size or any other characteristics. Only where it starts and what it is. Therefore you must provide information as to when the string ends explicitly. This is done by having the very last character in a string be set to the so called null character (in C represented by the escape sequence '\0'.
This implies that any string must be one character longer than the content you want it to hold. You should also never be setting up a string manually like this. Use a library function like strlcpy to do it. It will automatically add in a null character, even if your array is too small (by truncating the string). Alternatively you can statically create a literal string like this:
char array[] = "abc";
It will automatically be null terminated and be of size 4.
Strings need to have a NUL terminator, and you don't have one, nor is there room for one.
The solution is to add one more character:
char array1[4];
// ...
array1[3] = 0;
Also you're asking to print a string but supplying a character instead. You need to supply the whole buffer:
printf("%s", array1);
Then you're fine.
Spend the time to learn about how C strings work, in particular about the requirement for the terminator, as buffer overflow bugs are no joke.
When printf sees a "%s" specifier in the formatting string, it expects a char* as the corresponding argument, but you passed a char value of the array1[i+1] expression. That char got promoted to int but that is still incompatible with char *, And even if it was it has no chance to be a valid pointer to any meaningful character string...

Concatenate strings without using string functions: Replacing the end of string (null character) gives seg fault

Without using the string.h functions (want to use only the std libs), I wanted to create a new string by concatenating the string provided as an argument to the program. For that, I decided to copy the argument to a new char array of larger size and then replace the end of the string by the characters I want to append.
unsigned int argsize=sizeof(argv[1]);
unsigned char *newstr=calloc(argsize+5,1);
newstr=argv[1]; //copied arg string to new string of larger size
newstr[argsize+4]=oname[ns]; //copied the end-of-string null character
newstr[argsize]='.'; //this line gives seg fault
newstr[argsize+1]='X'; //this executes without any error
I believe there must be another more secure way of concatenating string without using string functions or by copying and appending char by char into a new char array. I would really want to know such methods. Also, I'm curious to know what is the reason of this segfault.
Read here: https://stackoverflow.com/a/164258/1176315 and I guess, the compiler is making my null character memory block read only but that's only a guess. I want to know the real reason behind this.
I will appreciate all your efforts to answer the question. Thanks.
Edit: By using std libs only, I mean to say I don't want to use the strcpy(), strlen(), strcat() etc. functions.
Without using the string.h functions (want to use only the std libs)
string.h is part of the standard library.
unsigned int argsize=sizeof(argv[1]);
This is wrong. sizeof does not tell you the length of a C string, it just tell you how big is the type of its argument. argv[1] is a pointer, and sizeof will just tell you how big a pointer is on your platform (typically 4 or 8), regardless of the actual content of the string.
If you want to know how long is a C string, you have to examine its characters and count until you find a 0 character (which incidentally is what strlen does).
newstr=argv[1]; //copied arg string to new string of larger size
Nope. You just copied the pointer stored in argv[1] to the variable newstr, incidentally losing the pointer that calloc returned to you previously, so you have also a memory leak.
To copy a string from a buffer to another you have to copy its characters one by one until you find a 0 character (which incidentally is what strcpy does).
All the following lines are thus operating on argv[1], so if you are going out of its original bounds anything can happen.
I believe there must be another more secure way of concatenating string without using string functions or by copying and appending char by char into a new char array.
C strings are just arrays of characters, everything boils down to copying/reading them one at time. If you don't want to use the provided string functions you'll end up essentially reimplementing them yourself. Mind you, it's a useful exercise, but you have to understand a bit better what C strings are and how pointers work.
First of all sizeof(argv[1]) will not return the length of the string you need to count the number of characters in the string using loops or using standard library function strlen().second if you want to copy the string you need to use strcpy() function.
You supposed to do like this:
unsigned int argsize=strlen(argv[1]); //you can also count the number of character
unsigned char *newstr=calloc((argsize+5),1);
strcpy(newstr,argv[1]);
newstr[argsize+4]=oname[ns];
newstr[argsize]='.';
newstr[argsize+1]='X';

Space for Null character in c strings

When is it necessary to explicitly provide space for a NULL character in C strings.
For eg;
This works without any error although I haven't declared str to be 7 characters long,i.e for the characters of string plus NULL character.
#include<stdio.h>
int main(){
char str[6] = "string";
printf("%s", str);
return 0;
}
Though in this question https://stackoverflow.com/a/7652089 the user says
"This is useful if you need to modify the string later on, but know that it will not exceed 40 characters (or 39 characters followed by a null terminator, depending on context)."
What does it mean by "depending on context" ?
When is it necessary to explicitly provide space for a NULL character in C strings?
Always. Not having that \0 character there will make functions like strcpy, strlen and printing via %s behave wrong. It might work for some examples (like your own) but I won't bet anything on that.
On the other hand, if your string is binary and you know the length of the packet you don't need that extra space. But then you cannot use str* functions. And this is not the case of your question, anyway.
It is buggy, keyword "buffer overflow". The memory is overwritten.
char str[4] = "stringulation";
char str2[20];
printf("%s", str);
printf("%s", str2);
Trying to write on some address for which you have not requested may lead to data corruption, Random output or undefined nature of code.
Your code invokes undefined behaviour. You may think it works, but the code is broken.
To store a C string with 6 characters, and a null-terminator, you need a character array of length 7 or more.
When is it necessary to explicitly provide space for a NULL character in C strings
There are no exceptions. A C string must always include a null terminating character.
What does it mean by "depending on context"?
The answer there is drawing the distinction between a string variable that you intend to modify at a later time, or a string variable that you will not modify. In the former case, you may choose to allocate more than you need for the initial contents, because you want to be able to add more later. In the latter case, you can simply allocate as many characters are needed for the initial value, and no more.
That 0 terminator1 is how the various library functions (strcpy(), strlen(), printf(), etc.) identify the end of a string. When you call a function like
char foo[6] = "hello";
printf( "%s\n", foo );
the array expression foo is converted to a pointer value before it's passed to the function, so all the function receives is the address of the first character; it doesn't know how long the foo array is. So it needs some way to know where the end of the string is. If foo didn't have that space for the 0 terminator, printf() would continue to print characters beyond the end of the array until it saw a 0-valued byte.
1. I prefer using the term "0 terminator" instead of "NULL terminator", just to avoid confusion with the NULL pointer, which is a different thing.

Reduce string length in c, where is the fault?

i have two different filenames, which are defined in a header file:
1: "physio_sensor_readout.csv"
2: "statethresh_configuration.csv"
they are initialised by
char* filename;
and later
filename = FILENAMEINAMACRO; which is the corresponding filename above
Later, filename is passed to another function which alters the ending:
filename[strnlen(filename, FILENAME_LENGTH) - 4] = '\0';
This should remove the ending .csv and i strncat a new one afterwards.
FILENAME_LENGTH is 60, so enough space.
It works if i pass "statetresh_...."(even the strncat afterwards) but not with "physio_se.....". This throws a segment fault
strnlen(filename,FILENAME_LENGTH - 4)
returns 21 in case 1 and 25 in case 2. this is the correct position of the dot, where i want to put the terminating null.
Is this a problem with char* and should i initialise filename with char filename[60]?
Regards and thank you
edit:
your suggestions solved the problem. thanks!
I think you declare FILENAMEINAMACRO as string literal [Without more code I cannot be sure about it].
string literals might be saved on read only memory - so you might not be able to change them.
In any way, trying to change string literals results in undefined behavior.
You might want to make a copy of FILENAMEINAMACRO and work on it using strcpy()
It is not safe to modify the contents of a character literal. Something like this:
char *filename = "yes";
filename[2] = 'p'; // change to "yep"
is undefined behavior, and can cause disastrous results, because filename can be pointing to memory that can't be modified. Instead, try something like this:
char filename[] = "yes";
filename[2] = 'p'; // change to "yep"
which will allocate a new array filename and initialize its contents with "yes".
You are appear to be pointing your char* pointer filename at a character constant. I assume you have defined #define FILENAMEINAMACRO "physio_sensor_readout.csv". This makes your assignment filename = "physio_sensor_readout.csv";. You then use the filename pointer to modify the string constant. Here is a more suitable sequence:
char filename[256]; // choose a size that is suitably large
...
strcpy(filename, FILENAMEINAMACRO); // also look at strncpy for safer copying
...
... manipulate the content of filename as you wish ...
Because you have made a copy of the string literal, modifying it is safe (as long as you stay within the bounds of the declared size of filename -- which includes keeping any terminating null also within the bounds.
You should be careful using the char filename[] = "..." form. It allocates enough space for the string literal you give it, but if later you copying some other string literal into that space you must be certain that the second literal is no longer than the first. A safer practice is to dimension the space to be large enough that you're certain your code will never attempt to use any more than what you have dimensioned. If you accept input from outside the program (or from other person's code), you should check the length of what you are accepting before trying to copy it into the space you have dimensioned. Any use of space beyond the dimensioned size is likely to cause issues that can be hard to diagnose. In the example above, you must make all efforts to ensure you never use more space (including the terminating nul char) than 256 chars (because filename is dimensioned at 200).

string manipulation without alloc mem in c

I'm wondering if there is another way of getting a sub string without allocating memory. To be more specific, I have a string as:
const char *str = "9|0\" 940 Hello";
Currently I'm getting the 940, which is the sub-string I want as,
char *a = strstr(str,"9|0\" ");
char *b = substr(a+5, 0, 3); // gives me the 940
Where substr is my sub string procedure. The thing is that I don't want to allocate memory for this by calling the sub string procedure.
Is there a much easier way?, perhaps by doing some string manipulation and not alloc mem.
I'll appreciate any feedback.
No, it can't be done. At least, not without modifying the original string and not without departing from the usual C concept of what a string is.
In C, a string is a sequence of characters terminated by a NUL (a \0 character). In order to obtain from "9|0\" 940 Hello" the substring "940", there would have to be a sequence of characters 9, 4, 0, \0 somewhere in memory. Since that sequence of characters does not exist anywhere in your original string, you would have to modify the original string.
The other option would just be to use a pointer into the original string at the place where your desired substring starts, and then also remember how long your substring is supposed to be in lieu of having the terminating \0 character. However, all C standard library functions that work on strings (and pretty much all third party C libraries that work with strings) expect strings to be NUL-terminated, and so won't accept this pointer-and-count format.
Try this:
char *mysubstr(char *dst, const char *src, const char *substr, size_t maxdst) {
... do substr logic, but stick result in dst respecting maxdst ...
}
Basically, punt and let the caller allocate space on the stack via:
char s[100];
Or something.
A C string is simply an array of chars in memory. If you want to access the substring without allocating a copy of the characters, you can simply access it directly:
char *b = a[5];
The problem with this approach is that b will not be null-terminated to the appropriate length. It would essentially be a pointer to the string: "940 hello".
If that doesn't matter to the code that uses b, then you are good to go. Keep in mind, however, that this would probably surprise other programmers later on in the product lifetime (including yourself)!
As xyld, suggested, you could let the caller allocate the memory and pass your substr function a buffer to fill; though, strictly speaking, that still involves "allocating memory".
Without allocating any memory at all, the only way you'd be able to do this would be by modifying the original string by changing the character after the substring to a '\0', but of course then your function couldn't take a const char * anymore, and you're modifying the original string, which may not be desirable.
If you don't require a \0 terminated string you can make a substring finding function that just tells you where in the full string (haystack) your partial string (needle) is. This would be considered a hot-copy or alias as the data could be changed by changes to the full string (haystack).
I was writing up a long thing on how to allocate memory using alloca and implement a macro (because it wouldn't work as a function) that would do what you want, but just happened to run across strndupa which is like strndup except allocates the memory on the stack rather than from the heap. It's a GNU extension, so it might not be available for you.
Writing your own macro that would look like a function because it needs to return a value but also work on the memory, but it is possible.

Resources