How to robustly copy text to char* without any errors - c

I have 2 questions..
is it necessary to add a termination character when executing the following commands against a char *string ?
strcpy();
strncpy();
Is it necessary to allocate memory before before doing any operation with the above to function against the char *string ?
for example..
char *str;
str = malloc(strlen(texttocopy));
strcpy(texttocopy, str); // see the below edit
Please explain.
EDIT :
in the above code I inverted the argument. it is just typo i made while asking the question here. The correct way should be
strcpy(str, texttocopy); // :)

The strcpy function always adds the terminator, but strncpy may not do it in some cases.
And for the second question, yes you need to make sure there is enough memory allocated for the destination. In your example you have not allocated enough memory, and will have a buffer overflow. Remember that strlen returns the length of the string without counting the terminator. You also have inverted the arguments to strcpy, the destination is the first argument.

'strcpy' function copies data from source to destination address including with '\0' termination character . 'strncpy' function copies data as the same way but if there is no termination character '\0' exists in the first n bytes to be copied, termination character will not be copied then and you will need to add it by yourself to terminate the string.
You will always have to statically or dynamically allocate a memory space to play with. Therefore, you should declare a character array or dynamically allocate a chunk of memory first then you can play nice with your strings

Related

The use of strcat in C overwrites irrelevant strings

The code is as follows:
char seg1[] = "abcdefgh";
char seg2[] = "ijklmnop";
char seg3[] = "qrstuvwx";
strcat(seg2, seg3);
Then the value stored in seg1 will become:
"rstuvwx\0\0"
I have learned to declare that strings with close positions are also adjacent in the stack area, but I forgot the details.
I guess the memory address of seg1 was overwritten when strcat() was executed, but I'm not sure about the specific process. Can someone tell me the specific process of this event?Thanks
C does not have a string class, it has character arrays which may be used as strings by appending a null terminator. And since there is no string class, all memory management of strings/arrays must be done manually.
char seg1[] = "abcdefgh"; Allocates space for exactly 8 characters and 1 null terminator. There is no room to append anything else at the end. If you try anyway, that's the realm of undefined behavior, where anything can happen. Crashes, overwriting other variables, program ceasing to function as expected and so on.
Solve this by allocating enough space to append something in the end, for example
char seg1[50] = "abcdefgh";. Alternatively allocate a new, third array and copy the strings into that one.

malloc and strcpy interactions

I've been testing out interactions between malloc() and various string functions in order to try to learn more about how pointers and memory work in C, but I'm a bit confused about the following interactions.
char *myString = malloc(5); // enough space for 5 characters (no '\0')
strcpy(myString, "Hello"); // shouldn't work since there isn't enough heap memory
printf(%s, %zd\n", myString, strlen(myString)); // also shouldn't work without '\0'
free(myString);
Everything above appears to work properly. I've tried using printf() for each character to see if the null terminator is present, but '\0' appears to just print as a blank space anyways.
My confusion lies in:
String literals will always have an implicit null terminator.
strcpy should copy over the null terminator onto myString, but there isn't enough allocated heap memory
printf/strlen shouldn't work unless myString has a terminator
Since myString apparently has a null terminator, where is it? Did it just get placed at a random memory location? Is the above code an error waiting to happen?
Addressing your three points:
String literals will always have an implicit null terminator.
Correct.
strcpy should copy over the null terminator onto myString, but there isn't enough allocated heap memory
strcpy has no way of knowing how large the destination buffer is, and will happily write past the end of it (overwritting whatever is after the buffer in memory. For information on this off-the-end-access look up 'buffer overrun' or 'buffer overflow'. These are common security weaknesses).
For a safer version, use strncpy which takes the length of the destination buffer as an argument so as not to write past the end of it.
printf/strlen shouldn't work unless myString has a terminator
The phrase 'shouldn't work' is a bit vague here. printf/strlen/etc will continue reading through memory until a null terminator is found, which could be immediately after the string or could be thousands of bytes away (in your case you have written the null terminator to the memory immediately after myString so printf/strlen/etc will stop there).
Lastly:
Is the above code an error waiting to happen?
Yes. You are overwriting memory that has not been allocated which could cause any manor of problems depending on what happened to be overwritten.
From the strcpy man page:
If the destination string of a strcpy() is not large enough, then anything might happen. Overflowing fixed-length string buffers is a favorite cracker technique for taking complete control of the machine. Any time a program reads or copies data into a buffer, the program first needs to check that there's enough space. This may be unnecessary if you can show that overflow is impossible, but be careful: programs can get changed over time, in ways that may make the impossible possible.

Space for Null character in c strings

When is it necessary to explicitly provide space for a NULL character in C strings.
For eg;
This works without any error although I haven't declared str to be 7 characters long,i.e for the characters of string plus NULL character.
#include<stdio.h>
int main(){
char str[6] = "string";
printf("%s", str);
return 0;
}
Though in this question https://stackoverflow.com/a/7652089 the user says
"This is useful if you need to modify the string later on, but know that it will not exceed 40 characters (or 39 characters followed by a null terminator, depending on context)."
What does it mean by "depending on context" ?
When is it necessary to explicitly provide space for a NULL character in C strings?
Always. Not having that \0 character there will make functions like strcpy, strlen and printing via %s behave wrong. It might work for some examples (like your own) but I won't bet anything on that.
On the other hand, if your string is binary and you know the length of the packet you don't need that extra space. But then you cannot use str* functions. And this is not the case of your question, anyway.
It is buggy, keyword "buffer overflow". The memory is overwritten.
char str[4] = "stringulation";
char str2[20];
printf("%s", str);
printf("%s", str2);
Trying to write on some address for which you have not requested may lead to data corruption, Random output or undefined nature of code.
Your code invokes undefined behaviour. You may think it works, but the code is broken.
To store a C string with 6 characters, and a null-terminator, you need a character array of length 7 or more.
When is it necessary to explicitly provide space for a NULL character in C strings
There are no exceptions. A C string must always include a null terminating character.
What does it mean by "depending on context"?
The answer there is drawing the distinction between a string variable that you intend to modify at a later time, or a string variable that you will not modify. In the former case, you may choose to allocate more than you need for the initial contents, because you want to be able to add more later. In the latter case, you can simply allocate as many characters are needed for the initial value, and no more.
That 0 terminator1 is how the various library functions (strcpy(), strlen(), printf(), etc.) identify the end of a string. When you call a function like
char foo[6] = "hello";
printf( "%s\n", foo );
the array expression foo is converted to a pointer value before it's passed to the function, so all the function receives is the address of the first character; it doesn't know how long the foo array is. So it needs some way to know where the end of the string is. If foo didn't have that space for the 0 terminator, printf() would continue to print characters beyond the end of the array until it saw a 0-valued byte.
1. I prefer using the term "0 terminator" instead of "NULL terminator", just to avoid confusion with the NULL pointer, which is a different thing.

string manipulation without alloc mem in c

I'm wondering if there is another way of getting a sub string without allocating memory. To be more specific, I have a string as:
const char *str = "9|0\" 940 Hello";
Currently I'm getting the 940, which is the sub-string I want as,
char *a = strstr(str,"9|0\" ");
char *b = substr(a+5, 0, 3); // gives me the 940
Where substr is my sub string procedure. The thing is that I don't want to allocate memory for this by calling the sub string procedure.
Is there a much easier way?, perhaps by doing some string manipulation and not alloc mem.
I'll appreciate any feedback.
No, it can't be done. At least, not without modifying the original string and not without departing from the usual C concept of what a string is.
In C, a string is a sequence of characters terminated by a NUL (a \0 character). In order to obtain from "9|0\" 940 Hello" the substring "940", there would have to be a sequence of characters 9, 4, 0, \0 somewhere in memory. Since that sequence of characters does not exist anywhere in your original string, you would have to modify the original string.
The other option would just be to use a pointer into the original string at the place where your desired substring starts, and then also remember how long your substring is supposed to be in lieu of having the terminating \0 character. However, all C standard library functions that work on strings (and pretty much all third party C libraries that work with strings) expect strings to be NUL-terminated, and so won't accept this pointer-and-count format.
Try this:
char *mysubstr(char *dst, const char *src, const char *substr, size_t maxdst) {
... do substr logic, but stick result in dst respecting maxdst ...
}
Basically, punt and let the caller allocate space on the stack via:
char s[100];
Or something.
A C string is simply an array of chars in memory. If you want to access the substring without allocating a copy of the characters, you can simply access it directly:
char *b = a[5];
The problem with this approach is that b will not be null-terminated to the appropriate length. It would essentially be a pointer to the string: "940 hello".
If that doesn't matter to the code that uses b, then you are good to go. Keep in mind, however, that this would probably surprise other programmers later on in the product lifetime (including yourself)!
As xyld, suggested, you could let the caller allocate the memory and pass your substr function a buffer to fill; though, strictly speaking, that still involves "allocating memory".
Without allocating any memory at all, the only way you'd be able to do this would be by modifying the original string by changing the character after the substring to a '\0', but of course then your function couldn't take a const char * anymore, and you're modifying the original string, which may not be desirable.
If you don't require a \0 terminated string you can make a substring finding function that just tells you where in the full string (haystack) your partial string (needle) is. This would be considered a hot-copy or alias as the data could be changed by changes to the full string (haystack).
I was writing up a long thing on how to allocate memory using alloca and implement a macro (because it wouldn't work as a function) that would do what you want, but just happened to run across strndupa which is like strndup except allocates the memory on the stack rather than from the heap. It's a GNU extension, so it might not be available for you.
Writing your own macro that would look like a function because it needs to return a value but also work on the memory, but it is possible.

How to copy a few chars from a char[] to a char* in C?

Yo!
I'm trying to copy a few chars from a char[] to a char*. I just want the chars from index 6 to (message length - 9).
Maybe the code example will explain my problem more:
char buffer[512] = "GET /testfile.htm HTTP/1.0";
char* filename; // I want *filename to hold only "/testfile.htm"
msgLen = recv(connecting_socket, buffer, 512, 0);
strncpy(filename, buffer+5, msgLen-9);
Any response would help alot!
I assume you meant...
strncpy(filename, buffer+5, msgLen-9);
The problem is you haven't allocated any memory to hold the characters you're copying. "filename" is a pointer, but it doesn't point at anything.
Either just declare
char filename[512];
or malloc some memory for the new name (and don't forget to free() it...)
There are a few problems with the use of strncpy() in your code.
buffer+5 points to the sixth character in string (the "T"), while you said you wanted the backslash.
The last parameter is the maximum number of bytes to copy, so should probably be msglen-13.
strncpy() won't null terminate the copied string, so you need to do that manually.
Also, from a readabilty perspective,
I prefer
strncpy(filename, &buffer[4], msgLen-(9 + 4));
&buffer[5] is the address of the character at the fifth position in the array. That's a personal thing, though.
Also, worth pointing out that the result of "recv" could be one byte or 512 bytes. It won't just read a line. You should really loop calling recv until you have a complete line to work with.
First of all you should allocate a buffer for filename. The next problem is your offset.
char buffer[512] = "GET /testfile.htm HTTP/1.0";
char filename[512]; // I want *filename to hold only "/testfile.htm"
msgLen = recv(connecting_socket, buffer, 512, 0);
strncpy(filename, buffer+4, msgLen-4-9);
//the first parameter should be buffer+4, not 5. Indexes are zero based.
//the second parameter is count, not the end pointer. You should subtract
//the first 4 chars too.
Also you should make sure you add a null at the end of string as strncpy doesn't do it.
filename[msgLen-4-9] = 0;
You could also use memcpy instead of strncpy as you want to just copy some bytes:
memcpy(filename, buffer+4, msgLen-4-9);
fileName[msgLen-4-9] = 0;
In either case, make sure you validate your input. You might receive invalid input from the socket.
Your example code has the line:
char* filename;
This is an uninitialised pointer - it points nowhere, and isn't backed by any storage. You need to allocate some memory for it, e.g. using malloc() (and remember to free() it when you're done), or, in this case, you can probably simply declare it as a character array, e.g.
char filename[SOME_BUFFER_SIZE];
Declaring an array on the stack has the advantage that you don't need to explicitly free it up when you're done with it.
Fundamentally, arrays in C are just syntactic sugar that hide pointers, so you can (usually) treat a char[] as a char*.
You've not allocated any space for the filename
Either replace the declaration of filename with something like
char filename[512]
or (probably better) allow enough space for the filename
filename = (char *)malloc(msgLen - 9 - 6 + 1 ); /* + 1 for the terminating null */
You have not allocated any memory space to filename yet.
It is a pointer... but until initialized, it is pointing to some random area of memory.
You need to allocate memory for filename. As Roddy said, you could declare filename to be 512 bytes. Or you could:
filename = (char*)malloc(512*sizeof(char));
(note: sizeof(char) isn't strictly needed, but I think helps clarify exactly what is being allocated).
After this statement, filename is a pointer to allocated memory, you are free to use it, including copying data to it from buffer. If you copy only a limited region, be sure you leave filename null-terminated.

Resources