I get a lot of strcat lines in my code. Is there a better way to concatenate strings in C?
char material[50]; // it is defined before this code.
char result[10000];
strcpy(result, "// Assign new material to worldGlobe\n");
strcat(result, "shadingNode -asShader lambert -n ");
strcat(result, material);
strcat(result, ";\n");
You could use a format string in conjunction with snprintf() (safe compared to sprintf()):
snprintf(result, 10000,
"// Assign new material to worldGlobe\nshadingNode -asShader lambert -n %s;\n",
material);
strcat is only really suitable for really small strings; it has several problems for anything non-trivial, such as:
Due to the Schlemiel The Painter problem, strcat is O(n) over the length of the input strings, that is, the longer your strings, the longer each concatenation takes. This is because strcat has to walk the entire string to find its end. To solve this, store the length of the string along with the string data, which will allow you to jump directly to the end of the string.
It does not do any bounds checking. If you strcat too much onto the end of a string, it will happily write past the end of the string, producing a segfault in the best case, a severe security vulnerability in the worst, and most likely some bugs that will make you bash your head against the wall. strncat partially solves this problem, as long as you pass it the correct size of the destination buffer.
If your destination buffer is too small, neither strcat nor strncat will increase its size: you'll have to do this yourself.
There are two practical solutions in your situation:
a) The Tower Of Hanoi algorithm: Build a stack of strings. If a new string is shorter than the stack top, push it onto the stack. If it's longer, pop off the top, concatenate, and repeat the process with the result. When you're done pushing, concatenate what's on the stack. This is what std::stringstream in C++ or StringBuilder in .NET do, and if you look around, I'm sure you'll find a suitable implementation in C.
b) Write your strings directly to a stream. What you're outputting looks a lot like code - why not write it to a file directly?
What about
sprintf(result, "// Assign new material to worldGlobe\nshadingNode -asShader lambert -n %s;\n\0", material);
Try stpcpy; see link. Your sample code becomes:
char material[50]; // it is defined before this code.
char result[10000], *p = result;
p = stpcpy(p, "// Assign new material to worldGlobe\n");
p = stpcpy(p, "shadingNode -asShader lambert -n ");
p = stpcpy(p, material);
p = stpcpy(p, ";\n");
This function is available in Linux; the man page for stpcpy on my system states:
This function is not part of the C or POSIX.1 standards, and is not customary on Unix systems, but is not a GNU invention either. Perhaps it comes from MS-DOS.
If you don't have it, it is easy enough to write:
char *stpcpy(char *restrict dst, const char *restrict src) {
return strcpy(dst, src) + strlen(src);
}
This assumes you are aware of the dangers of strcpy.
C is mostly a do-it-yourself language.
Now that you know how to concat strings, you should write your own function to make it easier.
I'd suggest something like:
char* str_multicat(char* result, ...);
And call it something like:
str_mutlicat(result, "// Assign new material to worldGlobe\n",
"shadingNode -asShader lambert -n ",
material,
";\n",
NULL);
(hint, if you don't know the ... syntax, look into va_arg, va_start, va_end)
It would by pretty straight forward to build a string buffer struct that will keep track of the current position in your buffer, and combine that with vsprintf to get a catf(). The function vsnprintf() (assuming it's available) is just like printf, except it takes a va_list instead of ... after the format string.
This approach has the advantage over other answers that it lets you 'cat' from anywhere in your code that has access to the struct without explicitly carrying around the current length or recalculating each time it like strcat does.
Here's a rough sketch free of charge.....
/* Note: the typedef is for the pointer, not the struct. */
typedef struct StrBufStruct {
char * buffer,
size_t size,
size_t pos
} * StrBuf;
/* Create a new StrBuf. NOTE: Could statically allocate too. */
StrBuf newStrBuf(size_t size){
StrBuf sb;
sb = malloc( sizeof(struct StrBufStruct) );
sb->size = size;
sb->pos = 0;
sb->buffer = malloc( size );
/* TODO: ALWAYS CHECK YOUR MALLOC!!! */
}
int sbcatf( StrBuf b, char * fmt, ... )
{
va_list ap;
int res;
if( b->pos < b->size )
{
va_start(ap,fmt);
res = vsnprintf( b->buffer[b->pos], b->size - b->pos, fmt, ap );
b->pos += res;
va_end();
} else {
/* If you want to get really fancy, use realloc so you don't have to worry
about buffer size at all. But be careful, you can run out of memory. */
}
}
/* TODO: Write a free/delete function */
int main(int argc, char **argv){
int i;
/* initialize your structure */
StrBuf sb = newStrBuf(10000);
/* concatenate numbers 0-999 */
for(i=0; i < 1000; i++){
sbcatf(sb, "I=%d\n", i);
}
/* TODO: whatever needs to be done with sb->buffer */
/* free your structure */
deleteStrBuf(sb);
}
Also note that if all you're trying to do is make a really long string but want to be able to have line breaks in your code, this is usually acceptable, although I won't personally guarantee portability. I also use the technique to separate strings at "\n" line breaks to make the code look like the resulting string really would.
const char * someString = "this is one really really really really"
"long stttttttttrrrrrrrrrrrrrrrrrrrriiiiiiiiiiinnnnnnngggggg"
" because the compiler will automatically concatenate string"
" literals until we reach a ';' after a \" character";
You could have a function that returns a pointer to the end of the string, and use that end pointer in future calls. That'd eliminate a bunch of the extra "first, find the end of the string" stuff.
char* faster_cat(char* dest, const char* src)
{
strcpy(dest, src);
return dest + strlen(src);
}
Use like:
char result[10000];
char *end = &result[0];
result[0] = '\0'; // not strictly necessary if you cat a string, but why not
end = faster_cat(end, "// Assign new material to worldGlobe\n");
end = faster_cat(end, "shadingNode -asShader lambert -n ");
end = faster_cat(end, material);
end = faster_cat(end, ";\n");
// result now contains the whole catted string
Related
I'm working on C and want to implements a string concatenation function.
I implemented following function:
void mystr_concat(char* dest, char* src)
{
char* temp = dest;
while(*temp)
{
temp++;
}
while(*src)
{
*temp++ = *src;
src++;
}
*temp = '\0';
return;
}
The output of above program is that it append "src" string to "dest" string.
If user passed a "dest" string which is small in the length such that it can't append "src" string anymore.
e.g. user have this strings and invoked function
char dest[6] = "abcnd";
char src[100] = "zdfhjksdfskdfsdfsdfj";
mystr_concat(dest, src)
In this case
How to check the above raise condition and required solution to resolve this issue?
C does not perform any bounds checks on array references. If you need this done, you will need to either pass into the function the maximum size of the destination array and then verify that the source will fit (or decide to truncate it if required), or, introduce an additional data structure to track the length of strings in the way that typical Pascal implementation prefix each string with its maximum length.
Neither solution is automatic and to support this functionality in a safe way requires the use of a language like Java or C# to prevent the use of unsafe constructs.
Notice that strings, when passed as parameters of functions, are decayed to char pointers. And there is no portable way in C to know at runtime the size of a memory zone given by its pointer. So your mystr_concat cannot know the size of dest (unless you give that size somehow, e.g. by passing that size as an additional function paramter e.g. declaring
void mystr_concat(char* dest, size_t destsize, char* src)
Then you might call it with e.g.
char destbuf[36];
strncpy (destbuf, sizeof(destbuf) /*i.e. 36*/, "start");
mystr_concat(destbuf, sizeof(destbuf), "some-more");
Notice that the standard snprintf(3) function has a similar convention.
Another possible way is to decide that your function returns a heap allocated string and that it is the responsibility of its caller to free that string. Then you could code
char *my_string_catenation (const char* s1, const char* s2)
{
if (!s1||!s2) return NULL;
size_t s1len = strlen(s1);
size_t s2len = strlen(s2);
char* res = malloc(s1len+s2len+1);
if (!res)
{ perror("my_string_catenation malloc"); exit(EXIT_FAILURE); };
memcpy (res, s1, s1len);
memcpy (res+s1len, s2, s2len);
res[s1len+s2len] = (char)0;
return res;
}
You then might code things like
char buf[32];
snprintf(buf, sizeof(buf), "+%d", i);
char* catstr = my_string_catenation((i>30)?"foo":"boo", buf);
do_something_with(catstr);
free(catstr), catstr = NULL;
The above example is a bit stupid, since one could just use snprintf without my_string_catenation but I can't think of a short better example.
It is common in C libraries to have conventions about who is responsible to free some heap-allocated data. You should document such conventions (at least in comments in the header files declaring them).
Perhaps you might be interested in using Boehm's conservative garbage collector; you'll then use GC_MALLOC instead of malloc etc... and you won't bother about free ...
I have a string function that accepts a pointer to a source string and returns a pointer to a destination string. This function currently works, but I'm worried I'm not following the best practice regrading malloc, realloc, and free.
The thing that's different about my function is that the length of the destination string is not the same as the source string, so realloc() has to be called inside my function. I know from looking at the docs...
http://www.cplusplus.com/reference/cstdlib/realloc/
that the memory address might change after the realloc. This means I have can't "pass by reference" like a C programmer might for other functions, I have to return the new pointer.
So the prototype for my function is:
//decode a uri encoded string
char *net_uri_to_text(char *);
I don't like the way I'm doing it because I have to free the pointer after running the function:
char * chr_output = net_uri_to_text("testing123%5a%5b%5cabc");
printf("%s\n", chr_output); //testing123Z[\abc
free(chr_output);
Which means that malloc() and realloc() are called inside my function and free() is called outside my function.
I have a background in high level languages, (perl, plpgsql, bash) so my instinct is proper encapsulation of such things, but that might not be the best practice in C.
The question: Is my way best practice, or is there a better way I should follow?
full example
Compiles and runs with two warnings on unused argc and argv arguments, you can safely ignore those two warnings.
example.c:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
char *net_uri_to_text(char *);
int main(int argc, char ** argv) {
char * chr_input = "testing123%5a%5b%5cabc";
char * chr_output = net_uri_to_text(chr_input);
printf("%s\n", chr_output);
free(chr_output);
return 0;
}
//decodes uri-encoded string
//send pointer to source string
//return pointer to destination string
//WARNING!! YOU MUST USE free(chr_result) AFTER YOU'RE DONE WITH IT OR YOU WILL GET A MEMORY LEAK!
char *net_uri_to_text(char * chr_input) {
//define variables
int int_length = strlen(chr_input);
int int_new_length = int_length;
char * chr_output = malloc(int_length);
char * chr_output_working = chr_output;
char * chr_input_working = chr_input;
int int_output_working = 0;
unsigned int uint_hex_working;
//while not a null byte
while(*chr_input_working != '\0') {
//if %
if (*chr_input_working == *"%") {
//then put correct char in
sscanf(chr_input_working + 1, "%02x", &uint_hex_working);
*chr_output_working = (char)uint_hex_working;
//printf("special char:%c, %c, %d<\n", *chr_output_working, (char)uint_hex_working, uint_hex_working);
//realloc
chr_input_working++;
chr_input_working++;
int_new_length -= 2;
chr_output = realloc(chr_output, int_new_length);
//output working must be the new pointer plys how many chars we've done
chr_output_working = chr_output + int_output_working;
} else {
//put char in
*chr_output_working = *chr_input_working;
}
//increment pointers and number of chars in output working
chr_input_working++;
chr_output_working++;
int_output_working++;
}
//last null byte
*chr_output_working = '\0';
return chr_output;
}
It's perfectly ok to return malloc'd buffers from functions in C, as long as you document the fact that they do. Lots of libraries do that, even though no function in the standard library does.
If you can compute (a not too pessimistic upper bound on) the number of characters that need to be written to the buffer cheaply, you can offer a function that does that and let the user call it.
It's also possible, but much less convenient, to accept a buffer to be filled in; I've seen quite a few libraries that do that like so:
/*
* Decodes uri-encoded string encoded into buf of length len (including NUL).
* Returns the number of characters written. If that number is less than len,
* nothing is written and you should try again with a larger buffer.
*/
size_t net_uri_to_text(char const *encoded, char *buf, size_t len)
{
size_t space_needed = 0;
while (decoding_needs_to_be_done()) {
// decode characters, but only write them to buf
// if it wouldn't overflow;
// increment space_needed regardless
}
return space_needed;
}
Now the caller is responsible for the allocation, and would do something like
size_t len = SOME_VALUE_THAT_IS_USUALLY_LONG_ENOUGH;
char *result = xmalloc(len);
len = net_uri_to_text(input, result, len);
if (len > SOME_VALUE_THAT_IS_USUALLY_LONG_ENOUGH) {
// try again
result = xrealloc(input, result, len);
}
(Here, xmalloc and xrealloc are "safe" allocating functions that I made up to skip NULL checks.)
The thing is that C is low-level enough to force the programmer to get her memory management right. In particular, there's nothing wrong with returning a malloc()ated string. It's a common idiom to return mallocated obejcts and have the caller free() them.
And anyways, if you don't like this approach, you can always take a pointer to the string and modify it from inside the function (after the last use, it will still need to be free()d, though).
One thing, however, that I don't think is necessary is explicitly shrinking the string. If the new string is shorter than the old one, there's obviously enough room for it in the memory chunk of the old string, so you don't need to realloc().
(Apart from the fact that you forgot to allocate one extra byte for the terminating NUL character, of course...)
And, as always, you can just return a different pointer each time the function is called, and you don't even need to call realloc() at all.
If you accept one last piece of good advice: it's advisable to const-qualify your input strings, so the caller can ensure that you don't modify them. Using this approach, you can safely call the function on string literals, for example.
All in all, I'd rewrite your function like this:
char *unescape(const char *s)
{
size_t l = strlen(s);
char *p = malloc(l + 1), *r = p;
while (*s) {
if (*s == '%') {
char buf[3] = { s[1], s[2], 0 };
*p++ = strtol(buf, NULL, 16); // yes, I prefer this over scanf()
s += 3;
} else {
*p++ = *s++;
}
}
*p = 0;
return r;
}
And call it as follows:
int main()
{
const char *in = "testing123%5a%5b%5cabc";
char *out = unescape(in);
printf("%s\n", out);
free(out);
return 0;
}
It's perfectly OK to return newly-malloc-ed (and possibly internally realloced) values from functions, you just need to document that you are doing so (as you do here).
Other obvious items:
Instead of int int_length you might want to use size_t. This is "an unsigned type" (usually unsigned int or unsigned long) that is the appropriate type for lengths of strings and arguments to malloc.
You need to allocate n+1 bytes initially, where n is the length of the string, as strlen does not include the terminating 0 byte.
You should check for malloc failing (returning NULL). If your function will pass the failure on, document that in the function-description comment.
sscanf is pretty heavy-weight for converting the two hex bytes. Not wrong, except that you're not checking whether the conversion succeeds (what if the input is malformed? you can of course decide that this is the caller's problem but in general you might want to handle that). You can use isxdigit from <ctype.h> to check for hexadecimal digits, and/or strtoul to do the conversion.
Rather than doing one realloc for every % conversion, you might want to do a final "shrink realloc" if desirable. Note that if you allocate (say) 50 bytes for a string and find it requires only 49 including the final 0 byte, it may not be worth doing a realloc after all.
I would approach the problem in a slightly different way. Personally, I would split your function in two. The first function to calculate the size you need to malloc. The second would write the output string to the given pointer (which has been allocated outside of the function). That saves several calls to realloc, and will keep the complexity the same. A possible function to find the size of the new string is:
int getNewSize (char *string) {
char *i = string;
int size = 0, percent = 0;
for (i, size; *i != '\0'; i++, size++) {
if (*i == '%')
percent++;
}
return size - percent * 2;
}
However, as mentioned in other answers there is no problem in returning a malloc'ed buffer as long as you document it!
Additionally what was already mentioned in the other postings, you should also document the fact that the string is reallocated. If your code is called with a static string or a string allocated with alloca, you may not reallocate it.
I think you are right to be concerned about splitting up mallocs and frees. As a rule, whatever makes it, owns it and should free it.
In this case, where the strings are relatively small, one good procedure is to make the string buffer larger than any possible string it could contain. For example, URLs have a de facto limit of about 2000 characters, so if you malloc 10000 characters you can store any possible URL.
Another trick is to store both the length and capacity of the string at its front, so that (int)*mystring == length of string and (int)*(mystring + 4) == capacity of string. Thus, the string itself only starts at the 8th position *(mystring+8). By doing this you can pass around a single pointer to a string and always know how long it is and how much memory capacity the string has. You can make macros that automatically generate these offsets and make "pretty code".
The value of using buffers this way is you do not need to do a reallocation. The new value overwrites the old value and you update the length at the beginning of the string.
I have a string pointer like below,
char *str = "This is cool stuff";
Now, I've references to this string pointer like below,
char* start = str + 1;
char* end = str + 6;
So, start and end are pointing to different locations of *str. How can I copy the string chars falls between start and end into a new string pointer. Any existing C++/C function is preferable.
Just create a new buffer called dest and use strncpy
char dest[end-start+1];
strncpy(dest,start,end-start);
dest[end-start] = '\0'
Use STL std::string:
#include
const char *str = "This is cool stuff";
std::string part( str + 1, str + 6 );
This uses iterator range constructor, so the part of the C-string does not have to be zero-terminated.
It's best to do this with strcpy(), and terminate the result yourself. The standard strncpy() function has very strange semantics.
If you really want a "new string pointer", and be a bit safe with regard to lengths and static buffers, you need to dynamically allocate the new string:
char * ranged_copy(const char *start, const char *end)
{
char *s;
s = malloc(end - start + 1);
memcpy(s, start, end - start);
s[end - start] = 0;
return s;
}
If you want to do this with C++ STL:
#include <string>
...
std::string cppStr (str, 1, 6); // copy substring range from 1st to 6th character of *str
const char *newStr = cppStr.c_str(); // make new char* from substring
char newChar[] = new char[end-start+1]]
p = newChar;
while (start < end)
*p++ = *start++;
This is one of the rare cases when function strncpy can be used. Just calculate the number of characters you need to copy and specify that exact amount in the strncpy. Remember that strncpy will not zero-terminate the result in this case, so you'll have to do it yourself (which, BTW, means that it makes more sense to use memcpy instead of the virtually useless strncpy).
And please, do yourself a favor, start using const char * pointers with string literals.
Assuming that end follows the idiomatic semantics of pointing just past the last item you want copied (STL semantics are a useful idiom even if we're dealing with straight C) and that your destination buffer is known to have enough space:
memcpy( buf, start, end-start);
buf[end-start] = '\0';
I'd wrap this in a sub-string function that also took the destination buffer size as a parameter so it could perform a check and truncate the result or return an error to prevent overruns.
I'd avoid using strncpy() because too many programmers forget about the fact that it might not terminate the destination string, so the second line might be mistakenly dropped at some point by someone believing it unnecessary. That's less likely if memcpy() were used. (In general, just say no to using strncpy())
I have a string:
char * someString;
If I want the first five letters of this string and want to set it to otherString, how would I do it?
#include <string.h>
...
char otherString[6]; // note 6, not 5, there's one there for the null terminator
...
strncpy(otherString, someString, 5);
otherString[5] = '\0'; // place the null terminator
Generalized:
char* subString (const char* input, int offset, int len, char* dest)
{
int input_len = strlen (input);
if (offset + len > input_len)
{
return NULL;
}
strncpy (dest, input + offset, len);
return dest;
}
char dest[80];
const char* source = "hello world";
if (subString (source, 0, 5, dest))
{
printf ("%s\n", dest);
}
char* someString = "abcdedgh";
char* otherString = 0;
otherString = (char*)malloc(5+1);
memcpy(otherString,someString,5);
otherString[5] = 0;
UPDATE:
Tip: A good way to understand definitions is called the right-left rule (some links at the end):
Start reading from identifier and say aloud => "someString is..."
Now go to right of someString (statement has ended with a semicolon, nothing to say).
Now go left of identifier (* is encountered) => so say "...a pointer to...".
Now go to left of "*" (the keyword char is found) => say "..char".
Done!
So char* someString; => "someString is a pointer to char".
Since a pointer simply points to a certain memory address, it can also be used as the "starting point" for an "array" of characters.
That works with anything .. give it a go:
char* s[2]; //=> s is an array of two pointers to char
char** someThing; //=> someThing is a pointer to a pointer to char.
//Note: We look in the brackets first, and then move outward
char (* s)[2]; //=> s is a pointer to an array of two char
Some links:
How to interpret complex C/C++ declarations and
How To Read C Declarations
You'll need to allocate memory for the new string otherString. In general for a substring of length n, something like this may work for you (don't forget to do bounds checking...)
char *subString(char *someString, int n)
{
char *new = malloc(sizeof(char)*n+1);
strncpy(new, someString, n);
new[n] = '\0';
return new;
}
This will return a substring of the first n characters of someString. Make sure you free the memory when you are done with it using free().
You can use snprintf to get a substring of a char array with precision:
#include <stdio.h>
int main()
{
const char source[] = "This is a string array";
char dest[17];
// get first 16 characters using precision
snprintf(dest, sizeof(dest), "%.16s", source);
// print substring
puts(dest);
} // end main
Output:
This is a string
Note:
For further information see printf man page.
You can treat C strings like pointers. So when you declare:
char str[10];
str can be used as a pointer. So if you want to copy just a portion of the string you can use:
char str1[24] = "This is a simple string.";
char str2[6];
strncpy(str1 + 10, str2,6);
This will copy 6 characters from the str1 array into str2 starting at the 11th element.
I had not seen this post until now, the present collection of answers form an orgy of bad advise and compiler errors, only a few recommending memcpy are correct. Basically the answer to the question is:
someString = allocated_memory; // statically or dynamically
memcpy(someString, otherString, 5);
someString[5] = '\0';
This assuming that we know that otherString is at least 5 characters long, then this is the correct answer, period. memcpy is faster and safer than strncpy and there is no confusion about whether memcpy null terminates the string or not - it doesn't, so we definitely have to append the null termination manually.
The main problem here is that strncpy is a very dangerous function that should not be used for any purpose. The function was never intended to be used for null terminated strings and it's presence in the C standard is a mistake. See Is strcpy dangerous and what should be used instead?, I will quote some relevant parts from that post for convenience:
Somewhere at the time when Microsoft flagged strcpy as obsolete and dangerous, some other misguided rumour started. This nasty rumour said that strncpy should be used as a safer version of strcpy. Since it takes the size as parameter and it's already part of the C standard lib, so it's portable. This seemed very convenient - spread the word, forget about non-standard strcpy_s, lets use strncpy! No, this is not a good idea...
Looking at the history of strncpy, it goes back to the very earliest days of Unix, where several string formats co-existed. Something called "fixed width strings" existed - they were not null terminated but came with a fixed size stored together with the string. One of the things Dennis Ritchie (the inventor of the C language) wished to avoid when creating C, was to store the size together with arrays [The Development of the C Language, Dennis M. Ritchie]. Likely in the same spirit as this, the "fixed width strings" were getting phased out over time, in favour for null terminated ones.
The function used to copy these old fixed width strings was named strncpy. This is the sole purpose that it was created for. It has no relation to strcpy. In particular it was never intended to be some more secure version - computer program security wasn't even invented when these functions were made.
Somehow strncpy still made it into the first C standard in 1989. A whole lot of highly questionable functions did - the reason was always backwards compatibility. We can also read the story about strncpy in the C99 rationale 7.21.2.4:
The strncpy function
strncpy was initially introduced into the C library to deal with fixed-length name fields in
structures such as directory entries. Such fields are not used in the same way as strings: the
trailing null is unnecessary for a maximum-length field, and setting trailing bytes for shorter
5 names to null assures efficient field-wise comparisons. strncpy is not by origin a “bounded
strcpy,” and the Committee preferred to recognize existing practice rather than alter the function
to better suit it to such use.
The Codidact link also contains some examples showing how strncpy will fail to terminate a copied string.
I think it's easy way... but I don't know how I can pass the result variable directly then I create a local char array as temp and return it.
char* substr(char *buff, uint8_t start,uint8_t len, char* substr)
{
strncpy(substr, buff+start, len);
substr[len] = 0;
return substr;
}
strncpy(otherString, someString, 5);
Don't forget to allocate memory for otherString.
#include <stdio.h>
#include <string.h>
int main ()
{
char someString[]="abcdedgh";
char otherString[]="00000";
memcpy (otherString, someString, 5);
printf ("someString: %s\notherString: %s\n", someString, otherString);
return 0;
}
You will not need stdio.h if you don't use the printf statement and putting constants in all but the smallest programs is bad form and should be avoided.
Doing it all in two fell swoops:
char *otherString = strncpy((char*)malloc(6), someString);
otherString[5] = 0;
char largeSrt[] = "123456789-123"; // original string
char * substr;
substr = strchr(largeSrt, '-'); // we save the new string "-123"
int substringLength = strlen(largeSrt) - strlen(substr); // 13-4=9 (bigger string size) - (new string size)
char *newStr = malloc(sizeof(char) * substringLength + 1);// keep memory free to new string
strncpy(newStr, largeSrt, substringLength); // copy only 9 characters
newStr[substringLength] = '\0'; // close the new string with final character
printf("newStr=%s\n", newStr);
free(newStr); // you free the memory
Try this code:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
char* substr(const char *src, unsigned int start, unsigned int end);
int main(void)
{
char *text = "The test string is here";
char *subtext = substr(text,9,14);
printf("The original string is: %s\n",text);
printf("Substring is: %s",subtext);
return 0;
}
char* substr(const char *src, unsigned int start, unsigned int end)
{
unsigned int subtext_len = end-start+2;
char *subtext = malloc(sizeof(char)*subtext_len);
strncpy(subtext,&src[start],subtext_len-1);
subtext[subtext_len-1] = '\0';
return subtext;
}
Okay so I have the following Code which appends a string to another in C#, note that this is Just an example, so giving alternative string concatination methods in C# is not nessesary, this is just to simplify the example.
string Data = "";
Data +="\n\nHTTP/1.1 " + Status_code;
Data += "\nContent-Type: " + Content_Type;
Data += "\nServer: PT06";
Data += "\nContent-Length: " + Content_Lengt;
Data += "\nDate: " + Date;
Data += "\n" + HTML;
Now I'd like to do the exact same thing in C and I'm trying to do this the following way
time_t rawtime;
time ( &rawtime );
char *message = "\n\nHTTP/1.1 ";
message = strcat(message, Status_code);
message = strcat(message, "\nContent-Type: ");
message = strcat(message, Content_Type);
message = strcat(message, "\nServer: PT06");
message = strcat(message, "\nContent-Length: ");
message = strcat(message, Content_Lengt);
message = strcat(message, "\nDate: ");
message = strcat(message, ctime(&rawtime));
message = strcat(message, "\n");
message = strcat(message, HTML);
Now, this gives me a Segment fault, I know why, I access and read on memory that i shouldn't. But the question is, how do i solve it? Could I use string.h and just do it the same way that I did in C#?
Change
char *message = "\n\nHTTP/1.1 ";
to
char message[1024];
strcpy(message,"\n\nHTTP/1.1 ");
and you should be ok, up to a total message length of 1023.
Edit: (as per mjy's comment). Using strcat in this fashion is a great way of getting buffer overflows. You could readily write a small function that checks the size of the buffer and length of incoming string addition to overcome this, or use realloc on a dynamic buffer. IMO, the onus is on the programmer to check correct buffer sizes where they are used, as with sprintfs and other C strings functions. I assume that C is being used over C++ for performance reasons, and hence STL is not an option.
Edit: As per request from Filip's comment, a simple strcat implementation based on a fixed size char buffer:
char buffer[MAXSIZE] = "";
int mystrcat(char *addition)
{
if (strlen(buffer) + strlen(addition) + sizeof(char) >= MaxSize)
return(FAILED);
strcat(buffer,addition);
return(OK);
}
Using dynamic allocation:
char *buffer = NULL;
int mystrcat(char *addition)
{
buffer = realloc(buffer, strlen(buffer) + strlen(addition) + sizeof(char));
if (!buffer)
return(FAIL);
strcat(buffer, addition);
return(OK);
}
In this case you have to free your buffer manually when you are finished with it. (Handled by destructors in C++ equivalents)
Addendum (Pax):
Okay, since you didn't actually explain why you had to create message[1024], here it is.
With char *x = "hello", the actual bytes ('h','e','l','l','o',0) (null on the end) are stored in an area of memory separate from the variables (and quite possibly read-only) and the variable x is set to point to it. After the null, there's probably something else very important. So you can't append to that at all.
With char x[1024]; strcpy(x,"hello");, you first allocate 1K om memory which is totally dedicated to x. Then you copy "hello" into it, and still leave quite a bit of space at the end for appending more strings. You won't get into trouble until you append more than the 1K-odd allowed.
End addendum (Pax):
I wonder why no one mentioned snprintf() from stdio.h yet. That's the C way to output multiple values and you won't even have to convert your primitives to strings beforehand.
The following example uses a stack allocated fixed-sized buffer. Otherwise, you have to malloc() the buffer (and store its size), which would make it possible to realloc() on overflow...
char buffer[1024];
int len = snprintf(buffer, sizeof(buffer), "%s %i", "a string", 5);
if(len < 0 || len >= sizeof(buffer))
{
// buffer too small or error
}
Edit: You might also consider using the asprintf() function. It's a widely available GNU extension and part of TR 24731-2 (which means it might make it into the next C standard). The example from above would read
char * buffer;
if(asprintf(&buffer, "%s %i", "a string", 5) < 0)
{
// (allocation?) error
}
Remember to free() the buffer when done using it!
Start from using the safer strncat function. In general always use the safer 'n' functions that will not overflow if the size of a string is bigger than a specific size.
In C you need to take care of string sizes yourself. So you need to know how big the resulting string will be and accommodate for it. If you know the sizes of all the strings at the left side, you should create a buffer big enough to hold the resulting string.
message points to a char const[] that you can't write to, yet that's exactly where strcat is writing. You need to malloc() a sufficiently large buffer.
As said previously, you have to write to a sufficiently large buffer. Unfortunately, doing so is a lot of extra work. Most C applications that deal with strings use a dynamically resizable string buffer for doing concatenations.
glib includes an implementation of this, glib strings, which I recommend using for any application that uses strings heavily. It makes managing the memory easier, and prevents buffer overflows.
Have not seen any mention of the strlcpy, strlcat function, which is similar to the 'n' functions except also takes into account the trailing 0. Both take a third argument indicating the maximum length of the output buffer and are found in string.h.
example:
char blah[]="blah";
char buffer[1024];
strlcpy(buffer,"herro!!!",sizeof(buffer));
strlcat(buffer,blah,sizeof(buffer));
printf("%s\n",buffer);
Will output "herro!!!blah"
char blah[]="blah";
char buffer[10];
strlcpy(buffer,"herro!!!",sizeof(buffer));
strlcat(buffer,blah,sizeof(buffer));
printf("%s\n",buffer);
will output "herro!!!b" due to the limited size of buffer[], with no segfaulting. ^^
Only problem is not all platforms seem to include it in their libc (such as linux ._.), most all BSD varients DO seem to have it.
In that case, code for both functions can be found here and easily added: strlcpy, strlcat,
the rest of string.h
The safe way to do this in classic C style is:
char *strconcat(char *s1, char *s2)
{
size_t old_size;
char *t;
old_size = strlen(s1);
/* cannot use realloc() on initial const char* */
t = malloc(old_size + strlen(s2) + 1);
strcpy(t, s1);
strcpy(t + old_size, s2);
return t;
}
...
char *message = "\n\nHTTP/1.1 ";
message = strconcat (message, Status_code);
message = strconcat (message, "\nContent-Type: ");
Now you can say a lot of bad things about it: it's inefficient, it fragments your memory, it's ugly ... but it's more or less what any language with a string concatenation operator and C type (zero-terminated) strings will do (except that most of these languages will have garbage collection built-in).