I have:
char *var1 = "foo";
char *var2 = "bar";
and I want to create this string: "foo\0bar\0"
How can I do that? I tried this but of course it does not work:
sprintf(buffer, "%s\0%s", var1, var2);
You have two problems here:
Putting \0 (aka NUL) in the middle of any string is legal, but it also means all C string APIs will consider the string as ending early; every C-style string ends with NUL, and there's no way to tell the difference between a new NUL you added and the "real NUL", because it has to assume the first NUL encountered is the end of the string (reading further could read uninitialized memory, or read beyond the end of the array entirely, invoking undefined behavior). So even if you succeed, C APIs that work with strings will never see bar. You'd have to keep track of how long the "real" string was, and use non-string APIs to work with it (sprintf does return how many characters it printed, so you're not operating completely blind).
Trying to put the \0 in the format string itself means that sprintf thinks the format string ends there; from its point of view, "%s\0%s" is exactly the same as "%s", it literally can't tell them apart.
You can work around problem number 2 by inserting the NUL with a format code that inserts a single char (where NUL is not special), e.g.:
sprintf(buffer, "%s%c%s", var1, '\0', var2);
but even when you're done, doing printf("%s", buffer); will only show foo (because the embedded NUL is where scanning stops). The data is there, and can be accessed, just not with C string APIs:
#include <stdio.h>
int main(int argc, char **argv) {
char *var1 = "foo";
char *var2 = "bar";
char buffer[10] = "0123456789";
sprintf(buffer, "%s%c%s", var1, '\0', var2);
for (int i = 0; i < sizeof(buffer); ++i) {
printf("'%c': %hhd\n", buffer[i], buffer[i]);
}
return 0;
}
Try it online!
which outputs:
'f': 102
'o': 111
'o': 111
'': 0
'b': 98
'a': 97
'r': 114
'': 0
'8': 56
'9': 57
The empty quotes contain a NUL byte if you look at the TIO link, but lo and behold, my browser stops the copy/paste at the NUL byte (yay C string APIs), so I can't actually copy it here.
This is a fairly common problem when dealing with binary data.
If you want to manipulate binary data, don't use the string tools of strcat, strcpy, etc., because they use null-termination to determine the length of the string.
Instead use the memcpy library routine that requires you to specify a length. Keep track of every binary string as a pointer and a length.
char *var1="foo";
unsigned len1 = 3;
char *var2="bar";
unsigned len2 = 3;
/* write var1 and var2 to buffer with null-separation */
/* assuming buffer is large enough */
char buffer[10];
unsigned len_buffer = 0;
/* write var1 to start of buffer */
memcpy(buffer, var1, len1);
len_buffer = len1;
/* append null */
buffer[len_buffer++] = '\0';
/* append var2 */
memcpy(buffer+len_buffer, var2, len2);
len_buffer += len2;
Not particulary fast or short, but this should do the job
strcpy (buffer, var1);
strcat (buffer+strlen(var1)+1, var2);
Related
I'm trying to use sprintf() to put a string "inside itself", so I can change it to have an integer prefix. I was testing this on a character array of length 12 with "Hello World" inside it already.
The basic premise is that I want a prefix that denotes the amount of words within a string. So I copy 11 characters into a character array of length 12.
Then I try to put the integer followed by the string itself by using "%i%s" in the function. To get past the integer (I don't just use myStr as the argument for %s), I make sure to use myStr + snprintf(NULL, 0, "%i", wordCount), which should be myStr + characters taken up by the integer.
The problem is that I'm having is that it eats the 'H' when I do this and prints "2ello World" instead of having the '2' right beside the "Hello World"
So far I've tried different options for getting "past the integer" in the string when I try to copy it inside itself, but nothing really seems to be the right case, as it either comes out as an empty string or just the integer prefix itself '222222222222' copied throughout the entire array.
int main() {
char myStr[12];
strcpy(myStr, "Hello World");//11 Characters in length
int wordCount = 2;
//Put the integer wordCount followed by the string myStr (past whatever amount of characters the integer would take up) inside of myStr
sprintf(myStr, "%i%s", wordCount, myStr + snprintf(NULL, 0, "%i", wordCount));
printf("\nChanged myStr '%s'\n", myStr);//Prints '2ello World'
return 0;
}
First, to insert a one-digit prefix into a string “Hello World”, you need a buffer of 13 characters—one for the prefix, eleven for the characters in “Hello World”, and one for the terminating null character.
Second, you should not pass a buffer to snprintf as both the output buffer and an input string. Its behavior is not defined by the C standard when objects passed to it overlap.
Below is a program that shows you how to insert a prefix by moving the string with memmove. This is largely tutorial, as it is not generally a good way to manipulate strings. For short strings, where space is not an issue, most programmers would simply print the desired string into a temporary buffer, avoiding overlap issues.
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
/* Insert a decimal numeral for Prefix into the beginning of String.
Length specifies the total number of bytes available at String.
*/
static void InsertPrefix(char *String, size_t Length, int Prefix)
{
// Find out how many characters the numeral needs.
int CharactersNeeded = snprintf(NULL, 0, "%i", Prefix);
// Find the current string length.
size_t Current = strlen(String);
/* Test whether there is enough space for the prefix, the current string,
and the terminating null character.
*/
if (Length < CharactersNeeded + Current + 1)
{
fprintf(stderr,
"Error, not enough space in string to insert prefix.\n");
exit(EXIT_FAILURE);
}
// Move the string to make room for the prefix.
memmove(String + CharactersNeeded, String, Current + 1);
/* Remember the first character, because snprintf will overwrite it with a
null character.
*/
char Temporary = String[0];
// Write the prefix, including a terminating null character.
snprintf(String, CharactersNeeded + 1, "%i", Prefix);
// Restore the first character of the original string.
String[CharactersNeeded] = Temporary;
}
int main(void)
{
char MyString[13] = "Hello World";
InsertPrefix(MyString, sizeof MyString, 2);
printf("Result = \"%s\".\n", MyString);
}
The best way to deal with this is to create another buffer to output to, and then if you really need to copy back to the source string then copy it back once the new copy is created.
There are other ways to "optimise" this if you really needed to, like putting your source string into the middle of the buffer so you can append and change the string pointer for the source (not recommended, unless you are running on an embedded target with limited RAM and the buffer is huge). Remember code is for people to read so best to keep it clean and easy to read.
#define MAX_BUFFER_SIZE 128
int main() {
char srcString[MAX_BUFFER_SIZE];
char destString[MAX_BUFFER_SIZE];
strncpy(srcString, "Hello World", MAX_BUFFER_SIZE);
int wordCount = 2;
snprintf(destString, MAX_BUFFER_SIZE, "%i%s", wordCount, srcString);
printf("Changed string '%s'\n", destString);
// Or if you really want the string put back into srcString then:
strncpy(srcString, destString, MAX_BUFFER_SIZE);
printf("Changed string in source '%s'\n", srcString);
return 0;
}
Notes:
To be safer protecting overflows in memory you should use strncpy and snprintf.
I'm using the code below to add some "0" chars into my string, but it seems there is a problem and the program will crash. Everything seems logic but I do not know where is the problem?
#include <stdlib.h>
#include <string.h>
int main()
{
char *Ten; int i=0; Ten = malloc(12);
Ten="1";
for (i=0;i<10;i++)
strcat(Ten,"0");
printf("%s",Ten);
return 0;
}
You declare Ten as a pointer to a string literal. However, you cannot rely on being able to modify a string literal, and thus the program crashes.
To fix this, you can declare Ten as an array instead:
int main()
{
char Ten[12]="1"; int i=0;
for (i=0;i<10;i++)
strcat(Ten,"0");
printf("%s",Ten);
return 0;
}
Note that you need 12 bytes; 11 for the characters and one for the terminating NUL character.
Ten is a string literal and you cannot modify it. Try with array instead
char Ten[12] = "1";
for (i=0;i<10;i++)
strcat(Ten,"0");
printf("%s",Ten);
notice that I created an array of 12 characters, because there should be room for a termination '\0'.
You actually don't need strcat here, it's just do this
char Ten = malloc(12);
if (Ten != NULL)
{
Ten[0] = '1';
for (i = 1 ; i < 11 ; i++)
Ten[i] = '0';
Ten[11] = '\0';
/* Use Ten here, for example printf it. */
printf("%s",Ten);
/* You should release memory. */
free(Ten);
}
or
char Ten = malloc(12);
if (Ten != NULL)
{
Ten[0] = '1';
memset(Ten + 1, '0', 10);
Ten[11] = '\0';
/* Use Ten here, for example printf it. */
printf("%s",Ten);
/* You should release memory. */
free(Ten);
}
To quote from strcat manual on linux:
The strcat() function appends the src string to the dest string,
overwriting the terminating null byte ('\0') at the end of dest, and
then adds a terminating null byte. The strings may not overlap, and
the dest string must have enough space for the result. If dest is not
large enough, program behavior is unpredictable; buffer overruns are
a favorite avenue for attacking secure programs.
Your Ten array is only long enough to store original literal. You need to preallocate memory as long as final desired string.
String literals might be stored in read only section of memory. Any attempt to modify such a literal causes undefined behavior.
To concatenate two strings, the destination must have enough space allocated for the characters to be added and space for '\0'. Change the declaration of Ten to
char Ten[12] = "1";
and it will work.
I want to copy X to Y words of a string to the out char * array.
unsigned char * string = "HELLO WORLD!!!" // length 14
unsigned char out[9];
size_t length = 9;
for(i=0 ;i < length ;++i)
{
out[i] = string[i+3];
}
printf("%s = string\n%s = out\n", string, out);
When looking at the output of out, why is there gibberish after a certain point of my string? I see the string of out as LO WORLD!# . Why are there weird characters appearing after the content I copied, isn't out supposed to be a an array of 9? I expected the output to be
LO WORLD!
In C you need to terminate your string with a 0x00 value so a string of length 9 needs ten bytes to store it with the last set to 0. Otherwise your print statements run off into random data.
unsigned char * string = "HELLO WORLD!!!" // length 14
unsigned char out[10];
size_t length = 9;
for(i=0 ;i < length ;++i)
{
out[i] = string[i+3];
}
out[length] = 0x00;
printf("%s = string\n%s = out\n", string, out);
A minor point, but string literals have type char* (or const char* in C++), not unsigned char* -- these might be the same in your implementation, but they don't need to be.
Furthermore, this is not true:
unsigned char * string = "HELLO WORLD!!!" // length 14
The string actually occupies 15 bytes -- there is an extra, hidden '\0' at the end, called a nul byte, which marks the end of the string. These nul terminators are very important, because if they're not present, then many C library functions which manipulate strings will keep going until they hit a byte with a value equal to '\0' -- and so can end up reading or trampling over bits of memory they shouldn't do. This is called a buffer overrun, and is a classic bug (and exploitable security problem) in C programmes.
In your example, you haven't included this nul terminator in your copied string, so printf() just keeps going until it finds one, hence the gibberish you're seeing. In general, it's a good idea only to use C library functions to manipulate C strings if possible, as these are careful to add the terminator for you. In this case, strncpy from string.h does exactly what you're after.
A 9 character string needs 10 bytes because it must be null ( 0 ) terminated. Try this:
unsigned char out[10]; // make this 10
size_t length = 9;
for(i=0 ;i < length ;++i)
{
out[i] = string[i+3];
}
out[i] = 0; // add this to terminate the string
A better approach would be just the line:
strncpy(out, string+3, 9);
C strings must be null terminated. You only created an array large enough for 8 characters + the null terminator, but you never added the terminator.
So, you need to allocate the length plus 1 and add the terminator.
// initializes all elements to 0
char out[10] = {0};
// alternatively, add it at the end.
out[9] = '\0';
Think of it this way; you're passed a char* which represents a string. How do you know how long it is? How can you read it? Well, in C, a sentinel value is added to the end. This is the null terminator. It is how strings are read in C, and passing around unterminated strings to functions which expect C strings results in undefined behavior.
And then... just use strncpy to copy strings.
If you want to have copy 9 characters from your string, you'll need to have an array of 10 to do that. It is because a C string needs to have '\0' as null terminated character. So your code should be rewritten like this:
unsigned char * string = "HELLO WORLD!!!" // length 14
unsigned char out[10];
size_t length = 9;
for(i=0 ;i < length ;++i)
{
out[i] = string[i+3];
}
out[9] = 0;
printf("%s = string\n%s = out\n", string, out);
char str[50];
char strToCopy[16];
int numberOfBytesToFill = 7; // fills 1st n bytes of the memory area pointed to
char charToFillInAs = '*'; //48: ascii for '0' // '$';
void* targetStartAddress = str;
strncpy(strToCopy, "tO sEe The wOrLd", strlen("tO sEe The wOrLd")+1);
puts(strToCopy); // print
strcpy(str, "Test statement !##$%^&*()=+_~```::||{}[]");
puts(str); // print
memset(targetStartAddress, charToFillInAs, numberOfBytesToFill); // fill memory with a constant byte (ie. charToFillInAs)
puts(str); // print
memcpy(targetStartAddress, strToCopy, strlen(strToCopy)+1); // +1 for null char
puts(str); // print
The output is:
tO sEe The wOrLd
Test statement !##$%^&*()=+_~```::||{}[]
*******atement !##$%^&*()=+_~```::||{}[]
tO sEe The wOrLd*******atement !##$%^&*()=+_~```::||{}[]
Hence, my question is why
tO sEe The wOrLd*******atement !##$%^&*()=+_~```::||{}[]
instead of
tO sEe The wOrLd\0#$%^&*()=+_~```::||{}[]
with '\0' as the null char?
strncpy(strToCopy, "tO sEe The wOrLd", strlen("tO sEe The wOrLd")+1);
This is the wrong way to use strncpy. You should specify the size of the output buffer, not the size of the input. In this case the output buffer is 16 bytes, but you are copying 17 bytes into it, which results in undefined behavior.
Note that even if you do specify the output buffer size, strncpy won't write the null terminator if the string is truncated so you'd need to add it yourself:
strncpy(strToCopy, "tO sEe The wOrLd", sizeof(strToCopy));
strToCopy[sizeof(strToCopy)-1] = '\0'; //null terminate in case of truncation.
For this reason, some people prefer to use another function. For example, you can implement a safe copy function by using strncat:
void safeCopy(char *dest, const char *src, size_t destSize) {
if (destSize > 0) {
dest[0] = '\0';
strncat(dest, src, destSize-1);
}
}
To be used like:
safeCopy(strToCopy, "tO sEe The wOrLd", sizeof(strToCopy));
Note also that you won't actually see the null character and what comes after it as in your expected output. The output will simply stop at the null terminator because it indicates the end of the string.
I want to define a constant string containing non printable characters in C. For e.g - Let say I have a string
char str1[] ={0x01, 0x05, 0x0A, 0x15};
Now I want to define it like this
char *str2 = "<??>"
What should I write in place of <??> do define an string equivalent to str1?
You can use "\x01\x05\x0a\x15"
If you want to use both a string literal and avoid having an extra terminator (NUL character) added, do it like this:
static const char str[4] = "\x1\x5\xa\x15";
When the string literal's length exactly matches the declared length of the character array, the compiler will not add the terminating NUL character.
The following test program:
#include <stdio.h>
int main(void)
{
size_t i;
static const char str[4] = "\x1\x5\xa\x15";
printf("str is %zu bytes:\n", sizeof str);
for(i = 0; i < sizeof str; ++i)
printf("%zu: %02x\n", i, (unsigned int) str[i]);
return 0;
}
Prints this:
str is 4 bytes:
0: 01
1: 05
2: 0a
3: 15
I don't understand why you would prefer using this method rather than the much more readable and maintainable original one with the hex numbers separated by commas, but perhaps your real string contains normal printable characters too, or something.
You could use :
const char *str2 = "\x01\x05\x0A\x15";
See escape sequences on MSDN (couldn't find a more neutral link).