I want to define a constant string containing non printable characters in C. For e.g - Let say I have a string
char str1[] ={0x01, 0x05, 0x0A, 0x15};
Now I want to define it like this
char *str2 = "<??>"
What should I write in place of <??> do define an string equivalent to str1?
You can use "\x01\x05\x0a\x15"
If you want to use both a string literal and avoid having an extra terminator (NUL character) added, do it like this:
static const char str[4] = "\x1\x5\xa\x15";
When the string literal's length exactly matches the declared length of the character array, the compiler will not add the terminating NUL character.
The following test program:
#include <stdio.h>
int main(void)
{
size_t i;
static const char str[4] = "\x1\x5\xa\x15";
printf("str is %zu bytes:\n", sizeof str);
for(i = 0; i < sizeof str; ++i)
printf("%zu: %02x\n", i, (unsigned int) str[i]);
return 0;
}
Prints this:
str is 4 bytes:
0: 01
1: 05
2: 0a
3: 15
I don't understand why you would prefer using this method rather than the much more readable and maintainable original one with the hex numbers separated by commas, but perhaps your real string contains normal printable characters too, or something.
You could use :
const char *str2 = "\x01\x05\x0A\x15";
See escape sequences on MSDN (couldn't find a more neutral link).
Related
I have:
char *var1 = "foo";
char *var2 = "bar";
and I want to create this string: "foo\0bar\0"
How can I do that? I tried this but of course it does not work:
sprintf(buffer, "%s\0%s", var1, var2);
You have two problems here:
Putting \0 (aka NUL) in the middle of any string is legal, but it also means all C string APIs will consider the string as ending early; every C-style string ends with NUL, and there's no way to tell the difference between a new NUL you added and the "real NUL", because it has to assume the first NUL encountered is the end of the string (reading further could read uninitialized memory, or read beyond the end of the array entirely, invoking undefined behavior). So even if you succeed, C APIs that work with strings will never see bar. You'd have to keep track of how long the "real" string was, and use non-string APIs to work with it (sprintf does return how many characters it printed, so you're not operating completely blind).
Trying to put the \0 in the format string itself means that sprintf thinks the format string ends there; from its point of view, "%s\0%s" is exactly the same as "%s", it literally can't tell them apart.
You can work around problem number 2 by inserting the NUL with a format code that inserts a single char (where NUL is not special), e.g.:
sprintf(buffer, "%s%c%s", var1, '\0', var2);
but even when you're done, doing printf("%s", buffer); will only show foo (because the embedded NUL is where scanning stops). The data is there, and can be accessed, just not with C string APIs:
#include <stdio.h>
int main(int argc, char **argv) {
char *var1 = "foo";
char *var2 = "bar";
char buffer[10] = "0123456789";
sprintf(buffer, "%s%c%s", var1, '\0', var2);
for (int i = 0; i < sizeof(buffer); ++i) {
printf("'%c': %hhd\n", buffer[i], buffer[i]);
}
return 0;
}
Try it online!
which outputs:
'f': 102
'o': 111
'o': 111
'': 0
'b': 98
'a': 97
'r': 114
'': 0
'8': 56
'9': 57
The empty quotes contain a NUL byte if you look at the TIO link, but lo and behold, my browser stops the copy/paste at the NUL byte (yay C string APIs), so I can't actually copy it here.
This is a fairly common problem when dealing with binary data.
If you want to manipulate binary data, don't use the string tools of strcat, strcpy, etc., because they use null-termination to determine the length of the string.
Instead use the memcpy library routine that requires you to specify a length. Keep track of every binary string as a pointer and a length.
char *var1="foo";
unsigned len1 = 3;
char *var2="bar";
unsigned len2 = 3;
/* write var1 and var2 to buffer with null-separation */
/* assuming buffer is large enough */
char buffer[10];
unsigned len_buffer = 0;
/* write var1 to start of buffer */
memcpy(buffer, var1, len1);
len_buffer = len1;
/* append null */
buffer[len_buffer++] = '\0';
/* append var2 */
memcpy(buffer+len_buffer, var2, len2);
len_buffer += len2;
Not particulary fast or short, but this should do the job
strcpy (buffer, var1);
strcat (buffer+strlen(var1)+1, var2);
In an introductory course of C, I have learned that while storing the strings are stored with null character \0 at the end of it. But what if I wanted to print a string, say printf("hello") although I've found that that it doesn't end with \0 by following statement
printf("%d", printf("hello"));
Output: 5
but this seem to be inconsistent, as far I know that variable like strings get stored in main memory and I guess while printing something it might also be stored in main memory, then why the difference?
The null byte marks the end of a string. It isn't counted in the length of the string and isn't printed when a string is printed with printf. Basically, the null byte tells functions that do string manipulation when to stop.
Where you will see a difference is if you create a char array initialized with a string. Using the sizeof operator will reflect the size of the array including the null byte. For example:
char str[] = "hello";
printf("len=%zu\n", strlen(str)); // prints 5
printf("size=%zu\n", sizeof(str)); // prints 6
printf returns the number of the characters printed. '\0' is not printed - it just signals that the are no more chars in this string. It is not counted towards the string length as well
int main()
{
char string[] = "hello";
printf("szieof(string) = %zu, strlen(string) = %zu\n", sizeof(string), strlen(string));
}
https://godbolt.org/z/wYn33e
sizeof(string) = 6, strlen(string) = 5
Your assumption is wrong. Your string indeed ends with a \0.
It contains of 5 characters h, e, l, l, o and the 0 character.
What the "inner" print() call outputs is the number of characters that were printed, and that's 5.
In C all literal strings are really arrays of characters, which include the null-terminator.
However, the null terminator is not counted in the length of a string (literal or not), and it's not printed. Printing stops when the null terminator is found.
All answers are really good but I would like to add another example to complete all these
#include <stdio.h>
int main()
{
char a_char_array[12] = "Hello world";
printf("%s", a_char_array);
printf("\n");
a_char_array[4] = 0; //0 is ASCII for null terminator
printf("%s", a_char_array);
printf("\n");
return 0;
}
For those don't want to try this on online gdb, the output is:
Hello world
Hell
https://linux.die.net/man/3/printf
Is this helpful to understand what escape terminator does? It's not a boundary for a char array or a string. It's the character that will say to the guy that parses -STOP, (print) parse until here.
PS: And if you parse and print it as a char array
for(i=0; i<12; i++)
{
printf("%c", a_char_array[i]);
}
printf("\n");
you get:
Hell world
where, the whitespace after double l, is the null terminator, however, parsing a char array, will just the char value of every byte. If you do another parse and print the int value of each byte ("%d%,char_array[i]), you'll see that (you get the ASCII code- int representation) the whitespace has a value of 0.
In C function printf() returns the number of character printed, \0 is a null terminator which is used to indicate the end of string in c language and there is no built in string type as of c++, however your array size needs to be a least greater than the number of char you want to store.
Here is the ref: cpp ref printf()
But what if I wanted to print a string, say printf("hello") although
I've found that that it doesn't end with \0 by following statement
printf("%d", printf("hello"));
Output: 5
You are wrong. This statement does not confirm that the string literal "hello" does not end with the terminating zero character '\0'. This statement confirmed that the function printf outputs elements of a string until the terminating zero character is encountered.
When you are using a string literal as in the statement above then the compiler
creates a character array with the static storage duration that contains elements of the string literal.
So in fact this expression
printf("hello")
is processed by the compiler something like the following
static char string_literal_hello[] = { 'h', 'e', 'l', 'l', 'o', '\0' };
printf( string_literal_hello );
Th action of the function printf in this you can imagine the following way
int printf( const char *string_literal )
{
int result = 0;
for ( ; *string_literal != '\0'; ++string_literal )
{
putchar( *string_literal );
++result;
}
return result;
}
To get the number of characters stored in the string literal "hello" you can run the following program
#include <stdio.h>
int main(void)
{
char literal[] = "hello";
printf( "The size of the literal \"%s\" is %zu\n", literal, sizeof( literal ) );
return 0;
}
The program output is
The size of the literal "hello" is 6
You have to clear your concept first..
As it will be cleared when you deal with array, The print command you are using its just counting the characters that are placed within paranthesis. Its necessary in array string that it will end with \0
A string is a vector of characters. Contains the sequence of characters that form the
string, followed by the special ending character
string: '\ 0'
Example:
char str[10] = {'H', 'e', 'l', 'l', 'o', '\0'};
Example: the following character vector is not one string because it doesn't end with '\ 0'
char str[2] = {'h', 'e'};
I'm trying to use sprintf() to put a string "inside itself", so I can change it to have an integer prefix. I was testing this on a character array of length 12 with "Hello World" inside it already.
The basic premise is that I want a prefix that denotes the amount of words within a string. So I copy 11 characters into a character array of length 12.
Then I try to put the integer followed by the string itself by using "%i%s" in the function. To get past the integer (I don't just use myStr as the argument for %s), I make sure to use myStr + snprintf(NULL, 0, "%i", wordCount), which should be myStr + characters taken up by the integer.
The problem is that I'm having is that it eats the 'H' when I do this and prints "2ello World" instead of having the '2' right beside the "Hello World"
So far I've tried different options for getting "past the integer" in the string when I try to copy it inside itself, but nothing really seems to be the right case, as it either comes out as an empty string or just the integer prefix itself '222222222222' copied throughout the entire array.
int main() {
char myStr[12];
strcpy(myStr, "Hello World");//11 Characters in length
int wordCount = 2;
//Put the integer wordCount followed by the string myStr (past whatever amount of characters the integer would take up) inside of myStr
sprintf(myStr, "%i%s", wordCount, myStr + snprintf(NULL, 0, "%i", wordCount));
printf("\nChanged myStr '%s'\n", myStr);//Prints '2ello World'
return 0;
}
First, to insert a one-digit prefix into a string “Hello World”, you need a buffer of 13 characters—one for the prefix, eleven for the characters in “Hello World”, and one for the terminating null character.
Second, you should not pass a buffer to snprintf as both the output buffer and an input string. Its behavior is not defined by the C standard when objects passed to it overlap.
Below is a program that shows you how to insert a prefix by moving the string with memmove. This is largely tutorial, as it is not generally a good way to manipulate strings. For short strings, where space is not an issue, most programmers would simply print the desired string into a temporary buffer, avoiding overlap issues.
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
/* Insert a decimal numeral for Prefix into the beginning of String.
Length specifies the total number of bytes available at String.
*/
static void InsertPrefix(char *String, size_t Length, int Prefix)
{
// Find out how many characters the numeral needs.
int CharactersNeeded = snprintf(NULL, 0, "%i", Prefix);
// Find the current string length.
size_t Current = strlen(String);
/* Test whether there is enough space for the prefix, the current string,
and the terminating null character.
*/
if (Length < CharactersNeeded + Current + 1)
{
fprintf(stderr,
"Error, not enough space in string to insert prefix.\n");
exit(EXIT_FAILURE);
}
// Move the string to make room for the prefix.
memmove(String + CharactersNeeded, String, Current + 1);
/* Remember the first character, because snprintf will overwrite it with a
null character.
*/
char Temporary = String[0];
// Write the prefix, including a terminating null character.
snprintf(String, CharactersNeeded + 1, "%i", Prefix);
// Restore the first character of the original string.
String[CharactersNeeded] = Temporary;
}
int main(void)
{
char MyString[13] = "Hello World";
InsertPrefix(MyString, sizeof MyString, 2);
printf("Result = \"%s\".\n", MyString);
}
The best way to deal with this is to create another buffer to output to, and then if you really need to copy back to the source string then copy it back once the new copy is created.
There are other ways to "optimise" this if you really needed to, like putting your source string into the middle of the buffer so you can append and change the string pointer for the source (not recommended, unless you are running on an embedded target with limited RAM and the buffer is huge). Remember code is for people to read so best to keep it clean and easy to read.
#define MAX_BUFFER_SIZE 128
int main() {
char srcString[MAX_BUFFER_SIZE];
char destString[MAX_BUFFER_SIZE];
strncpy(srcString, "Hello World", MAX_BUFFER_SIZE);
int wordCount = 2;
snprintf(destString, MAX_BUFFER_SIZE, "%i%s", wordCount, srcString);
printf("Changed string '%s'\n", destString);
// Or if you really want the string put back into srcString then:
strncpy(srcString, destString, MAX_BUFFER_SIZE);
printf("Changed string in source '%s'\n", srcString);
return 0;
}
Notes:
To be safer protecting overflows in memory you should use strncpy and snprintf.
I want to copy X to Y words of a string to the out char * array.
unsigned char * string = "HELLO WORLD!!!" // length 14
unsigned char out[9];
size_t length = 9;
for(i=0 ;i < length ;++i)
{
out[i] = string[i+3];
}
printf("%s = string\n%s = out\n", string, out);
When looking at the output of out, why is there gibberish after a certain point of my string? I see the string of out as LO WORLD!# . Why are there weird characters appearing after the content I copied, isn't out supposed to be a an array of 9? I expected the output to be
LO WORLD!
In C you need to terminate your string with a 0x00 value so a string of length 9 needs ten bytes to store it with the last set to 0. Otherwise your print statements run off into random data.
unsigned char * string = "HELLO WORLD!!!" // length 14
unsigned char out[10];
size_t length = 9;
for(i=0 ;i < length ;++i)
{
out[i] = string[i+3];
}
out[length] = 0x00;
printf("%s = string\n%s = out\n", string, out);
A minor point, but string literals have type char* (or const char* in C++), not unsigned char* -- these might be the same in your implementation, but they don't need to be.
Furthermore, this is not true:
unsigned char * string = "HELLO WORLD!!!" // length 14
The string actually occupies 15 bytes -- there is an extra, hidden '\0' at the end, called a nul byte, which marks the end of the string. These nul terminators are very important, because if they're not present, then many C library functions which manipulate strings will keep going until they hit a byte with a value equal to '\0' -- and so can end up reading or trampling over bits of memory they shouldn't do. This is called a buffer overrun, and is a classic bug (and exploitable security problem) in C programmes.
In your example, you haven't included this nul terminator in your copied string, so printf() just keeps going until it finds one, hence the gibberish you're seeing. In general, it's a good idea only to use C library functions to manipulate C strings if possible, as these are careful to add the terminator for you. In this case, strncpy from string.h does exactly what you're after.
A 9 character string needs 10 bytes because it must be null ( 0 ) terminated. Try this:
unsigned char out[10]; // make this 10
size_t length = 9;
for(i=0 ;i < length ;++i)
{
out[i] = string[i+3];
}
out[i] = 0; // add this to terminate the string
A better approach would be just the line:
strncpy(out, string+3, 9);
C strings must be null terminated. You only created an array large enough for 8 characters + the null terminator, but you never added the terminator.
So, you need to allocate the length plus 1 and add the terminator.
// initializes all elements to 0
char out[10] = {0};
// alternatively, add it at the end.
out[9] = '\0';
Think of it this way; you're passed a char* which represents a string. How do you know how long it is? How can you read it? Well, in C, a sentinel value is added to the end. This is the null terminator. It is how strings are read in C, and passing around unterminated strings to functions which expect C strings results in undefined behavior.
And then... just use strncpy to copy strings.
If you want to have copy 9 characters from your string, you'll need to have an array of 10 to do that. It is because a C string needs to have '\0' as null terminated character. So your code should be rewritten like this:
unsigned char * string = "HELLO WORLD!!!" // length 14
unsigned char out[10];
size_t length = 9;
for(i=0 ;i < length ;++i)
{
out[i] = string[i+3];
}
out[9] = 0;
printf("%s = string\n%s = out\n", string, out);
As simple as that. I'm on C++ btw. I've read the cplusplus.com's cstdlib library functions, but I can't find a simple function for this.
I know the length of the char, I only need to erase last three characters from it. I can use C++ string, but this is for handling files, which uses char*, and I don't want to do conversions from string to C char.
If you don't need to copy the string somewhere else and can change it
/* make sure strlen(name) >= 3 */
namelen = strlen(name); /* possibly you've saved the length previously */
name[namelen - 3] = 0;
If you need to copy it (because it's a string literal or you want to keep the original around)
/* make sure strlen(name) >= 3 */
namelen = strlen(name); /* possibly you've saved the length previously */
strncpy(copy, name, namelen - 3);
/* add a final null terminator */
copy[namelen - 3] = 0;
I think some of your post was lost in translation.
To truncate a string in C, you can simply insert a terminating null character in the desired position. All of the standard functions will then treat the string as having the new length.
#include <stdio.h>
#include <string.h>
int main(void)
{
char string[] = "one one two three five eight thirteen twenty-one";
printf("%s\n", string);
string[strlen(string) - 3] = '\0';
printf("%s\n", string);
return 0;
}
If you know the length of the string you can use pointer arithmetic to get a string with the last three characters:
const char* mystring = "abc123";
const int len = 6;
const char* substring = mystring + len - 3;
Please note that substring points to the same memory as mystring and is only valid as long as mystring is valid and left unchanged. The reason that this works is that a c string doesn't have any special markers at the beginning, only the NULL termination at the end.
I interpreted your question as wanting the last three characters, getting rid of the start, as opposed to how David Heffernan read it, one of us is obviously wrong.
bool TakeOutLastThreeChars(char* src, int len) {
if (len < 3) return false;
memset(src + len - 3, 0, 3);
return true;
}
I assume mutating the string memory is safe since you did say erase the last three characters. I'm just overwriting the last three characters with "NULL" or 0.
It might help to understand how C char* "strings" work:
You start reading them from the char that the char* points to until you hit a \0 char (or simply 0).
So if I have
char* str = "theFile.nam";
then str+3 represents the string File.nam.
But you want to remove the last three characters, so you want something like:
char str2[9];
strncpy (str2,str,8); // now str2 contains "theFile.#" where # is some character you don't know about
str2[8]='\0'; // now str2 contains "theFile.\0" and is a proper char* string.