Append extra null char to wide string - c

Some Win32 API structures require to concatenate an extra null character to a string, as in the following example taken from here:
c:\temp1.txt'\0'c:\temp2.txt'\0''\0'
When it comes to wide strings, what is the easiest way to append a L'\0' to the end of an existing wide string?
Here's what works for me but seems too cumbersome:
wchar_t my_string[10] = L"abc";
size_t len = wcslen(my_string);
wchar_t nullchar[1] = {'\0'};
memcpy(my_string + len + 1, nullchar, sizeof(wchar_t));

In your example you can just assign the value just like any other array. There's nothing special about wchar_t here.
my_string already has a single null-termination, so if you want double null-termination, then just add another 0 after it.
wchar_t my_string[10] = L"abc";
size_t len = wcslen(my_string);
// todo: check out-of-bounds
my_string[len + 1] = 0;
Or even simpler, if it's really just a string literal,
wchar_t my_string[10] = L"abc\0";
This will be doubly-null-terminated.

Assuming you have the various paths in a std::vector<std::wstring>, you can just build the required format in a loop:
std::vector<std::wstring> paths;
paths.emplace_back(L""); // This empty path will add the extra NUL
std::wstring buf(1000, 0);
for (auto p : paths) {
buf.append(p);
buf.append(1, 0);
}
wchar_t *ptr = buf.c_str(); // Now do stuff with it

assuming my_string is long enough:
my_string[wcslen(my_string)+1]='\0';
The terminating null will be translated to a wide char.
(Posted as a first comment to the question)

If you use std::wstring instead of wchar_t[], you can use its operator+= to append the extra null terminator, eg:
wstring my_string = L"abc";
...
my_string += L'\0';
// use my_string.c_str() as needed...

Related

Problems with wchar and registry entry

So I want to add a string to registry, since the registry strings are to written NULL terminated my string contains a null char in various places.
This is what my string looks like.
char names[550] = "1DFA-3327-*\01DFA-3527-*\001DFA-E527-*\00951-1500-
I convert this to whcar_t string like so.
wchar_t names_w[1000];
size_t charsConverted = 0;
mbstowcs_s(&charsConverted, names_w, names, SIZE);
RegSetValueEx(*pKeyHandle, valueName, 0, REG_MULTI_SZ, (LPBYTE)names_w, size);
The registry entry should be
1DFA-3327-*
1DFA-3527-*
1DFA-E527-*
0951-1500-*
0951-0004-*
0951-160D-*
But this is the registry entry now,
1DFA-3327-*
<a box here>DFA-3527-*
<a box here>DFA-E527-*
951-1500-*
951-0004-*
951-160D-*
So it eats up the 0 in 0951 also eats up the 1 in 1DFA
What I have tried:
1> I tried changing the string to
char names[550] = "1DFA-3327-*\0\01DFA-3527-*\0\001DFA-E527-*\0\00951-1500-
^ ^ Two nulls
2> I tried different conversion.
for(int i; i < SIZE; i++)
names_w[i] = (wchar_t)names[i];
The problem is in your string literal.
char names[550] = "1DFA-3327-*\01DFA-3527-*\001DFA-E527-*\00951-1500-...\0";
^^ ^^^ ^^
You can provide ASCII characters in octal notation (\ooo) or in hexadecimal notation (\x hh).
In your case you provide octal notation and this eats up the next up to three characters. You should change your string to
char names[550] = "1DFA-3327-*\0001DFA-3527-*\0001DFA-E527-*\0000951-1500-...\0";
or
char names[550] = "1DFA-3327-*\0" // put null byte at end of GUID
"1DFA-3527-*" "\0" // add null byte as extra literal
"1DFA-E527-*" "\0"
"0951-1500-...\0";
which also makes it easier to identify the GUIDs in the string.
If you want to use hexadecimal notation then take care that these might eat more than two bytes, so you also need to work with the literal concatenation. (See How to properly add hex escapes into a string-literal?)
char names[550] = "1DFA-3327-*\x00" // put null byte at end of GUID
"1DFA-3527-*" "\x00" // add null byte as extra literal
"1DFA-E527-*" "\x00"
"0951-1500-...\x00";
BTW, is there any reason not to directly store a wide char string?
wchar_t names[550] = L"1DFA-3327-*\x001DFA-3527-*\x001DFA-E527-*\x000951-1500-...\0";

Concatenate char array and char

I am new to C language. I need to concatenate char array and a char. In java we can use '+' operation but in C that is not allowed. Strcat and strcpy is also not working for me. How can I achieve this? My code is as follows
void myFunc(char prefix[], struct Tree *root) {
char tempPrefix[30];
strcpy(tempPrefix, prefix);
char label = root->label;
//I want to concat tempPrefix and label
My problem differs from concatenate char array in C as it concat char array with another but mine is a char array with a char
Rather simple really. The main concern is that tempPrefix should have enough space for the prefix + original character. Since C strings must be null terminated, your function shouldn't copy more than 28 characters of the prefix. It's 30(the size of the buffer) - 1 (the root label character) -1 (the terminating null character). Fortunately the standard library has the strncpy:
size_t const buffer_size = sizeof tempPrefix; // Only because tempPrefix is declared an array of characters in scope.
strncpy(tempPrefix, prefix, buffer_size - 3);
tempPrefix[buffer_size - 2] = root->label;
tempPrefix[buffer_size - 1] = '\0';
It's also worthwhile not to hard code the buffer size in the function calls, thus allowing you to increase its size with minimum changes.
If your buffer isn't an exact fit, some more legwork is needed. The approach is pretty much the same as before, but a call to strchr is required to complete the picture.
size_t const buffer_size = sizeof tempPrefix; // Only because tempPrefix is declared an array of characters in scope.
strncpy(tempPrefix, prefix, buffer_size - 3);
tempPrefix[buffer_size - 2] = tempPrefix[buffer_size - 1] = '\0';
*strchr(tempPrefix, '\0') = root->label;
We again copy no more than 28 characters. But explicitly pad the end with NUL bytes. Now, since strncpy fills the buffer with NUL bytes up to count in case the string being copied is shorter, in effect everything after the copied prefix is now \0. This is why I deference the result of strchr right away, it is guaranteed to point at a valid character. The first free space to be exact.
strXXX() family of functions mostly operate on strings (except the searching related ones), so you will not be able to use the library functions directly.
You can find out the position of the existing null-terminator, replace that with the char value you want to concatenate and add a null-terminator after that. However, you need to make sure you have got enough room left for the source to hold the concatenated string.
Something like this (not tested)
#define SIZ 30
//function
char tempPrefix[SIZ] = {0}; //initialize
strcpy(tempPrefix, prefix); //copy the string
char label = root->label; //take the char value
if (strlen(tempPrefix) < (SIZ -1)) //Check: Do we have room left?
{
int res = strchr(tempPrefix, '\0'); // find the current null
tempPrefix[res] = label; //replace with the value
tempPrefix[res + 1] = '\0'; //add a null to next index
}

Iterating over all files in OFN_ALLOWMULTISELECT with Unicode

What is the recommended way of iterating over all files selected in an OFN_ALLOWMULTISELECT open file dialog with Unicode enabled?
My first idea was something like this:
TCHAR *tmp = ofn.lpStrFile + ofn.nFileOffset;
while(*tmp) {
wprintf("Got file: %s\n", tmp);
tmp += wcslen(tmp) + 1;
}
But then it occurred to me that this won't work in case there are characters in the string buffer that can't be represented in 16 bits. So for a safe approach I'd first need to find out the byte length of the tmp TCHAR string, then cast the TCHAR pointer to char and add that byte length in each iteration. Something like this:
TCHAR *tmp = ofn.lpStrFile + ofn.nFileOffset;
while(*tmp) {
wprintf("Got file: %s\n", tmp);
tmp = (TCHAR *) (((char *) tmp)) + get_byte_len_of_tstr(tmp));
}
Note that get_byte_len_of_tstr() is just a placeholder for a function that would've to be written for this purpose. Since this approach looks somewhat clumsy, I'd first like to ask for some feedback whether this is really the way to go or whether I've missed or misunderstood something here...
Your first example was on the right track, but has a couple of mistakes:
your variable should be declared WCHAR* instead of TCHAR*.
wprintf() does not accept a char* format string as input, it takes a wchar_t* instead.
WCHAR *tmp = ofn.lpStrFile + ofn.nFileOffset;
while (*tmp)
{
wprintf(L"Got file: %s\n", tmp);
tmp += (wcslen(tmp) + 1);
}
If you want to use TCHAR (and you really shouldn't, unless you need to support Win9x/ME), then it would look like this instead:
TCHAR *tmp = ofn.lpStrFile + ofn.nFileOffset;
while (*tmp)
{
_tprintf(_T("Got file: %s\n"), tmp);
tmp += (_tcslen(tmp) + 1);
}
That being said, your understanding of wcslen() is wrong (but your use of it is correct). In Windows, a Unicode string is encoded in UTF-16, where each WCHAR element is a UTF-16 codeunit. wcslen() counts the number of WCHAR elements in the string, not the number of Unicode codepoints that they represent, like you are thinking. So, if a given codepoint requires a UTF-16 surrogate pair, it will use two WCHAR elements in the string, and wcslen() will count 2 for it. Otherwise, it will use 1 WCHAR and wcslen() will count 1 for it.
The same is true for strlen() and MBCS strings, when a given Unicode codepoint is encoded using more than 1 codeunit (char element) in the string.

How to truncate C char*?

As simple as that. I'm on C++ btw. I've read the cplusplus.com's cstdlib library functions, but I can't find a simple function for this.
I know the length of the char, I only need to erase last three characters from it. I can use C++ string, but this is for handling files, which uses char*, and I don't want to do conversions from string to C char.
If you don't need to copy the string somewhere else and can change it
/* make sure strlen(name) >= 3 */
namelen = strlen(name); /* possibly you've saved the length previously */
name[namelen - 3] = 0;
If you need to copy it (because it's a string literal or you want to keep the original around)
/* make sure strlen(name) >= 3 */
namelen = strlen(name); /* possibly you've saved the length previously */
strncpy(copy, name, namelen - 3);
/* add a final null terminator */
copy[namelen - 3] = 0;
I think some of your post was lost in translation.
To truncate a string in C, you can simply insert a terminating null character in the desired position. All of the standard functions will then treat the string as having the new length.
#include <stdio.h>
#include <string.h>
int main(void)
{
char string[] = "one one two three five eight thirteen twenty-one";
printf("%s\n", string);
string[strlen(string) - 3] = '\0';
printf("%s\n", string);
return 0;
}
If you know the length of the string you can use pointer arithmetic to get a string with the last three characters:
const char* mystring = "abc123";
const int len = 6;
const char* substring = mystring + len - 3;
Please note that substring points to the same memory as mystring and is only valid as long as mystring is valid and left unchanged. The reason that this works is that a c string doesn't have any special markers at the beginning, only the NULL termination at the end.
I interpreted your question as wanting the last three characters, getting rid of the start, as opposed to how David Heffernan read it, one of us is obviously wrong.
bool TakeOutLastThreeChars(char* src, int len) {
if (len < 3) return false;
memset(src + len - 3, 0, 3);
return true;
}
I assume mutating the string memory is safe since you did say erase the last three characters. I'm just overwriting the last three characters with "NULL" or 0.
It might help to understand how C char* "strings" work:
You start reading them from the char that the char* points to until you hit a \0 char (or simply 0).
So if I have
char* str = "theFile.nam";
then str+3 represents the string File.nam.
But you want to remove the last three characters, so you want something like:
char str2[9];
strncpy (str2,str,8); // now str2 contains "theFile.#" where # is some character you don't know about
str2[8]='\0'; // now str2 contains "theFile.\0" and is a proper char* string.

Blanking out cstrings in a loop

I am trying to iterate through char*
Is there any way to like reset these char* strings back to blank?
I am trying to reset from1 and send1.
Is there anything else wrong with my code.. it is only copying the first file in my array
for(i = 0; i < 3; i++)
{
from1 = " ";
send1 = " ";
from1 = strncat(fileLocation,filesToExport[i],50);
send1 = strncat(whereAmI,filesToExport[i],50);
CopyFile(from1,send1,TRUE);
printf("%s\n",from1);
printf("%s",send1);
}
THe strings are nul terminated, which means they have a zero character at the end. You can set the first char in the string to zero to truncate it back to being empty:
from1[0] = '\0';
Another way would be to copy a blank string:
strcpy(from1, "");
What do you mean by "blank"? Zeroed, spaces, or empty?
For filling a memory area you're best off using memset(), so
#include <string.h>
memset(pBuffer, ' ', length); /* Fill with spaces */
pBuffer[length] = '\0'; /* Remember to null-terminate manually when using memset */
memset(pBuffer, '\0', length); /* Fill with zeroes */
pBuffer[0] = '\0'; /* Set first element to null -- effectively set the string
* to length 0
*/
The easiest way is to set the first byte to 0. Like this:
from1[0] = 0;
send1[0] = 0;
C/C++ checks the end of a char* string by looking for the 0 byte. It doesn't care what follows that.
To clear a string to empty, so that strncat() has an empty string to concatenate to, just do:
from1[0] = '\0';
This sets the first character to the zero terminator that indicates end of string, thus making the string have length 0. This assumes that from1 is an actual modifiable char buffer, but your call to strncat() implies that it is.
You are copying into filelocation and whereami. Are they buffers or strings? You may be writing off the end of your string.
I think you would do better to allocate a suitably sized buffer
fromLen = strlen(fileLocation);
fileLen = strlen(filesToExport[i]);
from1 = malloc(fromLen + fileLen + 1);
/* add check here that string fits */
strcpy( from1, filelocation);
strcat( from1 + fromLen, filesToExport[i]);
/** etc **/
free(from1);
you mean like
send1[0] = 0;
from1[0] = 0;
?

Resources