How can I convert C-String to LPCSTR on Windows - c

In order to find if a file exists, I want to use the GetFileAttributes WinAPI function.
The function accepts a LPCSTR argument. How can I convert my classic const char* string to this type?
Please note, I'm using C, not C++. Is this the right way to go in C too?

According to this Microsoft documentation page, LPCSTR is defined in WinNT.h as follows:
typedef __nullterminated CONST CHAR *LPCSTR;
This evaluates to const char *.
So, you are essentially asking how to convert const char* to itself. In other words, the answer to your question is that no conversion is required.
Regarding your question, there is no difference between C and C++. However, C++ offers additional ways of handling strings.

It is GetFileAttributesA (note the A) which uses LPCSTR. The wide character version is GetFileAttributesW, and its argument is LPCWSTR. The generic name GetFileAttributes is a shim which will switch between these two functions at compile time; it is defined in terms of a TCHAR typedef (const strings of which are LPCTSTR). TCHAR switches between CHAR or WCHAR based on whether you build the program for Unicode support.
If you have a const char * input intended to be passed to GetFileAttributes being compiled for Unicode, or to be passed to GetFileAttributesW, then a conversion is needed from byte string to wide string.
It's best to avoid mixing wide and narrow strings in the entire program, if at all possible, to avoid the cumbersome conversions.

Related

How can I convert a PCHAR* to a TCHAR*?

I am looking for ways to convert a PCHAR* variable to a TCHAR* without having any warnings in Visual Studio( this is a requirement)?
Looking online I can't find a function or a method to do so without having warnings. Maybe somebody has come across something similar?
Thank you !
convert a PCHAR* variable to a TCHAR*
PCHAR is a typedef that resolves to char*, so PCHAR* means char**.
TCHAR is a macro #define'd to either the "wide" wchar_t or the "narrow" char.
In neither case can you (safely) convert between a char ** and a simple character pointer, so the following assumes the question is actually about converting a PCHAR to a TCHAR*.
PCHAR is the same TCHAR* in ANSI builds, and no conversion would be necessary in that case, so it can be further assumed that the question is about Unicode builds.
The PCHAR comes from the function declaration(can t be changed) and TCHAR comes from GetCurrentDirectory. I want to concatenate the 2 using _tcscat_s but I need to convert the PCHAR first.
The general question of converting between narrow and wide strings has been answered before, see for example Convert char * to LPWSTR or How to convert char* to LPCWSTR?. However, in this particular case, you could weigh the alternatives before choosing the general approaches.
Change your build settings to ANSI, instead of Unicode, then no conversion is necessary.
That's as easy as making sure neither UNICODE nor _UNICODE macros are defined when compiling, or changing in the IDE the project Configuration Properties / Advanced / Character Set from Use Unicode Character Set to either Not Set or Use Multi-Byte Character Set.
Disclaimer: it is retrograde nowadays to compile against an 8-bit Windows codepage. I am not advising it, and doing that means many international characters cannot be represented literally. However, a chain is only as strong as its weakest link, and if you are forced to use narrow strings returned by an external function that you cannot change, then that's limiting the usefulness of going full Unicode elsewhere.
Keep the build as Unicode, but change just the concatenation code to use ANSI strings.
This can be done by explicitly calling the ANSI version GetCurrentDirectoryA of the API, which returns a narrow string. Then you can strcat that directly with the other PCHAR string.
Keep it as is, but combine the narrow and wide strings using [w]printf instead of _tcscat_s.
char szFile[] = "test.txt";
PCHAR pszFile = szFile; // narrow string from ext function
wchar_t wszDir[_MAX_PATH];
GetCurrentDirectoryW(_MAX_PATH, wszDir); // wide string from own code
wchar_t wszPath[_MAX_PATH];
wsprintf(wszPath, L"%ws\\%hs", wszDir, pszFile); // combined into wide string

Usage of SafeStr in C

I am reading about using of safe strings at following location
https://www.securecoding.cert.org/confluence/pages/viewpage.action?pageId=5111861
It is mentioned as below.
SafeStr strings, when used properly, can eliminate many of these errors and provide backward compatibility to legacy code as well.
My question is what does author mean by "provide backward compatibility to legacy code as well." ? Request to explain with example.
Thanks for your time and help
It means that functions from the standard libc (and others) which expects plain, null terminated char arrays, will work even on those SafeStrs. This is probably achieved by putting a control structure at a negative offset (or some other trick) from the start of the string.
Examples: strcmp() printf() etc can be used directly on the strings returned by SafeStr.
In contrast, there are also other string libraries for C which are very "smart" and dynamic, but these strings can not be sent without conversion to "old school" functions.
From that page:
The library is based on the safestr_t type which is completely
compatible with char *. This allows casting of safestr_t structures to
char *.
That's some backward compatibility with all the existing code that takes char * or const char * pointers.

cannot convert parameter 1 from 'const char *' to 'LPCWSTR'

Basically I have some simple code that does some things for files and I'm trying to port it to windows. I have something that looks like this:
int SomeFileCall(const char * filename){
#ifndef __unix__
SomeWindowsFileCall(filename);
#endif
#ifdef __unix__
/**** Some unix only stat code here! ****/
#endif
}
the line SomeWindowsFileCall(filename); causes the compiler error:
cannot convert parameter 1 from 'const char *' to 'LPCWSTR'
How do I fix this, without changing the SomeFileCall prototype?
Most of the Windows APIs that take strings have two versions: one that takes char * and one that takes WCHAR * (that latter is equivalent to wchar_t *).
SetWindowText, for example, is actually a macro that expands to either SetWindowTextA (which takes char *) or SetWindowTextW (which takes WCHAR *).
In your project, it sounds like all of these macros are referencing the -W versions. This is controlled by the UNICODE preprocessor macro (which is defined if you choose the "Use Unicode Character Set" project option in Visual Studio). (Some of Microsoft's C and C++ run time library functions also have ANSI and wide versions. Which one you get is selected by the similarly-named _UNICODE macro that is also defined by that Visual Studio project setting.)
Typically, both of the -A and -W functions exist in the libraries and are available, even if your application is compiled for Unicode. (There are exceptions; some newer functions are available only in "wide" versions.)
If you have a char * that contains text in the proper ANSI code page, you can call the -A version explicitly (e.g., SetWindowTextA). The -A versions are typically wrappers that make wide character copies of the string parameters and pass control to the -W versions.
An alternative is to make your own wide character copies of the strings. You can do this with MultiByteToWideChar. Calling it can be tricky, because you have to manage the buffers. If you can get away with calling the -A version directly, that's generally simpler and already tested. But if your char * string is using UTF-8 or any encoding other than the user's current ANSI code page, you should do the conversion yourself.
Bonus Info
The -A suffix stands for "ANSI", which was the common Windows term for a single-byte code-page character set.
The -W suffix stands for "Wide" (meaning the encoding units are wider than a single byte). Specifically, Windows uses little-endian UTF-16 for wide strings. The MSDN documentation simply calls this "Unicode", which is a little bit of a misnomer.
Configure your project to use ANSI character set. (General -> Character Set)
What are TCHAR, WCHAR, LPSTR, LPWSTR, LPCTSTR etc.
typedef const wchar_t* LPCWSTR;
{project properties->advanced->character set->use multi byte character set} İf you do these step you problem is solved
You are building with WinApi in Unicode mode, so all string parameters resolve to wide strings. The simplest fix would be to change the WinApi to ANSI, otherwise you need to create a wchar_t* with the contents from filename and use that as an argument.
Am able to solve this error by setting the Character set to "Use Multi-Byte Character set"
[Project Properties-> Configuration Properties -> General -> Character Set ->"Use Multi-Byte Character set"
not sure what compiler you are using but in visual studio you can specify the default char type, whether it be UNICODE or multibyte. In your case it sounds as if UNICODE is default so the simplest solution is to check for the switch on your particular compiler that determines default char type because it would save you some work, otherwise you would end up adding code to convert back and forth from UNICODE which may add unnecessary overhead plus could be an additional source of error.

How to convert argv to wide chars in Win32 command line application?

I'm using the win32 api for C in my program to read from a serial port, it seems to be pretty low level stuff. Assuming that there is no better way of reading from a serial port, the CreateFile function involves a LPCWSTR argument, I've read and it looks like LPCWSTR is a wchar_t type. Firstly, I don't really understand the difference between wchar and char, I've read stuff about ansi and unicode, but I don't really know how it applies to my situation.
My program uses a main function, not wmain, and needs to get an argument from the command line and store it in a wchar_t variable. Now I know I could do this if I just made the string up on the spot;
wchar_t variable[1024];
swprintf(variable,1024,L"%s",L"randomstringETC");
Because it looks like the L converts char arrays to wchar arrays. However it does not work when I do;
wchar_t variable[1024];
swprintf(variable,1024,L"%s",Largv[1]);
obviously because it's a syntax error. I guess my question is, is there an easy way to convert normal strings to wchar_t strings?
Or is there a way to avoid this Unicode stuff completely and read from serial another way using C on windows..
There is no winapi function named CreateFile. There's CreateFileW and CreateFileA. CreateFile is a macro that maps to one of these real function depending on whether the _UNICODE macro is defined. CreateFileW takes an LPCWSTR (aka const wchar_t*), CreateFileA takes an LPCSTR (aka const char*).
If you are not ready yet to move to Unicode then simply use the CreateFileA() function explicitly. Or change the project setting: Project + Properties, General, Character Set. There's a non-zero cost, the underlying operating system is entirely Unicode based. So CreateFileA() goes through a translation layer that turns the const char* into a const wchar_t* according to the current system code page.
The L thing is only for string literals.
You need to convert argv string (presumably unsigned char) to wchar by using something like the winapi mbstowcs() function.
MultiByteToWideChar can be used to map from ANSI to UNICODE. To do your swprintf call, you need to define an array of wchar like this:
WCHAR lala[256] = {0};
swprintf(lala, _countof(lala), L"%s", Largv[1]);
It is possible to avoid unicode by compiling your application against a multibyte character set but it's bad practice to do this unless you're doing so for legacy reasons. Windows will need to convert it back to unicode at some point eventually anyway because that is the encoding of the underlying OS.

C basic datatype problem - const char * to LPCTSTR

#include "stdafx.h"
#include "string.h"
#include "windows.h"
bool SCS_GetAgentInfo(char name[32],char version[32], char description[256], const char * dwAppVersion)
{
strcpy(name,gName);
strcpy(version,gVersion);
strcpy(description,gDescription);
notify(dwAppVersion);
return true;
}
void notify(const char * msg)
{
MessageBox(NULL, TEXT(msg), NULL, NULL);
}
I have managed to work with the first three fields fine, but I am running into issues with the const char *. I have tried passing and casting in alot of different ways, but can't get it to work. I googled around, but couldn't find much on Lmsg. I am new to alot of this. I have read around and I think it may have to do with encoding. What really confuses me is LPCTSTR is defined as a const char *, but straight typecasting doesn't give me anything from the field.
I get an error that Lmsg is undeclared which I am guessing means that the Macro expansion of TEXT is causing this. How can I get this working?
Doing MessageBox(NULL, (LPCTSTR)msg, NULL, NULL); instead gives me a bunch of boxes indicating it probably is referencing the wrong characters, but copying the dwAppsVersion parameter into the description shows the correct information.
The problem is that you're building you application to use UNICODE Win32 API's, but you're passing around non-UNICODE strings. You have two options:
convert the msg string to Unicode using something like MultiByteToWideChar(). This is probably the 'right' way to do it, if a bit more complex because you need to deal with codepages and managing the buffers used for the conversion.
you can force the ANSI version of the API to be used:
MessageBoxA(NULL, msg, NULL, NULL);
That's a simple workaround, if not elegant.
Other options include only building the application to use Win32 ANSI APIs instead the Unicode APIs or changing the strings you pass around as LPTSTR and using the TEXT() or _T() macros for your literals. However, if you're reading non-Unicode data from files or elseswhere, then you still have to deal with the conversion at some point...
An LPCTSTR is an alias for const TCHAR *, and TCHAR is a type used in Windows programming to ease the transition between the ANSI (Windows-1252, very similar to the internationally standardized ISO 8859-1) and Unicode text encodings.
If your project is set up to build your app using ANSI, TCHAR is really char, and you would be able to pass msg to MessageBox without a cast.
If your app is set up to build using Unicode (which is what it sounds like), TCHAR is really wchar_t, and you would have to convert the string from ANSI to Unicode using a function like MultiByteToWideChar().
Simply casting just forces the compiler interpret the type differently without changing the data; in this case, that's not enough because the actual data must be converted from one format to another.
It's hard to tell exactly what's going on in your question since you appear to have left some relevant context out. For example LPCTSTR isn't mentioned anywhere, so I can only guess at what you're talking about, or what "the first three fields" are.
One thing to note is that LPCTSTR is not always const char*, it is in an ANSI build, but it is const wchar_t* in a Unicode build. This most likely the issue you're running into.
Also, the TEXT() macro is only for defining string constants. You can't use it to perform a conversion on a variable, this is why you're getting 'Lmsg undeclared'.
If you aren't intentionally using a Unicode build, you may want to change your project settings to an ANSI build as a work-around. Otherwise, you may want to read a tutorial on working with Unicode, which you really should be familiar with if you are writing software for Windows these days.

Resources