How can I validate if a file is name valid in Windows? - c

Is there a Windows API function that I can pass a string value to that will return a value indicating whether a file name is valid or not?
I need to verify that a file name is valid, and I'm looking for an easy way to do it without re-inventing the wheel. I'm working in straight C, but targeting the Win32 API.
If there's no such function built-in, how would I go about writing my own? Is there a general algorithm or pattern that Windows follows for determining file name validity?

The problem is not so simple, because it depends from what you consider a "valid file name".
The Windows APIs used with UNC paths will let you happily create a lot of names that are deemed invalid inside normal paths, since with the prefix \\?\ you are telling to the Windows APIs to just deliver the path to the filesystem driver, without performing any check; the filesystems themselves often do not really care about what it's used as a file name, once they know that some string is only the file name (i.e. the path/name split has already been done) they generally treat it just as an opaque sequence of characters.
On the other hand, if you want to play it safe, you should perform validation according to the rules specified by the MSDN document you already linked for Win32 names; I don't think that any file system is allowed to have more stringent rules than these on file naming. On the other hand, violating such requirements, although can be supported by the kernel itself, often give bad headaches to many "normal" applications that expect to deal with "traditional" Win32 paths.
But, in my opinion, if you have to create the file immediately, the best validation you can do is to try to actually create/open the file, letting the OS do such work for you, and be prepared to handle gracefully a failure (GetLastError should return ERROR_BAD_PATHNAME). This will check any other restriction you have on creating such file, e.g. that your application has the appropriate permissions, that the path is not on a readonly medium, ...
If, for some reason, this is not possible, you may like the shell function PathCleanupSpec: provided the requested file name and the directory in the file system where it has to be created, this function will remove all the invalid characters (I'm not sure about reserved DOS names, they are not listed in its documentation) making the path "probably valid" and notifying you if any modification was made (so you can use it also only for validation).
Notice that this function is marked as "modifiable or removable in any future Windows version", although Microsoft policy is generally that "anything that made it way to a public header will remain public forever".

In case you are checking if the file name is valid in the sense "can the file be named like this?" :
No, there is no function to directly check that. You will have to write you own function.
But, if you know what is a valid file name (the valid file name does now contain any of the following: \ / : * ? " < > |) that shouldn't be such a problem.
You could perhaps help your self with some of these functions from ctype.h (with them you can check if a specific character belongs to some specific character classes):
http://www.cplusplus.com/reference/clibrary/cctype/

This function gives you the list of invalid chars for a filename. Up to you to check that your filename doesn't contain any:
public static char[] Path.GetInvalidFileNameChars()
Docs here.
Note that if you want to validate a directory name, you should use GetInvalidPathChars().
EDIT: Oooops! Sorry, I thought you were on .NET. Using Reflector, here's what this functions boils down to:
'"', '<', '>', '|',
'\0', '\x0001', '\x0002', '\x0003', '\x0004', '\x0005', '\x0006',
'\a', '\b', '\t', '\n', '\v', '\f', '\r',
'\x000e', '\x000f', '\x0010', '\x0011', '\x0012', '\x0013', '\x0014', '\x0015',
'\x0016', '\x0017', '\x0018', '\x0019', '\x001a', '\x001b', '\x001c', '\x001d',
'\x001e', '\x001f',
':', '*', '?', '\\', '/'
Note that, in addition, there are reserved names such as prn, con, com1, com2,... , lpt1, lpt2,...

Related

Default extension for message catalog files

I want to localize my application using the catopen()/catgets() family of functions.
As far as I understand, in the absence of NLSPATH variable, message catalogs will be looked up under /usr/share/locale/xx_YY/LC_MESSAGES.
What is the "traditional" file extension for message catalog files? I see some code examples using *.cat while others don't use any extension at all. Is it dependent on a particular UNIX flavour?
On my Linux boxes I see plenty of *.mo files, but those are GNU gettext archives. It seems catgets() can rarely be seen "in the wild" nowadays.
I meant this to be a comment, but it's a bit too long :P
Looking at the doc you've linked to, it seems probably that the code isn't opinionated as to file extension. Since you're not using MIME or anything to automatically find a handler for this file, the only requirement is likely to be that the name is correct. In UNIX, especially in the shell, file extensions often mean nothing to the system - fo example, any file extension can be used on an executable script as long as the executable bit is set and the shebang line at the top of the file specifies an appropriate interpreter.
It's possible the user community, if one still exists for this crufty sounding library, has a standard naming convention that the docs don't describe - but I wouldn't sweat it too much. It's trival to change file names, even if it means a recompile ( command line variables would make the program agnostic as to file name and extension )

GSSAPI: gss_export_name returns a blank

I am having a problem with exporting a name using gss_export_name, I though that once the name is exported I should be able to just print it but I am turning up a blank Literaly
EXPORTED NAME: , EXPORTED NAME LENGTH: 47
Here is my code
OM_uint32 major_status;
gss_cred_usage_t usage;
OM_uint32 lifetime;
gss_name_t inquired_name;
major_status = gss_inquire_cred(&minor_status, GSS_C_NO_CREDENTIAL, &inquired_name,
&lifetime, &usage, &oid_set);
gss_buffer_desc exported_name_buffer;
major_status = gss_export_name(&minor_status, inquired_name, &exported_name_buffer);
printf("EXPORTED NAME: %s, EXPORTED NAME LENGTH: %d\n",
exported_name_buffer.value, exported_name_buffer.length);
for clarity I decided not to include checks, but I also take care to make sure that major_status is always == GSS_S_COMPLETE
Appreciate any ideas
Unfortunately the buffer output by gss_export_name is an ASN.1 data structure not a human-readable string. Se section 3.2 of RFC 2743. You'd need to skip over the header of that structure and then parse the name in a mechanism-dependent manner.
Some of the GSS-API developers strongly recommend doing this. As an example, the gss-api patches to Openssh do this for parsing Kerberos names. This is the theoretically correct approach. In practice though, using gss_display_name and handling the output of that call produces more portable results in practice, even though it may produce strange results in a multi-mechanism application. You'll get significant arguments over how to handle this in the GSS-API community. Everyone will agree that you should use gss_display_name for producing output for debugging and logs. The question is what should you do if you want a name for searching on an access control list. If you can directly use the output of gss_export_name and do binary comparisons, do that. However if you need to compare against input entered by a human, I'd argue that using the output of gss_display_name is better, while others will argue that parsing the gss_export_name output is better.

Which path format to use in C on Windows, "D:\\source.txt" or "D:/source.txt"?

I only knew that we can't use D:\demo.txt as \d will be considered an escape character and hence we have to use D:\\demo.txt.But minutes ago I found out that D:/demo.txt works just as fine as we don't have to worry about escape characters with /. I am using CodeBlocks on Windows, and I want to know which one of these formats for path is valid for C on my platform.Here's my code and the commented-out lines work just as fine.
#include<stdio.h>
int main()
{
char ch;
FILE *fp,*tp;
fp=fopen("D:\\source.txt","r");
//fp=fopen("D:/source.txt","r");
tp=fopen("D:\\encrypt.txt","w");
//tp=fopen("D:/encrypt.txt","w");
if(fp==NULL||tp==NULL)
printf("ERROR");
while((ch=getc(fp))!=EOF)
putc(~ch,tp);
fclose(fp);
fclose(tp);
}
Windows (like MS-DOS before it) requires back-slashes as the path separator for the command line tools built into/provided by Windows.
Internal functions, however, have always accepted forward or backward slashes interchangeably. Personally, I prefer forward slashes as a general rule, but it's mostly personal preference -- either works fine.
It's true that Windows and MS-DOS accept either the forward slash / or the backslash \ as a directory path delimiter. And there are good arguments for using the forward slash in C code, because it doesn't have to be escaped in string and character literals.
But my own preference is to use the backslash (and remember to escape it properly), because most Windows users likely don't know that you can use / as a directory delimiter. It doesn't matter for an fopen call; these are equivalent (on Windows):
fopen("D:\\foo\\bar\\blah.txt", "r");
fopen("D:/foo/bar/blah.txt", "r");
But if that file name is ever shown to a user, IMHO it's a lot better if the message refers to D:\foo\bar\blah.txt.
You could use forward slashes for paths that are used only internally, and backslashes for paths that appear in the user interface, but that's going to be more difficult and error-prone than using one or the other consistently.
Incidentally, the C language says nothing about which character is used as a path delimiter; the language standard doesn't even specify directory support. It's determined by the operating system and file system.

prepend the "\\?\" string to the path - DriverPackageUninstall

I used DriverPackageUninstall, to uninstall my driver. For this API I need to give "Inf Path" as the input. And I need to give this path as UNICODE string. To do this, I took the following statement from MSDN as reference.
For a Unicode string, the maximum length is 32,767 characters. If you
use the Unicode version, prepend the "\?\" string to the path. For
general information about the format of file path strings, see Naming
a File in the MSDN Library.
But when I try the same in my code its not working. Can someone give me some examples on how to prepend the "\?\" before the path? Thanks..
UPDATE :
I tried with the below code as sample
#define UNICODE
#define _UNIOCDE
#define WINVER 0x501
#include <stdio.h>
#include <windows.h>
#include <tchar.h>
int main () {
PTCHAR DriverPackageInfPath = TEXT("\\?\\c:\\Documents and Settings\\Desktop\\My.inf");
FILE * Log;
Log = _wfopen( TEXT(DriverPackageInfPath, TEXT("a"));
if ( Log == NULL ) {
MessageBox(NULL, TEXT ( "Unable to open INF file\n" ),
TEXT ( "Installation Error" ), 0 | MB_ICONSTOP );
exit ( 1 );
} else {
printf ("INF file opened successfully\n");
}
return 0;
}
UPDATE:
".\dist\Driver\My.inf" How to add "\\?\" before this kind of paths? "\\?\.\dist\Driver\My.inf" is not working.
You have error in string constant:
TEXT("\\?\\c:\\Documents ...."
should be
TEXT("\\\\?\\c:\\Documents ...."
Read carefully, escape carefully : http://msdn.microsoft.com/en-us/library/windows/hardware/ff552316%28v=vs.85%29.aspx
UPDATE:
From http://msdn.microsoft.com/en-us/library/aa365247.aspx :
Win32 File Namespaces
The Win32 namespace prefixing and conventions are summarized in this section and the following section, with descriptions of how they are used. Note that these examples are intended for use with the Windows API functions and do not all necessarily work with Windows shell applications such as Windows Explorer. For this reason there is a wider range of possible paths than is usually available from Windows shell applications, and Windows applications that take advantage of this can be developed using these namespace conventions.
For file I/O, the "\?\" prefix to a path string tells the Windows APIs to disable all string parsing and to send the string that follows it straight to the file system. For example, if the file system supports large paths and file names, you can exceed the MAX_PATH limits that are otherwise enforced by the Windows APIs. For more information about the normal maximum path limitation, see the previous section Maximum Path Length Limitation.
Because it turns off automatic expansion of the path string, the "\?\" prefix also allows the use of ".." and "." in the path names, which can be useful if you are attempting to perform operations on a file with these otherwise reserved relative path specifiers as part of the fully qualified path.
Win32 Device Namespaces
The "\.\" prefix will access the Win32 device namespace instead of the Win32 file namespace. This is how access to physical disks and volumes is accomplished directly, without going through the file system, if the API supports this type of access. You can access many devices other than disks this way (using the CreateFile and DefineDosDevice functions, for example).
For example, if you want to open the system's serial communications port 1, you can use "COM1" in the call to the CreateFile function. This works because COM1–COM9 are part of the reserved names in the NT namespace, although using the "\.\" prefix will also work with these device names. By comparison, if you have a 100 port serial expansion board installed and want to open COM56, you cannot open it using "COM56" because there is no predefined NT namespace for COM56. You will need to open it using "\.\COM56" because "\.\" goes directly to the device namespace without attempting to locate a predefined alias.
Another example of using the Win32 device namespace is using the CreateFile function with "\.\PhysicalDiskX" (where X is a valid integer value) or "\.\CdRomX". This allows you to access those devices directly, bypassing the file system. This works because these device names are created by the system as these devices are enumerated, and some drivers will also create other aliases in the system. For example, the device driver that implements the name "C:\" has its own namespace that also happens to be the file system.
APIs that go through the CreateFile function generally work with the "\.\" prefix because CreateFile is the function used to open both files and devices, depending on the parameters you use.
If you're working with Windows API functions, you should use the "\.\" prefix to access devices only and not files.
Most APIs won't support "\.\"; only those that are designed to work with the device namespace will recognize it. Always check the reference topic for each API to be sure.
So your relative path can be
\\?\.\dist\driver\My.inf
escaped form is
\\\\?\\.\\dist\\driver\\My.inf
You only need to prepend \\?\ to the path if it is longer than MAX_PATH characters.

_findfirst and wildcard matching

I am trying to use _findfirst() Windows API in C to match file name using wildcards.
If I am passing ????????.txt then I am expecting it will match all the files in a directory with 8 characters only, but it matches more than that.
Is there any thing wrong with this usage?
I would guess that it is matching on the short name. On windows all files have a long name and a DOS 8.3 short name. Therefore "????????.txt" is effectively the same as "*.txt".
Also on a pedantic note, _findfirst() is not part of the Windows API. Is it part of the Microsoft C run-time library.

Resources