I'm not asking about general syntactic rules for file names. I mean gotchas that jump out of nowhere and bite you. For example, trying to name a file "COM<n>" on Windows?
From: http://www.grouplogic.com/knowledge/index.cfm/fuseaction/view_Info/docID/111.
The following characters are invalid as file or folder names on Windows using NTFS: / ? < > \ : * | " and any character you can type with the Ctrl key.
In addition to the above illegal characters the caret ^ is also not permitted under Windows Operating Systems using the FAT file system.
Under Windows using the FAT file system file and folder names may be up to 255 characters long.
Under Windows using the NTFS file system file and folder names may be up to 256 characters long.
Under Window the length of a full path under both systems is 260 characters.
In addition to these characters, the following conventions are also illegal:
Placing a space at the end of the name
Placing a period at the end of the name
The following file names are also reserved under Windows:
aux,
com1,
com2,
...
com9,
lpt1,
lpt2,
...
lpt9,
con,
nul,
prn
Full description of legal and illegal filenames on Windows: http://msdn.microsoft.com/en-us/library/aa365247.aspx
A tricky Unix gotcha when you don't know:
Files which start with - or -- are legal but a pain in the butt to work with, as many command line tools think you are providing options to them.
Many of those tools have a special marker "--" to signal the end of the options:
gzip -9vf -- -mydashedfilename
As others have said, device names like COM1 are not possible as filenames under Windows because they are reserved devices.
However, there is an escape method to create and access files with these reserved names, for example, this command will redirect the output of the ver command into a file called COM1:
ver > "\\?\C:\Users\username\COM1"
Now you will have a file called COM1 that 99% of programs won't be able to open, and will probably freeze if you try to access.
Here's the Microsoft article that explains how this "file namespace" works. Basically it tells Windows not to do any string processing on the text and to pass it straight through to the filesystem. This trick can also be used to work with paths longer than 260 characters.
The boost::filesystem Portability Guide has a lot of good info.
Well, for MSDOS/Windows, NUL, PRN, LPT<n> and CON. They even cause problems if used with an extension: "NUL.TXT"
Unless you're touching special directories, the only illegal names on Linux are '.' and '..'. Any other name is possible, although accessing some of them from the shell requires using escape sequences.
EDIT: As Vinko Vrsalovic said, files starting with '-' and '--' are a pain from the shell, since those character sequences are interpreted by the application, not the shell.
Related
For example: '">sometext<.txt
I am currently trying to save a file in that form, so If I upload the file on a website I'm hopping to find the XSS bug.
Windows (but not necessarily NTFS) prohibits the following characters in filenames: \/:*?"<>|, which precludes the characters necessary for most XSS attacks (<>"). Windows also disallows reserved DOS device file-names like COM, NUL, etc (though it is possible to create a file with that name, it cannot be done using the normal Win32 filesystem API).
Linux (and UNIX and POSIX in general) is more permissive: every character is allowed in a filename except for / (the directory separator character) and \0 (NULL, a raw zero).
I imagine an insecure web-application that saves uploaded files with their filenames intact and without having sanitized filenames probably will succumb to an XSS attack - unless they're also careful to never render HTML raw.
Windows prohibits these characters. But you could try Azure Blob Storage
Some file system cares about spaces at the beginning or end of the file or directory name?
They (file system) convert this: "/ directory /" to this "/directory/" when create a file?
English is not my native language, so I apologize any mistake.
Yes they do care.
For instance in Linux Ext3 / Ext4:
touch "file1"
touch " file1"
touch "file1 "
Will create three different files. One without spaces, other with a leading space, and the other with a trailing one.
It works just the same with directories, as Linux follows the Unix principle of everything is a file.
Windows filename rules advices against using trailing spaces for files or directories, even though the underlying filesystem may support it.
I have run into a problem in one of my Tcl scripts where I am uploading a file from a Windows computer to a Unix server. I would like to get just the original file name from the Windows file and save the new file with the same name. The problem is that [file tail windows_file_name] does not work, it returns the whole file name like "c:\temp\dog.jpg" instead of just "dog.jpg". File tail works correctly on a Unix file name "/usr/tmp/dog.jpg", so for some reason it is not detecting that the file is in Windows format. However Tcl on my Windows computer works correctly for either name format. I am using Tcl 8.4.18, so maybe it is too old? Is there another trick to get it to split correctly?
Thanks
The problem here is that on Windows, both \ and / are valid path separators so long Windows API is concerned (even though only \ is deemed to be "official" on Windows). On the other hand, in POSIX, the only valid path separator is /, and the only two bytes which can't appear in a pathname component are / and \0 (a byte with value 0).
Hence, on a POSIX system, "C:\foo\bar.baz" is a perfectly valid short filename, and running
file normalize {C:\foo\bar.baz}
would yield /path/to/current/dir/C:\foo\bar.baz. By the same logic, [file tail $short_filename] is the same as $short_filename.
The solution is to either do what Glenn Jackman proposed or to somehow pass the short name from the browser via some other means (some JS bound to an appropriate file entry?). Also you could attempt to detect the user's OS from the User-Agent header.
To make Glenn's idea more agnostic to user's platform, you could go like this:
Scan the file name for "/".
If none found, do set fname [string map {\\ /} $fname] then go to the next step.
Use [file tail $fn] to extract the tail name.
It's not very bullet-proof, but supposedly better than nothing.
You could always do [lindex [split $windows_file_name \\] end]
Is putting a space in a directory name still a big deal? I've been doing some reading, but all the articles are from the early 2000s. Is it a problem now?
For those who don't get what I mean: public_html/space directory/index.html
If this is still an issue, why shouldn't I use spaces when naming files and directories?
Spaces in URLs are still special characters that need to be escaped or encoded (either a + or %20).
Well, I am still crossing fingers when executing external processes (from ant or Java's ProcessBuilder for example). If you just pass this dir to the external process within the command - it may break apart in two arguments which is clearly not what you want.
Some quoting and minding the spaces is still required in some usecases.
In some documentation, I have gotten the instructions to write
SERVER_PATH\theme\
When I check _SERVER["DOCUMENT_ROOT"] from php info, it's
/storage/content/75/113475/frilansbyran.se/public_html
this renders of course
/storage/content/75/113475/frilansbyran.se/public_html\theme\
this looks really weird to me what's the difference anyway which should I use? (unix-server)
Using backslashes is typically the windows way of paths, eg:
C:\Windows\System32
Forward slashes are usually used in Unix systems ala:
/usr/home/jdoe
If you are using a Unix-server I would stick to the forward slashes (/'s), though many times in in practice the system is smart enough to use either interchangeably
Path separator in Unix is /. The backslash \ is used to escape some special characters (incl. space) in the directory and file names.
The backslash \ is used as a path separator in the Windows world. It is possible, that 'some' documentation uses it.
If you are on Unix, use slash / only.
In *nix, forward slash [/] is used as the directory separator, so I would use that.
Back slashes [\] are used as directory separators on Windows systems.
To add to what others have said:
in URLs on the web, the Unix-style forward-slash / is always used; a backslash won't work in most situations.
in Windows filepaths, the forward-slash is an acceptable alternative to the backslash, so although it's not the normal way of writing it, C:/Foo/Bar will work.
So if in doubt, use the forward slash.