What is the maximum length allowed for filenames? And is the max different for different operating system? I'm asking because I have trouble creating or deleting files, and I suspect the error was because of long file names.
1. Creating:
I wrote a program that will read a xml source and save a copy of the file. The xml contains hundreds of <Document>, and each have childnode <Name> and <Format>, the saved file is named based on what I read in the xml. For example, if I have the code below, I will save a file called test.txt
<Document>
<Name>test</Name>
<Format>.txt</Format>
</Document>
I declared a counter in my code, and I found out not all files are successfully saved. After going through the large xml file, I found out the program fail to save the files whose <Name> are like a whole paragraph long. I modify my code to save as a different name if <Name> is longer than 15 characters, and it went through no problem. So I think the issue was that the filename is too long.
2. Deleting
I found a random file on my computer, and I was not able to delete it. The error says that the file name was too long, even if I rename the file to 1 character. The file doesn't take up much space, but it was just annoying being there and not doing anything.
So my overall question is: What is the maximum and minimum length for filenames? Does it differ based on the operating system? And how can I delete the file I mentioned in 2?
It depends on the filesystem. Have a look here: http://en.wikipedia.org/wiki/Comparison_of_file_systems#Limits
255 characters is a common maximum length these days.
Related
I have many files that are extracted into .txt with a batch file. But they don't have the headers. I've read that a possible solution from here that is to add to a .txt with the headers the exported rows.
With this:
echo. >> titles.txt
type data.txt >> titles.txt
This takes a lot of time and is not efficient, since it is adding the big file to the file with the text.
Another possible solution is to add to the SQL query the titles hardcoded, but this will change the type of the columns (is they are numeric they will be changed to varchar).
Is there a way to insert in the first row of the data txt the headers and not doing vice-versa?
I might be wrong, but as far as I am informed (and as far as I know from earlier experiments in doing as described): No, it is not possible! The mentioned Tasks are acting on the file sequentially. You can either open a file for reading, writing or appending. If you open the titles.txt file for writing, it is overwritten - and with this empty. If you open it for appending, it can only append to the end of the file - so you can only write the data after the Header... the only way it might work - but which is pretty nasty - is to append the title to the end of the file and during later processing (e.g. xls or whatever) Resort the rows and put the last one to the beginning. But as mentioned: nasty and not really the way to go.
If the number of files to process is a bigger problem than any individual file size, switching from bcp to sqlcmd might help.
I want to assign unique file numbers to files during run time.
Creating hash for the file name is not an option for me as I do not want any collisions.
One good option is create running numbers for all files. But I do not have access to source file to walk the directory in place where I am running my binary.
So I need some option that can extract file name from the binary (Say using symbol table similar to GDB). I am not sure how to do that. Any help is appriciated
You could try to use the inode number (st_ino) from the file itself -- you get that from using fstat (http://linux.die.net/man/2/fstat).
The inode number is how the file system is keeping track of the files, and they are unique for the given file system -- hence as long as the files are not located on different files systems (different mount points) the inode number is unique.
This include if there are multiple links to the same file, if that worries you as well.
I have a program which logs its activity.
I want to implement a log file mechanism to keep the log file under a certain size, lets say 10 MB.
The log file itself just holds commands the program executed; those commands are variable length.
Right now, the program runs on a windows environment, but I'm likely to port it to UNIX soon.
I've came up with two methods for managing the log files:
1. Keep multiple files of lower size, and if the new command exceeds the current file length, truncate the oldest file to zero size, and start writing there.
2. Keep a header in the file, which holds metadata regarding the first command in the file, and the next place to write to in the file. Also I think, each command should hold metadata about it's length this way.
My questions are as follows:
In terms of efficiency which of these methods would you use, and why?
Is there a unix command / function to this easily?
Thanks a lot for your help,
Nihil.
On UNIX/Linux platforms there's a logrotate program that manages logfiles. Details can be found for example here:
http://linuxcommand.org/man_pages/logrotate8.html
I am reading info (numbers) from a txt file and after that I am adding to those numbers, others I had in another file, with the same structure.
At the start of each line in the file is a number, that identifies a specific product. That code will allow me to search for the same product in the other file. In my program I have to add the other "variables" from one file to the other, and then replace it, in the same place in one of those files.
I didn't open any of those files with a or a+, I did it with r and r+ because i want to replace the information in the lines that may be in the middle of the file, and not in the end of it.
The program compiles, and runs, but when it comes to replace the info in the file, it just doesn't do anything.
How should I resolve the problem?
A program can replace (overwrite) text in the middle of the file. But the question is whether or not this should be performed.
In order to insert larger text or smaller text (and close up the gap), a new text file must be written. This is assuming the file is not fixed width. The fundamental rule is to copy all original text before the insertion to a new file. Write the new text. Finally write the remaining original text. This is a lot of work and will slow down even the simplest programs.
I suggest you design your data layout before you go any further. Also consider using a database, see my post: At what point is it worth using a database?
Your objective is to design the data to minimize duplication and data fetching.
I am working with a text file, which contains a list of processes under my programs control, along with relevant data.
At some point, one of the processes will finish, and thus will need to be removed from the file (as its no longer under control).
Here is a sample of the file contents (which has enteries added "randomly"):
PID=25729 IDLE=0.200000 BUSY=0.300000 USER=-10.000000
PID=26416 IDLE=0.100000 BUSY=0.800000 USER=-20.000000
PID=26522 IDLE=0.400000 BUSY=0.700000 USER=-30.000000
So for example, if I wanted to remove the line that says PID=26416.... how could I do that, without writing the file over again?
I can use external unix commands, however I am not very familiar with them so please if that is your suggestion, give an example.
Thanks!
Either you keep the contents of the file in temporary memory and then rewrite the file. Or you could have a file for each of the PIDs with the relevant information in them. Then you simply delete the file when it's no longer running. Or you could use a database for this instead.
As others have already pointed out, your only real choice is to rewrite the file.
The obvious way to do that with "external UNIX commands" would be grep -v "PID=26416" (or whatever PID you want to remove, obviously).
Edit: It is probably worth mentioning that if the lines are all the same length (as you've shown here) and order doesn't matter, you could delete a line more efficiently by copying the last line into the space being vacated, then shorten the file so eliminate what had been the last line. This will only work if they really are all the same length though (e.g., if you got a PID of '1', you'd need to pad it to the same length as the others in the file).
The only way is by copying each character that comes after the deleted line down over the characters that are deleted.
It is far more efficient to simply rewrite the file.
how could I do that, without writing the file over again?
You cannot. Filesystems (perhaps besides more esoteric record based ones) does not support insertion or deletion.
So you'll have to write the lines to a temporary file up till the line you want to delete, skip over that line, and write the rest of the lines to the file. When done, rename/copy the temp file to the original filename
Why are you maintaining these in a text file? That's not the best model for such a task. But, if you're stuck with it ... if these lines are guaranteed to all be the same length (it appears that way from the sample), and if the order of the lines in the file doesn't matter, then you can write the last line over the line for the process that has died and then shorten the file by one line with the (f)truncate() call if you're on a POSIX system: see Jonathan Leffler's answer in How to truncate a file in C?
But note carefully netrom's answer, which gives three different better ways to maintain this info.
Also, if you stick with a text file (preferably written from scratch each time from data structures you maintain, as per netrom's first suggestion), and you want to be sure that the file is always well formed, then write the new data into a temp file on the same device (putting it in the same directory is easiest) and then do a rename() call, which is an atomic operation.
You can use sed:
sed -i.bak -e '/PID=26416/d' test
-i is for editing in place. It also creates a back-up file with the new extension .bak
-e is for specifying the pattern. The /d indicates all lines matching the pattern should be deleted.
test is the filename
The unix command for it is:
grep -v "PID=26416" myfile > myfile.tmp
mv myfile.tmp myfile
The grep -v part outputs the file without the rows with the search term.
The > myfile.tmp part creates a new temp file for this output.
The mv part renames the temp file to the original file.
Note that we are rewriting the file here, and moreover, we can lose data if someone write something to file between the two commands.