I want to assign unique file numbers to files during run time.
Creating hash for the file name is not an option for me as I do not want any collisions.
One good option is create running numbers for all files. But I do not have access to source file to walk the directory in place where I am running my binary.
So I need some option that can extract file name from the binary (Say using symbol table similar to GDB). I am not sure how to do that. Any help is appriciated
You could try to use the inode number (st_ino) from the file itself -- you get that from using fstat (http://linux.die.net/man/2/fstat).
The inode number is how the file system is keeping track of the files, and they are unique for the given file system -- hence as long as the files are not located on different files systems (different mount points) the inode number is unique.
This include if there are multiple links to the same file, if that worries you as well.
Related
This question already has answers here:
Is there any way that I can search for a file or a filename using a given inode number?
(2 answers)
Closed 3 years ago.
In a C program, I have got the inode number of a file, and I need to get file name for this inode number.
I have a hint that gets the directory entry for this inode number, and that will, of course, have the file name. But I am unable to figure out how to get the directory entry for a file from its inode number.
I need to do this in the C program and Ubuntu. Any solutions?
In a C program, I have got the inode number of a file, and I need to get file name for this inode number.
A single file, identified by its inode number, may have any number of links (a.k.a. "directory entries" a.k.a. "names") associated with it. (It is a bit more complicated than that, because files that are open have additional links that may or may not be directory entries, but that's not important for our purposes). One can add and remove links that refer to the same file (inode) freely (unless that file is actually a directory). The file doesn't keep track of links associated with it. It only keeps track of their number (the reference count). As soon as the number of links goes down to zero, the file gets deleted.
Given just an inode number, there's absolutely positively no way whatsoever to find any or all of its associated directory entries, short of checking all directory entries of a filesystem.
N.B. Links above are hard links. Soft links are something else entirely.
I understand that the whole reason that you do not store the file name within the inode is so that you can use a hard link to point to the inode's number. But in terms of symbolic links, they use the file name in order to access the file rather than the i-number. So would they be accessible to point to a file name that is written within the i-node already?
Hard links only work within the same disk.
In order to work across disks, soft links point to a file name that in turn [may] point to an inode.
symlinks don't point to files, they are just a bunch of bytes which can replace symlink's own name during lookup.
That is, if 'foo' is a symlink to 'bar', looking up 'foo/crap' will end up looking up 'bar/crap'.
It is trivial to construct examples where following a symlink makes you end up with different files. For instance one can chroot(1) somewhere and then follow a symlink which is either absolute or has several '..'s. Let's say the symlink is named 'crap', contains "/meh" and placed is /mycontainer. Then processes which chroot to /mycontainer and follow 'crap', will end up on /mycontainer/crap from "outside" perspective. Meanwhile processes which did not chroot will end up on /crap.
I am trying to delete all files from a directory apart from two (which will be erased, then re-written). One of these files, not to be deleted, contains the names of all files in the folder/directory (the other contains the number of files in the directory).
I believe there (possibly?) are 2 solutions:
Read the names of each file from the un-deleted file and delete them individually until there is only the final 2 files remaining,
or...
Because all other files end in .txt I could use some sort of filter which would only delete files with this ending.
Which of these 2 would be most efficient and how could it be done?
Any help would be appreciated.
You are going to end up deleting files one by one, regardless of which method you use. Any optimizations you make are going to be very miniscule. Without actually timing your algorithms, I'd say they'd both take about the same amount of time (and this would vary from one computer to the next, based on CPU speed, HDD type, etc). So, instead of debating that, I'll provide you code for both the ways you've mentioned:
Method1:
import os
def deleteAll(infilepath):
with open(infilepath) as infile:
for line in infile:
os.remove(line)
Method 2:
import os
def deleteAll():
blacklist = set(['names/of/files/to/be/deleted', 'number/of/files'])
for fname in (f for f in os.listdir() if f not in blacklist):
os.remove(fname)
What is the maximum length allowed for filenames? And is the max different for different operating system? I'm asking because I have trouble creating or deleting files, and I suspect the error was because of long file names.
1. Creating:
I wrote a program that will read a xml source and save a copy of the file. The xml contains hundreds of <Document>, and each have childnode <Name> and <Format>, the saved file is named based on what I read in the xml. For example, if I have the code below, I will save a file called test.txt
<Document>
<Name>test</Name>
<Format>.txt</Format>
</Document>
I declared a counter in my code, and I found out not all files are successfully saved. After going through the large xml file, I found out the program fail to save the files whose <Name> are like a whole paragraph long. I modify my code to save as a different name if <Name> is longer than 15 characters, and it went through no problem. So I think the issue was that the filename is too long.
2. Deleting
I found a random file on my computer, and I was not able to delete it. The error says that the file name was too long, even if I rename the file to 1 character. The file doesn't take up much space, but it was just annoying being there and not doing anything.
So my overall question is: What is the maximum and minimum length for filenames? Does it differ based on the operating system? And how can I delete the file I mentioned in 2?
It depends on the filesystem. Have a look here: http://en.wikipedia.org/wiki/Comparison_of_file_systems#Limits
255 characters is a common maximum length these days.
I want to recursively copy one directory into another (like cp -R) using POSIX scandir().
The problem is that when I copy a directory like /sys/bus/, which contains links to higher levels (for example: foo/foo1/foo2/foo/foo1/foo2/foo/... ) the system enters a loop status and copies the directories "in the middle" forever...
How can I check if the file I'm opening with dirent is a link or not?
Look at this: How to check whether two file names point to the same physical file
You need to store a list of inodes that you have visited to make sure that you don't get any duplicates. If you have two hard links to the same file, there is no "one" canonical name. One possibility is to first store all the files and then recurse through all the filenames. You can store the path structure separately from the inodes and file contents.