I have an app which opens device files of harddisks. /dev/sda or something like that.
Now lets say my app opens the disk and in between any work that is done to the disk, I disconnect the disk and reconnect a different disk which again is the device file /dev/sda.
Is the file descripter still valid or does linux know it is a different disk and fail operations on that file descriptor accordingly?
A good way to deal with this would be to write a udev rule so that a particular hard disk with a particular vendor ID is mounted in a certain way, that way you would be certain that the File descriptor would fail if you unplugged one hard disk and reconnected another.
Related
It's possible to change UID/GID of current process as it runs programatically with setresgid/setresuid which affects future files access rights.
However what happens to already opened or memory mapped files? Are they still accessible for i/o operations like read/write? I'm asking more in context of "not explicit" i/o operations performed by libraries, for example sqlite database or other libraries that operate on files more internally. Files opened in DIRECT_IO mode sound even more uncertain in this aspect.
When you open a file, your ability to do so is determined by your effective uid and gid at the time you open the file.
When you change your effective uid or gid, it has no effect on any open file descriptors that you may have.
In most cases, if you have a valid file descriptor, that's all you need to read or write the resource that descriptor is connected to. The fact that you hold the valid file descriptor is supposed to be all the proof you need that you have permission to read/write the underlying resource.
When you read or write using an ordinary file descriptor, no additional authorization checks are performed. This is partly for efficiency (because those authentication checks would be expensive to perform each time), and partly so that -- this may be exactly what you are trying to do -- you can open a privileged resource, downgrade your process's privileges, and continue to access the open resource.
Bottom line: Yes, it's entirely possible for a process to use an open file descriptor to read or write a file which (based on its current uid/gid) it would not be able to open.
Footnote: What I've said is true for ordinary Unix file descriptors connected to ordinary resources (files, devices, pipes, network streams, etc.). But as #Mark Plotnick reminds in a comment, some file descriptors and underlying resources are "different" -- NFS and Linux /proc files are two examples. For those, it's possible for additional checks to be performed at the time of read/write.
From Wikipedia
Disk formatting
Disk formatting is the process of preparing a data storage device such as a hard disk drive, solid-state drive, floppy disk or USB flash drive for initial use.
Mount (computing)
Mounting is a process by which the operating system makes files and directories on a storage device (such as hard drive, CD-ROM, or network share) available for user to access via the computer's file system.
I don't understand how these two sentences are different
You format a device. You mount a filesystem.
Mounting a Filesystem
Typically, to make a device ready to be written to by the system, you'd have to do the following:
Physically connect the device to the system
(optional but recommended) Create a partition on your device
Create a filesystem on the device/partition
Mount the filesystem on the device/partition to a directory on your system's filesystem, so that you can access the filesystem of the device/partition from your system.
Formatting a Device
Therefore, every filesystem is backed by some block storage device. The storage medium of a block storage device (e.g. a hard disk) is divided into many sections, called sectors. Each sector has a unique address. Whenever you write to, or read from, the device, it is done by writing/reading whole sectors. Therefore, the sector is a smallest addressable physical unit for a block storage device.
When you format a device, you are adding 'marks' onto the storage medium to mark the location of these sectors.
Source of Confusion
Formatting has nothing to do with the filesystem, or mounting. The confusion often arise from Windows users, because on Windows, the term 'formatting' also includes the creation of the filesystem. On Windows, when you format a disk, you are also creating a filesystem on it (and Windows also automatically mount devices you connect).
To distinguish between Window's version of 'formatting', and actual formatting, people have used the term low-level formatting for actual formatting, and high-level formatting for creating the filesystem.
Formatting a disk or partition usually means that you're actually creating or have already created a volume, and it is the volume that you are formatting. Formatting a volume is to write an empty filesystem with the initialised filesystem structures like the MFT to the volume.
Mount has the notion of putting something on top of the other. But what is considered to be on top and what is considered to be at the bottom varies.
For mount points, the mount directory and the existing directory tree is considered to be on the bottom, and you mount the volume on top of / to the directory. This is despite the mount points actually being logically higher than the volume in the driver stack and the process of interacting with files. You sometimes see people say mounting the filesystem to the mount point, like the Linux df command. This makes sense because the file system is indeed logically between the volume and the mount point.
For mounting filesystems to volumes, the filesystem is on top and the volume is below. You mount the filesystem onto the volume. It is actually logically on top of the volume because on Windows it is consulted by the I/O Manager before the volume and it is up to the file system to interact with the volume stack.
On Windows, the volume is mounted to its mount points (C:\ if the symbolic link is \DosDevices\C:) and then you mount the file system to the volume by formatting the volume. Formatting the volume with a file system causes the file system driver to mount the volume. The file system will then mount itself to the volume every time the volume is detected 'arrives' on the system so long as it has been formatted with that file system because the file system driver detects its filesystem on the volume.
Formatting is usually done ONCE during the lifetime of the hard drive.
The hard drive is written to, wiping away anything that was on it previously, so that it can be accepted as a filesystem.
Mounting usually happens every time the system has been restarted. It allows the operating system to access the hard drive as a filesystem. Mounting a drive does NOT alter the hard drive, although once a filesystem has been mounted it can be modified (unless it was mounted read-only) by typical filesystem operations like creating a directory/folder, creating files, modifying files, etc ....
Mounting tells the operating system which filesystem type (ext3, ext4, HFS+, NTFS, ZFS ...) the hard drive is allowing the OS kernel to map the virtual filesystem functions to the real filesystem functions.
I am developing a system which copies and writes files on NTFS inside a virtual machine. At any time the VM can poweroff (direct shutdown). The poweroff is controlled from the outside so I do not have any way to detect it. Due to that files and complete directories which are being written to get lost. Is there any way to prevent that or do I have to develop my own file system? I have to store the files on the local disk and cannot send files via network.
There always exists a [short] period between when your data is written (sent to the API) and when this data is written to the physical hardware. If the system crashes in the middle, the data will be lost.
There is a setting in Windows to disable system write cache for certain disks. This setting can help you ensure that the data is at least sent to the host's hardware. Probably that's the answer you've been looking for.
Writing your own filesystem won't help much because it's mainly the write cache that causes the data to be lost. There can exist a filesystem-level cache as well, though, and I don't know if the write cache setting I mentioned above also affects internal filesystem cache.
If you write data to a file opened with "write through" enabled, the method only returns after the data is physically written to the disk so you can be sure it got written. You normally do that by passing in a WRITE_THROUGH flag when you open the file.
I would like to know if the open() system call in Linux latest kernel would block if the filesystem is mounted as remote device, for example a CEPH filesystem, or NFS , and there is a network failure of some sort?
Yes. How long depends on the speed (and state) of the uplink, but your process or thread will block until the remote operation finishes. NFS is a bit notorious for this, and some FUSE file systems handle the blocking for whatever has the file handle, but you will block on open(), read() and write(), often at the mercy of the network and the other system.
Don't use O_NONBLOCK to get around it, or you're potentially reading from or writing to a black hole (which would just block anyway).
Yes, an open() call can block when trying to open a file on a remote file system if there is a network failure of some sort.
Depending on how the remote file system is mounted, it may just take a long time (multiple seconds) to determine that the remote file system is unavailable and return unsuccessfully after what seems like an inordinate amount of time, or it may simply lock up indefinitely until the remote resource becomes available once more (or until the mapping is removed from the system).
I read some where that, any device file can be accessed by only one process at a time. But in my case I am able to access my /dev/ttyS0 device file by two different processes at the same time. In my case I opened a minicom with the /dev/ttyS0 and then I wrote a program in c, which opens the same file and tries to read/write from it. I am able to open both at the same time. Why is it happening in my case?
Comments converted to an answer:
Why not? Lots of processes have a given terminal as the I/O device at any one time, in general.
Are you saying terminal device files are different from other device files?
No; they're the same as device files, and multiple processes can have most device files open at any given time. Unix/Linux does not enforce exclusive access on devices. Device files such as /dev/null can be in use by many processes at one time. Disk devices can be opened by multiple processes (though generally, you only want one process at a time using any given device, but some DBMS will have multiple processes accessing a single disk device). When a process forks, both processes have access to the same set of files.