I know that each block of OS array contains one FCB. but I don't understand how OS uses them to control files. I don't understand the relation. PLZ explain simply.
C views each file simply as a sequential stream of bytes. Each file ends either with an end-of-file marker or at a specific byte number recorded in a system-maintained, administrative data structure. When a file is opened, a stream is associated with the file. Three files and their associated streams are automatically opened when program execution begins- the standard input, the standard output and the standard error. Opening a file returns a pointer to a FILE structure (defineed in <sdtio.>) that contains information used to process the file. This structure includes a file descriptor, i.e., an index into an operating system array called open file table. Each array element contains a file control block (FCB) that the operating system uses to administer a particular file. the standard input, standard output and standard error are maniulated using file pointers stdin, stdout and stderr.
Deitel, how to program C, 6th eidition, page 420
Related
I can't understand the meaning of "stream" in C language. Is it an abstraction ( just a name describe many operations)? Is it an object (monitor, keyboard, file on hard drive) which a program exchange data with it ? Or it 's a memory space in the RAM holding temporarly the exchanged data ?.
Thinks for help.
A stream is an abstraction of an I/O channel. It can map to a physical device such as a terminal or tape drive or a printer, or it can map to a file in a file system, or a network socket, or something else completely. How that mapping is accomplished is not exposed to you, the programmer.
From the perspective of your code, a stream is simply a source (input stream) or sink (output stream) of characters (text stream) or bytes (binary stream). Streams are managed through FILE objects and the stdio routines.
As far as your code is concerned, all streams behave the same way, regardless of what they are mapped to. It's a uniform interface to operations that can have wildly different implementations.
Stream is just the sequence of data available over the time. It is distinct from the file for example because you cant set the position. Examples: data coming/going through the RS232, USB, Ethernet, IP newworks etc etc.
but my questions are what are exactly a stream on the machine level
Nothing special. Machine level does not know anything about the streams.
What is exactly a stream in C language?
Same - C language does not know anything about the streams.
In C when we use the term stream, we indicate any input source or output destination.
Some example may be:
stdin (standard input which is the keyboard by default)
stdout (standard output which by default is the screen)
stderr (standard error which is the screen by default)
Functions such as printf, scanf, gets, puts and getchar, are functions that have the keyboard as input stream and the screen as output stream.
But we can create streams to files to!
The stdio.h library supports two types of files, text files and binary files. Within a text file, the bytes represent characters, which makes it possible for a human to read what the file contains. By contrast, in a binary file, bytes do not necessarily represent characters. In summary, text files have two things that binary files do not: Text files are divided into lines, and each line ends with one or two special characters. The code obviously depends on the operating system. In addition, text files can contain the file terminator (END OF FILE).
Streams are specific to the running program as well. Let me explain this further.
When you run a program through the terminal (Unix-like/Windows) what essentially it does is:
The terminal forks into a child process and runs your specified program (./name_of_program).
All the printf statements are given to stdout of the parent process which forked. Same for, scanf statements but now to stdin of the parent process that forked.
The operating system handles the characteristics of the streams, i.e. how many bytes can be streamed to stdin/out at once. Generally in Unix it is 4096 bytes. (Hint: Use pipes to overcome this issue).
There are three types of streams in C or any Programming language, Buffered, Line-buffered and Unbuffered. (Hint: use delay() function between each printf() call to know what this mean)
Now, the read and write access to files is handled by other service of the OS which is file descriptor. They are positive integers used by OS to keep track of the opened files and ports (like Serial Port).
Basically, the same result as creating a temporary file in the desired file system, opening it, and then unlinking it.
Even better, though unlikely, if this could be done without creating an inode that is visible to other processes.
The ability to do so is OS-specific, since the relevant POSIX function calls all result in a link being generated. Linux in particular has allowed, since version 3.11, the use of O_TMPFILE in the flags argument of open(2) in order to create an anonymous file in a given directory.
There are several POSIX APIs at your disposal:
mkstemp - generates a unique temporary filename from
template, creates and opens the file, and returns an open file
descriptor for the file.
tmpfile - opens a unique temporary file in binary
read/write (w+b) mode. The file will be automatically deleted when
it is closed or the program terminates.
Both of these functions do create files on the filesystem. Creating an inode is unavoidable, if you want to use a real file.
The first provides you a file descriptor for making low-level system calls, like read and write. The second gives you a FILE* for all of the <stdio.h> APIs.
If you don't need/desire an actual file on disk, you should consider the memory stream APIs provided by POSIX.1-2008.
open_memstream() - opens a stream for writing to a buffer.
The buffer is dynamically allocated (as with malloc(3)), and
automatically grows as required.
libtmpfilefd : create a temporary unnamed file seem to fullfill your requirements
Looking at the source file this function create a temporary file with mkstemp then unlink the file right after
Can any of you guys tell me what "int filedes" refers to?
http://pubs.opengroup.org/onlinepubs/9699919799/functions/read.html
I've noticed I can put any int in there and it seems to work but I don't know what it's for...
Thanks.
It's a file descriptor. See http://en.wikipedia.org/wiki/File_descriptor. Since it represents an offset to a table lookup of files and pipes, there may be multiple descriptors that could return valid data. 0=stdin and 2=stderr will exist by default, or you can look at the open function to create your own.
The very first sentence of the description says, "the file associated with the open file descriptor, fildes". In other words, it indicates the file you're reading from. If your read function call works no matter what file descriptor you pass it, your program isn't doing what you think it is.
Somewhere inside the kernel, there is a table comprises of file descriptor entries on a per-process base. A file descriptor is a structure which describes the state of the file. What kind of information has a file descriptor? First of all, position from which the next read/write operation can be performed. Then, the access mode of the file, specified by the open system call. And last but not least, a data structure which represent the on-disk information of a file. In *nix, this is an inode structure. Here, the main question to be answered is : Where resides the blocks of the file in the disk. If you have an inode of a file in the memory, you can find quickly, where is the Nth block of the file(which means you don't need to parse the path every time, and scan each directory in the path to resolve the inode).
In C when we open a file what happens?? As I know that the contents of the file is not loaded in the memory when we open a file. It just sets the file descriptor ? So what is this file descriptor then?? And if the contents of the file is not loaded in the memory then how a file is opened?
Typically, if you're opening a file with fopen or (on a POSIX system) open, the function, if successful, will "open the file" - it merely gives you a value (a FILE * or an int) to use in future calls to a read function.
Under the hood, the operating system might read some or all of the file in, it might not. You have to call some function to request data to be read anyways, and if it hasn't done it by the time you call fread/fgets/read/etc... then it will at that point.
A "file descriptor" typically refers to the integer returned by open in POSIX systems. It is used to identify an open file. If you get a value 3, somewhere, the operating system is keeping track that 3 refers to /home/user/dir/file.txt, or whatever. It's a short little value to indicate to the OS which file to read from. When you call open, and open say, foo.txt, the OS says, "ok, file open, calling it 3 from here on".
This question is not entirely related to the programming language. Although the library does have an impact on what happens when opening a file (using open or fopen, for example), the main behavior comes from the operating system.
Linux, and I assume other OSs perform read ahead in most cases. This means that the file is actually read from the physical storage even before you call read for the file. This is done as an optimization, reducing the time for the read when the file is actually read by the user. This behavior can be controlled partially by the programmer, using specific flag for the open functions. For example, the Win32 API CreateFile can specify FILE_FLAG_RANDOM_ACCESS or FILE_FLAG_SEQUENTIAL_SCAN to specify random access (in which case the file is not read ahead) or sequential access (in which case the OS will perform quite aggressive read ahead), respectively. Other OS APIs might give more or less control.
For the basic ANSI C API of open, read, write that use a file descriptor, the file descriptor is a simple integer that is passed onto the OS and signifies the file. In the OS itself this is most often translated to some structure that contains all the needed information for the file (name, path, seek offsets, size, read and write buffers, etc.). The OS will open the file - meaning find the specific file system entry (an inode under Linux) that correlates to the path you've given in the open method, creates the file structure and return an ID to the user - the file descriptor. From that point on the OS is free to read whatever data it seems fit, even if not requested by the user (reading more than was requested is often done, to at least work in the file system native size).
C has no primitives for file I/O, it all depends on what operating system
and what libraries you are using.
File descriptors are just abstracts. Everything is done on the operating system.
If the program uses fopen() then a buffering package will use an implementation-specific system call to get a file descriptor and it will store it in a FILE structure.
The system call (at least on Unix, Linux, and the Mac) will look around on (usually) a disk-based filesystem to find the file. It creates data structures in the kernel memory that collects the information needed to read or write the file.
It also creates a table for each process that links to the other kernel data structures necessary to access the file. The index into this table is a (usually) small number. This is the file descriptor that is returned from the system call to the user process, and then stored in the FILE struct.
As already mentioned it is OS functionality.
But for C file I/O most probably you need info on fopen function.
If you will check description for that function, it says :
Description:
Opens a stream.
fopen opens the file named by
filename and associates a stream with
it. fopen returns a pointer to be used
to identify the stream in subsequent
operations.
So on successful completion fopen just returns a pointer to the newly opened stream. And it returns NULL in case of any error.
When you open the file then the file pointer gets the base address(starting address)of that file.Then you use different functions to work on the file.
EDIT:
Thanks to Chris,here is the structure which is named FILE
typedef struct {
int level; /* fill/empty level of buffer */
unsigned flags; /* File status flags */
char fd; /* File descriptor */
unsigned char hold; /* Ungetc char if no buffer */
int bsize; /* Buffer size */
unsigned char *buffer; /* Data transfer buffer */
unsigned char *curp; /* Current active pointer */
unsigned istemp; /* Temporary file indicator */
short token; /* Used for validity checking */
} FILE;
I'm studying for my operating systems midterm and was wondering if I can get some help.
Can someone explain the checks and what the kernel does during the open() system call?
Thanks!
Very roughly, you can think of the following steps:
Translate the file name into an inode, which is the actual file system object describing the contents of the file, by traversing the filesystem data structures.
During this traversal, the kernel will check that you have sufficient access through the directory path to the file, and check access on the file itself. The precise checks depend on what modes were passed to open.
Create what's sometimes called an open file descriptor within the kernel. There is one of these objects for each file the kernel has opened on behalf of any process.
Allocate an unused index in the per-process file descriptor table, and point it at the open file descriptor.
Return this index from the system call as the file descriptor.
This description should be essentially correct for opening plain files and/or directories, but things are different for various sorts of special files, in particular for devices.
I would go back to what the prof told you - there a lot of things that happen during open(), depending on what you're opening (i.e. a device, a file, a directory), and unless you write what the professor's looking for, you'll lose points.
That being said, it mostly involves the checks to see if this open is valid (i.e. does this file exist, does the user have permissions to read/write it, etc), then an entry in the kernel handle table is allocated to keep track of the fd and its current file position (and of course, some other things)