This is a question about OS file storage management and Inode - filesystems

This is a question about OS file storage management and Inode. This is a question for review of final exam..Lecturer didn't give the answer about second question. Does anybody can do this and help me or give some hints?
THnaks!
[10 points] File Storage Management and Inode
b) Consider the organization of an Unix file a represented by Inode. Assume that there 10 direct block pointers, and a singly, doubly and triply indirect pointers in each Inode. Assume that the system block size is 4K. Disk block pointer is 4 bytes.
i. What is the maximum file size supported by the system?
ii. Assuming no information other than the file Inode is in the main memory, how many disk accesses are required to access the byte in position 54, 423,956.

10 block pointers = 10 4K blocks = 40KB
singly indirect: 1 block full of pointers = 4K / 4 pointers = 1024 pointers = 4MB
double indirect: 1 block of pointers = 1024 single indirects = 4GB
triple indirect: 1 block of pointers = 1024 double indirects = 4TB
total max size= 4TB+4GB+4MB+40KB = 4402345713664 bytes
position 54,423,956 is in one of the double indirect blocks, so it has to read the two steps and the data block => 3 random blocks read

Related

Using Linux AIO, able to do IOs but writing garbage as well into the file

This might seem silly, but, I am using libaio ( not posix aio), I am able to write something into the file, but I am also writing extra stuff into the file.
I read about the alignment requirement and the data type of the buffer field of iocb.
Here is the code sample ( only relevant sections of use, for representation )
aio_context_t someContext;
struct iocb somecb;
struct io_event someevents[1];
struct iocb *somecbs[1];
somefd = open("/tmp/someFile", O_RDWR | O_CREAT);
char someBuffer[4096];
... // error checks
someContext = 0; // this is necessary
io_setup(32, &someContext ); // no error checks pasted here
strcpy(someBuffer, "hello stack overflow");
memset(&somecb, 0, sizeof(somecb));
somecb.aio_fildes = somefd ;
somecb.aio_lio_opcode = IOCB_CMD_PWRITE;
somecb.aio_buf = (uint64_t)someBuffer;
somecb.aio_offset = 0;
somecb.aio_nbytes = 100; // // //
// I am avoiding the memeaign and sysconf get page part in sample paste
somecbs[0] = &somecb; // address of the solid struct, avoiding heap
// avoiding error checks for this sample listing
io_submit(someContext, 1, somecbs);
// not checking for events count or errors
io_getevents(someContext, 1, 1, someevents, NULL);
The Output:
This code does create the file, and does write the intended string
hello stack overflow into the file /tmp/someFile.
The problem:
The file /tmp/someFile also contains after the intended string, in series,
#^#^#^#^#^#^#^#^#^ and some sections from the file itself ( code section), can say garbage.
I am certain to an extent that this is some pointer gone wrong in the data field, but cannot crack this.
How to use aio ( not posix) to write exactly and only 'hello world' into a file?
I am aware that aio calls might be not supported on all file systems as of now. The one I am running against does support.
Edit - If you want the starter pack for this attempt , you can get from here.
http://www.fsl.cs.sunysb.edu/~vass/linux-aio.txt
Edit 2 : Carelessness, I was setting up more number of bytes to write to within the file, and the code was honoring it. Put simply, to write 'hw' exactly one needed no more than 2 bytes in the bytes field of iocb.
There's a few things going on here. First up, the alignment requirement that you mentioned is either 512 bytes or 4096 bytes, depending on your underlying device. Try 512 bytes to start. It applies to:
The offset that you're writing in the file must be a multiple of 512 bytes. It can be 0, 512, 1024, etc. You can write at offset 0 like you're doing here, but you can't write at offset 100.
The length of data that you're writing to the file must be a multiple of 512 bytes. Again, you can write 512 bytes, 1024 bytes, or 2048 bytes, and so on - any multiple of 512. You can't write 100 bytes like you're trying to do here.
The address of the memory that contains the data you're writing must be a multiple of 512. (I typically use 4096, to be safe.) Here, you'll need to be able to do someBuffer % 512 and get 0. (With the code the way it is, it most likely won't be.)
In my experience, failing to meet any of the above requirements doesn't actually give you an error back! Instead, it'll complete the I/O request using normal, regular old blocking I/O.
Unaligned I/O: If you really, really need to write a smaller amount of data or write at an unaligned offset, then things get tricky even above and beyond the io_submit interface. You'll need to do an aligned read to cover the range of data that you need to write, then modify the data in memory and write the aligned region back to disk.
For example, say you wanted to modify offset 768 through 1023 on the disk. You'd need to read 512 bytes at offset 512 into a buffer. Then, memcpy() the 256 bytes you wanted to write 256 bytes into that buffer. Finally, you issue a write of the 512 byte buffer at offset 512.
Uninitialized Data: As others have pointed out, you haven't fully initialized the buffer that you're writing. Use memset() to initialize it to zero to avoid writing junk.
Allocating an Aligned Pointer: To meet the pointer requirements for the data buffer, you'll need to use posix_memalign(). For example, to allocate 4096 bytes with a 512 byte alignment restriction: posix_memalign(&ptr, 512, 4096);
Lastly, consider whether you need to do this at all. Even in the best of cases, io_submit still "blocks", albeit at the 10 to 100 microsecond level. Normal blocking I/O with pread and pwrite offers a great many benefits to your application. And, if it becomes onerous, you can relegate it to another thread. If you've got a latency-sensitive app, you'll need to do io_submit in another thread anyway!

how to print indirect block in ext2

I'm trying to print all the single indirect blocks in an ext2 file system. I can print the direct blocks easy enough (0-11) but I don't understand how to get to the single indirect blocks, and later the double and triple indirect blocks. If I look at the value of ino->i_block[12] how do I use that to get to where it points to? I'm sure I'm missing something easy here
An inode in EXT2 is 128 bytes long, and contains many different fields.
the i_size field indicates the number of bytes stored in the file, i.e., the file's length.
the i_block array is an array of 15 block numbers.
The first 12 entries in the array (i_block[0] through i_block[11]) contain the block numbers of direct blocks: they name data blocks that contain the first 12 blocks worth of the file's content.
The 13th entry in the array (i_block[12]) contains the block number of a singly indirect block: it names a block that contains an array of 4 byte block numbers; each of these blocks contains additional file contents.
The 14th entry in the array (i_block[13]) contains the block number of a doubly indirect block: it names a block that contains an array of 4-byte block numbers, each of these blocks in a singly indirect block, that contains an array of 4-byte block numbers of direct blocks.
The 15th entry in the array (i_block[14]) contains the block number of a triple indirection block.

numa, mbind, segfault

I have allocated memory using valloc, let's say array A of [15*sizeof(double)]. Now I divided it into three pieces and I want to bind each piece (of length 5) into three NUMA nodes (let's say 0,1, and 2). Currently, I am doing the following:
double* A=(double*)valloc(15*sizeof(double));
piece=5;
nodemask=1;
mbind(&A[0],piece*sizeof(double),MPOL_BIND,&nodemask,64,MPOL_MF_MOVE);
nodemask=2;
mbind(&A[5],piece*sizeof(double),MPOL_BIND,&nodemask,64,MPOL_MF_MOVE);
nodemask=4;
mbind(&A[10],piece*sizeof(double),MPOL_BIND,&nodemask,64,MPOL_MF_MOVE);
First question is am I doing it right? I.e. is there any problems with being properly aligned to page size for example? Currently with size of 15 for array A it runs fine, but if I reset the array size to something like 6156000 and piece=2052000, and subsequently three calls to mbind start with &A[0], &A[2052000], and &A[4104000] then I am getting a segmentation fault (and sometimes it just hangs there). Why it runs for small size fine but for larger gives me segfault? Thanks.
For this to work, you need to deal with chunks of memory that are at least page-size and page-aligned - that means 4KB in most systems. In your case, I suspect the page gets moved twice (possibly three times), due to you calling mbind() three times over.
The way numa memory is located is that CPU socket 0 has a range of 0..X-1 MB, socket 1 has X..2X-1, socket three has 2X-3X-1, etc. Of course, if you stick a 4GB stick of ram next to socket 0 and a 16GB in the socket 1, then the distribution isn't even. But the principle still stands that a large chunk of memory is allocated for each socket, in accordance to where the memory is actually located.
As a consequence of how the memory is located, the physical location of the memory you are using will have to be placed in the linear (virtual) address space by page-mapping.
So, for large "chunks" of memory, it is fine to move it around, but for small chunks, it won't work quite right - you certainly can't "split" a page into something that is affine to two different CPU sockets.
Edit:
To split an array, you first need to find the page-aligned size.
page_size = sysconf(_SC_PAGESIZE);
objs_per_page = page_size / sizeof(A[0]);
// We should be an even number of "objects" per page. This checks that that
// no object straddles a page-boundary
ASSERT(page_size % sizeof(A[0]));
split_three = SIZE / 3;
aligned_size = (split_three / objs_per_page) * objs_per_page;
remnant = SIZE - (aligned_size * 3);
piece = aligned_size;
mbind(&A[0],piece*sizeof(double),MPOL_BIND,&nodemask,64,MPOL_MF_MOVE);
mbind(&A[aligned_size],piece*sizeof(double),MPOL_BIND,&nodemask,64,MPOL_MF_MOVE);
mbind(&A[aligned_size*2 + remnant],piece*sizeof(double),MPOL_BIND,&nodemask,64,MPOL_MF_MOVE);
Obviously, you will now need to split the three threads similarly using the aligned size and remnant as needed.

Counting page transfers between disk and main memory

for I := 1 to 1024 do
for J := 1 to 1024 do
A[J,I] := A[J,I] * B[I,J]
For the given code, I want to count how many pages are transferred between disk and main memory given the following assumptions:
page size = 512 words
no more than 256 pages can be in main memory
LRU replacement strategy
all 2d arrays size (1:1024,1:1024)
each array element occupies 1 word
2d arrays are mapped in main memory in row-major order
I was given the solution, and my questions stems from that:
A[J,I] := A[J,I] * B[I,J]
writeA := readA * readB
Notice that there are 2 transfers changing every J loop and 1 transfer
that only changes every I loop.
1024 * (8 + 1024 * (1 + 1)) = 2105344 transfers
So the entire row of B is read every time we use it, therefore we
count the entire row as transferred (8 pages). But since we only read
a portion of each A row (1 value) when we transfer it, we only grab 1
page each time.
So what I'm trying to figure out is, how do we get that 8 pages are transferred every time we read B but only 1 transfer for each read and write of A?
I'm not surprised you're confused, because I certainly am.
Part of the confusion comes from labelling the arrays 1:1024. I couldn't think like that, I relabelled them 0:1023.
I take "row-major order" to mean that A[0,0] is in the same disk block as A[0,511]. The next block is A[0,512] to A[0,1023]. Then A[1,0] to A[1,511]... And the same arrangement for B.
As the inner loop first executes, the system will fetch the block containing A[0,0], then B[0,0]. As J increments, each element of A referenced will come from a separate disk block. A[1,0] is in a different block from A[0,0]. But only every 512th B element referenced will come from a different block; B[0,0] is in the same block as B[0,511]. So for one complete iteration through the inner loop, 1024 calculations, there will be 1024 fetches of blocks from A, 1024 writes of dirty blocks from A, and 2 fetches of blocks from B. 2050 accesses overall. I don't understand why the answer you have says there will be 8 fetches from B. If B were not aligned on a 512-word boundary, there would be 3 fetches from B per cycle; but not 8.
This same pattern happens for each value of I in the outer loop. That makes 2050*1024 = 2099200 total blocks read and written, assuming B is 512-word aligned.
I'm entirely prepared for someone to point out my obvious bloomer - they usually do - but the explanation you've been given seems wrong to me.

2way cache associative ? how many bytes do I read from memory?

Given the code :
void transpose2(array dst,array src)
{
int i,j;
for ( i=0; i<4; i++) {
for ( j=0; j<4; j++) {
dst[i][j] = src[j][i];
}
}
}
Assumptions :
int is 4 bytes
src array starts at address 0 , dst starts at address 64
the size of the cache is 32 bytes , at the beginning the cache is empty
Assuming that I have a cache with size of 32 bytes , under write through ,write allocate & LRU , using 2way set associative method , where each block is 8 bytes :
When I read from the memory , how many bytes do I take each iteration from the memory ?
is it 4 or 8 ?
What I'm quite sure about is that the cache has 4 cells , or rows , and each row has 8 bytes .Is this correct ?
What is a little confusing is the 2way part , I think that each way has 4 bytes , right ? please correct me if I'm wrong ...
Then when I "take" a block from the memory , I just don't exactly understand how many bytes !!?
Thanks in advance
Ron
The cache way (aka its associativity) does not affect the amount of data that's transferred when a transfer occurs; the block size is the block size.
Associativity is simply a measure how many possible locations there are in the cache that a given block from memory could be stored. So:
For a direct-mapped cache (associativity=1), memory address xyz will always map to the same cache location.
For a two-way cache, xyz could map to either of two cache locations.
For a fully-associative cache, xyz could map to anywhere in cache.
I'm really not saying anything here which isn't already explained at e.g. Wikipedia: http://en.wikipedia.org/wiki/CPU_cache#Associativity.
When the CPU references (load or store) a word from a block that is not in the cache, that block is demanded to memory. So, with the parameters supplied, every cache miss involves a 8 byte transfer from memory to cache.
Related to the terminology, your cache has 4 entries, containers or cache lines (32 bytes / 8 bytes/block). As it is 2-way associative, there are 2 sets of 2 entries. Blocks with even addreses map to set 0, while blocks with odd addresses map to set 1.
Block addresses are obtained by shifting the word address log2(block_size) bits (3 bits in your cache).
For example:
address 64 belongs to block 8
address 72 belongs to block 9

Resources