Determine cache miss rate for a code snippet - c

I am preparing for an upcoming exam and I was having trouble with this problem:
direct mapped cache of size 64K with block size 16 bytes. Cache starts empty
What is the cache miss rate if...
ROWS = 128, COLS = 128
ROWS = 128 and COLS = 192
ROWS = 128 and COLS = 256
[solution: page 5 http://www.inf.ethz.ch/personal/markusp/teaching/263-2300-ETH-spring11/midterm/midterm.pdf ]
I was confused about how they got "the cache stores 128 x 128 elements". I thought the cache size was 64K (2^16).
Also, can someone explain how to approach each question? My professor had some formula to calculate the number of accesses in each block: block size/stride, but it doesn't seem to work here.

As far as I understand; in case 1, both src and dst matrices are of 64kb size (128 * 128 * 4 bytes); since the cache is directly mapped and has a size of 64kb; the entries of src & dst of the same indexes will have to be mapped to the same location in the cache (since (0+i mod)64 = (64+i mod)64) at the same time to be used in the line
dest[i][j]=src[i][j]
Therefore you have 100% miss rate; The same is applied to case 3 since the new size is a multiple of 64kb (128 * 256 * 4), so it doesn't make any difference;
But for case 2; the size of the matrices becomes 96 kb (128 * 192 *4 bytes); so now both src & dst may be loaded at the same time and you will have a lower miss rate.

Related

Maximum file size supported by a file representation of a node?

For a I feel like it would be 63504 Bytes because the file size would be (496/4)*512 + 16 Bytes.. But I cant seem to get that in the requested format, which leads me to believe that I attempted it wrong.
For pt b I have no Idea how to approach it.. Any help/hints would be appreciated
Part [a]
Starting with
Extent size 1 ==> File size will be 2^0 * 512 bytes
Extent size 2 ==>File size will be 2^1 * 512 bytes
Extent size 3 ==>File size will be 2^2 * 512 bytes
......
Extent size 12 ==>File size will be 2^11 * 512 bytes
Extent size x ==>File size will be 2^(x-1) * 512 bytes
Part [b]
Considering 512 bytes as block size,
Number of disk accesses required for getting 1,00,000 byte will be,
100000/512 = 196
I hope it does make sense to you....

Determining details of a cache

Your machine has an L1 cache and memory with the following properties.
Memory address space: 24 bits
Cache block size: 16 bytes
Cache associativity: direct-mapped
Caches size: 256 bytes
I am asked to determine the following: 1. the number of tag bits. 2. the number of bits of the cash index. 3. number of bits for cache size.
tag bits = m - (s+b)
m = 24. s = log2 S, S = C/(B*E). E = 1 due to it being direct mapped. so S = 256/16 = 16. s = log2 16 = 4. B = 16 (cache block size) b = log2 B; which is log2 16= 4. so s=4,b=4,m=24. t = 24-(4+4) = 16 total tag bits.
I am not sure how to figure this out.
I believe number of bits for cache size is just C*(num bits/byte) = 256*8 = 2048.
Can anyone help me figure out 2., and determine if the logic in 1. & 3. are correct?
1) This is correct for m=32 (isn't it 24?).
2) The number of index-bits: The number of bits to address a block in the cache when it'd direct-mapped, since it identifies the set (which consists only of one block in this case). If it was 2-way, one bit less would be needed for the index (and added to the tag-bits). For this problem, Since there are 16 sets you need 16 index bits which can be represented in 4 index bits.
3) It is not completely clear how to interpret this question. I would understand it as the number of bits needed to address the cache, which would be 4 in this case? If indeed, as you assume, the number of bits in the cache was meant, you would have to add 16*16 bits for the tag bits to your solution.

Largest size of a file from inode

Could someone explain me the answer to this. I got this in a quiz and couldn't answer it.
Assume that
All blocks in a disk are of size 4KB (4096 bytes).
The top level of an inode is stored in a disk block of size 4KB.
All file attributes, except data block locations, take up a total of
128 bytes (out of the above 4KB).
Each direct block address takes up 8 bytes of space and gives the
address of a disk block of size 4KB.
Last three entries of the first level of the inode point to single,
double, and triple indirect blocks respectively.
Question: What is the largest size of a file that can be accessed through direct block entries of the inode?
The calculations are quite simple:
(( 4096 − 128 ) / 8 − 3) × 4096 = 2019328

Do not understand where 2048 comes from

Where does the 2048 number comes from in is the problem?
Consider a file system that uses inodes to represent files. Disk blocks are 8 KB in size and a pointer to a disk block requires 4 bytes. This file system has 12 direct disk blocks, as well as single, double, and triple indirect disk blocks. What is the maximum size of a file that can be stored in this file system?
(12 * 8 KB) + (2048 * 8 KB) + (2048 * 2048 * 8 KB) + (2048 * 2048 * 2048 * 8 KB) = 64 terabytes
I was thinking 8KB/4B, but isn't that 2000? 8000/4.
Sometimes when discussing numbers in the context of computers, kB = 1024 bytes, MB = 1,048,576 bytes, etc.
In this case, 8kB = 8192 bytes. 8192 / 4 = 2048.
2048 is 8K (the block size) divided by 4 (the size of a pointer).
You need to allocate an entire 8192-byte block of pointers to 8K blocks; you can fit 2048 pointers into one of these.
Further, you can fit 2048 pointers to blocks of pointers to block for additional 2048 * 2048 * 8 KB capacity, and then 2048 * 2048 * 2048 * 8 KB of pointers to blocks of pointers to blocks of pointers to 8K blocks.
If you think that it goes a little like a cumulative tale, you're not alone.

Padding in 24-bits rgb bitmap

could somebody explain to me why in 24-bit rgb bitmap file I have to add a padding which size depends on width of image ? What for ?
I mean I must add this code to my program (in C):
if( read % 4 != 0 ) {
read = 4 - (read%4);
printf( "Padding: %d bytes\n", read );
fread( pixel, read, 1, inFile );
}
Because 24 bits is an odd number of bytes (3) and for a variety of reasons all the image rows are required to start at an address which is a multiple of 4 bytes.
According to Wikipedia, the bitmap file format specifies that:
The bits representing the bitmap pixels are packed in rows. The size of each row is rounded up to a multiple of 4 bytes (a 32-bit DWORD) by padding. Padding bytes (not necessarily 0) must be appended to the end of the rows in order to bring up the length of the rows to a multiple of four bytes. When the pixel array is loaded into memory, each row must begin at a memory address that is a multiple of 4. This address/offset restriction is mandatory only for Pixel Arrays loaded in memory. For file storage purposes, only the size of each row must be a multiple of 4 bytes while the file offset can be arbitrary. A 24-bit bitmap with Width=1, would have 3 bytes of data per row (blue, green, red) and 1 byte of padding, while Width=2 would have 2 bytes of padding, Width=3 would have 3 bytes of padding, and Width=4 would not have any padding at all.
The wikipedia article on Data Structure Padding is also an interesting read that explains the reasons that paddings are generally used in computer science.
I presume this was design decision to align for better memory patterns while not wasting that much space (for 319px wide image you would waste 3 bytes or 0.25%)
Imagine you need to access some odd row directly. You could access first 4 pixels of n-th row by doing:
uint8_t *startRow = bmp + n * width * 3; //3 bytes per pixel
uint8_t r1 = startRow[0];
uint8_t g1 = startRow[1];
//... Repeat
uint8_t b4 = startRow[11];
Note that if n and width are odd (and bmp is even), startRow is going to be odd.
Now if you tried to do following speedup:
uint32_t *startRow = (uint32_t *) (bmp + n * width * 3);
uint32_t a = startRow[0]; //Loading register at a time is MUCH faster
uint32_t b = startRow[1]; //but only if address is aligned
uint32_t c = startRow[2]; //else code can hit bus errors!
uint8_t r1 = (a & 0xFF000000) >> 24;
uint8_t g1 = (a & 0x00FF0000) >> 16;
//... Repeat
uint8_t b4 = (c & 0x000000FF) >> 0;
You'd run into lots of problems. In best case scenario (that is intel cpu) your every load of a, b and c would need to be broken into two loads since startRow is not divisible by 4. In worst case scenario (eg. sun sparc) your program would crash with "bus error".
In newer designs it is common to force rows to be aligned to at least L1 cache line size (64 bytes on intel or 128 bytes on nvidia gpus).
Short version
Because the bmp file format specifies rows must perfectly fit in a 32bits "memory cells". Because pixels are 24bits, some combinations of pixels will not perfect sit in 32bit "cells". In this case, the cell is "padded up to" the full 32bits.
8bits per byte ∴
cell: 32bit = 4bytes ∴
pixel: 24bits = 3bytes
// If doesn't fit perfectly in 4 byte "cell"
if( read % 4 != 0 ) {
// find the difference between the "cell", and "the partial fit"
read = 4 - (read%4);
printf( "Padding: %d bytes\n", read );
// skip the difference
fread( pixel, read, 1, inFile );
}
Long version
In computing, a word is the natural unit of data used by a particular processor design. A word is a fixed-sized piece of data handled as a unit by the instruction set or the hardware of the processor
-wiki: Word_(computer_architecture)
Computer systems basically have a preferred "word length" (though not so important these days). A standard data unit allows all sorts of optimisations in the architecture of the computer system (think what shipping containers did for the shipping industry). There is a 32 bit standard called DWORD aka Double word (I guess) - and thats what typical bitmap images are optimised for.
So if you have 24bits per pixel, there will be various "literal pixels" row lengths that will not fit nicely into the 32bits. So in that case, pad it out.
Note: today, you are probably using a computer with a 64bit word size. Check your processor.
It depends on the format whether or not there is padding at the end of each row.
There really isn't much reason for it for 3 x 8 bit channel images since I/O is byte oriented anyway. For images with pixels packed into less than a byte (1 bit / pixel for example), padding is useful so that each row starts at a byte offset.

Resources