How do you calculate the file slack?
For example:
File system: FAT 16
Drive size: 1.6 GB
Cluster: 32kB
A text file with a size of 150,000 bytes is created. So how do you count the file slack?
Thank you
FileSize / cluster size (in bytes) = # clusters needed.
If (FileSize modulo cluster size in bytes <> 0), add 1 additional cluster needed.
"File slack" = (Clusters needed * 1024) - FileSize
So, for your example:
32 * 1024 = 32768
150000 / 32768 = 4 clusters
150000 mod 32768 = 18928 = 1 additional cluster
4 + 1 clusters needed = 5 clusters needed
5 * 32768 = 163840 - 150000 file size = 13840 slack bytes
Note that, even though disk drives are given in 1KB = 1000 bytes, the cluster sizes are based on 1024 bytes per KB, so you need to use that in your calculations.
Related
I have to calculate storage requirements in terms of blocks for following table:
Bill(billno number(7),billdate date,ccode varchar2(20),amount number(9,2))
The table storage attributes are :
PCTFREE=20 , INITRANS=4 , PCTUSED=60 , BLOCKSIZE=8K , NUMBER OF ROWS=100000
I searched a lot on internet, referred many books but didn't got anything.
First you need to figure out what is the typical value for varchar2 column. The total size will depend on that. I created 2 tables from your BILL table. BILLMAX where ccode takes always 20 Char ('12345678901234567890') and BILLMIN that has always NULL in ccode.
The results are:
TABLE_NAME NUM_ROWS AVG_ROW_LEN BLOCKS
BILLMAX 3938 37 28
BILLMIN 3938 16 13
select table_name, num_rows, avg_row_len, blocks from user_tables
where table_name in ( 'BILLMIN', 'BILLMAX')
As you can see, the number of blocks depends on that. Use exec dbms_stats.GATHER_TABLE_STATS('YourSchema','BILL') to refresh values inside user_tables.
The other thing that you need to take into consideration is how big will be your extents. For example :
STORAGE (
INITIAL 64K
NEXT 1M
MINEXTENTS 1
MAXEXTENTS UNLIMITED
PCTINCREASE 0
BUFFER_POOL DEFAULT
)
will generate first 16 extents with 8 blocks size. After that it will start to create extents with size of 1 MB (128 blocks).
So for BILLMAX it will generate 768 blocks and BILLMIN will take 384 blocks.
As you can see the difference is quite big.
For BILLMAX : 16 * 8 + 128 * 5 = 768
For BILLMIN : 16 * 8 + 128 * 2 = 384
Is it possible to compute the size of data if I know its size when it's base64 encoded?
I've a file that is 450KB in size when base64 encoded but what size is it decompressed?
Is there a method to find output size without decompressing the file first?
I've a file that is 450KB in size when base64 encoded but what size is it decompressed?
In fact, you don't "decompress", you decode. The result will be smaller than the encoded data.
As Base 64 encoding needs ~ 8 bits for each 6 bits of the original data (or 4 bytes to store 3), the math is simple:
Encoded Decoded
450KB / 4 * 3 = ~ 337KB
The overhead between Base64 and decoded string is nearly constant, 33.33%. I say "nearly" just because the padding bytes at the end (=) that make the string length multiple of 4. See some examples:
String Encoded Len B64 Pad Space needed
A QQ== 1 2 2 400.00%
AB QUI= 2 3 1 200.00%
ABC QUJD 3 4 0 133.33%
ABCD QUJDRA== 4 6 2 200.00%
ABCDEFGHIJKLMNOPQ QUJDREVGR0hJSktMTU5PUFE= 17 23 1 140.00%
( 300 bytes ) ( 400 bytes ) 300 400 0 133.33%
( 500 bytes ) ( 668 bytes ) 500 666 2 133.60%
( 5000 bytes ) ( 6668 bytes ) 5000 6666 2 133.36%
... tends to 133.33% ...
Calculating the space for unencoded data:
Let's get the value QUJDREVGR0hJSktMTU5PUFE= mentioned above.
There are 24 bytes in the encoded value.
Let's calculate 24 / 4 * 3 => the result is 18.
Let's count the number of =s on the end of encoded value: In this case, 1
(we need to check only the 2 last bytes of encoded data).
Getting 18 (obtained on step 2) - 1 (obtained on step 3 ) we get 17
So, we need 17 bytes to store the data.
base64 adds roughly a third to the original size, so your file should be more or less .75*450kb in size.
I'm trying calculate crc32 for multithread. I'm trying use OpenCL.
The GPU code is:
__kernel void crc32_Sarwate( __global int* lenghtIn,
__global unsigned char *In,
__global int *OutCrc32,
int size ) {
int i, j, len;
i = get_global_id( 0 );
if( i >= size )
return;
len = j = 0;
while( j != i )
len += lenghtIn[ j++ ];
OutCrc32[ i ] = crc32( In + len, lenghtIn[ i ] ); }
I received this results( time ) with a thousand repetitions:
for 4 using work-item: 29.82
for 8 using work-item: 29.9
for 16 using work-item: 35.16
for 32 using work-item: 35.93
for 64 using work-item: 38.69
for 128 using work-item: 52.83
for 256 using work-item: 152.08
for 512 using work-item: 333.63
I have intel HD Graphics with 350 MHz and 3 work-group with 256 work-item
each work-group.
I assumed that by increasing the number of work-item 128 to 256 happen slight increase in time, but time tripled. Why?
( I'm sorry for my very bad English ).
The
while( j != i )
len += lenghtIn[ j++ ];
part runs for get_global_id( 0 ) times.
When it is 128, the latest work item to complete is doing 128 loop iterations.
When it is 256, it is doing 256 iterations so it should be %100 increase from memory's point of view but only for the last work item. When we integrate all workers' total memory access numbers,
1 item from 0 to 0 ---> 1 access
2 item from 0 to 0 and 0 to 1 ---> 3 access
4 item from 0 to 0 and 0 to 1 and 0 to 2 and 0 to 3---> 10 access
8 items: SUM(1 to 8) => 36 accesses
16 items: SUM(1 to 16) => 136 accesses (even more than + %200)
32 items: => 528 (~ %400)
64 items: => 2080 ( ~%400)
128 items: => 8256 (~%400) (cache of your igpu starts failing here)
256 items: => 32896 (~400%) (now caching is saturated and you start )
( seeing %400 per doubling of work items)
512 => uses second compute unit too! But %400 work is done
so it is not only %200 time consuming.
so each time you increase work items by %100, you increase total memory
accesses to %400 . But caching helps up to some degree. When you cross that, memory accesses increase badly. Alse the execution overhead(drivers,..) becomes unimportant.
You are accessing to memory non-parallel. You need to cache it first but it may not be possible in that hardware so you should distribute the job equally among workitems and make memory accesses contiguous between cores(vectorize). This should give more performance.
For now, each vector unit does:
unit : v0 v1 v2 v3 v4 ... v7
read address: 0 0 0 0 0 0
- 1 1 1 1 1
- - 2 2 2 2
- - - 3 3 3
- - - - 4 4
....
- - - - - ... 7
done in 8 steps on 8 streaming cores.
At the last step, only single work item is actually computing something. This should be something like:
Some Optimization
unit : v0 v1 v2 v3 no need other work items
read address: 0 0 0 0 \
1 1 1 1 \
2 2 2 2 \
3 3 3 3 / this is 5th work item's work
4 4 4 4 /
5 5 5 0 \
6 6 0 1 \ this is 0 to 3 as 4th work
7 0 1 2 /
first item<-- 0 1 2 3 /
done in 8 steps in only 4 streaming cores and is doing same job for the first
half part(probably faster).
Further Optimization Suggestion
I think it would be better with a prefix-scan(sum) algorithm on another kernel before getting to crc32() part. (probably in just 3 steps for this example rather than 8 and also more efficient)
Using precomputed values of
while( j != i )
len += lenghtIn[ j++ ];
should make crc32 immune to the current algorithm complexity (O(n²)).
If I create a B+-tree index for the key table(a,b,c), in a database with 2KB pages and using 64 bit pointers, where a,b and c are all of size 4 bytes and the total size of each record is 88 bytes.
What is the range of possible values for the depth of the index if the table has 36,279 rows?
For minimum capacity:
2 * ceiling[n/2]^(d-2) * ceiling[(n-1)/2] = 36279
solved for d gives you 3.5, so depth is 4.
For max capacity:
n^(d-1) * (n-1) = 36279
solved for d gives you 2.3 so depth is 3.
Therefore the answer is 3-4.
Oh, and n is 102.
I have a table with sorted numbers like:
1 320102
2 5200100
3 92010023
4 112010202
5 332020201
6 332020411
:
5000000000 3833240522044511
5000000001 3833240522089999
5000000002 4000000000213312
Given the record number I need the value in O(log n) time. The record number is 64-bit long and there are no missing record numbers. The values are 64-bit long, they are sorted and value(n) < value(n+1).
The obvious solution is simply doing an array and use the records number as index. This will cost 64-bit per value.
But I would like a more space efficient way of doing that. Since we know the values are always increasing that should be doable, but I do not remember a data structure that lets me do that.
A solution would be to use deflate on the array, but that will not give me O(log n) for accessing an element - thus unacceptable.
Do you know of a data structure that will give me:
O(log n) for access
space requirement < 64-bit/value
= Edit =
Since we know all numbers in advance we could find the difference between each number. By taking the 99th percentile of these differences we will get a relatively modest number. Taking the log2 will give us the number of bits needed to represent modest number - let us call that modest-bits.
Then create this:
64-bit value of record 0
64-bit value of record 1024
64-bit value of record 2048
64-bit value of record 3072
64-bit value of record 4096
Then a delta table for all records:
modest-bits difference to record 0
modest-bits difference to previous record
1022 * modest-bits difference to previous record
modest-bits difference to record 1024
modest-bits difference to record k*1024 will always be 0, so we can use that for signaling. If it is non-zero, then the following 64-bit will be a pointer to a simple array for the next 1024 records as 64-bit values.
As the modest value is chosen as the 99th percentile number, that will at most happen 1% of the time, thus wasting at most 1% * n * modest-bits + 1% * n * 64-bit * 1024.
space: O(modest-bits * n + 64-bit * n / 1024 + 1% * n * modest-bits + 1% * n * 64-bit * 1024)
lookup: O(1 + 1024)
(99% and 1024 may have to be adjusted)
= Edit2 =
Based on the idea above, but wasting less space. Create this:
64-bit value of record 0
64-bit value of record 1024
64-bit value of record 2048
64-bit value of record 3072
64-bit value of record 4096
And for all value that cannot be represented by modest-bits create big-value table as a tree:
64-bit position, 64-bit value
64-bit position, 64-bit value
64-bit position, 64-bit value
Then a delta table for all records, that is reset for every 1024 records:
modest-bits difference to record 0
modest-bits difference to previous record
1022 * modest-bits difference to previous record
modest-bits difference to record 1024
but also reset for every value that is in the big-value table.
space: O(modest-bits * n + 64-bit * n / 1024 + 1% * n * 2 * 64-bit).
Lookup requires searching big-value table, then looking up the 1024'th value and finally summing up the modest-bits values.
lookup: O(log(big-value table) + 1 + 1024) = O(log n)
Can you improve this? Or do better in a different way?
OP proposes splitting numbers into blocks (only once). But this process may be continued. Split every block once more. And again... Finally we might get a binary trie.
Root node contains value of the number with least index. Its right descendant stores difference between the middle number in the table and the number with least index: d = A[N/2] - A[0] - N/2. This is continued for other right descendants (red nodes on diagram). Leaf nodes contain deltas from preceding numbers: d = A[i+1] - A[i] - 1.
So most of the values, stored in trie, are delta values. Each of them occupies less than 64 bits. And for compactness they may be stored as variable-bit-length numbers in a bit stream. To get length of each number and to navigate in this structure in O(log N) time, bit stream should also contain lengths of (some) numbers and (some) subtrees:
Each node contains length (in bits) of its left sub-tree (if it has one).
Each right descendant (red nodes on diagram), except leaf nodes, contains length (in bits) of its value. Leaf node's length may be calculated from other lengths on the path from root to this node.
Each right descendant (red nodes on diagram) contains difference of correspondent value and the value of nearest "red" node up the path.
All nodes are packed in bit stream, starting from root node, in-order: left descendant always follows its ancestor; right descendant follows sub-tree, rooted by left descendant.
To access element given its index, use index's binary representation to follow path in the trie. While traversing this path, add together all values of "red" nodes. Stop when no more non-zero bits are left in the index.
There are several options to store N/2 value lengths:
Allocate as many bits for each length as needed to represent all values from the largest length to somewhere below mean length (excluding some very short outliers).
Also exclude some long outliers (keep them in a separate map).
Since lengths may be not evenly distributed, it's reasonable to use Huffman encoding for value lengths.
Either fixed length or Huffman encodings should be different for each trie depth.
N/4 subtree lengths are, in fact, value lengths, because N/4 smallest subtrees contain a single value.
Other N/4 subtree lengths may be stored in words of fixed (predefined) length, so that for large subtrees we know only approximate (rounded up) lengths.
For 230 full-range 64-bit numbers we have to pack approximately 34-bit values, for 3/4 nodes, approx. 4-bit value lengths, and for every fourth node, 10-bit subtree lengths. Which saves 34% space.
Example values:
0 320102
1 5200100
2 92010023
3 112010202
4 332020201
5 332020411
6 3833240522044511
7 3833240522089999
8 4000000000213312
Trie for these values:
root d=320102 vl=19 tl=84+8+105+4+5=206
+-l tl=75+4+5=84
| +-l tl=23
| | +-l
| | | +-r d=4879997 (vl=23)
| | +-r d=91689919 vl=27
| | +-r d=20000178 (vl=25)
| +-r d=331700095 vl=29 tl=8
| +-l
| | +-r d=209 (vl=8)
| +-r d=3833240190024308 vl=52
| +-r d=45487 (vl=16)
+-r d=3999999999893202 vl=52
Value length encoding:
bits start end
Root 0 19 19
depth 1 0 52 52
depth 2 0 29 29
depth 3 5 27 52
depth 4 4 8 23
Sub-tree lengths need 8 bits each.
Here is encoded stream (binary values still shown in decimal for readability):
bits value comment
19 320102 root value
8 206 left subtree length of the root
8 84 left subtree length
4 15 smallest left subtree length (with base value 8)
23 4879997 value for index 1
5 0 value length for index 2 (with base value 27)
27 91689919 value for index 2
25 20000178 value for index 3
29 331700095 value for index 4
4 0 smallest left subtree length (with base value 8)
8 209 value for index 5
5 25 value length for index 6 (with base value 27)
52 3833240190024308 value for index 6
16 45487 value for index 7
52 3999999999893202 value for index 8
Altogether 285 bits or 5 64-bit words. We also need to store bits/start values from value length encoding table (350 bits). To store 635 bits we need 10 64-bit words, which means such a small number table cannot be compressed. For larger number tables, size of value length encoding table is negligible.
To search a value for index 7, read root value (320102), skip 206 bits, add value for index 4 (331700095), skip 8 bits, add value for index 6 (3833240190024308), add value for index 7 (45487), and add index (7). The result is 3 833 240 522 089 999, as expected.
I would do it in blocks, as you outline in your question. Pick a block size k, where you can accept having to decode on average k/2 values before getting to the one you're after. For the n total values, you will have n/k blocks. A table with n/k entries would point into the data stream to find the starting point of each block. Finding where to go in that table would be O(log(n/k)) for a binary search, or if the table is small enough and if it matters, you could make it about O(1) with an auxiliary hash table.
Each block would start with a starting 64-bit value. All values after that would be stored as deltas from the preceding value. My suggestion is to store those deltas as a Huffman code that says how many bits are in the next value, followed by that many bits. The Huffman code would be optimized for each block, and a description of that code would be stored at the beginning of the block.
You could simplify that by just preceding each value with six bits having the number of bits following, in the range of 1..64, effectively a flat Huffman code. Depending on the histogram of the bit lengths, an optimized Huffman code could knock off a good number of bits compared to the flat code.
Once you have this set up, you can experiment with k and see how small you can make it and still have limited impact on the compression.
I do not know of a data structure that does that.
The obvious solution to gain space and not loose too much speed would be to create your own structure with different array size based on the different int size you store.
Pseudo-code
class memoryAwareArray {
array16 = Int16[] //2 bytes
array32 = Int32[] //4 bytes
array64 = Int64[] //8 bytes
max16Index = 0;
max32Index = 0;
addObjectAtIndex(index, value) {
if (value < 65535) {
array16[max16Index] = value;
max16Index++;
return;
}
if (value < 2147483647) {
array32[max32Index] = value;
max32Index++;
return;
}
array64[max64Index] = value;
max64Index++;
}
getObject(index) {
if (index < max16Index) return(array16[index]);
if (index < max32Index) return(array32[index-max16Index]);
return(array64[index-max16Index-max32Index]);
}
}
Something along those lines shouldn't alter to much the speed and you'd save around 7 gigas if you filled up the entire structure. You won't save as much since you have gaps beetween your values of course.