HDFS Mechanism
If each block is saved multiple times for fault tolerance (say 3 times), then does that mean that a 100TB datafile actually takes 300 TB size on the HDFS?
Yes, that is correct. The data must be stored on 3 different data nodes, and therefore it duplicated 3 times.
Related
I have a TEMPDB database with 8 files but, they have different sizes which is not recommended, as follow:
TempDB Files
I have a plan to resize to the same size as recommended with 20GB each, the TEMPDB total will be 160GB. My question is, if the SQL Server is executing an operation which need 23GB, will the remaining 3GB be split to another file or the files will grow to accommodate the operation in just 1 tempdb file?
If the files grow, instead of 160GB I will end with 184GB just because of 3 GB..
My question is, if the SQL Server is executing an operation which need
23GB, will the remaining 3GB be split to another file or the files
will grow to accommodate the operation in just 1 tempdb file?
SQL Server uses proportional fill algorithm to fill up data files, this means it will try to spread the data across all your files depending on the amount of free space in those files: the file with most free space will receive most of the data.
If all your files will be of equal size, and you have 9 file, then 23Gb will be evenly distributed among these 9 files, about 2,5Gb per each file.
I have an Apache Solr 4.2.1 instance that has 4 cores of total size (625MB + 30MB + 20GB + 300MB) 21 GB.
It runs on a 4 Core CPU, 16GB RAM, 120GB HD, CentOS dedicated machine.
1st core is fully imported once a day.
2nd core is fully imported every two hours.
3rd core is delta imported every two hours.
4rth core is fully imported every two hours.
The server also has a decent amount of queries (search for and create, update and delete documents).
Every core has maxDocs: 100 and maxTime: 15000 for autoCommint and maxTime: 1000 for autoSoftCommit.
The System usage is:
Around 97% of 14.96 GB Physical Memory
0MB Swap Space
Around 94% of 4096 File Descriptor Count
From 60% to 90% of 1.21GB of JVM-Memory.
When I reboot the machine the File Descriptor Count fall to near 0 and then steadily over the course of on week or so it reaches the aforementioned value.
So, to conclude, my questions are:
Is 94% of 4096 File Descriptor Count normal?
How can I increase the maximum File Descriptor Count?
How can I calculate the theoretical optimal value for the maximum and used File Descriptor Count.
Will the File Descriptor Count reaches 100? If yes, the server will crash? Or it will keep it bellow 100% by itself and functions as it should?
Thanks a lot beforehand!
Sure.
ulimit -n <number>. See Increasing ulimit on CentOS.
There really isn't one - as many as needed depending on a lot of factors, such as your mergefactor (if you have many files, the number of open files will be large as well - this is especially true for the indices that aren't full imports. Check the number of files in your data directories and issue an optimize if the same index has become very fragmented and have a large mergefactor), number of searchers, other software running on the same server, etc.
It could. Yes (or at least it won't function properly, as it won't be able to open any new files). No. In practice you'll get a message about being unable to open a file with the message "Too many open files".
So, the issue with the File Descriptor Count (FDC) and to be more precise with the ever increasing FDC was that I was committing after every update!
I noticed that Solr wasn't deleting old transaction logs. Thus, after the period of one week FDC was maxing out and I was forced to reboot.
I stopped committing after every update and now my Solr stats are:
Around 55% of 14.96 GB Physical Memory
0MB Swap Space
Around 4% of 4096 File Descriptor Count
From 60% to 80% of 1.21GB of JVM-Memory.
Also, the old transaction logs are deleted by auto commit (soft & hard) and Solr has no more performance wornings!
So, as pointed very well in this article:
Understanding Transaction Logs, Soft Commit and Commit in SolrCloud
"Be very careful committing from the client! In fact, don’t do it."
I have an array of 3 different drives which I use in single profile (no raid). I don't use raid because the data isn't that important to spend some extra money for additioinal drives.
But what I could not figure out exactly is on what granularity the data is distributed on the 3 drives.
I could find this on the wiki page:
When you have drives with differing sizes and want to use the full
capacity of each drive, you have to use the single profile for the
data blocks, rather than raid0
As far as I understand this means that not the whole files are distributed/allocated on one of the 3 drives but each of the file's data blocks.
This is unfortunate because losing only 1 drive will destroy the whole array. Is there a possibility to balance a single profile array at a file level?
I would be fine with the risk of losing all files on 1 drive in the array but not losing the whole array if 1 drive fails.
While reading this, I found a reasonable answer, which says:
Case 1: Directly Writing to File On Disk
100 times x 1 ms = 100 ms
I understood that. Next,
Case 3: Buffering in Memory before Writing to File on Disk
(100 times x 0.5 ms) + 1 ms = 51 ms
I didn't understand the 1 ms. What is the difference in between writing 100 data to disk and writing 1 data to disk? Why do both of them cost 1 ms?
The disc access (transferring data to disk) does not happen byte-by-byte, it happens in blocks. So, we cannot conclude if that the time taken for writing 1 byte of data is 1 ms, then x bytes of data will take x ms. It is not a linear relation.
The amount of data written to the disk at a time depends on block size. For example, if a disc access cost you 1ms, and the block size is 512 bytes, then a write of size between 1 to 512 bytes will cost you same, 1 ms only.
So, coming back to the eqation, if you have , say 16 bytes of data to be written in each opeartion for 20 iteration, then,
for direct write case
time = (20 iteration * 1 ms) == 20 ms.
for buffered access
time = (20 iteration * 0.5 ms (bufferring time)) + 1 ms (to write all at once) = 10 + 1 == 11 ms.
It is because of how the disc physical works.
They can take larger buffers (called pages) and save them in one go.
If you want to save the data all the time you need multiple alteration of one page, if you do it using buffer, you edit quickly accessible memory and then save everything in one go.
His example is explaining the costs of operation.
For loading memory to data you have 100 operation of 0.5 s cost and then you have one of altering the disc (IO operation) what is not described in the answer and is probably not obvious, nearly all disc provide the bulk transfer alteration operation. So 1 IO operation means 1 save to a disc, not necessarily 1 bit save (it can be much more data).
When writing 1 byte at a time, each write requires:
disk seek time (which can vary) to place the 'head' over the
correct track on the disk,
disk rotational latency time while waiting for the correct sector of the disk to be under the 'head'.
disk read time while the sector is read, (the rotational latency
and sector read time may have to be performed more than once if the
CRC does not match that saved on the disk
insert the new byte into the correct location in the sector
rotational latency waiting for the proper sector to again be under the 'head'
sector write time (including the new CRC).
Repeating all the above for each byte (esp. since a disk is orders of magnitude slower than memory) takes a LOT of time.
It takes no longer to write a whole sector of data than to update a single byte.
That is why writing a buffer full of data is so very much faster than writing a series of individual bytes.
There are also other overheads like updating the inodes that:
track the directories
track the individual file
Each of those directory and file inodes are updated each time the file is updated.
Those inodes are (simply) other sectors on the disk. Overall, lots of disk activity occurs each time a file is modified.
So modifying the file only once rather than numerous times is a major time saving. Buffering is the technique used to minimize the number of disk activities.
Among other things, data is written to disk in whole "blocks" only. A block is usually 512 bytes. Even if you only change a single byte inside the block, the OS and the disk will have to write all 512 bytes. If you change all 512 bytes in the block before writing, the actual write will be no slower than when changing only one byte.
The automatic caching inside the OS and/or the disk does in fact avoid this issue to a great extent. However, every "real" write operation requires a call from your program to the OS and probably all the way through to the disk driver. This takes some time. In comparison, writing into a char/byte/... array in your own process' memory in RAM does virtually cost nothing.
I have a 5 gb 256 Files in csv which I need to read at optimum speed and then write back
data in Binary form .
I made following arrangments to achieve it :-
For each file, there is one corresponding thread.
Am using C function fscanf,fwrite.
But in Resource Monitor,it shows not more then 12 MB/ Sec of Hard Disk and 100 % Acitve Highest Time.
Google says HardDisk can read/write till 100 MB/Sec.
Machine Configuration is :-
Intel i7 Core 3.4. Has 8 Cores.
Please give me your prespective.
My aim to complete this process within 1 Min .
** Using One Thread it took me 12 Mins**
If all the files reside on the same disk, using multiple threads is likely to be counter-productive. If you read from many files in parallel, the HDD heads will keep moving back and forth between different areas of the disk, drastically reducing throughput.
I would measure how long it takes a built-in OS utility to read the files (on Unix, something like dd or cat into /dev/null) and then use that as a baseline, bearing in mind that you also need to write stuff back. Writing can be costly both in terms of throughput and seek times.
I would then come up with a single-threaded implementation that reads and writes data in large chunks, and see whether I can get it to perform similarly the OS tools.
P.S. If you have 5GB of data and your HDD's top raw throughput is 100MB, and you also need to write the converted data back onto the same disk, you goal of 1 minute is not realistic.