Sequential Scan Pages - database

Does a sequential scan imply that all pages in database table file are sequential on disk? Or are the pages laid out in different parts of disk and the sequential scan must still do multiple seeks to locate those pages?

Related

What is the difference between having one data file or multiple data files for a tablespace?

Oracle allow to create table space with multiple data files. what is the different with one data file size 1TB and 2 data files 500GB each? Is it any performance gain?
Performance? Could be. If you have one large (or two smaller) datafiles on the same hard disk, that will probably run somewhat slower than having two smaller datafiles on diferent hard disks. You know, you & me accessing data at the same time. HDD head will have to "jump" from one place to another to send data to both of us. If those were two disks, there's a chance that each disk will provide data separately and that would be faster.

TempDB with big files - Performance issue

I have a TEMPDB database with 8 files but, they have different sizes which is not recommended, as follow:
TempDB Files
I have a plan to resize to the same size as recommended with 20GB each, the TEMPDB total will be 160GB. My question is, if the SQL Server is executing an operation which need 23GB, will the remaining 3GB be split to another file or the files will grow to accommodate the operation in just 1 tempdb file?
If the files grow, instead of 160GB I will end with 184GB just because of 3 GB..
My question is, if the SQL Server is executing an operation which need
23GB, will the remaining 3GB be split to another file or the files
will grow to accommodate the operation in just 1 tempdb file?
SQL Server uses proportional fill algorithm to fill up data files, this means it will try to spread the data across all your files depending on the amount of free space in those files: the file with most free space will receive most of the data.
If all your files will be of equal size, and you have 9 file, then 23Gb will be evenly distributed among these 9 files, about 2,5Gb per each file.

Read a file after write and closing it in C

My code does the following
do 100 times of
open a new file; write 10M data; close it
open the 100 files together, read and merge their data into a larger file
do steps 1 and 2 many times in a loop
I was wondering if I can keep the 100 open w/o opening and closing them too many times. What I can do is fopen them with w+. After writing I set position the beginning to read, after read I set position to the beginning to write, and so on.
The questions are:
if I read after write w/o closing, do we always read all the written data
would this save some overhead? File open and close must have some overhead, but is this overhead large enough to save?
Bases on the comments and discussion I will talk about why I need to do this in my work. It is also related to my other post
how to convert large row-based tables into column-based tables efficently
I have a calculation that generates a stream of results. So far the results are saved in a row-storage table. This table has 1M columns, each column could be 10M long. Actually each column is one attribute the calculation produces. At the calculation runs, I dump and append the intermediate results the table. The intermediate results could be 2 or 3 double values at each column. I wanted to dump it soon because it already consumes >16M memory. And the calculate needs more memoy. This ends up a table like the following
aabbcc...zzaabbcc..zz.........aabb...zz
A row of data are stored together. The problem happens when I want to analyze the data column by column. So I have to read 16 bytes then seek to the next row for reading 16 bytes then keep on going. There are too many seeks, it is much slower than if all columns are stored together so I can read them sequentially.
I can make the calculation dump less frequent. But to make the late read more efficent. I may want to have 4K data stored together since I assume each fread gets 4K by default even if I read only 16bytes. But this means I need to buffer 1M*4k = 4G in memory...
So I was thinking if I can merge fragment datas into larger chunks like that the post says
how to convert large row-based tables into column-based tables efficently
So I wanted to use files as offline buffers. I may need 256 files to get a 4K contiguous data after merge if each file contains 1M of 2 doubles. This work can be done as an asynchronous way in terms of the main calculation. But I wanted to ensure the merge overhead is small so when it runs in parallel it can finish before the main calculation is done. So I came up with this question.
I guess this is very related to how column based data base is constructed. When people create them, do they have the similar issues? Is there any description of how it works on creation?
You can use w+ as long as the maximum number of open files on your system allows it; this is usually 255 or 1024, and can be set (e.g. on Unix by ulimit).
But I'm not too sure this will be worth the effort.
On the other hand, 100 files of 10M each is one gigabyte; you might want to experiment with a RAM disk. Or with a large file system cache.
I suspect that huger savings might be reaped by analyzing your specific problem structure. Why is it 100 files? Why 10 M? What kind of "merge" are you doing? Are those 100 files always accessed in the same order and with the same frequency? Could some data be kept in RAM and never be written at all?
Update
So, you have several large buffers like,
ABCDEFG...
ABCDEFG...
ABCDEFG...
and you want to pivot them so they read
AAA...
BBB...
CCC...
If you already have the total size (i.e., you know that you are going to write 10 GB of data), you can do this with two files, pre-allocating the file and using fseek() to write to the output file. With memory-mapped files, this should be quite efficient. In practice, row Y, column X of 1,000,000 , has been dumped at address 16*X in file Y.dat; you need to write it to address 16*(Y*1,000,000 + X) into largeoutput.dat.
Actually, you could write the data even during the first calculation. Or you could have two processes communicating via a pipe, one calculating, one writing to both row-column and column-row files, so that you can monitor the performances of each.
Frankly, I think that adding more RAM and/or a fast I/O layer (SSD maybe?) could get you more bang for the same buck. Your time costs too, and the memory will remain available after this one work has been completed.
Yes. You can keep the 100 files open without doing the opening-closing-opening cycle. Most systems do have a limit on the number of open files though.
if I read after write w/o closing, do we always read all the written data
It depends on you. You can do an fseek goto wherever you want in the file and read data from there. It's all the way you and your logic.
would this save some overhead? File open and close must have some overhead, but is this overhead large enough to save?
This would definitely save some overhead, like additional unnecessary I/O operations and also in some systems, the content which you write to file is not immediately flushed to physical file, it may be buffered and flushed periodically and or done at the time of fclose.
So, such overheads are saved, but, the real question is what do you achieve by saving such overheads? How does it suit you in the overall picture of your application? This is the call which you must take before deciding on the logic.

Would reading a file sequentially result in random disk seeks?

I was under the impression that sequential scan of a file would actually be a sequential seek on disk. However, I read recently that the blocks of a file might not be written contiguously on disk by a file system. If inodes are used as a map and each block is obtained by following the block pointer, I am wondering whether the actual mechanism with which a file system retrieves the blocks of a file is actually sequential?
If the answer is file system dependant, it would be great to cite some major filesystems.
Thanks.
Filesystems try to allocate as much sequential blocks as possible during writes. But as they age (i.e lot of creates + deletes over time), fragmentation becomes inevitable. There are heuristics to reduce fragmentation like speculative preallocation, delayed preallocation etc. Applications themselves can do things like preallocation (example fallocate), enabling readahead and running de-fragmentation tools depending on the features available in the filesystem to make the blocks contiguous or at least reads faster.

Data distribution in btrfs single profile array: using file instead of block level?

I have an array of 3 different drives which I use in single profile (no raid). I don't use raid because the data isn't that important to spend some extra money for additioinal drives.
But what I could not figure out exactly is on what granularity the data is distributed on the 3 drives.
I could find this on the wiki page:
When you have drives with differing sizes and want to use the full
capacity of each drive, you have to use the single profile for the
data blocks, rather than raid0
As far as I understand this means that not the whole files are distributed/allocated on one of the 3 drives but each of the file's data blocks.
This is unfortunate because losing only 1 drive will destroy the whole array. Is there a possibility to balance a single profile array at a file level?
I would be fine with the risk of losing all files on 1 drive in the array but not losing the whole array if 1 drive fails.

Resources