What is a maximum size of SQLite database? [duplicate] - database

I have read their limits FAQ, they talk about many limits except limit of the whole database.

This is fairly easy to deduce from the implementation limits page:
An SQLite database file is organized as pages. The size of each page is a power of 2 between 512 and SQLITE_MAX_PAGE_SIZE. The default value for SQLITE_MAX_PAGE_SIZE is 32768.
...
The SQLITE_MAX_PAGE_COUNT parameter, which is normally set to 1073741823, is the maximum number of pages allowed in a single database file. An attempt to insert new data that would cause the database file to grow larger than this will return SQLITE_FULL.
So we have 32768 * 1073741823, which is 35,184,372,056,064 (35 trillion bytes)!
You can modify SQLITE_MAX_PAGE_COUNT or SQLITE_MAX_PAGE_SIZE in the source, but this of course will require a custom build of SQLite for your application. As far as I'm aware, there's no way to set a limit programmatically other than at compile time (but I'd be happy to be proven wrong).

It has new limits, now the database size limit is 256TB:
Every database consists of one or more "pages". Within a single database, every page is the same size, but different databases can have page sizes that are powers of two between 512 and 65536, inclusive. The maximum size of a database file is 4294967294 pages. At the maximum page size of 65536 bytes, this translates into a maximum database size of approximately 1.4e+14 bytes (281 terabytes, or 256 tebibytes, or 281474 gigabytes or 256,000 gibibytes).
This particular upper bound is untested since the developers do not have access to hardware capable of reaching this limit. However, tests do verify that SQLite behaves correctly and sanely when a database reaches the maximum file size of the underlying filesystem (which is usually much less than the maximum theoretical database size) and when a database is unable to grow due to disk space exhaustion.

The new limit is 281 terabytes. https://www.sqlite.org/limits.html

Though this is an old question, but let me share my findings for people who reach this question.
Although Sqlite documentation states that maximum size of database file is ~140 terabytes but your OS imposes its own restrictions on maximum file size for any type of file.
For e.g. if you are using FAT32 disk on Windows, maximum file size that I could achieve for sqLite db file was 2GB. (According to Microsoft site, limit on FAT 32 system is 4GB but still my sqlite db size was restricted to 2GB). While on Linux , I was able to reach 3 GB (where I stopped. it could have reached more size)
NOTE: I had written a small java program that will start populating sqlite db from 0 rows and go on populating until stop command is given.

The maximum number of bytes in a string or BLOB in SQLite is defined by the preprocessor macro SQLITE_MAX_LENGTH. The default value of this macro is 1 billion (1 thousand million or 1,000,000,000). 
The current implementation will only support a string or BLOB length up to 231-1 or 2147483647
The default setting for SQLITE_MAX_COLUMN is 2000. You can change it at compile time to values as large as 32767. On the other hand, many experienced database designers will argue that a well-normalized database will never need more than 100 columns in a table.
SQLite does not support joins containing more than 64 tables.
The theoretical maximum number of rows in a table is 2^64 (18446744073709551616 or about 1.8e+19). This limit is unreachable since the maximum database size of 140 terabytes will be reached first.
Max size of DB : 140 terabytes
Please check URL for more info : https://www.sqlite.org/limits.html

I'm just starting to explore SQLite for a project I'm working on, but it seems to me that the effective size of a database is actually more flexible than the file system would seem to allow.
By utilizing the 'attach' capability, a database could be compiled that would exceed the file system's max file size by up to 125 times... so a FAT32 effective limit would actually be 500GB (125 x 4GB)... if the data could be balanced perfectly between the various files.

Related

How to compute the FAT size in FAT File System

I am implementing a FAT file system in C. I am following the specs published by microsoft (http://read.pudn.com/downloads77/ebook/294884/FAT32%20Spec%20%28SDA%20Contribution%29.pdf)
But I don't understand how to compute the FAT size field of boot sector. In the specification document appear the following code on page 14.
RootDirSectors = ((BPB_RootEntCnt * 32) + (BPB_BytsPerSec – 1)) / BPB_BytsPerSec;
TmpVal1 = DskSize – (BPB_ResvdSecCnt + RootDirSectors);
TmpVal2 = (256 * BPB_SecPerClus) + BPB_NumFATs;
If(FATType == FAT32)
TmpVal2 = TmpVal2 / 2;
FATSz = (TMPVal1 + (TmpVal2 – 1)) / TmpVal2;
If(FATType == FAT32) {
BPB_FATSz16 = 0;
BPB_FATSz32 = FATSz;
} else {
BPB_FATSz16 = LOWORD(FATSz);
/* there is no BPB_FATSz32 in a FAT16 BPB */
}
From this code I don't understand
What is TmpVal2?
Why number 256 is used?
Why if it is FAT32 it divide by 2?
I am not sure why the constant of 256 was chosen however here are some thoughts on your other questions.
There is a note below the source code segment which states that the math is an approximation.
NOTE: The above math does not work perfectly. It will occasionally set
a FATSz that is up to 2 sectors too large for FAT16, and occasionally
up to 8 sectors too large for FAT32. It will never compute a FATSz
value that is too small, however. Because it is OK to have a FATSz
that is too large, at the expense of wasting a few sectors, the fact
that this computation is surprisingly simple more than makes up for it
being off in a safe way in some cases.
The way I read the code is that the calculation is for a FAT16 size and then there is an adjustment to the calculation if the target is actually FAT32.
The value of the variable TmpVal2 looks to be a unit size in that the amount of space calculated for the value of TmpVal1 is then divided by the unit size value of TmpVal2 in order to determine the number of units of disk space. However in the case of FAT32 the unit size is smaller than in FAT16 so there needs to be an adjustment.
It appears that FAT16 used a specific size for the File Allocation Table and as the hard disk space available for a volume was increased with improvements in disk technology, the cluster size was based on the volume size. So with a smaller volume size the cluster size, the number of disk sectors in an allocation unit, was smaller than the cluster size for a large volume size. See FAT16 vs. FAT32 in Microsoft TechNet as well as the tables in the source code on page 13 of the document you reference.
With FAT32, a standard cluster size of 4K was used and the File Allocation Table storage was changed from a fixed size to a variable size and was no longer at a fixed location on the disk.
This artice, File systems (FAT, FAT8, FAT16, FAT32, and NTFS) explained, goes into some details about the differences between these various file system versions.
The Wikipedia article, File Allocation Table, has quite a bit of technical information with links to other articles.
You may also find the following stackoverflow articles of interest.
Converting the cluster number stored in FAT table (of FAT12 filesystem) for reading from a floppy disk
Why did Windows use the FAT structure instead of a conventional linked list with a next pointer for each data block of a file?

Better to store data in RAM, text file, or database

I am working on a project where I am using words, encoded by vectors, which are about 2000 floats long. Now when I use these with raw text I need to retrieve the vector for each word as it comes across and do some computations with it. Needless to say for a large vocabulary (~100k words) this has a large storage requirement (about 8 GB in a text file).
I initially had a system where I split the large text file into smaller ones and then for a particular word, I read its file, and retrieved its vector. This was too slow as you might imagine.
I next tried reading everything into RAM (takes about ~40GB RAM) figuring once everything was read in, it would be quite fast. However, it takes a long time to read in and a disadvantage is that I have to use only certain machines which have enough free RAM to do this. However, once the data is loaded, it is much faster than the other approach.
I was wondering how a database would compare with these approaches. Retrieval would be slower than the RAM approach, but there wouldn't be the overhead requirement. Also, any other ideas would be welcome and I have had others myself (i.e. caching, using a server that has everything loaded into RAM etc.). I might benchmark a database, but I thought I would post here to see what other had to say.
Thanks!
UPDATE
I used Tyler's suggestion. Although in my case I did not think a BTree was necessary. I just hashed the words and their offset. I then could look up a word and read in its vector at runtime. I cached the words as they occurred in text so at most each vector is read in only once, however this saves the overhead of reading in and storing unneeded words, making it superior to the RAM approach.
Just an FYI, I used Java's RamdomAccessFile class and made use of the readLine(), getFilePointer(), and seek() functions.
Thanks to all who contributed to this thread.
UPDATE 2
For more performance improvement check out buffered RandomAccessFile from:
http://minddumped.blogspot.com/2009/01/buffered-javaiorandomaccessfile.html
Apparently the readLine from RandomAccessFile is very slow because it reads byte by byte. This gave me some nice improvement.
As a rule, anything custom coded should be much faster than a generic database, assuming you have coded it efficiently.
There are specific C-libraries to solve this problem using B-trees. In the old days there was a famous library called "B-trieve" that was very popular because it was fast. In this application a B-tree will be faster and easier than fooling around with a database.
If you want optimal performance you would use a data structure called a suffix tree. There are libraries which are designed to create and use suffix trees. This will give you the fastest word lookup possible.
In either case there is no reason to store the entire dataset in memory, just store the B-tree (or suffix tree) with an offset to the data in memory. This will require about 3 to 5 megabytes of memory. When you query the tree you get an offset back. Then open the file, seek forwards to the offset and read the vector off disk.
You could use a simple text based index file just mapping the words to indices, and another file just containing the raw vector data for each word. Initially you just read the index to a hashmap that maps each word to the datafile index and keep it in memory. If you need the data for a word, you calculate the offset in the data file (2000 * 32 * index) and read it as needed. You probably want to cache this data in RAM (if you are in java perhaps just use a weak map as a starting point).
This is basically implementing your own primitive database, but it may still be preferable because it avoidy database setup / deployment complexity.

Binary Search on Large Disk File in C - Problems

This question recurs frequently on StackOverflow, but I have read all the previous relevant answers, and have a slight twist on the question.
I have a 23Gb file containing 475 million lines of equal size, with each line consisting of a 40-character hash code followed by an identifier (an integer).
I have a stream of incoming hash codes - billions of them in total - and for each incoming hash code I need to locate it and print out corresponding identifier. This job, while large, only needs to be done once.
The file is too large for me to read into memory and so I have been trying to usemmap in the following way:
codes = (char *) mmap(0,statbuf.st_size,PROT_READ,MAP_SHARED,codefile,0);
Then I just do a binary search using address arithmetic based on the address in codes.
This seems to start working beautifully and produces a few million identifiers in a few seconds, using 100% of the cpu, but then after some, seemingly random, amount of time it slows down to a crawl. When I look at the process using ps, it has changed from status "R" using 100% of the cpu, to status "D" (diskbound) using 1% of the cpu.
This is not repeatable - I can start the process off again on the same data, and it might run for 5 seconds or 10 seconds before the "slow to crawl" happens. Once last night, I got nearly a minute out of it before this happened.
Everything is read only, I am not attempting any writes to the file, and I have stopped all other processes (that I control) on the machine. It is a modern Red Hat Enterprise Linux 64-bit machine.
Does anyone know why the process becomes disk-bound and how to stop it?
UPDATE:
Thanks to everyone for answering, and for your ideas; I had not previously tried all the various improvements before because I was wondering if I was somehow using mmap incorrectly. But the gist of the answers seemed to be that unless I could squeeze everything into memory, I would inevitable run into problems. So I squashed the size of the hash code to the size of the leading prefix that did not create any duplicates - the first 15 characters were enough. Then I pulled the resulting file into memory, and ran the incoming hash codes in batches of about 2 billion each.
The first thing to do is split the file.
Make one file with the hash-codes and another with the integer ids. Since the rows are the same then it will line up fine after the result is found. Also you can try an approach that puts every nth hash into another file and then stores the index.
For example, every 1000th hash key put into a new file with the index and then load that into memory. Then binary scan that instead. This will tell you the range of 1000 entries that need to be further scanned in the file. Yes that will do it fine! But probably much less than that. Like probably every 20th record or so will divide that file size down by 20 +- if I am thinking good.
In other words after scanning you only need to touch a few kilobytes of the file on disk.
Another option is to split the file and put it in memory on multiple machines. Then just binary scan each file. This will yield the absolute fastest possible search with zero disk access...
Have you considered hacking a PATRICIA trie algorithm up? It seems to me that if you can build a PATRICIA tree representation of your data file, which refers to the file for the hash and integer values, then you might be able to reduce each item to node pointers (2*64 bits?), bit test offsets (1 byte in this scenario) and file offsets (uint64_t, which might need to correspond to multiple fseek()s).
Does anyone know why the process becomes disk-bound and how to stop it?
Binary search requires a lot of seeking within the file. In the case where the whole file doesn't fit in memory, the page cache doesn't handle the big seeks very well, resulting in the behaviour you're seeing.
The best way to deal with this is to reduce/prevent the big seeks and make the page cache work for you.
Three ideas for you:
If you can sort the input stream, you can search the file in chunks, using something like the following algorithm:
code_block <- mmap the first N entries of the file, where N entries fit in memory
max_code <- code_block[N - 1]
while(input codes remain) {
input_code <- next input code
while(input_code > max_code) {
code_block <- mmap the next N entries of the file
max_code <- code_block[N - 1]
}
binary search for input code in code_block
}
If you can't sort the input stream, you could reduce your disk seeks by building an in-memory index of the data. Pass over the large file, and make a table that is:
record_hash, offset into file where this record starts
Don't store all records in this table - store only every Kth record. Pick a large K, but small enough that this fits in memory.
To search the large file for a given target hash, do a binary search in the in-memory table to find the biggest hash in the table that is smaller than the target hash. Say this is table[h]. Then, mmap the segment starting at table[h].offset and ending at table[h+1].offset, and do a final binary search. This will dramatically reduce the number of disk seeks.
If this isn't enough, you can have multiple layers of indexes:
record_hash, offset into index where the next index starts
Of course, you'll need to know ahead of time how many layers of index there are.
Lastly, if you have extra money available you can always buy more than 23 gb of RAM, and make this a memory bound problem again (I just looked at Dell's website - you pick up a new low-end workstation with 32 GB of RAM for just under $1,400 Australian dollars). Of course, it will take a while to read that much data in from disk, but once it's there, you'll be set.
Instead of using mmap, consider just using plain old lseek+read. You can define some helper functions to read a hash value or its corresponding integer:
void read_hash(int line, char *hashbuf) {
lseek64(fd, ((uint64_t)line) * line_len, SEEK_SET);
read(fd, hashbuf, 40);
}
int read_int(int line) {
lseek64(fd, ((uint64_t)line) * line_len + 40, SEEK_SET);
int ret;
read(fd, &ret, sizeof(int));
return ret;
}
then just do your binary search as usual. It might be a bit slower, but it won't start chewing up your virtual memory.
We don't know the back story. So it is hard to give you definitive advice. How much memory do you have? How sophisticated is your hard drive? Is this a learning project? Who's paying for your time? 32GB of ram doesn't seem so expensive compared to two days of work of person that makes $50/h. How fast does this need to run? How far outside the box are you willing to go? Does your solution need to use advanced OS concepts? Are you married to a program in C? How about making Postgres handle this?
Here's is a low risk alternative. This option isn't as intellectually appealing as the other suggestions but has the potential to give you significant gains. Separate the file into 3 chunks of 8GB or 6 chunks of 4GB (depending on the machines you have around, it needs to comfortably fit in memory). On each machine run the same software, but in memory and put an RPC stub around each. Write an RPC caller to each of your 3 or 6 workers to determine the integer associated with a given hash code.

I'm trying to rebuild the indexes on a Progress database after a huge binary load, getting this error

Just imported 655 tables via binary load using a batch script to a newly created database on a 650gb hard drive.
Idxbuild is running with threads, the maximum number of threads is 1. (13942)
TMB value is 8, TM value is 8, SG value is 48, packing factor is 100. (16141)
Temporary sort file at: C:\Progress\OpenEdge\bin will use the available disk spa
ce. (11443)
SYSTEM ERROR: Unable to extend database within area Schema Area. (8897)
I can't find any solution to this in the documentation.
Tom Bascom -- I know you know a solution to this.
Thank You community!
What version of Progress?
As Tim says, that's a very odd place for your temp files. How did that happen? My guess is that your working directory is %DLC%\bin.
You're extending the schema area? Why? Did you forget to move all of your data, indexes and LOBs to type 2 storage?
By eliminating all of the AREA information from the structure file you put everything in the schema area.
You probably also did not create a structure file with multiple extents and so forth? Thus there is just the single initial extent.
It also seems likely that you did not enable large files. Which means that once that extent hits 2GB it cannot grow.
So the quick and easy solution is probably:
proutil dbName -C enablelargefiles
Note: this is a terrible way to set setup a database -- don't do it for a real system. But, as I understand it, you are just trying to do a one-time load of this data so that you can export it as CSV data.
From the KB:
This situation arises when the database Storage Area is either:
a.) composed entirely of fixed-length extents and the last extent has
become filled up, b.) a variable length extent needs to exceed the
2Gig file size limit to accommodate writes and LargeFiles have not
been enabled, or c.) when the user hits their ulimit as defined in
their .profile (UNIX) or disk quota limit (Windows)
In any of the above cases, the PROGRESS run is aborted and recovery
must be run.
Also:
More extents for the database to grow must be made avaialable. The
prostrct utility must be used to add additional space. In future, the
highwater mark of the last extent in an area can be monitored to
forewarn of this occurrence. Once this is done, restart the database
and allow crash recovery to take place.
and
References to Written DocumentationReferences to
ProgressManuals:Database Administration Guide and Reference - Chapter
9 "Maintaining the Database Structure"
Also, why are you pointing your temp space for sorting at the progress bin directory? Point it at an empty temp directory instead.
Use the Temporary Directory (-T) startup parameter to identify or redirect temporary files created by the PROUTIL utility to a specified directory when sorting and handling space issues.

The FAT, Linux, and NTFS file systems

I heard that the NTFS file system is basically a b-tree. Is that true? What about the other file systems? What kind of trees are they?
Also, how is FAT32 different from FAT16?
What kind of tree are the FAT file systems using?
FAT (FAT12, FAT16, and FAT32) do not use a tree of any kind. Two interesting data structures are used, in addition to a block of data describing the partition itself. Full details at the level required to write a compatible implementation in an embedded system are available from Microsoft and third parties. Wikipedia has a decent article as an alternative starting point that also includes a lot of the history of how it got the way it is.
Since the original question was about the use of trees, I'll provide a quick summary of what little data structure is actually in a FAT file system. Refer to the above references for accurate details and for history.
The set of files in each directory is stored in a simple list, initially in the order the files were created. Deletion is done by marking an entry as deleted, so a subsequent file creation might re-use that slot. Each entry in the list is a fixed size struct, and is just large enough to hold the classic 8.3 file name along with the flag bits, size, dates, and the starting cluster number. Long file names (which also includes international character support) is done by using extra directory entry slots to hold the long name alongside the original 8.3 slot that holds all the rest of the file attributes.
Each file on the disk is stored in a sequence of clusters, where each cluster is a fixed number of adjacent disk blocks. Each directory (except the root directory of a disk) is just like a file, and can grow as needed by allocating additional clusters.
Clusters are managed by the (misnamed) File Allocation Table from which the file system gets its common name. This table is a packed array of slots, one for each cluster in the disk partition. The name FAT12 implies that each slot is 12 bits wide, FAT16 slots are 16 bits, and FAT32 slots are 32 bits. The slot stores code values for empty, last, and bad clusters, or the cluster number of the next cluster of the file. In this way, the actual content of a file is represented as a linked list of clusters called a chain.
Larger disks require wider FAT entries and/or larger allocation units. FAT12 is essentially only found on floppy disks where its upper bound of 4K clusters makes sense for media that was never much more than 1MB in size. FAT16 and FAT32 are both commonly found on thumb drives and flash cards. The choice of FAT size there depends partly on the intended application.
Access to the content of a particular file is straightforward. From its directory entry you learn its total size in bytes and its first cluster number. From the cluster number, you can immediately calculate the address of the first logical disk block. From the FAT indexed by cluster number, you find each allocated cluster in the chain assigned to that file.
Discovery of free space suitable for storage of a new file or extending an existing file is not as easy. The FAT file system simply marks free clusters with a code value. Finding one or more free clusters requires searching the FAT.
Locating the directory entry for a file is not fast either since the directories are not ordered, requiring a linear time search through the directory for the desired file. Note that long file names increase the search time by occupying multiple directory entries for each file with a long name.
FAT still has the advantage that it is simple enough to implement that it can be done in small microprocessors so that data interchange between even small embedded systems and PCs can be done in a cost effective way. I suspect that its quirks and oddities will be with us for a long time as a result.
ext3 and ext4 use "H-trees", which are apparently a specialized form of B-tree.
BTRFS uses B-trees (B-Tree File System).
ReiserFS uses B+trees, which are apparently what NTFS uses.
By the way, if you search for these on Wikipedia, it's all listed in the info box on the right side under "Directory contents".
Here is a nice chart on FAT16 vs FAT32.
The numerals in the names FAT16 and
FAT32 refer to the number of bits
required for a file allocation table
entry.
FAT16 uses a 16-bit file allocation
table entry (2 16 allocation units).
Windows 2000 reserves the first 4 bits
of a FAT32 file allocation table
entry, which means FAT32 has a maximum
of 2 28 allocation units. However,
this number is capped at 32 GB by the
Windows 2000 format utilities.
http://technet.microsoft.com/en-us/library/cc940351.aspx
FAT32 uses 32bit numbers to store cluster numbers. It supports larger disks and files up to 4 GiB in size.
As far as I understand the topic, FAT uses File Allocation Tables which are used to store data about status on disk. It appears that it doesn't use trees. I could be wrong though.

Resources