SQL Sever - max # of 8k pages in a file? - sql-server

MSDN says here: msdn maximums that max datafile size is "16 Terabytes" - not sure if their definition of terabyte is 1024^4 or 1000^4 - so valid max page number might be 2,147,483,648 (for 1024 basis) or 1,953,125,000 (for 1000 basis) or perhaps something else - does anyone know with certainty?
I have heard that this limit should be increasing with future releases - right now I'm using 2012.

Yes it is based on 1024 which is a kilobyte. Multiply that by 1024 and you get a megabyte. And so on.
Your also correct that newer versions have larger maximums.

The minimum unit required for storing any type of data in SQL Server is a Page, a page is 8 KB in size.i.e exactly 8192 Bytes , pages are stored in logical Extents.
Page Header
Yet not all of the 8192 Bytes is available for data storage, some of the space from 8192 Bytes is used to store information about the page itself. It is called Page Header and it is about 96 Bytes.
Row Set
This is another section on the page containing information about the rows on that page, it begins at the end of the page taking another 36 Bytes from the total page size 8192 Bytes.
Total Space Available for Data Storage
8912 Total space on a page
- 96 Space taken by the page header
- 36 Space taken by the Row set
----------------------------------------------
8060 Total Space Available for Data Storage
So if you are trying to calculate the amount of data you will be able to store in a database and especially when you are talking in Terabytes, dont forget to take Page header and row set into consideration.

Related

Store attribute as binary or String (JSON)

I have to store some attributes in DynamoDB and confused if some of JSON attributes should be stored as String/Binary. I understand that storing it as binary will reduce the size of attribute.
I considered DDB limits as 1 Read/Write IOPS consumes 4KB.
My total data in item is less than 4KB even if I store it as String.
What things should I consider to choose binary vs String ?
Thanks.
Given that your item sizes are less than 4KB uncompressed, whether to encode attributes in byte or string depends on whether the attribute will be a partition / range key of the table and your typical read patterns.
A partition key has a max sz of 2048 bytes, or ~2Kb.
A sort key (if you specify one on the table) has a max sz of 1024 bytes, or ~1Kb.
If you foresee your string attribute exceeding the above maximums on any items, it would make sense to compress to binary first to keep your attribute sizes in congruence with DynamoDB requirements.
Depending on how many items are in your typical query and your tolerance for throttled queries, your RCU's may not satisfy a Query / Scan where you perform the read in a single request.
For instance,
If you have 1KB items and want to query 100 items in a single request, your RCU req will be as follows:
(100 * 1024 bytes = 100 KB) / 4 KB = 25 read capacity units
Converting some attributes to binary could reduce your RCU requirement in this case. Again it largely depends on your typical usage pattern.
See http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.ProvisionedThroughput.html#HowItWorks.ProvisionedThroughput.Reads

The total size of the index is too large or too many parts in index in informix

I am trying to run following script on informix:
CREATE TABLE REG_PATH (
REG_PATH_ID SERIAL UNIQUE,
REG_PATH_VALUE LVARCHAR(750) NOT NULL,
REG_PATH_PARENT_ID INTEGER,
REG_TENANT_ID INTEGER DEFAULT 0,
PRIMARY KEY(REG_PATH_ID, REG_TENANT_ID) CONSTRAINT PK_REG_PATH
);
CREATE INDEX IDX1 ON REG_PATH(REG_PATH_VALUE, REG_TENANT_ID);
But it gives the following error:
517: The total size of the index is too large or too many parts in index.
I am using informix version 11.50FC9TL. My dbspace chunk size is 5M.
What is the reason for this error, and how can I fix it?
I believe 11.50 has support for large page sizes, and to create an index on a column that is LVARCHAR(750) (plus a 4-byte INTEGER), you will need to use a bigger page size for the dbspace that holds the index. Offhand, I think the page size will need to be at least 4 KiB, rather than the default 2 KiB you almost certainly are using. The rule of thumb I remember is 'at least 5 index keys per page', and at 754 bytes plus some overhead, 5 keys squeaks in at just under 4 KiB.
This is different from the value quoted by Bohemian in his answer.
See the IDS 12.10 Information Center for documentation about Informix 12.10.
Creating a dbspace with a non-default page size
CREATE INDEX statement
Index key specification
This last reference has a table of dbspace page sizes and maximum key sizes permitted:
Page Size Maximum Index Key Size
2 kilobytes 387 bytes
4 kilobytes 796 bytes
8 kilobytes 1,615 bytes
12 kilobytes 2,435 bytes
16 kilobytes 3,245 bytes
If 11.50 doesn't have support for large page sizes, you will have to migrate to a newer version (12.10 recommended, 11.70 a possibility) if you must create such an index.
One other possibility to consider is whether you really want such a large key string; could you reduce it to, say, 350 bytes? That would then fit in your current system.
From the informix documentation:
You can include up to 16 columns in a composite index. The total width of all indexed columns in a single composite index cannot exceed 380 bytes.
One of the columns you want to add to your index is REG_PATH_VALUE LVARCHAR(750); 750 bytes is longer than the 380 maximum allowed.
You can't "fix" this per se; either make the column size smaller, or don't include it in the index.

Worth a unique table for database values that repeat ~twice?

I have a static database of ~60,000 rows. There is a certain column for which there are ~30,000 unique entries. Given that ratio (60,000 rows/30,000 unique entries in a certain column), is it worth creating a new table with those entries in it, and linking to it from the main table? Or is that going to be more trouble than it's worth?
To put the question in a more concrete way: Will I gain a lot more efficiency by separating out this field into it's own table?
** UPDATE **
We're talking about a VARCHAR(100) field, but in reality, I doubt any of the entries use that much space -- I could most likely trim it down to VARCHAR(50). Example entries: "The Gas Patch and Little Canada" and "Kora Temple Masonic Bldg. George Coombs"
If the field is a VARCHAR(255) that normally contains about 30 characters, and the alternative is to store a 4-byte integer in the main table and use a second table with a 4-byte integer and the VARCHAR(255), then you're looking at some space saving.
Old scheme:
T1: 30 bytes * 60 K entries = 1800 KiB.
New scheme:
T1: 4 bytes * 60 K entries = 240 KiB
T2: (4 + 30) bytes * 30 K entries = 1020 KiB
So, that's crudely 1800 - 1260 = 540 KiB space saving. If, as would be necessary, you build an index on the integer column in T2, you lose some more space. If the average length of the data is larger than 30 bytes, the space saving increases. If the ratio of repeated rows ever increases, the saving increases.
Whether the space saving is significant depends on your context. If you need half a megabyte more memory, you just got it — and you could squeeze more if you're sure you won't need to go above 65535 distinct entries by using 2-byte integers instead of 4 byte integers (120 + 960 KiB = 1080 KiB; saving 720 KiB). On the other hand, if you really won't notice the half megabyte in the multi-gigabyte storage that's available, then it becomes a more pragmatic problem. Maintaining two tables is harder work, but guarantees that the name is the same each time it is used. Maintaining one table means that you have to make sure that the pairs of names are handled correctly — or, more likely, you ignore the possibility and you end up without pairs where you should have pairs, or you end up with triplets where you should have doubletons.
Clearly, if the type that's repeated is a 4 byte integer, using two tables will save nothing; it will cost you space.
A lot, therefore, depends on what you've not told us. The type is one key issue. The other is the semantics behind the repetition.

Loading Huge resolution images causing Heap error in j2me

I am trying to load a 3776 * 2816 PNG, 24 bit image - 804KB onto my phone , the MOTO ROKR e6.It gives up with java.lang.OutOfMemoryError,Is their a general way to handle loading such high resolution images.The phone's internal memory is only 8MB, I think this has something to do with the error.
I have also, tried to split the image to 16 parts and load them, still there seems to be some limit on what it can handle.
Please advise.
So just some quick calculations:
24 bits = 3 bytes
space required (in bytes) = 3776 * 2816 * 3
= 31,899,648 bytes
= 31.9MB
That means once you've loaded the image (using ImageIO or JAI or whatever) you need 31.9MB to store the raw image data. As a result you can't load it on a device with only 8MB of memory (and I'm assuming no other kind of swap space).
You could load the raw file as bytes of data rather than an image -- the data is heavily compressed -- but I don't think that's what you're looking for.

Predicting Oracle Table Growth

How can I predict the future size / growth of an Oracle table?
Assuming:
linear growth of the number of rows
known columns of basic datatypes (char, number, and date)
ignore the variability of varchar2
basic understanding of the space required to store them (e.g. number)
basic understanding of blocks, extents, segments, and block overhead
I'm looking for something more proactive than "measure now, wait, measure again."
Estimate the average row size based on your data types.
Estimate the available space in a block. This will be the block size, minus the block header size, minus the space left over by PCTFREE. For example, if your block header size is 100 bytes, your PCTFREE is 10, and your block size is 8192 bytes, then the free space in a given block is (8192 - 100) * 0.9 = 7282.
Estimate how many rows will fit in that space. If your average row size is 1 kB, then roughly 7 rows will fit in an 8 kB block.
Estimate your rate of growth, in rows per time unit. For example, if you anticipate a million rows per year, your table will grow by roughly 1 GB annually given 7 rows per 8 kB block.
I suspect that the estimate will depend 100% on the problem domain. Your proposed method seems as good a general procedure as is possible.
Given your assumptions, "measure, wait, measure again" is perfectly predictive. In 10g+ Oracle even does the "measure, wait, measure again" for you. http://download.oracle.com/docs/cd/B19306_01/server.102/b14237/statviews_3165.htm#I1023436

Resources