This question already has answers here:
What is the major difference between Varchar2 and char
(7 answers)
Closed 6 years ago.
i'm currently developing for a big company, and they ask me to create a database for the project following some guidelines, but i have a small question about char and varchar.
I have a string that can be either 8 or 11 char and I would like to know which is the best solution:
myColumn varchar(11)
or
myColumn char(11)
At first i was thinking about the second one, but char is for fixed length no?
Thanks for the answers.
char(11) can hold string with length upto 11 chars, but even if you have only 2 chars, it will take space for the entire 11 chars(will fill the rest with space. where as varchar(11) does hold the strings with same length, but will not reserve space.
What is the major difference between Varchar2 and char
CHAR Data Type is a Fixed Length Data Type. For example, if you declare a column of CHAR (11) data type, then it will always take 11 bytes irrespective of whether you are storing 1 character or 11 characters in this column.
On the other hand, VARCHAR is a variable length Data Type. For example, if you declare a column of VARCHAR (11) data type, it will take the number of bytes equal to the number of characters stored in this column. So, in this variable/column if you are storing only one character then it will take only one byte and if we are storing 11 characters then it will take 11 bytes. And in this example, as you have declared a column as VARCHAR (11), so we can store max 11 characters in this column.
https://msdn.microsoft.com/en-in/library/ms176089.aspx
varchar(11) is more preferable because, memory allocation is dynamically (eg. Suppose string length is 8 then memory allocation for just 8 character not for all). but in case of char(11) memory allocated for all cell.
Related
Does it occupy fixed N*2 or it may use less storage if the actual value to be stored is smaller then N*2 bytes?
I have a huge table with many fields of fixed nvarchar type. Some are nvarchar(100) and some are nvarchar(400) etc.
Data in column is never an exact size, it varies from 0 to N. Most of data is less then N/2.
For example, a field called RecipientName is of type nvarchar(400) and there are 9026424 rows.
Size of only RecipientName would be 800*9026424 = 6.72 GB.
but actual storage size of entire table is only 2.02 GB. Is there any compression applied or some smaller then N with power of 2 is chosen?
NCHAR data type:
It is a fixed length data type.
It Occupies 2 bytes of space for EACH CHARACTER.
It is used to store Unicode characters (e.g. other languages like Spanish, French, Arabic, German, etc.)
For Example:
Declare #Name NChar(20);
Set #Name = N'Sachin'
Select #Name As Name, DATALENGTH(#Name) As [Datalength In Bytes], LEN(#Name) As [Length];
Name Datalength Length
Sachin 40 6
Even though declared size is 20, the data length column shows 40 bytes storage memory size because it uses 2 bytes for each character.
And this 40 bytes of memory is irrespective of the actual length of data stored.
NVARCHAR data type:
It is a variable length data type.
It Occupies 2 bytes of space for EACH CHARACTER.
It is used to store Unicode characters (e.g. other languages like Spanish, French, Arabic, German, etc.)
For Example:
Declare #Name NVarchar(20);
Set #Name = N'Sachin'
Select #Name As Name, DATALENGTH(#Name) As [Datalength], LEN(#Name) As [Length];
Name Datalength Length
Sachin 12 6
Even though declared size is 20, the data length column shows 12 bytes storage memory size because it uses 2 bytes for each character.
And this 12 bytes of memory is irrespective of the length of data in the declaration.
Hope this is helpful :)
Yes,
it may use less storage if the actual value to be stored is smaller
then N*2 bytes
n just shows the maximum number of characters that can be stored in this field, the number of stored characters is equal to actual characters number you pass in.
And here is the documentation: nchar and nvarchar (Transact-SQL)
For non-MAX, non-XML string types, the length that they are declared as (i.e. the value within the parenthesis) is the maximum number of smallest (in terms of bytes) characters that will be allowed. But, the actual limit isn't calculated in terms of characters but in terms of bytes. CHAR and VARCHAR characters can be 1 or 2 bytes, so the smallest is 1 and hence a [VAR]CHAR(100) has a limit of 100 bytes. That 100 bytes can be filled up by 100 single-byte characters, or 50 double-byte characters, or any combination that does not exceed 100 bytes. NCHAR and NVARCHAR (stored as UTF-16 Little Endian) characters can be either 2 or 4 bytes, so the smallest is 2 and hence a N[VAR]CHAR(100) has a limit of 200 bytes. That 200 bytes can be filled up by 100 two-byte characters or 50 four-byte characters, or any combination that does not exceed 200 bytes.
If you enable ROW or DATA Compression (this is a per-Index setting), then the actual space used will usually be less. NCHAR and NVARCHAR use the Unicode Compression Algorithm which is somewhat complex so not easy to calculate what it would be. And I believe that the MAX types don't allow for compression.
Outside of those technicalities, the difference between the VAR and non-VAR types is simply that the VAR types take up only the space of each individual value inserted or updated, while the non-VAR types are blank-padded and always take up the declared amount of space (which is why one almost always uses the VAR types). The MAX types are only variable (i.e. there is no CHAR(MAX) or NCHAR(MAX)).
The documentation isn't super clear: https://msdn.microsoft.com/en-us/library/ms186939.aspx
What happens if I try to store a 20 character length string in a column defined as nvarchar(10)? Is 10 the max length the field could be or is it the expected length? If I can exceed n characters in the string, what are the performance implications of doing that?
The maximum number of characters you can store in a column or variable typed as nvarchar(n) is n. If you try to store more your string will be truncated, or in case of an insert into a table, the insert would be disallowed with a warning about possible truncation:
String or binary data would be truncated. The statement has been
terminated.
declare #n nvarchar(10)
set #n = N'more than ten chars'
select #n
Result:
----------
more than
(1 row(s) affected)
From my understanding, nvarchar will only only store the provided characters up to the amount defined. Nchar will actually fill in the unused characters with whitespace.
If varchar(max) is used as the datatype and the inserted data is less than the full allocation, i.e. only 200 chars, then will SQL Server always take the full space of varchar(max) or just the 200 chars' space?
Further, what are the other data types that will take the max space even if lesser data is inserted?
Are there any documents that specify this?
From MS DOCS on char and varchar (Transact-SQL):
char [ ( n ) ]
Fixed-length, non-Unicode string data. n defines the string length and must be a value from 1 through 8,000. The storage size is n bytes. The ISO synonym for char is character.
varchar [ ( n | max ) ]
Variable-length, non-Unicode string data. n defines the string length and can be a value from 1 through 8,000. max indicates that the maximum storage size is 2^31-1 bytes (2 GB). The storage size is the actual length of the data entered + 2 bytes. The ISO synonyms for varchar are char varying or character varying.
So for varchar, including max - the storage will depend on actual data length, while char is always fixed size even when entire space is not used.
Use CHAR only for strings
whose length you know to be fixed. For example, if you define a domain
whose values are restricted to 'T' and 'F', you should probably make
that CHAR[1]. If you're storing US social security numbers, make the
domain CHAR[9] (or CHAR[11] if you want punctuation).
Use VARCHAR for strings that can vary in length, like names, short
descriptions, etc. Use VARCHAR when you don't want to worry about
stripping trailing blanks. Use VARCHAR unless there's a good reason
not to.
varchar size depends on the length of the data. So in your case, it will just take 200 chars.
This question already has answers here:
What is the maximum number of characters that nvarchar(MAX) will hold?
(3 answers)
Closed 1 year ago.
I have declared a column of type NVARCHAR(MAX) in SQL Server 2008, what would be its exact maximum characters having the MAX as the length?
The max size for a column of type NVARCHAR(MAX) is 2 GByte of storage.
Since NVARCHAR uses 2 bytes per character, that's approx. 1 billion characters.
Leo Tolstoj's War and Peace is a 1'440 page book, containing about 600'000 words - so that might be 6 million characters - well rounded up. So you could stick about 166 copies of the entire War and Peace book into each NVARCHAR(MAX) column.
Is that enough space for your needs? :-)
By default, nvarchar(MAX) values are stored exactly the same as nvarchar(4000) values would be, unless the actual length exceed 4000 characters; in that case, the in-row data is replaced by a pointer to one or more seperate pages where the data is stored.
If you anticipate data possibly exceeding 4000 character, nvarchar(MAX) is definitely the recommended choice.
Source: https://social.msdn.microsoft.com/Forums/en-US/databasedesign/thread/d5e0c6e5-8e44-4ad5-9591-20dc0ac7a870/
From MSDN Documentation
nvarchar [ ( n | max ) ]
Variable-length Unicode string data. n defines the string length and can be a value from 1 through 4,000. max indicates that the maximum storage size is 2^31-1 bytes (2 GB).
The storage size, in bytes, is two times the actual length of data entered + 2 bytes
I think actually nvarchar(MAX) can store approximately 1070000000 chars.
Does a char occupy 1 byte in a database?
EDIT:
If I define a column as varchar(1), will it reserve 1 or 2 bytes for me?
Char(k) takes k-bytes no matter what the value is,
varchar(k) n+1 bytes, where n = number of chars in the value, but max k+1 bytes
Value CHAR(4) Storage Required VARCHAR(4) Storage Required
'' ' ' 4 bytes '' 1 byte
'ab' 'ab ' 4 bytes 'ab' 3 bytes
'abcd' 'abcd' 4 bytes 'abcd' 5 bytes
'abcdefgh' 'abcd' 4 bytes 'abcd' 5 bytes
http://dev.mysql.com/doc/refman/5.1/en/char.html
depends on what kind of char is it. if type of string is char/varchar then 1 byte if unicode: nchar/nvarchar then most probably 2 bytes.
It depends on the RDBMS system, and how you define the column. You certainly could define one that only requires one byte of storage space [in SQL Server, it'd be CHAR(1) ]. Overhead for row headers, null bitmasks, possibly indexing uniquefication, and lots of other cruft can complicate things, but yeah, you should be able to create a column that's one byte wide.
Yes, if you specify the length of the char field as one, and the database is using a codepage based character mapping so that each character is represented as one byte.
If the database for example is set up to use UTF-8 for storing characters, each character will take anything from one to five bytes depending on what character it is.
However, the char data type is rather old, some databases may actually store a char(1) fields the same way as a varchar(1) field. In that case the field will also need a length, so it will take up at least one or two bytes depending on whether it's a space that you store in the field (which will be stored as an empty string), maybe more depending on the database.