I have a big chunk of textual data which I split and write multiple rows of a varchar(255) column of a table. Sometimes, the last character happens to be a space. When I read back this row, the trailing space is chopped and I get only 254 characters. This messes up my data when I append the next row to the end of this one.
My code sends the full 255 char (incl space) to the DB API. How can I check that the trailing space is actually written to the table?
I am not in a position to rewrite/redesign legacy code. Is there any setting - either in the DB, DB interface, read/write calls etc - that I can use to preserve the trailing space?
This is designed behaviour varchars will strip. If you want to keep all the filling spaces you have to use char columns. So the only thing you can do is change the schems
Related
I would like to save postcodes in my database. There are postcodes that begin with a 0, and not only in Germany. Unfortunately, the integer format would not work because the 0 would be omitted. So 01234 would then become 1234. Is varChar(5) the only possibility?
The German ZIP Code is always 5 digits.
Well, alphanumeric (anything with CHAR in the data type name...). Same as for phone numbers or other data that does primarily contain digits, but not only.
Varchar(5) certainly isn't the only possibility. You could save them as integers, then pad them with leading zeroes in whatever application is using the database.
But IMO, saving them as strings (whether varchar or char) is the best option. Even though they're comprised of digits, they're not really numbers (e.g. it doesn't make sense to add them together, and leading zeroes are important). Saving them as strings would also give you flexibility if you do eventually need to use postcodes with letters in them.
If they're always going to be 5 characters, then use a datatype that's exactly 5 characters, i.e. char(5). You could even add a column constraint to ensure that anything inserted into the table is exactly 5 characters long and every char is a digit.
I'm having trouble understanding how to define a column for my text that has the right size for my max. number of characters. In Oracle I can create a VARCHAR2(10 CHAR) which will be big enough for 10 characters. The size depends on the encoding used in the database. But how do I do that in SQL Server? Do I use varchar(10)? nvarchar(10)? I want to be able to store all kinds of characters (even chinese).
If you want Chinese characters, you need to use nvarchar(n) and specify a length of n that makes sense.
Those are characters you're defining, and the space you need is twice that number (since any Unicode character in SQL Server always uses 2 bytes).
Max. size is nvarchar(4000) - or if you really need more, use nvarchar(max) (for up to 1 billion characters).
I would recommend NOT to just use nvarchar(max) for everything, out of lazyness about considering what size you really need! Since it's a really large column, you won't be able to index it for one.
If you use nvarchar(max) this will allow for any number of characters for all character sets. The system will optimise storage.
Limitations on row size are addressed here. See answer from #marc_s for limitations on the use of max.
What is the best way to store the following value in SQL Server ?
1234-56789 or
4567-12892
The value will always have 4 digits followed by a hyphen and 5 digits
char(10) is a possibility that I was thinking of using or removing the hyphen and storing as int
If it is a business requirement to have "The value will always have 4 digits followed by a hypen and 5 digits" Then CHAR(10) but if you think Users should be able to add values even if isnt in the expected format then VARCHAR(10) or VARCHAR(15) whatever suits you better.
You should store those kind of values as int only if really represents a number as opposed to a series of digits. Number means something that you can make calculations on, compare are numbers, etc.
Otherwise store it as char. Make it length of 10 if the format is set and won't change.
Another option would be to create a CHAR(4) column and a CHAR(5) column. This would be useful (only) if you envision ever having to query against one or the other part independently.
Very easy to concatenate these back together using a view, computed column, or inline - so you don't have to waste storage space on a dash that will always be there, and so that you can keep these two pieces of data separate if, in fact, they are independent.
Since you didn't provide much detail about what these "numbers" represent or how they will be used / queried, you're going to get a whole bunch of opinions, some of which might not be very relevant to your data model.
Well, if it's guaranteed to always be like that, a char(10) datatype seems appropriate.
But you should also add a check constraint:
column LIKE '[0-9][0-9][0-9][0-9]-[0-9][0-9][0-9][0-9][0-9]'
Here is a SO answer that should help you sort out what you need -
nchar and nvarchar can store Unicode characters.
char and varcharcannot store Unicode characters.
char and nchar are fixed-length which will reserve storage space for number of characters you specify even if you don't use up all that space.
varchar and nvarchar are variable-length which will only use up spaces for the characters you store. It will not reserve storage like char or nchar.
When I am saving data into a table, extra spaces being added to the valued at the tail. I observed that as the column length is 5, if I am inserting a value of 3 char length, 2 extra spaces are being added. Can any one how to solve this problem.
Is the column type CHAR(5) instead of VARCHAR(5)?
CHAR(x) creates a column that always stores x characters, and pads the data with spaces.
VARCHAR(x) creates a column that varies the lengths of the strings to match the data inserted.
This is a property of CHAR data type. If you want no extra spaces, you need to use VARCHAR although for a small field there is a minimal overhead compared to standard CHAR. Having said that, it is believed that VARCHAR nowadays is as good as CHAR.
CHAR variables will store this extra padding, maybe you need to be using VARCHAR2 variables instead?
I have a form that records a student ID number. Some of those numbers contain a leading zero. When the number gets recorded into the database it drops the leading 0.
The field is set up to only accept numbers. The length of the student ID varies.
I need the field to be recorded and displayed with the leading zero.
If you are always going to have a number of a certain length (say, it will always be 10 characters), then you can just get the length of the number in the database (after it is converted to a string) and then add the appropriate 0's.
However, if this is an arbitrary amount of leading zeros, then you will have to store the content as a string in the database so you can capture the leading zeros.
It sounds like this should be stored as string data. It sounds like the leading zeros are part of the data itself, not just part of it's formatting.
You could reformat the data for display with the leading zeros in it, however I believe you should store the correct form of the ID number, it will lead to less bugs down the road (ex: you forgot to format it in one place but not in another).
There are a few ways of doing this - depending on the answers to my comments in your question:
Store the extra data in the database by converting the datatype from numeric to varchar/string.
Advantages: Very simple in its implementation; You can treat all the values in the same way.
Disadvantage: If you've got very large amounts of data, storage sizes will escalate; indexing and sorting on strings doesn't perform so well.
Use if: Each number may have an arbitrary length (and hence number of zeros).
Don't use if: You're going to be spending a lot of time sorting data, sorting numeric strings is a pain in the ass - look up natural sorting to see some of the pitfalls;
Continue to store the data in the database as numeric but pad the numeric back to a set length (i.e. 10 as I have suggested in my example below):
Advantages: Data will index better, search better, not require such large amounts of storage if you've got large amounts of data.
Disadvantage: Every query or display of data will require every data instance to be padded to the correct length causing a slight performance hit.
Use if: All the output numbers will be the same length (i.e. including zeros they're all [for example] 10 digits); Large amounts of sorting will be necessary.
Add a field to your table to store the original length of the numeric, continue to store the value as numeric (to leverage sorting/indexing performance gains of numeric vs. string) in your new field store the length as it would include the significant zeros:
Advantages: Reduction in required storage space; maximum use of indexing; sorting of numerics is far easier than sorting text numerics; You still get the ability to pad numerics to arbitrary lengths like you have with option 1.
Disadvantages: An extra field is required in your database, so all your queries will have to pull that extra field thus potentially requiring a slight increase in resources at query/display time.
Use if: Storage space/indexing/sorting performance is any sort of concern.
Don't use if: You don't have the luxury of changing the table structure to include the extra value; This will overcomplicate already complex queries.
If I were you and I had access to modify the db structure slightly, I'd go with option 3, sure you need to pull out an extra field to get the length. The slightly increased complexity pays huge dividends in the advantages versus the disadvantages. The performance hit of padding the string back out the correct length will be far superceded by the performance increase of the indexing and storage space required.
I worked with a database with a similar problem. They were storing zip codes as a number. The consequence was that people in New Jersey couldn't use our app.
You're using data that is logically a text string and not a number. It just happens to look like a number, but you really need to treat it as text. Use a text-oriented data type, or at least create a database view that enables you to pull back a properly formatted value for this.
See here: Pad or remove leading zeroes from numbers
declare #recordNumber integer;
set #recordNumber = 93088;
declare #padZeroes integer;
set #padZeroes = 8;
select
right( replicate('0',#padZeroes)
+ convert(varchar,#recordNumber), #padZeroes);
Unless you intend on doing calculations on that ID, its probably best to store them as text/string.
Another option is since the field is an id, i would recommend creating a secondary field for display number (nvarchar) that you can use for reports, etc...
Then in your application when the student id is entered you can insert that into the database as the number, as well as the display number.
An Oracle solution
Store the ID as a number and convert it into a character for display. For instance, to display 42 as a zero-padded, three-character string:
SQL> select to_char(42, '099') from dual;
042
Change the format string to fit your needs.
(I don't know if this is transferable to other SQL flavors, however.)
You could just concatenate '1' to the beginning of the ID when storing it in the database. When retrieving it, treat it as a string and remove the first char.
MySQL Example:
SET #student_id = '123456789';
INSERT INTO student_table (id,name) VALUES(CONCAT('1',#student_id),'John Smith');
...
SELECT SUBSTRING(id,1) FROM student_table;
Mathematically:
Initially I thought too much and did it mathematically by adding an integer to the student ID, depending on its length (like 1,000,000,000 if it's 9 digits), before storing it.
SET #new_student_id = ABS(#student_id) + POW(10, CHAR_LENGTH(#student_id));
INSERT INTO student_table (id,name) VALUES(#new_student_id,'John Smith');