HI ALL,
I am using sql server express to store some data but it also store spaces with data. for example if a have a nchar(20) column in a table and i store "computer" (8 characters) to this column, then remaining character (20-8=12) is filled with blank spaces. Is there any way to over come this problem. Because when I shows this data to flow document (center alignment), then it produces alignment error.
Thanks for help
You can use the NVARCHAR data type instead. The NVARCHAR type is a variable length data type and will only store the actual data.
If you don't have control over the data types then you'll need to trim off any extra characters manually. In T-SQL you can do this with the RTRIM command.
Related
I have some data which I believe is Unicode and seeing what happens when I store it into my database column which is of VARCHAR(MAX) datatype.
And here's the source, from the file which is UTF-8...
looking for that ‘X’ and • 3 large bedrooms with 2 ensuites and • Main bedroom with ensuite & surround with plantation shutters`
and using the Visual Studio debugger:
=> so 2x apostrophes and 2x bullets.
I thought SQL Server can only store Unicode if the column is of type NVARCHAR?
I'm assuming my source data is not Unicode and therefore, I totally suck at all this Unicode/UTF-8 stuff :(
I thought SQL Server can only store Unicode if the column is of type NVARCHAR?
That's correct. As far as I can guess from your example, it is not storing Unicode. Probably it is storing bytes encoded in Windows code page 1252, which would be the default encoding for a Western install of SQL Server.
Code page 1252 happens to include mappings for characters ‘, ’ and •, so those characters can be safely stored. But step outside that limited repertoire and you'll start losing characters.
I have a mobile chat conversation text area which is stored in ntext data type in SQL Server 2008. I am doing some process character by character. I need to do something I do not know to pass these kind of emoji characters. Should I eliminate them or collate to different collation or encode to different char-set. My table's collation type is Latin1_General_CI_AS. I need something like this:
IF(SUBSTRING(#chat_Conversation, #i, 1) = 'Emoji')
CONTINUE;
As a first guess I'd suggest to place an N in front of your literal
Compare the results:
SELECT '😊'
,N'😊';
The result
ExtASCII Unicode
?? 😊
Without the N the literal is read as extended ASCII, unknown characters are returned as question marks. With N you are dealing with UNICODE (to be exact: UCS-2)...
UPDATE
As pointed out in comments: Do not use NTEXT!
NTEXT, TEXT and IMAGE are deprecated for centuries! These types will not be supported in future versions!
Convert all your work (columns, variables...) to
NTEXT -> NVARCHAR(MAX) (covering UCS-2 characters)
TEXT -> VARCHAR(MAX) (covering extended ASCII, depending on COLLATION and code page)
IMAGE -> VARBINARY(MAX) (covering BLOBs)
Hint
If you are dealing with special characters like foreign alphabets or emojis you should always use the N with literals and with types...
Currently, I have a DB2 database and some columns in char type.
If i insert some data to these columns with less characters than it's specified, DB2 fillst the rest with blank characters.
For example:
Column: orderkey in type char(8)
If I insert "AB12", it is saved like "AB12XXXX" (X indicates the blank chracters)
Is ist possible to prevent that DB2 fills blank characters by char type?
DB2 Version 9.5
In SQL, the CHAR data type is a fixed length character string. By definition, the additional characters are padded wtih spaces.
What you want is the VARCHAR data type. So, just change your data type for VARCHAR(8) and it will store your strings with no appended spaces.
By the way, this is true in all databases, not only DB2.
I have got a excel sheet which inserts data in to SQL Server, but noticed for a particular field, the data is being inserted with e, this particular field is of type varchar and size 20.
Why is e being inserted when the actual data for these respective fields is 54607677038, 77200818179 and 9920996.
Help me out
Thanks in anticipation.
You may think of '2007038971' as being just a string of numbers (some kind of article code, I guess). Excel just sees numbers and treats it as a numerical value. It probably is right aligned (default for numbers) and not left-aligned (default for strings).
When asked to store in as a string, it 'helpfully' formats that number into a string, thereby introducing that "e" notation (the value 2007038971 is about 2.00704 * 10^9).
You need to convince Excel that that code really is a string, maybe by adding a quote in front of it.
How about this. When you read value from excel, then convert ToString() and insert into DB. Need to change relevant data type based on data in your excel.
double doub = 2.00704e+009;
string val = doub.ToString();
Im facing a strange issue trying to move from sql server to oracle.
in one of my tables i have column defined by NVARCHAR(255)
after reading a bit i understod that SQL server is counting characters when oracle count bytes.
So i defined my table in oracle as VARCHAR(510) 255*2 = 510
But when using sqlldr to load the data from a tab delimetered text file i get en error indicating some entries had exiceeded the length of this column.
after checking in the sql server using:
SELECT MAX(DATALENGTH(column))
FROM table
i get that the max data length is 510.
I do use Hebrew_CI_AS collationg even though i dont think it changes anything....
I checked in SQL Server also if any of the entries contains TAB but no... so i guess its not a corrupted data....
Any one have an idea?
EDIT
After further checkup i've noticed that the issue is due to the data file (in addition to the issue solved by #Justin Cave post.
I have changed the row delimeter to '^' since none of my data contains this character and '|^|' as column delimeter.
creating a control file as follows:
load data
infile data.txt "str '^'"
badfile "data_BAD.txt"
discardfile "data_DSC.txt"
into table table
FIELDS TERMINATED BY '|^|' TRAILING NULLCOLS
(
col1,
col2,
col3,
col4,
col5,
col6
)
The problem is that my data contain <CR> and sqlldr expecting a stream file there for fails on the <CR>!!!! i do not want to change the data since its a textual data (error messages for examples).
What is your database character set
SELECT parameter, value
FROM v$nls_parameters
WHERE parameter LIKE '%CHARACTERSET'
Assuming that your database character set is AL32UTF8, each character could require up to 4 bytes of storage (though almost every useful character can be represented with at most 3 bytes of storage). So you could declare your column as VARCHAR2(1020) to ensure that you have enough space.
You could also simply use character length semantics. If you declare your column VARCHAR2(255 CHAR), you'll allocate space for 255 characters regardless of the amount of space that requires. If you change the NLS_LENGTH_SEMANTICS initialization parameter from the default BYTE to CHAR, you'll change the default so that VARCHAR2(255) is interpreted as VARCHAR2(255 CHAR) rather than VARCHAR2(255 BYTE). Note that the 4000-byte limit on a VARCHAR2 remains even if you are using character length semantics.
If your data contains line breaks, do you need the TRAILING NULLCOLS parameter? That implies that sometimes columns may be omitted from the end of a logical row. If you combine columns that may be omitted with columns that contain line breaks and data that is not enclosed by at least an optional enclosure character, it's not obvious to me how you would begin to identify where a logical row ended and where it began. If you don't actually need the TRAILING NULLCOLS parameter, you should be able to use the CONTINUEIF parameter to combine multiple physical rows into a single logical row. If you can change the data file format, I'd strongly suggest adding an optional enclosure character.
The bytes used by an NVARCHAR field is equal to two times the number of characters plus two (see http://msdn.microsoft.com/en-us/library/ms186939.aspx), so if you make your VARCHAR field 512 you may be OK. There's also some indication that some character sets use 4 bytes per character, but I've found no indication that Hebrew is one of these character sets.