SQL Server 2012 8k limitation confusion - sql-server

I have a question about the SQL Server 8k limitation. I have a destination table with 453 columns, all are of type varchar(max). My import table also has same number of columns and of same datatype. Basically Import table -> destination table. I am sure ppl are going to suggest normalizing and redesign but I am more interested in why sql server is behaving this way.
The character count of the rows from import table are around 4000 and 4500. Below are the senarios I need help with
If I do a select * into sometable from "import table", I get a successful run.
If I do a insert into "destination table" select * from "import table" I get an error saying "Cannot create a row of size 8239 which is greater than the allowable maximum row size of 8060."
I am totally losing my mind. I thought using varchar(max) allows me to stretch the 8060 limitation upto 2 gig. All my destination table columns and import table columns are of type varchar(max) which allows for LOB and/or out-of-row storage.
Need help please.

Related

SQLSERVER: BCP import and encrypt columns using Database Key

I have a large file which needs to be imported to SQL Server. File contains columns of personal information (like first_name, phone_number). Currently I'm importing the large file onto SQL Server using BCP tool. And as a next step, I'm encrypting the columns using Database Key as shown below.
CREATE TABLE users (
first_name VARCHAR(4000)
)
CREATE CERTIFICATE db_cert1
WITH SUBJECT = 'Encrypt PII data';
GO
CREATE SYMMETRIC KEY db_symkey1
WITH ALGORITHM = AES_256
ENCRYPTION BY CERTIFICATE db_cert1;
GO
BEGIN TRY
UPDATE users
SET first_name = CAST(EncryptByKey(KEY_GUID('db_symkey1'),[first_name]) AS VARCHAR(MAX))
END TRY
BEGIN CATCH
DELETE FROM users;
END CATCH
There are 100s of columns in my table and 10s of such sensitive columns which needs encryption and millions of rows. Currently it is slow (due to number of rows and VARCHAR(MAX/4000))
Is there a better way to achieve this? Does BCP offer any out of the box solution?
I guess you are preforming cast to nvarchar(max) because of your fields type. It will be better, to use varbinary instead.
The function EncryptByKey returns:
varbinary with a maximum size of 8,000 bytes.
So, storing your data in this format will remove the need of cast. Also, it will be better to use precise value for the varbinary length.
You can use the formula bellow, to check what's the maximum varbinary's length that EncryptByKey will return for specific text column:
60 + max_length - ((max_length + 8) % 16)
I often use the following script:
SELECT name, 60 + max_length - ((max_length + 8) % 16)
FROM sys.columns
WHERE object_id = OBJECT_ID('dbo.securityUsers')
AND name in ('FirstName', 'LastName', 'Gender', 'Address1', 'Address2', 'City', 'Province', 'Country')
For example, for nvarchar(128), you will have varbinary(308). You just need to have some way to know that when you are decrypting, tho cast to nvarchar(128) again.
Generally, try to use types with smallest possible precision and to cast to smallest possible precision, too.
You can for example to insert these data in a buffer table and then to just encrypt it and record it in the target table (without casting).
Below are the steps followed to improve performance.
Created two columns for each sensitive data
first_name_plaintext VARCHAR(256)
first_name VARBINARY(308)
Thanks #gotqn for this
Added an auto incrementing id column, added a clustered index(this makes sure its already sorted) on the table and did updates in batches (like WHERE [id] BETWEEN 1 AND 100000).
Commit after each iteration (to reduce the transaction logs usage)
Changed the DB Recovery model to Simple (IMPORTANT)
Increased the DB file size
If there are no restrictions, you can use AES_128 encryption for key creation instead of AES_256, but our security advisor didn't allow this.
This improved the time from 3 minutes to 1:17 minutes for 1 million records.

Why does SQL Server give me a column twice the size I requested?

After executing a CREATE TABLE for a temporary table I was verifying that the size of the field fits what I need to use.
To my surprise, SQL Server (Azure SQL) is reporting that the table now has double the size. Why is this?
This is what I executed, in order:
CREATE TABLE #A ( Name NVARCHAR(500) not null )
EXEC tempdb..sp_help '#A'
An NVARCHAR column in SQL Server always stores every character with 2 bytes.
So if you're asking for 500 characters (at 2 bytes each), obviously this results in column size of 1000 bytes.
That's been like this in SQL Server forever - this isn't new or Azure specific.
NVARCHAR shows 2 bytes per character. So if the size is 500 it shows size as 1000. It is to store unicode format data.

Sqoop Import from SQL Server, casting binary(10) to bigint

I have a below column in sql database, however, I need to sqoop import the table and import into hive and get the Max of that table. kindly help with the conversion:
column name __$seqval in the CDC table :
datatype in sql database showing as : __$seqval(binary(10),NOT NULL)
values of the columns are as below :
0x000001D1000003520003
0x000001D1000003520003
0x000001D10000035A0003
0x000001D1000003630003
0x000001D1000006FB0003
0x000001D1000007090003
0x000001D1000007100003
0x000001D1000007170003
0x000001D10000071E0003
0x000001D100000747002C
0x000001D100000747002C
0x000001D100000747002E
0x000001D100000747002E
0x000001D1000007470030
0x000001D1000007470030
0x000001D1000007850002
0x000001D1000007850002
0x000001D1000007AA002C
0x000001D1000007AA002C
How do I convert these and get the MAX of them.. in Hive
When you're pulling it out of SQL-Server, you can use:
select convert(bigint, binaryColumnName)
...which will pull out the binary values as BigInts. Then when presented to Hadoop it should treat it as BigInt. (I don't know Hadoop so can't tell you how to get a MAX out of it.)

Get along with image sql server linked servers from PostgreSQL

When obtaining an image grabde with sql server linked servers from PostgreSQL, I get the following error: OLE DB provider 'MSDASQL' for linked server 'bd_acceso_ruisegip' returned data that does not match expected data length for column '[MSDASQL] . fot_imagen '. The data length (maximum) expected is 255 and the data returned is 38471.
Don't know if you were dealing with a bytea column but I was having the same problem. Found the answer in the configuring of the postrgres ODBC system dsn. Under the Options/Datasource-page 2 there is a option for bytea as LO. Clicked that and now it works like a champ.
I found a similar issue when replicating some Forum data from PostgreSQL to MSSQL, using the PostgreSQL 64-bit driver and a Linked Server (.
When I coded like this:...
select * into Post from OpenQuery(PostgreSQL_Test1, 'select * From public.post')
... the MSSQL table defaulted to a column size of nvarchar(4000).
My fix: First, run it once with a small limit on the number of rows copied:
select * into Post from OpenQuery(PostgreSQL_Test1, 'select * From public.post limit 10')
Next, right-click on the local Post table. Choose "Script table as drop and create"
In the create script, replace the size of the offending column with VARCHAR(MAX)
Next, create the table.
Then use:
Insert Post select * from OpenQuery(PostgreSQL_Test1, 'select * From public.post')
Hope that helps.
Your mileage may vary.

SQL Server - trying to convert column to XML fails

I'm in the process of importing data from a legacy MySQL database into SQL Server 2005.
I have one table in particular that's causing me grief. I've imported it from MySQL using a linked server and the MySQL ODBC driver, and I end up with this:
Col Name Datatype MaxLen
OrderItem_ID bigint 8
PDM_Structure_ID int 4
LastModifiedDate datetime 8
LastModifiedUser varchar 20
CreationDate datetime 8
CreationUser varchar 20
XMLData text -1
OrderHeader_ID bigint 8
Contract_Action varchar 10
ContractItem int 4
My main focus is on the XMLData column - I need to clean it up and make it so that I can convert it to an XML datatype to use XQuery on it.
So I set the table option "large data out of row" to 1:
EXEC sp_tableoption 'OrderItem', 'large value types out of row', 1
and then I go ahead and convert XMLData to VARCHAR(MAX) and do some cleanup of the XML stored in that field. All fine so far.
But when I now try to convert that column to XML datatype:
ALTER TABLE dbo.OrderItem
ALTER COLUMN XMLData XML
I get this message here:
Msg 511, Level 16, State 1, Line 1
Cannot create a row of size 8077 which
is greater than the allowable maximum
row size of 8060. The statement has
been terminated.
which is rather surprising, seeing that the columns besides the XMLData only make up roughly 90 bytes, and I specifically instructed SQL Server to store all "large data" off-row....
So why on earth does SQL Server refuse to convert that column to XML data??? Any ideas?? Thoughts?? Things I can check / change in my approach??
Update: I don't know what changed, but on a second attempt to import the raw data from MySQL into SQL Server, I was successfully able to convert that NTEXT -> VARCHAR(MAX) column to XML in the end..... odd..... anyhoo - works now - thanks guys for all your input and recommendations! Highly appreciated !
If you have sufficient storage space, you could try selecting from the VARCHAR(MAX) version of the table into a new table with the same schema but with XMLData set up as XML - either using SELECT INTO or by explicitly creating the table before you begin.
PS - it's a side issue unrelated to your problem, but you might want to check that you're not losing Unicode characters in the original MySQL XMLData field by this conversion since the text/varchar data types won't support them.
Can you ADD a new column of type xml?
If so, add the new xml column, update the table to set the new column equal to the XmlData column and then drop the XmlData column.
Edit
I have a table "TestTable" with a "nvarchar(max)" column.
select * from sys.tables where name = 'TestTable'
This gives a result containing:
[lob_data_space_id] [text_in_row_limit] [large_value_types_out_of_row]
1 0 0
yet I can happily save 500k characters in my nvarchar(max) field.
What do you get if you query sys.tables for your OrderItems table?
If your [text_in_row_limit] is not zero, try this, which should convert any existing in-row strings into BLOBs:
exec sp_tableoption 'OrderItems', 'text in row', 0
and then try to switch from nvarchar(max) to xml.
From BOL,
Disabling the text in row option or
reducing the limit of the option will
require the conversion of all BLOBs;
therefore, the process can be long,
depending on the number of BLOB
strings that must be converted. The
table is locked during the conversion
process.

Resources