SSIS Cannot map the lookup column. NVARCHAR(MAX) error - sql-server

I am writing ETL. I have created View in my source database. My View is a join of two tables. Now, I need to fetch data from View. But there are two columns in View which have nvarchar(max) data type.
But when I perform lookup operation in DFT, I am facing this error:
Cannot map the lookup column, 'Description', because the column data type is a binary large object block (BLOB).
I have seen following links:
SSIS Lookup By NVARCHAR(MAX) Column
SSIS Lookup with Derived Columns
Note that, Description column may have large amount of text.
Image is attached for reference. Thank You!

What you is a lookup, and the lookup transformation supports join columns with any data type, except for DT_R4, DT_R8, DT_TEXT, DT_NTEXT, or DT_IMAGE (i.e. BLOB's)
Personally I try to avoid handling BLOB's as much as possible in SSIS. Convert and treat the BLOB as a nvarchar with a max value, and you should be fine.

You might get this problem if the column based on which you are comparing with in the lookup table have a been assigned different constraints.
Ex: If custid in source table allows NULL but the custid of target table does not allows NULL . you might get this error.

Related

Snowflake change data type of column in table, varchar -> Date

How do I change a varchar data type to a DATE for the column datatype in a table?
Tried the following:
ALTER TABLE table.name.here MODIFY COLUMN insert_dt DATE;
I just get
SQL compilation error: cannot change column INSERT_DT from type VARCHAR(16777216) to DATE
Unfortunately this sort of data type change is not allowed, generally your best option is to
Add a new column with a temp name, with the new data type
Run an update statement to set the new column to the old column's value (with any required transformations)
Rename the columns, and drop the old column if desired.
It is also sometimes easiest to do this change in a clone or CTAS table and then do an ALTER TABLE SWAP WITH.
Note that a full table update like this does mean recreating micro-partitions, which is generally ok (if a little slow), but you may want to keep an eye on if this affects your clustering. This is easier to control in a CTAS approach because you can explicitly maintain the ordering in an ORDER BY clause.

Why can't columnar databases like Snowflake and Redshift change the column order?

I have been working with Redshift and now testing Snowflake. Both are columnar databases. Everything I have read about this type of databases says that they store the information by column rather than by row, which helps with the massive parallel processing (MPP).
But I have also seen that they are not able to change the order of a column or add a column in between existing columns (don't know about other columnar databases). The only way to add a new column is to append it at the end. If you want to change the order, you need to recreate the table with the new order, drop the old one, and change the name of the new one (this is called a deep copy). But this sometimes can't be possible because of dependencies or even memory utilization.
I'm more surprised about the fact that this could be done in row databases and not in columnar ones. Of course, there must be a reason why it's not a feature yet, but I clearly don't have enough information about it. I thought it was going to be just a matter of changing the ordinal of the tables in the information_schema but clearly is not that simple.
Does anyone know the reason of this?
Generally, column ordering within the table is not considered to be a first class attribute. Columns can be retrieved in whatever order you require by listing the names in that order.
Emphasis on column order within a table suggests frequent use of SELECT *. I'd strongly recommend not using SELECT * in columnar databases without an explicit LIMIT clause to minimize the impact.
If column order must be changed you do that in Redshift by creating a new empty version of the table with the columns in the desired order and then using ALTER TABLE APPEND to move the data into the new table very quickly.
https://docs.aws.amazon.com/redshift/latest/dg/r_ALTER_TABLE_APPEND.html
The order in which the columns are stored internally cannot be changed without dropping and recreating them.
Your SQL can retrieve the columns in any order you want.
General requirement to have columns listed in some particular order is for the viewing purpose.
You could define a view to be in the desired column order and use the view in the required operation.
CREATE OR REPLACE TABLE CO_TEST(B NUMBER,A NUMBER);
INSERT INTO CO_TEST VALUES (1,2),(3,4),(5,6);
SELECT * FROM CO_TEST;
SELECT A,B FROM CO_TEST;
CREATE OR REPLACE VIEW CO_VIEW AS SELECT A,B FROM CO_TEST;
SELECT * FROM CO_VIEW;
Creating a view to list the columns in the required order will not disturb the actual table underneath the view and the resources associated with recreation of the table is not wasted.
In some databases (Oracle especially) ordering columns on table will make difference in performance by storing NULLable columns at the end of the list. Has to do with how storage is beiing utilized within the data block.

is it possible to select all columns as nvarchar or varchar without explicit casting in SQL server 2008?

basically what I want is when i fill dataset executing some query, all data column must be string only irrespective of type in database tables. Is it possible?
I want to know if any short SQL syntax exists instead of casting every columns when you have huge number of columns or even dynamic number of columns
No, this isn't possible. If you select data from a table the data has the datatype of the column as defined.
You can create a table with only nvarchar fields instead.
No. The database assumes that you defined your columns with a given type because you wanted them to be that type. That's how the data was given to it to store, so that's how it's going to return it. If you don't want them to be that type, the database requires you to explicitly state that.

Sybase - Change a table column datatype on an IDENTITY column which is a User Definied Datatype

I'm pretty good around Oracle but I've been struggling to find a decent solution to a problem I'm having with Sybase.
I have a table which has an IDENTITY column which is also a User Defined Datatype (UDD) "id" which is numeric(10,0). I've decided to replace the UDD with the native datatype but I get an error when I do this.
I've found that the only way to do this is:
Rename the original table (table_a to table_a_backup) using the procedure sp_rename
Recreate the original table (table_a) but use native data types
Copy the contents of the backup table to the original (i.e insert into table_a select * from table_b)
This works however I have over 10M records and it eventually runs out of log segment and halts (I can't increase the segment any more due to physical requirements).
Does anybody have a solution, preferably not a solution which would involve processing the records as anything but one large set?
Cheers,
JLove
conceptually, something like this works (in Sybase ASE 12.5.x) ...
do an "alter table drop column" on your current ID column
do "alter table add column" stmt to add new column (w/ native datatype) with IDENTITY attribute
Note that the ID field might not have the same numbers, so be very wary of doing the above if the ID field is used as an explicit or implicit key to other tables.

Storing varchar(max) & varbinary(max) together - Problem?

I have an app that will have entries of both varchar(max) and varbinary(max) data types. I was considering putting these both in a separate table, together, even if only one of the two will be used at any given time.
The question is whether storing them together has any impact on performance. Considering that they are stored in the heap, I'm thinking that having them together will not be a problem. However, the varchar(max) column will be probably have the text in row table option set.
I couldn't find any performance testing or profiling while "googling bing," probably too specific a question?
The SQL Server 2008 table looks like this:
Id
ParentId
Version
VersionDate
StringContent - varchar(max)
BinaryContent - varbinary(max)
The app will decide which of the two columns to select for when the data is queried. The string column will much used much more frequently than the binary column - will this have any impact on performance?
See this earlier answer and this answer.
text in row option is deprecated and applies the the text, ntext, image data types
varchar(max) by default stores the text in the row up to the limit and then outside the row above the limit unless large value types out of row option is set, in which it's always stored out of row - which would now mean you're storing the data out of the table and then out of that table, too ;-).
If you are already storing them in this separate table, that might not strictly be necessary unless you need the one-to-many relationship which your other columns suggest - since the size of the existing data in your parent row may force these elements out of row. With the data logically in a separate table, you do get more options, however.

Resources