Make Kingswaysoft truncate input data that is too long - sql-server

I have an SSIS project that I'm using to automate pulling CRM data into a SQL Server Database using Kingswaysoft. These SSIS packages are autogenerated, so my solution to this issue needs to be compatible with that.
The description field on Contact in CRM is a nvarchar(2000), but this CRM org still has old data, and some of those old contact records have a description longer than 2000 characters. When I try to pull those using Kingsway, I get this error:
Error: 0xC002F304 at Stage Data for contact, Export contact Data [2]: An error occurred with the following error message: "The input value for 'description' field (or one of its related fields) does not fit into the output buffer, please consider increasing the output column's Length property or changing its data type to one that can accommodate more data such as ntext (DT_NTEXT). This change can be done using the component's Advanced Editor window.".
This makes sense, since I'm pulling a column longer than specified in the metadata, but the problem is that I want to ignore this error, truncate the column, and continue the data load. Obviously I could set the column to DT_NTEXT and not worry about it, but since these packages are autogenerated I have no way of knowing beforehand which columns have old data and which don't, so I won't know which should be DT_NTEXT.
So is there a way to make Kingswaysoft truncate input data which is longer than what's specified in the metadata?

Thank you for choosing KingswaySoft as your integration solution. For this situation, unfortunately there is no way to make that work without making those changes in the component’s Advanced Editor.
If the source component just simply ignores the error and truncates the value, you will lose some of your data and thus affect the data integrity during the integration. Therefore, you may need to change the data type to DT_NTEXT or increase the length of this field in order to handle this situation properly. Alternatively, you can try to change the field length on your CRM side so that the SSIS package can be generated correctly.

Related

Specifying flat file data types vs using data conversion

This may be a stupid question but I must ask since I see it a lot... I have inherited quite a few packages in which developers will use the the Data Conversion transformation shape when dumping flat files into their respective sql server tables. This is pretty straight forward however I always wonder why wouldn't the developer just specify the correct data types within the flat file connection and then do a straight load into the the table?
For example:
Typically I will see flat file connections with columns that are DT_STR and then converted into the correct type within the package ie: DT_STR of length 50 to DT_I4. However, if the staging table and the flat file are based on the same schema - why wouldn't you just specify the correct types (DT_I4) in the flat file connection? Is there any added benefit (performance, error handling) for using the data conversion task that I am not aware of?
This is a good question with not one right answer. Here is the strategy that I use:
If the data source is unreliable
i.e. sometimes int or date values are strings, like when you have the literal word 'null' instead of the value being blank. I would let the data source be treated as strings and deal with converting the data downstream.
This could mean just staging the data in a table and using the database to do conversions and loading from there. This pattern avoid the source component throwing errors which is always tricky to troubleshoot. Also, it avoids having to add error handling into data conversion components.
Instead, if the database throws a conversion error, you can easily look at the data in your staging table to examine the problem. Lastly, SQL is much more forgiving with date conversions than ssis.
If the data source is reliable
If the dates and numbers are always dates and numbers, I would define the datatypes in the connection manager. This makes it clear what you are expecting from the file and makes the package easier to maintain with fewer components.
Additionally, if you go to the advanced properties of the flatfile source, integers and dates can be set to fast parse which will speed up the read time: https://msdn.microsoft.com/en-us/library/8893ea9d-634c-4309-b52c-6337222dcb39?f=255&MSPPError=-2147217396
When I use data conversion
I rarely use the data conversion component. But one case I find it useful is for converting from / to unicode. This could be necessary when reading from an ado.net source which always treats the input as unicode, for example.
You could change the output data type in the flat file connection manager in Advanced page or right click the source in Data flow, Advanced editor to change the data type before loading it.
I think one of the benefit is the conversion transformation could allow you output the extra column, usually named copy of .., which in some case, you might use both of the two columns. Also, sometimes when you load the data from Excel source, all coming with Unicode, you need to use Data conversion to do the data TF, etc.
Also, just FYI, you could also use Derived Column TF to convert the data type.
UPDATE [Need to be further confirmed]:
From the flat file source connection manager, the maximum length of string type is 255, while in the Data Conversion it could be set over 255.

"Conversion failed because the data value overflowed the specified type" error applies to only one column of the same table

I am trying to import data from database access file into SQL server. To do that, I have created SSIS package through SQL Server Import/Export wizard. All tables have passed validation when I execute package through execute package utility with "validate without execution" option checked. However, during the execution I received the following chunk of errors (using a picture, since blockquote uses a lot of space):
Upon the investigation, I found exactly the table and the column, which was causing the problem. However, this is problem I have been trying to solve for a couple days now, and I'm running dry on possible options.
Structure of the troubled table column
As noted from the error list, the trouble occurs in RHF Repairs table on the Date Returned column. In Access, the column in question is Date/Time type. Inside the actual table, all inputs are in a form of 'mmddyy', which when clicked upon, turn into 'mm/dd/yyyy' format:
In SSIS package, it created OLEDB Source/Destination relationship like following:
Inside this relationship, in both output columns and external columns data type is DT_DATE (I still think it is a key cause of my problems). What bugs me the most is that the adjacent to Date Returned column is exactly the same as what I described above, and none of the errors applied to it or any other columns of the same type, Date Returned is literally the only black sheep in the flock.
What have I tried
I have tried every option from the following thread, the error remains the same.
I tried Data conversion option, trying to convert this column into datestamp or even unicode string. It didn't work.
I tried to specify data type with the advanced source editor to both datestamp/unicode string. I tried specifying it only in output columns, tried in both external and output columns, same result.
Plowing through the data in access table also did not give me anything. All of them use the same 6-char formatting through it all.
At this point, I literally exhausted all options I could think of. Can you please point me in the right direction on what else I could possibly try to resolve it, since it drives me nuts for last two days.
PS: On my end, I will plow through each row individually, while not trying to get discouraged by the fact that there are 4000+ row entries...
UPDATE:
I resolved this matter by plowing through data. There were 3 faulty entries among 4000+ rows... Since the issue was resolved in a manner unlikely to help others, please close that question.
It sounds to me like you have one or more bad dates in the column. With 4,000 rows, I actually would visually scan and look for something very short or very long.
You could change your source to selecting top 1 instead of all 4,000. Do those insert? If so, that would lend weight to the bad date scenario. If 1 row does not flow through, it is another issue.
(I will just share my experience, how I overcame this problem, in case it helps someone)
My scenario:
One of the column Identifier in the ole db data source has changed from int to bigint. I was getting the error message - Conversion failed because the data value overflowed the specified type.
Basically, it was telling me the source data size was greater than the destination data size.
What I have tried:
In the ole db data source and destination both places, I clicked "show advanced editior", checkd the data type Identifier was bigint. But still, I was getting the error message
The solution worked for me:
In the ole db data source--> show advanced edition option--> Input and Output Properties--> OLE DB Source Output--> there are two options - External columns & Output columns.
In my case, though the Identifier column in the External columns was showing the data type bigint, but in the Output columns was showing the data type int. So, I changed the data type to bigint and it has solved my problem.
Now and then I get this problem, specially when I have a big table with lots of data.
I hope it helps.
We had this error when someone had entered the year as 216 instead of 2016. The data source was reading the data ok but it was failing on the OLEDB destination task.
We use a script task in the data flow for validation. By adding a check that dates aren't too far in the past we are able to trap this kind of error and at least generate a meaningful error message to find and correct the problem quickly.

Storing Serialized Information In SQL Server using F#

I am currently working on a project in F# that takes in data from Excel spreadsheets, determines if it is compatible with an existing table in SQL Server, and then adds the relevant rows to the existing table.
Some of the data I am working with is more specific than the types provided by T-SQL. That is, T-SQL has a type "date", but I need to distinguish between sets of dates that are at the beginning of each month or the end of each month. This same logic applies to many other types as well. If I have types:
Date(Beginning)
Date(End)
they will both be converted to the T-SQL type "date" before being added to the table, therefore erasing some of the more specific information.
In order to solve this problem, I am keeping a log of the serialized types in F#, along with which column number in the SQL Server table they apply to. My question is: is there any way to store this log somewhere internally in SQL Server so that I can access it and compare the serialized types of the incoming data to the serialized types of the data that already exists in the table before making new inserts?
Keeping metadata outside of the DB and maintaining them manually makes your DB "expensive" to manage plus increases the risk of errors that you might not even detect until something bad happens.
If you have control over the table schema, there are at least a couple of simple options. For example, you can add a column that stores the type info. For something simple with just a couple of possible values as you described, just add a new column to store the actual type value. Update the F# code to de-serialize the source into separate DATE and type (BEGINNING/END) values which are then inserted to the table. Simple, easy to maintain and easily consumed.
You could also create a user defined type for each date subtype but that can be confusing to another DBA/dev plus makes it more complicated when retrieving data from your application. This is generally not a good approach.
Yes, you can do that if you want to.

Unable to change an SSIS Excel Destination Column Data Type

I have an SSIS package that is importing data from SQL Server and placing it into an Excel destination file. When going into the Advanced Editor of the ADO Source component, I have a field Description that has an External Data Type of Unicode String, length 4000, and an Output Data Type of Unicode Text Stream (This is to ensure a String length > 255 can be imported into Excel). Now, when I go into the Advanced Editor of the Excel Destination component the Data Type is stuck as Unicode String, length 4000. It allows me to change it, but reverts back immediately after I click save. Running the package results in a failure since I have data in the Description field with a length > 255. I have searched countless threads regarding this issue such as this but haven't come to a solution. Any help would be greatly appreciated.
This might be very simple: after you make any change related to the Source component, I find I have to double-click the green arrow -- showing that metadata does more than just show it -- it updates that metadata, too, based on the source component. Only after that will the Destination component be able to "see" the changes to the Source component.
But if that isn't enough: when making these kind of changes, before I could get them to take effect, I've often had to (1) delete the destination component, (2) delete the destination connection object in SSIS, and (3) delete/rename/move the actual Excel spreadsheet, and then generate a new one by clicking the button (in the Destination component), that generates a new destination file from the metadata.
I've had this issue with the Union All component before, and they only way I've managed to fix it without deleting and re-creating the component was to open/edit it, set the incriminating fields to "ignore" on the inputs, press OK and then go back in and set the inputs back to the original fields and press OK.
This seemed to do the trick. Maybe a similar approach for other components may work.

SSIS getting wrong column type with OLEDB connector

Halfway through a SSIS project certain table fields changed from char(30) to nvarchar(30)
However, when running the SSIS packages, an error stating cannot convert from unicode to non-unicode appears.
I am trying to transfer data directly from a database source to its destination.
Both connections use the same database schema, so there should be no conversion.
When checking the external column data type it shows D_STR, which is not the case anymore.
I tried deleting both source and destination in hope that it would clean any sort of cached data, but it did not work.
Any ideas?
Sounds to me like the metadata in your data flow task is cached and needs to be refreshed to reflect the new type.
Open the source, go to columns, and uncheck the column, then check the column. Click ok. The metadata should refresh now.
nvarchar and nchar are unicode. Conversely, varchar and char are non-unicode.
http://msdn.microsoft.com/en-us/library/ms187752.aspx
As a result if you are moving data from one data type to another you will have to perform some additional transformation (CAST or CONVERT). The other option is to look at your adapters such that char will use SSIS DataType of DT-STR and nvarchar will use SSIS DataType DT-WSTR
http://msdn.microsoft.com/en-us/library/ms141036.aspx
Without knowing how your packages work I cannot be much more specific but hopefully this will get you going.

Resources