using schema files with decimals for datastage file sequence import - database

I have a series of CSV's I import into a database via Datastage. I am attempting to do this using RCP and schema files.
I generate the schema files from the CSVs using an accompanied master table list that comes with the CSVs.
I am down to one problem. When I find that a numeral is the last column in a particular table, it is the last entry in a schema file. My problem is null handling. The CSV is comma-delimited, double quoted for strings, and no data for null.
The master list identifies some of these number columns as number(), which is indicative of an oracle description of the output. To that end, I am trying this:
:nullable decimal[38,9] { default=0, text };
in this example, the scale and precision are defaulted, to 38,9....unless specified elsewhere, such as decimal[10,2].
A null entry results in this error:
When validating import/export function: APT_GFIX_Decimal::validateParameters: the decimal "text" format is variable length, and no external length is specified;
you should possibly specify an appropriate "width" property; external format: {text, padchar=32, nofix_zero, precision=38, scale=9, round=trunc_zero, ascii}. [decimal/impexp.C:939]
so I tried:
:nullable decimal[38,9] { default=0, text, width=47 };
in this example, the scale and precision are defaulted, to 38,9. The width is the sum of the two values (38 + 9 = 47...unless specified elsewhere, such as decimal[10,2].
and I got:
ODBC_Connector_3,0: Input buffer overrun at field "", at offset: ### [impexp/group_comp.C:6006]
Lastly, I tried exactly what it said, and did this:
:nullable decimal[38,9] { default=0, text, padchar=32, nofix_zero, precision=, scale=, round=trunc_zero, ascii, width=47 };
in this example, the scale and precision are defaulted, to 38,9. The width is the sum of the two values (38 + 9 = 47...unless specified elsewhere, such as decimal[10,2].
For this third time, I received this error: Input buffer overrun at field "", at offset: ### [impexp/group_comp.C:6006]
Has anyone ran into this? this only happens if decimal is the last column in the table.
my record settings are: {intact, final_delim=none, record_delim='\n', charset='UTF8', delim=','}
Thank you very much.

I had the same issue. I tried to put the solutions mentioned in the above answer as well as question. It didnt work. Turned out, my target column had - decimal(14,10), i.e. 4 digits before decimal point and 10 digits after decimal point. I was getting null values in the target even though i had actual data at the source. But the issue was source had more than 4 digits before the decimal. I modified target and source column to decimal(16.10). On top of this, like mentioned in the question, we shouldn’t put decimal columns in the end when we are using schema files. I put a string column in the end at source, Combined both of these and viola! I could see my data properly loaded in the target.

Related

Handling truncation error in derived column in data flow task

I have a data flow task which contains a derived column. The derived column transforms a CSV file column, lets say A which is order number, to a data type char with length 10.
This works perfectly fine when the text file column is equal to or less than 10 characters. Of course, it throws an error when column A order number is more than 10 characters.
The column A (error prone).
12PR567890
254W895X98
ABC 56987K5239
485P971259 SPTGER
459745WERT
I would like to catch the error prone records and extract the order number only.
I already can configure error output from the derived column. But, this just ignores the error records and processes the others.
The expected output will process ABC 56987K5239, 485P971259 SPTGER order numbers as 56987K5239, 485P971259 respectively. The process removal of unexpected characters are not important, rather how to achieve this during the run time of the derived column (stripping and processing the data in case of error).
If the valid order number always starts with a number, and the length of it equal to 10. You could use Script Component (Transformation) together with Regular Expression to transform the source data.
Drag and drop the Script Component as Transformation
Connect the source to the Script Component
From the Script Component Edit window, checked the Order from the Input columns, and make it as Read and Write
In the script, add:using System.Text.RegularExpressions;
The full code needs to be added in the Input process method:
string pattern = "[0-9].{9}";
Row.Order = Regex.Match(Row.Order, pattern).Groups[1].ToString();
The output going to the destination should be the matched 10 characters starting with the number.

Data type cannot be converted in matlab or excel

I recently copy and pasted some data from a database (USGS stream gauge data to be specific. I copy and pasted into excel, creating a column of my own for time.
When I import the data into matlab, only the column I made shows up.
obs = xlsread('ObservedMR.xlsx','jun30For');
I tried to convert the values in excel to numbers, but to no avail. In excel, the numbers are left justified (which I know means that they are not being registered as numbers), but there are no other characters visible.
When I create an empty matrix and try to copy and paste data in, I get an error reading that I cannot paste data that contains strings.
Using the following,
p = readtable('ObservedMR.xlsx','Sheet','jun30For')
I get
p =
x0 x0_31
___ _________
1 '0.31  '
2 '0.31  '
3 '0.31  '
4 '0.31  '
5 '0.31  '
6 '0.31  '
I got error messages trying to use str2num (requires string or character input) and table2array.(types double and cell).
I was going to try to use
regexprep(p, ''' , '')
to replace the quotes, but I am getting messages about the single quotes being unclosed.
Does anyone know how I can use this data, by writing a code to edit out the quotes and spaces, import another way, convert it some way, etc?
Thank you!
you can specify the format of the columns in readtable.
p = readtable('ObservedMR.xlsx','Sheet','jun30For', 'Format', '%f%f')
will read the columns as floats. If it doesn't do the conversion correctly, try reading them as strings %s and then using str2num once you have them in Matlab.
Anyway I would suggest correcting your data in Excel. If you click on the cell and look at the formula bar you will probably see a quote ' at the left of the number which indicates the cell is stored as text. Convert it to number, save the Excel and done.

SSIS - How to convert real values for Oracle?

I'm facing a problem in a package to import some data from a MySQL table to Oracle table and MS SQL Server table. It works well from MySQL to SQL Server, however I get an error when I want to import to Oracle.
The table I want to import contains an attribute (unitPrice) of data type DT_R8.
The destination data type for Oracle is a DT_NUMBERIC as you can see in the capture.
I added a conversion step to convert the unitPrice data from DT_R8 to DT_NUMERIC.
It doesn't work, I get the following error.
I found the detail of the error :
An ORA-01722 ("invalid number") error occurs when an attempt is made to convert a character string into a number, and the string cannot be converted into a valid number. Valid numbers contain the digits '0' through '9', with possibly one decimal point, a sign (+ or -) at the beginning or end of the string, or an 'E' or 'e' (if it is a floating point number in scientific notation). All other characters are forbidden.
However, I don't know how to fix.
EDIT : I added a component to redirect rows/errors to an Excel file.
The following screenshot show the result of the process including errors :
By browsing the only 3000 rows recorded, It seems the process accept only int values no real. So if the price is equal to 10, it's OK but if it's 10,5 it's failed.
Any idea to solve this issue ?
Your NLS environment does not match the expected one. Default, Oracle assumes that "," is the grouping character and "." is the decimal separator. Make sure that your session uses the correct value for the NLS_NUMERIC_CHARACTERS parameter.
See Setting Up a Globalization Support Environment for docu.

TClientDataset Widestring field doubles in size after reading NVARCHAR from database

I'm converting one of our Delphi 7 projects to Delphi X3 because we want to support Unicode. We're using MS SQL Server 2008/R2 as our database server. After changing some database fields from VARCHAR to NVARCHAR (and the fields in the accompanying ClientDatasets to ftWideString), random crashes started to occur. While debugging I noticed some unexpected behaviour by the TClientDataset/DbExpress:
For a NVARCHAR(10) databasecolumn I manually create a TWideStringField in a clientdataset and set the 'Size' property to 10. The 'DataSize' property of the field tells me 22 bytes are needed, which is expected since TWideStringField's encoding is UTF-16, so it needs two bytes per character and some space for storing the length. Now when I call 'CreateDataset' on the ClientDataset and write the dataset to XML (using .SaveToFile), in the XML file the field is defined as
<FIELD WIDTH="20" fieldtype="string.uni" attrname="TEST"/>
which looks ok to me.
Now, instead of calling .CreateDataset I call .Open on the TClientDataset so that it gets its data through the linked components ->TDatasetProvider->TSQLDataset (.CommandText = a simple select * from table)->TSQLConnection. When I inspect the properties of the field in my watch list, Size is still 10, Datasize is still 22. After saving to XML file however, the field is defined as
<FIELD WIDTH="40" fieldtype="string.uni" attrname="TEST"/>
..the width has doubled?
Finally, if I call .Open on the TClientDataset without creating any fielddefinitions in advance at all, the Size of the field will afterwards be 20(incorrect !) and Datasize 42. After saving to XML, the field is still defined as
<FIELD WIDTH="40" fieldtype="string.uni" attrname="TEST"/>
Does anyone have any idea what is going wrong here?
Check the fieldtype and it's size at the SQLCommand component (which is before DatasetProvider).
Size doubling may be a result of two implicit "conversions": first - server provides NVarchar data which is stored into ansi-string field (and every byte becomes a separate character), second - it is stored into clientdataset's field of type Widestring and each character becomes 2 bytes (size doubles).
Note that in prior versions of Delphi string field size mismatch between ClientDataset's field and corresponding Query/Command field did not result in an exception but starting from one of XE*'s it offten results in AV. So you have to check carefully string field sizes during migration.
Sounds like because of the column datatype being changed, it has created unexpected issues for you. My suggestion is to
1. back up the table,multiple ways to doing this,pick your poison figuratively speaking
2. delete the table,
3. recreate the table,
4. import the data from the old table to the newly created table. See if that helps.
Sql tables DO NOT like it when column datatypes get changed, and unexpected issues may arise from doing just that. So try that, and worst case scenario, you have wasted maybe ten minutes of your time trying a possible solution.

Why does a number imported to SQL Server from Excel contain the letter e?

I have got a excel sheet which inserts data in to SQL Server, but noticed for a particular field, the data is being inserted with e, this particular field is of type varchar and size 20.
Why is e being inserted when the actual data for these respective fields is 54607677038, 77200818179 and 9920996.
Help me out
Thanks in anticipation.
You may think of '2007038971' as being just a string of numbers (some kind of article code, I guess). Excel just sees numbers and treats it as a numerical value. It probably is right aligned (default for numbers) and not left-aligned (default for strings).
When asked to store in as a string, it 'helpfully' formats that number into a string, thereby introducing that "e" notation (the value 2007038971 is about 2.00704 * 10^9).
You need to convince Excel that that code really is a string, maybe by adding a quote in front of it.
How about this. When you read value from excel, then convert ToString() and insert into DB. Need to change relevant data type based on data in your excel.
double doub = 2.00704e+009;
string val = doub.ToString();

Resources