How to convert LDAP data (NTEXT) to String within an SSIS Package?

How to convert LDAP data (NTEXT) to String within an SSIS Package? - sql-server

I'm currently using LDAP to grab User information within an SSIS package. I'm using an ADO.NET Datasource which is running this query:
SELECT sAMAccountName, cn, givenName, sn FROM 'LDAP://domainController' where objectClass='User'
The problem is, I would like to append the Domain suffix to the sAMAccountName since it is not included. Ex. " DOMAIN\ sAMAccountName" . However, I cannot figure out how to convert this data from Unicode Text Stream (DT_NTEXT) to String (DT_STR). Converting this column to String will allow me to append this suffix fairly easily using the Derived Column tool.
Is there a simple way of converting a DT_NTEXT to a DT_STR within my SSIS package?

Active Directory SSIS Data Source
I chain two data conversion tasks together "NTEXT -> TEXT" and then "TEXT -> STR" as you can see in the second screen shot. I don't have access to that package at the moment but something like 128 characters should be sufficiently wide.

Related

Scientific Notation Issue while loading data from Excel (xlsx) file to SQL Tables via SSIS

I'm loading data from excel file (.xlsx) to SQL table using SSIS package. For one column it's adding scientific notations in the data, it's already there in the excel file. But it's actual value is not loading to SQL table. I tried multiple option of derived columns, expressions etc. But I couldn't get the proper value.
This column has data of numeric and nvarchar values. Below is the example of the column.
ApplicationNumber
1.43E+15
923576663
25388447
TXY020732087
18794588
TXAP0000140343
**Actual Values -**
ApplicationNumber
1425600000000000
923576663
25388447
TXY020732087
18794588
TXAP0000140343
There is no issue with data coming from Business to Excel. But how we can handle this scenario in SSIS ?
I also tried (DT_I8)ApplicationNumber==(DT_I8)ApplicationNumber, But it giving values for the above
1.43E+15 -> 1.430000000000000 and not the 1425600000000000

One thing you can do is set the output in advanced editor of the excel source as decimal with a large scale, 20 digits for example:
UPDATE
to consider also strings in the same column you may need to redirect the error output as these will throw a conversion error:
in advanced editor:
Default output:
Error output:
Then you can update your database from both the default and the error output.

I faced this problem recently using SSIS too.
1- Change the column type in Excel to "Number"
2- Remove the decimal positions.
3- Upload the file using SSIS

Can't import characters due to incorrect code page

I have an SSIS job to import data from a flat file into an SQL Server table. I'm having an issue regarding the encoding of the source file and destination table.
The file is an UTF8 encoded CSV file with some standard accented latin characters (ãóé, etc). My destination table is defined as having the Latin1_General_CI_AS Collation, which means I can manually insert the following text with no problem: "JOÃO ANTÓNIO".
When I declare the Flat File source, it automatically determines the file as having the 65001 code page (UTF-8), and infers the string [DT_STR] data type for each column. However, the SSIS package automatically assumes the destination table as having the 1252 Code Page, giving me the following error:
Validation error. <STEPNAME>: <STEPNAME>: The code page 65002 specified on output column "<MYCOLUMN>" (180) is not valid. Select a different code page for output column "<MYCOLUMN>".
I understand why, since the database collation is defined as having that Code Page. However, if I try to set the Flat File datasource as having the Latin1 1252 encoding, the SSIS executes but it imports characters incorrectly:
JOÃO ANTÓNIO (Flat File)-> JOAO ANTÃ“NIO (Database).
I have already tried to configure the flat file source as being unicode compliant, but then when after I configure each column as having a unicode compliant data type, i can't update the destination step since SSIS infers data types directly from the database and doesn't allow me to change them.
Is there a way to keep the flat file source as being CP 1252, but also importing the correct characters? What am I missing here?

Thanks to Larnu's comment i've been able to get around this problem.
Since SSIS doesn't allow implicit data conversion, I needed to set up a data conversion step first (Derived Column Transformation). Since the source columns were already set up as DTSTR[65002], i had to configure new derived columns form an expression, converting from the source code page into the destination code page, with the following expression:
(DT_STR, 50, 1252)<SourceColumn>
Where a direct cast to DT_STR is being made, stating the column will have a maximum size of 50 characters and the data will be represented with the 1252 code page.

Uploading excel file to sql server [duplicate]

Every time that I try to import an Excel file into SQL Server I'm getting a particular error. When I try to edit the mappings the default value for all numerical fields is float. None of the fields in my table have decimals in them and they aren't a money data type. They're only 8 digit numbers. However, since I don't want my primary key stored as a float when it's an int, how can I fix this? It gives me a truncation error of some sort, I'll post a screen cap if needed. Is this a common problem?
It should be noted that I cannot import Excel 2007 files (I think I've found the remedy to this), but even when I try to import .xls files every value that contains numerals is automatically imported as a float and when I try to change it I get an error.
http://imgur.com/4204g

SSIS doesn't implicitly convert data types, so you need to do it explicitly. The Excel connection manager can only handle a few data types and it tries to make a best guess based on the first few rows of the file. This is fully documented in the SSIS documentation.
You have several options:
Change your destination data type to float
Load to a 'staging' table with data type float using the Import Wizard and then INSERT into the real destination table using CAST or CONVERT to convert the data
Create an SSIS package and use the Data Conversion transformation to convert the data
You might also want to note the comments in the Import Wizard documentation about data type mappings.

Going off of what Derloopkat said, which still can fail on conversion (no offense Derloopkat) because Excel is terrible at this:
Paste from excel into Notepad and save as normal (.txt file).
From within excel, open said .txt file.
Select next as it is obviously tab delimited.
Select "none" for text qualifier, then next again.
Select the first row, hold shift, select the last row, and select the text radial button. Click Finish
It will open, check it to make sure it's accurate and then save as an excel file.

There is a workaround.
Import excel sheet with numbers as float (default).
After importing, Goto Table-Design
Change DataType of the column from Float to Int or Bigint
Save Changes
Change DataType of the column from Bigint to any Text Type (Varchar, nvarchar, text, ntext etc)
Save Changes.
That's it.

When Excel finds mixed data types in same column it guesses what is the right format for the column (the majority of the values determines the type of the column) and dismisses all other values by inserting NULLs. But Excel does it far badly (e.g. if a column is considered text and Excel finds a number then decides that the number is a mistake and insert a NULL instead, or if some cells containing numbers are "text" formatted, one may get NULL values into an integer column of the database).
Solution:
Create a new excel sheet with the name of the columns in the first row
Format the columns as text
Paste the rows without format (use CVS format or copy/paste in Notepad to get only text)
Note that formatting the columns on an existing Excel sheet is not enough.

There seems to be a really easy solution when dealing with data type issues.
Basically, at the end of Excel connection string, add ;IMEX=1;"
Provider=Microsoft.Jet.OLEDB.4.0;Data Source=\\YOURSERVER\shared\Client Projects\FOLDER\Data\FILE.xls;Extended Properties="EXCEL 8.0;HDR=YES;IMEX=1";
This will resolve data type issues such as columns where values are mixed with text and numbers.
To get to connection property, right click on Excel connection manager below control flow and hit properties. It'll be to the right under solution explorer. Hope that helps.

To avoid float type field in a simple way:
Open your excel sheet..
Insert blank row after header row and type (any text) in all cells.
Mouse Right-Click on the head of the columns that cause a float issue and select (Format Cells), then choose the category (Text) and press OK.
And then export the excel sheet to your SQL server.
This simple way worked with me.

A workaround to consider in a pinch:
save a copy of the excel file, modify the column to format type 'text'
copy the column values and paste to a text editor, save the file (call it tmp.txt).
modify the data in the text file to start and end with a character so that the SQL Server import mechanism will recognize as text. If you have a fancy editor, use included tools. I use awk in cygwin on my windows laptop. For example, I start end end the column value with a single quote, like "$ awk '{print "\x27"$1"\x27"}' ./tmp.txt > ./tmp2.txt"
copy and paste the data from tmp2.txt over top of the necessary column in the excel file, and save the excel file
run the sql server import for your modified excel file... be sure to double check the data type chosen by the importer is not numeric... if it is, repeat the above steps with a different set of characters
The data in the database will have the quotes once the import is done... you can update the data later on to remove the quotes, or use the "replace" function in your read query, such as "replace([dbo].[MyTable].[MyColumn], '''', '')"

SQL Server Datetime object persistent reformatting issue in Excel

I have an annoying issue working with SQL Server DATETIME objects in Excel 2013. The problem has been stated several times here in SO, and I know that the work around is to just reformat the DATETIME objects in Excel by doing this:
Right click the cell
Choose Format Cells
Choose Custom
In the Type: input field enter yyyy-mm-dd hh:mm:ss.000
This works fine BUT I loathe having to do this every time. Is there a permanent work around to this aside from creating macros? I need to maintain the granularity of the DATETIME object so I cannot use a SMALLDATETIME. I am currently using Microsoft SQL Server Management Studio 2008 r2 on a win7 machine.
Thanks in advance.
-Stelio K.

Without any code it's hard to guess how the data gets from SQL Server to Excel. I assume it's not through a data connection, because Excel wouldn't have any issues displaying the data as dates directly.
What about data connections?
Excel doesn't support any kind of formatting or any useful designer for that matter, when working with data connections only. That functionality is provided by Power Query or the PivotTable designer. Power Query is integrated in Excel 2016 and available as a download for Excel 2010+.
Why you need to format dates
Excel doesn't preserve type information. Everything is a string or number and its display is governed by the cell's format.
Dates are stored as decimals using the OLE Automation format - the integral part is the number of dates since 1900-01-01 and the fractional part is the time. This is why the System.DateTime has those FromOADate and ToOADate functions.
To create an Excel sheet with dates, you should set the cell format at the same time you generate the cell.
How to format cells
Doing this is relatively if you use the Open XML SDK or a library like EPPlus. The following example creates an Excel sheet from a list of customers:
static void Main(string[] args)
{
var customers = new[]
{
new Customer("A",DateTime.Now),
new Customer("B",DateTime.Today.AddDays(-1))
};
File.Delete("customers.xlsx");
var newFile = new FileInfo(#"customers.xlsx");
using (ExcelPackage pck = new ExcelPackage(newFile))
{
var ws = pck.Workbook.Worksheets.Add("Content");
// This format string *is* affected by the user locale!
// and so is "mm-dd-yy"!
ws.Column(2).Style.Numberformat.Format = "m/d/yy h:mm";
//That's all it needs to load the data
ws.Cells.LoadFromCollection(customers,true);
pck.Save();
}
}
The code uses the LoadFromCollection method to load a list of customers directly, without dealing with cells. true means that a header is generated.
There are equivalent methods to load data from other source: LoadFromDatatable, LoadFromDataReader, LoadFromText for CSV data and even LoadFromArrays for jagged object arrays.
The weird thing is that specifying the m/d/yy h:mm or mm-dd-yy format uses the user's locale for formatting, not the US format! That's because these formats are built-in into Excel and are treated as the locale-dependent formats. In the list of date formats they are shown with an asterisk, meaning they are affected by the user's locale.
The reason for this weirdness is that when Excel moved to the XML-based XLSX format 10 years ago, it preserved the quirks of the older XLS format for backward-compatibility reasons.
When EPPlus saves the xlsx file it detects them and stores a reference to the built-in format ID (22 and 14 respectively) instead of storing the entire format string.
Finding Format IDs
The list of standard format IDs is shown in the NumberingFormat element documentation page of the Open XML standard. Excel originally defined IDs 0 (General) through 49.
EPPlus doesn't allow setting the ID directly. It checks the format string and maps only the formats 0-49 as shown in the GetBfromBuildIdFromFormat method of ExcelNumberFormat. In order to get ID 22 we need to set the Format property to "m/d/yy h:mm"
Another trick is to check the stylesheets of an existing sheet. xlsx is a zipped package of XML files that can be opened with any decompression utility. The styles are stored in the xl\styles.xml file.

convert memo field in Access database from double byte to Unicode

I am using Access database for one system, and SQL server for another system. The data gets synced between these two systems.
The problem is that one of the fields in a table in Access database is a Memo field which is in double-byte format. When I read this data using DataGridView in a Windows form, the text is displayed as ???.
Also, when data from this field is inserted in sql server database nvarchar(max) field, non-English characters are inserted as ???.
How can I fetch data from memo field, convert its encoding to Unicode, so that it appears correctly in SQL server database as well?
Please help!!!

I have no direct experience with datagrid controls, but I already noticed that some database values are not correctly displayed through MS-Access controls. Uniqueidentifiers, for example, are set to '?????' values when displayed on a form. You could try this in the debug window, where "myIdField" control is bound to "myIdField" field from the underlying recordset (unique Identifier type field):
? screen.activeForm.recordset.fields("myIdField")
{F0E3C822-BEE9-474F-8A4D-445A33F363EE}
? screen.activeForm.controls("myIdField")
????
Here is what the Access Help says on this issue:
The Microsoft Jet database engine stores GUIDs as
arrays of type Byte. However, Microsoft Access can't return Byte data
from a control on a form or report. In order to return the value of a
GUID from a control, you must convert it to a string. To convert a
GUID to a string, use the StringFromGUID function. To convert a string
back to a GUID, use the GUIDFromString function.
So if you are extracting values from controls to update a table (either directly or through a recordset), you might face similar issuers ...
One solution will be to update data directly from the recordset original value. Another option would be to open the original recordset with a query containing necessary conversion instructions so that the field will be correctly displayed through the control.
What I usually do in similar situation, where I have to manipulate uniqueIdentifier fields from multiple datasources (MS-Access and SQL Server for Example), is to 'standardize' these fields as text in the recordsets. Recordsets are then built with queries such as:
SQL Server
"SELECT convert(nvarchar(36),myIdField) as myIdField, .... FROM .... "
MS-Access
"SELECT stringFromGUID(myIdField) as myIdField, .... FROM .... "

I solved this issue by converting the encoding as follows:
//Define Windows 1252, Big5 and Unicode encodings
System.Text.Encoding enc1252 = System.Text.Encoding.GetEncoding(1252);
System.Text.Encoding encBig5 = System.Text.Encoding.GetEncoding(950);
System.Text.Encoding encUTF16 = System.Text.Encoding.Unicode;
byte[] arrByte1 = enc1252.GetBytes(note); //string to be converted
byte[] arrByte2 = System.Text.Encoding.Convert(encBig5, encUTF16, arrByte1);
string convertedText = encUTF16.GetString(arrByte2);
return convertedText;
Thank you all for pitching in!

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

How to convert LDAP data (NTEXT) to String within an SSIS Package? - sql-server

Active Directory SSIS Data Source I chain two data conversion tasks together "NTEXT -> TEXT" and then "TEXT -> STR" as you can see in the second screen shot. I don't have access to that package at the moment but something like 128 characters should be sufficiently wide.

Related

Scientific Notation Issue while loading data from Excel (xlsx) file to SQL Tables via SSIS

Can't import characters due to incorrect code page

Uploading excel file to sql server [duplicate]

SQL Server Datetime object persistent reformatting issue in Excel

convert memo field in Access database from double byte to Unicode

Categories

Resources