SSIS: Code page goes back to 65001 - sql-server

In an SSIS package that I'm writing, I have a CSV file as a source. On the Connection Manager General page, it has 65001 as the Code page (I was testing something). Unicode is not checked.
The columns map to a SQL Server destination table with varchar (among others) columns.
There's an error at the destination: The column "columnname" cannot be processed because more than one code page (65001 and 1252) are specified for it.
My SQL columns have to be varchar, not nvarchar due to other applications that use it.
On the Connection Manager General page I then change the Code page to 1252 (ANSI - Latin I) and OK out, but when I open it again it's back to 65001. It doesn't make a difference if (just for test) I check Unicode or not.
As a note, all this started happening after the CSV file and the SQL table had columns added and removed (users, you know.) Before that, I had no issues whatsoever. Yes, I refreshed the OLE DB destination in the Advanced Editor.
This is SQL Server 2012 and whichever version of BIDS and SSIS come with it.

If it is a CSV file column text stream [DT_TEXT] to SQL varchar(max) data type that you want to convert to, change the flat file Connection Manager Editor property Code page to 1252 (ANSI - Latin I).

65001 Code page = Unicode (UTF-8)
Based on this Microsoft article (Flat File Connection Manager):
Code page
Specify the code page for non-Unicode text.
Also
You can configure the Flat File connection manager in the following ways:
Specify the file, locale, and code page to use. The locale is used to interpret locale-sensitive data such as dates, and the code page is used to convert string data to Unicode.
So when the flat file has a Unicode encoding:
Unicode, UTF-8, UTF-16, UTF-32
Then this property cannot be changed, it will always return to it original encoding.
For more infor about the Code Page identifiers, you can refer to this article:
Code Page Identifiers

I solved this in SSIS through Derived Column Transformation

If it's a csv file, you can still use code page 1252 to process it. When you open the flat file connection manager it shows you the code page for the file, but you don't need to save that setting. If you have other changes to make in the connection manager, change the code page back to 1252 before you save the changes. It will process fine if there are no unicode characters in the file.

I was running into a similar challenge, which is how I ended up on this page looking for a solution. I resolved it using a different approach.
I opened the csv in Notepad++. One of the menu options is called Encoding. If you select that, it will give you the option to "Convert to ANSI."
I knew that my file did not contain any Unicode specific characters.
When I went back to the SSIS package, I edited the flat file connection and it automatically changed it to 1252.

In my case the file was generated in Excel and (mistakenly) saved as CSV UTF-8 (Comma delimited) (*.csv) instead of simply CSV (Comma delimited) (*.csv). Once I saved the file as the correct form of CSV, the code page no longer changed from 1252 (ANSI - Latin I).

Related

Changing Connections Properties of multiple Excel files using Text editors

We have over 400 Excel files, that connect to a specific SQL server's databases. Currently we are undergoing an upgrade and the Server will change. The db names and tables will remain the same, so the only change in the connection property will be the server name.
I am looking for a quick way to do this using any text editor, and replacing the old name with the new one.
I've tried Notepad++ and EmEditor opening the .xls file as Binary (Hex view) and tried to replace the Hex equivalent of the ASCII characters with no success.
Also, I tried opening in non Binary view, but after saving many funtionalities of the excel are lost and I also get a message that the file is not a valid xls file.
It was successfully done by using UltraEdit, and replacing the string by selecting 'Replace in all files' and choosing the entire folder. So it seems that it can be done with a Text Editor, that doesn't open the file but instead it searches and replaces all instances of the string in a selected folder.
Thank you all for the contribution.

Change the file encoding of the file which is created using SSIS Log provider for Text Files

I am new to SSIS, I have already designed a package and configured SSIS Log provider for Text Files.
This works fine and log files are generated successfully.
We have a monitoring team, they use this log file for monitoring. They are unable to read the log files since the file encoding is in Unicode format.
They are expecting a non unicode format for their monitoring.
I tried to change the existing log file encoding to ANSI but when I re-run the package my log file has been created again with UNICODE encoding.
Is any way we can create log files using SSIS Log provider for Text Files with non unicode encoding. Kindly suggest me any workaround. I am unable to find solution for the past two days.
Trying to figure out the issue
Since SSIS Log provider for Text Files use a File connection manager for logging purposes, you don't have the choice to edit the file encoding within the SSIS package because this type of connection manager can be used for different files format (excel, text ...).
While searching for this issue it looks like if the log is created for the first time by SSIS it will write unicode data.
why are my log files getting generated with a space between every two characters?
Why is my SSIS text logfile formatted in this way?
Possible workaround
Try to create an empty text file using notepad and save it with ANSI encoding.
Then select this file from the SSIS logging configuration.
Other helpful links
Change the default of encoding in Notepad
Add Logging with SSIS
Update 1 - Experiments
To test the workaround i provided i have run the following experiments:
I add SSIS Logging and created and a new log file
After executing the package the file is create in Unicode (to check that i opened the file using notepad and click Save As the encoding shown in the combobox is Unicode)
I create a new file using Notepad and save it using Ansi encoding as mentioned above.
In SSIS i changed the File connection manager to Use Existing instead of Create New and i selected the file i created
After executing the package the log is filled within the file and the encoding is still Ansi
I repeated executing the package several times and the undoing wont changes.
TL DR: Create a file with ANSI encoding outside the ssis package and within the package create a file connection manager, select Use Existing option and choose the created file. Use this file connection manager for logging purposes.

SQL Server Management Studio saves .sql file with binary character

While saving .sql files from SQL Server Management Studio in to my local windows folder, it looks to be including some binary characters making AccuRev comparisons impossible. I looked for possible save options and couldn't locate any. and couldn't find any. Any suggestions please?
If you can't tell AccuRev to handle this as UTF-8 files (this sucks - these days, all software should really know about UTF-8 and handle it correctly!), then you might need to do something in SQL Server Management Studio instead.
When you have a SQL statement open and you click on "File > Save", in the "Save" dialog, there is a little down-arrow to the right of the Save button:
If you click that (instead of just clicking on the button itself), you can select "Save with Encoding", which allows you to pick what encoding to use for your files - pick something like the Windows-1252 Western European - that should not have any UTF-8 Byte-Order Mark bytes at the start:
AccuRev does handle UTF-8 character encoding. However, older versions may not have that capability.
Make sure that the file is being saved using UTF-8. Anything else will have binary content and should be typed as such.
When you export sql files from MS SQL Server Management Studio in unicode (by default), it puts a "FF FE BOM" at the front of the file which forces programs to treat it as binary. Exporting as ANSI solved it. Choose "Save as ANSI Text".

SSIS Import UTF-8 TSV file into SQL Server 2014

First time user/question:
I have numerous TSV files exported from a computer forensics application encoded as "Unicode (UTF-8)". I created a package using visual studio 2013 and have a flat file connection manager where my code page is 65001 (UTF-8) and advanced settings are all unicode string (dt_wstr). My OLE DB destination is hooked to a table with the same unicode settings, and the component properties have "always use default code page" = 65001 and set to true.
However, the package fails with the error: "The data type for "Flat File Source.Outputs[Flat File Source Output].Columns[MyCOLUMN]" is DT_NTEXT, which is not supported with ANSI files. Use DT_TEXT instead and convert the data to DT_NTEXT using the data conversion component."
I'm puzzled: How does this have anything to do with ANSI? The file was exported encoded as UTF-8. Now, as the error suggests, I can work around this by setting my connection manager advanced properties to Varchar (dt_str), then using a data conversion task to convert each and every field to dt_wstr, but that seems unnecessary.
Thank you.

SSIS - ANSI flatfile always saved as UTF-8 (w/o BOM)

I am facing an issue with SSIS where a customer wants a (previously delivered file in UTF-8) to be delivered in ANSI-1252. No big deal i thought. change the file connection manager and done... unfortunately it wasn't that simple. Been stuck on this for a day and clueless on what to try next.
the package itself
IN - OLE DB source with a query. Source database fields are NVARCHAR.
Next i have created a Data conversion block where i convert the incoming DT_WSTR to DT_STR using 1252 codepage.
After that is a outbound file connection destination. The flat file connection is tab delimited using codepage 1252. I have mapped the converted columns to the columns used in this flat file. Below are some screenshots of the connection manager and destination block
Now when i create a new txt file from explorer it will be ANSI (as detected by Notepad++)
When the package runs the file becomes UTF-8 w/o BOM
I have tried experimenting with the checkbox for overwriting as suggested in SSIS - Flat file always ANSI never UTF-8 encoded
as building the project from scratch and experimenting with the data conversion.
Does anyone have a suggestion on what I am missing here? The strange thing is we have a different package with exact the same blocks build previously and it does output an ANSI file (checked the package from top to bottom). However we are getting mixed results on different machines. Some machines will give an ANSI file other the UTF-8 file.
Is this solved already? My idea is to delete the whole Data Flow Task and re-create it. I suppose the metadata is stuck and overwritten at each execution.
I believe you need not to change anything in your ssis package just check your editor setting (notepad++). Go to settings --> Preferences --> new document setting
You need to uncheck the 'Apply to opened ANSI files' checkbox.
Kindly check and let me know if it works for you.

Resources