SQL import wizard drops leading zero - sql-server

I've read all the posts about prefixing the numbers with ', and setting IMEX=1 in the connection string; nothing seems to do the trick for me.
Here's the setup: Excel column with mixed data - 99% numbers (some start with 0) 1% text.
PROGRAMATICALLY mporting into SQL Server 2005 table / column type - varchar(255).
Import works fine locally, but once i move the code to production (GoDaddy), it drops the leading 0's in the column.
Any ideas?
p.s. I knew about the registry change solution, matter of fact - the value was set to 0 on my dev box, but the answer made me realize that the value wasn't set on the PRODUCTION SERVER :)

The ISAM driver only samples the first 8 rows, but you can change that behaviour through a registry change:
http://sqlserversd.wordpress.com/2008/09/14/ssis-excel-values-import-as-nulls/
But yes, using Excel for machine-to-machine data transfer is a nightmare...Is there no other way you can be sent the data?

Yes. The Excel driver only sample the 1st 8 rows to determine the data type.
This means that it assumes the column is numeric is "bob" does not appear in rows 1 to 8.
The target table column datatype is irrelevant.
Ths issue has been there for a long time, I saw it in 2003 the first time.
BOL notes on excel import

We usually save the file as a .csv file or .txt file and then the issue doesn't occur.

There is a quick and tricky way for this. following these steps BUT first copy all the data / columns and rows from the actual excel sheet into another excel sheet just to be of the safe side so that you have the actual data to compare with.
Steps:
Copy all the values in the column and paste them into a notepad.
Now change the column type to text in the Excel sheet (it will trim the preceding / trailing Zeros), don't worry about that.
Go to Notepad and copy all the values that you have pasted just now.
Go to your excel sheet and paste the values in same column.
If you have more than one column with 0 values then just repeat the same.
Now your excel document is ready to be imported with Zero Values :).
Happy Days.

Related

Work around to 255 column limit in SQL import export wizard for excel

I am trying to import an excel sheet to SQL database. This sheet has 700+ columns, but i understand there is a limit of 255 columns. Is there a work around to include all the columns while uploading to database. I selected Excel 2007 while selecting excel version.
Sadly, of the resources I saw online, there wasn’t an easy way to do this. I tried the above solution of choosing “Excel 2007” but that didn’t work for me. One would expect Excel and SQL Server to have tighter integration.
However, by converting to a .txt file and then working through some of the truncation errors, I managed to load my data set. Below is some detail of my use case and process.
I had a fairly small dataset row-wise (~200 rows) but with a large set of columns (+500 columns). I first converted the Excel file to a text, 'tab-delimited' file (i.e. *.txt). In loading through the SQL Server Import and Export Wizard, I faced two truncation errors: one, a few of the column names were greater than 128 characters and two, the length of values in some of the rows where the default datatype was DT_STR was greater than 50 (the default output column width). For the first error, I just renamed the column(s) to something shorter. For the second error, I manually ran a count of lengths and found the maximum length of values for each column, which allowed me to isolate which columns would throw up an error. I followed the steps below:
1) In the SQL Server Import & Export Wizard 'General' Tab, I selected 'Flat File source' and accepted all the normal defaults.
2) In the second tab ('Columns'), ensure column delimiter is selected as Tab. I preferred this over a CSV format since there was some heavy text with commas in my dataset.
3) If the mapping works out, in the third tab ('Advanced'), you should get a laundry list of all your columns with their natural defaults. As detailed above, I isolated which columns had values that exceeded the default of DT_STR (50) and changed that to DT_TEXT.
4) The remaining steps just specify the destination, and whether you want to save the SSIS steps.
A more simple, straightforward solution: on excel, sort your rows so the one with the longest value is moved to the top. It looks like the import wizard only looks into the first 8 rows of your data file to determine the width of each column (weird!).
source: How does one change the default varchar 255 of a column when importing data from Excel to Sql Server using Import Export Wizard?

Excel destination character size in SSIS

I am trying to export data from SQL server 2008 to Excel file using BIDS.
One of the fields 'DESCRIPTION' coming from SQL database is VARCHAR(4000).
I can export everything to excel but the 'DESCRIPTION' field size in excel is restricted to unicode 255 and no mater what I try it does not allow me to export the data over 255 characters (exports it as blank). I tried to change SQL field as varchar(max) or ntext but none of attempts worked. I used advanced editor in BIDS on excel destination to change 'DESCRIPTION' character length manually but as soon as I hit 'OK', it resets to unicode 255.
Could anybody please help me to resolve this issue?
Thanks,
Vishal
So, I did some testing. Excel data transformation is funky but I came up with a solution. I created an excel spreadsheet with fields as needed. I then created fake, dummy data in excel with character length far greater than 255 and hid the row. I then did the SSIS data transformation to the excel spreadsheet which worked. It's a weird and not preferable option but it works.
Problem: Excel only accepts 255 chars per cell when I attempt to use Excel Destination in SSIS (2008 R2) from a sql server table. SalesForce data loader would not accept CSV (with “” text qualifiers) created by
ssis flat file connection manager. SalesForce will only accept CSV (with “” text qualifiers). SalesForce will accept CSV as exported by Excel (2010).
Solution:
1. Create your excel connection manager, set name/path of the destination EXCEL file in your “Excel Destination Data Flow Competent” and map meta-data.
2. Open a new Excel file, remove all extra “sheets”, rename “sheet1” to that was created in step#1, above, select all cells and format to “text”, add all the column header names to the first row of your template sheet. In the columns that need to hold more data than 255 limit, paste in any characters that exceed your limit by 50% (just in case). These columns are now configured to hold your large data. Save the file, naming it something like TEMPLATE_Excel_forLargeCellValues.xlsx
3. Copy this template into your DESTINATION connection: Before your “Excel Destination Data Flow Competent” in the SSIS Control Flow, create a new “File System Task”. Create an ssis pkg level variable to hold the path/filename of your template excel file. In your “File System Task” set “IsSourcePathVariable” = TRUE, set “SourceVariable” to User::Template_Excel. Set “IsDestinationPathVariable” = FALSE, and set “DestinationConnection” = from step #1 above. Set “Operation” = Copy file. “OverwriteDestination”=TRUE. This will now copy your formatted Excel workbook/sheet into your destination folder with the file name you designated in step #1 above and because you put a larger amount of sample data in the columns that require more than 255 chars, all your data will fit.
Note: It is not necessary to delay validation on any components.
You're saying that the excel field is set to 255 right? Changing the SQL field won't have an effect on excel, you'd have to modify the excel file.
I don't believe you can modify the Excel output column to write more than 255 characters. Why not simply write your output to a csv, it can be opened and later modified in Excel anyway.
SSIS excel engine recognizes datatype of first 8 rows and assigns it to excel source or destination automatically. Even defining the excel column as memo wont work. I tried to resolve the error by changing registry value TypeGuessRows of excel engine but it did not work either. So I was not left with any other option but to create a dummy row(2nd row) with more than 255 characters and hide it.Excel source then identify the column with unicode text stream. You have to write some logic in SSIS package to exclude this row if you are trying to import the data from excel. I heard that this issue is resolved in excel versions on and after 2010. But BIDS 2008 does not have option to choose any version after 2007 so this is the only solution if you are working with BIDS 2008 and excel.
You have to select Microsoft Excel 97-2003 and use the xls as file extension in your file name for destination.
I got the same issue of the excel destination not allowing more than 255 characters. After spending almost a day, I tried adding more characters (to simplify, I added spaces more than 255) in the header of the column that has the issue with more than 255 characters. And it magically worked!
You can insert dummy data (260 characters) to under head column you want in your excel (Execute SQL Task)
Script Create and insert
CREATE TABLE `YourSheet` (`myColumn260char` LongText)
GO
INSERT INTO YourSheet(myColumn260char) Values('....................................................................................................................................................................................................................................................................')
And you can delete dummy row after imported.

Import data from .xls to table by removing unwanted columns? [duplicate]

I need to import sheets which look like the following:
March Orders
***Empty Row
Week Order # Date Cust #
3.1 271356 3/3/10 010572
3.1 280353 3/5/10 022114
3.1 290822 3/5/10 010275
3.1 291436 3/2/10 010155
3.1 291627 3/5/10 011840
The column headers are actually row 3. I can use an Excel Sourch to import them, but I don't know how to specify that the information starts at row 3.
I Googled the problem, but came up empty.
have a look:
the links have more details, but I've included some text from the pages (just in case the links go dead)
http://social.msdn.microsoft.com/Forums/en-US/sqlintegrationservices/thread/97144bb2-9bb9-4cb8-b069-45c29690dfeb
Q:
While we are loading the text file to SQL Server via SSIS, we have the
provision to skip any number of leading rows from the source and load
the data to SQL server. Is there any provision to do the same for
Excel file.
The source Excel file for me has some description in the leading 5
rows, I want to skip it and start the data load from the row 6. Please
provide your thoughts on this.
A:
Easiest would be to give each row a number (a bit like an identity in
SQL Server) and then use a conditional split to filter out everything
where the number <=5
http://social.msdn.microsoft.com/Forums/en/sqlintegrationservices/thread/947fa27e-e31f-4108-a889-18acebce9217
Q:
Is it possible during import data from Excel to DB table skip first 6 rows for example?
Also Excel data divided by sections with headers. Is it possible for example to skip every 12th row?
A:
YES YOU CAN. Actually, you can do this very easily if you know the number columns that will be imported from your Excel file. In
your Data Flow task, you will need to set the "OpenRowset" Custom
Property of your Excel Connection (right-click your Excel connection >
Properties; in the Properties window, look for OpenRowset under Custom
Properties). To ignore the first 5 rows in Sheet1, and import columns
A-M, you would enter the following value for OpenRowset: Sheet1$A6:M
(notice, I did not specify a row number for column M. You can enter a
row number if you like, but in my case the number of rows can vary
from one iteration to the next)
AGAIN, YES YOU CAN. You can import the data using a conditional split. You'd configure the conditional split to look for something in
each row that uniquely identifies it as a header row; skip the rows
that match this 'header logic'. Another option would be to import all
the rows and then remove the header rows using a SQL script in the
database...like a cursor that deletes every 12th row. Or you could
add an identity field with seed/increment of 1/1 and then delete all
rows with row numbers that divide perfectly by 12. Something like
that...
http://social.msdn.microsoft.com/Forums/en-US/sqlintegrationservices/thread/847c4b9e-b2d7-4cdf-a193-e4ce14986ee2
Q:
I have an SSIS package that imports from an Excel file with data
beginning in the 7th row.
Unlike the same operation with a csv file ('Header Rows to Skip' in
Connection Manager Editor), I can't seem to find a way to ignore the
first 6 rows of an Excel file connection.
I'm guessing the answer might be in one of the Data Flow
Transformation objects, but I'm not very familiar with them.
A:
Question Sign in to vote 1 Sign in to vote rbhro, actually there were
2 fields in the upper 5 rows that had some data that I think prevented
the importer from ignoring those rows completely.
Anyway, I did find a solution to my problem.
In my Excel source object, I used 'SQL Command' as the 'Data Access
Mode' (it's drop down when you double-click the Excel Source object).
From there I was able to build a query ('Build Query' button) that
only grabbed records I needed. Something like this: SELECT F4,
F5, F6 FROM [Spreadsheet$] WHERE (F4 IS NOT NULL) AND (F4
<> 'TheHeaderFieldName')
Note: I initially tried an ISNUMERIC instead of 'IS NOT NULL', but
that wasn't supported for some reason.
In my particular case, I was only interested in rows where F4 wasn't
NULL (and fortunately F4 didn't containing any junk in the first 5
rows). I could skip the whole header row (row 6) with the 2nd WHERE
clause.
So that cleaned up my data source perfectly. All I needed to do now
was add a Data Conversion object in between the source and destination
(everything needed to be converted from unicode in the spreadsheet),
and it worked.
My first suggestion is not to accept a file in that format. Excel files to be imported should always start with column header rows. Send it back to whoever provides it to you and tell them to fix their format. This works most of the time.
We provide guidance to our customers and vendors about how files must be formatted before we can process them and it is up to them to meet the guidlines as much as possible. People often aren't aware that files like that create a problem in processing (next month it might have six lines before the data starts) and they need to be educated that Excel files must start with the column headers, have no blank lines in the middle of the data and no repeating the headers multiple times and most important of all, they must have the same columns with the same column titles in the same order every time. If they can't provide that then you probably don't have something that will work for automated import as you will get the file in a differnt format everytime depending on the mood of the person who maintains the Excel spreadsheet. Incidentally, we push really hard to never receive any data from Excel (only works some of the time, but if they have the data in a database, they can usually accomodate). They also must know that any changes they make to the spreadsheet format will result in a change to the import package and that they willl be charged for those development changes (assuming that these are outside clients and not internal ones). These changes must be communicated in advance and developer time scheduled, a file with the wrong format will fail and be returned to them to fix if not.
If that doesn't work, may I suggest that you open the file, delete the first two rows and save a text file in a data flow. Then write a data flow that will process the text file. SSIS did a lousy job of supporting Excel and anything you can do to get the file in a different format will make life easier in the long run.
My first suggestion is not to accept a file in that format. Excel files to be imported should always start with column header rows. Send it back to whoever provides it to you and tell them to fix their format. This works most of the time.
Not entirely correct.
SSIS forces you to use the format and quite often it does not work correctly with excel
If you can't change he format consider using our Advanced ETL Processor.
You can skip rows or fields and you can validate the data the way you want.
http://www.dbsoftlab.com/etl-tools/advanced-etl-processor/overview.html
Sky is the limit
You can just use the OpenRowset property you can find in the Excel Source properties.
Take a look here for details:
SSIS: Read and Export Excel data from nth Row
Regards.

Truncation errors trying to import from Excel

I'm trying to import the NDC database that you can download here: http://www.fda.gov/drugs/informationondrugs/ucm142438.htm
When I initially tried to import the excel in the zip file it complained about the format, so I started with a blank excel, and imported it into excel from the txt file.
I've created a table to import the data into and set all the columns to nvarchar(MAX). The column it complains about is the SUBSTANCENAME column. I checked, and the longest value in that column is about 2700 characters.
My understanding is that the nvarchar(MAX) should easily hold that much. I'm not sure what to do about this other than changing that column to a text field. Should that fit into that column how it is?
I've tried setting it to ignore errors, but as far as I can tell that does nothing, at least it never seems to ignore them when I try.
How are you importing the data into the SQL Server table? If I remember correctly, SSIS uses the first 5 or 10 rows of the Excel file to determine the datatype and length. I remember I had to make a change to the registry in order to get a larger sample size
HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\Jet\4.0\Engines\Excel
The TypeGuessRows entry can be modified to get a larger sample size.
That is assuming you are using SSIS - but if you are using SQL Server Import then it would be doing the same thing as well.

Error converting data types when importing from Excel to SQL Server 2008

Every time that I try to import an Excel file into SQL Server I'm getting a particular error. When I try to edit the mappings the default value for all numerical fields is float. None of the fields in my table have decimals in them and they aren't a money data type. They're only 8 digit numbers. However, since I don't want my primary key stored as a float when it's an int, how can I fix this? It gives me a truncation error of some sort, I'll post a screen cap if needed. Is this a common problem?
It should be noted that I cannot import Excel 2007 files (I think I've found the remedy to this), but even when I try to import .xls files every value that contains numerals is automatically imported as a float and when I try to change it I get an error.
http://imgur.com/4204g
SSIS doesn't implicitly convert data types, so you need to do it explicitly. The Excel connection manager can only handle a few data types and it tries to make a best guess based on the first few rows of the file. This is fully documented in the SSIS documentation.
You have several options:
Change your destination data type to float
Load to a 'staging' table with data type float using the Import Wizard and then INSERT into the real destination table using CAST or CONVERT to convert the data
Create an SSIS package and use the Data Conversion transformation to convert the data
You might also want to note the comments in the Import Wizard documentation about data type mappings.
Going off of what Derloopkat said, which still can fail on conversion (no offense Derloopkat) because Excel is terrible at this:
Paste from excel into Notepad and save as normal (.txt file).
From within excel, open said .txt file.
Select next as it is obviously tab delimited.
Select "none" for text qualifier, then next again.
Select the first row, hold shift, select the last row, and select the text radial button. Click Finish
It will open, check it to make sure it's accurate and then save as an excel file.
There is a workaround.
Import excel sheet with numbers as float (default).
After importing, Goto Table-Design
Change DataType of the column from Float to Int or Bigint
Save Changes
Change DataType of the column from Bigint to any Text Type (Varchar, nvarchar, text, ntext etc)
Save Changes.
That's it.
When Excel finds mixed data types in same column it guesses what is the right format for the column (the majority of the values determines the type of the column) and dismisses all other values by inserting NULLs. But Excel does it far badly (e.g. if a column is considered text and Excel finds a number then decides that the number is a mistake and insert a NULL instead, or if some cells containing numbers are "text" formatted, one may get NULL values into an integer column of the database).
Solution:
Create a new excel sheet with the name of the columns in the first row
Format the columns as text
Paste the rows without format (use CVS format or copy/paste in Notepad to get only text)
Note that formatting the columns on an existing Excel sheet is not enough.
There seems to be a really easy solution when dealing with data type issues.
Basically, at the end of Excel connection string, add ;IMEX=1;"
Provider=Microsoft.Jet.OLEDB.4.0;Data Source=\\YOURSERVER\shared\Client Projects\FOLDER\Data\FILE.xls;Extended Properties="EXCEL 8.0;HDR=YES;IMEX=1";
This will resolve data type issues such as columns where values are mixed with text and numbers.
To get to connection property, right click on Excel connection manager below control flow and hit properties. It'll be to the right under solution explorer. Hope that helps.
To avoid float type field in a simple way:
Open your excel sheet..
Insert blank row after header row and type (any text) in all cells.
Mouse Right-Click on the head of the columns that cause a float issue and select (Format Cells), then choose the category (Text) and press OK.
And then export the excel sheet to your SQL server.
This simple way worked with me.
A workaround to consider in a pinch:
save a copy of the excel file, modify the column to format type 'text'
copy the column values and paste to a text editor, save the file (call it tmp.txt).
modify the data in the text file to start and end with a character so that the SQL Server import mechanism will recognize as text. If you have a fancy editor, use included tools. I use awk in cygwin on my windows laptop. For example, I start end end the column value with a single quote, like "$ awk '{print "\x27"$1"\x27"}' ./tmp.txt > ./tmp2.txt"
copy and paste the data from tmp2.txt over top of the necessary column in the excel file, and save the excel file
run the sql server import for your modified excel file... be sure to double check the data type chosen by the importer is not numeric... if it is, repeat the above steps with a different set of characters
The data in the database will have the quotes once the import is done... you can update the data later on to remove the quotes, or use the "replace" function in your read query, such as "replace([dbo].[MyTable].[MyColumn], '''', '')"

Resources