Importing CSV files to Excel 2003 with more than 256 columns - arrays

I like to import some csv files with 375 columns into excel 2003. But excel 2003 limits the number of columns to 256.
I must edit this question due to relevancy
I have several CSV-files with exactly 375 columns and different number of rows. How can I delete every second (odd) column which contains unnecessary information. The unneeded columns are from number 5 to 375 (5, 7, 9, ...373, 375).
Is there a useful solution to delete all unneeded columns with a script in VBS?
I will keep my fingers cross to get a solution and thank you in advance.
Mike

Have a look at the CSV Editors. They will generally allow you to delete columns. You will have to specify which columns to delete (i.e. you will not be able to delete columns via the content). But you should be able to delete enough to view / edit in Excel. Also you could
cut/paste from the Csv Editor into Excel as well.
Here are several (first 2 are free, the next 2 have a 30 day trial)
http://csved.sjfrancke.nl/
http://recsveditor.sourceforge.net/
http://www.whitepeaksoftware.com/killink-csv-editor/
http://www.gammadyne.com/csv_editor_pro.htm
You could also google "CSV editor" / check SourceForge for more

After a long time and no response...
I tried several CSV-editors (free or purchasable), but nobody seems to have the options to delete columns by having the string "percent" in their header). Only ReCsvEditor_0.80.6 is working with filters, but the filter seems not to work properly.
If anyone have a suitable VBS (WSH) script solution I would be very grateful.

Related

SSIS dynamic column names mapping

My main task is to upload multiple files in a table. The source files doesn't contain headers and once I configure it on source editor, the column names both in input and output columns are empty (see picture below)
The table consists of 16 columns so I had to manually configure it on the output column to properly map it. It wouldn't be much of a problem to me to do it manually if I only need to do it once. The hard part is I need to upload 12 files for a year (monthly data) and 4 years total.
Any help would very much be appreciated.
Thanks in advance.

Import Flat File into SQL Server Repeated Headers Break Integrity Constraints

I'm having several issues with importing a flat file into MS SQL Server using the SQL Server import / export wizard. I'd like to know how to effectively load the file into a SQL Server table.
File Conditions:
The flat file is fairly large (800MB, and serveral million rows)
It's poorly formatted
The first column is empty
The header is a 3 row set: top blank, middle has field names, bottom blank
This 3 row header is repeated approximately every 60,000 rows
Some values are nulls
It's tab delimited
First, I tried to load it in as Flat File, but SQL server failed to recognize the tab delimiters. Excel opens it correctly (although partially), but SQL Server sticks it all in 1 column.
Second, I tried opening and saving it as an excel file and loading it as an excel file into the SQL Server import wizard (which I'm not sure if it resaves all the data anyway). Now SQL Server parses the columns correctly, but it says integrity constrints are broken when it hits the repeated headers (every numeric type field has a string header every 60000 rows).
If anyone can tell me how to get around this that would be great. I'd ideally like to upload it without the integrity constraints and remove the extra headers with a DELETE WHERE header or blank clause. Not the only solution I'll take, but an idea.
Also, this is my first stackoverflow post, so patience is appreciated.
Thanks,
Since I don't have a formal answer yet, I'll post what I ended up doing.
Essentially, I just made everything a varchar so it would just load into a table. Then I wrote several queries to clean up the garbage in it. Later I made new typed fields and filled them with an insert and cast from the varchar typed fields.
I don't know that this will ever help someone, but at least there's an answer here.

How to turn huge Excel sheets into a database?

I have 12 very-large Excel sheets.
Each one is 102 columns wide, and an average of 600K rows long.
They're all identical in structure.
Quite often I need to run a specific query, from only a subset of columns, with a specific criteria. And the process of doing so by opening each file and filtering/sorting/finding what I need has become very tedious.
If they were all in a database, such queries would be so much easier.
I tried MS Access, and I tried SQL Server Express.
Both die on me during the respective Import wizards.
And the failure is certainly due to the size of the data set, because if I manually trim a file to say 10 rows, the import works fine.
Any ideas how to do this? I'm open to using Access, or SQL Server Express, or any other tool that does the job really.
Note: Some of the columns contain records that contain commas, hence several suggestions I found to turn the files into CSVs prior to import, ended up with broken structure.
Edit 1: They're 12 separate workbooks, with 1 sheet in each. And the aim is create a single table where all the data is appended.

Import data from .xls to table by removing unwanted columns? [duplicate]

I need to import sheets which look like the following:
March Orders
***Empty Row
Week Order # Date Cust #
3.1 271356 3/3/10 010572
3.1 280353 3/5/10 022114
3.1 290822 3/5/10 010275
3.1 291436 3/2/10 010155
3.1 291627 3/5/10 011840
The column headers are actually row 3. I can use an Excel Sourch to import them, but I don't know how to specify that the information starts at row 3.
I Googled the problem, but came up empty.
have a look:
the links have more details, but I've included some text from the pages (just in case the links go dead)
http://social.msdn.microsoft.com/Forums/en-US/sqlintegrationservices/thread/97144bb2-9bb9-4cb8-b069-45c29690dfeb
Q:
While we are loading the text file to SQL Server via SSIS, we have the
provision to skip any number of leading rows from the source and load
the data to SQL server. Is there any provision to do the same for
Excel file.
The source Excel file for me has some description in the leading 5
rows, I want to skip it and start the data load from the row 6. Please
provide your thoughts on this.
A:
Easiest would be to give each row a number (a bit like an identity in
SQL Server) and then use a conditional split to filter out everything
where the number <=5
http://social.msdn.microsoft.com/Forums/en/sqlintegrationservices/thread/947fa27e-e31f-4108-a889-18acebce9217
Q:
Is it possible during import data from Excel to DB table skip first 6 rows for example?
Also Excel data divided by sections with headers. Is it possible for example to skip every 12th row?
A:
YES YOU CAN. Actually, you can do this very easily if you know the number columns that will be imported from your Excel file. In
your Data Flow task, you will need to set the "OpenRowset" Custom
Property of your Excel Connection (right-click your Excel connection >
Properties; in the Properties window, look for OpenRowset under Custom
Properties). To ignore the first 5 rows in Sheet1, and import columns
A-M, you would enter the following value for OpenRowset: Sheet1$A6:M
(notice, I did not specify a row number for column M. You can enter a
row number if you like, but in my case the number of rows can vary
from one iteration to the next)
AGAIN, YES YOU CAN. You can import the data using a conditional split. You'd configure the conditional split to look for something in
each row that uniquely identifies it as a header row; skip the rows
that match this 'header logic'. Another option would be to import all
the rows and then remove the header rows using a SQL script in the
database...like a cursor that deletes every 12th row. Or you could
add an identity field with seed/increment of 1/1 and then delete all
rows with row numbers that divide perfectly by 12. Something like
that...
http://social.msdn.microsoft.com/Forums/en-US/sqlintegrationservices/thread/847c4b9e-b2d7-4cdf-a193-e4ce14986ee2
Q:
I have an SSIS package that imports from an Excel file with data
beginning in the 7th row.
Unlike the same operation with a csv file ('Header Rows to Skip' in
Connection Manager Editor), I can't seem to find a way to ignore the
first 6 rows of an Excel file connection.
I'm guessing the answer might be in one of the Data Flow
Transformation objects, but I'm not very familiar with them.
A:
Question Sign in to vote 1 Sign in to vote rbhro, actually there were
2 fields in the upper 5 rows that had some data that I think prevented
the importer from ignoring those rows completely.
Anyway, I did find a solution to my problem.
In my Excel source object, I used 'SQL Command' as the 'Data Access
Mode' (it's drop down when you double-click the Excel Source object).
From there I was able to build a query ('Build Query' button) that
only grabbed records I needed. Something like this: SELECT F4,
F5, F6 FROM [Spreadsheet$] WHERE (F4 IS NOT NULL) AND (F4
<> 'TheHeaderFieldName')
Note: I initially tried an ISNUMERIC instead of 'IS NOT NULL', but
that wasn't supported for some reason.
In my particular case, I was only interested in rows where F4 wasn't
NULL (and fortunately F4 didn't containing any junk in the first 5
rows). I could skip the whole header row (row 6) with the 2nd WHERE
clause.
So that cleaned up my data source perfectly. All I needed to do now
was add a Data Conversion object in between the source and destination
(everything needed to be converted from unicode in the spreadsheet),
and it worked.
My first suggestion is not to accept a file in that format. Excel files to be imported should always start with column header rows. Send it back to whoever provides it to you and tell them to fix their format. This works most of the time.
We provide guidance to our customers and vendors about how files must be formatted before we can process them and it is up to them to meet the guidlines as much as possible. People often aren't aware that files like that create a problem in processing (next month it might have six lines before the data starts) and they need to be educated that Excel files must start with the column headers, have no blank lines in the middle of the data and no repeating the headers multiple times and most important of all, they must have the same columns with the same column titles in the same order every time. If they can't provide that then you probably don't have something that will work for automated import as you will get the file in a differnt format everytime depending on the mood of the person who maintains the Excel spreadsheet. Incidentally, we push really hard to never receive any data from Excel (only works some of the time, but if they have the data in a database, they can usually accomodate). They also must know that any changes they make to the spreadsheet format will result in a change to the import package and that they willl be charged for those development changes (assuming that these are outside clients and not internal ones). These changes must be communicated in advance and developer time scheduled, a file with the wrong format will fail and be returned to them to fix if not.
If that doesn't work, may I suggest that you open the file, delete the first two rows and save a text file in a data flow. Then write a data flow that will process the text file. SSIS did a lousy job of supporting Excel and anything you can do to get the file in a different format will make life easier in the long run.
My first suggestion is not to accept a file in that format. Excel files to be imported should always start with column header rows. Send it back to whoever provides it to you and tell them to fix their format. This works most of the time.
Not entirely correct.
SSIS forces you to use the format and quite often it does not work correctly with excel
If you can't change he format consider using our Advanced ETL Processor.
You can skip rows or fields and you can validate the data the way you want.
http://www.dbsoftlab.com/etl-tools/advanced-etl-processor/overview.html
Sky is the limit
You can just use the OpenRowset property you can find in the Excel Source properties.
Take a look here for details:
SSIS: Read and Export Excel data from nth Row
Regards.

SQL import wizard drops leading zero

I've read all the posts about prefixing the numbers with ', and setting IMEX=1 in the connection string; nothing seems to do the trick for me.
Here's the setup: Excel column with mixed data - 99% numbers (some start with 0) 1% text.
PROGRAMATICALLY mporting into SQL Server 2005 table / column type - varchar(255).
Import works fine locally, but once i move the code to production (GoDaddy), it drops the leading 0's in the column.
Any ideas?
p.s. I knew about the registry change solution, matter of fact - the value was set to 0 on my dev box, but the answer made me realize that the value wasn't set on the PRODUCTION SERVER :)
The ISAM driver only samples the first 8 rows, but you can change that behaviour through a registry change:
http://sqlserversd.wordpress.com/2008/09/14/ssis-excel-values-import-as-nulls/
But yes, using Excel for machine-to-machine data transfer is a nightmare...Is there no other way you can be sent the data?
Yes. The Excel driver only sample the 1st 8 rows to determine the data type.
This means that it assumes the column is numeric is "bob" does not appear in rows 1 to 8.
The target table column datatype is irrelevant.
Ths issue has been there for a long time, I saw it in 2003 the first time.
BOL notes on excel import
We usually save the file as a .csv file or .txt file and then the issue doesn't occur.
There is a quick and tricky way for this. following these steps BUT first copy all the data / columns and rows from the actual excel sheet into another excel sheet just to be of the safe side so that you have the actual data to compare with.
Steps:
Copy all the values in the column and paste them into a notepad.
Now change the column type to text in the Excel sheet (it will trim the preceding / trailing Zeros), don't worry about that.
Go to Notepad and copy all the values that you have pasted just now.
Go to your excel sheet and paste the values in same column.
If you have more than one column with 0 values then just repeat the same.
Now your excel document is ready to be imported with Zero Values :).
Happy Days.

Resources