Reading and writing to xls and doc files in c - c

I have this particular problem where i have to write a c program that reads numerical data from a text file. The data is tab delimited. Here is a sample from the text file.
1 23099 345565 345569
2 908 66766 66768
This is data for clients and each client has a row.Each column represents customer no.,previous balance,previous reading, current reading.Then i have to generate a doc. document
that summarizes all this information and calculates the balance I can write a function that does this but how do i create an xls document
and a word document where all the results are summarized using the program? The text document has only numerical data. Any ideas

The easiest way is to create a csv file and not a xls file.
Office can open those csv files with good results.
And it is way easier to create a ascii text file with commaseparated values,
than to create something into a closed format like the ms office formats.

The simplest way to create a spreadsheet that contains formulas and formatting, and can be opened by Excel, is to create an XML Spreadsheet file.

Related

Extract data from Word documents with SSIS to ETL into SQL

I could really use some help in how to extract data from Word documents using SSIS and inserting the extracted data in SQL. There are 10,000 - 13,000 Word files to process. The files most likely aren't consistent over the years. Any help is greatly appreciated!
Below is the example data from the Word documents that I'm interested in capturing. Note that Date and Job No are in the Header section.
Customer : Test Customer
Customer Ref. : 123456
Contact : Test Contact
Part No. : 123456789ABCDEFG
Manufacturer : Some Mfg.
Package : 123-456
Date Codes : 1234
Lot Number : 123456
Country of Origin : Country
Total Incoming Qty : 1 pc
XRF Test Result : PASS
HCT Result : PASS
Solder Test Result : PASS
My approach would be this:
Create a script in Python that extracts your data from the Word files and save them in XML or JSON format
Create SSIS package to load the data from each XML/JSON file to SQL Server
1. Using a script component as a source
To import data from Microsoft Word into SQL Server, you can use a script component as a data source where you can implement a C# script to parse document files using Office Interoperability libraries or any third-party assembly.
Example of reading tables from a Word file
2. Extracting XML from DOCX file
DOCX file is composed of several embedded files. Text is mainly stored within an XML file. You can use a script task or Execute Process Task to extract the DOCX file content and use an XML source to read the data.
How can I extract the data from a corrupted .docx file?
How to extract just plain text from .doc & .docx files?
3. Converting the Word document into a text file
The third approach is to convert the Word document into a text file and use a flat-file connection manager to read the data.
convert a word doc to text doc using C#
Converting a Microsoft Word document to a text file in C#

I need to extract three columns from a file

I am doing a project in Association Rule Mining. For my data set, I need to extract three columns from a text file.
Here is the link for the text file.
I need to extract Billno, Product and Batch columns and write them on to a text file.
The easiest way would be to use grep
http://www.dreamincode.net/forums/topic/290545-using-a-grep-command-in-a-c-program-in-linux/
Otherwise, that data is consistent, all the bill numbers are the same length and and appear to be easily Regex'ed
http://www.cplusplus.com/reference/regex/

J2ME Writing in CSV file through file Connection API

I am creating an application in which i need to add a new column in the csv file and then entries for that particular column.
And I have tried OutputStream and PrintStream but the problem is that the data is being written in starting of the file but i want the data at random position according to my need.
And RandomAccessFile is not identified by the application.
For e.g.
My CSV is:
Name,any_date
A,
B,
c,
And after writing it will look like
Name,any_date
A,p
B,a
C,p
I am using file Connection API to read and write.Can anyone suggest me how to do that??
Thanks in advance.
i think you want to append data in the file.
you can try this
os = fconn.openOutputStream(fconn.fileSize());
os.write(data.getBytes());
This is a simple example to append data at last.

Read from excel file in C

I want to read from an excel file in C. The excel 2007 file contains about 6000 rows and 2 columns. I want to store the contents in a 2-D array in C. If there exists a C library or any other method then please let me know.
Excel 2007 stores the data in a bunch of files, most of them in XML, all crammed together into a zip file. If you want to look at the contents, you can rename your .xlsx to whatever.zip and then open it and look at the files inside.
Assuming your Excel file just contains raw data, and all you care about is reading it (i.e., you do not need/want to update its contents and get Excel to open it again), reading the data is actually pretty easy. Inside the zip file, you're looking for the subdirectory xl\worksheets\, which will contain a number of .xml files, one for each worksheet from Excel (e.g., a default workbook will have three worksheets named sheet1.xml, sheet2.xml and sheet3.xml).
Inside of those, you're looking for the <sheet data> tag. Inside of that, you'll have <row> tags (one for each row of data), and inside of them <c> tags with an attribute r=RC where RC is replaced by the normal row/column notation (e.g., "A1"). The <c> tag will have nested <v> tag where you'll find the value for that cell.
I do feel obliged to add a warning though: while reading really simple data can indeed be just this easy, life can get a lot more complex in a hurry if you decide to do much more than reading simple rows/columns of numbers. Trying to do anything even slightly more complex than that can get a lot more complex in a hurry.
You have several choices:
1) Save your excel worksheet to a csv file and parse that.
2) Use the COM API (Windows proprietary and tricky)
3) See this link for a C++ class that you could modify.
Another C lib to read data from excel files can be found here.

components of an SPSS project

I have given some data in an excel sheet to a 3rd party for SPSS data processing. After completion of the processing, what are the files that I should get back from them.
I have received one file with a ".sav" extension. I presume this file contains the imported data (from my excel file).
I have received documents (.rtf - rich text format) with the chart and graphs only. Is there something else I need to get so that I can use the files later on for further analysis.
Thanks in advance
V Karthick
Yes, the ".sav" extension is the data file. You should also request the syntax file(s), ".sps" extension. The syntax file is a record of all data transformations which have been performed and allows you to review their work. The syntax file can be opened with notepad or any text editor.
Arthur

Resources