I'd like to create a PowerPoint (not Javascript/HTML/PDF/Keynote/.mov) using code (any language, C preferred) for free.
(I've seen this SO question which references how to create them in C#)
Is this even possible? How can I write the raw bits that make up a PowerPoint file? Any good libraries for doing this?
UPDATE The Microsoft Reference Page for the binary format is here.
Open Office has an API. You can use the C++ bindings (doc available here). If you really need C, you'll have to do some wrapping.. but hey, it's Christmas, isn't it ;-)
Open Office has export functions to create .ppt compatible files.
PowerPoint you may not, but OpenOffice Impress you may. (Yoda style answer :) )
Take a look at the ODF Toolkit project. They aim to produce lots of libraries for generating this kind of content programatically.
Unless you're specifically interested in PowerPoint 2003 binary files, PowerPoint 2007 and up .PPTX files are actually a collection of XML files inside a zipped file. You can see that, by simply renaming a .pptx file to .zip and opening it.
You can create these XML files in any way you like, such as writing code to do it.
PresentationML defines the powerpoint XML documents, have a look here for example:
http://msdn.microsoft.com/en-us/openspecifications/hh295812.aspx
The standards could be found here:
http://www.ecma-international.org/publications/standards/Ecma-376.htm
If you don't mind going to Java, Apache POI provides readers and writers for most MS Office formats (up to the 2003 version anyway).
Related
MSI database contains set of tables, and I can successfully enumerate File table, which has all deployable file' meta-deta. What I need to extract is the actual contents of those files. msiexec, lessmsi, 7-zip all can do it, but I couldn't find any source/API to do it.
What I've discovered it that all other (resource) files are in Binary table, and Data field can be used to get content of those files (like icons, custom DLL etc).
Further, I found and know that Media table contains information about the .CAB file (MSI has all content embedded with <MediaTemplate EmbedCab="yes"/>. This simply means the CAB file contains the actual content. I probably need to read contents from "Structured Storage" of the .msi file.
How to extract the contents of CAB/MSI file, using native C Msi* functions?
Phil has given you the easy/simple answer but I thought I might give you a little more information since you've done some research. Checkout:
https://msdn.microsoft.com/en-us/library/windows/desktop/aa372919(v=vs.85).aspx
This is where the structured storage is. You'll see something like Disk1.cab as the Name (PK) and binary data. The data is a CAB file with the file entry in the cab matching the File.File column. From there you can use the File.FileName column to get the short name and long name (you'll want the long name no doubt) and do a joint to the Component table to get the directory table ID.
You'll also need to recurse the directory table to build the tree of directories and know where to put the files.
Fun stuff. There's some libraries in C# that make this WAY simpler. Or just call msiexec /a as Phil says. :)
The most straightforward to extract all the files to some location is to install the product in "advertised" mode. If you do a:
msiexec /a [path to msi] TARGETDIR=[some folder]
you'll see what happens.
In C++ call MsiInstallProduct () with that command line.
You have gotten many good answers already, including the use of dark.exe from the WiX toolkit. By downloading the WiX source code you should be able to get the code you need ready-made from there. I assume you may already have done this.
Chris has already linked to the DTF code you can check, but here is a link directly to dark.exe as well: https://github.com/wixtoolset/wix3/tree/develop/src/tools/dark. I would try both. This is C#, you seem to want native.
UPDATE: Before I get to the Win32 features you can use, check out this little summary of the C# DTF features: How to programmatically read the properties inside an MSI file?
Native Win32 functions: The database functions to deal with an MSI file can be found on MSDN (this is to deal with the MSI file as a database). There are also MSI Installer Functions (used to deal with the MSI file as an actual installer).
You can certainly find good examples of native code for this with a good Google search. Have fun!
BTW: It would help with a description of the actual problem you are trying to solve as well as what you need technically. There could - as always - be less involved ways to achieve what you need. Unless you are writing a security software or malware scanner or something super-involved.
And so it is clear: WiX's dark.exe fully decompiles MSI files into WiX source files and the resource files used to build them - you can then text and binary compare the various types of content (text compare for tables, binary compare for binaries, etc...). The process to do so via command line is described in the following answer: How can I compare the content of two (or more) MSI files? (this is about comparing MSI files, but one option to do so is to decompile them - see section on dark.exe - just for reference for others who find your question).
I like to link things together so we can find content easily at a later point in time. Strictly speaking it doesn't seem necessary here, you have what you need I think but others could perhaps benefit from some further links. Here are some related links:
Extract MSI from EXE.
What is the purpose of administrative installation initiated using msiexec /a?
How do I extract files from an MSI package? (explains why you should not use 7-Zip to extract).
I'm a newbie in MS Search so please forgive the dumb question :-)
I'm storing a large amount of specialized text files for a card game (bridge).
These files are plain textfiles with a specific format to describe a bridge game played in a championship.
The only difference with a regular .txt file is the file extension that is NOT ".txt" but ".lin"
What I need is implement a new iFilter that is an exact copy of the standard MS Search text iFilter, but with another file extension.
Is this possible by copy/pasting an existing filter and tweaking (tampering) its content?
Or do I have to use c# to edit the iFilter and recompile?
The Windows 7 SDK has a sample IFilter implementation that would be a good blue print for what you are trying to do. It contains a project called "SmpFilt" The code shows parsing of a text file with a custom file extension. You will need to modify the code to parse your text instead and pull out any custom attributes from your .lin files.
Unfortunately, you can no longer build custom IFilters with managed code (C#/VB, etc). The sample project is in c++. Windows 7 and Server 2008 won't load IFilters written in managed code.
Good luck.
I've an application, that uses encrypted (txt) files to store data. After investigating the decompiled assembly I concluded that it's a file of some DBMS. So how can find out which DBMS is this application using to store it's data, so that I can attach that file to the correct DBMS.
This is little application and there is no license problem. I can just ask the owner to gimme the data, but just curious to solve this myself.
MORE INFO:
Platform is Windows, and after trying couple of decompilers I concluded that it WAS written in Visual C++. However I couldn't fully decompile this exe, otherwise I just could find out it from the source code.
A couple ideas.
If opening the file in a HEX editor doesn't give you any information (like a magic identifier at the start of the file, which you can pop into google, then:
Use the depends tool from microsoft to grab a list of the DLLs being loaded by the application. Chances are whatever DBMS it's using is contained in an external library.
If the first two suggestions yield nothing, load the executable into IDA pro freeware and have a look at the code which is creating these files.
I am trying to locate a set of source code that would allow me to open and read the contents of an Excel file on Linux from within a C program.
I dont really want to link it to OpenOffice SDK if I can find something that just does these two things.
carl
If following suites you, then You may take read routines from
Sourceforge
and write routines from
What is a simple and reliable C library for working with Excel files?
As far as I know there is no library that does this. The common method is always to save the file as CVS in Excel, although then markup etc. is lost.
You could try to use the Excel plugin of Gnumeric:
http://svn.gnome.org/viewvc/gnumeric/trunk/plugins/excel/
It works very well (inside gnumeric).
You can use xlhtml to convert the Excel files into HTML, and then use your favorite HTML parser to extract the cell data.
Check out the answers to What is the best C library that can access Excel files?
Possible things for you to look at:
C : xlsLib
C++ : LibExcel
Though I think both are write-only, which is perhaps not what you need.
Grab the xls reading code from Open Office.
why don't you just use Google Docs? With Gears it has offline support and you can edit files too, just a thought - http://docs.google.com
Check out XLSX I/O at https://sourceforge.net/projects/xlsxio/
It is a cross platform C library to read from and write to Excel .xslx files.
Works on Windows, OS X, Linux and does not require Excel or Office to be installed.
It is intended for sequential access to data in .xlsx files, so if it's only the values you are interested in this is what you need.
I would like to know the procedure to adopt to parse and obtain text content from Microsoft word (.doc and .docx) documents . programming language used should be plain "C" (should be gcc).
Are there any libraries that already do this job,
extension : can i use the same procedure to parse text from Microsoft power point files also ?
Microsoft Word documents are an enormous beast - you definitely don't want to be writing this code yourself. Look into using an existing free Word library such as antiword or wvWare.
I don't know about libraries that exist, but the format specifications are available from Microsoft for free and under a promise not to sue you for using them.
on windows, let word do the job and interface with the COM object, on linux, the job was done in antiword. Or you can automate OpenOffice.org on any platform with the UNO object model.
If you're willing to go through the effort of using a COM interface in C, you can use the IFilter interface built into every version of Windows since Windows 2000. You can use it to extract text from any office document (Word, Excel, etc.), PDF file or any file type that has IFilter support installed.
I wrote a blog post about it a few years back. It's all C++, but you can use COM objects from C.