Legacy dos system with flat file data store (ISAM-Files) - export

I have a legacy system which used to run on dos. It is an ERP system for retail stores (fashion). It think it stores it's data in flat files.
I have files ending with *.KEY and other files ending with *.D00 (counting up).
I think the key files hold the key informationen and the D-Files hold some data ... there are alot D77 files...
As far as my investigation concerns this is not dfb or foxpro it could proprietary...
The company who wrote it is out of business of course so no chance for support or any hints.
When I open these files in vim or other editors I get some binary signs and some text... I tryed it in hex mode but still nothing to use...
Is there any chance I can dump out the data... in csv, ascii, xml?
I am pretty sure that this is not a standard format. Can someone point me in a direction how those data were stored back in the days and how could I make them read-able...
Any tools, tips or tricks?
// EDIT
After some time I made some progress and can now post some details which I did not now of back then and made a good answer impossible.
I asume that the dos system was written in visual cobol and that the files could be b-tree files stored in ISAM format. I assume the closet thing I could provide is, that there is a possibility that the format is C-ISAM.
How can I access / view or modify these files... C#, JAVA, ruby.... everything new age language would be cool... I am not sure if I can handle cobol... It would be great to have a converter or a viewer tool preferable opensource...
Hope this clearifies more my question =)

OpenCOBOL has a very active user group. The language itself is free and runs on Linux and Windows and perhaps MacOSX. Have a chat to the user group there; they may be able to help.

Peachtree Accounting Software used those file extensions back in 1992.

Related

Importing COBOL Data Into SQL

I am not even sure where to begin with this one. Our old accounting system used Cobol and flat files as a database. I was wondering if there was any way to import all of this into SQL and making it useful. Ideally I would like to get to a point where I could import this historical data into our ERP. The header in one of the files shows a RMKF entry and I also see some Cobol dll files on the server like Cob32api.dll
Any insight appreciated.
Directly no. Any answer will be determined on which Cobol Dialect you have (I would guess RM Cobol). Some of the Cobol compilers have there own File System. You may need to unload the files
In general while some Cobol files will be suitable for loading into a Database. Other will require programming:
Multi-Record files - probably split in to several different tables
Files with redefines
Accessing the data in the files
Loading the Files into a Database is either going to be expensive or time consuming (or both):
There are some commercial that provide access to Cobol Files (I would imagine they are expensive). Googling revealed: http://www.cobolproducts.com/datafile/data-viewer.html popup. But you will still need to analyse the files. Things like redefines can cause issues.
look at this answer Dynamically Reading COBOL Redefines with C# It looks to be similar problem, Thomas used cb2xml to generate the Cobol.
If you get the files into a Text format (see 2 above), cobolToCsv may be useful - Csv files can generally be loaded in to Databases. cobolToCsv will not handle RM-Cobol files directly.
The RecordEditor mentioned by Simon is unlikely to handle RM-Cobol binary files but should handle unloaded Text files. It may prove useful (note I am the author of the RecordEditor)

Signable, streamable, "readable" archive format?

Is there any archive format that offers the following:
be digitally sign-able with a digital certificate from a trusted source like Verisign - for preventing changes to the file (I am not referring to read only, but in case the file was changed it should no longer be signed telling the user this is not the original file)
be stream-able - be able to be opened even if not all of the content has been transferred (also not strictly linearly)
be "readable" - be able to read the data without extracting to a temporary folder (AFAIK if you open a file in a zip archive it is extracted first, and this stays true even for zip based formats like OOXML. This is not what I want)
be portable - support on at least Windows, Linux and Mac OS X is a must, or at least future support
be free of patents - Be open source - also preferably a license that allows commercial use(as far as i know GPL a share-alike license so it doesn't allow commercial use, BSD on the other hand allows it)
Note: Though it may come in handy eventually I can not think right now of a scenario that would require both point 1 and point 2 simultaneously. Or lets leave it a be able to check the signature only when the whole file was downloaded.
I am not interested in:
being able to be compressed
being supported on legacy systems
Does any existing archive format fit this description (tar evolutions like DAR and pax come to mind) ?
If there is, are there programing libraries available for the above mentioned OSs?
If not, would it be hard to create such a thing?
Usage scenario:
I want to use this to create a new media container.
Current media containers contain the audio, video and subtitle streams directly.
Matroska, currently the most advanced container, has supplementary features like attachments and menus.
The menu functionality however is not implemented and very limited.
What I want to create is one level higher.
I want to create a file similar in a way to OOXML.
Also all of the menuing should be done in web technologies like HTML5 (as it is now the tag allows for any kind of codec to be used) and CSS.
Also just like you have holograms on dvds to prove the authenticity I want to create a sign-able file
Research notes:
Before asking this question I stumbled uppon this:
Whats the best way digitally sign a zip file for download using .Net
While detached signing would be feasable for the individual files contained in this archive it is not an ellegant solution for the archive file. Not end user friendly.End users should be able to doubleclick the file to open it in a media player like VLC, and see a message that the file is legit (just like you see in a browser if the page is transmitted with SSL through HTTPS or not)
EDIT: clarified point 5
EDIT 2: added a note to clarify point 1 and 2
EDIT 3: added usage scenario
EDIT 4: added research notes section
P.S.: This is my first question on StackOverflow
I doubt that you find such format out of the box. I understand how such solution can be built with help of our SolFS, but SolFS doesn't have built-in signing (you can add signing easily).

Configuration Management for FPGA Designs

Which configuration management tool is the best for FPGA designs, specifically Xilinx FPGA's programmed with VHDL and C for the embedded (microblaze) software?
There isn't a "best", but configuration control solutions that work for software will be OK for FPGAs - the flow is very similar. I use Subversion at work and git at home, and wrote a little on 'why' at my blog.
In other answers, binary files keep getting mentioned - the only binary files I deal with are compilation products (equivalent to software object and executables), so I don't keep them in the version control repository, I keep a zipfile for each release/tag that I create with all the important (and irritatingly slow to reproduce) ones in.
I don't think it much matters what revision control tool you use -- anything that you would consider good in general will probably be OK here. I personally use Git for a sizable Verilog + software project, and I'm quite happy with it.
What will bite you in the ass -- no matter what version control you use -- is this: The Xilinx tools don't generally respect a clean division between "input" and "output" or between (human edited) "source" and (opaque) "binary." Many of the tools like to store some state information, like a last-run time or a hash value, in their "input" files meaning that you'll get lots of false changes. Coregen does this to its .xco files, and project navigator (the main GUI) does this to its .xise files. Also, both tools have a habit of inserting or removing lines for default-valued parameters, seemingly at random.
The biggest issue I've encountered is the work-flow with Coregen: In many cases, at least one of the following is true:
You have to manually edit the HDL files produced by Coregen.
The parameters that went into Coregen are stored somewhere other than the .xco file (usually in what looks like an output file).
You have to copy-and-paste the output from Coregen into your top-level design.
This means that there is no single logical source/master location for your input to the core-generating process. So even if you have the .xco file under version control, there's no expectation that the design you're running corresponds to it. If you re-generate "the same" core from its nominal inputs, you probably won't get the right outputs. And don't even think about merging.
I suggest CM tools that support version labeling and binary files. Most Software CM applications are fine with ASCII text files. They may just store a "difference" file rather than the entire file for updates.
My recommendations: PVCS, ClearCase and Subversion. DO NOT USE Microsoft SourceSafe. I don't like it because it only supports one label per revision.
I've seen Perforce and Subversion used in a couple of FPGA-intensive companies.
We use Perforce, and its great. You can have your code that lives in Linux-land checked in side-by-side with your Specs and Docs that live in Windows-land. And you get branching, labels, etc.
I've seen everything from Clearcase to RCS used, and it is really all okay for this kind of thing. The important thing is to get a good set of check-in policies established for your group, and make sure they stick to it.
And have automated nightly regressions. That way, when someone breaks the rules, they can be identified and publicly shamed.
I have personally used Perforce, Subverion, git and ClearCase for FPGA projects. Since VHDL and C are just text files, any works fine. However be sure to capture the other project and contraint files and any libraries you use.
Also think about what to do with the outputs, e.g. log file and bitstreams. Both tend to be big and the bitstreams are binaries.
Previously I used Subversion but have switched to git two years ago. Git handles FPGA design files just as well as it handles every other text and binary file. Git is all you need for version controlling your files and artifacts.
For building the designs, I recommend just using a single ISE project called "ise" (living in a subdirectory called "ise/"). You can take a look at my (very modest) FPGA open-source project on github for the file layout. I don't bother storing the ISE files at all since they are easy to regenerate. The only things I save are the Verilog files and some ISIM waveform config files. In other projects that use coregen I save the coregen.cgp project file and all of the *.xco scripts for regenerating cores. Then I use a Makefile for actually running coregen on the *.xco files. There are a few other Xilinx-specific files you should version control too: *.ucf, *.coe, *.xcf, etc.
I experimented with using Makefiles and the Xilinx command-line tools but found that ISE did a much better job tracking dependencies and calling the tools with the right arguments. Just don't make the mistake of trying to version control your ise/ project files or you will go mad. Xilinx has something like 300 different file types which change every release. If you want to save a file, you can try the ISE project file itself with a .xise extension. Anything that is hard to recreate, like the golden bitfile that you know works and took 6 hours to build, you might want to copy that and configuration manage it explicitly.

Find out which DBMS belongs the file

I've an application, that uses encrypted (txt) files to store data. After investigating the decompiled assembly I concluded that it's a file of some DBMS. So how can find out which DBMS is this application using to store it's data, so that I can attach that file to the correct DBMS.
This is little application and there is no license problem. I can just ask the owner to gimme the data, but just curious to solve this myself.
MORE INFO:
Platform is Windows, and after trying couple of decompilers I concluded that it WAS written in Visual C++. However I couldn't fully decompile this exe, otherwise I just could find out it from the source code.
A couple ideas.
If opening the file in a HEX editor doesn't give you any information (like a magic identifier at the start of the file, which you can pop into google, then:
Use the depends tool from microsoft to grab a list of the DLLs being loaded by the application. Chances are whatever DBMS it's using is contained in an external library.
If the first two suggestions yield nothing, load the executable into IDA pro freeware and have a look at the code which is creating these files.

Read data from damaged media

Is it possible to read damaged media (cd, hdd, dvd,...) even if windows explorer bombs out?
What I mean to ask is, whether there is a set of APIs or something that can access the disk at a very low level (below explorer?) and read whatever can be retrieved even if it is only partial, especially if you can still see the file is there from explorer, but can't do anything with it because it is damaged somehow (scratch on cd, etc)?
The main problem with Windows Explorer is that it doesn't support resuming copying after a read error. Most superficially scratched CDs, for example, will fail on different areas of the disk every time you eject and reinsert them.
Therefore, with a utility that supports resuming copy operations, it is possible to read the entire contents of a damaged CD with by doing "eject/reload/resume" a few times.
In fact, this is what a utility I wrote does, and I've never needed anything fancier to read scratched disks. (It simply uses ReadFile and WriteFile.)
One step lower would be opening the raw partition (i.e. disk image) by passing a string such as "\.\F:" (note: slashes are literal here) to CreateFile. It would allow you to read raw sectors from a drive, but reconstructing files from that data would be hard.
In fact, the "\.\" syntax allows you to open devices in the "\GLOBAL??" branch of the Windows Object Manager namespace as if they were files. It's not unlike calling dd with /dev/x as a parameter. There is also a "\Device" branch, but that's only accessible via DeviceIoControl() (i.e. ioctl()), meaning there's no simple ReadFile()/WriteFile() interface.
Anything lower level than that would be device-specific, I guess; like reading raw CD-ROM data (including ECC bits) the way some CD-burning programs do. You'd have to do some research on the specific media (CD, flash, DVD) and what your hardware allows you to do on them.
Note: The backslashes seem to get lost on the way to the web page; you need to pass "backslash backslash dot backslash DeviceName" to CreateFile. You need to escape them, too, of course.
If you want to do it, do it from the Linux side - see: http://sourceforge.net/projects/monkeycity/ opensource
or ready made app and freeware too: http://www.theabsolute.net/sware/dskinv.html
the first step is dd_rescue. After that, you're free to try anything to reconstruct the data.
And there's GNU ddrescue
GNU ddrescue is a data recovery tool. It copies data from one file or block device (hard disc, cdrom, etc) to another, trying to rescue the good parts first in case of read errors.
Make sure to use the 3-arg version (manual):
ddrescue [options] infile outfile [mapfile]
That is, do use a mapfile even if it's optional, because:
If you use the mapfile feature of ddrescue, the data is rescued very efficiently, (only the needed blocks are read). Also you can interrupt the rescue at any time and resume it later at the same point. The mapfile is an essential part of ddrescue's effectiveness. Use it unless you know what you are doing.
And it's also included in Cygwin and Homebrew.
I don't know what layer exists between Windows Explorer and the Win32 APIs. You can try to write a program with the Win32 File I/O stuff. If that doesn't work, then you have to write your own device driver to get any lower.
I've had some luck from the linux side, or using BartPE (http://www.nu2.nu/pebuilder/), but just seeing the file doesn't always mean the file is going to be recoverable, whether you're trying from Windows or Linux. You're best bet might be to use a trial of a recovery program.
I have had two disks start to disintegrate on me. From the pattern of unreadable sectors I think they had internal flaking of their emulsion. WinXP Explorer just threw up its hands and said the drive didn't even exist.
In both cases I used "GetDataBack for NTFS" from Runtime Software (http://www.runtime.org/). You can download a free trial which will show you what you could get back if you paid for it. When I bought it it was $49, but I see it is now $79.
This program is amazing. It's not necessarily fast as it will reread some sectors over and over, trying to get a consensus value from multiple tries, but when it's done you can get back stuff that you thought was gone forever. I had one drive that it took over 10 hours to analyze, but when it was done I got back over 97% of a 500GB drive. Definitely worth the price.
Another great tool is Beyond Compare. I have rev 2.5.3, but it is currently at 3.?? and costs $30. They have a full-functionality, 30-day trail. It does a great job of copying large quantities of files (and only those that need to be copied) and, unlike Explorer, it doesn't blow up if something fails. It's sort of like a visual rsync for Windows, if you're familiar with that program from the Samba people.
I have no connection with either of the comapnies mentioned other than being a very satisfied customer.
The gold standard for recovering data from a magnetic storage device would have to be SpinRite. It's a commerical app though, so you probably wouldn't learn much from it.
If you have a Linux machine around, I can recommend dvdisaster. It is originally meant for creating error correction files, but it also reads DVDs into an image and ignores read errors; and you can use different drives one after another to get missing sectors filled in the image.

Resources