Why is there no program-data independence in traditional file processing? - database

"In traditional file processing, the structure of data files is embedded in the application programs, so any changes to the structure of a file may require changing all programs that access that file. By contrast, DBMS access programs do not require such changes in most cases. The structure of data files is stored in the DBMS catalog separately from the access programs. We call this property program-data independence."
The following text is taken from the book Fundamentals of the Database system. I didn't get the part about the traditional file processing can somebody please explain(an example would be appreciated)?

I'll give you a simple example.
Microsoft Excel used to save its files in a proprietary binary format. In practical terms, this meant that you could only work on those files using Excel.
But now, Excel supports an open document format in XML that is text-based, and allows other programs like the OpenOffice SDK to interact with them. So you no longer need to rely on Excel to work with open document format Excel files.

Related

Is it possible to manipulate pdf files in Visual Basic without an external library/SDK?

I am looking at how to implement PDF merging with raw VB code so that the code may be invoked by a bot for business process automation.
The software used to create the bot provides a function to invoke VB code, but I don't believe it can access any externally imported libraries because it expects plain source, so I essentially need to produce code that one could run in a VB shell environment without anything fancy (or convenient, it seems).
All the research I've done so far point me in the direction of external packages I would need to install, such as iText; this is what I'm looking to avoid.
(previous iText employee here)
PDF is not an easy (binary) format.
Essentially, blobs of information (text that has to be rendered, fonts, images, vector graphics, etc) are compressed and gathered into objects.
Each object gets a number. Objects are allowed to reference eachother (a piece of text might say 'I want to be rendered with font 4433')
All object numbers and their byte offset in the file are gathered in the crossreference (often called XREF) table.
A PDF includes a 'Pages' dictionary object that tells the viewer which objects belong on which page.
In order to merge PDF files, you would need to:
- read all XREF tables of all files
- adjust all of those to the correct byte offset
- update various dictionary objects within the PDF file that tell it where all the objects per page are kept
This is by no means a trivial task, but it can be done using only VB.
If you are serious about implementing a robust, scalable version of this of tool, perhaps it's better to look at the iText sourcecode and try to port it to VB?

Export Databases of DOS Clipper Application

Our current system database system is a clipper DOS application. The database inside its folder is fragmented/divided into many parts. I want to decrypt the database so that I will have only one database in all and avoid reshuffling of data. I'll attached the file folder Screenshot.. the database is on .DBF format
VScreenshot of files
Often you can decompile the CLIPPER exe file to source code and work from the .prg I've done it many times. The program to use is called WALKYRIE.
In Clipper and Fox Pro for DOS .dbf file is a simple table file.
If You want to use as data base with many tables in one unit.
You can import these tables in MS SQL data base and/or part of a MS Access database.
I see that you got several answers. Most are partially right. Let's address these one at a time:
All those files essentially comprise the "database" for the application you're using. They could be used by other applications as well. Besides having a lot of files, what is the problem you're trying to solve?
People mentioned indexes. You can generally ignore these. There are there primarily to make access to the data files faster. Any properly written clipper application will recreate these if they're missing or corrupted. You could test this by renaming one, running the app, and seeing what happens. If it doesn't recreate it you can name it back. Not replacing missing index files would be unusual behavior.
The DBF file format is binary, but barely. Most of what's in a DBF is text and is readable with an editor. But there's no reason to do so - I'm sure there are several free DBF utilities out there to to read DBF files. Getting the structure of the files could be very helpful.
Getting the data out of the files would also be fairly simple with a utility. If you look up the DBF format you could even write one fairly easily in Clipper, any other language that uses DBF files, or in something like Python. Any language that can open and write files, really. It's not hard - any competent developer could do this in a matter of hours. Must less if you're using Clipper or another language that natively reads DBX files.
Most people create dBase/Clipper programs with relational data, like SQL Server. Where SQL Server has tables that relate to each other dBase/Clipper has a file for each "table." This isn't a requirement, but it was almost certainly done this way.
Given that, if you get the table structures through a utility or by reading the headers in an editor (don't save them from an editor!) you could quite likely recreate the database schema (i.e. the map of the data). Once you have that it's fairly trivial to get the data into another type of database (SQL Sever, Access, or whatever you like to use.) If non of the files are too large it's conceivable to put all the files into Excel sheets. It really depends on what you want to do with it.
As others have said, you may be able to get the code by Valkyrie. Some people have used it very successfully. I don't know where you get it and I've never used it. Why do you not have the code? If this is a commercial application you likely should not have it. If it's a custom app who ever wrote it or paid to have it written should have the code.
Again, it's not clear to me what problem you're trying to solve. But there are many options for doing something with those DBF files. Fortunately they are one of the easier to read data formats you could be working with.
Let me know if you have any questions. Apologies for the typos that are no doubt scattered throughout this reply.
You sort of can get an idea of how they relate to each other by opening the index files they use (.NTX files). If you have the DBU utility (executable) around, you can open the DBF and load the index (NTX). LibreOffice Calc is also able to open DBFs (haven't tested .NTX).
If you open the .NTX on a text editor you will see the indexes in the beginning.
I open with Access, but I can save the data using a PrintFill Program.

Manually create a multi-sheet file for excel from C

I am working on a C application. I was planning on using a CSV file to read the values into a spread sheet, but then as the data got more and more complex (around 100 cols), I saw the need to start to do multiple sheets. I am working on a single board computer, and the file is used for storing diagnostic information. I would like to be able to write the file from the SBC in an ASCII format, and then import it to excel (or the open source alternative), and have multiple sheets. Is this even possible, or should I start working on macros to run on the data?
Maybe consider using XML to write the data, then it can easily be transformed into whatever format you want, and you can have the data be emitted in a way that make semantic sense rather than according to the architecture of a spreadsheet. This might allow more flexibility for different external programs to interact with the data. You can use XSLT to transform it into multiple CSV files if that's desired and there is probably reasonable ways to import it into spreadsheets directly.

Best tool to document T-SQL *source* files?

At work, the database is not documented at all. Furthermore, the stored procedures, functions and views are all encrypted, this rules out a lot of tools that document these objects for you. All I have are the plain .SQL files that generate the database, schemas, tables, functions and all.
I'd like to know, is there a tool that can read these files and generate a Doxygen-like documentation? Preferably open-source or freeware.
I found IzzySoft's HyperSQL and SourceForge's project PLDoc do something very close to what I'd need, though both seem to be very PL/SQL specific. I want something that reads SQL source files (that understands T-SQL's idiosyncracies), parses them, and gets me:
List of SPs, UDFs, etc. defined within each file
List of objects (both tables/views and procs/functions) each object depends on (directly and, if possible, also indirectly)
Calling and dependencies graphs (i.e. what calls what and is called by what)
If possible, when an SP uses a table/view, how's it using it (INSERT/DELETE/UPDATE/SELECT/mix???)
I've already developped a tiny Perl script that minimally parses these files attempting to get first point - but then it's just a hack and lacks a lot of polish. I'm sure there must be a tool out there which does the job, I want to believe I won't have to code it myself.
Thanks in advance,
Joe
We use Red Gate SQL Doc to generate ours.
However, it works from a database not files: it's easier to read everything from system tables (permission, dependecies, datatypes etc) than parse scripts. Parsing scripts is what the DB engine does...
Can you not generate an empty DB from the source files (remove WITH ENCRYPTION) and generate from that?
Or decrypt if you have sa rights?

dsofile c# API / NTFS custom file properties

I'm searching for a good way to add meta data to a file. dsofile.dll works fine for NTFS. The meta data is lost, when one drops a copy on a FAT32 share (it uses NTFS hidden streams I guess). Microsoft Word documents contain meta data that are not lost, how do they do it? Similiar to FAT, sending the file via E-Mail strips of all meta data created with dsofile (and also meta data created by hand with Windows Explorer). Separate meta data files are not an option. It must be compatible with standard Windows techniques. If I send someone a file with Outlook and he sends it back, the meta-data should not be lost.
(the required meta data is actually only an ID)
The issue is that all file systems provide a single-stream view of the file as a greatest-common-denominator. Through this interface which exposes the files "contents", you can read or store properties and have them be transported with the "contents" by naive system (or user-) utilities. For example, CopyFile in Windows will carefully lose alternate data streams and has no notion of "shadow files".
The question is whether or not the format of the "contents" allows for arbitrary addition of properties.
Some formats allow arbitrary content (e.g., MSFT's docfile aka .doc/.xls/etc). Some allow limited content (.mp3, .jpg, .exe).
Some are completely SOL (.txt, .bmp).
Any solution would be format-dependent. MS OFfice files are (all) compound files and there's a place for properties there. In some formats (PE files, for example) it's safe to just append data to the end of the file, if you know how to read them later. In ZIP file you can probably find a place in the directory or just add a helper file with your data to the archive. Other formats can't stand this, and you'd need to find your own way at solving the problem.
Actually, file name can also be a good placeholder for your ID.
If you need to store the files somewhere but don't need the file to remain readable by outside applications, you can pack them to ZIP archive or use something like our SolFS
library.
What about the standard properties rather than custom DSOFile properties? Ie Comments, Author etc? do they get wiped?
Not sure if its ideal but a way we've gotten around it is that we have a tool that will take the DSOfile properties and save a text file, which is then emailed along with the file, and at the other end the user runs a tool to re-import the dsofile properties from the text.

Resources