I'm trying to identify a type of file that contents starts with "[CS Format=A]".
I've extracted files from blobs from a database I was handed. I do not have access to the software that created this database. There is a column that I assume signifies compression (it's called COMPRESS). Also in said database were the names of the files and their extensions. I've extracted all the files out of the database and everything works except anything that's marked as compress is not readable as it's own file type (I.E. if it was a PDF before it was stored in this DB now that I've pulled them all back out it is not parsable as a pdf like the other non-"COMPRESS" pdfs). When I crack them open and look at them the first 13 bytes always are "[CS Format=A]" (which I swear I've seen somewhere before, but can't for the life of me remember what) followed by binary data. Magic can't tell me what I'm looking at and google is not being very helpful with my very strict search term. These were stored in an MSSQL database before I was given the files, most likely 2005 by the time it was pulled.
Probably not helpful, but just to make sure... Oracle will decompress automatically on select.
If it's still compressed afterwards then you're looking at some 3rd party component which can be almost anything, but I'd start with testing Mac/Win first before you run through all the 3rd party compression tools.
Related
Our current system database system is a clipper DOS application. The database inside its folder is fragmented/divided into many parts. I want to decrypt the database so that I will have only one database in all and avoid reshuffling of data. I'll attached the file folder Screenshot.. the database is on .DBF format
VScreenshot of files
Often you can decompile the CLIPPER exe file to source code and work from the .prg I've done it many times. The program to use is called WALKYRIE.
In Clipper and Fox Pro for DOS .dbf file is a simple table file.
If You want to use as data base with many tables in one unit.
You can import these tables in MS SQL data base and/or part of a MS Access database.
I see that you got several answers. Most are partially right. Let's address these one at a time:
All those files essentially comprise the "database" for the application you're using. They could be used by other applications as well. Besides having a lot of files, what is the problem you're trying to solve?
People mentioned indexes. You can generally ignore these. There are there primarily to make access to the data files faster. Any properly written clipper application will recreate these if they're missing or corrupted. You could test this by renaming one, running the app, and seeing what happens. If it doesn't recreate it you can name it back. Not replacing missing index files would be unusual behavior.
The DBF file format is binary, but barely. Most of what's in a DBF is text and is readable with an editor. But there's no reason to do so - I'm sure there are several free DBF utilities out there to to read DBF files. Getting the structure of the files could be very helpful.
Getting the data out of the files would also be fairly simple with a utility. If you look up the DBF format you could even write one fairly easily in Clipper, any other language that uses DBF files, or in something like Python. Any language that can open and write files, really. It's not hard - any competent developer could do this in a matter of hours. Must less if you're using Clipper or another language that natively reads DBX files.
Most people create dBase/Clipper programs with relational data, like SQL Server. Where SQL Server has tables that relate to each other dBase/Clipper has a file for each "table." This isn't a requirement, but it was almost certainly done this way.
Given that, if you get the table structures through a utility or by reading the headers in an editor (don't save them from an editor!) you could quite likely recreate the database schema (i.e. the map of the data). Once you have that it's fairly trivial to get the data into another type of database (SQL Sever, Access, or whatever you like to use.) If non of the files are too large it's conceivable to put all the files into Excel sheets. It really depends on what you want to do with it.
As others have said, you may be able to get the code by Valkyrie. Some people have used it very successfully. I don't know where you get it and I've never used it. Why do you not have the code? If this is a commercial application you likely should not have it. If it's a custom app who ever wrote it or paid to have it written should have the code.
Again, it's not clear to me what problem you're trying to solve. But there are many options for doing something with those DBF files. Fortunately they are one of the easier to read data formats you could be working with.
Let me know if you have any questions. Apologies for the typos that are no doubt scattered throughout this reply.
You sort of can get an idea of how they relate to each other by opening the index files they use (.NTX files). If you have the DBU utility (executable) around, you can open the DBF and load the index (NTX). LibreOffice Calc is also able to open DBFs (haven't tested .NTX).
If you open the .NTX on a text editor you will see the indexes in the beginning.
I open with Access, but I can save the data using a PrintFill Program.
I am trying to create a site where users can upload images, videos and other types of files.
I did some research and people seem to suggest that saving the files as BLOB in database is a Bad idea; instead, save the file paths in database.
My questions are, if I save the file paths in a database:
1. How do I generate the file names?
I thought about computing the MD5 value of the file name, but what if two files have the same name? Adding the username and time-stamp etc. to file name? Does it even make sense?
2. What is the best directory structure?
If a user uploads images at 12/17/2013, 12/18/2018, can I just put it in user_ABC/images/, then create time-stamped sub-directories 20131217, 20131218 etc. ? What is the best structure for all these stuff?
3. How do all these come together?
It seems like maintaining this system is such a pain, because the file system manipulation scripts are tightly coupled with the database operations(may also need the worry about database transactions? Say in one transaction I updated the database but failed to modify the file system so I need to roll back my database?).
And I think this system doesn't scale (what if my machine runs out of hard disk so I need to upload the files to a second machine? What if my contents are on a cluster?)
I think my real question is:
4. Is there any existing framework/design pattern/db that handles this problem?
What is the standard way of handling this kind of problems?
Thanks in advance for your answers.
I've actually asked this same question when I was designing a social website for food chefs. I decided to store the url of the image in a MySQL database along with recipe. If you plan on storing multiple images for one recipe, in my example, maybe having a comma separated value would work. When the recipe loaded on the page, I would fetch the image associated with that recipe onto the screen.
Since it was a hackathon and wasn't meant for production purposes, I didn't encode the file name into something unique. However, if I were developing for productional purposes, I would append the time-stamp to the media file name when storing it into the server and database/backend.
I believe what I've proposed is the best data structure of handling this scenario. Storing the image onto the server is not only faster, but it should also take less space. I have found that when converting a standard jpg file of reasonable resolution to base64 encoding, the encoded text file representation took 30% more space. There is also the time of encoding the file and decoding the file for storage and resolving when using some BLOB type of data format instead of straight up storing the file on the server.
Using some sort of backend server scripting like PHP, you'll be able to do some pretty neat stuff with the information you have available. Fetch the result from the database, and load it in from the page using HTML.
As far as I know, there isn't a standard way of fetching media from a database yet. Perhaps there will be one day.
There is not standard way to do that, it is different to the different application. The idea is you need generate a different Path+FileName for every upload, here is a way:
HashId = sha1(microsecond + random(1,1000000));
Path = /[user_id]/[HashId{0,2}]/[HashId{-2}];
FileName = HashId
Yes, I know. This question have been already replied in Where to store the Core Data file? and in Store coredata file outside of documents directory?.
#Kendall Helmstetter Gelner and #Matthias Bauch provided very good replies. I upvoted for them.
Now my question is quite conceptual and I'll try to explain it.
From Where You Should Put Your App’s Files section in Apple doc, I've read the following:
Handle support files — files your application downloads or generates and
can recreate as needed — in one of two ways:
In iOS 5.0 and earlier, put support files in the /Library/Caches directory to prevent them from being
backed up
In iOS 5.0.1 and later, put support files in the /Library/Application Support directory and apply the
com.apple.MobileBackup extended attribute to them. This attribute
prevents the files from being backed up to iTunes or iCloud. If you
have a large number of support files, you may store them in a custom
subdirectory and apply the extended attribute to just the directory.
Apple says that for handling support files you can follow two different ways based on the installed iOS. In my opinion (but maybe I'm wrong) a Core Data file is a support file and so it falls in these categories.
Said this, does the approach by Matthias and Kendall continue to be valid or not? In particular, if I create a directory, say Private, within the Library folder, does this directory continue to remain hidden both in iOS 5 version (5.0 and 5.0.1) or do I need to follow Apple solution? If the latter is valid, could you provide any sample or link?
Thank you in advance.
I would say that a Core Data file is not really a support file - unless you have some way to replicate the data stored, then you would want it backed up.
The support files are more things like images, or databases that are only caches for a remote web site.
So, you could continue to place your Core Data databases where you like (though it should be under Application Support).
Recent addition as of Jan 2013: Apple has started treating pre-loaded CoreData data stores that you copy from a bundle into a writable area, as if they were a support file - even if you write user data into the same databases also. The solution (from DTS) is to make sure when you copy the databases into place, set the do-not-backup flag, and then un-set that if user data is written into the database.
If your CoreData store is purely a cache of downloaded network data, continue to make sure it goes someplace like Caches or has the Do Not Backup flag set.
I have some database files I'd like to pull data from (and push to).
The first problem is that I don't know what format the database is in.
Each table (or object) seems to have a separate pair of files, such as ACCOUNT.FS5 and ACCOUNT.IDX. Some of them also have .SAV files.
A friend suggested that they are likely to be Flagship database files, presumably because of the FS5 extension. Edit: this is incorrect, they are not Flagship files, they are database files for the software 'EXACT'.
If this is the case, the second problem is that I don't know how I'd go about querying on these files. I have no schema per se, although the application is capable of exporting the data in csv format. Judging by the unfriendly nature of the csv, I'd imagine it to be pretty closely aligned to the database schema.
Any ideas?
If what you think is in these files, is not confidential, I would create the project on one of freelance sites, like "vWorker", and ask for a complete data extraction there.
You can as well specify the destination file format (say, .sqlite) you know how to deal with.
Hope it helped.
Regards
I am trying to find some development best practises for SQL Server Reporting Services, Analysis Services and Integration Services.
Does anyone have some useful links or guidance they can offer on this subject?
I can only talk specifically to SSIS although some of this wil be applicable to the others as well.
Save your packages as files and put them in Source Control.
Where possible use variables for things that will change from server to server or run to run.
Use configuration files to save the configuration for differnt environments.
When processing data that comes from an outside source, assume it will change format without warning (ie check to see that the data you expect in each column is the data you got!) Nothing like putting the emails in the lastname field (or as happened to us once in DTS, the social security number into the field that said how much to pay the person, sure glad we caught that before someone got paid that amount.).
Things I have seen happen include adding new columns, removing columns that are critical to your process, reaarranging the order of the columns (especially bad when the file itself does not have the column names), leaving the column titles the same but changing the data they contain (yes once I got a file where the last name data was in the column labelled First_name and vice versa), data with new values that don't have a match to values in your system (i'm think of look up type things here like medical specialties), flat out strange data such as notes in an email field, names in this format lastname - 'Willams, Jo' first_name - 'hn' (combine the two fields to get the whole name - apparently their data entry people just typed the name until they ran out of spaces and continued on in the next field no matter where they were in the name!).
Don't put uncleaned data into your database.
Always retain a copy of any files that you process or send out. Amazing how often you will need to research back.
Log errors and log records that needed cleaning, espcially if the problem in the field was such that it caused the process to fail. It is a whole lot easier to see the errors in a table than to know your 20 million record file failed because one record had an extra | in it and try to figure out which one it was.
If you do a lot of similar imports in SSIS, create a template project that has all the standard logging and data cleaning it it. It is a whole lot faster to start from a template and adjust to new mappings based onteh new file you are working with and make minor adjustoments to things specific to that file than to rewrite every SSIS package from scratch.
Store meta data. Sooner or later you will be asked, how often did it fail or how soon after the file was received did the import happen or even when was the last import. All our pacakges start and end with a task to store start and stop times in our meta data table. All failure paths include a task to mark the import as failed in our meta data. Eventually you can build a system that knows how many records to expect and fail it if the new file is significantly off. Meta data can also be used to store things like number of records which can help identify when they sent a partial file instead of the whole file you were expecting and prevent you from blowing away 300,000 sales targets they actually still want.