Storing data in a struct or txt file? - c

Im making an mock ATM program for practice and was wondering is it better
to
store someones account information in a struct on another file and have each users information on that
Have it all in a txt file and just read it while making an algorithm to grab what you need based on pin location ?
right now iv just made a pin list in a text file which counts and stores the data into an array which i check which then opens an account based on that user input ect ect.
i can see both working but was just wondering whats the proper process for stuff such as this ?

It may depend on your operating system and the language/development tools you use as some will have libraries allowing easy access to structured data in a standardized format as pointed out above.
Text files are rarely the way to go long term since you will have to write all the code yourself, not only creating a lot of work but opening yourself to bugs and potential security issues. If you can use a library, a lot of that work is already done for you.

Related

How to store data from a .TIFF file to then modify it with my C program

I know few about this and i'm trying to keep building upon it. My goal is to do image stacking with some criteria using C language, as i came upon some cool ideas i think i should be capable of doing with my photos. My C background should be enough to understand what i may need. That being said...
So far i've learned how to read an existing .TIFF file and save it into a char array. The problem is i don't know in which way its data is contained so that i can then be able to analize individual pixels and modify them, or build another .TIFF file from data i previously read.
I've read some things about (a so called) libtiff.h which may be usefull but i can't find where to get it, neither how to install it.
Does anyone know how a .TIFF file data is stored so that i can read it and apply changes to it?
Also,
Does anyone have any experience with handling image files and editing in C? Where did you learn it from?
Do you know of any place i could search for information/tutorials?
Any help will be very usefull,
Thanks in advance.
You can do an enormous amount of very sophisticated processing on TIFFs, or any one of 190+ other formats with ImageMagick without any need to understand TIFF format or write any C. Try searching on Stack Overflow for [imagemagick]
If you want to do processing yourself, consider https://cimg.eu
Another option might be to convert your TIFFs to NetPBM which is much, much simpler to read and write in C. That would be as follows with ImageMagick:
magick INPUT.TIFF -compress none OUTPUT.PPM

Can file write mode affect HMAC of text written into file?

in an application we have a file with data, which we want to "protect" from tampering. By "protect" I mean - make it hard for a user to just edit the text file.
Data is stored in JSON format. Before writing to the file, we create JSON string and calculate HMAC. Then both of the information are written to a file. When file is read, we again generate the HMAC of the data and compare it to the stored hash.
It sounds like something trivial and something that should not cause any issues, however we are getting a lot of reports that this check fails. We get them also from our testers, who definitely didn't tamper the file.
I'm wondering if it's possible that reading/writing file could affect the process. Due to some legacy code, the file is read and written in binary mode. I'm wondering if this could affect the process? I'm not going to "let's change it and see what happens" before I have a confirmation that this could be the case.
The file is read and stored on the same system all the time, it's not transferred anywhere.
All of the operations are performed using Lua, if that makes any difference.
Thank you
Krystian

Is there a way to read HD data past EOF?

Is there a way to read a file's data but continue reading the data on the hard drive past the end of file? For normal file I/O I could just use fread(), but, obviously, that will only read to the end of the file. And it might be beneficial if I add that I need this on a Windows computer.
All my Googling for a way to do this is instead coming up with results about unrelated topics concerning EOF, such as people having problems with normal I/O.
My reasoning for this is that I accidentally deleted part of the text in a text file I was working on, and it was an entire day's worth of work. I Googled up a bunch of file recovery stuff, but it all seems to be about recovering deleted files, where my problem is that the file is still there but without some of its information, and I'm hoping some of that data still exists directly after the currently marked end of file and is neither fragmented elsewhere or already claimed or otherwise overwritten. Since I can't find a program that helps with this specifically, I'm hoping I can quickly make something up for it (I understand that, depending on what is involved, this might not be as feasible as just redoing the work, but I'm hoping that's not the case).
As far as I can foresee, though I might not be correct (not sure, which is why I'm asking for help), there are 3 possibilities.
Worst of the three: I have to look up Windows API functions that allow direct access to the entire hard drive (similar to its functions for memory, perhaps? those I have experience with) and scan the entire thing for the data that I still have access to from the file and then just continue looking at what's after it.
Second: I can get a pointer to the file, then I still have to get raw access to HD but at least have a pointer to the file in it?
Best of the three: Just open the file for write access, seek to the end, then write a ways past EOF to claim more space, but first hope that Windows won't clean the data before it hands it over to me so that I get garbage data which was the previous data in that spot which would actually be what I'm looking for? This would be awesome if it were that simple, but I'm afraid to test it out because I'd lose the data if it failed, so hopefully someone else already knows. The PC in question is running Vista Home Premium if that matters to anyone that knows the gory details of Windows.
Do either of those three seem plausible? Whether yea or nay, I'm also open (and eager) for other suggestions, especially those which are better than my silly ideas, and especially if they come with direction toward specific functions to use to get the job done.
Also, if anyone else actually has heard of a recovery program that doesn't just recover deleted files but which would actually work for a situation like this, and which is free and trustworthy, that works too.
Thanks in advance for any assistance.
You should get a utility for scanning the free space of a hard drive and recovering data from it, for example PhotoRec or foremost. Note however that if you've been using the machine much at all (even web browsing, which will create files in your cache), the data has likely already been overwritten. Do not save your recovery tools on the same hard drive, or even use the same PC to download them; get them from another computer and save them to a USB device, then run them from that device.
As for the conceptual content of your question, files are abstract objects. There is no such thing as data "past eof" except (depending on the implementation) perhaps up to the next multiple of the filesystem/disk "blocksize". Also it's possible (very likely) that your editor "saved" the file by truncating it and writing everything newly from the beginning, meaning there's not necessarily any correspondence between the old and new storage.
Your question doesn't make a lot of sense -- by definition there is nothing in the file after the EOF. By your further description, it appears that you want to read whatever happens to be on the disk after the last byte that is used by the file, which might be random garbage (unused space) or might be some other file. But in either case, this isn't 'data after the EOF' its just data on the disk that's not part of the file. Its even possible that it might be some other part of the same file, if the filesystem happens to lay out its data that way -- some filesystems scatter blocks in seemingly random ways across the disk and figuring out what bytes belong to which files requires understanding the filesystem metadata.

What should I know before poking around an unknown archive file for things?

A game that I play stores all of its data in a .DAT file. There has been some work done by people in examining the file. There are also some existing tools, but I'm not sure about their current state. I think it would be fun to poke around in the data myself, but I've never tried to examine a file, much less anything like this before.
Is there anything I should know about examining a file format for data extraction purposes before I dive headfirst into this?
EDIT: I would like very general tips, as examining file formats seems interesting. I would like to be able to take File X and learn how to approach the problem of learning about it.
You'll definitely want a hex editor before you get too far. It will let you see the raw data as numbers instead of as large empty blocks in whatever font notepad is using (or whatever text editor).
Try opening it in any archive extractors you have (i.e. zip, 7z, rar, gz, tar etc.) to see if it's just a renamed file format (.PK3 is something like that).
Look for headers of known file formats somewhere within the file, which will help you discover where certain parts of the data are stored (i.e. do a search for "IPNG" to find any (uncompressed) png files somewhere within).
If you do find where a certain piece of data is stored, take a note of its location and length, and see if you can find numbers equal to either of those values near the beginning of the file, which usually act as pointers to the actual data.
Some times you just have to guess, or intuit what a certain value means, and if you're wrong, well, keep moving. There's not much you can do about it.
I have found that http://www.wotsit.org is particularly useful for known file type formats, for help finding headers within the .dat file.
Back up the file first. Once you've restricted the amount of damage you can do, just poke around as Ed suggested.
Looking at your rep level, I guess a basic primer on hexadecimal numbers, endianness, representations for various data types, and all that would be a bit superfluous. A good tool that can show the data in hex is of course essential, as is the ability to write quick scripts to test complex assumptions about the data's structure. All of these should be obvious to you, but might perhaps help someone else so I thought I'd mention them.
One of the best ways to attack unknown file formats, when you have some control over contents is to take a differential approach. Save a file, make a small and controlled change, and save again. Do a binary compare of the files to find the difference - preferably using a tool that can detect inserts and deletions. If you're dealing with an encrypted file, a small change will trigger a massive difference. If it's just compressed, the difference will not be localized. And if the file format is trivial, a simple change in state will result in a simple change to the file.
The other thing is to look at some of the common compression techniques, notably zip and gzip, and learn their "signatures". Most of these formats are "self identifying" so when they start decompressing, they can do quick sanity checks that what they're working on is in a format they understand.
Barring encryption, an archive file format is basically some kind of indexing mechanism (a directory or sorts), and a way located those elements from within the archive via pointers in the index.
With the the ubiquitousness of the standard compression algorithms, it's mostly a matter of finding where those blocks start, and trying to hunt down the index, or table of contents.
Some will have the index all in one spot (like a file system does), others will simply precede each element within the archive with its identity information. But in the end somewhere, there is information about offsets from one block to another, there is information about data types (for example, if they're storing GIF files, GIF have a signature as well), etc.
Those are the patterns that you're trying to hunt down within the file.
It would be nice if somehow you can get your hand on two versions of data using the same format. For example, on a game, you might be able to get the initial version off the CD and a newer, patched version. These can really highlight the information you're looking for.

Writing more to a file than just plain text

I have always been able to read and write basic text files in C++, but so far no one has discussed much more than that.
My question is this:
If developing a file type by myself for use by an application I also create, how would I go about writing the data to a file and preserve the layout, formatting, etc.? Are there any standards, or does it just depend on the creativity of the programmer?
You basically have to come up with your own file format and write binary data.
You can also serialize your object model and write the output to a file, but that's usually less efficient.
Better to use an existing database, or use xml (or other) for simple needs. If you want to write a file in a format that already exists, find a library that supports it.
You have to know the binary file format for the file you are trying to create. Consider Joel's post on this topic: the 97-2003 File Format is a 349 page spec.
Nearly all the time, to do something like that, you use an API, to avoid the grunt work. Be careful however, because trial and error and figuring out "what works" by trial and error can result in an upgrade of the program breaking your code. Plus you have to take into account other operating systems, minor version differences, patches, etc.
There are a number of standards of course. The likely one to use is some flavor of xml since there are libraries and tools that already exist to help you work with it, but nothing is stopping you from inventing your own.
Well you could store the data in a format you could read, but which maintained the integrity of your data (XML or JSON for instance).
Or (shudder) you could come up with your own propriatory binary format, and use that.
you would go at it exactly the same way as you would a text file. writing your data byte by byte, encoded in such a way that when you read the file you know what you are reading.
for a spreadsheet application you could even use a text format (OOXML, OpenDocument) to store presentation and content information.
Or you could define binary datastructures and write that directly to the file.
the choice between text or binary format depends on the application. for a configuration file you may prefer a text file which can be modified outside your app, for a database you will most likely choose a binary format for performance reasons.
See wotsit.org for information on file formats for various file types. Example: You can figure out exactly how to write out a .BMP file and how it is composed.
Writing to a database can be done by using a wrapper class in your language, mainly passing it SQL commands.
If you create a binary file , you can write any file to it . The only drawback is that you have to know exactly where it starts and where it ends .
Use xml (something open, descriptive, and validatable), and stick with the text. There are standards for this sort of thing as well, including ODF
You can open the file as binary, instead of text (how one does this depends somewhat on the platform), from there you can write the data directly out to disk. The only real caveat to this is endianess, which can become an issue when moving the files from one architecture to another (x86 to PPC for instance).
Writing binary data to disk is really no harder than writing text, and really, your creativity is key for how you store the data.
The general problem is usually referred to as serialization of your application state and in your case with a source/target of a file in whatever format makes sense for you. These days the preferred input/output format is XML, and you may want to look into the existing standards in this field. The problem then becomes how do I map from the state of my system to the particular schema. Boost has a serialization framework that you may want to check out.
/Allan
There are a variety of approaches you can take, but in general you'll want some sort of serialization library. BOOST::Serialization, or Google's Protocal Buffers are a good example of these. The basic idea is that you have memory structures (classes and objects) that represent your data, and you want to write that data to a file in a way that can be used to reconstruct those structures again.
If you're hesitant to use a library, you can do it all manually, but realize that you can end up writing a lot of redundant code, or developing your own library. See fopen, fread, fwrite and fclose for a starting point.
A typical binary file format for custom data is an "indexed file format" consisting of
-------
|index|
-------
|data |
-------
Where the index contains records "pointing" to the data.
The index consists of records containing an offset and a size. The offset tells you where in the file the data is stored and the size tells you the size of the data at that offset (i.e. the number of bytes to read).
typedef struct {
size_t offset
size_t size
} Index
typedef struct {
int ID
char First[20]
char Last[20]
char *RandomInfo
} Data
Suppose you want to store 50 records in the file you would create 50 indices and 50 data structures. The 50 index structures would be written to the file first, followed by the 50 data structures.
To read the file you would read in the 50 index structures, then from the data in the read-in index structures you could tell where to "seek" to read the data records.
Look up (fopen, fread, fwrite, fclose, ftell) for functions to read/write the data.
(Sorry my semicolon key doesn't work)
You usually use a third party library for these things. For example, you would link in a database library for say Oracle that would allow you to talk to the database. Because the underlying file type, ( i.e. Excel spreadsheet vs Openoffice, Oracle vs MySQL, etc. ) differ these libraries abstract away your need to care how the file is constructed.
Hope that helps you find what you're looking for!
1985 called, and said they have some help IFF you are willing to read up. The interchange file format is still in use today and provides some basic metadata around binary files, such as RIFF or WAV audio. (Unfortunately, TIFF is a false friend.) It allegedly even inspired PNG, so it can't be that bad.

Resources