Atomicity with openTempFile - file

I have the following function:
safeWrite :: Text -> IO ()
safeWrite c = bracket (openTempFile "/tmp" "list.tmp")
(\(path, h) -> hClose h
>> copyFile path dataFile
>> removeFile path)
(\(_, h) -> TI.hPutStr h c)
I was under the impression that this would safe, no copying would happen if there were errors during any moment, and the original file would still be usable. However just yesterday I ended up with an empty file, and I have no idea where to go look at it. The program had been running well for over a month without any hiccups which points so some corner case I didn't think of.
Does the method guarantee atomicity, meaning the error is somewhere else, or if not, why not? What should I do to guarantee atomicity?

Your definition of mkTemp is atomic with respect to Haskell exceptions. If there is an exception it will print a message about the failure (leaving the file there).
It is not atomic with respect to the Unix file system -- other programs could overwrite the same file
It does not clean up should there be a failure.
You can do a bit more to clean up, by optionally removing the file if there is an exception, or simply using the provided (atomic) mkTemp function:
openTempFile: http://hackage.haskell.org/packages/archive/base/4.3.1.0/doc/html/System-IO.html#g:22
or using the posix layer:
http://hackage.haskell.org/packages/archive/unix/latest/doc/html/System-Posix-Temp.html#Temp

Related

Writing in the executable while running the program

I'm writing a C program and I would like to be able to store data inside the executable file.
I tried making a function to write a single byte at the end of the file but it looks like it can't open the file because it reaches the printf and then gives "segmentation fault".
void writeByte(char c){
FILE *f;
f = fopen("game","wb");
if(f == 0)
printf("\nFile not found\n");
fseek(f,-1,SEEK_END);
fwrite(&c,1,sizeof(char),f);
fclose(f);
}
The file is in the correct directory and the name is correct. When I try to read the last byte instead of writing it works without problems.
Edit: I know I should abort the program instead of trying to write anyway but my main problem is that the program can't open the file despite being in the same directory.
There are several unrelated problems in your code and the problem you're trying to solve.
First you lack proper error handling. If any function that can fail (like e.g. fopen) fails, you should act accordingly. If, for example you did
#include <error.h>
#include <errno.h>
...
f = fopen("game","wb");
if ( f == NULL ) {
error(1,errno,"File could not be opened");
}
...
You would have recieved an useful error message like
./game: File could not be opened: Text file busy
You printed a message, which is not even correct (the file not beeing able to be opened is somthing different, than not beeing found) and continued the program which resulted in a segmentation fault because you dereferenced the NULL pointer stored in f after the failure of fopen.
Second As the message tells us (at least on my linux machine), the file is busy. That means, that my operating system does not allow me to open the executable I'm running in write mode. The answers to this question lists numerous source of the explanation of this error message. There might be ways to get around this and open a running executable in write mode, but I doubt this is easy and I doubt that this would solve your problem because:...
Third Executable files are stored in a special binary format (usually ELF on Linux). They are not designed to be manually modified. I don't know what happens if you just append data to it, but you could run into serious problems if your not very careful and know what you're doing.
If you just try to store data, use another plain and fresh file. If you're hoping to append code to an executable, you really should gather some background information about ELF files (e.g. from man elf) before continuing.

Renaming & moving my file based on the size is not always working in c. Why?

I have an application, written in C, which generates various data parameters that I am logging into a text file named debug_log.txt. Whenever this log file reaches 1 MB, I am renaming the filename with timestamp ex debug_log_20200106_133000.txt & moving it in same directory. I am then reopening debug_log.txt to log new parameters.
if(stat("/home/log/debug_log.txt", &statFiledbg) == 0)
{
if(statFiledbg.st_size >= 1048576) // 1MB
{
current_time = time(0);
strftime(time_buffer, sizeof(time_buffer), "%Y%m%d_%H-%M-%S", gmtime(&current_time));
sprintf(strSysCmddbg, "mv /home/log/debug_log.txt /home/log/debug_log%s.txt", time_buffer);
system(strSysCmddbg);
fp_dbglog = freopen("/home/log/debug_log.txt", "w", fp_dbglog);
}
}
The code works most of the time until it doesn't. After running the application for couple days, I see that debug_log.txt grows beyond 1 MB while the last moved & renamed log file is empty.
What could be the reason?
Use the rename function from the C standard library (in stdio.h) and check errno if it failed to know the exact reason why it is failing.
When working with files, and I/O in general, there are many, many things that can go wrong.
One of my senior developer in the company told me so. Is there anything wrong with using system()?
Yes: it is unnecessary (C and POSIX provide you with a function for basic usages like this), nonportable (it assumes you are in a system that has a "mv"), slower (it needs to spawn another process) and wrong for many use cases (eg. here there is no way to know what exactly failed unless you save the textual output of mv).
See questions and answers like Moving a file on Linux in C for an in-depth explanation.

tcl "open" command not working when replacing Tcl_Filesystem with a duplicate

I'm trying to write a custom filesystem for Tcl using the Tclapi (it's work related, won't go into details), but I'm stuck trying to figure out why this is not working.
In this code segment I'm getting the original/native Tcl_Filesystem, copying over all its contents (function pointers) to my_fs, and then calling Tcl_FSRegister on my_fs. Very simple, thought it should work.
// global scope
const Tcl_Filesystem *ori_fs;
Tcl_Filesystem *my_fs;
...
// in Init
// Get the original Tcl_Filesystem.
Tcl_Obj *root_obj = Tcl_NewStringObj("/", -1);
Tcl_IncrRefCount(root_obj);
ori_fs = Tcl_FSGetFileSystemForPath(root_obj);
Tcl_DecrRefCount(root_obj);
// create a duplicate of the original Tcl_Filesystem struct.
my_fs = malloc(sizeof(Tcl_Filesystem));
memmove(my_fs, ori_fs, ori_fs->structureLength);
int ret = Tcl_FSRegister((ClientData)1, my_fs);
if (ret == TCL_ERROR) {
...
When I ran
load <path to .so>/my_fs[info sharedlibextension]
# sanity check
puts [pwd]
set fp [open test.txt]
however, I get this
<my current directory>
while executing
"open test.txt"
invoked from within
"set fp [open test.txt]"
(file "test.tcl" line 3)
Notice how "puts [pwd]" works but not "open test.txt" ?
Replacing "my_fs" with "ori_fs" in the call to Tcl_FSRegister seems to work...
I've already spent far too much time trying to figure this out. I would appreciate if anyone could help me with this!
The native filesystem is special. In particular, there's some places where its identity is used directly: for example, it's the only FS that can have temporary files made on it, it's assumed to own the roots, and it is handled specially in path management. (Well, according to where in the source code there are direct references to the Tcl internal variable tclNativeFilesystem, which isn't something you can cheat at. It's also possibly in read-only memory, so you can't hack around this.)
For most sane uses of a Tcl virtual filesystem, this doesn't matter. Temp files have to be native because you may well be passing them to the OS (e.g., for loading libraries or running programs that were inside the VFS; with these, they have to be copied out or the OS will think “what are you talking about?!”) and you put the things that you are mounting somewhere other than the native root. So long as you're not trying to use a VFS as a security measure (not recommended; there are safe interpreters for that as they offer a stronger sandboxing solution) it shouldn't be a problem as you can just make your code know that it needs to work below a particular location to get things done. (FWIW, it's a bad idea to cd anyway, except in response to user requests, since it changes the meaning of user-supplied relative paths, so good code handles “make everything relative to a defined location” from the start.)

SHA-1, RFC3174 and RFC4634

New to the community, but not new to programming.
I've been trying to get a collection of hash functions up/running and I succeeded. However I found some weird results and haven't been able to put my finger on it yet. RFC4634 contains a C implementation for SHA-1 and SHA-2 family, which also can accept a file being passed on for hashing. RFC3174 contains a C implementation, but doesn't process file streams. I've been using the C implementation from RFC4634 to verify files, yet the verification process is returning non similar results when I compare them against SHA-1 provided hashes.
Any idea what the reasons could be?
Did you check if you opened the files in ASCII or binary mode? Line end translation may be performed before the hash is being calculated.
Update:
I just compiled the RFC4634 shatestand tried on a sample text file. As long as there isn't a line break, all tools agree. Once you insert a line break, results depend: if the text file uses CR and LF (DOS mode), then shatest produces a different result. If the line end is only LF (UNIX), it still agrees with the other tools.
Update 2:
In the file shatest.c of RFC4634, in function hashfile(...), set fopen to binary mode:
FILE *hashfp = (strcmp(hashfilename, "-") == 0) ? stdin :
fopen(hashfilename, "rb");
/* ^ HERE */

Linux C: Shell-like environment - for individual execution - of C commands? (C interpreter)

Sorry if the question is worded wrong - I don't know the right word for what I'm asking for! :)
Say, you have some simple C program like:
#include <stdio.h>
int main()
{
int a=2;
printf("Hello World %d\n", a);
return 0;
}
Typically, this would have to be saved in a file (say, hello.c); then we run gcc on the source file and obtain executable file - and if we compiled in debug information, then we can use gdb on the executable, to step through lines of code, and inspect variables.
What I would like to have, is basically some sort of a "C" shell - similar to the Python shell; in the sense that I can have a sequence of Python commands in a file (a script) - or I can just paste the same commands in the shell, and they will execute the same. In respect to the simple program above, this is what I'd like to be able to do (where C> represents the imagined prompt):
C> #include <stdio.h>
(stdio.h included)
C> int a=2;
C> printf("Hello World %d\n", a);
Hello World 2
C>
In other words, I'd like to be able to execute individual C commands interactively (I'm guessing this would represent on-the-fly compilation of sorts?). Initially I was misled by the name of the C shell (csh) - but I don't think it will be able to execute C commands on the fly.
So, first and foremost, I'd like to know if it is possible somehow to persuade, say, gdb to perform in this manner? If not, is there anything else that would allow me to do something similar (some special shell, maybe)?
As for the context - I have some code where I have problems troubleshooting pointers between structs and such; here the way gdb can printout structs works very well - however, to isolate the problem, I have to make new source files, paste in data, compile and debug all over again. In this case, I'd much rather have the possibility to paste several structs (and their initialization commands) in some sort of a shell - and then, inspect using printf (or even better, something akin to gdb's print) typed directly on the shell.
Just for the record - I'm not really persuaded something like this really exists; but I thought I'd ask anyways :)
Thanks in advance for any answers,
Cheers!
EDIT: I was a bit busy, so haven't had time to review all answers yet for accept (sorry :) ); just wanted to add a little comment re:"interpreted vs. machine code"; or as mentioned by #doron:
The problem with running C /C++ source interactively is that
the compiler is not able to perform line by line interpretation of the code.
I am fully aware of this - but let's imagine a command line application (could even be an interpreted one), that gives you a prompt with a command line interface. At start, let's assume this application generates this simple "text file" in memory:
##HEADER##
int main()
{
##MAIN##
return 0;
}
Then, the application simply waits for a text to be entered at the prompt, and ENTER to be pressed; and upon a new line:
The application checks:
if the line starts with #define or #include, then it is added below the ##HEADER## - but above the int main() line - in the temp file
anything else, goes below ##MAIN## line - but above return 0; line - in the temp file
the temp file is stripped of ##HEADER## and ##MAIN## lines, and saved to disk as temp.c
gcc is called to compile temp.c and generate temp.out executable
if fail, notify user, exit
gdb is called to run the temp.out executable, with a breakpoint set at the return 0; line
if fail, notify user, exit
execution is returned to the prompt; the next commands the user enters, are in fact passed to gdb (so the user can use commands like p variable to inspect) - until the user presses, say, Ctrl+1 to exit gdb
Ctrl+1 - gdb exits, control is returned to our application - which waits for the next code line all over again.. etc
(subsequent code line entries are kept in the temp file - placed below the last entry from the same category)
Obviously, I wouldn't expect to be able to paste the entire linux kernel code into an application like this, and expect it to work :) However, I would expect to be able to paste in a couple of structs, and to inspect the results of statements like, say:
char dat = (char) (*(int16_t*)(my->structure->pdata) >> 32 & 0xFF) ^ 0x88;
... so I'm sure in what is the proper syntax to use (which is usually what I mess up with) - without the overhead of rebuilding and debugging the entire software, just to figure out whether I should have moved a right parenthesis before or after the asterisk sign (in the cases when such an action doesn't raise a compilation error, of course).
Now, I'm not sure of the entire scope of problems that can arise from a simplistic application architecture as above. But, it's an example, that simply points that something like a "C shell" (for relatively simple sessions/programs) would be conceptually doable, by also using gcc and gdb - without any serious clashes with the, otherwise, strict distinction between 'machine code' and 'interpreted' languages.
There are C interpreters.
Look for Ch or CINT.
Edit: found a new (untested) thing that appears to be what the OP wants
c-repl
Or just use it [...] like driving a Ferarri on city streets.
Tiny C Compiler
[... many features, including]
C script supported : just add '#!/usr/local/bin/tcc -run' at the first line of your C source, and execute it directly from the command line.
When your CPU runs a computer program, it runs something called machine code. This is a series of binary instructions that are specific to the CPU that you are using. Since machine code is quite hard to hand code, people invented higher level languages like C and C++. Unfortunately the CPU only understands machine code. So what happens is that we run a compiler that converts the high-level source language into machine code. Computer languages in this class are compiled language like C and C++. These languages are said to run natively since the generated machine code is run by the CPU without any further interpretation.
Now certain languages like Python, Bash and Perl do not need to be compiled beforehand and are rather interpreted. This means that the source file is read line by line by the interpreter and the correct task for the line is performed. This gives you the ability run stuff in an interactive shell as we see in Python.
The problem with running C /C++ source interactively is that the compiler is not able to perform line by line interpretation of the code. It is designed solely to generate corresponding machine code and therefore cannot run your C / C++ source interactively.
#buddhabrot and #pmg - thank you for your answers!
For the benefit of n00bery, here is a summary of the answers (as I couldn't immediately grasp what is going on): what I needed (in OP) is handled by what is called a "C Interpreter" (not a 'C shell'), of which the following were suggested:
CINT | ROOT - Ubuntu: install as sudo apt-get install root-system-bin (5.18.00-2.3ubuntu4 + 115MB of additional disk space)
c-repl (c-repl README)- Ubuntu: install as sudo apt-get install c-repl (c-repl_0.0.20071223-1_i386.deb + 106kB of additional disk space)
Ch standard edition - standard edition is freeware for windows/Unix
For c-repl - there is a quick tutorial on c-repl homepage as an example session; but here is how the same commands behave on my Ubuntu Lucid system, with the repository version (edit: see Where can I find c-repl documentation? for a better example):
$ c-repl
> int x = 3
> ++x
> .p x
unknown command: p
> printf("%d %p\n", x, &x)
4 0xbbd014
> .t fprintf
repl is ok
> #include <unistd.h>
<stdin>:1:22: warning: extra tokens at end of #include directive
> getp
p getp
No symbol "getp" in current context.
> printf("%d\n", getpid())
10284
> [Ctrl+C]
/usr/bin/c-repl:185:in `readline': Interrupt
from /usr/bin/c-repl:185:in `input_loop'
from /usr/bin/c-repl:184:in `loop'
from /usr/bin/c-repl:184:in `input_loop'
from /usr/bin/c-repl:203
Apparently, it would be best to build c-repl from latest source.
For cint it was a bit difficult to find something relateed to it directly (the webpage refers to ROOT Tutorials instead), but then I found "Le Huy: Using CINT - C/C++ Interpreter - Basic Commands"; and here is an example session from my system:
(Note: if cint is not available on your distribution's package root-system-bin, try root instead.)
$ cint
cint : C/C++ interpreter (mailing list 'cint#root.cern.ch')
Copyright(c) : 1995~2005 Masaharu Goto (gotom#hanno.jp)
revision : 5.16.29, Jan 08, 2008 by M.Goto
No main() function found in given source file. Interactive interface started.
'?':help, '.q':quit, 'statement','{statements;}' or '.p [expr]' to evaluate
cint> L iostream
Error: Symbol Liostream is not defined in current scope (tmpfile):1:
*** Interpreter error recovered ***
cint> {#include <iostream>}
cint> files
Error: Symbol files is not defined in current scope (tmpfile):1:
*** Interpreter error recovered ***
cint> {int x=3;}
cint> {++x}
Syntax Error: ++x Maybe missing ';' (tmpfile):2:
*** Interpreter error recovered ***
cint> {++x;}
(int)4
cint> .p x
(int)4
cint> printf("%d %p\n", x, &x)
4 0x8d57720
(const int)12
cint> printf("%d\n", getpid())
Error: Function getpid() is not defined in current scope (tmpfile):1:
*** Interpreter error recovered ***
cint> {#include <unistd.h>}
cint> printf("%d\n", getpid())
10535
(const int)6
cint> .q
Bye... (try 'qqq' if still running)
In any case, that is exactly what I needed: ability to load headers, add variables, and inspect the memory they will take! Thanks again, everyone - Cheers!
Python and c belongs to different kinds of language. Python is interpreted line by line when running, but c should compile, link and generate code to run.

Resources