In sqlite3, is reset required after a failed call to step? - c

Is it a requirement to call sqlite3_reset() on a prepared statement after a failed call to sqlite3_step()? I'm using sqlite3 version 3.23.1. The lifecycle of my prepared statements is as follows:
At the start of my application, I globally do sqlite3_prepare_v2() and keep the handle to the prepared statement available for the lifetime of the application.
When I'm ready to do a query, I invoke one of the sqlite3_bind_*() functions, then do sqlite3_step() on that statement until I get something other than SQLITE_ROW returned.
Then the code below is executed to reset the statement.
Here is the part of the code that happens after I call sqlite3_step(). Note that variable resultCode holds the return value of the last call to sqlite3_step().
if (resultCode == SQLITE_DONE || resultCode == SQLITE_ROW)
{
if (sqlite3_reset(m_statement) != SQLITE_OK)
{
LogDbFailure(*m_db, "sqlite3_reset()");
}
}
else
{
LogDbFailure(*m_db, "sqlite3_step()");
success = false;
}
Notice that if the call to step failed, I don't do a reset. Nothing in the documentation or search results on Google indicate that sqlite3_reset() must be called on failures. In fact, the documentation states that calling sqlite3_reset() after a failure will also fail:
If the most recent call to sqlite3_step(S) for the prepared statement S indicated an error, then sqlite3_reset(S) returns an appropriate error code.
Reading this made me think that maybe I shouldn't call the reset function if step fails.
Can anyone clarify? Note in my case, sqlite3_step() is failing because of SQLITE_BUSY. I am using WAL journalling mode. Once step fails on a prepared statement, that prepared statement is forever in busy state when I call sqlite3_step(). Calls to sqlite3_bind_*() after that return sqlite3_bind_int64() failed (21): bad parameter or other API misuse (log format is my own, but 21 is the error code), which makes me think that reset should be called in failure cases, since all errors seem to indicate that the database is busy because of the prepared statement stuck in mid-transaction due to the lack of a reset.

Notice that if the call to step failed, I don't do a reset. Nothing in
the documentation or search results on Google indicate that
sqlite3_reset() must be called on failures.
Well no, not specifically, but the docs for sqlite3_reset() do say
The sqlite3_reset() function is called to reset a prepared statement
object back to its initial state, ready to be re-executed.
You add,
In fact, the documentation
states that calling sqlite3_reset() after a failure will also fail:
If the most recent call to sqlite3_step(S) for the prepared statement S indicated an error, then sqlite3_reset(S) returns an
appropriate error code.
No, you are misinterpreting that. There is an important distinction between "returns an appropriate error code" and "will fail". That should be clearer when considered in the context of this excerpt from the docs for sqlite3_step():
In the legacy interface, the sqlite3_step() API always returns a
generic error code, SQLITE_ERROR, following any error other than
SQLITE_BUSY and SQLITE_MISUSE. You must call sqlite3_reset() or
sqlite3_finalize() in order to find one of the specific error codes
that better describes the error.
Although that behavior of sqlite3_step() applies only to the legacy interface, not the V2 interface, it explains why the return value of sqlite3_reset() reports on the result of previous calls (if any) to sqlite3_step(), not on its own success or failure. It is implicit that the reset itself cannot fail, or at least cannot report on its own failure via its return code.
Reading this made me think that maybe I shouldn't call the reset
function if step fails.
The docs for sqlite3_step() have this to say on that point:
For all versions of SQLite up to and including 3.6.23.1, a call to
sqlite3_reset() was required after sqlite3_step() returned anything
other than SQLITE_ROW before any subsequent invocation of
sqlite3_step().
Note: it is therefore not wrong to call sqlite3_reset() after sqlite3_step() reports an error. The docs go on to say,
Failure to reset the prepared statement using
sqlite3_reset() would result in an SQLITE_MISUSE return from
sqlite3_step(). But after version 3.6.23.1 (2010-03-26, sqlite3_step()
began calling sqlite3_reset() automatically in this circumstance
rather than returning SQLITE_MISUSE.
That seems inconsistent with the behavior you report, but note that,
[...] The SQLITE_OMIT_AUTORESET
compile-time option can be used to restore the legacy behavior.
Thus, your safest bet is to reset the statement unconditionally, rather than to avoid resetting it after an error is reported. That might be unnecessary with many SQLite3 builds, but it is not wrong or harmful, and it is necessary with some builds.

Related

cgo Interacting with C Library that uses Thread Local Storage

I'm in the midst of wrapping a C library with cgo to be usable by normal Go code.
My problem is that I'd like to propagate error strings up to the Go API, but the C library in question makes error strings available via thread-local storage; there's a global get_error() call that returns a pointer to thread local character data.
My original plan was to call into C via cgo, check if the call returned an error, and if so, wrap the error string using C.GoString to convert it from a raw character pointer into a Go string. It'd look something like C.GoString(C.get_error()).
The problem that I foresee here is that TLS in C works on the level of native OS threads, but in my understanding, the calling Go code will be coming from one of potentially N goroutines that are multiplexed across some number of underlying native threads in a thread pool managed by the Go scheduler.
What I'm afraid of is running into a situation where I call into the C routine, then after the C routine returns, but before I copy the error string, the Go scheduler decides to swap the current goroutine out for another one. When the original goroutine gets swapped back in, it could potentially be on a different native thread for all I know, but even if it gets swapped back onto the same thread, any goroutines that ran there in the intervening time could've changed the state of the TLS, causing me to load an error string for an unrelated call.
My questions are these:
Is this a reasonable concern? Am I misunderstanding something about the go scheduler, or the way it interacts with cgo, that would cause this to not be an issue?
If this is a reasonable concern, how can I work around it?
cgo somehow manages to propagate errno values back to the calling Go code, which are also stored in TLS, which makes me think there must be a safe way to do this.
I can't think of a way that the C code itself could get preempted by the go scheduler, so should I introduce a wrapper C function and have IT make the necessary call and then conditionally copy the error string before returning back up to goland?
I'm interested in any solution that would allow me to propagate the error strings out to the rest of Go, but I'm hoping to avoid any solution that would require me to serialize accesses around the TLS, as adding a lock just to grab an error string seems greatly unfortunate to me.
Thanks in advance!
What I'm afraid of is running into a situation where I call into the C routine, then after the C routine returns, but before I copy the error string, the Go scheduler decides to swap the current goroutine out for another one. ...
Is this a reasonable concern?
Yes. The cgo "call C code" wrappers lock on to one POSIX / OS thread for the duration of each call, but the thread they lock is not fixed for all time; it does in fact bop around, as it were, to multiple different threads over time, as long as your goroutines are operating normally. (Since Go is cooperatively scheduled in the current implementations, you can, in some circumstances, be careful not to do anything that might let you switch underlying OS threads, but this is probably not a good plan.)
You can use runtime.LockOSThread here, but I think the best plan is otherwise:
how can I work around it?
Grab the error before Go resumes its normal scheduling algorithm (i.e., before unlocking the goroutine from the C / POSIX thread).
cgo somehow manages to propagate errno values ...
It grabs the errno value before unlocking the goroutine from the POSIX thread.
My original plan was to call into C via cgo, check if the call returned an error, and if so, wrap the error string using C.GoString to convert it from a raw character pointer into a Go string. It'd look something like C.GoString(C.get_error()).
If there is a variant of this that takes the error number (rather than fishing it out of a TLS variable), that plan should still work: just make sure that your C routines provide both the return value and the error number.
If not, write your own C wrapper, just as you suggested:
ftype wrapper_for_realfunc(char **errp, arg1type arg1, arg2type arg2) {
ftype ret = realfunc(arg1, arg2);
if IS_ERROR(ret) {
*errp = get_error();
} else {
*errp = NULL;
}
return ret;
}
Now your Go wrapper simply calls the wrapper, which fills in a pointer to C memory with an extra *C.char argument, setting it to nil if there is no error, and setting it to something on which you can use C.GoString if there is an error.
If that's not feasible for some reason, consider using runtime.LockOSThread and its counterpart, runtime.UnlockOSThread.

libcurl call back functions, using the easy interface

I am trying to understand/make sure about the nature of "callback function" utilized by libcurl.
As usual, after setting all the options using curl_easy_setopt, I would call curl_easy_perform.
But when there is a call back function will libcurl actually return absolutely all the data before exiting curl_easy_perform.
I understand that the multi-interface is there to provide non-blocking capabilities. But "call back functions" are meant to be called "later in time" while other stuff is taking place, right? So for easy-interface is it really blocking till all data is received?
How can I test this?
I have been researching and I put below two quotes from the libcurl docs. But I am stuck at trying to comprehend the concepts of call back functions and blocking manner
http://curl.haxx.se/libcurl/c/curl_easy_perform.html
curl_easy_perform - man page:
curl_easy_perform performs the entire request in a blocking manner and returns when done, or if it failed. For non-blocking behavior, see curl_multi_perform.”
http://curl.haxx.se/libcurl/c/curl_multi_perform.html
curl_multi_perform - man page:
This function handles transfers on all the added handles that need attention in an non-blocking fashion”
Please note that the aim is to make sure that after the end of my function call, the application must have ALL the data. We are doing things strictly sequentially and can not afford chunks of data flying in at different times.
Yes, the easy interface is blocking until the entire request is complete. You can test this by doing lots of requests and verifying that it works this way - or just trust the docs and the thousands of users who depend on this behavior.
"Callbacks" means that they are functions you write and provide that get "called back" from the function you invoke. So, you call curl_easy_perform() and then libcurl itself calls back to your callback function(s) according to the documentation all the way until either something failed or the transfer is complete and then curl_easy_perform() returns back to your program again.

Does aio_write always write the whole buffer?

I know that the POSIX write function can return successfully even though it didn't write the whole buffer (if interrupted by a signal). You have to check for short writes and resume them.
But does aio_write have the same issue? I don't think it does, but it's not mentioned in the documentation, and I can't find anything that states that it doesn't happen.
Short answer
Excluding any case of error: Practical yes, theoratical not necessarily.
Long answer
From my experience the caller does not need to call aio_write() more then once to write the whole buffer using aoi_write().
This however is not a guarantee that the whole buffer passed in really will be written. A final call to aio_error() gives the result of the whole asyncronous I/O operation, which could indicate an error.
Anyhow the documentation does not explicitly excludes the case that the final call to aio_return() returns a value less then the amount of bytes to write out specified in the original call to aio_write(), what indeed needs to be interpreted as that not the whole buffer would have been sent, in which case it would be necessary to call aio_write() again passing in what whould have been indicated as having been left over to write by the previous call.
The list of error codes on this page doesn't include EINTR, which is the value in errno that means "please call again to do some more work". So, no you shouldn't need to call aio_write again for the same piece of data to be written.
This doesn't mean that you can rely on every write being completed. You still could, for example, get an partial write because the disk is full or some such. But you don't need to check for EINTR and "try again".

Is there a way to check if a (file) handle is valid?

Is there any way to check if a handle, in my case returned by CreateFile, is valid?
The problem I face is that a valid file handle returned by CreateFile (it is not INVALID_HANDLE_VALUE) later causes WriteFile to fail, and GetLastError claims that it is because of an invalid handle.
Since it seems that you are not setting the handle value to INVALID_HANDLE_VALUE after closing it, what I would do is set a read watchpoint on the HANDLE variable, which will cause the debugger to break at each line that accesses the value of the HANDLE. You will be able to see the order in which the variable is accessed, including when the variable is read in order to pass it to CloseHandle.
See: Adding a watchpoint (breaking when a variable changes)
Your problem is caused most probably by either of two things:
You may close the file handle, nevertheless you still try to use it
File handle is overwritten due to a memory corruption
Generally it's a good practice to assign INVALID_HANDLE_VALUE to every handle as long as it's not supposed to contain any valid handle value.
In simple words - when your variable is declared - immediately initialize it to this value. And also write this value into your variable immediately after you close the file handle.
This will give you an indication of (1) - attempt to use the file handle which is already closed (or hasn't been opened yet)
The other answers are all important for your particular problem.
However, if you are given a HANDLE and simply want to find out whether it is indeed an open file handle (as opposed to, e.g., a handle to a mutex or a GDI object etc.), there is the Windows API function GetFileInformationByHandle for that.
Depending on the permissions your handle grants you for the file, you can also try to read some data from it using ReadFile or perform a null write operation using WriteFile with nNumberOfBytesToWrite set to 0.
Open-File are kept as a Data Structure in kernel, I don't think that has a official way to detect a file-handle is valid, just use it and check Error code as INVALID_HANDLE. Are you sure no others thread closed that file-handle?
Checking the validity of the handle is a band-aid, at best.
You should debug the process - set a breakpoint at the point the handle is set up (file open) and when you hit that code and after the handle is set up, set a second conditional breakpoint to trigger when the handle value changes.
This should enable you to work out the underlying cause rather than just check the handle is valid on each access, which is unreliable, costly and not necessary given correct logic.
Just to add to what everyone else is saying, make sure that you check the return value when you call CreateFile. IIRC, it will return INVALID_HANDLE_VALUE on failure, at which point you should call GetLastError to find out why.

C how to handle malloc returning NULL? exit() or abort()

When malloc() fails, which would be the best way to handle the error? If it fails, I want to immediately exit the program, which I would normally do with using exit(). But in this special case, I'm not quite sure if exit() would be the way to go here.
In library code, it's absolutely unacceptable to call exit or abort under any circumstances except when the caller broke the contact of your library's documented interface. If you're writing library code, you should gracefully handle any allocation failures, freeing any memory or other resources acquired in the attempted operation and returning an error condition to the caller. The calling program may then decide to exit, abort, reject whatever command the user gave which required excessive memory, free some unneeded data and try again, or whatever makes sense for the application.
In all cases, if your application is holding data which has not been synchronized to disk and which has some potential value to the user, you should make every effort to ensure that you don't throw away this data on allocation failures. The user will almost surely be very angry. It's best to design your applications so that the "save" function does not require any allocations, but if you can't do that in general, you might instead want to perform frequent auto-save-to-temp-file operations or provide a way of dumping the memory contents to disk in a form that's not the standard file format (which might for example require ugly XML and ZIP libraries, each with their own allocation needs, to write) but instead a more "raw dump" which you application can read and recover from on the next startup.
If malloc() returns NULL it means that the allocation was unsuccessful. It's up to you to deal with this error case. I personally find it excessive to exit your entire process because of a failed allocation. Deal with it some other way.
Use Both?
It depends on whether the core file will be useful. If no one is going to analyze it, then you may as well simply _exit(2) or exit(3).
If the program will sometimes be used locally and you intend to analyze any core files produced, then that's an argument for using abort(3).
You could always choose conditionally, so, with --debug use abort(3) and without it use exit.

Resources