Resuming a TLS 1.3 session in OpenSSL

Resuming a TLS 1.3 session in OpenSSL - c

I am working on a TLS 1.3 client, and trying to support session resumption. I managed to do it in the following way:
When the server sends the NewSessionTicket, I have a callback function new_session_cb_func(SSL *ssl, SSL_SESSION *session) that:
Creates a file. BIO *stmp = BIO_new_file(filepath, "w")
Writes the session into it. PEM_write_bio_SSL_SESSION(stmp, session).
When the client wants to resume the session, it uses the function PskSessionResume(SSL *ssl) that:
Opens the file where the session is stored. BIO *stmp = BIO_new_file(filepath, "r")
Reads the session from the
file. SSL_SESSION *sess = PEM_read_bio_SSL_SESSION(stmp, NULL, 0, NULL)
Resumes the session. SSL_set_session(ssl, sess).
Currently, it works fine.
I tried to change the code so that the session will be kept in memory instead of a file.
new_session_cb_func(SSL *ssl, SSL_SESSION *session):
Saves the session to a global variable. SSL_SESSION *prev_sess = session
PskSessionResume(SSL *ssl):
Ensure prev_sess != NULL.
Resumes the session. SSL_set_session(ssl, prev_sess).
For some reason, the resumption doesn't work after the change. SSL_set_session returns 1, but the client doesn't send the pre-shared key, and therefore a full handshake is conducted.

It seems the pointer to the SESSION passed to your callback is not expected to be kept as is. Actually, when storing the session, you want to copy the entire content of the structure, not just keep track of its address.
When you save your session in a file, the file is written with the structure content, not the address of the structure.
I strongly suspect that the pointer you keep track of gets freed and re-used somehow during the execution of your program (before you do the resumption). That is, when you want to resume the session and use the backed SESSION pointer, it no longer points to valid/consistent memory data.
That is why the session cannot resume as expected and the process starts over from the very beginning.
A solution to this situation is to change the type of your global variable from SESSION *prev_sess; to SESSION prev_sess and save the session with prev_sess = *session; in your callback. This way, you keep a copy of the structure content, not the pointer.
If everything is good, resuming the session will go nice and smooth!
EDIT:
The API provide a function for duplicating session context: SSL_SESSION_dup(SSL_SESSION *session). You may want to use it for duplicating the current session passed to the callback.
Quoted from the documentation link:
SSL objects may be using the SSL_SESSION object; as a session may be reused, several SSL objects may be using one SSL_SESSION object at the same time. It is therefore crucial to keep the reference count (usage information) correct and not delete a SSL_SESSION object that is still used, as this may lead to program failures due to dangling pointers.
From there, it seems you can keep the original pointer as long as you also mark it with one more use with SSL_SESSION_up_ref. This is supposed to guarantee that the structure never gets freed as long as you need it.

Related

Is this a bug in the Log class in CodenameOne?

I've been trying to use the log class to capture some strange device-specific failures using local storage. When I went into the Log class and traced the code I noticed what seems to be a bug.
when I call the p(String) method, it calls getWriter() to get the 'output' instance of the Writer. It will notice output is null so it calls createWriter() create it. Since I haven't set a File URL, the following code gets executed:
if(getFileURL() == null) {
return new OutputStreamWriter(Storage.getInstance().createOutputStream("CN1Log__$"));
}
On the Simulator, I notice this file is created and contains log info.
so in my app I want to display the logs after an error is detected (to debug). I call getLogContent() to retrieve it as a string but it does some strange things:
if(instance.isFileWriteEnabled()) {
if(instance.getFileURL() == null) {
instance.setFileURL("file:///" + FileSystemStorage.getInstance().getRoots()[0] + "/codenameOne.log");
}
Reader r = new InputStreamReader(FileSystemStorage.getInstance().openInputStream(instance.getFileURL()));
The main problem I see is that it's using a different file URL than the default writer location. and since the creation of the Writer didn't set the File URL, the getLogContent method will never see the logged data. (The other issue I have is a style issue that a method getting content shouldn't be setting the location for that content persistently for the instance, but that's another story).
As a workaround I think I can just call "getLogContent()" at the beginning of the application which should set the file url correctly in a place that it will retrieve it from later. I'll test that next.
In the mean time, is this a Bug, or is it functionality I don't understand from my user perspective?

It's more like "unimplemented functionality". This specific API dates back to LWUIT.
The main problem with that method is that we are currently writing into a log file and getting its contents which we might currently be in the middle of writing into can be a problem and might actually cause a failure. So this approach was mostly abandoned in favor of the more robust crash protection approach.

Transferring data to/from a callback from/to a worker thread

My current application is a toy web service written in C designed to replicate the behaviour of http://sprunge.us/ (takes data in via http POST, stores it to disk, returns the client a url to the data - also serves data that has been previously stored upon request).
The application is structured such that a thread pool is instantiated with worker threads (just a function pointer that takes a void* parameter) and a socket is opened to listen to incoming connections. The main loop of the program comprises a sock = accept(...) call and then a pool_add_task(worker_function_parse_http, sock) to enable requests to be handled quickly.
The parse_http worker parses the incoming request and either adds another task to the work queue for storing the data (POST) or serving previously stored data (GET).
My problem with this approach stems from the use of the http-parser library which uses a callback design to return parsed data (all http parsers that I looked at used this style). The problem I encounter is as such:
My parse_http worker:
Buffers data from the accepted socket (the function's only parameter, at this stage)
Sets up a http-parser object as per its API, complete with setting callback functions for it to call when it finishes parsing the URL or BODY or whatever. (These functions are of a fixed type signature defined by the http-parser lib, with a pointer to a buffer containing the parsed data relevant to the call, so I can't pass in my own variables and solve the problem that way. These functions also return a status code to the http parser, so I can't use the return values either. The suggested way to get data out of the parser for later use is to copy it out to a global variable during the callback - fun with multiple threads.)
Execute the parser on the buffered socket data. At this stage, the parser is expected to call its set up callbacks when it parses different sections of the buffer. The callback is supplied with parsed data relevant to each callback (e.g. BODY segment supplied to body_parsed callback function).
Well, this is where the problem shows. The parser has executed, but I don't have any access to the parsed data. Here is where I would add a new task to the queue with a worker function to store the received body data or another to handle the GET request for previously stored data. These functions would need to be supplied with both the parsed information (POST data or GET url) as well as the accepted socket so that the now delegated work can respond to the request and close the connection.
Of course, the obvious solution to the problem is simply to not use this thread-pool model with asynchronous practices, but I would like to know, for now and for later, how best to tackle this problem.
How can I get the parsed data from these callbacks back to the worker thread function. I've considered simply making my on_url_parsed and on_body_parsed do the rest of the application's job (storing and retrieving data), but of course I no longer have the client's socket to respond back to in these contexts.
If needed, I can post up the source code to the project when I get the chance.
Edit: It turns out that it is possible to access a user defined void * from within the callbacks of this particular http-parser library as the callbacks are passed a reference to the caller (the parser object) which has a user-definable data field.

A well-designed callback interface would provide for you to give the parser a void * which it would pass on to each of the callback functions when it calls them. The callback functions you provide know what type of object it points to (since you provide both the data pointer and the function pointers), so they can cast and properly dereference it. Among other advantages, this way you can provide for the callbacks to access a local variable of the function that initiates the parse, instead of having to rely on global variables.
If the parser library you are using does not have such a feature (and you don't want to switch to a better-designed one), then you can probably use thread-local storage instead of global variables. How exactly you would do that depends on your thread library and compiler, or you could roll your own by using thread identifiers as keys to thread-specific slots in some global data structure (a hash table for instance).

Save to .settings file diffrence between 2 diffrent ways of saving

I was reading about the .settings file on msdn and I noticed they give 2 examples of how to set the value of a item in the settings. Now my question is what is the real diffrence between the 2 and when would you use one instead of the other, since to me they seem pretty mutch the same.
To Write and Persist User Settings at Run Time
Access the user setting and assign it a new value, as shown in the following example:
Properties.Settings.Default.myColor = Color.AliceBlue;
If you want to persist changes to user settings between application sessions, call the Save method, as shown in the following code:
Properties.Settings.Default.Save();

The first statement updates the value of the setting in memory. The second statement updates the persisted value in the user.config file on the disk. That second statement is required to get the value back when you restart the program.
It is very, very important to realize that these two statements must be separate and never be written close together in your code. Keeping them close is harakiri-code. Settings tend to implement unsubtle features in your code, making it operate differently. Which isn't always perfectly tested. What you strongly want to avoid is persisting a setting value that subsequently crashes your program.
That's the harakiri angle, if you saved that value then it is highly likely that the program will immediately crash again when the user restarts it. Or in other words, your program will never run correctly again.
The Save() call must be made when you have a reasonable guarantee that nothing bad happened when the new setting value was used. It belongs at the end of your Main() method. Only reached when the program terminated normally.

Why should I use thread-specific data?

Since each thread has its own stack, its private data can be put on it. For example, each thread can allocate some heap memory to hold some data structure, and use the same interface to manipulate it. Then why thread-specific data is helpful?
The only case that I can think of is that, each thread may have many kinds of private data. If we need to access the private data in any function called within that thread, we need to pass the data as arguments to all those functions, which is boring and error-prone.

Thread-local storage is a solution for avoiding global state. If data isn't shared across threads but is accessed by several functions, you can make it thread-local. No need to worry about breaking reentrancy. Makes debugging that much easier.
From a performance point of view, using thread-local data is a way of avoiding false sharing. Let's say you have two threads, one responsible for writing to a variable x, and the other responsible for reading from a variable y. If you were to define these as global variables, they could be on the same cache line. This means that if one of the threads writes to x, the CPU will update the cache line, and this of course includes the variable y, so cache performance will degrade, because there was no reason to update y.
If you used thread-local data, one thread would only store the variable x and the other would only store the variable y, thus avoiding false sharing. Bear in mind, though, that there are other ways to go about this, e.g. cache line padding.

Unlike the stack (which, like thread-local data is dedicated to each thread), thread-local data is useful because it persists through function calls (unlike stack data which may already be overwritten if used out of its function).
The alternative would be to use adjacent pieces of global data dedicated to each thread, but that has some performance implications when the CPU caches are concerned. Since different threads are likely to run on different cores, such "sharing" of a global piece of data may bring some undesirable performance degradation because an access from one core may invalidate the cache-line of another, with the latter contributing to more inter-core traffic to ensure cache consistency.
In contrast, working with thread-local data should conceptually not involve messing up with the cache of other cores.

Think of thread local storage as another kind of global variable. It's global in the sense that you don't have to pass it around, different code can access it as they please (given the declaration of course). However, each different thread has its own separate variable. Normally, globals are extra bad in multithreaded programming bacause other threads can change the value. If you make it thread local, only your thread can see it so it is impossible for another thread to unexpectedly change it.
Another use case is when you are forced to use a (badly designed) API that expects you to use global variables to carry information to callback functions. This is a simple instance of being forced into a global variable, but using thread local storage to make it thread safe.

Well, I've been writing multithreaded apps for 30 odd years and have never, ever found any need to use TLS. If a thread needs a DB connection that the DB binds to the thread, the thread can open one of its own and keep it on the stack. Since threads cannot be called, only signaled, there is no problem. Every time I've ever looked at this magic 'TLS', I've realized it's not a solution to my problem.
With my typical message-passing design, where objects are queued in to threads that never terminate, there is just no need for TLS.
With thread-pools it's even more useless.
I can only say that using TLS=bad design. Someone put me right, if you can :)

I've used thread local storage for database connections and sometimes for request / response objects. To give two examples, both from a Java webapp environment, but the principles hold.
A web app might consist of a large spread of code that calls various subsystems. Many of these might need to access the database. In my case, I had written each subsystem that required the db to get a db connection from a db pool, use the connection, and return the connection to the pool. Thread-local storage provided a simpler alternative: when the request is created, fetch a db connection from the pool and store it in thread-local storage. Each subsystem then just uses the db connection from thread-local storage, and when the request is completing, it returns the connection to the db pool. This solution had performance advantages, while also not requiring me to pass the db connection through every level: ie my parameter lists remained shorter.
In the same web app, I decided in one remote subsystem that I actually wanted to see the web Request object. So I had either to refactor to pass this object all the way down, which would have involved a lot of parameter passing and refactoring, or I could simply place the object into Thread Local storage, and retrieve it when I wanted it.
In both cases, you could argue that I had messed up the design in the first place, and was just using Thread Local storage to save my bacon. You might have a point. But I could also argue that Thread Local made for cleaner code, while remaining thread-safe.
Of course, I had to be very sure that the things I was putting into Thread Local were indeed one-and-only-one per thread. In the case of a web app, the Request object or a database connection fit this description nicely.

I would like to add on the above answers, that as far as I know, performance wise, allocation on stack is faster than allocation on heap.
Regarding passing the local data across calls , well - if you allocate on heap, you will need to pass the pointer / reference (I'm a Java guy :) ) to the calls - otherwise, how will you access the memory?
TLS is also good in order to store a context for passing data across calls within a thread (We use it to hold information on a logged on user across the thread - some sort of session management).

Thread Specific Data is used when all the functions of a particular thread needs to access one common variable. This variable is local to that particular thread but acts as a global variable for all the functions of that thread.
Let's say we have two threads t1 and t2 of any process. Variable 'a' is the thread specific data for t1. Then, t2 has no knowledge over 'a' but all the functions of t1 can access 'a' as a global variable. Any change in 'a' will be seen by all the functions of t1.

With new OOP techniques available, I find thread specific data as irrelevant. Instead of passing the function to the thread, you can pass the functor. The functor class that you pass, can hold any thread specific data that you need.
Eg. Sample code with C++11 or boost would like like below
MyClassFunctor functorobj; <-- Functor Object. Can hold the function that runs as part of thread as well as any thread specific data
boost::thread mythread(functorobj);
Class MyClassFunctor
{
private:
std::list mylist; <-- Thread specific data
public:
void operator () ()
{
// This function is called when the thread runs
// This can access thread specific data mylist.
}
};

Multithreaded c program design help

I don't have much experience with multithreading and I'm writing a c program which I believe is suited to running in two threads. The program will listen on the serial port for data, read and process new data when it's available, and publish the newest processed data to other (irrelevant) modules via a third party IPC api (it's confusingly named IPC) when requested.
In order to receive the request to publish data via IPC, the program must call IPC_listenwait(wait_time);. Then if a request to publish is received while "listenwaiting" a handler is invoked to publish the newest data.
One option is to do this in one thread like:
for(;;) {
read_serial(inputBuffer);
process_data(inputBuffer, processedData); //Process and store
IPC_listenwait(wait_time); //If a request to publish is received during this,
} //then a handler will be invoked and the newest piece of
//processedData will be published to other modules
publishRequestHandler() { //Invoked when a message is received during IPC_listenwait
IPC_publish(newest(processedData));
}
And this works, but for the application it is important that the program is very responsive to the request to publish new data, and that the data published is the newest available. These goals are not satisfied with the above because data may arrive after the process begins listenwaiting and before a request to publish message is received. Or the process may be reading/processing when a request to publish message is incoming, but won't be able to service it until the next IPC_listenwait call.
The only design I can think of is to have one thread to read, which will just do something like:
readThread() {
for(;;) { //pseudocode
select();
read(inputBuffer);
process(inputBuffer, processedData);
}
}
And have the main thread just listening for incoming messages:
mainThread() {
IPC_listenwait(forever);
}
publishRequestHandler() { //Invoked when a message is received during IPC_listenwait
IPC_publish(newest(processedData));
}
Is this the design you would use? If so, will I need to use a semaphore when accessing or writing processedData?
Will this give me good responsiveness?
Thanks

You're mostly on the right track.
The one thing you have to watch out for is concurrent access to the publishable data, because you don't want one thread clobbering it while another is trying to read it. To prevent that, use a pair of buffers and a mutex-protected pointer to whichever one is considered current. When process_data() has something ready, it should dump its results in the non-current buffer, lock the pointer mutex, repoint the pointer to the buffer containing the new data and then release the mutex. Similarly, the publisher should lock the pointer mutex while it reads the current data, which will force anything that might want to clobber it to wait. This is a bit more complex than having a single, mutex-protected buffer but will assure that you always have something current to publish while new data is being prepared.
If your processing step takes long enough that you could get multiple sets of data to read, you might split the read/process thread into two and let the reader make sure the processor only ever gets the latest and greatest so you don't end up processing stuff you won't ever publish.
Excellent first question, by the way. Have an upvote.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight