transfer std::shared_ptr via mailbox - shared-ptr

We have a Real Time Operating System which offers Inter-Task-Communication by so called Mailboxes.
A Mailbox is described by a Handle of type RTKMailbox.
The API looks like:
int RTKPut(RTKMailbox h, const void* data);
int RTKGet(RTKMailbox h, void* data);
The size of data is known by the Mailbox. Data transfer could be thought as doing a memcpy from sender to receiver.
Imagine I have a Producer-Task and a Consumer-Task; is it a good idea to send a shared_ptr by that system?
Since the Mailbox does not know a shared_ptr my idea is to wrap the shared_ptr in a transport structure.
The code could look like:
class MyData {
//...
};
struct TransportWrapper {
void BeforePut();
void AfterGet();
std::shared_ptr<MyData> Data;
TransportWrapper() {}
TransportWrapper(std::shared_ptr<MyData>& _data) : Data(_data)
{}
};
void Send(RTKMailbox mbHandle, std::shared_ptr<MyData>& data)
{
TransportWrapper wrap(data);
wrap.BeforePut();
RTKPut(mbHandle, &wrap);
}
std::shared_ptr<MyData> Receive(RTKMailbox mbHandle)
{
TransportWrapper wrap;
RTKGet(mbHandle, &wrap);
wrap.AfterGet();
return wrap.Data;
}
What do I have to do in BeforePut to prevent the shared_ptr to be deleted if the Lifetime of the wrapper ends?
What do I have to do in AfterGet to restore the shared_ptr to the state it had before Put?
Regards Andreas

Your example code won't work, you can't just memcpy a shared_ptr because all that does is copy the pointers it contains, it doesn't make a new copy of the shared_ptr and increase the reference count. You cannot use memcpy with objects that have non-trivial constructors or destructors.
Assuming the sender and receiver share an address space (because otherwise this is pretty much impossible to do via your mailbox API, you need shared memory), you need to increase the shared_ptr's reference count on the sender side, to ensure that the sender doesn't drop its last reference to the owned object and delete it before the receiver has received it. Then the receiver has to decrease the reference count, so they need to coordinate.
If delivery to a mailbox is asynchronous (i.e. the sender does not block until delivery is complete and the receiver has received the data) you can't do that with local variables in the Send function, because those variables will go out of scope as soon as the RTKPut call returns, which will decrease the reference count (and maybe destroy the data) before the receiver has got it.
The simplest way to solve that is to create a new shared_ptr on the heap and transfer its address.
void Send(RTKMailbox mbHandle, const std::shared_ptr<MyData>& data)
{
std::shared_ptr<MyData>* p = new std::shared_ptr<MyData>(data);
if (RTKPut(mbHandle, &p) != success)
{
delete p;
// deal with it
}
}
std::shared_ptr<MyData> Receive(RTKMailbox mbHandle)
{
std::shared_ptr<MyData>* p = nullptr;
if (RTKGet(mbHandle, &p) == success)
{
auto sp = *p;
delete p;
return sp;
}
// else deal with it
}
This assumes that if RTKPut returns successfully then delivery will not fail, otherwise you leak the shared_ptr created on the heap, and will never delete the object it owns.

Related

How to avoid a double copy when implementing get API methods

I Want to optimize a thread-safe "get" function for a variable globalBig that looks like the following
extern struct big *globalBig; // points to a big struct defined elsewhere
struct big getGlobalBig(void) {
// lock globalBig here...
struct big ret = *globalBig; // <------ copy #1 (into stack of getGlobalBig)
// unlock globalBig here...
return ret; // <------ copy #2 (into stack of caller)
}
When I look at the assembly, I see that there are two invocations of memcpy, one into the stack of the get method, and the second into the stack of the calling function caused by the return. I DO NOT want to have to pass in a pointer argument into getGlobalBig() because I want a function that returns an rvalue so that I could do stuff like:
if(getGlobalBig().someField == 9) {
// yay
}
I am also aware that I could do a single copy by skipping the unlocking of the variable, leaving that to the caller as a cleanup activity, but this is undesirable as I want to spend as little time as possible with the lock on.
struct big getGlobalBig(void) {
// lock globalBig here...
return *globalBig; // <------ copy #1 (directly into stack of caller, caller is responsible for unlocking)
}
So based on those needs, is there a way to avoid the two copies? For example, I envision something like "returning in advance":
struct big getGlobalBig(void) {
// lock globalBig here...
// single copy directly into the destination struct allocated in the stack area of the caller
*(struct big*)SOMEHOW_GET_THE_ADDR_THAT_THIS_FUNCTION_COPIES_TO_WHEN_RETURNING = *globalBig;
// unlock globalBig here...
return; // return nothing, you already "did" (?)
}

Understanding a stack-use-after-scope error

I am working on a multithreaded client using C and the pthreads library, using a boss/worker arch design and am having issues understanding/debugging a stack-use-after-scope error that is causing my client to fail. (I am kinda new to C)
I have tried multiple things, including defining the variable globally, passing a double pointer reference, etc.
Boss logic within main:
for (i = 0; i < nrequests; i++)
{
struct Request_work_item *request_ctx = malloc(sizeof(*request_ctx));
request_ctx->server = server;
request_ctx->port = port;
request_ctx->nrequests = nrequests;
req_path = get_path(); //Gets a file path to work on
request_ctx->path = req_path;
steque_item work_item = &request_ctx; // steque_item is a void* so passing it a pointer to the Request_work_item
pthread_mutex_lock(&mutex);
while (steque_isempty(&work_queue) == 0) //Wait for the queue to be empty to add more work
{
pthread_cond_wait(&c_boss, &mutex);
}
steque_enqueue(&work_queue, work_item); //Queue the workItem in a workQueue (type steque_t, can hold any number of steque_items)
pthread_mutex_unlock(&mutex);
pthread_cond_signal(&c_worker);
}
Worker logic inside a defined function:
struct Request_work_item **wi;
while (1)
{
pthread_mutex_lock(&mutex);
while (steque_isempty(&work_queue) == 1) //Wait for work to be added to the queue
{
pthread_cond_wait(&c_worker, &mutex);
}
wi = steque_pop(&work_queue); //Pull the steque_item into a Request_work_item type
pthread_mutex_unlock(&mutex);
pthread_cond_signal(&c_boss);
char *path_to_file = (*wi)->path; //When executing, I get this error in this line: SUMMARY: AddressSanitizer: stack-use-after-scope
...
...
...
continues with additional worker logic
I expect the worker to pull the work_item from the queue, dereference the values and then perform some work. However, I keep getting AddressSanitizer: stack-use-after-scope, and the information for this error online is not very abundant so any pointers would be greatly appreciated.
The red flag here is that &request_ctx is the address of a local variable. It's not the pointer to the storage allocated with malloc, but the address of the variable which holds that storage. That variable is gone once this scope terminates, even though the malloc-ed block endures.
Maybe the fix is simply to delete the address-of & operator in this line?
steque_item work_item = &request_ctx; // steque_item is a void* so passing
// it a pointer to the Request_work_item
If we do that, then the comment actually tells the truth. Because otherwise we're making work_item a pointer to a pointer to the Request_work_item.
Since work_item has type void*, it compiles either way, unfortunately.
If the consumer of the item on the other end of the queue is extracting it as a Request_work_item *, then you not only have an access to an object that has gone out of scope, but also a type mismatch even if that object happens to still be in the producer's scope when the consumer uses it. The consumer ends up using a piece of the producer's stack as if it were a Request_work_item structure. Edit: I see that you are using a pointer-to-pointer when dequeuing the item and accessing it as (*wi)->path. Think about changing the design to avoid doing that. Or else, that wi pointer has to be dynamically allocated also, and freed. The producer has to do something like:
struct Request_work_item **p_request_ctx = malloc(sizeof *p_request_ctx);
struct Request_work_item *request_ctx = malloc(sizeof *request_ctx);
if (p_request_ctx && request_ctx) {
*p_request_ctx = request_ctx;
request_ctx->field = init_value;
// ... etc
// then p_request_ctx is enqueued.
The consumer then has to free the structure, and also free the pointer. That extra pointer just seems like pure overhead here; it doesn't provide any essential or useful level of indirection.

Shared pointers and queues in FreeRTOS

A C++ wapper around a FreeRTOS queue can be simplified into something like this:
template<typename T>
class Queue<T>
{
public:
bool push(const T& item)
{
return xQueueSendToBack(handle, &item, 0) == pdTRUE;
}
bool pop(T& target)
{
return xQueueReceive(handle, &target, 0) == pdTRUE;
}
private:
QueueHandle_t handle;
}
The documentation of xQueueSendToBack states:
The item is queued by copy, not by reference.
Unfortunately, it is literally by copy, because it all ends in a memcpy, which makes sense since it is a C API. While this works well for plain old data, more complex items such as the following event message give serious problems.
class ConnectionStatusEvent
{
public:
ConnectionStatusEvent() = default;
ConnectionStatusEvent(std::shared_ptr<ISocket> sock)
: sock(sock)
{
}
const std::shared_ptr<ISocket>& get_socket() const
{
return sock;
}
private:
const std::shared_ptr<ISocket> sock;
bool connected;
};
The problem is obviously the std::shared_ptr which doesn't work at all with a memcpy since its copy constructor/assignment operator isn't called when copied onto the queue, resulting in premature deletion of the held object when the event message, and thus the shared_ptr, goes out of scope.
I could solve this by using dynamically allocated T-instances and change the queues to only contain pointers to the instance, but I'd rather not do that since this shall run on an embedded system and I very much want to keep the memory static at run-time.
My current plan is to change the queue to contain pointers to a locally held memory area in the wrapper class in which I can implement full C++ object-copy, but as I'd also need to protect that memory area against multiple thread access, it essentially defeats the already thread-safe implementation of the FreeRTOS queues (which surely are more efficient than any implementation I can write myself) I might as well skip them entirely.
Finally, the question:
Before I implement my own queue, are there any tricks I can use to make the FreeRTOS queues function with C++ object instances, in particular std::shared_ptr?
The issue is what happens to the original once you put the pointer into the queue.
Copying seems trivial but not optimal.
To get around this issue i use a mailbox instead of a queue:
T* data = (T*) osMailAlloc(m_mail, osWaitForever);
...
osMailPut (m_mail, data);
Where you allocate the pointer explicitly to begin with. And just add the pointer to the mailbox.
And to retrieve:
osEvent ev = osMailGet(m_mail, osWaitForever);
...
osStatus freeStatus = osMailFree(m_mail, p);
All can be neatly warpend into c++ template methods.

Efficient way to detect changes in structure members?

This seems like it should be simple but I wasn't able to find much related to it. I have structure which has different fields used to store data about the program operation. I want to log that data so that I can analyse it later. Attempting to continuously log data over the course of the programs operation eats up a lot of resources. Thus I would only like to call the logging function when the data has changed. I would love it if there was an efficient way to check whether the structure members have updated. Currently I am playing a shell game with 3 structures (old, current, and new) in order to detect when the data has changed. Thanks in advance.
You may track structures and its hashes in your log function.
Let you have a hash function:
int hash(void* ptr, size_t size);
Let you have a mapping from pointer to struct to struct's hash like:
/* Stores hash value for ptr*/
void ptr2hash_update_hash(void* ptr, int hash);
/* Remove ptr from mapping */
void ptr2hash_remove(void* ptr);
/* Returns 0 if ptr was not stored, or stored has otherwise*/
int ptr2hash_get_hash(void* ptr);
Then you may check if your object was changed between log calls like this:
int new_hash = hash(ptr, sizeof(TheStruct));
int old_hash = ptr2hash_get_hash(ptr);
if (old_hash == new_hash)
return;
ptr2hash_update_hash(ptr, new_hash);
/* Then do the logging */
Don't forget to remove ptr from mapping when you do free(ptr) :)
Here is simple hash table implementation, you will need it to implement ptr2hash mapping.
Simple hash functions are here.
If you're running on Linux (x86 or x86_64) then another possible approach is the following:
Install a segment descriptor for a non-writable segment in the local descriptor table using the modify_ldt system call. Place your data inside this segment (or install the segment such that your data structure is within it).
Upon write access, your process will receive a SIGSEGV (segmentation fault). Install a handler using sigaction to catch segmentation faults. Within that handler, first check that the fault occurred inside the previously set segment (si_addr member of the siginfo_t) and if so prepare to record a notification. Now, change the segment descriptor such that the segment becomes writable and return from the signal handler.
The write will now be performed, but you need a way to change the segment to be non-writable again and to actually check what was written and if your data actually changed.
A possible approach could be to send oneself (or a "delay" process and then back to the main process) another signal (SIGUSR1 for example), and doing the above in the handler for this signal.
Is this portable? No.
Is this relyable? No.
Is this easy to implement? No.
So if you can, and I really hope you do, use a interface like already suggested.
The easiest way what you can try is, You can just keep two structure pointers. Once you are receiving the new updated values that time you can just compare the new structure pointer with the old structure pointer, and if any difference is there you can detect it and then you can update to old structure pointer so that you can detect further changes in updated value in future.
typedef struct testStruct
{
int x;
float y;
}TESTSTRUCT;
TESTSTRUCT* getUpdatedValue()
{
TESTSTRUCT *ptr;
ptr->x = 5;
ptr->y = 6;
//You can put your code to update the value.
return ptr;
}
void updateTheChange(TESTSTRUCT* oldObj,TESTSTRUCT* newObj)
{
cout << "Change Detected\n";
oldObj = newObj;
}
int main()
{
TESTSTRUCT *oldObj = NULL;
TESTSTRUCT *newObj = NULL;
newObj = getUpdatedValue();
//each time a value is updated compae with the old structure
if(newObj == oldObj)
{
cout << "Same" << endl;
}
else
{
updateTheChange(oldObj,newObj);
}
return 0;
}
I am not sure, it gives you your exact answer or not.
Hope this Helps.

Store extra data in a c function pointer

Suppose there is a library function (can not modify) that accept a callback (function pointer) as its argument which will be called at some point in the future. My question: is there a way to store extra data along with the function pointer, so that when the callback is called, the extra data can be retrieved. The program is in c.
For example:
// callback's type, no argument
typedef void (*callback_t)();
// the library function
void regist_callback(callback_t cb);
// store data with the function pointer
callback_t store_data(callback_t cb, int data);
// retrieve data within the callback
int retrieve_data();
void my_callback() {
int a;
a = retrieve_data();
// do something with a ...
}
int my_func(...) {
// some variables that i want to pass to my_callback
int a;
// ... regist_callback may be called multiple times
regist_callback(store_data(my_callback, a));
// ...
}
The problem is because callback_t accept no argument. My idea is to generate a small piece of asm code each time to fill into regist_callback, when it is called, it can find the real callback and its data and store it on the stack (or some unused register), then jump to the real callback, and inside the callback, the data can be found.
pseudocode:
typedef struct {
// some asm code knows the following is the real callback
char trampoline_code[X];
callback_t real_callback;
int data;
} func_ptr_t;
callback_t store_data(callback_t cb, int data) {
// ... malloc a func_ptr_t
func_ptr_t * fpt = malloc(...);
// fill the trampoline_code, different machine and
// different calling conversion are different
// ...
fpt->real_callback = cb;
fpt->data = data;
return (callback_t)fpt;
}
int retrieve_data() {
// ... some asm code to retrive data on stack (or some register)
// and return
}
Is it reasonable? Is there any previous work done for such problem?
Unfortunately you're likely to be prohibited from executing your trampoline in more and more systems as time goes on, as executing data is a pretty common way of exploiting security vulnerabilities.
I'd start by reporting the bug to the author of the library. Everybody should know better than to offer a callback interface with no private data parameter.
Having such a limitation would make me think twice about how whether or not the library is reentrant. I would suggest ensuring you can only have one call outstanding at a time, and store the callback parameter in a global variable.
If you believe that the library is fit for use, then you could extend this by writing n different callback trampolines, each referring to their own global data, and wrap that up in some management API.

Resources