C programming - globally seen instance - c

I want to declare an instance of a structure which will be accessible in all source files. To be more precise, I have a structure which represents a ring buffer. Two parts of my program can write to the buffer, so I need somehow to share the same instance of the buffer between my source files. This was my idea but it is not working:
To declare instance of buffer as extern in buf.h file and to make function ringBuf_get() which would return pointer to instance of my buffer.
extern ringBuf buf_frames;
ringBuf *ringBuf_get(void);
So I would implement ringBuf_get() like this in buf.c:
ringBuf *ringBuf_get(void)
{
return &buf_frames;
}
Then whenever I want to make some operation on buffer, I would first call ringBuf_get to get instance of buffer and then I would make operation. But it seems that I am doing something wrong. Feel free to comment.
bool ringBuf_write(ringBuf *_this, uint8_t *mac_protocol_data_unit, uint8_t length)
{
if(_this->write->alloc == false)
{
_this->write->alloc = true;
_this->write->len = length;
memcpy(_this->write->data, mac_protocol_data_unit, length);
if(_this->write == &_this->buf_pool[MAX_BUF_POOL_SIZE])
_this->write = &_this->buf_pool[0];
else
_this->write++;
xil_printf("\n\n Write Suceeded! \n\n");
return true;
}
else
{
if (ringBuf_full(_this))
{
xil_printf("\n\n BUFFER IS FULL! \n\n");
}
return false;
}
}

There is no need for the header to declare
extern ringBuf buf_frames;
The function is sufficient — and it is better to expose just the function and not the global variable. Indeed, the variable should be made static in the file that defines it (and there must be such a file; presumably it would be buf.c). Making the variable static means other files cannot access it by name, but they can call the function to get a pointer to the variable and then access it.
Auxilliary Q&A
It seems that every time I write:
ringBuf_write(ringBuf_get(), packet1, 127);
ringBuf_write(ringBuf_get(), packet2, 64);
ringBuf_write(ringBuf_get(), packet1, 127);
I get a new instance by calling ringBuf_get(). So instead of movingthe pointer and writing to another frame in the buffer, I write to the same frame. Also the buffer does not remember that I have already allocated, for example, frame 0.
Then you need to review your question and explain what is going wrong, preferably with an MCVE (How to create a Minimal, Complete, and Verifiable Example?) or SSCCE (Short, Self-Contained, Correct Example) — two names and links for the same basic idea.
You wrote the ringBuf_get() function to return a pointer to the same data structure each time. That's probably correct, but it is not then clear what you mean by 'I get new instance by calling ringBuf_get()'. You get the same instance each time. If you've not updated that instance correctly, or you need to point to a different instance each time, you need to fix the code.
It is not clear where you check that the memory to copied to _this->write->data is small enough to fit.
Also, it is not clear how big the _this->buf_pool array is. If it is not of size MAX_BUF_POOL_SIZE+1, you have a problem writing out of bounds in the code:
if(_this->write == &_this->buf_pool[MAX_BUF_POOL_SIZE])
_this->write = &_this->buf_pool[0];
else
_this->write++;
Arguably the damage is done by the memcpy() before that. The code that sets _this->alloc = true; without allocating memory is worrying, too. I can't say whether it is wrong or not; you've not shown enough of your code or the detailed definition of the structure.
How have you unit tested the ring-buffer code? Where are all the diagnostic print statements that tell you that the buffers are being handled correctly? Have you run the function in a debugger if you can't or refuse to add printing code? What is the design of the ring buffers really?

You haven't shown us much of the code, but does this statement (or others!) break an array?
if(_this->write == &_this->buf_pool[MAX_BUF_POOL_SIZE])
Typically, when you define the SIZE of an array, it can then be indexed in the range 0..(SIZE-1). I propose this since one of your problems is being able to write a 5th buffer when there are only 4.

Related

Returning a pointer to a static buffer

In C on a small embedded system, is there any reason not to do this:
const char * filter_something(const char * original, const int max_length)
{
static char buffer[BUFFER_SIZE];
// checking inputs for safety omitted
// copy input to buffer here with appropriate filtering etc
return buffer;
}
this is essentially a utility function the source is FLASH memory which may be corrupted, we do a kind of "safe copy" to make sure we have a null terminated string. I chose to use a static buffer and make it available read only to the caller.
A colleague is telling me that I am somehow not respecting the scope of the buffer by doing this, to me it makes perfect sense for the use case we have.
I really do not see any reason not to do this. Can anyone give me one?
(LATER EDIT)
Many thanks to all who responded. You have generally confirmed my ideas on this, which I am grateful for. I was looking for major reasons not to do this, I don't think that there are any. To clarify a few points:
rentrancy/thread safety is not a concern. It is a small (bare metal) embedded system with a single run loop. This code will not be called from ISRs, ever.
in this system we are not short on memory, but we do want very predictable behavior. For this reason I prefer declaring an object like this statically, even though it might be a little "wasteful". We have already had issues with large objects declared carelessly on the stack, which caused intermittent crashes (now fixed but it took a while to diagnose). So in general, I am preferring static allocation, simply to have very predictability, reliability, and less potential issues downstream.
So basically it's a case of taking a certain approach for a specific system design.
Pro
The behavior is well defined; the static buffer exists for the duration of the program and may be used by the program after filter_something returns.
Cons
Returning a static buffer is prone to error because people writing calls to the routines may neglect or be unaware that a static buffer is returned. This can lead to attempts to use multiple instances of the buffer from multiple calls to the function (in the same thread or different threads). Clear documentation is essential.
The static buffer exists for the duration of the program, so it occupies space at times when it may not be needed.
It really depends on how filter_something is used. Take the following as an example
#include <stdio.h>
#include <string.h>
const char* filter(const char* original, const int max_length)
{
static char buffer[1024];
memset(buffer, 0, sizeof(buffer));
memcpy(buffer, original, max_length);
return buffer;
}
int main()
{
const char *strone, *strtwo;
char deepone[16], deeptwo[16];
/* Case 1 */
printf("%s\n", filter("everybody", 10));
/* Case 2 */
printf("%s %s %s\n", filter("nobody", 7), filter("somebody", 9), filter("anybody", 8));
/* Case 2 */
if (strcmp(filter("same",5), filter("different", 10)) == 0)
printf("Strings same\n");
else
printf("Strings different\n");
/* Case 3 - Both of these end up with the same pointer */
strone = filter("same",5);
strtwo = filter("different", 10);
if (strcmp(strone, strtwo) == 0)
printf("Strings same\n");
else
printf("Strings different\n");
/* Case 4 - You need a deep copy if you wish to compare */
strcpy(deepone, filter("same", 5));
strcpy(deeptwo, filter("different", 10));
if (strcmp(deepone, deeptwo) == 0)
printf("Strings same\n");
else
printf("Strings different\n");
}
The output when gcc is used is
everybody
nobody nobody nobody
Strings same
Strings same
Strings different.
When filter is used by itself, it behaves quite well.
When it is used multiple times in an expression, the behaviour is undefined there is no telling what it will do. All instances will use the contents the last time the filter was executed. This depends on the order in which the execution was performed.
If an instance is taken, the contents of the instance will not stay the same as when the instance was taken. This is also a common problem when C++ coders switch to C# or Java.
If a deep copy of the instance is taken, then the contents of the instance when the instance was taken will be preserved.
In C++, this technique is often used when returning objects with the same consequences.
It is true that the identifier buffer only has scope local to the block in which it is declared. However, because it is declared static, its lifetime is that of the full program.
So returning a pointer to a static variable is valid. In fact, many standard functions do this such as strtok and ctime.
The one thing you need to watch for is that such a function is not reentrant. For example, if you do something like this:
printf("filter 1: %s, filter 2: %s\n",
filter_something("abc", 3), filter_something("xyz", 3));
The two function calls can occur in any order, and both return the same pointer, so you'll get the same result printed twice (i.e. the result of whatever call happens to occur last) instead of two different results.
Also, if such a function is called from two different threads, you end up with a race condition with the threads reading/writing the same place.
Just to add to the previous answers, I think the problem, in a more abstract sense, is to make the filtering result broader in scope than it ought to be. You introduce a 'state' which seems useless, at least if the caller's intention is only to get a filtered string. In this case, it should be the caller who should create the array, likely on the stack, and pass it as a parameter to the filtering method. It is the introduction of this state that makes possible all the problems referred to in the preceding responses.
From a program design, it's frowned upon to return pointers to private data, in case that data was made private for a reason. That being said, it's less bad design to return a pointer to a local static then it is to use spaghetti programming with "globals" (external linkage). Particularly when the pointer returned is const qualified.
One general issue with staticvariables, that may or may not be a problem regardless of embedded or hosted system is re-entrancy. If the code needs to be interrupt/thread safe, then you need to implement means to achieve that.
The obvious alternative to it all is caller allocation and you've got to ask yourself why that's not an option:
void filter_something (size_t size, char dest[size], const char original[size]);
(Or if you will, [restrict size] on both pointers for a mini-optimization.)

How test if a value is (or is not) a valid pointer? [duplicate]

Is there any way to determine (programatically, of course) if a given pointer is "valid"? Checking for NULL is easy, but what about things like 0x00001234? When trying to dereference this kind of pointer an exception/crash occurs.
A cross-platform method is preferred, but platform-specific (for Windows and Linux) is also ok.
Update for clarification:
The problem is not with stale/freed/uninitialized pointers; instead, I'm implementing an API that takes pointers from the caller (like a pointer to a string, a file handle, etc.). The caller can send (in purpose or by mistake) an invalid value as the pointer. How do I prevent a crash?
Update for clarification: The problem is not with stale, freed or uninitialized pointers; instead, I'm implementing an API that takes pointers from the caller (like a pointer to a string, a file handle, etc.). The caller can send (in purpose or by mistake) an invalid value as the pointer. How do I prevent a crash?
You can't make that check. There is simply no way you can check whether a pointer is "valid". You have to trust that when people use a function that takes a pointer, those people know what they are doing. If they pass you 0x4211 as a pointer value, then you have to trust it points to address 0x4211. And if they "accidentally" hit an object, then even if you would use some scary operation system function (IsValidPtr or whatever), you would still slip into a bug and not fail fast.
Start using null pointers for signaling this kind of thing and tell the user of your library that they should not use pointers if they tend to accidentally pass invalid pointers, seriously :)
Here are three easy ways for a C program under Linux to get introspective about the status of the memory in which it is running, and why the question has appropriate sophisticated answers in some contexts.
After calling getpagesize() and rounding the pointer to a page
boundary, you can call mincore() to find out if a page is valid and
if it happens to be part of the process working set. Note that this requires
some kernel resources, so you should benchmark it and determine if
calling this function is really appropriate in your api. If your api
is going to be handling interrupts, or reading from serial ports
into memory, it is appropriate to call this to avoid unpredictable
behaviors.
After calling stat() to determine if there is a /proc/self directory available, you can fopen and read through /proc/self/maps
to find information about the region in which a pointer resides.
Study the man page for proc, the process information pseudo-file
system. Obviously this is relatively expensive, but you might be
able to get away with caching the result of the parse into an array
you can efficiently lookup using a binary search. Also consider the
/proc/self/smaps. If your api is for high-performance computing then
the program will want to know about the /proc/self/numa which is
documented under the man page for numa, the non-uniform memory
architecture.
The get_mempolicy(MPOL_F_ADDR) call is appropriate for high performance computing api work where there are multiple threads of
execution and you are managing your work to have affinity for non-uniform memory
as it relates to the cpu cores and socket resources. Such an api
will of course also tell you if a pointer is valid.
Under Microsoft Windows there is the function QueryWorkingSetEx that is documented under the Process Status API (also in the NUMA API).
As a corollary to sophisticated NUMA API programming this function will also let you do simple "testing pointers for validity (C/C++)" work, as such it is unlikely to be deprecated for at least 15 years.
Preventing a crash caused by the caller sending in an invalid pointer is a good way to make silent bugs that are hard to find.
Isn't it better for the programmer using your API to get a clear message that his code is bogus by crashing it rather than hiding it?
On Win32/64 there is a way to do this. Attempt to read the pointer and catch the resulting SEH exeception that will be thrown on failure. If it doesn't throw, then it's a valid pointer.
The problem with this method though is that it just returns whether or not you can read data from the pointer. It makes no guarantee about type safety or any number of other invariants. In general this method is good for little else other than to say "yes, I can read that particular place in memory at a time that has now passed".
In short, Don't do this ;)
Raymond Chen has a blog post on this subject: http://blogs.msdn.com/oldnewthing/archive/2007/06/25/3507294.aspx
AFAIK there is no way. You should try to avoid this situation by always setting pointers to NULL after freeing memory.
On Unix you should be able to utilize a kernel syscall that does pointer checking and returns EFAULT, such as:
#include <unistd.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <errno.h>
#include <stdbool.h>
bool isPointerBad( void * p )
{
int fh = open( p, 0, 0 );
int e = errno;
if ( -1 == fh && e == EFAULT )
{
printf( "bad pointer: %p\n", p );
return true;
}
else if ( fh != -1 )
{
close( fh );
}
printf( "good pointer: %p\n", p );
return false;
}
int main()
{
int good = 4;
isPointerBad( (void *)3 );
isPointerBad( &good );
isPointerBad( "/tmp/blah" );
return 0;
}
returning:
bad pointer: 0x3
good pointer: 0x7fff375fd49c
good pointer: 0x400793
There's probably a better syscall to use than open() [perhaps access], since there's a chance that this could lead to actual file creation codepath, and a subsequent close requirement.
Regarding the answer a bit up in this thread:
IsBadReadPtr(), IsBadWritePtr(), IsBadCodePtr(), IsBadStringPtr() for Windows.
My advice is to stay away from them, someone has already posted this one:
http://blogs.msdn.com/oldnewthing/archive/2007/06/25/3507294.aspx
Another post on the same topic and by the same author (I think) is this one:
http://blogs.msdn.com/oldnewthing/archive/2006/09/27/773741.aspx ("IsBadXxxPtr should really be called CrashProgramRandomly").
If the users of your API sends in bad data, let it crash. If the problem is that the data passed isn't used until later (and that makes it harder to find the cause), add a debug mode where the strings etc. are logged at entry. If they are bad it will be obvious (and probably crash). If it is happening way to often, it might be worth moving your API out of process and let them crash the API process instead of the main process.
Firstly, I don't see any point in trying to protect yourself from the caller deliberately trying to cause a crash. They could easily do this by trying to access through an invalid pointer themselves. There are many other ways - they could just overwrite your memory or the stack. If you need to protect against this sort of thing then you need to be running in a separate process using sockets or some other IPC for communication.
We write quite a lot of software that allows partners/customers/users to extend functionality. Inevitably any bug gets reported to us first so it is useful to be able to easily show that the problem is in the plug-in code. Additionally there are security concerns and some users are more trusted than others.
We use a number of different methods depending on performance/throughput requirements and trustworthyness. From most preferred:
separate processes using sockets (often passing data as text).
separate processes using shared memory (if large amounts of data to pass).
same process separate threads via message queue (if frequent short messages).
same process separate threads all passed data allocated from a memory pool.
same process via direct procedure call - all passed data allocated from a memory pool.
We try never to resort to what you are trying to do when dealing with third party software - especially when we are given the plug-ins/library as binary rather than source code.
Use of a memory pool is quite easy in most circumstances and needn't be inefficient. If YOU allocate the data in the first place then it is trivial to check the pointers against the values you allocated. You could also store the length allocated and add "magic" values before and after the data to check for valid data type and data overruns.
I've got a lot of sympathy with your question, as I'm in an almost identical position myself. I appreciate what a lot of the replies are saying, and they are correct - the routine supplying the pointer should be providing a valid pointer. In my case, it is almost inconceivable that they could have corrupted the pointer - but if they had managed, it would be MY software that crashes, and ME that would get the blame :-(
My requirement isn't that I continue after a segmentation fault - that would be dangerous - I just want to report what happened to the customer before terminating so that they can fix their code rather than blaming me!
This is how I've found to do it (on Windows): http://www.cplusplus.com/reference/clibrary/csignal/signal/
To give a synopsis:
#include <signal.h>
using namespace std;
void terminate(int param)
/// Function executed if a segmentation fault is encountered during the cast to an instance.
{
cerr << "\nThe function received a corrupted reference - please check the user-supplied dll.\n";
cerr << "Terminating program...\n";
exit(1);
}
...
void MyFunction()
{
void (*previous_sigsegv_function)(int);
previous_sigsegv_function = signal(SIGSEGV, terminate);
<-- insert risky stuff here -->
signal(SIGSEGV, previous_sigsegv_function);
}
Now this appears to behave as I would hope (it prints the error message, then terminates the program) - but if someone can spot a flaw, please let me know!
There are no provisions in C++ to test for the validity of a pointer as a general case. One can obviously assume that NULL (0x00000000) is bad, and various compilers and libraries like to use "special values" here and there to make debugging easier (For example, if I ever see a pointer show up as 0xCECECECE in visual studio I know I did something wrong) but the truth is that since a pointer is just an index into memory it's near impossible to tell just by looking at the pointer if it's the "right" index.
There are various tricks that you can do with dynamic_cast and RTTI such to ensure that the object pointed to is of the type that you want, but they all require that you are pointing to something valid in the first place.
If you want to ensure that you program can detect "invalid" pointers then my advice is this: Set every pointer you declare either to NULL or a valid address immediately upon creation and set it to NULL immediately after freeing the memory that it points to. If you are diligent about this practice, then checking for NULL is all you ever need.
Setting the pointer to NULL before and after using is a good technique. This is easy to do in C++ if you manage pointers within a class for example (a string):
class SomeClass
{
public:
SomeClass();
~SomeClass();
void SetText( const char *text);
char *GetText() const { return MyText; }
void Clear();
private:
char * MyText;
};
SomeClass::SomeClass()
{
MyText = NULL;
}
SomeClass::~SomeClass()
{
Clear();
}
void SomeClass::Clear()
{
if (MyText)
free( MyText);
MyText = NULL;
}
void SomeClass::Settext( const char *text)
{
Clear();
MyText = malloc( strlen(text));
if (MyText)
strcpy( MyText, text);
}
Indeed, something could be done under specific occasion: for example if you want to check whether a string pointer string is valid, using write(fd, buf, szie) syscall can help you do the magic: let fd be a file descriptor of temporary file you create for test, and buf pointing to the string you are tesing, if the pointer is invalid write() would return -1 and errno set to EFAULT which indicating that buf is outside your accessible address space.
Peeter Joos answer is pretty good. Here is an "official" way to do it:
#include <sys/mman.h>
#include <stdbool.h>
#include <unistd.h>
bool is_pointer_valid(void *p) {
/* get the page size */
size_t page_size = sysconf(_SC_PAGESIZE);
/* find the address of the page that contains p */
void *base = (void *)((((size_t)p) / page_size) * page_size);
/* call msync, if it returns non-zero, return false */
int ret = msync(base, page_size, MS_ASYNC) != -1;
return ret ? ret : errno != ENOMEM;
}
There isn't any portable way of doing this, and doing it for specific platforms can be anywhere between hard and impossible. In any case, you should never write code that depends on such a check - don't let the pointers take on invalid values in the first place.
As others have said, you can't reliably detect an invalid pointer. Consider some of the forms an invalid pointer might take:
You could have a null pointer. That's one you could easily check for and do something about.
You could have a pointer to somewhere outside of valid memory. What constitutes valid memory varies depending on how the run-time environment of your system sets up the address space. On Unix systems, it is usually a virtual address space starting at 0 and going to some large number of megabytes. On embedded systems, it could be quite small. It might not start at 0, in any case. If your app happens to be running in supervisor mode or the equivalent, then your pointer might reference a real address, which may or may not be backed up with real memory.
You could have a pointer to somewhere inside your valid memory, even inside your data segment, bss, stack or heap, but not pointing at a valid object. A variant of this is a pointer that used to point to a valid object, before something bad happened to the object. Bad things in this context include deallocation, memory corruption, or pointer corruption.
You could have a flat-out illegal pointer, such as a pointer with illegal alignment for the thing being referenced.
The problem gets even worse when you consider segment/offset based architectures and other odd pointer implementations. This sort of thing is normally hidden from the developer by good compilers and judicious use of types, but if you want to pierce the veil and try to outsmart the operating system and compiler developers, well, you can, but there is not one generic way to do it that will handle all of the issues you might run into.
The best thing you can do is allow the crash and put out some good diagnostic information.
In general, it's impossible to do. Here's one particularly nasty case:
struct Point2d {
int x;
int y;
};
struct Point3d {
int x;
int y;
int z;
};
void dump(Point3 *p)
{
printf("[%d %d %d]\n", p->x, p->y, p->z);
}
Point2d points[2] = { {0, 1}, {2, 3} };
Point3d *p3 = reinterpret_cast<Point3d *>(&points[0]);
dump(p3);
On many platforms, this will print out:
[0 1 2]
You're forcing the runtime system to incorrectly interpret bits of memory, but in this case it's not going to crash, because the bits all make sense. This is part of the design of the language (look at C-style polymorphism with struct inaddr, inaddr_in, inaddr_in6), so you can't reliably protect against it on any platform.
It's unbelievable how much misleading information you can read in articles above...
And even in microsoft msdn documentation IsBadPtr is claimed to be banned. Oh well - I prefer working application rather than crashing. Even if term working might be working incorrectly (as long as end-user can continue with application).
By googling I haven't found any useful example for windows - found a solution for 32-bit apps,
http://www.codeproject.com/script/Content/ViewAssociatedFile.aspx?rzp=%2FKB%2Fsystem%2Fdetect-driver%2F%2FDetectDriverSrc.zip&zep=DetectDriverSrc%2FDetectDriver%2Fsrc%2FdrvCppLib%2Frtti.cpp&obid=58895&obtid=2&ovid=2
but I need also to support 64-bit apps, so this solution did not work for me.
But I've harvested wine's source codes, and managed to cook similar kind of code which would work for 64-bit apps as well - attaching code here:
#include <typeinfo.h>
typedef void (*v_table_ptr)();
typedef struct _cpp_object
{
v_table_ptr* vtable;
} cpp_object;
#ifndef _WIN64
typedef struct _rtti_object_locator
{
unsigned int signature;
int base_class_offset;
unsigned int flags;
const type_info *type_descriptor;
//const rtti_object_hierarchy *type_hierarchy;
} rtti_object_locator;
#else
typedef struct
{
unsigned int signature;
int base_class_offset;
unsigned int flags;
unsigned int type_descriptor;
unsigned int type_hierarchy;
unsigned int object_locator;
} rtti_object_locator;
#endif
/* Get type info from an object (internal) */
static const rtti_object_locator* RTTI_GetObjectLocator(void* inptr)
{
cpp_object* cppobj = (cpp_object*) inptr;
const rtti_object_locator* obj_locator = 0;
if (!IsBadReadPtr(cppobj, sizeof(void*)) &&
!IsBadReadPtr(cppobj->vtable - 1, sizeof(void*)) &&
!IsBadReadPtr((void*)cppobj->vtable[-1], sizeof(rtti_object_locator)))
{
obj_locator = (rtti_object_locator*) cppobj->vtable[-1];
}
return obj_locator;
}
And following code can detect whether pointer is valid or not, you need probably to add some NULL checking:
CTest* t = new CTest();
//t = (CTest*) 0;
//t = (CTest*) 0x12345678;
const rtti_object_locator* ptr = RTTI_GetObjectLocator(t);
#ifdef _WIN64
char *base = ptr->signature == 0 ? (char*)RtlPcToFileHeader((void*)ptr, (void**)&base) : (char*)ptr - ptr->object_locator;
const type_info *td = (const type_info*)(base + ptr->type_descriptor);
#else
const type_info *td = ptr->type_descriptor;
#endif
const char* n =td->name();
This gets class name from pointer - I think it should be enough for your needs.
One thing which I'm still afraid is performance of pointer checking - in code snipet above there is already 3-4 API calls being made - might be overkill for time critical applications.
It would be good if someone could measure overhead of pointer checking compared for example to C#/managed c++ calls.
It is not a very good policy to accept arbitrary pointers as input parameters in a public API. It's better to have "plain data" types like an integer, a string or a struct (I mean a classical struct with plain data inside, of course; officially anything can be a struct).
Why? Well because as others say there is no standard way to know whether you've been given a valid pointer or one that points to junk.
But sometimes you don't have the choice - your API must accept a pointer.
In these cases, it is the duty of the caller to pass a good pointer. NULL may be accepted as a value, but not a pointer to junk.
Can you double-check in any way? Well, what I did in a case like that was to define an invariant for the type the pointer points to, and call it when you get it (in debug mode). At least if the invariant fails (or crashes) you know that you were passed a bad value.
// API that does not allow NULL
void PublicApiFunction1(Person* in_person)
{
assert(in_person != NULL);
assert(in_person->Invariant());
// Actual code...
}
// API that allows NULL
void PublicApiFunction2(Person* in_person)
{
assert(in_person == NULL || in_person->Invariant());
// Actual code (must keep in mind that in_person may be NULL)
}
Following does work in Windows (somebody suggested it before):
static void copy(void * target, const void* source, int size)
{
__try
{
CopyMemory(target, source, size);
}
__except(EXCEPTION_EXECUTE_HANDLER)
{
doSomething(--whatever--);
}
}
The function has to be static, standalone or static method of some class.
To test on read-only, copy data in the local buffer.
To test on write without modifying contents, write them over.
You can test first/last addresses only.
If pointer is invalid, control will be passed to 'doSomething',
and then outside the brackets.
Just do not use anything requiring destructors, like CString.
On Windows I use this code:
void * G_pPointer = NULL;
const char * G_szPointerName = NULL;
void CheckPointerIternal()
{
char cTest = *((char *)G_pPointer);
}
bool CheckPointerIternalExt()
{
bool bRet = false;
__try
{
CheckPointerIternal();
bRet = true;
}
__except (EXCEPTION_EXECUTE_HANDLER)
{
}
return bRet;
}
void CheckPointer(void * A_pPointer, const char * A_szPointerName)
{
G_pPointer = A_pPointer;
G_szPointerName = A_szPointerName;
if (!CheckPointerIternalExt())
throw std::runtime_error("Invalid pointer " + std::string(G_szPointerName) + "!");
}
Usage:
unsigned long * pTest = (unsigned long *) 0x12345;
CheckPointer(pTest, "pTest"); //throws exception
On macOS, you can do this with mach_vm_region, which as well as telling you if a pointer is valid, also lets you validate what access you have to the memory to which the pointer points (read/write/execute). I provided sample code to do this in my answer to another question:
#include <mach/mach.h>
#include <mach/mach_vm.h>
#include <stdio.h>
#include <stdbool.h>
bool ptr_is_valid(void *ptr, vm_prot_t needs_access) {
vm_map_t task = mach_task_self();
mach_vm_address_t address = (mach_vm_address_t)ptr;
mach_vm_size_t size = 0;
vm_region_basic_info_data_64_t info;
mach_msg_type_number_t count = VM_REGION_BASIC_INFO_COUNT_64;
mach_port_t object_name;
kern_return_t ret = mach_vm_region(task, &address, &size, VM_REGION_BASIC_INFO_64, (vm_region_info_t)&info, &count, &object_name);
if (ret != KERN_SUCCESS) return false;
return ((mach_vm_address_t)ptr) >= address && ((info.protection & needs_access) == needs_access);
}
#define TEST(ptr,acc) printf("ptr_is_valid(%p,access=%d)=%d\n", (void*)(ptr), (acc), ptr_is_valid((void*)(ptr),(acc)))
int main(int argc, char**argv) {
TEST(0,0);
TEST(0,VM_PROT_READ);
TEST(123456789,VM_PROT_READ);
TEST(main,0);
TEST(main,VM_PROT_READ);
TEST(main,VM_PROT_READ|VM_PROT_EXECUTE);
TEST(main,VM_PROT_EXECUTE);
TEST(main,VM_PROT_WRITE);
TEST((void*)(-1),0);
return 0;
}
The SEI CERT C Coding Standard recommendation MEM10-C. Define and use a pointer validation function says it is possible to do a check to some degree, especially under Linux OS.
The method described in the link is to keep track of the highest memory address returned by malloc and add a function that tests if someone tries to use a pointer greater than that value. It is probably of limited use.
IsBadReadPtr(), IsBadWritePtr(), IsBadCodePtr(), IsBadStringPtr() for Windows.
These take time proportional to the length of the block, so for sanity check I just check the starting address.
I have seen various libraries use some method to check for unreferenced memory and such. I believe they simply "override" the memory allocation and deallocation methods (malloc/free), which has some logic that keeps track of the pointers. I suppose this is overkill for your use case, but it would be one way to do it.
Technically you can override operator new (and delete) and collect information about all allocated memory, so you can have a method to check if heap memory is valid.
but:
you still need a way to check if pointer is allocated on stack ()
you will need to define what is 'valid' pointer:
a) memory on that address is
allocated
b) memory at that address
is start address of object (e.g.
address not in the middle of huge
array)
c) memory at that address
is start address of object of expected type
Bottom line: approach in question is not C++ way, you need to define some rules which ensure that function receives valid pointers.
There is no way to make that check in C++. What should you do if other code passes you an invalid pointer? You should crash. Why? Check out this link: http://blogs.msdn.com/oldnewthing/archive/2006/09/27/773741.aspx
Addendum to the accpeted answer(s):
Assume that your pointer could hold only three values -- 0, 1 and -1 where 1 signifies a valid pointer, -1 an invalid one and 0 another invalid one. What is the probability that your pointer is NULL, all values being equally likely? 1/3. Now, take the valid case out, so for every invalid case, you have a 50:50 ratio to catch all errors. Looks good right? Scale this for a 4-byte pointer. There are 2^32 or 4294967294 possible values. Of these, only ONE value is correct, one is NULL, and you are still left with 4294967292 other invalid cases. Recalculate: you have a test for 1 out of (4294967292+ 1) invalid cases. A probability of 2.xe-10 or 0 for most practical purposes. Such is the futility of the NULL check.
You know, a new driver (at least on Linux) that is capable of this probably wouldn't be that hard to write.
On the other hand, it would be folly to build your programs like this. Unless you have some really specific and single use for such a thing, I wouldn't recommend it. If you built a large application loaded with constant pointer validity checks it would likely be horrendously slow.
you should avoid these methods because they do not work. blogs.msdn.com/oldnewthing/archive/2006/09/27/773741.aspx – JaredPar Feb 15 '09 at 16:02
If they don't work - next windows update will fix it ?
If they don't work on concept level - function will be probably removed from windows api completely.
MSDN documentation claim that they are banned, and reason for this is probably flaw of further design of application (e.g. generally you should not eat invalid pointers silently - if you're in charge of design of whole application of course), and performance/time of pointer checking.
But you should not claim that they does not work because of some blog.
In my test application I've verified that they do work.
these links may be helpful
_CrtIsValidPointer
Verifies that a specified memory range is valid for reading and writing (debug version only).
http://msdn.microsoft.com/en-us/library/0w1ekd5e.aspx
_CrtCheckMemory
Confirms the integrity of the memory blocks allocated in the debug heap (debug version only).
http://msdn.microsoft.com/en-us/library/e73x0s4b.aspx

Using function parameters as both input and output

I found myself using function parameters for both input and output and was wondering if what I'm doing is going to bite me later.
In this example buffer_len is such a parameter. It is used by the foo to determine the size of the buffer and tell the caller, main, how much of the buffer has been used up.
#define MAX_BUFFER_LENGTH 16
char buffer[MAX_BUFFER_LENGTH] = {0};
void main(void)
{
uint32_t buffer_len = MAX_BUFFER_LENGTH;
printf("BEFORE: Max buffer length = %u", buffer_len);
foo(buffer, &buffer_len);
printf("BEFORE: Buffer length used = %u", buffer_len);
}
void foo(char *buffer, uint32_t *buffer_len)
{
/* Remember max buffer length */
uint32_t buffer_len_max = *buffer_len;
uint32_t buffer_len_left = buffer_len_max;
/* Add things to the buffer, decreasing the buffer_len_left
in the process */
...
/* Return the length of the buffer used up to the caller */
*buffer_len = buffer_len_max - buffer_len_left;
}
Is this an OK thing to do?
EDIT:
Thank you for your responses, but I'd prefer to keep the return value of foo for the actual function result (which makes sense with larger functions). Would something like this be more pain-free in the long run?
typedef struct
{
char *data_ptr;
uint32_t length_used;
uint32_t length_max;
} buffer_t;
#define ACTUAL_BUFFER_LENGTH 16
char actual_buffer[ACTUAL_BUFFER_LENGTH] = {0};
void main(void)
{
buffer_t my_buffer = { .data_ptr = &actual_buffer[0],
.length_used = 0,
.length_max = ACTUAL_BUFFER_LENGTH };
}
For the original version of the question where the called function doesn't return a value, you got three similar answers, all roughly saying "Yes, but…":
Jonathan Leffler said:
It's "OK" as long as it is documented, but really not the preferred way of operating. Why not have the function return the length used, and leave the buffer length parameter as a regular uint32_t (or perhaps size_t, so you can pass sizeof(buffer) to the function)?
DevSolar said:
It's syntactically OK, but personally I much prefer return codes. If that is not possible, I would want a dedicated output parameter, especially if it's pass-by-reference. (I might not pay attention and continue under the assumption that my buffer_len still holds the original value.)
BeyelerStudios said:
As long as you have not used up your return value, I prefer to receive the result rather than having to defining a variable every time I use the function (or maybe I want to input an expression).
The unanimity is remarkable.
The question was then updated to indicate that instead of returning void, the function's return value would be used for another purpose. This completely changes the assessment.
Don't show a void function if your real function is going to return a value. Doing so completely alters the answers. If you need to return more than one value, then an in-out parameter is OK (even necessary — getchar() is a counter-example), though a pure in parameter and a separate pure out parameter might be better. Using a structure is OK too.
Perhaps I should explain the 'counter-example' a bit. The getchar() function returns a value that indicates failure or a char value. This leads to many pitfalls for beginners (because getchar() returns an int, not a char as its name suggests). It would be better, in some ways, if the function was:
bool get_char(char *c);
returning true if it reads a character and false if it fails, and assigning the character value to c. It could be used like:
char c;
while (get_char(&c))
…use character just read…
This is a case where the function needs to return two values.
Harking back to the suggested revised code in the question, the code using the structure.
That is not a bad idea at all; it is often sensible to package up a set of values into a structure. It would make a lot of sense if the called function has to compute some value that it will return and yet it also modifies the buffer array and needs to report on the number of entries in it (as well as knowing how much space there is to be used). Here, keeping the 'space available' separate from the 'space used' is definitely preferable; it will be easier to see what's going on than having the 'in-out' parameter which informs the function how much space is available on entry and reports back how much space was used on exit. Even if it reported how much space was still available on exit, it would be harder to use.
Which gets back to the original diagnosis: yes, the in-out parameter is technically legal, and can be made to work, but isn't as easy to use as separate values.
Side note: void main(void) is not the standard way to write main() — see What should main() return in C and C++? for the full story.
There's nothing wrong with using the same buffer for input and output, but it probably limits the functions utility elsewhere. For example, what if you want to use it with two different values? (for some reason as a user of the function I need the original preserved). In the example you provide there's no harm in taking two parameters in the function and then just passing the same pointer in twice. Then you've wrapped up both uses and it probably simplifies the function code.
For more complex data types like arrays, as well as the same problem above, you'll need to make sure your function doesn't need a larger output, or if it shrinks the buffer that you memset( 0.. ) the difference and so on.
So for those headaches I do tend to avoid as a pattern, but as I say nothing particularly wrong.

Comparing a volatile array to a non-volatile array

Recently I needed to compare two uint arrays (one volatile and other nonvolatile) and results were confusing, there got to be something I misunderstood about volatile arrays.
I need to read an array from an input device and write it to a local variable before comparing this array to a global volatile array. And if there is any difference i need to copy new one onto global one and publish new array to other platforms. Code is something as blow:
#define ARRAYLENGTH 30
volatile uint8 myArray[ARRAYLENGTH];
void myFunc(void){
uint8 shadow_array[ARRAYLENGTH],change=0;
readInput(shadow_array);
for(int i=0;i<ARRAYLENGTH;i++){
if(myArray[i] != shadow_array[i]){
change = 1;
myArray[i] = shadow_array[i];
}
}
if(change){
char arrayStr[ARRAYLENGTH*4];
array2String(arrayStr,myArray);
publish(arrayStr);
}
}
However, this didn't work and everytime myFunc runs, it comes out that a new message is published, mostly identical to the earlier message.
So I inserted a log line into code:
for(int i=0;i<ARRAYLENGTH;i++){
if(myArray[i] != shadow_array[i]){
change = 1;
log("old:%d,new:%d\r\n",myArray[i],shadow_array[i]);
myArray[i] = shadow_array[i];
}
}
Logs I got was as below:
old:0,new:0
old:8,new:8
old:87,new:87
...
Since solving bug was time critical I solved the issue as below:
char arrayStr[ARRAYLENGTH*4];
char arrayStr1[ARRAYLENGTH*4];
array2String(arrayStr,myArray);
array2String(arrayStr1,shadow_array);
if(strCompare(arrayStr,arrayStr1)){
publish(arrayStr1);
}
}
But, this approach is far from being efficient. If anyone have a reasonable explanation, i would like to hear.
Thank you.
[updated from comments:]
For the volatile part, global array has to be volatile, since other threads are accessing it.
If the global array is volatile, your tracing code could be inaccurate:
for(int i=0;i<ARRAYLENGTH;i++){
if(myArray[i] != shadow_array[i]){
change = 1;
log("old:%d,new:%d\r\n",myArray[i],shadow_array[i]);
myArray[i] = shadow_array[i];
}
}
The trouble is that the comparison line reads myArray[i] once, but the logging message reads it again, and since it is volatile, there's no guarantee that the two reads will give the same value. An accurate logging technique would be:
for (int i = 0; i < ARRAYLENGTH; i++)
{
uintu_t value;
if ((value = myArray[i]) != shadow_array[i])
{
change = 1;
log("old:%d,new:%d\r\n", value, shadow_array[i]);
myArray[i] = shadow_array[i];
}
}
This copies the value used in the comparison and reports that. My gut feel is it is not going to show a difference, but in theory it could.
global array has to be volatile, since other threads are accessing it
As you "nicely" observe declaring an array volatile is not the way to protect it against concurrent read/write access by different threads.
Use a mutex for this. For example by wrapping access to the "global array" into a function which locks and unlocks this mutex. Then only use this function to access the "global array".
References:
Why is volatile not considered useful in multithreaded C or C++ programming?
https://www.kernel.org/doc/Documentation/volatile-considered-harmful.txt
Also for printf()ing unsigned integers use the conversion specifier u not d.
A variable (or Array) should be declared volatile when it may Change outside the current program execution flow. This may happen by concurrent threads or an ISR.
If there is, however, only one who actually writes to it and all others are jsut Readers, then the actual writing code may treat it as being not volatile (even though there is no way to tell teh Compiler to do so).
So if the comparison function is the only Point in the Project where teh gloal Array is actually changed (updated) then there is no Problem with multiple reads. The code can be designed with the (external) knowledge that there will be no Change by an external source, despite of the volatile declaration.
The 'readers', however, do know that the variable (or the array content) may change and won't buffer it reads (e.g by storing the read vlaue in a register for further use), but still the array content may change while they are reading it and the whole information might be inconsistent.
So the suggested use of a mutex is a good idea.
It does not, however, help with the original Problem that the comparison Loop fails, even though nobody is messing with the array from outside.
Also, I wonder why myArray is declared volatile if it is only locally used and the publishing is done by sending out a pointer to ArrayStr (which is a pointer to a non-volatile char (array).
There is no reason why myArray should be volatile. Actually, there is no reason for its existence at all:
Just read in the data, create a temporary tring, and if it differes form the original one, replace the old string and publish it. Well, it's maybe less efficient to always build the string, but it makes the code much shorter and apparently works.
static char arrayStr[ARRAYLENGTH*4]={};
char tempStr[ARRAYLENGTH*4];
array2String(tempStr,shadow_array);
if(strCompare(arrayStr,tempStr)){
strCopy(arrayStr, tempStr);
publish(arrayStr);
}
}

Understanding how to turn a chunk of code that appears in main(){} into a function or functions in C

I'm working on a user-space driver which reads data off of a device by sending and receiving reports (it's a hid device).
Here is my initial code http://pastebin.com/ufbvziUR
EDIT NOTE: From the sound of the answers and comments it would seem I will need to wrap my C code inside of some Objective-C since the app that will be consuming this driver is written in Objective-C.
As of now I have just added all the code into the main() function and I'm able to grab data and print it out to the log window with the main purpose of logging out the buffer array. This code will be apart of a much larger app though and will need to be ran by getting called rather than just automatically running.
So I figured I would wrap all the code that appears in main() inside of a large function called getData(). I want to return an array called char buffer[] (a char array full of bytes) when this function get run. My initial thought to declare it would be like so since I know you cannot really return an array from a function only a pointer so I just let the caller allocate the buffer and pass it the size.
char *getData(int user){
unsigned char *buffer = malloc(2800);
// ... do all the stuff
return buffer;
}
Then in main()
int main(){
int user = 1;
unsigned char *buffer = getData(user);
}
Is this a proper approach?
A couple of things feel wrong to me here. First is wrapping that much code into a single function and two I'm not quite sure how to break out of the function when one of my error checks returns 1; since this function will need to return an array. I'm still really new to C and am getting confused on how to approach this when I don't have objects or classes to work with.
void getData(unsigned char *buffer, int user){...}
Defines a function that is not returning anything.
If you want to return some value - like for instance error code, you need a function returning an int.
int getData(unsigned char *buffer, int user){...}
To get out of function returning void differently then reaching the end of the function, you can also use return but then without any arguments.
As you've already noticed you aren't passing array to the function so there is no reason to return it at all. You could return the pointer if you wanted but the is no need to do so. The main function knows where the array is anyway.
It's generally considered a good habit to keep main as short as possible. You can also divide getData into smaller functions which would make the code more readable. You could for instance make every chunk marked by # pragma mark - ... a separate function. Or even better, see whether there are any parts of the program that are doing the same or similar thing. Then you can generalize this functionality into one function and use it multiple times.

Resources