I am trying to code a C project to simulate air traffic control in an airline. There is a Plane struct that has to have a mutex lock and cond.
This is the struct:
typedef struct Plane {
int ID;
int arrival_time;
pthread_mutex_t lock;
pthread_cond_t cond;
}Plane;
And there are landing/departing functions that create and use Plane structures. However, I do not know how to initialize a struct's mutex lock inside a function.
void landing_func(int arrival_time)
{
Plane plane;
plane.ID = ++plane_id;
plane.arrival_time = arrival_time;
// initialize lock and cond here
};
However, I am not sure how to do that. I also declare a function that creates and returns a Plane*
Plane* createPlane(int arrival_time){
Plane* plane;
plane = (Plane*)malloc(sizeof(Plane));
plane->arrival_time = arrival_time;
plane->ID = ++plane_id;
pthread_mutex_init(&plane->lock, NULL);
pthread_cond_init(&plane->cond, NULL);
return plane;
};
I am not sure about correctness of this initialization.
A minor question: Can I return Plane instead of Plane*? I need Plane in other functions instead of Plane*.
However, I do not know how to initialize a struct's mutex lock inside a function.
It's unclear what your confusion is.
The mutex and condition variable initialization functions require as first argument a pointer to the object to initialize, which is perfectly natural. Their second arguments can be null pointers if you want default attributes, which you probably do.
Given
Plane plane;
, as in your first example, the mutex you want to initialize is plane.lock. The address-of operator will produce a pointer to it: &plane.lock (or if you're uncertain about precedence and too lazy to look it up, then &(plane.lock) avoids the precedence question). Thus, you might initialize it with
pthread_mutex_init(&plane.lock, NULL);
Similar for the CV.
I guess the main thing to appreciate is that the mutex being a member of a structure does not constitute a special case. Objects that are structure members or array elements have the same properties as objects of the same type that are not contained in such aggregates. The difference is simply in how you reference such objects.
I also declare a function that creates and returns a Plane* [...] I am not sure about correctness of this initialization.
It's fine, as you should also be able to infer from my previous comments.
A minor question: Can I return Plane instead of Plane*? I need Plane
in other functions instead of Plane*.
I take it that you're referring specifically to the second example, where you allocate space dynamically for the Plane. Yes, you can return a Plane instead of a Plane *, supposing that you adjust the function's return type, but that defeats the purpose of the dynamic allocation. Moreover, if you do that naively (just return *plane;, for example) then you will leak memory.
But it is extremely important to understand that when you pass or return an object, you are delivering a copy of that object to the receiver. Where once there was one, suddenly there are two.* Additionally, it is a shallow copy, so the two might not be wholly independent. That can be ok under some circumstances, but yours are not such circumstances. I guarantee that passing your Plane structures around that way will cause you no end of grief in your particular program, on account of its reliance on the mutex and CV belonging to each. There are other gotchas there, too, but the mutex and CV are at the center of the most glaring ones.
Do yourself a tremendous favor, and adjust to exchanging Planes via pointer instead of by copying. It may take some adjustment, and that mode has its own issues to contend with, but trust me: the other way leads to disaster for you.
* This applies just as much to pointers as to any other kind of object, but if you pass a copy of a pointer then only the pointer itself is copied, and the copy points to the same object as the original does.
Related
Working on my C muscle lately and looking through the many libraries I've been working with its certainly gave me a good idea of what is good practice. One thing that I have NOT seen is a function that returns a struct:
something_t make_something() { ... }
From what I've absorbed this is the "right" way of doing this:
something_t *make_something() { ... }
void destroy_something(something_t *object) { ... }
The architecture in code snippet 2 is FAR more popular than snippet 1. So now I ask, why would I ever return a struct directly, as in snippet 1? What differences should I take into account when I'm choosing between the two options?
Furthermore, how does this option compare?
void make_something(something_t *object)
When something_t is small (read: copying it is about as cheap as copying a pointer) and you want it to be stack-allocated by default:
something_t make_something(void);
something_t stack_thing = make_something();
something_t *heap_thing = malloc(sizeof *heap_thing);
*heap_thing = make_something();
When something_t is large or you want it to be heap-allocated:
something_t *make_something(void);
something_t *heap_thing = make_something();
Regardless of the size of something_t, and if you don’t care where it’s allocated:
void make_something(something_t *);
something_t stack_thing;
make_something(&stack_thing);
something_t *heap_thing = malloc(sizeof *heap_thing);
make_something(heap_thing);
This is almost always about ABI stability. Binary stability between versions of the library. In the cases where it is not, it is sometimes about having dynamically sized structs. Rarely it is about extremely large structs or performance.
It is exceedingly rare that allocating a struct on the heap and returning it is nearly as fast as returning it by-value. The struct would have to be huge.
Really, speed is not the reason behind technique 2, return-by-pointer, instead of return-by-value.
Technique 2 exists for ABI stability. If you have a struct and your next version of the library adds another 20 fields to it, consumers of your previous version of the library are binary compatible if they are handed pre-constructed pointers. The extra data beyond the end of the struct they know about is something they don't have to know about.
If you return it on the stack, the caller is allocating the memory for it, and they must agree with you on how big it is. If your library updated since they last rebuilt, you are going to trash the stack.
Technique 2 also permits you to hide extra data both before and after the pointer you return (which versions appending data to the end of the struct is a variant of). You could end the structure with a variable sized array, or prepend the pointer with some extra data, or both.
If you want stack-allocated structs in a stable ABI, almost all functions that talk to the struct need to be passed version information.
So
something_t make_something(unsigned library_version) { ... }
where library_version is used by the library to determine what version of something_t it is expected to return and it changes how much of the stack it manipulates. This isn't possible using standard C, but
void make_something(something_t* here) { ... }
is. In this case, something_t might have a version field as its first element (or a size field), and you would require that it be populated prior to calling make_something.
Other library code taking a something_t would then query the version field to determine what version of something_t they are working with.
As a rule of thumb, you should never pass struct objects by value. In practice, it will be fine to do so as long as they are smaller or equal to the maximum size that your CPU can handle in a single instruction. But stylistically, one typically avoids it even then. If you never pass structs by value you can later on add members to the struct and it won't affect performance.
I think that void make_something(something_t *object) is the most common way to use structures in C. You leave the allocation to the caller. It is efficient but not pretty.
However, object-oriented C programs use something_t *make_something() since they are built with the concept of opaque type, which forces you to use pointers. Whether the returned pointer points at dynamic memory or something else depends on the implementation. OO with opaque type is often one of the most elegant and best ways to design more complex C programs, but sadly, few C programmers know/care about it.
Some pros of the first approach:
Less code to write.
More idiomatic for the use case of returning multiple values.
Works on systems that don't have dynamic allocation.
Probably faster for small or smallish objects.
No memory leak due to forgetting to free.
Some cons:
If the object is large (say, a megabyte) , may cause stack overflow, or may be slow if compilers don't optimize it well.
May surprise people who learned C in the 1970s when this was not possible, and haven't kept up to date.
Does not work with objects that contain a pointer to a part of themself.
I'm somewhat surprised.
The difference is that example 1 creates a structure on the stack, example 2 creates it on the heap. In C, or C++ code which is effectively C, it's idiomatic and convenient to create most objects on the heap. In C++ it is not, mostly they go on the stack. The reason is that if you create an object on the stack, the destructor is called automatically, if you create it on the heap, it must be called explicitly.So it's a lot easier to ensure there are no memory leaks and to handle exceptions is everything goes on the stack. In C, the destructor must be called explictly anyway, and there's no concept of a special destructor function (you have destructors, of course, but they are just normal functions with names like destroy_myobject()).
Now the exception in C++ is for low-level container objects, e.g. vectors, trees, hash maps and so on. These do retain heap members, and they have destructors. Now most memory-heavy objects consist of a few immediate data members giving sizes, ids, tags and so on, and then the rest of the information in STL structures, maybe a vector of pixel data or a map of English word / value pairs. So most of the data is in fact on the heap, even in C++.
And modern C++ is designed so that this pattern
class big
{
std::vector<double> observations; // thousands of observations
int station_x; // a bit of data associated with them
int station_y;
std::string station_name;
}
big retrieveobservations(int a, int b, int c)
{
big answer;
// lots of code to fill in the structure here
return answer;
}
void high_level()
{
big myobservations = retriveobservations(1, 2, 3);
}
Will compile to pretty efficient code. The large observation member won't generate unnecessary makework copies.
Unlike some other languages (like Python), C does not have the concept of a tuple. For example, the following is legal in Python:
def foo():
return 1,2
x,y = foo()
print x, y
The function foo returns two values as a tuple, which are assigned to x and y.
Since C doesn't have the concept of a tuple, it's inconvenient to return multiple values from a function. One way around this is to define a structure to hold the values, and then return the structure, like this:
typedef struct { int x, y; } stPoint;
stPoint foo( void )
{
stPoint point = { 1, 2 };
return point;
}
int main( void )
{
stPoint point = foo();
printf( "%d %d\n", point.x, point.y );
}
This is but one example where you might see a function return a structure.
Generally it is preferred to pass pointer to structure to a function in C, in order to avoid copying during function call. This has an unwanted side effect that the called function can modify the elements of the structure inadvertently. What is a good programming practice to avoid such errors without compromising on the efficiency of the function call ?
Pass a pointer-to-const is the obvious answer
void foo(const struct some_struct *p)
That will prevent you from modifying the immediate members of the struct inadvertently. That's what const is for.
In fact, your question sounds like a copy-paste from some quiz card, with const being the expected answer.
In general, when it comes to simple optimizations like what you've described, it is often preferable to use a pointer-to-struct rather than passing a struct itself, as passing a whole struct means more overhead from extra data being copied onto the call stack.
The example below is a fairly common approach:
#include <errno.h>
typedef struct myStruct {
int i;
char c;
} myStruct_t;
int myFunc(myStruct_t* pStruct) {
if (!pStruct) {
return EINVAL;
}
// Do some stuff
return 0;
}
If you want to avoid modifying the data passed to the function, just make sure that the data is immutable by modifying the function prototype.
int myFunc(const myStruct_t* pStruct)
You will also benefit from reading up on "const correctness".
A very common idiom, particularly in unix/posix style system code is to have the caller allocate a struct, and pass a pointer to that struct through the function call.
This is a little different than what I think your asking about where you are passing data into a function with a struct (where as others have mention you may the function to treat the struct as const). In these cases, the struct is empty (or only partially full) before the function call. The caller will do something like allocate an empty struct and then passes a pointer to this struct. Probably different than your general question, but relevant to the discussion I think.
This accomplishes a couple handy things. It avoids copying a possibly large structure, also it lets the caller fill in some fields and the callee to fill out other (giving an effective shared space for communication).
The most important aspect to this idiom is that the caller has full control over the allocation of the struct. It can have it on the stack, heap, reuse the same one repeatedly, but where it comes from the caller is responsible for the handling the memory.
This is one of the problems with passing around struct pointers; you can easily lose track of who allocated the struct and whose responsibility it is to free it. This idiom gives you the advantage of not having to copy the struct around, while making it clear who has the job of free'ing the memory is.
My question regards parameter passing to a thread.
I have a function Foo that operates on an array, say arrayA. To speed things up, Foo is coded to operate at both directions on the array. So, Foo takes arrayA and an integer X as parameters. Depending on the value of X, it operates in forward or reverse direction.
I'm looking to avoid making global use of "arrayA" and "X". So, I'm after passing "arrayA" and "X" as parameters to Foo, and creating two threads to run Foo-- one in each direction. Here's what I did:
typedef struct {int* arrayA[MSIZE]; int X; } TP; //arrayPack=TP
void Foo (void *tP) {
TP *tp = (TP*)tP; // cast the parameter tP back to what it is and assign to pointer *tp
int x;
printf("\nX: %d", tp->X);
printf("\n arrayA: "); for (x=0; x<tp->arrayA.size(); printf("%d ", aP->arrayA[x]), x++);
} // end Foo
void callingRouting () {
int* arrayA[MSIZE] = {3,5,7,9};
TP tp; tp.arrayA=arrayA;
tp.X=0; _beginthread(Foo, 0, (void*)&tp); // process -- forward
tp.X=1; _beginthread(Foo, 0, (void*)&tp); // process -- reverse
}
The values aren`t passed-- my array is printing empty and I'm not getting the value of X printed right. What am i missing ?
I would also appreciate suggestions on some readings on this-- passing parameters to threads-- especially on passing the resources shared by the threads. Thanks.
You're passing the address of a stack variable to your thread function, once callingRouting exits the TP structure no longer exists. They need to be either globals or allocated on the heap.
However you'll need two copies of the TP for each thread as the change tp.X=1 may be visible to both threads.
There are problems there but how you see them depends on how the OS decides to schedule the threads on each execution.
The first thing to remember is that you have a thread that is starting up two other threads. Since you do not have any control over the processor time slice and how that is allocated, you can not be sure when the two other threads will start and may be not even the order that they will start in.
Since you hare using an array on the stack that is local to the function callingRouting () as soon as that function returns the local variables allocated will basically be out of scope and can no longer be depended on.
So there are a couple of ways to do this.
The first is to use global or static memory variables for these data items being passed to the threads.
The other is to start both threads and then wait for both to complete before continuing.
Since you do not know when or the order of the threads being started, you really should use two different TP type variables, one for each thread. Otherwise you run the risk of the time slice allocation to be such that both threads will have the same TP data.
void my_cool_function()
{
obj_scene_data scene;
obj_scene_data *scene_ptr = &scene;
parse_obj_scene(scene_ptr, "test.txt");
}
Why would I ever create a pointer to a local variable as above if I can just do
void my_cool_function()
{
obj_scene_data scene;
parse_obj_scene(&scene, "test.txt");
}
Just in case it's relevant:
int parse_obj_scene(obj_scene_data *data_out, char *filename);
In the specific code you linked, there isn't really a reason.
It could be functionally necessary if you have a function taking an obj_scene_data **. You can't do &&scene, so you'd have to create a local variable before passing the address on.
Yes absolutely you can do this for many reasons.
For example if you want to iterate over the members of a stack allocated array via a pointer.
Or in other cases if you want to point sometimes to one memory address and other times to another memory address. You can setup a pointer to point to one or the other via an if statement and then later use your common code all within the same scope.
Typically in these cases your pointer variable goes out of scope at the same time as your stack allocated memory goes out of scope. There is no harm if you use your pointer within the same scope.
In your exact example there is no good reason to do it.
If the function accepts a NULL pointer as input, and you want to decide whether to pass NULL based on some condition, then a pointer to a stack variable is useful to avoid having to call the same function in separate code paths, especially if the rest of the parameters are the same otherwise. For example, instead of this:
void my_function()
{
obj_data obj = {0};
if( some condition )
other_function(&scene, "test.txt");
else
other_function(NULL, "test.txt");
}
You could do this:
void my_function()
{
obj_data obj = {0};
obj_data *obj_ptr = (condition is true) ? &obj : NULL;
other_function(obj_ptr, "test.txt");
}
If parse_obj_scene() is a function there may be no good reason to create a separate pointer. But if for some unholy reason it is a macro it may be necessary to reassign the value to the pointer to iterate over the subject data.
Not in terms of semantics, and in fact there is a more general point that you can replace all local variables with function calls with no change in semantics, and given suitable compiler optimisations, equal efficiency. (see section 2.3 of "Lambda: The Ultimate Imperative".)
But the point of writing code to communicate with the next person to maintain it, and in an imperative language without tail call optimisation, it is usual to use local variables for things which are iterated over, for automatic structures, and to simplify expressions. So if it makes the code more readable, then use it.
I have a char pointer which would be used to store a string. It is used later in the program.
I have declared and initialized like this:
char * p = NULL;
I am just wondering if this is good practice. I'm using gcc 4.3.3.
Yes, it's good idea.
Google Code Style recommends:
To initialize all your variables even if you don't need them right now.
Initialize pointers by NULL, int's by 0 and float's by 0.0 -- just for better readability.
int i = 0;
double x = 0.0;
char* c = NULL;
You cannot store a string in a pointer.
Your definition of mgt_dev_name is good, but you need to point it somewhere with space for your string. Either malloc() that space or use a previously defined array of characters.
char *mgt_dev_name = NULL;
char data[4200];
/* ... */
mgt_dev_name = data; /* use array */
/* ... */
mgt_dev_name = malloc(4200);
if (mgt_dev_name != NULL) {
/* use malloc'd space */
free(mgt_dev_name);
} else {
/* error: not enough memory */
}
It is good practice to initialize all variables.
If you're asking whether it's necessary, or whether it's a good idea to initialize the variable to NULL before you set it to something else later on: It's not necessary to initialize it to NULL, it won't make any difference for the functionality of your program.
Note that in programming, it's important to understand every line of code - why it's there and what exactly it's doing. Don't do things without knowing what they mean or without understanding why you're doing them.
Another option is to not define the variable until the place in your code where you have access to it's initial value. So rather then doing:
char *name = NULL;
...
name = initial_value;
I would change that to:
...
char *name = initial_value;
The compiler will then prevent you from referencing the variable in the part of the code where it has no value. Depending on the specifics of your code this may not always be possible (for example, the initial value is set in an inner scope but the variable has a different lifetime), moving the definition as late as possible in the code prevents errors.
That said, this is only allowed starting with the c99 standard (it's also valid C++). To enable c99 features in gcc, you'll need to either do:
gcc -std=gnu99
or if you don't want gcc extensions to the standard:
gcc -std=c99
No, it is not a good practice, if I understood your context correctly.
If your code actually depends on the mgt_dev_name having the initial value of a null-pointer, then, of course, including the initializer into the declaration is a very good idea. I.e. if you'd have to do this anyway
char *mgt_dev_name;
/* ... and soon after */
mgt_dev_name = NULL;
then it is always a better idea to use initialization instead of assignment
char *mgt_dev_name = NULL;
However, initialization is only good when you can initialize your object with a meaningful useful value. A value that you will actually need. In general case, this is only possible in languages that allow declarations at any point in the code, C99 and C++ being good examples of such languages. By the time you need your object, you usually already know the appropriate initializer for that object, and so can easily come up with an elegant declaration with a good initializer.
In C89/90 on the other hand, declarations can only be placed at the beginning of the block. At that point, in general case, you won't have meaningful initializers for all of your objects. Should you just initialize them with something, anything (like 0 or NULL) just to have them initialized? No!!! Never do meaningless things in your code. It will not improve anything, regardless of what various "style guides" might tell you. In reality, meaningless initialization might actually cover bugs in your code, making it the harder to discover and fix them.
Note, that even in C89/90 it is always beneficial to strive for better locality of declarations. I.e. a well-known good practice guideline states: always make your variables as local as they can be. Don't pile up all your local object declarations at the very beginning of the function, but rather move them to the beginning of the smallest block that envelopes the entire lifetime of the object as tightly as possible. Sometimes it might even be a good idea to introduce a fictive, otherwise unnecessary block just to improve the locality of declarations. Following this practice will help you to provide good useful initializers to your objects in many (if not most) cases. But some objects will remain uninitialized in C89/90 just because you won't have a good initializer for them at the point of declaration. Don't try to initialize them with "something" just for the sake of having them initialized. This will achieve absolutely nothing good, and might actually have negative consequences.
Note that some modern development tools (like MS Visual Studio 2005, for example) will catch run-time access to uninitialized variables in debug version of the code. I.e these tools can help you to detect situations when you access a variable before it had a chance to acquire a meaningful value, indicating a bug in the code. But performing unconditional premature initialization of your variables you essentially kill that capability of the tool and sweep these bugs under the carpet.
This topic has already been discussed here:
http://www.velocityreviews.com/forums/t282290-how-to-initialize-a-char.html
It refers to C++, but it might be useful for you, too.
There are several good answers to this question, one of them has been accepted. I'm going to answer anyway in order to expand on practicalities.
Yes, it is good practice to initialize pointers to NULL, as well as set pointers to NULL after they are no longer needed (i.e. freed).
In either case, its very practical to be able to test a pointer prior to dereferencing it. Lets say you have a structure that looks like this:
struct foo {
int counter;
unsigned char ch;
char *context;
};
You then write an application that spawns several threads, all of which operate on a single allocated foo structure (safely) through the use of mutual exclusion.
Thread A gets a lock on foo, increments counter and checks for a value in ch. It does not find one, so it does not allocate (or modify) context. Instead, it stores a value in ch so that thread B can do this work instead.
Thread B Sees that counter has been incremented, notes a value in ch but isn't sure if thread A has done anything with context. If context was initialized as NULL, thread B no longer has to care what thread A did, it knows context is safe to dereference (if not NULL) or allocate (if NULL) without leaking.
Thread B does its business, thread A reads its context, frees it, then re-initializes it to NULL.
The same reasoning applies to global variables, without the use of threads. Its good to be able to test them in various functions prior to dereferencing them (or attempting to allocate them thus causing a leak and undefined behavior in your program).
When it gets silly is when the scope of the pointer does not go beyond a single function. If you have a single function and can't keep track of the pointers within it, usually this means the function should be re-factored. However, there is nothing wrong with initializing a pointer in a single function, if only to keep uniform habits.
The only time I've ever seen an 'ugly' case of relying on an initialized pointer (before and after use) is in something like this:
void my_free(void **p)
{
if (*p != NULL) {
free(*p);
*p = NULL;
}
}
Not only is dereferencing a type punned pointer frowned upon on strict platforms, the above code makes free() even more dangerous, because callers will have some delusion of safety. You can't rely on a practice 'wholesale' unless you are sure every operation is in agreement.
Probably a lot more information than you actually wanted.
Preferred styles:
in C: char * c = NULL;
in C++: char * c = 0;
My rationale is that if you don't initialize with NULL, and then forget to initialize altogether, the kinds of bugs you will get in your code when dereferencing are much more difficult to trace due to the potential garbage held in memory at that point. On the other hand, if you do initialize to NULL, most of the time you will only get a segmentation fault, which is better, considering the alternative.
Initializing variables even when you don't need them initialized right away is a good practice. Usually, we initialize pointers to NULL, int to 0 and floats to 0.0 as a convention.
int* ptr = NULL;
int i = 0;
float r = 0.0;
It is always good to initialize pointer variables in C++ as shown below:
int *iPtr = nullptr;
char *cPtr = nullptr;
Because initializing as above will help in condition like below since nullptr is convertible to bool, else your code will end up throwing some compilation warnings or undefined behaviour:
if(iPtr){
//then do something.
}
if(cPtr){
//then do something.
}