The Linux kernel manpages declare the epoll_ctl procedure as follows:
int epoll_ctl(int epfd, int op, int fd, struct epoll_event *event);
As evident, the event parameter is declared as a pointer to the epoll_event struct.
The significance of said observation in the context of this question is that there is no const ahead of the pointer type declaration, and thus, the procedure appears to be permitted to modify the contents of the passed structure.
Is this an omission of sorts, or was the procedure made like that by design and we have to assume that the passed structure may indeed be modified inside the procedure?
I understand that the declaration is unambiguous here, but is there reason to believe this to be an omission?
I have also taken a look at the relevant source code in kernel 4.6 tree, and I don't see much evidence that the procedure even intends to modify the structure, so there.
Found a rather conclusive answer on the Linux mailing list. Quoting Davide Libenzi here, chief or sole author of "epoll":
From: Davide Libenzi <davidel <at> xmailserver.org>
Subject: Re: epoll_ctl and const correctness
Newsgroups: gmane.linux.kernel
Date: 2009-03-25 16:23:21 GMT (7 years, 17 weeks, 1 day, 9 hours and 4 minutes ago)
On Wed, 25 Mar 2009, nicolas sitbon wrote:
Currently, the prototype of epoll_ctl is :
int epoll_ctl(int epfd, int op, int fd, struct epoll_event *event);
I searched in the man of epoll_ctl and google, and it seems that the
structure pointed to by event isn't modify, valgrind confirms this
behaviour, so am I wrong? or the good prototype is
int epoll_ctl(int epfd, int op, int fd, struct epoll_event const *event);
According to the current ctl operations, yes. But doing that would prevent
other non-const operations to be added later on.
Davide
The takeaway is that even though de-facto behavior is not to modify the structure, the interface omits const modifier deliberately because other control operations may be added in the future through the same system call, necessitating a potentially modifiable structure pointed at by the event argument.
I should have hit the kernel mailing list first, apologies for another perhaps redundant information on SO. Leaving the question and this answer for posterity.
Related
Here's the signature of pthread_setschedparam:
#include <pthread.h>
int pthread_setschedparam(pthread_t thread, int policy, const struct sched_param *param);
Will this piece of code result in unexpected behavior:
void schedule(const thread &t, int policy, int priority) {
sched_param params;
params.sched_priority = priority;
pthread_setschedparam(t.native_handle(), policy, ¶ms);
}
It is completely unclear if the scope of params needs to be broader than the function call alone. When I see a function that takes in a pointer, it suggests (to me at least) that it's asking for ownership of it. Is this signature just badly designed? Should "sched_params params" live on the heap? Does it need to outlive the thread to stay valid? Can it be deleted?
I have no idea.
Thanks!
pthread_setschedparam sets the scheduling policy for the given thread. The parameters need not be alive after the call.
If the lifetime of the last argument mattered (as you put, if pthread_setschedparam takes ownership of it), it would have been explicitly documented so. But it's not in POSIX documentation pthread_setschedparam .
The probable reason why it takes a pointer (instead of value) is that it's less expensive to pass a pointer than a struct.
When I see a function that takes in a pointer, it suggests (to me at least) that it's asking for ownership of it.
I don't jump straight there when I see a function that accepts a pointer parameter, and I don't think you should, either. Although it is important to be aware of the possibility, and you do well to look for documentation, there is a variety of reasons for a function to take a pointer parameter, among them:
the function accepts arrays via the parameter. This is surely the most common reason.
the function wants to modify an object specified to it by the caller (via the pointer). This is probably the second most common reason.
the function accepts a pointer to a structure or union of large or potentially-large size to lighten the function-call overhead
the function accepts a pointer to a structure or union because it conforms to interface conventions that accommodate ancient C compilers that did not accept structures and unions as arguments. This was normal for early C compilers, as it's the way the language was originally specified:
[T]he only operations you can perform on a structure are take its address with & and access one of its members. [... Structures] can not be passed to or returned from functions. [...] Pointers to structures do not suffer these limitations[.]
(Kernighan & Ritchie, The C Programming Language, 1st ed., section 6.2)
Standard C does not have those restrictions, but their effect can still be felt in some places.
That the function expects to take (and typically reassign) responsibility for freeing dynamically-allocated space to which the pointer points, or that it otherwise intends to make a copy of the pointer that survives the function's return, are way down the list. If a function intends to do one of those things, then I fully expect its documentation to indicate so in some manner.
Is this signature just badly designed?
No, I think its design is prompted by one or both of the latter two points from my list.
Should "sched_params params" live on the heap?
I would not expect that to be a requirement.
Does it need to outlive the thread to stay valid? Can it be deleted?
I do not think it needs to outlive the thread whose properties are set. In addition to my general interpretation of the interface, I read (weak) support for that position in the wording of the function's POSIX specification:
The pthread_setschedparam() function shall set the scheduling policy
and associated scheduling parameters for the thread whose thread ID is
given by thread to the policy and associated parameters provided in
policy and param, respectively.
(POSIX specification for pthread_setscheduleparam(); emphasis added)
The "provided in" language indicates to me (again, weakly) that the function uses the contents of the pointed-to structure, not the structure itself.
On Linux, sched.h contains the definition of
int sched_rr_get_interval(pid_t pid, struct timespec * tp);
to get the time slice of a process. However the file shipping with OS X El Capitan doesn't hold that definition.
Is there an alternative for this on OS X?
The API's related to this stuff are pretty byzantine and poorly documented, but here's what I've found.
First, the datatypes related to RR scheduling seem to be in /usr/include/mach/policy.h, around line 155. There's this struct:
struct policy_rr_info {
...
integer_t quantum;
....
};
The quantum is, I think, the timeslice (not sure of units.) Then grepping around for this or related types defined in the same place, I found the file /usr/include/mach/mach_types.def, which says that the type struct thread_policy_t contains a field policy_rr_info_t on line 203.
Next, I found in /usr/include/mach/thread_act.h the public function thread_policy_get, which can retrieve information about a thread's policy into a struct thread_policy_t *.
So, working backwards. I think (but haven't tried at all) that you can
Use the thread_policy_get() routine to return information about the thread's scheduling state into a thread_policy_t
That struct seems to have a policy_rr_info_t sub-substructure
That sub-structure should have a quantum field.
That field appears to be the timeslice, but I don't know about the units.
There are no man pages for this part of the API, but this Apple Developer page explains at least a little bit about how to use this API.
Note that this is all gleaned from just grepping the various kernel headers, and I've definitely not tried to use any of these APIs in any actual code.
Sorry if this has been asked before, I wasn't really even sure what to search for to come up with this.
When I create a typedef struct, I usually do something like this:
typedef struct myStruct {
int a;
int b;
struct myStruct *next;
} MyStruct;
So I declare it with MyStruct at the end. Then when I create functions that pass that in as a parameter, I write
int doSomething(MyStruct *ptr){
}
Yet I am collaborating with a friend on a project and I have come across his coding style, which is to also declare *MyStructP like this:
typedef struct myStruct {
int a;
int b;
struct myStruct *next;
} MyStructR, *MyStructP;
And then he uses MyStructP in his functions, so his parameters look like:
int doSomething(MyStructP)
So he doesn't have to use the * in the parameter list. This confused me because when I look at the parameter list, I always look for the * to determine if the arg is a pointer or not. On top of that, I am creating a function that takes in a struct I created and a struct he created, so my arg has the * and his does not. Ultra confusing!!
Can someone give insight/comparison/advice on the differences between the two? Pros? Cons? Which way is better or worse, or more widely used? Any information at all. Thanks!
It is generally considered poor style to hide pointers behind typedefs, unless they are meant to be opaque handles (for example SDL_GLContext is a void*).
This being not the case here, I agree with you that it's more confusing than helping.
The Linux kernel coding style says to avoid these kinds of typedefs:
Chapter 5: Typedefs
Please don't use things like "vps_t".
It's a mistake to use typedef for structures and pointers. When you see a
vps_t a;
in the source, what does it mean?
In contrast, if it says
struct virtual_container *a;
you can actually tell what "a" is.
Some people like to go with ideas from Hungarian Notation when they name variables. And some people take that concept further when they name types.
I think it's a matter of taste.
However, I think it obscures things (like in your example) because you'd have to dig up the declaration of the name in order to find its type. I prefer things to be obvious and explicit, and I would avoid such type names.
(And remember, typedef does not introduce a new type but merely a new name that aliases a new type.)
The main good reason why people occasionally typedef pointers is to represent the type as a "black box object" to the programmer and to allow its implementation to more easily be changed in the future.
For example, maybe today the type is a pointer to a struct but tomorrow the type becomes an index into some table, a handle/key of some sort, or a file descriptor. Typedef'ing this way tells the programmer that they shouldn't try things they might normally do to a pointer such as comparing it against 0 / NULL, dereferencing it (e.g. - directly accessing members), incrementing it, etc., as their code may become broken in the future. Of course, using a naming convention, such as your friend did, that reveals and encodes that the underlying implementation actually is a pointer conflicts with that purpose.
The other reason to do this is to make this kind of error less likely:
myStructR *ptr1, ptr2;
myStructP ptr3, ptr4;
That's pretty weak sauce as the compiler will typically catch you misusing ptr2 later, but that is a reason given for doing this.
I have a header and a sample application using this header, all in C, I get almost all the logic of this software except for this; this the interesting part of the header:
struct A;
typedef struct A A;
in the C application this A is only used when declaring a pointer like this
A* aName;
I'm quite sure that this is a solution for just including A in the scope/namespace and give just a name to a basically void pointer, because this kind of pointer is only used to handle some kind of data, it is more like some namespace sugar.
What this could be for?
You're correct that it's like a void pointer, in that void is an incomplete type, and in this file A is also an incomplete type. About all you can do with incomplete types is pass around pointers to them.
It has one advantage over void* in this file, that it's a different and incompatible type from some other bit of code that has done the same thing with B. So you get a bit of type safety. If A is windowHandle and B is jpgHandle, then you can't pass the wrong one to a function.
It has an advantage over void* in the .c file that defines the functions that accept an A* -- that file can contain a definition of struct A, and give A whatever members it wants, that the first file doesn't need to know about.
However, you say there are no other mentions of A in any header file, which means there are no functions that accept or return it. You also say that the only use of A in your source file is to declare pointers -- I wonder where the values of those pointers come from, if any.
If all that happens if that someone defines an uninitialized A* and never uses it, then clearly this is a remnant of some old code, or the start of some code that never got written, and it shouldn't be in the file at all.
Finally, if the real type is called something a bit less stupid than A, then the name might give a clue to its use.
I assume struct A is a forward declaration. It most likely is defined in one of the .c-files.
Doing so struct A's members are private to the module defining it.
This is an example of an opaque pointer, which is useful for passing handles. See http://en.wikipedia.org/wiki/Opaque_pointer for some further info. What may be interesting here from a C++ perspective, is the notion that you can define a class with a member that is a pointer to an (as yet) undefined struct. Although this struct is thus not yet defined in the header, in some later cpp implementation this struct is given body, and the compiler does the rest. This strategy is also called the Pimpl idiom (more of which you will find LOTS on the internet). Microsoft discusses it briefly at http://msdn.microsoft.com/en-us/library/hh438477.aspx.
I was perusing some code using arbitrary-length integers using the GNU Multi-Precision (GMP) library code. The type for a MP integer is mpz_t as defined in gmp.h header file.
But, I've some questions about the lower-level definition of this library-defined mpz_t type. In the header code:
/* THIS IS FROM THE GNU MP LIBRARY gmp.h HEADER FILE */
typedef struct
{
/* SOME OTHER STUFF HERE */
} __mpz_struct;
typedef __mpz_struct mpz_t[1];
First question: Does the [1] associate with the __mpz_struct? In other words, is the typedef defining a mpz_t type as a __mpz_struct array with one occurrence?
Second question: Why the array? (And why only one occurrence?) Is this one of those struct hacks I've heard about?
Third question (perhaps indirectly related to second question): The GMP documentation for the mpz_init_set(mpz_t, unsigned long int) function says to use it as pass-by-value only, although one would assume that this function would be modifying its contents within the called function (and thus would need pass-by-reference) syntax. Refer to my code:
/* FROM MY CODE */
mpz_t fact_val; /* declaration */
mpz_init_set_ui(fact_val, 1); /* Initialize fact_val */
Does the single-occurrence array enable pass-by-reference automatically (due to the breakdown of array/pointer semantics in C)? I freely admit I'm kinda over-analyzing this, but I'd certainly love any discussion on this. Thanks!
This does not appear to be a struct hack in the sense described on C2. It appears that they want mpz_t to have pointer semantics (presumably, they want people to use it like an opaque pointer). Consider the syntactic difference between the following snippets:
struct __mpz_struct data[1];
(&data[0])->access = 1;
gmp_func(data, ...);
And
mpz_t data;
data->access = 1;
gmp_func(data, ...);
Because C arrays decay into pointers, this also allows for automatic pass by reference for the mpz_t type.
It also allows you to use a pointer-like type without needing to malloc or free it.
*First question: Does the [1] associate with the __mpz_struct? In other words, is the typedef defining a mpz_t type as a __mpz_struct array with one occurrence?*
Yes.
Second question: Why the array? (And why only one occurrence?) Is this one of those struct hacks I've heard about?
Beats me. Don't know, but one possibility is that the author wanted to make an object that was passed by reference automatically, or, "yes", possibly the struct hack. If you ever see an mpz_t object as the last member of a struct, then "almost certainly" it's the struct hack. An allocation looking like
malloc(sizeof(struct whatever) + sizeof(mpz_t) * some_number)`
would be a dead giveaway.
Does the single-occurrence array enable pass-by-reference automatically...?
Aha, you figured it out too. "Yes", one possible reason is to simplify pass-by-reference at the expense of more complex references.
I suppose another possibility is that something changed in the data model or the algorithm, and the author wanted to find every reference and change it in some way. A change in type like this would leave the program with the same base type but error-out every unconverted reference.
The reason for this comes from the implementation of mpn. Specifically, if you're mathematically inclined you'll realise N is the set of natural numbers (1,2,3,4...) whereas Z is the set of integers (...,-2,-1,0,1,2,...).
Implementing a bignum library for Z is equivalent to doing so for N and taking into account some special rules for sign operations, i.e. keeping track of whether you need to do an addition or a subtraction and what the result is.
Now, as for how a bignum library is implemented... here's a line to give you a clue:
typedef unsigned int mp_limb_t;
typedef mp_limb_t * mp_ptr;
And now let's look at a function signature operating on that:
__GMP_DECLSPEC mp_limb_t mpn_add __GMP_PROTO ((mp_ptr, mp_srcptr, mp_size_t, mp_srcptr,mp_size_t));
Basically, what it comes down to is that a "limb" is an integer field representing the bits of a number and the whole number is represented as a huge array. The clever part is that gmp does all this in a very efficient, well optimised manner.
Anyway, back to the discussion. Basically, the only way to pass arrays around in C is, as you know, to pass pointers to those arrays which effectively enables pass by reference. Now, in order to keep track of what's going on, two types are defined, a mp_ptr which is an array of mp_limb_t big enough to store your number, and mp_srcptr which is a const version of that, so that you cannot accidentally alter the bits of the source bignums on what you are operating. The basic idea is that most of the functions follow this pattern:
func(ptr output, src in1, src in2)
etc. Thus, I suspect mpz_* functions follow this convention simply to be consistent and it is because that is how the authors are thinking.
Short version: Because of how you have to implement a bignum lib, this is necessary.