We are all familiar with working of sizeof operator in C language. I am trying to make a similar working function that will absorb any kind of datatype and return me its size.
Can somebody tell me how to make such a similar function in "C".
int myOwnSizeOf(/*what would be parameter type?*/)
{
//and what about the definition?
}
Thanks
You can't do that with a function. That's why sizeof is an operator built into the language, not a library function. It's magic.
You could do
#define myOwnSizeOf(x) (sizeof(x))
but I don't really see the point.
Since a function is evaluated at runtime, it can never consume a datatype but only objects. That's why sizeof is a built in operator.
You might get it to work for this limited case in C++ with a template function. But in C your only possibilities to consume an object of any datatype are either void pointers or macros. But the former won't work, of course, as it looses any type information and the latter was already suggested by aschepler and as he noted, it won't buy you anything (and it isn't a function, anyway).
A function can't do it, but a macro can, if you don't mind throwing in a little technical UB that can't/won't really matter:
#define mysizeof(T) (size_t)((char *)((T*)0+1)-(char *)0)
If you replaced 0 with the address of a static-storage-duration object larger than any other object you'd ever try to take the size of, the UB would go away.
Edit: Note that this is for types T; a version for variables is much easier:
#define mysizeof(v) (size_t)((char *)(&v+1)-(char *)&v)
Related
I have to return nul in a function but I'm not allowed to include any library. I tried to find in how is NULL defined, then to sys/_types/_null.h only to find that NULL is actually __DARWIN_NULL. Great ! Now, I have no idea where to search in order to find the __DARWIN_NULL definition...
I have to return nul in a function but I'm not allowed to include any library.
The solution to this problem is far simpler than you're making it; you've actually asked a form of XY problem.
You won't find the NUL character defined in any standard library; the best way to return that will be using the constant '\0' or 0.
If your professor is teaching you to avoid using <stddef.h> to find NULL then he/she has set a silly exercise which involves using something other than the most appropriate tool for the job, a tool which is guaranteed by the standard to exist, by the way... I would be raising this as a concern.
Nonetheless, sometimes professors don't care and will teach you to do stupid things anyway. NULL is defined as an implementation-defined null pointer constant, usually 0 or a conversion of 0 to void * like so: ((void *) 0). That may not be the implementation-defined value your NULL resolves to; in that case, adjust to suit :)
You could add a preprocessor definition such as #define NULL ((void *) 0) and then you would be able to return NULL; from your functions. Ta-da! Stupid exercises deserve stupid solutions.
If your professor says you're not allowed to use #define, either, I would be tempted to ask him what you are allowed to use. Changing requirements on the fly is not fair. Most compilers will allow you to set preprocessor constants using a command line argument, for example cc -DNULL='((void *) 0)' .... This is useful for exposing compile-time configuration options, but again, using this to define NULL is dumb.
The question in your title is different to the rest of your post. __DARWIN_NULL could also be defined using either of the above, providing it isn't already defined, but I really don't think that's required to answer your actual question.
I am replacing macros in a large C99 code base with inline functions to see if the compiler can do a better job optimizing. There are a lot of macros which expand to a data structure. The code uses them not as functions, but as a replacement for the data structure itself.
For example...
#define AvARRAY(av) ((av)->sv_u.svu_array)
AvARRAY(av)[--key] = &PL_sv_undef;
Not only is there a lot of code which does this, published code outside of my control does this, so I would rather leave this idiom in place if possible.
Is there a way to define an inline function version of AvARRAY which is compatible with the sort of code above?
Yes, you can use the return value without assigning it to a variable.
However, in general case it is not possible to replace this macro with a function. The whole point of using macros in for this purpose is that macros can "evaluate" to lvalues. Functions in C cannot produce lvalues, unfortunately. In other words, no, in general case you can't directly replace such macros with a functions.
But in specific case it could be different. Do they really use those macros as lvalues? In your specific example it is not used as an lvalue, so in your case you can just do
inline whatever_type_it_has *AvARRAY(struct_type *av)
{
return av->sv_u.svu_array;
}
and use it later exactly as it is used in your example
AvARRAY(av)[--key] = &PL_sv_undef;
But if somewhere else in the code you have something like
AvARRAY(av) = malloc(some_size);
or
whatever_type_it_has **pptr = &AvARRAY(av);
then you'll be out of luck. The function version will not work, while the original macro will.
The only way to [almost] fully "emulate" the functionality of macro with a function in this case is to lift it to a higher level of indirection, i.e. assume that the function always returns a pointer to the target data field
inline whatever_type_it_has **AvARRAY_P(struct_type *av)
{
return &av->sv_u.svu_array;
}
In that case you will have to remember to dereference that pointer every time
(*(AvARRAY_P(av))[--key] = &PL_sv_undef;
but this will work
*AvARRAY_P(av) = malloc(some_size);
whatever_type_it_has **pptr = &*AvARRAY_P(av);
But this will not work with bit-fields, while the macro version will.
I'm quite often confused when coming back to C by the inability to create an array using the following initialisation pattern...
const int SOME_ARRAY_SIZE = 6;
const int myArray[SOME_ARRAY_SIZE];
My understanding of the problem is that the const operator does not guarantee const-ness but rather merely asserts that the value pointed to by SOME_ARRAY_SIZE will not change at runtime. But why can the compiler not assume that the value is constant at compile time? It says 6 right there in the source code...
I think I'm missing something core in my fundamental understanding of C. Somebody help me out here. :)
[UPDATE]After reading a bit more around C99 and variable length arrays I think I understand this a bit better. What I was trying to create was a variable length array - const does not create a compile time constant but rather a runtime constant. Therfore I was initialising a variable length array, which is only valid in C99 at a function/block scope. A variable length array at the file scope is impossible as the compiler cannot assign a fixed memory address to an unbounded array.[/UPDATE]
Well, in C++ the semantics are a bit different. In C++ your code would work fine. You must distinguish between 2 things, const and constant expression. Const means simply, as you described, that the value is read-only. constant expression, on the other hand, means the value is known compile time and is a compile-time constant. The semantics of const in C are always of the first type. The only constant expressions in C are literals, that's why #define is used for such kind of things.
In C++ however, any const object initialized with a constant expression is in itself a constant expression.
I don't know exactly WHY this is so in C, it's just the way it is
The problem is that the language syntax demands a integer value between the [ ]. SOME_ARRAY_SIZE is still a variable (even if you told the compiler nobody is allowed to vary it!)
The const keyword is basically a read-only indication. It does not, really, indicate the underlying value will not change, even though that is the case in your example.
When it comes to pointers, this is more clear:
void foo(int const * p)
{
if (*p == 100)
{
bar();
/* Here, the compiler can not assume that *p is 100 */
}
}
In this case, a compiler should not accept the code in your example, as it requires the array size to be constant. If it would accept it, the user could later run into trouble when porting the code a more strict compiler.
You can do this in C99, and some compilers prior to C99 also had support for this as an extension to C89 (e.g. gcc). If you're stuck with an old compiler that doesn't have C99 support though (e.g. MSVC) then you'll have to do it the old skool way and use a #define for the array size.
Note that that above comments apply only to such declarations at local scope (i.e. automatic variables). C99 still doesn't allow such declarations at global scope.
i just did a very quick test with my Xcode and Objective C file I currently had open on my machine and put this in the .m file:
const int arrs = 6;
const int arr[arrs];
This compiles without any issues.
Having been writing Java code for many years, I was amazed when I saw this C++ statement:
int a,b;
int c = (a=1, b=a+2, b*3);
My question is: Is this a choice of coding style, or does it have a real benefit? (I am looking for a practicle use case)
I think the compiler will see it the same as the following:
int a=1, b=a+2;
int c = b*3;
(What's the offical name for this? I assume it's a standard C/C++ syntax.)
It's the comma operator, used twice. You are correct about the result, and I don't see much point in using it that way.
Looks like an obscure use of a , (comma) operator.
It's not a representative way of doing things in C++.
The only "good-style" use for the comma operator might be in a for statement that has multiple loop variables, used something like this:
// Copy from source buffer to destination buffer until we see a zero
for (char *src = source, *dst = dest; *src != 0; ++src, ++dst) {
*dst = *src;
}
I put "good-style" in scare quotes because there is almost always a better way than to use the comma operator.
Another context where I've seen this used is with the ternary operator, when you want to have multiple side effects, e.g.,
bool didStuff = DoWeNeedToDoStuff() ? (Foo(), Bar(), Baz(), true) : false;
Again, there are better ways to express this kind of thing. These idioms are holdovers from the days when we could only see 24 lines of text on our monitors, and squeezing a lot of stuff into each line had some practical importance.
Dunno its name, but it seems to be missing from the Job Security Coding Guidelines!
Seriously: C++ allows you to a do a lot of things in many contexts, even when they are not necessarily sound. With great power comes great responsibility...
This is called 'obfuscated C'. It is legal, but intended to confuse the reader. And it seems to have worked. Unless you're trying to be obscure it's best avoided.
Hotei
Your sample code use two not very well known by beginners (but not really hidden either) features of C expressions:
the comma operator : a normal binary operator whose role is to return the last of it's two operands. If operands are expression they are evaluated from left to right.
assignment as an operator that returns a value. C assignment is not a statement as in other languages, and returns the value that has been assigned.
Most use cases of both these feature involve some form of obfuscation. But there is some legitimate ones. The point is that you can use them anywhere you can provide an expression : inside an if or a while conditional, in a for loop iteration block, in function call parameters (is using coma you must use parenthesis to avoid confusing with actual function parameters), in macro parameter, etc.
The most usual use of comma is probably in loop control, when you want to change two variables at once, or store some value before performing loop test, or loop iteration.
For example a reverse function can be written as below, thanks to comma operator:
void reverse(int * d, int len){
int i, j;
for (i = 0, j = len - 1 ; i < j ; i++, j--){
SWAP(d[i], d[j]);
}
}
Another legitimate (not obfuscated, really) use of coma operator I have in mind is a DEBUG macro I found in some project defined as:
#ifdef defined(DEBUGMODE)
#define DEBUG(x) printf x
#else
#define DEBUG(x) x
#endif
You use it like:
DEBUG(("my debug message with some value=%d\n", d));
If DEBUGMODE is on then you'll get a printf, if not the wrapper function will not be called but the expression between parenthesis is still valid C. The point is that any side effect of printing code will apply both in release code and debug code, like those introduced by:
DEBUG(("my debug message with some value=%d\n", d++));
With the above macro d will always be incremented regardless of debug or release mode.
There is probably some other rare cases where comma and assignment values are useful and code is easier to write when you use them.
I agree that assignment operator is a great source of errors because it can easily be confused with == in a conditional.
I agree that as comma is also used with a different meaning in other contexts (function calls, initialisation lists, declaration lists) it was not a very good choice for an operator. But basically it's not worse than using < and > for template parameters in C++ and it exists in C from much older days.
Its strictly coding style and won't make any difference in your program. Especially since any decent C++ compiler will optimize it to
int a=1;
int b=3;
int c=9;
The math won't even be performed during assignment at runtime. (and some of the variables may even be eliminated entirely).
As to choice of coding style, I prefer the second example. Most of the time, less nesting is better, and you won't need the extra parenthesis. Since the use of commas exhibited will be known to virtually all C++ programmers, you have some choice of style. Otherwise, I would say put each assignment on its own line.
Is this a choice of coding style, or does it have a real benefit? (I am looking for a practicle use case)
It's both a choice of coding style and it has a real benefit.
It's clearly a different coding style as compared to your equivalent example.
The benefit is that I already know I would never want to employ the person who wrote it, not as a programmer anyway.
A use case: Bob comes to me with a piece of code containing that line. I have him transferred to marketing.
You have found a hideous abuse of the comma operator written by a programmer who probably wishes that C++ had multiple assignment. It doesn't. I'm reminded of the old saw that you can write FORTRAN in any language. Evidently you can try to write Dijkstra's language of guarded commands in C++.
To answer your question, it is purely a matter of (bad) style, and the compiler doesn't careāthe compiler will generate exactly the same code as from something a C++ programmer would consider sane and sensible.
You can see this for yourself if you make two little example functions and compile both with the -S option.
I want to get the size of a specific member in a struct.
sizeof(((SomeStruct *) 0)->some_member) works for me but I feel like there might be a nicer way to do it.
I could #define SIZEOF_ELEM(STRUCT, ELEM) sizeof(((STRUCT *) 0)->ELEM) and then use SIZEOF_ELEM(SomeStruct, some_member), but I wonder whether there is already something better built-in.
My specific use-case is in hsc2hs (Haskell C bindings).
pokeArray (plusPtr context (#offset AVFormatContext, filename)) .
take (#size ((AVFormatContext *) 0)->filename) .
(++ repeat '\NUL') $ filename
What you've got is about as clean as it gets if you can't guarantee you have a variable to dereference. (If you can, then use just sizeof(var.member) or sizeof(ptr->member), of course, but this won't work in some contexts where a compile-time constant is needed.)
Once upon a long, long time ago (circa 1990), I ran into a compiler that had 'offsetof' defined using the base address 0, and it crashed. I worked around the problem by hacking <stddef.h> to use 1024 instead of 0. But you should not run into such problems now.
Microsoft has the following in one of their headers:
#define RTL_FIELD_SIZE(type, field) (sizeof(((type *)0)->field))
I see no reason to do any different.
They have related macros for:
RTL_SIZEOF_THROUGH_FIELD()
RTL_CONTAINS_FIELD()
and the nifty:
CONTAINING_RECORD()
which helps implement generic lists in straight C without having to require that link fields be at the start of a struct. See this Kernel Mustard article for details.
I believe you've already got the correct solution there. You could dig up your stddef.h and look for how offsetof is defined, since it does a very similar thing.
Remember that there may well be a difference between the sizeof a member and the difference between the offsetofs of that member and the next one, due to padding.
In C++ you could do sizeof(SomeStruct::some_member), but this is c and you have no scope resolution operator. What you've written is as good as can be written, as far as I know.