Why aren't typedefs strongly typed?

Why aren't typedefs strongly typed? - c

What's the reason for typedefs not being strongly typed? Is there any benefit I can't see or is it due to backward compatibility? See this example:
typedef int Velocity;
void foo(Velocity v) {
//do anything;
}
int main() {
int i=4;
foo(i); //Should result in compile error if strongly typed.
return 0;
}
I am not asking for workarounds to get a strong typed datatype but only want to know why the standard isn't requiring typedefs to be strongly typed?
Thank you.

Because C is not strongly typed and typedef has its origin in that thinking
typedef is just for convenience and readability, it doesn't create a new type.

typedef is just a missnomer (like many other keywords). Think of it as typealias.
C has in the contrary a whole idea of what compatible types are. This allows for example to link compilation units together, even if declarations of function protopyes are only done with compatible types and not with identical ones. All this comes from simple practical necessity in every day life, being still able to give some guarantees to implementations.

Even if Velocity were a distinct type from int, your code would compile and work just fine due to type conversion rules. What would not work is passing an expression of type Velocity * to a function expecting int *, etc. If you want to achieve the latter form of type enforcement, simply make Velocity a structure or union type containing a single integer, and you'll now have a new real type.

Related

How create user-defined types in C?

For unknown reasons, I need to know how to replace the standard char a[10]; with string a; (Yes, I saw it in the CS50). So, how to create your own variable type named string?

Amplifying on what #Lundin said in his answer:
One of the things that makes C hard to learn -- especially for students coming from other languages -- is that C does not have a first-class "string" type. Important as they are, strings in C are cobbled together out of arrays of char, and often accessed via char * pointers. The cobbling together is performed by a loose collaboration between the compiler, the programmer, and library functions like strcpy and printf.
Although I said that "strings are cobbled together out of arrays of char, and often accessed via char * pointers", this does not mean that C's string type is char [], and it also does not mean that C's string type is char *.
If you imagine that C has a first-class string type, handled for you automatically by the language just like char, int, and double, you will be badly confused and frustrated. And if you try to give yourself a typedef called string, this will not insulate you from that confusion, will not ease your frustration, will not make your life easier in any way. It will only cloud the issue still further.

It is as simple as typedef char string[10];.
But please don't do this. Hiding arrays or pointers behind a typedef is very bad practice. The code gets much harder to read and you gain nothing from it. See Is it a good idea to typedef pointers? - the same arguments apply to arrays.
It is particularly bad to name the hidden array string since that is the exact spelling used by C++ std::string.
Please note that the CS50 is a bad course since it teaches you to do this. The SO community is sick and tired of "un-teaching" bad habits to the victims of this course. Stay away from questionable Internet tutorials in general.
If you want to create some manner of custom string type, the correct and proper way is to use a struct instead.

User defined types in C are structures, unions, enumerations and functions. And it seems you can also include arrays in the list.
For example
struct Point
{
int x;
int y;
};
or
enum Dimension { N = 100 };
Point a[N];
In this example the array type is Point[N].
In fact any derived type (including pointers) can be considered as a user-defined type. The C Standard does not define and use the tern user-defined type.

User defined types are one of the following: struct, union or enum. For example, struct:
struct worker {
int id;
char firstName[255];
char lastName[255];
};
To create an instance:
struct worker w1 = { 1234, "John", "Smith" };

Warn if another typedef'd name of a type is used in an argument list

Consider a large project, where many types are typedef'd, e.g.
typedef int age;
typedef int height;
and some functions getting arguments of those types:
void printPerson(age a, height h) {
printf("Age %d, Height %d\n", a, h);
}
Is there a way to warn at compile time, if those arguments are of the wrong type, e.g.
age a = 30;
height h = 180;
printPerson(h, a); /* No warning, because a and h are both integers */
Does gcc (or some static code analysis tool) have an option to warn in such cases?

There is no built-in support for this in GCC.
There is a feature request to add this, based on the Sparse nocast attribute. However, this hasn't been implemented. If you can use Sparse, though, you could do this by marking each typedef with __attribute__((nocast)).
In C++ you can do this by making wrapper classes rather than typedefs, and then simply not defining implicit conversions for them.

Klocwork has some checks related to what they call "strong typing".
For your code it throws STRONG.TYPE.ASSIGN.ARG because argument types do not match.
It also complains about assigning int values (the consts) to age and height typed variables and about using the variables as int in printf.
I heard it is quite expensive, though.

As has been made clear by the other responses, you're not going to get this for free from gcc. You are definitely into the world of static analysis tools to solve this.
There have been several suggestions for this, some of which require extra annotation, some of which don't but may be more than you're looking for. I therefore thought I'd throw one more into the mix...
A long stand by for me has been the various command line lint tools. In your case, I think PC-lint/flexelint fits very well even though it is a commercial tool. See here for its strong type checking.

No, there cannot be. A compiler will warn you if you're doing (or about to do) something illegal. It is not supposed to know (or determine the correctness of) the values which you will be passing as function parameter. As long as the types are same, it does not have any reason to complain.
However, in case of type mismatch, it will alert you.

Error will be generated only when types are different. can you wrap these types inside struct and use macros to automate definition and assignment.
If you are willing to use enum instead of integers then there is an option for warnings on use of mixed enums in static code analysis tool named coverity.
https://wiki.ubuntu.com/CoverityCheckerDictionary
look for MIXED_ENUMS.

As others already stated, there is no support for this in C. If you absolutely want strong type checking to happen, you could do like this:
typedef struct {int a;} age;
typedef struct {int h;} height;
void printPerson(age a, height h)
{
printf("Age %d, height %d\n", a.a, h.h);
}
age a = {30};
height h = {180};
printPerson(h, a); // will generate errors
Beware that this might have some performance impact, though.

Extending a struct in C

I recently came across a colleague's code that looked like this:
typedef struct A {
int x;
}A;
typedef struct B {
A a;
int d;
}B;
void fn(){
B *b;
((A*)b)->x = 10;
}
His explanation was that since struct A was the first member of struct B, so b->x would be the same as b->a.x and provides better readability.
This makes sense, but is this considered good practice? And will this work across platforms? Currently this runs fine on GCC.

Yes, it will work cross-platform(a), but that doesn't necessarily make it a good idea.
As per the ISO C standard (all citations below are from C11), 6.7.2.1 Structure and union specifiers /15, there is not allowed to be padding before the first element of a structure
In addition, 6.2.7 Compatible type and composite type states that:
Two types have compatible type if their types are the same
and it is undisputed that the A and A-within-B types are identical.
This means that the memory accesses to the A fields will be the same in both A and B types, as would the more sensible b->a.x which is probably what you should be using if you have any concerns about maintainability in future.
And, though you would normally have to worry about strict type aliasing, I don't believe that applies here. It is illegal to alias pointers but the standard has specific exceptions.
6.5 Expressions /7 states some of those exceptions, with the footnote:
The intent of this list is to specify those circumstances in which an object may or may not be aliased.
The exceptions listed are:
a type compatible with the effective type of the object;
some other exceptions which need not concern us here; and
an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union).
That, combined with the struct padding rules mentioned above, including the phrase:
A pointer to a structure object, suitably converted, points to its initial member
seems to indicate this example is specifically allowed for. The core point we have to remember here is that the type of the expression ((A*)b) is A*, not B*. That makes the variables compatible for the purposes of unrestricted aliasing.
That's my reading of the relevant portions of the standard, I've been wrong before (b), but I doubt it in this case.
So, if you have a genuine need for this, it will work okay but I'd be documenting any constraints in the code very close to the structures so as to not get bitten in future.
(a) In the general sense. Of course, the code snippet:
B *b;
((A*)b)->x = 10;
will be undefined behaviour because b is not initialised to something sensible. But I'm going to assume this is just example code meant to illustrate your question. If anyone's concerned about it, think of it instead as:
B b, *pb = &b;
((A*)pb)->x = 10;
(b) As my wife will tell you, frequently and with little prompting :-)

I'll go out on a limb and oppose #paxdiablo on this one: I think it's a fine idea, and it's very common in large, production-quality code.
It's basically the most obvious and nice way to implement inheritance-based object oriented data structures in C. Starting the declaration of struct B with an instance of struct A means "B is a sub-class of A". The fact that the first structure member is guaranteed to be 0 bytes from the start of the structure is what makes it work safely, and it's borderline beautiful in my opinion.
It's widely used and deployed in code based on the GObject library, such as the GTK+ user interface toolkit and the GNOME desktop environment.
Of course, it requires you to "know what you're doing", but that is generally always the case when implementing complicated type relationships in C. :)
In the case of GObject and GTK+, there's plenty of support infrastructure and documentation to help with this: it's quite hard to forget about it. It might mean that creating a new class isn't something you do just as quickly as in C++, but that's perhaps to be expected since there's no native support in C for classes.

That's a horrible idea. As soon as someone comes along and inserts another field at the front of struct B your program blows up. And what is so wrong with b.a.x?

Anything that circumvents type checking should generally be avoided.
This hack rely on the order of the declarations and neither the cast nor this order can be enforced by the compiler.
It should work cross-platform, but I don't think it is a good practice.
If you really have deeply nested structures (you might have to wonder why, however), then you should use a temporary local variable to access the fields:
A deep_a = e->d.c.b.a;
deep_a.x = 10;
deep_a.y = deep_a.x + 72;
e->d.c.b.a = deep_a;
Or, if you don't want to copy a along:
A* deep_a = &(e->d.c.b.a);
deep_a->x = 10;
deep_a->y = deep_a->x + 72;
This shows from where a comes and it doesn't require a cast.
Java and C# also regularly expose constructs like "c.b.a", I don't see what the problem is. If what you want to simulate is object-oriented behaviour, then you should consider using an object-oriented language (like C++), since "extending structs" in the way you propose doesn't provide encapsulation nor runtime polymorphism (although one may argue that ((A*)b) is akin to a "dynamic cast").

I am sorry to disagree with all the other answers here, but this system is not compliant to standard C. It is not acceptable to have two pointers with different types which point to the same location at the same time, this is called aliasing and is not allowed by the strict aliasing rules in C99 and many other standards. A less ugly was of doing this would be to use in-line getter functions which then do not have to look neat in that way. Or perhaps this is the job for a union? Specifically allowed to hold one of several types, however there are a myriad of other drawbacks there too.
In short, this kind of dirty casting to create polymorphism is not allowed by most C standards, just because it seems to work on your compiler does not mean it is acceptable. See here for an explanation of why it is not allowed, and why compilers at high optimization levels can break code which does not follow these rules http://en.wikipedia.org/wiki/Aliasing_%28computing%29#Conflicts_with_optimization

Yes, it will work. And it is one of the core principle of Object Oriented using C. See this answer 'Object-orientation in C' for more examples about extending (i.e inheritance).

This is perfectly legal, and, in my opinion, pretty elegant. For an example of this in production code, see the GObject docs:
Thanks to these simple conditions, it is possible to detect the type
of every object instance by doing:
B *b;
b->parent.parent.g_class->g_type
or, more quickly:
B *b;
((GTypeInstance*)b)->g_class->g_type
Personally, I think that unions are ugly and tend to lead towards huge switch statements, which is a big part of what you've worked to avoid by writing OO code. I write a significant amount of code myself in this style --- typically, the first member of the struct contains function pointers that can be made to work like a vtable for the type in question.

I can see how this works but I would not call this good practice. This is depending on how the bytes of each data structure is placed in memory. Any time you are casting one complicated data structure to another (ie. structs), it's not a very good idea, especially when the two structures are not the same size.

I think the OP and many commenters have latched onto the idea that the code is extending a struct.
It is not.
This is and example of composition. Very useful. (Getting rid of the typedefs, here is a more descriptive example ):
struct person {
char name[MAX_STRING + 1];
char address[MAX_STRING + 1];
}
struct item {
int x;
};
struct accessory {
int y;
};
/* fixed size memory buffer.
The Linux kernel is full of embedded structs like this
*/
struct order {
struct person customer;
struct item items[MAX_ITEMS];
struct accessory accessories[MAX_ACCESSORIES];
};
void fn(struct order *the_order){
memcpy(the_order->customer.name, DEFAULT_NAME, sizeof(DEFAULT_NAME));
}
You have a fixed size buffer that is nicely compartmentalized. It sure beats a giant single tier struct.
struct double_order {
struct order order;
struct item extra_items[MAX_ITEMS];
struct accessory extra_accessories[MAX_ACCESSORIES];
};
So now you have a second struct that can be treated (a la inheritance) exactly like the first with an explicit cast.
struct double_order d;
fn((order *)&d);
This preserves compatibility with code that was written to work with the smaller struct. Both the Linux kernel (http://lxr.free-electrons.com/source/include/linux/spi/spi.h (look at struct spi_device)) and bsd sockets library (http://beej.us/guide/bgnet/output/html/multipage/sockaddr_inman.html) use this approach. In the kernel and sockets cases you have a struct that is run through both generic and differentiated sections of code. Not all that different than the use case for inheritance.
I would NOT suggest writing structs like that just for readability.

I think Postgres does this in some of their code as well. Not that it makes it a good idea, but it does say something about how widely accepted it seems to be.

Perhaps you can consider using macros to implement this feature, the need to reuse the function or field into the macro.

Same set of instructions, different types. How to handle?

In a C program I'm making, I will receive as command lines arguments a file path and a letter. The file is where I read data from, and the letter represents the type of data that is held inside that file.
The instructions I need to perform on the data are basically the same, only the type is different: it might be that the file holds ints, doubles or the values of a struct X. Regardless of type, the operations will be identical; how can I avoid repeating code? In C++ I would handle this with templates. How would this be typically handled in C?

In C you would do it through what you're hoping to avoid -- repeating the code. C++ makes this more convenient with templates, as you're aware, however that's just a simple way to repeat the code and base it on a different type.
Something that might be appropriate for you is to provide the different class functions but to not call them directly. Instead, based on your command line, determine once which function(s) will process your data, and assign them to function pointers. Then, your control loop will just generically call the processing function(s) using those pointer(s). This will obviously include whatever you do with the data, but you might also decide to have separate input functions based on data type.
Edit: As Mat says, there are come types which promote well and so one block of code would work fine. I suspect this is why your assignment includes working with some structure type.

The solution to this problem is obvious with modern objected oriented languages -- you make an object of each type that implements an interface (or via inheritance) of the actions you want to perform.
You can't do this in C because the language does not naively support object oriented, but you can "reproduce" the same functionality instead of letting the compiler do it for you. To do so you need to use a level of indirection specifically you will need to use function pointers.
So (as an example) one of the actions you might take is to read values from the file. One of your variables will be a function pointer to a function that takes as a parameter the file and a variable of type void (this will change for each function you write.) Write the function for each of your types and then at run type assign the function to use based on the type of the file.

In the realms of really ugly pre-processor tricks, if you want to replicate the body of a function for different types, but keep the code "structure" identical, you can do something like this:
foo.hc
#define YNAME(X) foo_ ## X
#define XNAME(X) YNAME(X)
#define NAME XNAME(TYPE)
int NAME(FILE* f) {
TYPE myvar;
...
return whatever;
}
foo.c
#define TYPE int
#include "foo.hc"
#undef TYPE
#define TYPE double
#include "foo.hc"
#undef TYPE
This foo.c will pre-process to:
int foo_int(FILE* f) {
int myvar;
...
return whatever;
}
int foo_double(FILE* f) {
double myvar;
...
return whatever;
}
All you need to do in your main processing loop with that is to dispatch to the right function depending on your file type. A plain switch statement can work pretty well, an array of function pointers could work too.

The new C standard, C11, has type generic expressions that you could use for this. There is not yet much compiler support for C11 but for example the latest version of clang has _Generic. You can also use P99 to emulate C11 features on top of similar extensions that are provided by gcc.

Determining type of variable during run-time in C

I have several variables of type char (array), int, and double. Is there a way to identify what type they are during run-time?
For example, I'm looking for something like:
int dummyInt = 5;
double dummyDouble = 5.0;
dummyInt == int ?
printf("yes, it's of int type\n") : printf("no, it's not of int type\n");
dummyDouble == int ?
printf("yes, it's of int type\n") : printf("no, it's not of int type\n");
Where the obvious results will be:
yes, it's of int type
no, it's not of int type
Well, the reason why I need it is because I'm transferring data from variables into an SQL database (using SQLite). Now, the headers can change each time I run the program depending on which variables are being used. So when I'm creating the table, I need to tell it if it's VARCHAR, INTEGER, DOUBLE etc.

No, you can't, but why do you even want to do this? C is statically typed, so when you declare int dummyInt = 5, you already know that it's an int.
Edit:
Now, the headers can change each time I run the program depending on which variables are being used. So when I'm creating the table, I need to tell it if it's VARCHAR, INTEGER, DOUBLE etc.
That still doesn't change much. The types are still known at compile-time, so maybe all you need is a #define alongside the actual variable that defines the SQL type, say:
int someVar;
#define SOMEVAR_SQL_TYPE "INTEGER"

There is no facility in standard C to obtain the type of a variable at runtime; you are expected to do all your own bookkeeping in that regard.
If you need to create a generic container in C, you will have to figure out some way of tagging each element in the container with some sort of type information. Depending on the approach, it can be a lot of work.

I assume you want to have a program where you don't know the type of an object until runtime. For example, you want to implement a polymorphic sorting algorithm which doesn't know the type of the object until runtime (at which point, it knows only the size of each element because C is statically typed, but even this may break down unless you implement your own functions which are independent of all types and does all of the bit arithmetic for the objects).
If you do want to do some sort of polymorphic approach for something described above, I recommend you use char pointers and allocate enough space on the heap for whatever size is needed at runtime. Then implement functions which receive a pointer to the space, the element size, and number of elements, and performs the appropriate comparison/arithmetic operations. This implementation can get very complicated. It may not be complicated if you supply a module function which does know the type, but then you are back to knowing the type at compile time.

No, there is no way to do it in C.

completely out of spec but try here for one example

I had the same problem but using C++ This is not C but perhaps an equivalent exists. I had to use the header then call typeid();
My code looked like:
#include typeinfo
int main()
{
int a = 10;
cout << typeid(int).name() << endl;
}
I know you can share header files so perhaps it can be made to work?
The output of typeid used a letter for each type (implementation specific). i.e. an unsigned int was output as a 'j'.
Here is my orignal question, similar to your problem.