I'm trying to connect to mysql from rust code. I've tried these steps.
1. I wrote c code using mysql.h, and command below.
$ gcc -shared mysqlrust.c -o libmysqlrust.so $(mysql_config --cflags) $(mysql_config --libs) $(mysql_config --cflags)
$ cp libmysqlrust.so /usr/local/lib/rustc/i686-unknown-linux-gnu/lib/
2. I wrote Rust code, that calls libmysqlrust.so.
But I couldn't figure out way to use C type structure "MYSQL", "MYSQL_RES", "MYSQL_ROW".
Please show me how to use c type structure from rust code.
There is not yet any way to automatically create Rust type definitions from C structs. In these situations there are a few ways to proceed. Not knowing the MySQL API, I can't say exactly what you should do, but here are some options.
1) Treat them entirely as opaque pointers.
This is the best situation to be in, and depends on the C API always taking the struct as a pointer, having its own constructor and destructor functions, and providing accessor functions for whatever you need to access inside the struct. In these cases you just define type MYSQL = ctypes::void and only ever use it as an unsafe pointer *MYSQL. Sometimes the easiest path is to write your own C wrappers to fill in the gaps and make this scenario possible.
The remaining scenarios all involve redefining a Rust data structure with the same structure as the C struct. Rust tries to lay out its data structures in a way that is compatible with C (though doesn't always succeed yet), so it is often possible to create a Rust record or enum with the size, alignment and layout of the C struct you care about. You will want to make sure you use the types in core::ctypes, as they are defined to match various common C types.
Note that the ctypes module will be going away soon in favor of a more comprehensive libc compatibility module.
2) Define a Rust record that is partially correct.
If the API provides constructors and destructors, but you still need access to some fields of the struct, then you can define just enough of the struct to get at the fields you care about, disregarding things like the correct size and alignment. e.g. type MSQL = { filler1: ctypes::int, ..., connector_fd: *ctypes::char }. You can stop defining the struct at the last field you care about since you have a C function to allocate it on the heap with the correct size and alignment. In Rust code you always refer to it with an unsafe pointer: let mysql: *MYSQL = mysqlrust::create_mysql();
3) Define a Rust record that is the correct size and alignment, without caring about the contents.
If you don't have constructor/destructor functions, or need to store the struct on the stack, but you otherwise have accessor functions to manipulate the contents of the struct, then you need to define a Rust record with the correct size and alignment. To do this, just add fields of type uint (which is always pointer-sized) or tuples of uint, until both C's sizeof and core::sys::size_of agree on the size. Pad with u8s if the size isn't a multiple of the pointer size. Getting the alignment right is a more mystical process, but by using uint fields you will generally end up with a usable alignment (maybe - I really have no idea how accurate that statement is).
I would recommend adding tests to sanity check that Rust and C agree on the size in order to guard against future breakage.
3) Actually redefine the entire C struct.
This is a pretty dire situation for large structs, and it is possible in theory, but I don't think anybody has done it for a struct as big as MYSQL. I would avoid it if you can. Eventually there will be a clang-based tool to do this automatically.
Here are some examples of interop with C structs:
https://github.com/jdm/rust-socket/blob/master/socket.rs - This redefines various socket structs, adding placeholders for fields it doesn't care about. Note that it uses u8 for padding, but I think uint is more likely to produce correct alignment.
https://github.com/erickt/rust-zmq/blob/master/zmq.rs
https://github.com/pcwalton/rust-spidermonkey - This one demonstrates interop with a somewhat complex API.
Related
In my C89 code, I have several units implementing a variety of abstract buffers which are to be treated by the user as if they were classes. That is, there is a public header defining the interfacing functions, and this is all the user ever sees. They are not intended to (need to) know what is going on behind the scenes.
However, at buffer creation, a raw byte-buffer is passed to the creation function, so the user must be able to know how much raw buffer space to allocate at compile time. This requires knowing how much space one item takes up in each abstract type. We are coding for a very limited embedded environment.
Currently, each buffer type has a private header in which a struct defines the format of the data. It is simple to add a macro for the size of the data element:
#define MY_ELEMENT_SIZE (sizeof(component_1_type) + sizeof(component_2_type))
However, component_x_type is intended to be hidden from the user, so this definition cannot go in the public header with the prototypes for the interfacing functions.
Our next idea was to have a const variable in the source:
const int MY_ELEMENT_SIZE = sizeof(component_1_type) + sizeof(component_2_type);
and an extern declaration in the public header:
extern const int MY_ELEMENT_SIZE;
But, because this is C89 and we have pedantry and MISRA and other requirements to fulfill, we cannot use variable-length arrays. In a "user" source file, to get a 50-element raw buffer, we write:
char rawBuffer[50 * MY_ELEMENT_SIZE] = {0u};
Using the extern const... method, this results in the compilation error:
error: variably modified ‘rawBuffer’ at file scope
This was not totally unexpected, but is disappointing in that sizeof(any_type) is genuinely constant and known at compile time.
Please advise me on how to expose the size of the data element in the public header without making the existence of component_x_type known to the user, in such a way that it can be used as an array length in C89.
Many, many thanks.
In my C89 code
It is 2020 now. Discuss with your manager or client the opportunity to use a less obsolete C standard. In practice, most hand-written C89 code can be reasonably ported to C11, and you could use, buy or develop code refactoring tools -or services- helping you with that (e.g. your GCC plugin). Remind to your manager or client that technical debt has a lot of cost (probably dozen of thousands of US$ or €). Notice that old C89 compilers are in practice optimizing much less than recent ones, and that most junior developers (your future colleagues) are not even familiar with C89 (so they would need some kind of training, which costs a lot).
How can I hide the contents of a user-exposed C preprocessor definition in non-user code?
As far as I know, you cannot (in theory). Check by reading the C11 standard n1570. Read also the documentation of GNU cpp then of GCC (or of your C compiler).
we have pedantry and MISRA and other requirements to fulfill
Be aware that these requirements have costs. Remind these costs to your client or manager.
(about hiding the content of a user-exposed C preprocessor #define)
However, in practice, a C code (e.g. inside some internal header file #include-d in your translation unit) can be generated, and this is common practice (look into GNU bison or SWIG for a well known example of C code generator, and also consider using GNU m4 or gpp or your own Guile or Python script, or your own C or C++ program emitting C code). You simply have to configure your build automation infrastructure (e.g. write your Makefile) for such a case.
If you have some script or utility generating things like #define MACRO_7oa7eIzzcxv03Tm (where MACRO_7oa7eIzzcxv03Tm is some pseudo-random or name mangled identifier) then the probability of an accidental collision with client code is quite small. A human programmer is very unlikely to think of such identifiers, and with enough care a C generating script usually won't emit identifiers colliding with that. See also this answer.
Perhaps your client or manager allows you to use (on your desktop) some generator of such "random-looking" identifier. AFAIK, they are MISRA compatible (but my MISRA standard is at office, and I am -may 2020- currently Covid19 confined at home, near Paris, France).
we cannot use variable-length arrays.
You could (with approval from manager and client) consider using struct-s with flexible array members or else use arrays of dimension 0 or 1 as the last member of your struct-s. IIRC, that was common practice in SunOS3.2
Consider also using tools like Frama-C, Clang static analyzer, or -at end of 2020- my Bismon coupled with a recent GCC. Think of subcontracting the code review of your source code.
Additional to the other answers, this is a quite primitive proposal. But it is easy to understand.
Since presumably you will not publish your header files too often to you clients, and so will not change the sizes of the types, you can use a (manually or automatically) calculated definition:
#define OUR_LIB_TYPE_X_SIZE 23
In your private sources you can then check the correctness of this assumption for example by
typedef char assert_type_x_has_size[2 * (sizeof (TypeX) == OUR_LIB_TYPE_X_SIZE) - 1];
It will error on any decent compiler on unequal sizes, because the array's size will be -1 and illegal. On equal sizes, the array's size is 1 and legal.
Because you're just defining a type, no code or memory is allocated. You might need to mark this as "unused" for some compilers or code checkers.
I've encountered this very problem too - unfortunately private encapsulation also makes the object size encapsulated. Sometimes it is sufficient to simply return the object size through a getter function, but not always.
I solved it exactly as KamilCuk showed in comments: give the caller a raw "magic number" through a #define in the .h file, then a static assert inside the .c implementation checking that the define is consistent with the object size.
If that's not elegant enough, then perhaps you could consider outsourcing the size allocation to a run-time API from the "class":
uint8_t* component1_get_raw_buffer (size_t n);
Where you return a pointer to a statically allocated buffer inside the encapsulated "class". The caller code would then have to be changed to:
uint8_t* raw_buffer;
raw_buffer = component1_get_raw_buffer(50);
This involves some internal trickery keeping track of how much memory that's allocated (and error handling - maybe return NULL on failure). You will to reserve a fixed maximum size for the internal static buffer, to cover the worst use-case scenario.
(Optionally: const qualify the returned pointer if the user isn't supposed to modify the data)
Advantages are: better OO design, no heap allocation, remain MISRA-C compliant. Disadvantages are function call overhead during initialization and the need to set aside "enough" memory in advance.
Also, this method isn't very safe in a multi-threading environment, but that's not usually an issue in embedded systems.
I guess there must be a duplicated question here but I couldn't find it. I'm recently working on a C project and, while trying to leave the code as concise as possible, I considered typedef-ing a consistently-used array with a certain type.
As an example, suppose the array of a structure type entry has always the fixed length of MAX_N_ENTRIES. I'd like to reduce the redundancy by rewriting the code;
struct entry ents[MAX_N_ENTRIES];
to this code;
typedef struct entry entry_arr_t[MAX_N_ENTRIES];
entry_arr_t ents;
What I'm concerning about is that, as the array type obviously should be handled in a different way to any primitive types in C, this kind of typedef-ing can cause confusion in the future, making it look like an alias of primitives.
Yes, it's possible to create a typedef for an array type -- and there's even an example in the Standard C library, namely the jmp_buf type that's used with setjmp and longjmp.
It's usually considered poor style, however, because type names are usually assumed to refer to first-class types that you can do every ordinary first-class-type thing with, and in particular: assign them. But of course you can't assign arrays in C, because they're not first-class types.
In other words, given the typedef in your question, a later programmer might assume that it would be possible to write
entry_arr_t ents1, ents2;
...
ents1 = ents2;
But of course that assignment would fail.
The fact that you've included "arr" in the typedef name does indeed mitigate this concern, making it less likely that the hypothetical later programmer would make the bad assumption.
if I am developing a C shared library and I have my own structs. To make common operations on these struct instances easier for library consumers, can I provide function pointers to such functions inside the struct itself? Is it a good practice? Would there be issues with respect to multithreading where a utility function is called in parallel with different arguments and so on?
I know it goes a lot closer to C++ classes but I wish to stick to C and learn how it would be done in a procedural language as opposed to OOP.
To give an example
typedef struct tag tag;
typedef struct my_custom_struct my_custom_struct;
struct tag
{
// ...
};
struct my_custom_struct
{
tag *tags;
my_custom_struct* (*add_tag)(my_custom_struct* str, tag *tag);
};
my_custom_struct* add_tag(my_custom_struct* str, tag *tag)
{
// ...
}
where add_tag is a helper that manages to add the tag to tag list inside *str.
I saw this pattern in libjson-c like here- http://json-c.github.io/json-c/json-c-0.13.1/doc/html/structarray__list.html. There is a function pointer given inside array_list to help free it.
To make common operations on these struct instances easier for library
consumers, can I provide function pointers to such functions inside
the struct itself?
It is possible to endow your structures with members that are function pointers, pointing to function types whose parameters include pointers to your structure type, and that are intended to be used more or less like C++ instance methods, more or less as presented in the question.
Is it a good practice?
TL;DR: no.
The first problem you will run into is getting those pointer members initialized appropriately. Name correspondence notwithstanding, the function pointers in instances of your structure will not automatically be initialized to point to a particular function. Unless you make the structure type opaque, users can (and undoubtedly sometimes will) declare instances without calling whatever constructor-analog function you provide for the purpose, and then chaos will ensue.
If you do make the structure opaque (which after all isn't a bad idea), then you'll need non-member functions anyway, because your users won't be able to access the function pointers directly. Perhaps something like this:
struct my_custom_struct *my_add_tag(struct my_custom_struct *str, tag *tag) {
return str->add_tag(str, tag);
}
But if you're going to provide for that, then what's the point of the extra level of indirection? (Answer: the only good reason for that would be that in different instances, the function pointer can point to different functions.)
And similar applies if you don't make the structure opaque. Then you might suppose that users would (more) directly call
str->add_tag(str, tag);
but what exactly makes that a convenience with respect to simply
add_tag(str, tag);
?
So overall, no, I would not consider this approach a good practice in general. There are limited circumstances where it may make sense to do something along these lines, but not as a general library convention.
Would there be issues with
respect to multithreading where a utility function is called in
parallel with different arguments and so on?
Not more so than with functions designated any other way, except if the function pointers themselves are being modified.
I know it goes a lot closer to C++ classes but I wish to stick to C
and learn how it would be done in a procedural language as opposed to
OOP.
If you want to learn C idioms and conventions then by all means do so. What you are describing is not one. C code and libraries can absolutely be designed with use of OO principles such as encapsulation, and to some extent even polymorphism, but it is not conventionally achieved via the mechanism you describe. This answer touches on some of the approaches that are used for the purpose.
Is it a good practice?
TLDR; no.
Background:
I've been programming almost exclusively in embedded C on STM32 microcontrollers for the last year and a half (as opposed to using C++ or "C+", as I'll describe below). It's been very insightful for me to have to learn C at the architectural level, like I have. I've studied C architecture pretty hard to get to where I can say I "know C". It turns out, as we all know, C and C++ are NOT the same language. At the syntax level, C is almost exactly a subset of C++ (with some key differences where C supports stuff C++ does not), hence why people (myself included before this) frequently think/thought they are pretty much the same language, but at the architectural level they are VASTLY DIFFERENT ANIMALS.
Aside:
Note that my favorite approach to embedded is to use what some colloquially know as "C+". It is basically using a C++ compiler to write C-style embedded code. You basically just write C how you'd expect to write C, except you use C++ classes to vastly simplify the (otherwise pure C) architecture. In other words, "C+" is a pseudonym used to describe using a C++ compiler to write C-like code that uses classes instead of "object-based C" architecture (which is described below). You may also use some advanced C++ concepts on occasion, like operator overloading or templates, but avoid the STL for the most part to not accidentally use dynamic allocation (behind-the-scenes and automatically, like C++ vectors do, for example) after initialization, since dynamic memory allocation/deallocation in normal run-time can quickly use up scarce RAM resources and make otherwise-deterministic code non-deterministic. So-called "C+" may also include using a mix of C (compiled with the C compiler) and C++ (compiled with the C++ compiler), linked together as required (don't forget your extern "C" usage in C header files included in your C++ code, as required).
The core Arduino source code (again, the core, not necessarily their example "sketches" or example code for beginners) does this really well, and can be used as a model of good "C+" design. <== before you attack me on this, go study the Arduino source code for dozen of hours like I have [again, NOT the example "sketches", but their actual source code, linked-to below], and drop your "arduino is for beginners" pride right now.
The AVR core (mix of C and "C+"-style C++) is here: https://github.com/arduino/ArduinoCore-avr/tree/master/cores/arduino
Some of the core libraries ("C+"-style C++) are here: https://github.com/arduino/ArduinoCore-avr/tree/master/libraries
[aside over]
Architectural C notes:
So, regarding C architecture (ie: actual C, NOT "C+"/C-style C++):
C is not an OO language, as you know, but it can be written in an "object-based" style. Notice I say "object-based", NOT "object oriented", as that's how I've heard other pedantic C programmers refer to it. I can say I write object-based C architecture, and it's actually quite interesting.
To make object-based C architecture, here's a few things to remember:
Namespaces can be done in C simply by prepending your namespace name and an underscore in front of something. That's all a namespace really is after-all. Ex: mylibraryname_foo(), mylibraryname_bar(), etc. Apply this to enums, for example, since C doesn't have "enum classes" like C++. Apply it to all C class "methods" too since C doesn't have classes. Apply to all global variables or defines as well that pertain to a particular library.
When making C "classes", you have 2 major architectural options, both of which are very valid and widely used:
Use public structs (possibly hidden in headers named "myheader_private.h" to give them a pseudo-sense of privacy)
Use opaque structs (frequently called "opaque pointers" since they are pointers to opaque structs)
When making C "classes", you have the option of wrapping up pointers to functions inside of your structs above to give it a more "C++" type feel. This is somewhat common, but in my opinion a horrible idea which makes the code nearly impossible to follow and very difficult to read, understand, and maintain.
1st option, public structs:
Make a header file with a struct definition which contains all your "class data". I recommend you do NOT include pointers to functions (will discuss later). This essentially gives you the equivalent of a "C++ class where all members are public." The downside is you don't get data hiding. The upside is you can use static memory allocation of all of your C "class objects" since your user code which includes these library headers knows the full specification and size of the struct.
2nd option: opaque structs:
In your library header file, make a forward declaration to a struct:
/// Opaque pointer (handle) to C-style "object" of "class" type mylibrarymodule:
typedef struct mylibrarymodule_s *mylibrarymodule_h;
In your library .c source file, provide the full definition of the struct mylibrarymodule_s. Since users of this library include only the header file, they do NOT get to see the full implementation or size of this opaque struct. That is what "opaque" means: "hidden". It is obfuscated, or hidden away. This essentially gives you the equivalent of a "C++ class where all members are private." The upside is you get true data hiding. The downside is you can NOT use static memory allocation for any of your C "class objects" in your user code using this library, since any user code including this library doesn't even know how big the struct is, so it cannot be statically allocated. Instead, the library must do dynamic memory allocation at program initialization, one time, which is safe even for embedded deterministic real-time safety-critical systems since you are not allocating or freeing memory during normal program execution.
For a detailed and full example of Option 2 (don't be confused: I call it "Option 1.5" in my answer linked-to here) see my other answer on opaque structs/pointers here: Opaque C structs: how should they be declared?.
Personally, I think the Option 1, with static memory allocation and "all public members", may be my preferred approach, but I am most familiar with the opaque struct Option 2 approach, since that's what the C code base I work in the most uses.
Bullet 3 above: including pointers to functions in your structs.
This can be done, and some do it, but I really hate it. Don't do it. It just makes your code so stinking hard to follow. In Eclipse, for instance, which has an excellent indexer, I can Ctrl + click on anything and it will jump to its definition. What if I want to see the implementation of a function I'm calling on a C "object"? I Ctrl + click it and it jumps to the declaration of the pointer to the function. But where's the function??? I don't know! It might take me 10 minutes of grepping and using find or search tools, digging all around the code base, to find the stinking function definition. Once I find it, I forget where I was, and I have to repeat it all over again for every single function, every single time I edit a library module using this approach. It's just bad. The opaque pointer approach above works fantastic instead, and the public pointer approach would be easy too.
Now, to directly answer your questions:
To make common operations on these struct instances easier for library consumers, can I provide function pointers to such functions inside the struct itself?
Yes you can, but it only makes calling something easier. Don't do it. Finding the function to look at its implementation becomes really hard.
Is it a good practice?
No, use Option 1 or Option 2 above instead, where you now just have to call C "namespaced" "methods" on every C "object". You must simply pass the "members of the C class" into the function as the first argument for every call instead. This means instead of in C++ where you can do:
myclass.dosomething(int a, int b);
You'll just have to do in object-based C:
// Notice that you must pass the "guts", or member data
// (`mylibrarymodule` here), of each C "class" into the namespaced
// "methods" to operate on said C "class object"!
// - Essentially you're passing around the guts (member variables)
// of the C "class" (which guts are frequently referred to as
// "private data", or just `priv` in C lingo) to each function that
// needs to operate on a C object
mylibrarymodule_dosomething(mylibrarymodule_h mylibrarymodule, int a, int b);
Would there be issues with respect to multithreading where a utility function is called in parallel with different arguments and so on?
Yes, same as in any multithreaded situation where multiple threads are trying to access the same data. Just add a mutex to each C struct-based "object", and be sure each "method" acting on your C "objects" properly locks (takes) and unlocks (gives) the mutex as required before operating on any shared volatile members of the C "object".
Related:
Opaque C structs: how should they be declared? [use "Object-based" C architecture]
I would like to suggest you reading com specification, you will gain a lot. all these com, ole and dcom technology is based on a simple struct that incorporates its own data and methods.
https://www.scribd.com/document/45643943/Com-Spec
simplied more here
http://www.voidcn.com/article/p-fixbymia-beu.html
Is it possible to determine the elements(name & datatype) in a structure(C language) in a library ? If yes, how to do it in C language ? If C language does not support it, Is it possible to get the structure elements by other tricks or is there any tool for it?
Do you mean find out when you are programming, or dynamically at runtime?
For the former, sure. Just find the .h file which you are including and you will find the struct definition there including all the fields.
For the latter, no, it is not possible. C compiles structs to machine code in such a way that all of this information is lost. For example, if you have a struct {int x, float y, int z}, and you have some code which says
a = mystruct.y
in the machine code, all that will remain is something like finding the pointer to mystruct, adding 4 to it (the size of the int), and reading 4 bytes from there, then doing some floating point operations to it. Neither the names nor the types of those struct fields will be accessible at all, and therefore, there is no way to find them out at runtime.
No, it isn't possible. C has no inbuilt reflection-style support.
If by "determine the elements of a structure" you mean "get the declaration of that structure type programmatically", then I do not believe that it is possible - at least not portably. Contrary to more modern languages like C++ ot Java, C does not keep type information in a form available to the actual program.
EDIT:
To clarify my comment about it being impossible "portably":
There could very well be some compiler+debugging format combination that would embed the necessary information in the object files that it produces, although I can't say I know of one. You could then, hypothetically, have the program open its own executable file and parse the debugging information. But this is a cumbersome and fragile approach, at best...
Why do you need to do something like that?
Being a developer born and raised on OO, I was curious to hear how it's possible to avoid global state in a procedural program.
You can also write object-oriented code in C. You don't get all the C++ goodies and it's ugly, and you have to manually pass the this pointer (I've seen self used for this, in order to make it compatible with C++), but it works. So technically, you don't need global state in pure procedural languages for the very same reasons you don't need it in object-oriented languages. You just have to pass the state around explicitly, rather than implicitly like in OO languages.
As an example, look at how the file I/O functions in the C standard library work with pointer to FILE objects that are (largely) opaque. Or look at how OS APIs deal with handles and such to encapsulate information. A program creates objects, uses APIs that act on those objects and closes/deletes the objects - all using straight C.
A global variable is nothing but an implicit procedure argument. Make it explicit and the global variable goes away.
Note: the fact that you no longer use a global variable does not mean that you no longer use global state! What we did above was just a purely syntactical transformation, the semantics of the program haven't changed at all. It's just as non-composable, non-modular, non-threadsafe, non-parallelizable as it was before.
All OO is a mindset and a whole bunch of compiler support.
You can achieve much the same by discipline, coding conventions, and passing around structures in most languages.
For example I used to have functions/procedures prefixed with their module identity, taking the first parameter as being the related module struct.
// System.h
typedef struct _System
{
struct _System *owner;
LinkedList *elements;
} System;
// System.c
int System_FindName ( System * system, char *name)
{
..
}
etc..
I'd really seriously not like to have to go back to coding like this though. I'm very happy that I haven't had to write and debug a linked list for at least 18 years. It was hard back then without the internet and sitting there isolated in the corner of a cold brightly lit room with green phosphors burning into your retina...
Of course. Just declare a struct somewhere, allocate some memory for it, pass the pointer to the allocated memory to an initialization function, and off you go. Just pass the pointer to all the functions that require using the struct.
Though the question arises as to where you store the pointer to the data you don't want to be global, and then you may end up with a global pointer ;-)
You can have variables on stack or in heap that will exist during all the program life.
Passing object style structure pointers to every function is a good way to have OO C coding style.
(I would suggest to have a look in linux sources)
You could try, as an example, create with dia (the diagramming tool), a simple class (for example, a square).
http://projects.gnome.org/dia/
http://dia-installer.de/index_en.html
Then, you can transform that class in C code using dia2code:
http://dia2code.sourceforge.net/
Specifically, say you created the class square inside the square.dia diagram. Then, you type:
$ dia2code -t c square.dia
... and you will see that it is possible to convert any object-oriented programming in a C program without global variables. Explore the created files square.c and square.h
NOTE: in Windows, you'll need a workaround in order to make dia2code work. Before using dia2code, change square.dia to square.zip, unzip it, and rename the result as square.dia
Simple. Whenever a procedure accesses a global variable, then give this variable as an argument to the procedure instead, either by value or by reference or by pointer, or by whatever your programming language provides. After that there is no more need for the variable to be global.