What are the pitfalls and gotchas of mixing Objective-C and C? - c

At the risk of oversimplifying something I'm worried might be ridiculously complex, what should I be aware of when mixing C and Objective-C?
Edit: Just to clarify, I've never worked with C before, and I'm learning Objective-C through Cocoa. Also I'm using the Chipmunk Dynamics engine, which is C.

I'd put it the other way around: you might be risking overcomplicating something that is ridiculously simple :-)
Ok, I'm being a bit glib. As others are pointing out, Objective-C is really just a minimal set of language extensions to C. When you are writing Objective-C code, you are actually writing C. You can even access the internal machinations of the Objective-C runtime support using some handy C functions that are part of the language (no... I don't recommend you actually DO this unless you really know what you're doing).
About the only time I've ever had mildly tricky moments is when I wanted to pass an Objective-C instance method as a callback to a C function. Say, for example, I'm using a pure-C cross platform library that has functions which accept a callback. I might call the function from within an object instance to process some data, and then want that C function to call my instance BACK when its done, or as part of getting additional input etc etc (a common paradigm in C). This can be done with funky function wrapping, and some other creative methods I've seen, and if you ever need to do it googling "objective-c method for c callback" or something like that will give you the goods.
The only other word of advice is to make sure your objects appropriately manage any manually malloced memory that they create for use by C functions. You'll want your objective-c classes to tidy up that memory on dealloc if, indeed, it is finished.
Other than that, dust off any reference on C and have fun!

You can't 'mix' C and Objective-C: Objective-C is a superset of C.
Now, C++ and Objective-C on the other hand...

Objective C is a superset of C, so it shouldn't conflict.
Except that, as pointed here pure C has different conventions (obviously, since there is no built-in mechanism) to handle OO programming. In C, an object is simply a (struct *) with function pointers.

Related

Emulating lambdas in C?

I should mention that I'm generating code in C, as opposed to doing this manually. I say this because it doesn't matter too much if there's a lot of code behind it, because the compiler should manage it all. Anyway, how would I go around emulating a lambda in C? I was thinking I could just generate a function with some random name somewhere in the source code and then call that? I'm not too sure. I haven't really tried anything just yet, since I wanted to get the idea down before I implement it.
Is there some kind of preprocessor directive I can do, or some macro that will make this cleaner to do? I've been inspired by Jon Blow to try out compiler development, and he seemed to implement Lambdas in his language Jai. However, I think he does something where he generates bytecode, and then into C? I'm not sure.
Edit:
I'm working on a compiler, the compiler is just a project of mine to keep me busy, plus I wanted to learn more about compilers. I primarily use clang, I'm on Ubuntu 14.10. I don't have any garbage collection, but I wanted to try my hand at some kind of smart pointer-y/rust/ARC inspired memory model for garbage collection, i.e. little to no overhead. I chose C because I wanted to dabble in it more. My project is free software, just a hobby project.
There are several ways of doing it ("having" lambdas in C). The important thing to understand is that lambdas give closures and that closures are mixing "code" with "data" (the closed values); notice that objects are also mixing "code" with "data" and there is a similarity between objects and closures. See also this answer on Programmers.
Traditionally, in C, you not only use function pointers, but you adopt a convention regarding callbacks. This for instance is the case with GTK: every time you pass a function pointer, you also pass some data with it. You can view callbacks (the convention of giving C function pointer with some void*data) as a way to implement closures.
Since you generate C code (which is a wise idea, I'm doing similar things in MELT which -on Linux- generates C++ code at runtime, compile it into a shared object, and dlopen-s that) you could adopt a callback convention and pass some closed values to every function that you generate.
You might also consider closed values as static variables, but this approach is generally unwise.
There have been in the past some lambda.h header library which generates a machine-specific trampoline code for closures (essentially generating a code which pushes some closed values as arguments then call some routine). You might use some JIT compilation techniques (using libjit, GNU lightning, LLVM, asmjit, ....) to do the same. See also libffi to call an arbitrary function (of signature known at runtime only).
Notice that there is a strong -but indirect- relation between closures and garbage collection (read the GC handbook for more), and it is not by accident that every functional language has a GC. C++11 lambda functions are an exception on this (and it is difficult to understand all the intricacies of memory management of C++11 closures). So if you are generating C code, you could and probably should use Boehm's conservative garbage collector (which is wrapping dlopen) and you would have closure GC-ed values. (You could use some other GC libraries, e.g. Ravenbrook's MPS or my unmaintained Qish...) Then you could have the convention that every generated C function takes its closure as first argument.
I would suggest to read Scott's book on Programming Language Pragmatics and (assuming you know a tiny bit of Scheme or Lisp; if you don't you should learn a bit of Scheme and read SICP) Queinnec's book Lisp In Small Pieces (if you happen to read French, read the latest French variant).

What 'mark' does a language leaves on a library that we need language bindings?

What mark does a language leaves on a compiled library that we need language bindings if we have call its functions from a different language?
object code looks 'language free' to me.
While learning OpenGL in c in Linux environment I have across language bindings.
Binding provides a simple and consistent way for applications to present and interact with data.
Source: The tag under your question
I'm guessing that you're either young or haven't been programming for more than a decade or so.
Object code should look language free, but it ain't due to history. Back in the 1970s and 1980s, on Intel 80x86 and Motorola 680x0 CPUs, function call arguments were passed on the stack. In the 'Pascal' convention, the number of arguments was fixed and the called function code removed them from the stack before returning. In the 'C' convention, the number of arguments was variable (eg printf) so the calling code had to remove them when the function returned. This cost 2 extra bytes per function call, which is nothing today but was significant back then when PCs only came with 128K or so of RAM. So Microsoft chose to use the Pascal calling convention for the Windows API, even though it was written in C. If your object code called a Windows function with the C convention by mistake, kaboom. This is why the header files are still cluttered up with WINAPI and _stdcall and _fastcall and whatnot.
Starting in the 1990s operating system authors realized this was silly and started imposing standard calling conventions on everyone. The C convention could handle both cases, so it got used everywhere. With the moves to MacOS X, 64 bit Windows, and ARM; we are finally getting language free object code.
Now, OpenGL was designed to be used from C and Fortran. (Which was in the 1990s still an important language for scientific calculations and visualization.) Both languages have integers, floating point numbers, and arrays of various sized ints/floats. C has structs but Fortran doesn't, and I suspect this is a major reason why the OpenGL API never uses any structs. There are also differences in the memory layout of 2D or higher dimension arrays between C and Fortran, and again note that the OpenGL API never specifies 2D arrays, only 1D.
A C API works for most languages. This is partly because C is 'portable assembler' that works onto almost any CPU and operating system. It's also because most other programming languages in common use are either supersets of C (C++, Objective-C) or implemented in C themselves (Python, Perl, Ruby) so can be made to call the OpenGL C API reasonably easily.
Java and C# have more problems, because they define their own object code, so to speak, and memory access is more tightly controlled. The C/OpenGL notion of 'here is a pointer to a block of memory, do what you like with it' breaks the security model of the JVM/CLR. So you end up having to use Java NIOByteBuffer things instead of just passing arrays.
A lot of it also comes down to the skill of the language binding designer. For one example, Python-OpenGL by Mike Fletcher is a really good binding. All the functions and constants have exactly the same names, so a lot of code can be just copied from C and pasted into Python. Python doesn't have C style arrays directly, but the language binding will silently translate any Python sequences/tuples you pass as "arrays" into the underlying C format for you. It feels natural for a Python programmer and still exposes the full capabilities of OpenGL.
For a bad example, JOGL is a pain in the arse. There's no automatic conversion from Java arrays to C, so you have to futz around with NIOByteBuffers yourself. It's so annoying that it's actually easier to use glBegin..glEnd blocks. And extra array offset arguments got added to a lot of OpenGL functions, so your code no longer looks the same as C/C++ and you waste a lot of time sticking ,0 on the end of function calls. Some of this is due to the JVM as mentioned before, but a lot of it is just bad design by (I suspect) somebody who never actually wrote much OpenGL themselves.
A long and rambling answer to a vague question.
Well, all you have to do is think about the myriad of calling conventions in C and C++. In order to prevent serious mishaps, the compiler mangles the function names based on calling convention so that you do not accidentally call a stdcall function using fastcall conventions. Each language has its own set of superfluous details like this that a language independent API should never have to burden itself with. Language bindings serve as an adapter/bridge that separates the language-specific stuff from the standardized API, filling in the gaps wherever necessary.
The OpenGL API is generally implemented in a single language (C) and programs written in other languages interface with the system's implementation through language bindings. OpenGL uses null-terminated ASCII strings for GLSL and has numerous functions that use pointers, things that make perfect sense for an API that is designed to be implemented in C. In Java, strings are not null-terminated and they are UTF-16 encoded; you can see why a bridge is needed. The Java GL bindings take care of string conversion and alter glVertexPointer (...)-like functions to fit Java's conditions for "pointing to" contiguous blocks of memory.

Win32 COM Programming in Pure C

I have a programming project which requires access to some of the lower-level Windows APIs (WASAPI, in particular). I am fairly experienced with higher-level programming languages such as C#, Java, and PHP, and I know only a little bit about C/C++.
I would like to use C (not C++) for this project, as C++ is kind of scary. Upon changing the C/C++ settings of my project in Visual Studio to compile as C code, I noticed that any calls to __uuidof won't work, as they are C++ specific.
My question is twofold:
Is it possible to write Win32 programs that utilize COM in pure C, and
If so, should using pure C be avoided?
Yes it is possible to use write pure C programs that utilize COM, in fact that was common practice 10-15 years ago.
You are not doing yourself a favor by using C (as you already have noticed). E.g. ATL provides lots of help when you are doing COM and can help you to avoid common mistakes.
If I were you I would go the C++ even if the threshold may be a bit higher at first. Also get a book in the subject if you do not have one. There are lots of examples to be found in the net but it is good to have something that leads you by the hand since COM programming is not for the faint of heart, regardless whether you use C or C++.
COM uses a pretty small subset of C++ beyond the C part, and there's no requirement to use many of the 'scary' items such as:
Exceptions - these can really bite you in C++ if you don't get them right (you need to really buy into the RAII idiom, which can take getting used to if you're used to more traditional C/Win32 coding); but exceptions are not part of COM at all: COM uses return codes for all error handling. There are some wrappers and compiler extensions that will convert COM's return codes into C++ exceptions if you want, but it's opt-in.
Class hierarchies and inheritance - Inheritance is also not really part of COM itself, other than the fact that all COM interfaces derive from (or, start off with the methods from) IUnknown. But you don't need to know anything about multiple virtual inheritance to use COM. Some of this is very useful to know if you are implementing COM objects yourself, but not to just use it. (For example, using C++ multiple inheritance is a very common way to implement a COM object that exposes multiple interfaces; but it's not the only way.)
Templates - again, not part of COM, but there are several libraries - eg. MFC and ATL - that use templates to make COM a lot simpler to use. This is especially true for smart pointer classes like CComPtr that will take care of some reference counting for you, allowing your code to concentrate on doing the real interesting stuff instead of having it packed with housekeeping stuff.
The main link between COM and C++ is all C++ compilers on Windows will lay out a C++ object in memory in a way that matches exactly what COM requires. This allows you to use a COM object as though it were a C++ object, which makes the code considerably less verbose, as the language/compiler is taking care of some really simple but verbose stuff for you.
So instead of doing the following in C (step through vtable and pass this param explicitly):
pUnk->lpVtbl->SomeMethod(pUnk, 42);
You can do the following in C++:
pUnk->SomeMethod(42);
You really don't want to have to type ->lpVtbl and make sure you pass the correct 'this' parameter (be careful when cutting and pasting!) with every single COM call you make, do you?
My suggestion would be to find a good COM book - Inside COM is a good one - and just start off using the C++ subset that you are comfortable with. Once you know how to use a COM pointer "raw", and use QI, AddRef and so on yourself, then perhaps you can play with the helper libraries that use templates to do some of the bookkeeping for you. Deciding to use wrappers that map COM errors to C++ exceptions is a bit of a bigger jump, as you need to write C++ code that is exception-safe first so need to understand the various issues with those first. But I can't think of any good reason - other than sheer curiosity - for going back and using COM from plain C.

What libraries would be useful for implementing a small language interpreter in C?

For my own learning experience, I want to try writing an interpreter for a simple programming language in C – the main thing I think I need is a hash table library, but a general purpose collection of data structures and helper functions would be pretty helpful. What would you guys recommend?
libbasekit - by the author of Io. You can also use libcoroutine.
One library I recommend looking into is libgc, a garbage collector for C.
You use it by replacing calls to malloc, realloc, strdup, etc. with their libgc counterparts (e.g. GC_MALLOC). It works by scanning the stack, global variables, and GC-allocated blocks, looking for numbers that might be pointers. Believe it or not, it actually performs quite well (almost on par with the very good ptmalloc, which is the default (non-garbage collected) malloc implementation in GNU/Linux), and a lot of programs use it (including Mono and GCJ). A disadvantage, though, is it might not play well with other libraries you may want to use, and you may even have to recompile some of them by hand to replace calls to malloc with GC_MALLOC.
Honestly - and I know some people will hate me for it - but I recommend you use C++. You don't have to bust a gut to learn it just to be able to start your project. Just use it like C, but in an hour you can learn how to use std::map<> (an associative container), std::string for easy textual data handling, and std::vector<> for a resizable heap-allocated array. If you want to spend an extra hour or two, learn to put member functions in classes (don't worry about polymorphism, virtual functions etc. to begin with), and you'll get a more organised program.
You need no more than the standard library for a suitably small language with simple constructs. The most complex part of an interpreted language is probably expression evaluation. For both that, procedure-calling, and construct-nesting you will need to understand and implement stack data structures.
The code at the link above is C++, but the algorithm is described clearly and you could re-implement it easily in C. There again there are few valid arguments for not using C++ IMO.
Before diving into what libraries to use I suggest you learn about grammars and compiler design. Especially input parsing is for compilers and interpreters similar, that is tokenizing and parsing. The process of tokenizing converts a stream characters (your input) into a stream of tokens. A parser takes this stream of tokens and matches it with your grammar.
You don't mention what language you're writing an interpreter for. But very likely that language contains recursion. In that case you need to use a so-called bottom-up parser which you cannot write by hand but needs to be generated. If you try write such a parser by hand you will end up with a error-prone mess.
If you're developing for a posix platform then you can use lex and yacc. These tools are a bit old but very powerful for building parsers. Lex can generate code that implements the tokenizing process and yacc can generate a bottom-up parser.
My answer probably raises more questions than it answers. That's because the field of compilers/interpreters is quite complex and cannot simply be explained in a short answer. Just get a good book on compiler design.

C for an Object-Oriented programmer

Having learned Java and C++, I've learned the OO-way. I want to embark on a fairly ambitious project but I want to do it in C. I know how to break problems down into classes and how to turn them into class hierarchies. I know how to abstract functionality into abstract classes and interfaces. I'm even somewhat proficient at using polymorphism in an effective way.
The problem is that when I'm presented with a problem, I only way I know how to do it is in an Object-Oriented way. I've become too dependent on Object-Oriented design philosophies and methodologies.
I want to learn how to think in a strictly procedural way. How do I do things in a world that lacks classes, interfaces, polymorphism, function overloading, constructors, etc.
How do you represent complex concepts using only non-object-oriented structs? How do you get around a lack of function overloading? What are some tip and tricks for thinking in a procedural way?
The procedural way is to, on one side, have your data structures, and, on the other, your algorithms. Then you take your data structures and pass them to your algorithms. Without encapsulation, it takes a somewhat higher amount of discipline to do this and if you increase the abstraction level to make it easier to do it right, you're doing a considerable part of OO in C.
I think you have a good plan. Doing things the completely OO way in C, while quite possible, is enough of a pain that you would soon drop it anyway. (Don't fight the language.)
If you want a philosophical statement on mapping the OO way to the C way, in part it happens by pushing object creation up one level. A module can still implement its object as a black box, and you can still use reasonable programming style, but basically its too much of a pain to really hide the object, so the caller allocates it and passes it down, rather than the module allocating it and returning it back up. You usually punt on getters and setters, or implement them as macros.
Consider also that all of those abstractions you mentioned are a relatively thin layer on top of ordinary structs, so you aren't really very far away from what you want to do. It just isn't packaged quite as nicely.
The C toolkit consists of functions, function pointers and macros. Function pointers can be used to emulate polymorphism.
You are taking the reverse trip old C programmers did for learning OO.
Even before c++ was a standart OO techniquis were used in C.
They included defining structs with a pointer to srtuct (usually called this...)
Then defining pointer functions in the struct, and during runtime initialize those pointers to the relevant functions.
All those functions received as first paremeter the struct pointer this.
Don't think C in the complete OOP way. If you have to use C, you should learn procedural programming. Doing this would not take more time than learning how to realize all the OOP features in C. Furthermore, basic encapsulation is probably fine, but a lot of other OOP features come with overhead on performance when you mimic them (not when the language is designed to support OOP). The overhead may be huge if you strictly follow the C++ design methodology to represent every small things as objects. Programming languages have specific purposes in design. When you break the boundary, you always have to pay something as the cost.
Don't think you have to shelve your knowledge of object-oriented work - you can "program into the language".
I had to work in C after being primarily experienced in object-oriented work. C allows for some level of object concepts to pull through. At the job, I had to implement a red-black tree in C, for use in a sweep-line algorithm to find the intersection points in a set of segments. Since the algorithm used different comparison functions, I ended up using function pointers to achieve the same effect as lambdas in Scheme or delegates in C#. It worked well, and also allowed the balanced tree to be reusable.
The other feature of the balanced tree was using void pointers to store arbitrary data. Again, void and function pointers in C are a pain (if you don't know their ins and outs), but they can be used to approximate creating a generic data structure.
One final note: use the right tool for the job. If you want to use C simply to master procedural technique, then choose a problem that is well-suited to a procedural approach. I didn't have a choice in the matter (legacy application written in C, and people demand the world and refuse to enter the 21st century), so I had to be creative. C is great for low/medium abstractions from the machine, say if you wanted to write a command-line packet inspection program.
The standard way to do polymorphic behavior in C is to use function pointers. You'll find a lot of C APIs (such as the standard qsort(3) and bsearch(3)) take function pointers as parameters; some non-standard ones such as qsort_r take a function pointer and a context pointer (thunk in this case) which serves no purpose other than to be passed back to the callback function. The context pointer functions exactly like the this pointer in object-oriented languages, when dealing with function objects (e.g. functors).
See also:
Can you write object-oriented code in C?
Object-Orientation in C
Try not to use OOP in C. But if you need to, use structures. For the functions,
take a structure for an argument, like so:
typedef struct{
int age;
char* name;
char* dialog;
} Human;
void make_dialog(Human human){
char* dialog="Hi";
human.dialog=dialog;
}
which works exactly like python's self, or something like that and to access other functions belonging to that class:
void get_dialog(Human human){
make_dialog(human);
printf(human.dialog);
}

Resources