Knowing where op structs are filled - c

I am currently trying to write a linux driver and for this it is necessary that I understand some apis to make best use of them. Often I come across a pattern where I start digging into a funciton and end up at a point where the function reads:
returnType OperationX(args...) {
...
struct operations_t operations = get_operations();
if(operations->X)
return operations->X(args...)
}
Basically get_operations() returns a pointer to a global struct, which holds a pointer to the actual function running the operation.
I find it very tedious using the linux cross reference to dig into the different places and then actually understand which assignement actually takes place. Is there a better faster way?
AN example would be dma mapping.

git grep and cscope are your best friends.
By the way, DMA operations are filled either by platform code or in IOMMU implementations. I bet most probably you have lib/swiotlb.c in use for that.

Related

IOCTL, unix system call uses vairable number of arguments. Is it reliable or safe to use to monitor and control devices?

I have been learning how to control devices in linux based system using IOCTL and reading an article the author said that IOCTL prototype stands out in the list of Unix System call because of the dots int ioctl(int fd, unsigned long cmd, ...) which prevents type checking during compilation. The last part is what I do not get quite well. My concern is, not checking types could create some issues controlling the peripheral? and what would be a more reliable way or best practice to monitor and control a peripheral? Thanks
My concern is, not checking types could create some issues controlling the peripheral?
No, at least not directly. As long as the arguments provided are indeed of the correct number and types, everything will be well (that is, the values will be received correctly by the driver). The problem is that that compiler cannot help users of your device driver recognize when they are providing the wrong number or types of arguments.
and what would be a more reliable way or best practice to monitor and control a peripheral?
Alternative ways to monitor and communicate with a peripheral include character and/or block special files (see mknod()), setting kernel parameters via _sysctl(), and manipulating files presented in the proc filesystem via your driver. Whether any of those are more reliable, more appropriate, or better practice depends on many factors -- exactly what you're trying to do not least among them.
Not checking types means that the function does not prevent you from mistakenly passing a char there where your peripheral was expecting an int, and well, yes, that could create some issues controlling the peripheral.
So, this means that you need to be careful with the types of the parameters you pass.
The general rule is "GIGO", which stands for "Garbage In Garbage Out". If you give something garbage, it will give you back garbage. Type checking is meant to save programmers from really obvious, really dumb errors. No type checking simply means that the programmers need to be extra careful.
Generally, the first thing you need to do with ioctl() is to create a set of functions that fully describe the interface of your peripheral. Of course these functions will accept properly typed parameters. Then, you will implement each one of those functions by delegating to the type-unsafe ioctl() function. From that moment on, you never directly invoke ioctl() again.

c - sockets, why do ip are sent in integer format?

Question
I am wondering why do we connect to sockets by using functions like hton to take care of endianness when we could have sent the ip in plain char array.
Say we want to connect to 184.54.12.169
There is an explanation to this but I cannot figure out why we use integers instead of char, and so involving ourself in endianness hell.
I think char out_ip[] = "184.54.12.169" could have theoretically made it.
Please explain me the subtleties i don't get here.
The basic networking APIs are low level functions. These are very thin wrappers around kernel system calls. Removing these low level functions, forcing everything to use strings, would be rather bad for a low-level API like that, especially considering how tedious string handling is in C. As a concrete hurdle, even IP strings would not be fixed length, so handling them is a lot more complex than just plain 32 bit integers. And moving string handling to kernel is really quite against what kernel is supposed to be, handling arbitrary user strings is really user space problem.
So, you want to create higher-level functions which would accept strings and do the conversion in the library. But, adding such higher level "convenience" functions all over the place in the core libraries would bloat them, because certainly passing IP numbers is not the only place for such convenience. These functions would need to be maintained forever and included everywhere, after they became part of standard (official like POSIX, or de-facto) libraries.
So, removing the low-level functions is not really an option, and adding more functions for higher-level API in the same library is not a good option either.
So solution is to use another library to provide higher level networking API, which could for example handle address strings directly. Not sure what's out ther for C, but it's almost a given for other languages, which also have "real" strings built in so using them is not a hassle.
Because that's how an IP is transmitted in a packet. The "www.xxx.yyy.zzz" string form is really just a human readable form of a 4 byte integer that allows us to see the hierarchical nature a little easier. Sending a whole string would take up a lot more space as well.
Say number 127536 that requires 7 bytes not four. In addition you need to parse it.
I.e. more efficient and do not have to deal with invalid values.

Why do C written libraries use so many structs?

I've looked to some open source Libraries in some places. And, I've realized which that Libraries are basically a great stack of structs. I've seen few methods.
Why does C written libraries uses so much structs? What's the basis behind this? This, for me, looked like a attempt to simulate object orientation, 'cause a fast searching told me that each struct is "instantiated" by the using program to make something, per example, in some Desktop enviroments for linux that I've seen that each window was a struct in the used GUI library.
Anyway, the question is that.
Structs are a great way to organize data. And data is fundamental, as Fred Brooks knew decades ago:
Show me your flowcharts and conceal your tables, and I shall continue
to be mystified. Show me your tables, and I won't usually need your
flowcharts; they'll be obvious.
Object-oriented programming doesn't have to be merely simulated in C, it can be realized. For example, did you know that inside your structs you can store function pointers which operate on those same structs, and then you are a little bit closer to C++'s classes?
Also consider extensibility: even a function taking many arguments may be improved by taking a single struct, because then its signature does not need to change when a new argument is added.
Finally, C does not have multiple return values from a single function call. But it can return a struct, which is about the same thing. C is a lot about building your own tools from the raw language, and being able to stash a bunch of related data and/or functions together in one place is a good building block.
With or without object orientation, structures are a useful way to group aggregate data into a single symbol. You can copy the structure wherever you like without having to write out all the members each time, and this makes the structure easier to change if you have to.
It also makes it easier to reference certain members using pointer arithmetic, if you're careful (see sockaddr).
Same argument as with arrays.
Simply put, there's no reason not to use structures.
Structures are useful while retrieving data using a pointer. Because single pointer is enough for complete bunch of data with in a structure.
One, it keeps the APIs clean. Instead of passing N separate arguments to a function, you pass a single argument containing N members.
Two, it allows the library to hide implementation details from the programmer. For example, the C FILE type abstracts away some details of stream I/O, details which vary from implementation to implementation. We don't need to know those details, so they're not exposed to us; we just use the FILE type to pass that information around.

Alternative to Hash Map for Small Data set in C

I am currently working on a command line interface for a particle simulator. Its parser takes reads input in the following format:
[command] [argument]* (-[flag] [flag argument])
Currently, the command is sent through a conditional block, compared to various known commands and its corresponding data packet is sent to the matching function. This, however, seems clunky, inefficient and inelegant.
I am thinking about using a hashmap instead, with a string representation of a command as the key and a function pointer as the value. The function referenced would then be sent a data packet containing arguments, flags, etc.
Is a hash map overkill in this situation? Does the extra infrastructure required to implement one outweigh the potential benefits? I am aiming for speed, elegance, function, and, since this is an open-source project, extensibility.
Thanks for the help.
You might want to consider the Ternary Search Tree. It has good performnce, efficient use of storage; and you don't need a hash function or a collision strategy.
The linked Bentley/Sedgwick article is a very thorough-yet-readable explanation of the accompanying C source.
I've been using a TST for name-lookup in the past 3 versions of my postscript interpreter. The only changes that have been needed have been due to changes in memory management. Here's a version I modified (lightly) to use explicit pointers. I use yet another version in my postscript interpreter, any of the xpost2*.zip versions, in the file core.c, which uses byte-offsets for pointers (have to be added to the user-memory byte-pointer to yield a real pointer).
Speed gained will probably be minimal, but you could hash the command to convert it to a number and then use a switch statement. Faster than a hash map.

What are some good ways of implementing tail call elimination?

I've written a small Scheme interpreter in an unholy mix of C/C++, but I have yet to implement proper tail calls.
I am aware of the classic Cheney on the MTA algorithm, but are there other nice ways of implementing this? I know I could put the Scheme stack on the heap, but that would still not be proper elimination, as the standard says one should support an unlimited number of active tail calls.
I've also fiddled with longjmps, but so far I think it'll only work well for non-mutual recursive tail calls.
How do the major C-based Schemes implement proper tail recursion?
Simpler than writing a compiler and VM is to registerize and trampoline your interpreter. Since you have an interpreter and not a compiler (I assume), you only need a couple straightforward transformations to get proper support for tail calls.
You'll have to first write everything in continuation-passing style, which may be weird to think about and do in C/C++. Dan Friedman's ParentheC tutorial steps you through transforming a high-level, recursive program into a form that is machine-translatable to C.
In the end, you'll essentially implement a simple VM where instead of using regular function calls to do eval, applyProc, etc., you pass arguments by setting global variables and then do a goto to the next argument (or use a top-level loop and program counter)...
return applyProc(rator, rand)
becomes
reg_rator = rator
reg_rand = rand
reg_pc = applyProc
return
That is, all of your functions that normally call each other recursively are reduced to a pseudo-assembly in which they are just blocks of code that don't recur. An top-level loop controls the program:
for(;;) {
switch(reg_pc) {
case EVAL:
eval();
break;
case APPLY_PROC:
applyProc();
break;
...
}
}
Edit: I went through the same process for my hobby Scheme interpreter, written in JavaScript. I took advantage of a lot of anonymous procedures, but it might help as a concrete reference. Look at FoxScheme's commit history starting from 2011-03-13 (30707a0432563ce1632a) up through 2011-03-15 (5dd3b521dac582507086).
Edit^2: Non-tail recursion will still consume memory, even if it's not in the stack.
Without knowing what you have, I'd say the easiest (and most enlightening) way to do it is to implement the scheme compiler and VM from Dybvig's "Three Implementation Models for Scheme".
I've done it here in Javascript (a copy of Dybvig's PDF is there too): https://github.com/z5h/zb-lisp
check src/compiler.js: compileCons, and the implementation of the "op codes" in src/vm.js
If you are interested in implementation techniques of interpreters, there
is no way around the book "LiSP - Lisp in Small Pieces" by Christian Queinnec.
It explains all aspects of implementing a Scheme system very thoroughly with
complete code. It is a wonderful book.
http://www.amazon.com/exec/obidos/ASIN/0521562473/qid=945541473/sr=1-2/002-2995245-1849825
But don't forget to check out the papers on ReadScheme.org.
The section
Compiler Technology/Implementation Techniques and Optimization
http://library.readscheme.org/page8.html
has quite a few papers on tail call optimization.
Among others you will find a link to Dybvig's thesis (a classic),
which is very well written. It explains and motivates everything in
a very clear manner.

Resources