Contrast CFFI vs FFI - ffi

I see from impnotes 32.3 that clisp has a FFI system. I also see a CFFI project at http://common-lisp.net/project/cffi/.
Can someone knowledgeable please elaborate on any important differences between these two systems? Which is 'better'/'more official'/'recommended'/'more efficient'/'more reliable' etc.?
Many thanks, R.

CLISP FFI is very high-level and, necessarily, CLISP-specific (just like SBCL FFI is SBCL-specific &c).
CFFI is a cross-implementation compatibility layer which is more low-level, and it relies on the underlying implementation's FFI to work.
So, if you are dead set on using a specific implementation, you should be learning its own FFI.
If you want to write code which will run on many different implementations, use CFFI.
PS. Low-level vs high-level means, roughly, that you need to write more characters to achieve the same effect in CFFI than in CLISP FFI; also the result will probably run faster in CLISP FFI.

Related

What is the relation between BLAS, LAPACK and ATLAS

I don't understand how BLAS, LAPACK and ATLAS are related and how I should use them together! I have been looking through all of their manuals and I have a general idea of BLAS and LAPACK and how to use them with the very few examples I find, but I can't find any actual examples using ATLAS to see how it is related with these two.
I am trying to do some low level work on matrixes and my primary language is C. First I wanted to use GSL, but it says that if you want the best performance you should use BLAS and ATLAS. Is there any good webpage giving some nice examples of how to use these (in C) all together? In other words I am looking for a tutorial on using these three (or any subset of them!). In short I am confused!
BLAS is a collection of low-level matrix and vector arithmetic operations (“multiply a vector by a scalar”, “multiply two matrices and add to a third matrix”, etc ...).
LAPACK is a collection of higher-level linear algebra operations. Things like matrix factorizations (LU, LLt, QR, SVD, Schur, etc) that are used to do things like “find the eigenvalues of a matrix”, or “find the singular values of a matrix”, or “solve a linear system”. LAPACK is built on top of the BLAS; many users of LAPACK only use the LAPACK interfaces and never need to be aware of the BLAS at all. LAPACK is generally compiled separately from the BLAS, and can use whatever highly-optimized BLAS implementation you have available.
ATLAS is a portable reasonably good implementation of the BLAS interfaces, that also implements a few of the most commonly used LAPACK operations.
What “you should use” depends somewhat on details of what you’re trying to do and what platform you’re using. You won't go too far wrong with “use ATLAS + LAPACK”, however.
While ago, when I started doing some linear algebra in C, it came to me as a surprise to see there are so few tutorials for BLAS, LAPACK and other fundamental APIs, despite the fact that they are somehow the cornerstones of many other libraries. For that reason I started collecting all the examples/tutorials I could find all over the internet for BLAS, CBLAS, LAPACK, CLAPACK, LAPACKE, ATLAS, OpenBLAS ... in this Github repo.
Well, I should warn you that as a mechanical engineer I have little experience in managing such a git repository or GitHub. It will first seem as a complete mess to you guys. However if you manage to get over the messy structure you will find all kind of examples and instructions which might be a help. I have tried most of them, to be sure they compile. And the ones which do not compile I have mentioned. I have modified many of them to be compilable with GNU compilers (gcc, g++ and gfortran). I have made MakeFiles which you can read to learn how you can call individual Fortran/FORTRAN routines in a C or C++ program. I have also put some installations instructions for mac and linux (sorry windows guys!). I have also made some bash .sh files for automatic compilation of some of these libraries.
But going to your other question: BLAS and LAPACK are rather APIs not specific SDKs. They are just a list of specifications or language extensions rather than an implementations or libraries. With that said, there are original implementations by Netlib in FORTRAN 77, which most people refer to (confusingly!) when talking about BLAS and LAPACK. So if you see a lot of strange things when using these APIs is because you were actually calling FORTRAN routines in C rather than C libraries and functions. ATLAS and OpenBLAS are some of the best implementations of BLAS and LACPACK as far as I know. They conform to the original API, even though, to my knowledge they are implemented on C/C++ from scratch (not sure!). There are GPGPU implementations of the APIs using OpenCL: CLBlast, clBLAS, clMAGMA, ArrayFire and ViennaCL to mention some. There are also vendor specific implementations optimized for specific hardware or platform, which I strongly discourage anybody to use them.
My recommendation to anyone who wants to learn using BLAS and LAPACK in C is to learn FORTRAN-C mixed programming first. The first chapter of the mentioned repo is dedicated to this matter and there I have collected many different examples.
P.S. I have been working on the dev branch of the repository time to time. It seems slightly less messy!
ATLAS is by now quite outdated. It was developed at a time when it was thought that optimizing the BLAS for various platforms was beyond the ability of humans, and as a result autogeneration and autotuning was the way to go.
In the early 2000s, along came Kazushige Goto, who showed how highly efficient implementations can be coded by hand. You may enjoy an interesting article in the New York Times: https://www.nytimes.com/2005/11/28/technology/writing-the-fastest-code-by-hand-for-fun-a-human-computer-keeps.html.
Kazushige's on the one hand had better insights into the theory behind high-performance implementations of matrix-matrix multiplication, and on the other hand engineered these better. His approach, which on current CPUs is usually the highest performing, is not in the search space that ATLAS autotunes. Hence, ATLAS is inherently inferior. Kazushige's implementation of the BLAS became known as the GotoBLAS. It was forked as the OpenBLAS when he joined industry.
The ideas behind the GotoBLAS were refactored into a new implementation, the BLAS-like Library Instantiation Software (BLIS) framework (https://github.com/flame/blis), which implements the same algorithms, but structures the code so that less needs to be custom implemented for a new architecture. BLIS is coded in C.
What this discussion shows is that there are many implementation of the BLAS. The BLAS themselves are a de facto standard for the interface. ATLAS was once the state of the art. It is no longer.
As far as I'm aware, and after working through the ATLAS repository, it seems that it includes a re-implementation of BLAS in C. There's bit more to it than that but I hope it answers the question.

In what languages besides C can I write a C library?

I want to write a library that is dynamically loadable and callable from C code, but I really don't want to write it in C - the code is security critical, so I want a language that makes it easier to have confidence that my code is correct. What are my options?
To be more specific, I want C programmers to be able to #include this, and -l that, and start using my library just as if I had written it in C. I'd like programmers in other languages to be able to use their favourite tools for linking to C libraries to link to it. Ideally I'd like that to be possible on every platform that supports C, but I'll settle for Linux, Windows and MacOS.
Anything that compiles to native code. So you might Google for that - "languages that compile to native code." See, e.g., Programming languages that compile to native code and have the batteries included
C++ is often the choice for this. Compiles to native code and provided you keep your interfaces simple, easy to write an adapter layer.
Objective C and Fortran are also possible.
It sounds like you are looking for a language with ABI compatibility or which can be described as resulting in native code. So long as it can be compiled to a valid object file (typically an .obj or .o file) which is accepted by the linker, that should be the main criteria. You also then want to write a header file as a convenience for any client code which is written in C (or a closely related language/variant thereof).
As mentioned by others, you need a pretty good reason for choosing a language other than C as it is the lingua-franco of low-level/systems software. Assembler is an option, although harder to port between platforms. D is a more portable - but less widespread - alternative which is designed to produce secure, efficient native code with a minimum of fuss. There are many others.
Almost every security critical application I know of is written in C. I don't believe that there are any other language that has higher real status in producing secure applications.
C is being said to be a poor language for security by people who don't understand.
If you want C programmers to use your library, use C. Doing anything else is tying one hand behind your back whilst trying to walk on a balance beam (the gymnastics equipment). Sure, there are dozens of other languages that are CAPABLE of interfacing to C, but it typically involves using a C layer and then stuffing the C data types into a language specific data type (Java Objects, Python Objects, etc, etc), and when the call is finished, you use the same conversion back to a C data type. Just makes it harder to work with, and potentially slower if you don't get all the design decisions right. And people won't understand the source code, so won't like to use it (see more about this below).
If you want security, then write very good code, wearing your "security aspects" hat firmly on at all times, find a security mailing list or website and post it there for review, take the review comments on board, understand the comments, and fix any comments that are meaningful to fix. Distribute the source code to the users, so people can see what your code does. Those that understand security will know what to look for and understand that you have done a good job (or a bad job, whichever is applicable) - and those who don't will hopefully trust the right pople. If it's good, people will use it. If it's "hidden", and not easy to access, you won't get many customers, no matter what language you use.
Don't worry, you won't reveal anything more from releasing source. If there is a flaw in the code, and it is popular (or important) enough, someone will find the flaw, even if you publish only binaries. For those skilled in reverse engineering, not having source code is only a small obstacle.
Security doesn't stem from using a specific language or a specific tool, it stems from good design and good basic understanding of the problems with security.
And remember security by obscurity (whether that means "hidden source code" or "unusual language" or something else obscure) is false security.
You might be interested in ATS, http://ats-lang.sourceforge.net/. ATS compiles via C, can be as efficient as C, and can be used in a way that is ABI-compatible with C. From the project website:
ATS is a statically typed programming language that unifies implementation with formal specification. It is equipped with a highly expressive type system rooted in the framework Applied Type System, which gives the language its name. In particular, both dependent types and linear types are available in ATS. The current implementation of ATS (ATS/Anairiats) is written in ATS itself. It can be as efficient as C/C++ (see The Computer Language Benchmarks Game for concrete evidence) and supports a variety of programming paradigms
ATS's dependent and linear type system helps produce static guarantees about your code, including various aspects of resource management safety.
Chris Double has been writing a series of articles exploring the power of ATS's type system for systems programming here: http://bluishcoder.co.nz/tags/ats/. Of particular note is this article: http://bluishcoder.co.nz/2012/08/30/safer-handling-of-c-memory-in-ats.html
This document covers aspects of calling back and forth between ATS and C code: https://docs.google.com/document/d/1W6DYQApEqKgyBzMbvpCI87DBfLdNAQ3E60u1hUiMoU0
The main downside is that dependently-typed programming is still a daunting prospect, even for non-systems programming. The syntax of the language is also a bit weird: consider lexical quirks such as the use of abst#ype as a keyword. Finally, ATS is to some degree a research project, and I personally don't know whether it would be sensible to adopt for a commercial endeavour.
Theoretically, it's going to be Fortran: less indirection (as in: my array is [here], not just a pointer to here, and this is true of most but not all of your data structures and variables).
However... There are many gotchas and quirks in Fortran: not, perhaps, as many as in C but you probably know your way around C rather better than Fortran. Which is the point behind most of the comments saying 'Know your code' - but do you really know what your compiler is doing?
Knowing you, I'm prepared to take it on trust that you do, for C. Most programers don't. You do not know and cannot know what a local JVM or JIT compiler does, and that's a black hole in your security model if you're using Java or C# r scripting languages.
Ignore anyone who tells you that the hairy-chested he-men of secure computing write their own assembler: they probably don't even know the security errors they're making in any and all nontrivial projects they release. Know your compiler, indeed.
You could write it in lua - providing a C API to a Lua library is relatively straight forward. C++ is also an option, though of course you'd have to write C wrappers and make sure no exceptions can escape your functions. But honestly, if it's security critical the minor inconveniences of the C language shouldn't be that much of a big deal. What you really should be doing is prove the correctness of your program where feasible, and test extensively where it's not.
You can write a library in Java. JNI is normally used to call C from Java, but it can be used the other way around.
There is finally a decent answer to this question: Rust.

Port GNU C Library to minimal hobby OS

So I have a minimal OS that doesn't do much. There's a bootloader, that loads a basic C kernel in 32-bit protected mode. How do I port in a C library so I can use things like printf? I'm looking to use the GNU C Library. Are there any tutorials anywhere?
Ok, porting in a C library isn't that hard, i'm using Newlib in my kernel. Here is a tutorial to start: http://wiki.osdev.org/Porting_Newlib.
You basically need to:
Compile the library (for example Newlib) using your cross compiler
Provide stub-implementations for a list of system functions (like fork, fstat, etc.) in your kernel
Link the library and your kernel together
If you want to use functions like malloc or printf (which uses malloc internally), you need some kind of memory management and simplest working implementation of sbrk.
I strongly recommend against glibc. It is a beast.
Try newlib instead. Porting it to a new kernel is easy. You just need to write a few support functions, as explained here.
Another new kid on the block is musl which specifically aims to improve the situation in embedded space.
It's probably not the best choice for a beginner, though, since it's still pretty much work in progress.
Better look for a small libc, like uClibc. The GNU C library is huge. And as the comments tell, the first step is to get a C compiler going.
What are you trying to do? Building a full operating system is a job for a group of people lasting a few years... better start with something that already works, and hack on the parts that most interest you.

Is it viable to write a Linux kernel-mode debugger for Intel x86-64 in Common Lisp, and with which Common Lisp implementation[s]?

I'm interested in developing some kind of ring0 kernel-mode debugger for x86-64 in Common Lisp that would be loaded as a Linux kernel module and as I prefer Common Lisp to C in general programming, I wonder how different Common Lisp implementations would fit this kind of programming task.
The debugger would use some external disassembling library, such as udis86 via some FFI. It seems to me that it's easiest to write kernel modules in C as they need to contain C functions int init_module(void) and void cleanup_module(void) (The Linux Kernel Module Programming Guide), so the kernel-land module code would call Common Lisp code from C by using CFFI. The idea would be to create a ring0 debugger for 64-bit Linux inspired by the idea of Rasta Ring 0 Debugger, that is only available for 32-bit Linux and requires PS/2 keyboard. I think the most challenging part would be the actual debugger code with hardware and software breakpoints and low-level video, keyboard or USB input device handling. Inline assembly would help a lot in that, it seems to me that in SBCL inline assembly can be implemented by using VOPs (SBCL Internals: VOP) (SBCL Internals: Adding VOPs), and this IRC log mentions that ACL (Allegro Common Lisp), CCL (Clozure Common Lisp) and CormanCL have LAPs (Lisp Assembly Programs). Both ACL and CormanCL are proprietary and thus discarded, but CCL (Clozure Common Lisp) could be one option. Capacity of building standalone executables is a requirement too; SBCL which I'm currently using has it, but as they are entire Lisp images, their size is quite big.
My question is: is it viable to create a ring0 kernel-mode debugger for Intel x86-64 in Common Lisp, with low-level code implemented in C and/or assembly, and if it is, which Common Lisp implementations for 64-bit Linux best suit for this kind of endeavour, and what are the pros and cons if there are more than one suitable Common Lisp implementation? Scheme can be one possible option too, if it offers some benefits over Common Lisp. I am well aware that the great majority of kernel modules are written in C, and I know C and x86 assembly well enough to be able to write the required low-level code in C and/or assembly. This is not an attempt to port Linux kernel into Lisp (see: https://stackoverflow.com/questions/1848029/why-not-port-linux-kernel-to-common-lisp), but a plan to write in Common Lisp a Linux kernel module that would be used as a ring0 debugger.
You might want to take a look at the Feb 2 2008 lispvan talk "Doing Evil Things with Common Lisp" by Brad Beveridge on working with a filesystem driver from SBCL.
Talk description & files
In it he mentions:
"A C/C++ Debugger written in CL??
Totally pie in the sky right now
But, how cool would that be?
Not that much of a stretch, only need to be able to write to memory where the library is located to insert break points & then trap signals on the Lisp side
Could use dirty tricks to replace the C functions with calls to Lisp functions
Apart from some details, it's probably not that hard – certainly nothing “new”
The dirty trick would involve overwriting the C code with another jump (branch without link) into a Lisp callback. When the Lisp code returns it can jump directly back to the original calling function via the link register.
Also, I'm totally glossing over the real difficulty in writing a debugger – it would be time consuming."
Nope, it would not be feasible to implement a kernel module in CL for the obvious reason that just to make that work you will need to do lot of hacks and may end up loosing all the benefit the lisp language provide and your lisp code will look like C code in S-expressions.
Write your kernel module to export whatever functionality/data you need from kernel and then using ioctl or read/write operations any user mode program can communicate with the module.
I am not sure if there is any kernel module which is generic enough that it implement Lisp (or may be its subset) such that you write code in Lisp and this generic module reads the lisp code and run it as sub component, basically Kernel -> Generic lisp module -> your lisp code (that will run in kernel).

High-level system language that compiles to c?

I'm looking for a higher-level system language, if possible, suitable for formal verification, that compiles to standard C, so that it can be run cross-platform with (relatively) low overhead.
The two most promising such languages I've stumbled during the past few days are:
BitC - While the design goals of this language match my needs (it even supports the functional paradigm), it is in very unstable state, the documentation is out of date, and, generally, it seems like a very long shot for a real-world project.
Lisaac - It supports Design-by-contract, which is very cool and has a relatively low performance overhead. However the website is dead, there hasn't been a new release since '08 and generally it seems the language is dead.
I'd also like to note that it's not meant for a real-time system, so a GC or, generally, non-determinism (in the real-time sense), is not an issue.
The project involves mainly audio processing, though it has to be cross-platform.
I assume someone would point me to the obvious answer - "plain ol' C". While it is truly cross-platform and very effective, the code quantity would probably be greater.
EDIT: I should clarify that I mean cross-platform AND cross-architecture. That is why I consider only languages, compiled to C in the first place, but if you can point me to another example, I'd be grateful :)
I think you may become interested in ATS. It compiles to C (actually it expresses and explains many C idioms and patterns from a formal type-theoretic perspective, it has even been proposed to prepare a book of sorts to show this -- if only we had more time...).
The project involves mainly audio processing, though it has to be cross-platform.
I don't know much about audio processing, I've been mainly doing some computer graphics stuff (mostly the basic things, just to try it out).
Also, I am not sure if ATS works on Windows (never tried that).
(Disclaimer: I've been working with ATS for some time. It is bulky and large language, and sometimes hard to use, but I very much liked the quality of programs I've been able to produce with it, for instance, see TEST subdirectory in GLES2 bindings for some realistic programs)
The following doesn't strictly adhere to the requirements but I'd like to mention it anyway and it is too long for a comment:
Pypy's RPython can be translated to C. Here's a nice talk about it. It's been used to implement Smalltalk, JavaScript, Io, Scheme, Gameboy (with various degree of completeness), but you can write standalone programs in it. It is known mainly for its implementation of Python language that runs on Intel x86 (IA-32) and x86_64 platforms.
The translation process requires a capable C compiler. The toolchain provides means to infer various things about the code (used by the translation process itself) that you might repurpose for formal verification.
If you know both Python and C you could use cython that translates Python-like syntax to C. It is used to write CPython extensions.

Resources