Coq VST Internal structure copying - c

run into a problem with VST(Verified Software Toolchain) 2.5v library for Coq 8.10.1:
Got an error with the latest working commit of VST namely "Internal structure copying is not supported".
Minimal example:
struct foo {unsigned int a;};
struct foo f() {
struct foo q;
return q; }
On starting proof got an error:
Error: Tactic failure: The expression (_q)%expr contains internal structure-copying, a feature of C not currently supported in Verifiable C (level 97).
This is due to the check_normalized in floyd/forward.v :
Fixpoint check_norm_expr (e: expr) : diagnose_expr :=
match e with
| Evar _ ty => diagnose_this_expr (access_mode ty) e
...
So, the questions are:
1) What suggested workarounds exists?
2) What is the reason for this limitation?
3) Where can I get a list of unsupported features?

1) The workaround is to change your C program to copy field by field.
2) The reason is the absurdly complicated and target-ISA-dependent implementation/semantics of C's structure-copying, especially in parameter passing and function-return.
3) The first 10 lines of Chapter 4 ("Verifiable C and clightgen") of the reference manual has a short list of unsupported features, but unfortunately struct-by-copy is not on that list. That's a bug.

Related

How to solve "bad pointer in write barrier" panic in cgo when C library uses opaque struct pointers

I'm currently writing a Go wrapper around a C library. That C library uses opaque struct pointers to hide information across the interface. However, the underlying implementation stores size_t values in there. This leads to runtime errors in the resulting program.
A minimum working example to reproduce the problem looks like this:
main.go:
package main
/*
#include "stddef.h"
// Create an opaque type to hide the details of the underlying data structure.
typedef struct HandlePrivate *Handle;
// In reality, the implementation uses a type derived from size_t for the Handle.
Handle getInvalidPointer() {
size_t actualHandle = 1;
return (Handle) actualHandle;
}
*/
import "C"
// Create a temporary slice containing invalid pointers.
// The idea is that the local variable slice can be garbage collected at the end of the function call.
// When the slice is scanned for linked objects, the GC comes across the invalid pointers.
func getTempSlice() {
slice := make([]C.Handle, 1000000)
for i, _ := range slice {
slice[i] = C.getInvalidPointer()
}
}
func main() {
getTempSlice()
}
Running this program will lead to the following error
runtime: writebarrierptr *0xc42006c000 = 0x1
fatal error: bad pointer in write barrier
[...stack trace omitted...]
Note that the errors disappear when the GC is disabled by setting the environment variable GOGC=off.
My question is which is the best way to solve or work around this problem. The library stores integer values in pointers for the sake of information hiding and this seems to confuse the GC. For obvious reasons I don't want to start messing with the library itself but rather absorb this behaviour in my wrapping layer.
My environment is Ubuntu 16.04, with gcc 5.4.0 and Go 1.9.2.
Documentation of cgo
I can reproduce the error for go1.8.5 and go1.9.2. I cannot reproduce the error for tip: devel +f01b928 Sat Nov 11 06:17:48 2017 +0000 (effectively go1.10alpha).
// Create a temporary slice containing invalid pointers.
// The idea is that the local variable slice can be garbage collected at the end of the function call.
// When the slice is scanned for linked objects, the GC comes across the invalid pointers.
A Go mantra is do not ignore errors. However, you seem to assume that that the GC will gracefully ignore errors. The GC should complain loudly (go1.8.5 and go1.9.2). At worst, with undefined behavior that may vary from release to release, the GC may appear to ignore errors (go devel).
The Go compiler sees a pointer and the Go runtime GC expects a valid pointer.
// go tool cgo
// type _Ctype_Handle *_Ctype_struct_HandlePrivate
// var handle _Ctype_Handle
var handle C.Handle
// main._Ctype_Handle <nil> 0x0
fmt.Fprintf(os.Stderr, "%[1]T %[1]v %[1]p\n", handle)
slice := make([]C.Handle, 1000000)
for i, _ := range slice {
slice[i] = C.getInvalidPointer()
}
Use type uintptr. For example,
package main
import "unsafe"
/*
#include "stddef.h"
// Create an opaque type to hide the details of the underlying data structure.
typedef struct HandlePrivate *Handle;
// In reality, the implementation uses a type derived from size_t for the Handle.
Handle getInvalidPointer() {
size_t actualHandle = 1;
return (Handle) actualHandle;
}
*/
import "C"
// Create a temporary slice of C pointers as Go integer type uintptr.
func getTempSlice() {
slice := make([]uintptr, 1000000)
for i, _ := range slice {
slice[i] = uintptr(unsafe.Pointer(C.getInvalidPointer()))
}
}
func main() {
getTempSlice()
}

Using R random number generators in C [duplicate]

I would like to, within my own compiled C++ code, check to see if a library package is loaded in R (if not, load it), call a function from that library and get the results back to in my C++ code.
Could someone point me in the right direction? There seems to be a plethora of info on R and different ways of calling R from C++ and vis versa, but I have not come across exactly what I am wanting to do.
Thanks.
Dirk's probably right that RInside makes life easier. But for the die-hards... The essence comes from Writing R Extensions sections 8.1 and 8.2, and from the examples distributed with R. The material below covers constructing and evaluating the call; dealing with the return value is a different (and in some sense easier) topic.
Setup
Let's suppose a Linux / Mac platform. The first thing is that R must have been compiled to allow linking, either to a shared or static R library. I work with an svn copy of R's source, in the directory ~/src/R-devel. I switch to some other directory, call it ~/bin/R-devel, and then
~/src/R-devel/configure --enable-R-shlib
make -j
this generates ~/bin/R-devel/lib/libR.so; perhaps whatever distribution you're using already has this? The -j flag runs make in parallel, which greatly speeds the build.
Examples for embedding are in ~/src/R-devel/tests/Embedding, and they can be made with cd ~/bin/R-devel/tests/Embedding && make. Obviously, the source code for these examples is extremely instructive.
Code
To illustrate, create a file embed.cpp. Start by including the header that defines R data structures, and the R embedding interface; these are located in bin/R-devel/include, and serve as the primary documentation. We also have a prototype for the function that will do all the work
#include <Rembedded.h>
#include <Rdefines.h>
static void doSplinesExample();
The work flow is to start R, do the work, and end R:
int
main(int argc, char *argv[])
{
Rf_initEmbeddedR(argc, argv);
doSplinesExample();
Rf_endEmbeddedR(0);
return 0;
}
The examples under Embedding include one that calls library(splines), sets a named option, then runs a function example("ns"). Here's the routine that does this
static void
doSplinesExample()
{
SEXP e, result;
int errorOccurred;
// create and evaluate 'library(splines)'
PROTECT(e = lang2(install("library"), mkString("splines")));
R_tryEval(e, R_GlobalEnv, &errorOccurred);
if (errorOccurred) {
// handle error
}
UNPROTECT(1);
// 'options(FALSE)' ...
PROTECT(e = lang2(install("options"), ScalarLogical(0)));
// ... modified to 'options(example.ask=FALSE)' (this is obscure)
SET_TAG(CDR(e), install("example.ask"));
R_tryEval(e, R_GlobalEnv, NULL);
UNPROTECT(1);
// 'example("ns")'
PROTECT(e = lang2(install("example"), mkString("ns")));
R_tryEval(e, R_GlobalEnv, &errorOccurred);
UNPROTECT(1);
}
Compile and run
We're now ready to put everything together. The compiler needs to know where the headers and libraries are
g++ -I/home/user/bin/R-devel/include -L/home/user/bin/R-devel/lib -lR embed.cpp
The compiled application needs to be run in the correct environment, e.g., with R_HOME set correctly; this can be arranged easily (obviously a deployed app would want to take a more extensive approach) with
R CMD ./a.out
Depending on your ambitions, some parts of section 8 of Writing R Extensions are not relevant, e.g., callbacks are needed to implement a GUI on top of R, but not to evaluate simple code chunks.
Some detail
Running through that in a bit of detail... An SEXP (S-expression) is a data structure fundamental to R's representation of basic types (integer, logical, language calls, etc.). The line
PROTECT(e = lang2(install("library"), mkString("splines")));
makes a symbol library and a string "splines", and places them into a language construct consisting of two elements. This constructs an unevaluated language object, approximately equivalent to quote(library("splines")) in R. lang2 returns an SEXP that has been allocated from R's memory pool, and it needs to be PROTECTed from garbage collection. PROTECT adds the address pointed to by e to a protection stack, when the memory no longer needs to be protected, the address is popped from the stack (with UNPROTECT(1), a few lines down). The line
R_tryEval(e, R_GlobalEnv, &errorOccurred);
tries to evaluate e in R's global environment. errorOccurred is set to non-0 if an error occurs. R_tryEval returns an SEXP representing the result of the function, but we ignore it here. Because we no longer need the memory allocated to store library("splines"), we tell R that it is no longer PROTECT'ed.
The next chunk of code is similar, evaluating options(example.ask=FALSE), but the construction of the call is more complicated. The S-expression created by lang2 is a pair list, conceptually with a node, a left pointer (CAR) and a right pointer (CDR). The left pointer of e points to the symbol options. The right pointer of e points to another node in the pair list, whose left pointer is FALSE (the right pointer is R_NilValue, indicating the end of the language expression). Each node of a pair list can have a TAG, the meaning of which depends on the role played by the node. Here we attach an argument name.
SET_TAG(CDR(e), install("example.ask"));
The next line evaluates the expression that we have constructed (options(example.ask=FALSE)), using NULL to indicate that we'll ignore the success or failure of the function's evaluation. A different way of constructing and evaluating this call is illustrated in R-devel/tests/Embedding/RParseEval.c, adapted here as
PROTECT(tmp = mkString("options(example.ask=FALSE)"));
PROTECT(e = R_ParseVector(tmp, 1, &status, R_NilValue));
R_tryEval(VECTOR_ELT(e, 0), R_GlobalEnv, NULL);
UNPROTECT(2);
but this doesn't seem like a good strategy in general, as it mixes R and C code and does not allow computed arguments to be used in R functions. Instead write and manage R code in R (e.g., creating a package with functions that perform complicated series of R manipulations) that your C code uses.
The final block of code above constructs and evaluates example("ns"). Rf_tryEval returns the result of the function call, so
SEXP result;
PROTECT(result = Rf_tryEval(e, R_GlobalEnv, &errorOccurred));
// ...
UNPROTECT(1);
would capture that for subsequent processing.
There is Rcpp which allows you to easily extend R with C++ code, and also have that C++ code call back to R. There are examples included in the package which show that.
But maybe what you really want is to keep your C++ program (i.e. you own main()) and call out to R? That can be done most easily with
RInside which allows you to very easily embed R inside your C++ application---and the test for library, load if needed and function call are then extremely easy to do, and the (more than a dozen) included examples show you how to. And Rcpp still helps you to get results back and forth.
Edit: As Martin was kind enough to show things the official way I cannot help and contrast it with one of the examples shipping with RInside. It is something I once wrote quickly to help someone who had asked on r-help about how to load (a portfolio optimisation) library and use it. It meets your requirements: load a library, accesses some data in pass a weights vector down from C++ to R, deploy R and get the result back.
// -*- mode: C++; c-indent-level: 4; c-basic-offset: 4; tab-width: 8; -*-
//
// Simple example for the repeated r-devel mails by Abhijit Bera
//
// Copyright (C) 2009 Dirk Eddelbuettel
// Copyright (C) 2010 - 2011 Dirk Eddelbuettel and Romain Francois
#include <RInside.h> // for the embedded R via RInside
int main(int argc, char *argv[]) {
try {
RInside R(argc, argv); // create an embedded R instance
std::string txt = "suppressMessages(library(fPortfolio))";
R.parseEvalQ(txt); // load library, no return value
txt = "M <- as.matrix(SWX.RET); print(head(M)); M";
// assign mat. M to NumericMatrix
Rcpp::NumericMatrix M = R.parseEval(txt);
std::cout << "M has "
<< M.nrow() << " rows and "
<< M.ncol() << " cols" << std::endl;
txt = "colnames(M)"; // assign columns names of M to ans and
// into string vector cnames
Rcpp::CharacterVector cnames = R.parseEval(txt);
for (int i=0; i<M.ncol(); i++) {
std::cout << "Column " << cnames[i]
<< " in row 42 has " << M(42,i) << std::endl;
}
} catch(std::exception& ex) {
std::cerr << "Exception caught: " << ex.what() << std::endl;
} catch(...) {
std::cerr << "Unknown exception caught" << std::endl;
}
exit(0);
}
This rinside_sample2.cpp, and there are lots more examples in the package. To build it, you just say 'make rinside_sample2' as the supplied Makefile is set up to find R, Rcpp and RInside.

Watch out for C function names with R code

So here is something a bit crazy.
If you have some C code which is called by an R function (as a shared object), try adding this to the code
void warn() {
int i; // just so the function has some work, but you could make it empty to, or do other stuff
}
If you then call warn() anywhere in the C code being called by the R function you get a segfault;
*** caught segfault ***
address 0xa, cause 'memory not mapped'
Traceback:
1: .C("C_function_called_by_R", as.double(L), as.double(G), as.double(T), as.integer(nrow), as.integer(ncolL), as.integer(ncolG), as.integer(ncolT), as.integer(trios), as.integer(seed), as.double(pval), as.double(pval1), as.double(pval2), as.double(pval3), as.double(pval4), as.integer(ntest), as.integer(maxit), as.integer(threads), as.integer(quietly))
2: package_name::R_function(L, G, T, trios)
3: func()
4: system.time(func())
5: doTryCatch(return(expr), name, parentenv, handler)
6: tryCatchOne(expr, names, parentenv, handlers[[1L]])
7: tryCatchList(expr, classes, parentenv, handlers)
8: tryCatch(expr, error = function(e) { call <- conditionCall(e) if (!is.null(call)) { if (identical(call[[1L]], quote(doTryCatch))) call <- sys.call(-4L) dcall <- deparse(call)[1L] prefix <- paste("Error in", dcall, ": ") LONG <- 75L msg <- conditionMessage(e) sm <- strsplit(msg, "\n")[[1L]] w <- 14L + nchar(dcall, type = "w") + nchar(sm[1L], type = "w") if (is.na(w)) w <- 14L + nchar(dcall, type = "b") + nchar(sm[1L], type = "b") if (w > LONG) prefix <- paste(prefix, "\n ", sep = "") } else prefix <- "Error : " msg <- paste(prefix, conditionMessage(e), "\n", sep = "") .Internal(seterrmessage(msg[1L])) if (!silent && identical(getOption("show.error.messages"), TRUE)) { cat(msg, file = stderr()) .Internal(printDeferredWarnings()) } invisible(structure(msg, class = "try-error", condition = e))})
9: try(system.time(func()))
10: .executeTestCase(funcName, envir = sandbox, setUpFunc = .setUp, tearDownFunc = .tearDown)
11: .sourceTestFile(testFile, testSuite$testFuncRegexp)
12: runTestSuite(testSuite)
aborting ...
Segmentation fault (core dumped)
(END)
Needless to say the code runs fine if you call the same function from a C or C++ wrapper instead of from an R function. If you rename warn() it also works fine.
Any ideas? Is this a protected name/symbol? Is there a list of such names? I'm using R version 2.14.1 on Ubuntu 12.01 (i686-pc-linux-gnu (32-bit)). C code is compiled with GNU GCC 4.6.3.
This seems like quite an interesting question. Here's my minimal example, in a file test.c I have
void warn() {}
void my_fun() { warn(); }
I compile it and then run
$ R CMD SHLIB test.c
$ R -e "dyn.load('test.so'); .C('my_fun')"
With my linux gcc version 4.6.3., the R output is
> dyn.load('test.so'); .C('my_fun')
R: Success
list()
with that "R: Success" coming from the warn function defined in libc (see man warn, defined in err.h). What happens is that R loads several dynamic libraries as a matter of course, and then loads test.so as instructed. When my_fun gets called, the dynamic linker resolves warn, but the rules of resolution are to search globally for the warn symbol, and not just in test.so. I really don't know what the global search rules are, perhaps in the order the .so's were opened, but whatever the case the resolution is not where I was expecting.
What is to be done? Specifying
static void warn() {}
forces resolution at compile time, when the .o is created, and hence avoiding the problem. This wouldn't work if, for instance, warn was defined in one file (utilities.c) and my_fun in another. On Linux dlopen (the function used to load a shared object) can be provided with a flag RTLD_DEEPBIND that does symbol resolution locally before globally, but (a) R does not use dlopen that way and (b) there are several (see p. 9) reservations with this kind of approach. So as far as I can tell the best practice is to use static where possible, and to carefully name functions to avoid name conflicts. This latter is not quite as bad as it seems, since R loads package shared objects such that the package symbols themselves are NOT added to the global name space (see ?dyn.load and the local argument, and also note the OS-specific caveats).
I'd be interested in hearing of a more robust 'best practice'.

Locate unused structures and structure-members

Some time ago we took over the responsibility of a legacy code base.
One of the quirks of this very badly structured/written code was that
it contained a number of really huge structs, each containing
hundreds of members. One of the many steps that we did was to clean
out as much of the code as possible that wasn't used, hence the need
to find unused structs/struct members.
Regarding the structs, I conjured up a combination of python, GNU
Global and ctags to list the struct members that are unused.
Basically, what I'm doing is to use ctags to generate a tags file,
the python-script below parses that file to locate all struct
members and then using GNU Global to do a lookup in the previously
generated global-database to see if that member is used in the code.
This approach have a number of quite serious flaws, but it sort of
solved the issue we faced and gave us a good start for further
cleanup.
There must be a better way to do this!
The question is: How to find unused structures and structure members
in a code base?
#!/usr/bin/env python
import os
import string
import sys
import operator
def printheader(word):
"""generate a nice header string"""
print "\n%s\n%s" % (word, "-" * len(word))
class StructFreqAnalysis:
""" add description"""
def __init__(self):
self.path2hfile=''
self.name=''
self.id=''
self.members=[]
def show(self):
print 'path2hfile:',self.path2hfile
print 'name:',self.name
print 'members:',self.members
print
def sort(self):
return sorted(self.members, key=operator.itemgetter(1))
def prettyprint(self):
'''display a sorted list'''
print 'struct:',self.name
print 'path:',self.path2hfile
for i in self.sort():
print ' ',i[0],':',i[1]
print
f=open('tags','r')
x={} # struct_name -> class
y={} # internal tags id -> class
for i in f:
i=i.strip()
if 'typeref:struct:' in i:
line=i.split()
x[line[0]]=StructFreqAnalysis()
x[line[0]].name=line[0]
x[line[0]].path2hfile=line[1]
for j in line:
if 'typeref' in j:
s=j.split(':')
x[line[0]].id=s[-1]
y[s[-1]]=x[line[0]]
f.seek(0)
for i in f:
i=i.strip()
if 'struct:' in i:
items=i.split()
name=items[0]
id=items[-1].split(':')[-1]
if id:
if id in y:
key=y[id]
key.members.append([name,0])
f.close()
# do frequency count
for k,v in x.iteritems():
for i in v.members:
cmd='global -a -s %s'%i[0] # -a absolute path. use global to give src-file for member
g=os.popen(cmd)
for gout in g:
if '.c' in gout:
gout=gout.strip()
f=open(gout,'r')
for line in f:
if '->'+i[0] in line or '.'+i[0] in line:
i[1]=i[1]+1
f.close()
printheader('All structures')
for k,v in x.iteritems():
v.prettyprint()
#show which structs that can be removed
printheader('These structs could perhaps be removed')
for k,v in x.iteritems():
if len(v.members)==0:
v.show()
printheader('Total number of probably unused members')
cnt=0
for k,v in x.iteritems():
for i in v.members:
if i[1]==0:
cnt=cnt+1
print cnt
Edit
As proposed by #Jens-Gustedt using the compiler is a good way to do it. I'm after a approach that can do a sort of "High Level" filtering before using the compiler-approach.
If these are only a few struct and if the code does no bad hacks of accessing a struct through another type... then you could just comment out all the fields of your first struct and let the compiler tell you.
Uncomment one used field after the other until the compiler is satisfied. Then once that compiles, to a good testing to ensure the precondition that there were no hacks.
Iterate over all struct.
Definitively not pretty, but at the end you'd have at least one person who knows the code a bit.
Use coverity. This is a wonderful tool to detect code flaws, but is a bit costly.
Although it is a very old post. But recently I did the same using python and gdb. I compiled following snippet of code with structure at the top of hierarchy and then using gdb did print type on the structure and re-cursed into its members.
#include <usedheader.h>
UsedStructureInTop *to_print = 0;
int main(){return 0;}
(gdb) p to_print
(gdb) $1 = (UsedStructureInTop *) 0x0
(gdb) pt UsedStructureInTop
type = struct StructureTag {
members displayed here line by line
}
(gdb)
Although my purpose is little different. It is to generate a header that contains only the structure UsedStructureInTop and its dependency types. There are compiler options to do this. But they do not remove unused/unlinked structures found in the included header files.
Under C rules, it's possible to access struct members via another structure which has a similar layout. That means that you can access struct Foo {int a; float b; char c; }; via struct Bar { int x; float y; }; (except of course for Foo::c).
Hence, your algorithm is potentially flawed. It's bloody hard to find what you want, which BTW is why C is hard to optimize.

What's wrong with the C code below?

The code below is from this answer:
#include <windows.h>
int main()
{
HANDLE h = ::CreateFile(L"\\\\.\\d:", 0, 0, NULL, OPEN_EXISTING, 0, NULL);
STORAGE_DEVICE_NUMBER info = {};
DWORD bytesReturned = 0;
::DeviceIoControl(h, IOCTL_STORAGE_GET_DEVICE_NUMBER, NULL, 0, &info, sizeof(info), &bytesReturned, NULL);
}
When I compile and run the above,get error like this:
error C2059: syntax error : ':'
error C2059: syntax error : '}'
error C2143: syntax error : missing ';' before ':'
UPDATE
AFter saving the above as a cpp file,I got this error:
error C2664: 'CreateFileA' : cannot convert parameter 1 from 'const wchar_t [7]' to 'LPCSTR'
Plain C doesn't have namespaces, so you need to leave out the :: global namespace specifiers. Those would only be valid in C++.
The L in front of the string specifies that this is a wide character string. If your project doesn't use wide characters, leave out the L to get a normal character string. If you need to support both variants you can also use the _T macro: _T("...") expands to the correct variant of string literal depending on your project settings.
I'm pretty sure that you can't use :: as part of an identifier name in the C programming language. This looks more like some bizarre, bastardized usage of C++. IIRC, :: by itself in front of an identifier specified that this was in the global or file scope (to avoid potentially clashing with, say, methods in a class).
It's not C, it's C++. Drop the double colons (or compile it as C++).
Also, the string constant uses wide characters. Drop the L in front of the open quote.
So, I guess your question is more about getting the result you want than the reason for the code not compiling :)
Now you've got it working, the DeviceIoControl() call will be filling in the STORAGE_DEVICE_NUMBER structure you're passing to it with the results you want. So, you should find that info.DeviceType now holds the device type, info.DeviceNumber holds the device number, and info.PartitionNumber the partition number.
Test for success before using the returned values. I'd maybe try it on the C: drive rather than the D: drive, as you're doing at the moment, until I was sure it was working; at least you know you've pretty much always got a C: drive in Windows :) So, use \\\\.\\c: rather than \\\\.\\d:.
Anyway, untested, but:
if (::DeviceIoControl(h, IOCTL_STORAGE_GET_DEVICE_NUMBER, NULL, 0, &info, sizeof(info), &bytesReturned, NULL))
{
std::cout << "Device: " << info.DeviceNumber <<
" Partition: " << info.PartitionNumber << std::endl;
} else {
std::cerr << "Ooops. The DeviceIoControl call returned failure." << std::endl;
}
...obviously, you'd need a #include <iostream> for this to work, as I'm dumping the values using iostream, but you can print these out however you want, or message box them, bearing in mind you're already bringing in windows.h.
Bear in mind it's been a decade or so since I did any Windows C++ work, but maybe this'll get you going...

Resources