I’m trying to extend OCaml-Xlib with bindings for the MIT-SHM extension.
It’s the first time I’m trying to interface C with OCaml and I’ve never written anything in C, so I guess I’m doing something stupid somewhere.
I first added the XShmQueryExtension function. I added the following to the Xlib.ml file
external xShmQueryExtension: dpy:display -> bool = "ml_xShmQueryExtension"
the following to the wrap_xlib.c file
CAMLprim value
ml_xShmQueryExtension( value dpy )
{
int ans = XShmQueryExtension( Display_val(dpy) );
return Val_bool(ans);
}
I changed the Makefile to link with Xext, and it works: when I call the xShmQueryExtension function from OCaml I get true.
Now I’m trying to write a function creating a shared xImage, initializing the shared memory and attaching it to the X server. I added the following to the Xlib.ml file:
type xShmSegmentInfo
external xShmCreateImageAndAttach:
dpy:display -> visual:visual -> depth:int -> fmt:ximage_format
-> width:uint -> height:uint -> xShmSegmentInfo * xImage
= "ml_xShmCreateImageAndAttach_bytecode"
"ml_xShmCreateImageAndAttach"
and the following to the wrap_xlib.c file:
#define Val_XShmSegmentInfo(d) ((value)(d))
#define XShmSegmentInfo_val(v) ((XShmSegmentInfo *)(v))
CAMLprim value
ml_xShmCreateImageAndAttach( value dpy, value visual, value depth, value format,
value width, value height)
{
CAMLparam5(dpy, visual, depth, format, width);
CAMLxparam1(height);
CAMLlocal1(ret);
XShmSegmentInfo *shminfo = malloc(sizeof(XShmSegmentInfo));
XImage *ximage = XShmCreateImage(
Display_val(dpy),
Visual_val(visual),
Int_val(depth),
XImage_format_val(format),
NULL,
shminfo,
UInt_val(width),
UInt_val(height)
);
shminfo->shmid = shmget (IPC_PRIVATE,
ximage->bytes_per_line * ximage->height, IPC_CREAT|0777);
shminfo->shmaddr = ximage->data = (char *) shmat (shminfo->shmid, 0, 0);
if (shminfo->shmaddr == -1)
fprintf(stderr,"Error");
shminfo->readOnly = False;
XShmAttach (Display_val(dpy), shminfo);
ret = caml_alloc(2, 0);
Store_field(ret, 0, Val_XShmSegmentInfo(shminfo) );
Store_field(ret, 1, Val_XImage(ximage) );
CAMLreturn(ret);
}
CAMLprim value
ml_xShmCreateImageAndAttach_bytecode( value * argv, int argn )
{
return ml_xShmCreateImageAndAttach(argv[0], argv[1], argv[2], argv[3],
argv[4], argv[5]);
}
Now I’m calling this function in my OCaml program:
let disp = xOpenDisplay ""
let screen = xDefaultScreen disp
let (shminfo, image) = xShmCreateImageAndAttach disp
(xDefaultVisual disp screen)
(xDefaultDepth disp screen) ZPixmap 640 174
This is a toplevel call in my OCaml program, and I’m never using the variables shminfo and image again (this is just to test that the function work). This call does not fail, but my program segfault a little while after (the rest of my program constantly dump the screen with xGetImage and do stuff with the pixels, and I get a segfault in some xGetPixel which has nothing to do with the call to xShmCreateImageAndAttach above).
I noticed that if I remove the line shminfo->shmaddr = ximage->data = (char *) shmat (shminfo->shmid, 0, 0); I don’t get the segfault anymore (but of course this won’t do what I want).
I assume that this has to do with the garbage collector somehow but I don’t know how to fix it.
On the OCaml doc, there is a warning about casting pointers obtained with malloc to the value type, but I don’t really understand what it means and I don’t know if it’s relevant.
Edit:
I replaced the two lines following shmat by the following:
fprintf(stderr,"%i\n",(int)shminfo->shmaddr);
fflush(stderr);
and I get something like 1009700864, so the call to shmat seems to be working.
Here is the backtrace given by gdb:
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7acdde8 in ?? () from /usr/lib/libX11.so.6
(gdb) backtrace
#0 0x00007ffff7acdde8 in ?? () from /usr/lib/libX11.so.6
#1 0x000000000044070c in ml_XGetPixel ()
#2 0x00000000004165b9 in camlInit__rvb_at_1023 () at init.ml:43
#3 0x0000000000415743 in camlParse__find_guy_1046 () at parse.ml:58
#4 0x000000000041610c in camlParse__pre_parse_1044 () at parse.ml:95
#5 0x0000000000415565 in camlGame__entry () at game.ml:26
#6 0x00000000004141f9 in caml_program ()
#7 0x000000000045c03e in caml_start_program ()
#8 0x000000000044afa5 in caml_main ()
#9 0x000000000044afe0 in main ()
The warning is relevant if X is going to call free() on the shminfo pointer that you're casting to the value type. The problem is that OCaml assumes that values can be freely copied and handled later by GC. This isn't true for your pointer value, so there will potentially be dangling copies of the pointer. Also, the space can get reused as part of OCaml's heap, and then you have real trouble.
It doesn't seem to me that X will do this, and since you don't call free() in your code, I don't think this is the problem. But it could be--I don't know how X works.
It might be good to call fflush(stderr) after your call to fprintf(). It probably won't change anything, but I've found my tracing messages tend to get buffered up and never appear when the program crashes.
It would also be good to know what the segfaulting address looks like. Is it near 0? Or a big address in the middle of the heap somewhere?
Sorry I can't pinpoint your error. I don't see anything you're doing wrong after 4 or 5 readings of the code, assuming Display_val and the rest are working correctly. But this is tricky to get right.
Related
What do you need to do to allocate an immortal OCaml object off-heap in a C function? In particular, how do you make an OCaml value that looks to the runtime like a global variable in OCaml source code would.
Here is my attempt at producing an intentionally broken program that neglects to register a value as a GC root.
Here's the OCaml source file driving everything.
(* immortal_string.ml *)
external make_string : string -> unit = "make_string"
external get_string : unit -> string = "get_string"
let () = make_string "a"
let () = Gc.full_major ()
let () = Printf.printf "%s\n" (get_string ())
And the C implementation. There are probably better ways to do this than using 0 as a sentinel value and a function static, but I think the intent is clear. Note that multiple calls to make_string will clobber the value that was originally there, but that's okay. I want the memory pointed to by old values to be reclaimed by the garbage collector.
// lib_immortal_string.c
#include <caml/mlvalues.h>
#include <caml/memory.h>
#include <caml/alloc.h>
value *storage(void) {
// BAD! we haven't registered this thing
// as a GC root. No clue how you do that.
static value data = 0; // sentinel, will never be valid OCaml value
if (data == 0) {
data = caml_copy_string("");
}
return &data;
}
CAMLprim value
make_string(value ml_string) {
CAMLparam1(ml_string);
*storage() = ml_string;
CAMLreturn(Val_unit);
}
CAMLprim value
get_string(value ml_unit) {
CAMLparam1(ml_unit);
CAMLreturn(*storage());
}
I was expecting this program to segfault since there's nothing obvious keeping data in the storage function alive. data is not global and isn't on the stack. However, the program appears to run without reclaiming the string.
$ ocamlopt immortal_string.ml lib_immortal_string.c
./a.out
a
So my question is, what is the proper way to make a global OCaml value / off-heap thing? And, furthermore, why does the above program appear to work instead of crashing?
There's virtually no reuse of memory going on in your small program, so I guess the string "a" still looks the same after the garbage collection, even if it's not referenced by anything.
If you call GC every time, it just returns things to a well-ordered state. It would be better to let the GC happen normally, which will run through many more possible memory states.
You also have to do some looping to give it time to fail.
A slightly modified version of your code does in fact segfault for me:
external make_string : string -> unit = "make_string"
external get_string : unit -> string = "get_string"
let () =
while true do
let a = String.make (1024 * 1024) 'a' in
make_string a;
let b = String.make (1024 * 1024) 'b' in
Printf.printf "%s %s\n" (get_string ()) b
done
The way to mark a value as a GC root is with caml_register_global_root
caml_register_global_root(&data);
Just as you expected, if I call this under if (data == 0) { there is no segfault.
This is documented in Section 19.5 of the OCaml manual.
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 7 years ago.
Improve this question
I've encountered some strange behaviour in my program.
When I run my program 3-4 times and close it immediately, it's starting to give me segmentation faults before it even starts. When I haven't opened it for a while it opens the first 2-3 times without problem and then again seg faults.
I am open to suggestion on what can cause this kind of problem.
The project is quite big so I don't know where exactly to look so if someone wants to see the source code, here you go :
https://github.com/rokn/Helsys3
Let me break down a debugging session for you, but in future you better do that yourself.
If the problem can be easily reproduced, it is quite trivial to fix it:
gdb ./GAME
(gdb) r
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7b2d10c in ?? () from /usr/lib/x86_64-linux-gnu/libSDL2-2.0.so.0
(gdb) bt
#0 0x00007ffff7b2d10c in ?? () from /usr/lib/x86_64-linux-gnu/libSDL2-2.0.so.0
#1 0x000000000040650c in sprite_free_age ()
#2 0x00000000004065f9 in AGE_SpriteLoad ()
#3 0x0000000000402740 in BattlefieldLoad () at battlefield.c:89
#4 0x0000000000402794 in BattlefieldInit (battlefield=0x1dca440, battlefieldId=1)
at battlefield.c:96
#5 0x0000000000405349 in BattleInitialize (leftTeam=0x60dff0 <leftTeam>,
rightTeam=0x60e010 <rightTeam>, battlefieldId=1) at battle.c:13
#6 0x0000000000401e8f in LoadContent () at main.c:90
#7 0x0000000000401d2b in main (argc=1, argv=0x7fffffffdfc8) at main.c:49
It crashed within SDL, and last thing that called it is sprite_free_age. What's here (AGE/AGE_Sprite.c):
void sprite_free_age(AGE_Sprite *sprite)
{
if( sprite->texture != NULL )
{
SDL_DestroyTexture( sprite->texture );
sprite->texture = NULL;
sprite->Width = 0;
sprite->Height = 0;
}
}
The only SDL call is SDL_DestroyTexture, and NULL check is performed, which means sprite have garbage data (not NULL but still not an SDL texture but rather something else). It was called from AGE_SpriteLoad:
bool AGE_SpriteLoad(AGE_Sprite *sprite, char *path)
{
sprite_free_age(sprite);
SDL_Texture *finalTexture = NULL;
SDL_Surface *loadedSurface = IMG_Load(path);
// ... the rest is omitted
So whenever you call AGE_SpriteLoad, it first tries to drop previous sprite it may've contained. It was called from BattlefieldLoad at battlefield.c:89:
void BattlefieldLoad()
{
assert(AGE_SpriteLoad(&squareWalkable, "Resources/Battlefield/SquareWalkable.png"));
assert(AGE_SpriteLoad(&squareSelected, "Resources/Battlefield/SquareSelected.png"));
assert(AGE_SpriteLoad(&squareEnemy, "Resources/Battlefield/SquareEnemy.png"));
assert(AGE_SpriteCreateBlank(&battlefieldField, LevelWidth, LevelHeight, SDL_TEXTUREACCESS_TARGET));
AGE_ListInit(&objectsList, sizeof(AGE_Sprite));
AGE_Sprite objectSprite;
int i;
char buffer[100];
for (i = 0; i < BATTLEFIELD_OBJECTS_COUNT; ++i)
{
snprintf(buffer, sizeof(buffer), "Resources/Battlefield/Object_%d.png", i+1);
AGE_SpriteLoad(&objectSprite, buffer);
AGE_ListAdd(&objectsList, &objectSprite);
}
}
Here you have uninitialised AGE_Sprite objectSprite, and you're calling AGE_SpriteLoad on it, which tries to drop old data (which is uninitialised => garbage) and (maybe) crashes. First thing coming to mind is that you need to set objectSprite to zero-bytes, either with memset or just by initialising it upon declaration, e.g. AGE_Sprite objectSprite = {};.
Here is a C function that segfaults:
void compileShaders(OGL_STATE_T *state) {
// First testing to see if I can access object properly. Correctly outputs:
// nsHandle: 6
state->nsHandle = 6;
printf("nsHandle: %d\n", state->nsHandle);
// Next testing if glCreateProgram() returns proper value. Correctly outputs:
// glCreateProgram: 1
printf("glCreateProgram: %d\n", glCreateProgram());
// Then the program segfaults on the following line according to gdb
state->nsHandle = glCreateProgram();
}
For the record state->nsHandle is of type GLuint and glCreateProgram() returns a GLuint so that shouldn't be my problem.
gdb says that my program segfaults on line 303 which is actually the comment line before that line. I don't know if that actually matters.
Is gdb lying to me? How do I debug this?
EDIT:
Turned off optimizations (-O3) and now it's working. If somebody could explain why that would be great though.
EDIT 2:
For the purpose of the comments, here's a watered down version of the important components:
typedef struct {
GLuint nsHandle;
} OGL_STATE_T;
int main (int argc, char *argv[]) {
OGL_STATE_T _state, *state=&_state;
compileShaders(state);
}
EDIT 3:
Here's a test I did:
int main(int argc, char *argv[]) {
OGL_STATE_T _state, *state=&_state;
// Assign value and try to print it in other function
state->nsHandle = 5;
compileShaders(state);
}
void compileShaders(OGL_STATE_T *state) {
// Test to see if the first call to state is getting optimized out
// Correctly outputs:
// nsHandle (At entry): 5
printf("nsHandle (At entry): %d\n", state->nsHandle);
}
Not sure if that helps anything or if the compiler would actually optimize the value from the main function.
EDIT 4:
Printed out pointer address in main and compileShaders and everything matches. So I'm gonna assume it's segfaulting somewhere else and gdb is lying to me about which line is actually causing it.
This is going to be guesswork based on what you have, but with optimization on this line:
state->nsHandle = 6;
printf("nsHandle: %d\n", state->nsHandle);
is probably optimized to just
printf("nsHandle: 6\n");
So the first access to state is where the segfault is. With optimization on GDB can report odd line numbers for where the issue is because the running code may no longer map cleanly to source code lines as you can see from the example above.
As mentioned in the comments, state is almost certainly not initialized. Some other difference in the optimized code is causing it to point to an invalid memory area whereas the non-optimized code it's pointing somewhere valid.
This might happen if you're doing something with pointers directly that prevents the optimizer from 'seeing' that a given variable is used.
A sanity check would be useful to check that state != 0 but it'll not help if it's non-zero but invalid.
You'd need to post the calling code for anyone to tell you more. However, you asked how to debug it -- I would print (or use GDB to view) the value of state when that function is entered, I imagine it will be vastly different in optimized and non-optimized versions. Then track back to the function call to work out why that's the case.
EDIT
You posted the calling code -- that should be fine. Are you getting warnings when compiling (turn all the warnings on with -Wall). In any case my advice about printing the value of state in different scenarios still stands.
(removed comment about adding & since you edited the question again)
When you optimize your program, there is no more 1:1 mapping between source lines and emmitted code.
Typically, the compiler will reorder the code to be more efficient for your CPU, or will inline function call, etc...
This code is wrong:
*state=_state
It should be:
*state=&_state
Well, you edited your post, so ignore the above fix.
Check for the NULL condition before de-referencing the pointer or reading it. If the values you pass are NULL or if the values stored are NULL then you will hit segfault without performing any checks.
FYI: GDB Can't Lie !
I ended up starting a new thread with more relevant information and somebody found the answer. New thread is here:
GCC: Segmentation fault and debugging program that only crashes when optimized
I'm trying to share a variable with c and tcl, the problem is when i try to read the variable in the c thread from tcl, it causes segmentation error, i'm not sure this is the right way to do it, but it seems to work for ints. The part that is causing the segmentation fault is this line is when i try to print "Var" but i want to read the variable to do the corresponding action when the variable changes.
Here is the C code that i'm using
void mode_service(ClientData clientData) {
while(1) {
char* Var = (char *) clientData;
printf("%s\n", Var);
usleep(100000); //100ms
}
}
static int mode_thread(ClientData cdata, Tcl_Interp *interp, int objc, Tcl_Obj *const objv[]) {
Tcl_ThreadId id;
ClientData limitData;
limitData = cdata;
id = 0;
Tcl_CreateThread(&id, mode_service, limitData, TCL_THREAD_STACK_DEFAULT, TCL_THREAD_NOFLAGS);
printf("Tcl_CreateThread id = %d\n", (int) id);
// Wait thread process, before returning to TCL prog
int i, aa;
for (i=0 ; i<100000; i++) {aa = i;}
// Return thread ID to tcl prog to allow mutex use
Tcl_SetObjResult(interp, Tcl_NewIntObj((int)id));
printf("returning\n");
return TCL_OK;
}
int DLLEXPORT Modemanager_Init(Tcl_Interp *interp){
if (Tcl_InitStubs(interp, TCL_VERSION, 0) == NULL) {
return TCL_ERROR;
}
if (Tcl_PkgProvide(interp, "PCIe", "1.0") == TCL_ERROR) {
return TCL_ERROR;
}
// Create global Var
int *sharedPtr=NULL;
//sharedPtr = sharedPtr = (char *) Tcl_Alloc(sizeof(char));
Tcl_LinkVar(interp, "mode", (char *) &sharedPtr, TCL_LINK_STRING);
Tcl_CreateObjCommand(interp, "mode_thread", mode_thread, sharedPtr, NULL);
return TCL_OK;
}
In the tcl code, i'm changing the variable mode whenever the user presses a button for example:
set mode "Idle"
button .startSamp -text "Sample Start" -width 9 -height 3 -background $btnColor -relief flat -state normal -command {set mode "Sampling"}
set threadId [mode_thread]
puts "Created thread $threadId, waiting"
Your code is a complete mess! You need to decide what you are doing and then do just that. In particular, you are using Tcl_LinkVar so you need to decide what sort of variable you are linking to. If you get a mismatch between the storage, the C access pattern and the declared semantic type, you'll get crashes.
Because your code is in too complicated a mess for me to figure out exactly what you want to do, I'll illustrate with less closely related examples. You'll need to figure out from them how to change things in your code to get the result you need.
Linking Integer Variables
Let's do the simple case: a global int variable (declared outside any function).
int sharedVal;
You want your C code to read that variable and get the value. Easy! Just read it as it is in scope. You also want Tcl code to be able to write to that variable. Easy! In the package initialization function, put this:
Tcl_LinkVar(interp /* == the Tcl interpreter context */,
"sharedVal" /* == the Tcl name */,
(char *) &sharedVal /* == pointer to C variable */,
TCL_LINK_INT /* == what is it! An integer */);
Note that after that (until you Tcl_UnlinkVar) whenever Tcl code reads from the Tcl variable, the current value will be fetched from the C variable and converted.
If you want that variable to be on the heap, you then do:
int *sharedValPtr = malloc(sizeof(int));
C code accesses using *sharedValPtr, and you bind to Tcl with:
Tcl_LinkVar(interp /* == the Tcl interpreter context */,
"sharedVal" /* == the Tcl name */,
(char *) sharedValPtr /* == pointer to C variable */,
TCL_LINK_INT /* == what is it! An integer */);
Linking String Variables
There's a bunch of other semantic types as well as TCL_LINK_INT (see the documentation for a list) but they all follow that pattern except for TCL_LINK_STRING. With that, you do:
char *sharedStr = NULL;
Tcl_LinkVar(interp, "sharedStr", (char *) &sharedStr, TCL_LINK_STRING);
You also need to be aware that the string will always be allocated with Tcl_Alloc (which is substantially faster than most system memory allocators for typical Tcl memory usage patterns) and not with any other memory allocator, and so will also always be deallocated with Tcl_Free. Practically, that means if you set the string from the C side, you must use Tcl_Alloc to allocate the memory.
Posting Update Notifications
The final piece to note is when you set the variable from the C side but want Tcl to notice that the change has set (e.g., because a trace has been set or because you've surfaced the value in a Tk GUI), you should do Tcl_UpdateLinkedVar to let Tcl know that a change has happened that it should pay attention to. If you never use traces (or Tk GUIs, or the vwait command) to watch the variable for updates, you can ignore this API call.
Donal's answer is correct, but I try to show you what you did with your ClientData.
To clarify: All (or almost all, Idk) Tcl functions that take a function pointer also take a parameter of type ClientData that is passed to your function when Tcl calls it.
Let's take a look at this line:
Tcl_CreateObjCommand(interp, "mode_thread", mode_thread, NULL, NULL);
// ------------------------------------------------------^^^^
You always pass NULL as ClientData to the mode_thread function.
In the mode_thread function you use the passed ClientData (NULL) to pass it as ClientData to the new Thread:
limitData = cdata;
// ...
Tcl_CreateThread(&id, mode_service, limitData, TCL_THREAD_STACK_DEFAULT, TCL_THREAD_NOFLAGS);
In the mode_service function you use the ClientData (which is still NULL) as pointer to a char array:
char* Var = (char *) clientData;
Which is a pointer to the address 0x00.
And then you tell printf to dereference this NULL pointer:
printf("%s\n", Var);
Which obviously crashes your program.
I am understanding and implementing the concept of threading in my application. Since now things are going good. But I have few questions still unanswered and they are making me slow now. I would appreciate if anyone replies to even any of them
In Createthread(), can we only take 1 argument? as I have seen in MSDN website and all other examples that I have seen I saw only 1 argument, LPVOID.
The other thing is , what does the return value DWORD WINAPI means as a return value? Can we have only DWORD , int or any other return type. I suppose it has something to do with HANDLE (may be)
I want to use the array of the thread, hence I learn the array to functions, and (as I have understood) threads are itself just a function called by CreateThread() routine, hence I tried to implement that concept there but could not because of the return type DWORD WINAPI was not allowing me to do so?
I have one single thread for saving files, now I want its array so that I can save multiple files at the same time (not exaclty the same starting time, but sort of parallel file saving). How can I do that?
Thanks
Shan
Indeed, you can only take one argument, of type void * (LPVOID).
However, since it can point to anything, it can point to a struct
or object (usually allocated on the heap for lifetime reasons).
WINAPI is not part of the return value, it's the function's calling
convention. The function must return a DWORD or anything that fit
in it. It must NOT return a pointer, because a pointer can't fit a
DWORD in Win64.
I don't understand, please elaborate what you're
trying to do.
Usually for this you need a single thread function,
passed several times to CreateThread() with a different argument
each time. Don't forget to keep the thread handles (which you'll
likely save in an array) until you stop needing them and close them
with CloseHandle().
for the point number three I guess I understood and will try differently. I was using
DWORD WINAPI save_uwpi_file0( LPVOID )
{
while(1)
{
if(release == 1 && flag_oper1 == 1)
{
int w_cnt = 0; FILE *opfile;
char fname[30] = "txt_file0.txt";
//opening file for write
opfile = fopen(fname , "w");
printf("assigning memory for file 1 \n");
ssint *Lmem = (ssint *)malloc( sizeof(ssint)*size_of_memory);
memcpy(Lmem, pInDMA, sizeof(ssint)*size_of_memory);
release = 0;
printf("relseaing for second file saving\n");
for( int nbr = 0; nbr < size_of_memory; nbr++){
fprintf(opfile , "%hi\n", Lmem[nbr] );
}
printf("aligned free 1\n");
free(Lmem);
fclose(opfile);
printf("File saved 1\n\n");
return 1;
} //if statement ends
}
}
and I was using following to make the pointer to (thread) function
DWORD WINAPI (* save_uwpi_file0)(LPVOID);
I guess I should try something like
DWORD (* save_uwpi_file0)(LPVOID);
I will do it and post the result here