Updated: When to "mortalize" a variable in Perl Inline::C - c

I am trying to wrap a C library into Perl. I have tinkered with XS but being unsuccessful I thought I should start simply with Inline::C. My question is on Mortalization. I have been reading perlguts as best as I am able, but am still confused. Do I need to call sv_2mortal on an SV* that is to be returned if I am not pushing it onto the stack?
(PS I really am working on a less than functional knowledge of C which is hurting me. I have a friend who knows C helping me, but he doesn't know any Perl).
I am providing a sample below. The function FLIGetLibVersion simply puts len characters of the library version onto char* ver. My question is will the version_return form of my C code leak memory?
N.B. any other comments on this code is welcomed.
#!/usr/bin/perl
use strict;
use warnings;
use 5.10.1;
use Inline (
C => 'DATA',
LIBS => '-lm -lfli',
FORCE_BUILD => 1,
);
say version_stack();
say version_return();
__DATA__
__C__
#include <stdio.h>
#include "libfli.h"
void version_stack() {
Inline_Stack_Vars;
Inline_Stack_Reset;
size_t len = 50;
char ver[len];
FLIGetLibVersion(ver, len);
Inline_Stack_Push(sv_2mortal(newSVpv(ver,strlen(ver))));
Inline_Stack_Done;
}
SV* version_return() {
size_t len = 50;
char ver[len];
FLIGetLibVersion(ver, len);
SV* ret = newSVpv(ver, strlen(ver));
return ret;
}
Edit:
In an attempt to answer this myself, I tried changing the line to
SV* ret = sv_2mortal(newSVpv(ver, strlen(ver)));
and now when I run the script I get the same output that I did previously plus an extra warning. Here is the output:
Software Development Library for Linux 1.99
Software Development Library for Linux 1.99
Attempt to free unreferenced scalar: SV 0x2308aa8, Perl interpreter: 0x22cb010.
I imagine that this means that I don't need to mortalize in this case? I suspect that the error is saying that I marked for collection something that was already in line for collection. Can someone confirm for me that that is what that warning means?

I've been maintaining Set::Object for many years and had this question, too - perhaps best to look at the source of that code to see when stuff should be mortalised (github.com/samv/Set-Object). I know Set::Object has it right after many changes. I think though, it's whenever you're pushing the SV onto the return stack. Not sure how Inline changes all that.

Related

How to 'tag' a location in a C source file for a later breakpoint definition?

Problem:
I want to be able to put different potentially unique or repeated "tags" across my C code, such that I can use them in gdb to create breakpoints.
Similar Work:
Breakpoints to line-numbers: The main difference with breakpoints on source lines, is that if the code previous to the tag is modified in such a way that it results in more or less lines, a reference to the tag would still be semantically correct, a reference to the source line would not.
Labels: I am coming from my previous question, How to tell gcc to keep my unused labels?, in which I preconceived the idea that the answer was to insert labels. Upon discussion with knowledgeable members of the platform, I was taught that label's names are not preserved after compilation. Labels not used within C are removed by the compiler.
Injecting asm labels: Related to the previous approach, if I inject asm code in the C source, certain problems arise, due to inline functions, compiler optimizations, and lack of scoping. This makes this approach not robust.
Define a dummy function: On this other question, Set GDB breakpoint in C file, there is an interesting approach, in which a "dummy" function can be placed in the code, and then add a breakpoint to the function call. The problem with this approach is that the definition of such function must be replicated for each different tag.
Is there a better solution to accomplish this? Or a different angle to attack the presented problem?
Using SDT (Statically Defined Tracing) probe points appears to satisfy all the requirements.
GDB documentation links to examples of how to define the probes.
Example use: (gdb) break -probe-stap my_probe (this should be documented in the GDB breakpoints section, but currently isn't).
You could create a dummy variable and set it to different values. Then you can use conditional watchpoints. Example:
#include <stdio.h>
static volatile int loc;
int main()
{
loc = 1;
puts("hello world");
loc = 2;
return 0;
}
(gdb) watch loc if loc == 2
Hardware watchpoint 1: loc
(gdb) r
Starting program: /tmp/a.out
hello world
Hardware watchpoint 1: loc
Old value = 1
New value = 2
main () at test.c:8
8 return 0;
You can of course wrap the assignment in a macro so you only get it in debug builds. Usual caveats apply: optimizations and inlining may be affected.
Use python to search a source file for some predefined labels, and set breakpoints there:
def break_on_labels(source, label):
"""add breakpoint on each SOURCE line containing LABEL"""
with open(source) as file:
l = 0
for line in file:
l = l + 1
if label in line:
gdb.Breakpoint(source=source, line=l)
main_file = gdb.lookup_global_symbol("main").symtab.fullname()
break_on_labels(main_file, "BREAK-HERE")
Example:
int main(void)
{
int a = 15;
a = a + 23; // BREAK-HERE
return a;
}
You could insert a #warning at each line where you want a breakpoint, then have a script to parse the file and line numbers from the compiler messages and write a .gdbinit file placing breakpoints at those locations.

C: Segmentation fault and maybe GDB is lying to me

Here is a C function that segfaults:
void compileShaders(OGL_STATE_T *state) {
// First testing to see if I can access object properly. Correctly outputs:
// nsHandle: 6
state->nsHandle = 6;
printf("nsHandle: %d\n", state->nsHandle);
// Next testing if glCreateProgram() returns proper value. Correctly outputs:
// glCreateProgram: 1
printf("glCreateProgram: %d\n", glCreateProgram());
// Then the program segfaults on the following line according to gdb
state->nsHandle = glCreateProgram();
}
For the record state->nsHandle is of type GLuint and glCreateProgram() returns a GLuint so that shouldn't be my problem.
gdb says that my program segfaults on line 303 which is actually the comment line before that line. I don't know if that actually matters.
Is gdb lying to me? How do I debug this?
EDIT:
Turned off optimizations (-O3) and now it's working. If somebody could explain why that would be great though.
EDIT 2:
For the purpose of the comments, here's a watered down version of the important components:
typedef struct {
GLuint nsHandle;
} OGL_STATE_T;
int main (int argc, char *argv[]) {
OGL_STATE_T _state, *state=&_state;
compileShaders(state);
}
EDIT 3:
Here's a test I did:
int main(int argc, char *argv[]) {
OGL_STATE_T _state, *state=&_state;
// Assign value and try to print it in other function
state->nsHandle = 5;
compileShaders(state);
}
void compileShaders(OGL_STATE_T *state) {
// Test to see if the first call to state is getting optimized out
// Correctly outputs:
// nsHandle (At entry): 5
printf("nsHandle (At entry): %d\n", state->nsHandle);
}
Not sure if that helps anything or if the compiler would actually optimize the value from the main function.
EDIT 4:
Printed out pointer address in main and compileShaders and everything matches. So I'm gonna assume it's segfaulting somewhere else and gdb is lying to me about which line is actually causing it.
This is going to be guesswork based on what you have, but with optimization on this line:
state->nsHandle = 6;
printf("nsHandle: %d\n", state->nsHandle);
is probably optimized to just
printf("nsHandle: 6\n");
So the first access to state is where the segfault is. With optimization on GDB can report odd line numbers for where the issue is because the running code may no longer map cleanly to source code lines as you can see from the example above.
As mentioned in the comments, state is almost certainly not initialized. Some other difference in the optimized code is causing it to point to an invalid memory area whereas the non-optimized code it's pointing somewhere valid.
This might happen if you're doing something with pointers directly that prevents the optimizer from 'seeing' that a given variable is used.
A sanity check would be useful to check that state != 0 but it'll not help if it's non-zero but invalid.
You'd need to post the calling code for anyone to tell you more. However, you asked how to debug it -- I would print (or use GDB to view) the value of state when that function is entered, I imagine it will be vastly different in optimized and non-optimized versions. Then track back to the function call to work out why that's the case.
EDIT
You posted the calling code -- that should be fine. Are you getting warnings when compiling (turn all the warnings on with -Wall). In any case my advice about printing the value of state in different scenarios still stands.
(removed comment about adding & since you edited the question again)
When you optimize your program, there is no more 1:1 mapping between source lines and emmitted code.
Typically, the compiler will reorder the code to be more efficient for your CPU, or will inline function call, etc...
This code is wrong:
*state=_state
It should be:
*state=&_state
Well, you edited your post, so ignore the above fix.
Check for the NULL condition before de-referencing the pointer or reading it. If the values you pass are NULL or if the values stored are NULL then you will hit segfault without performing any checks.
FYI: GDB Can't Lie !
I ended up starting a new thread with more relevant information and somebody found the answer. New thread is here:
GCC: Segmentation fault and debugging program that only crashes when optimized

Search and replace a string as shown below

I am reading a file say x.c and I have to find for the string "shared". Once the string like that has been found, the following has to be done.
Example:
shared(x,n)
Output has to be
*var = &x;
*var1 = &n;
Pointers can be of any name. Output has to be written to a different file. How to do this?
I'm developing a source to source compiler for concurrent platforms using lex and yacc. This can be a routine written in C or if u can using lex and yacc. Can anyone please help?
Thanks.
If, as you state, the arguments can only be variables and not any kind of other expressions, then there are a couple of simple solutions.
One is to use regular expressions, and do a simple search/replace on the whole file using a pretty simple regular expression.
Another is to simply load the entire source file into memory, search using strstr for "shared(", and use e.g. strtok to get the arguments. Copy everything else verbatim to the destination.
Take advantage of the C preprocessor.
Put this at the top of the file
#define shared(x,n) { *var = &(x); *var1 = &(n); }
and run in through cpp. This will include external resources also and replace all macros, but you can simply remove all #something lines from the code, convert using injected preprocessor rules and then re-add them.
By the way, why not a simple macro set in a header file for the developer to include?
A doubt: where do var and var1 come from?
EDIT: corrected as shown by johnchen902
When it comes to preprocessor, I'll do this:
#define shared(x,n) (*var=&(x),*var1=&(n))
Why I think it's better than esseks's answer?
Suppose this situation:
if( someBool )
shared(x,n);
else { /* something else */ }
In esseks's answer it will becomes to:
if( someBool )
{ *var = &x; *var1 = &n; }; // compile error
else { /* something else */ }
And in my answer it will becomes to:
if( someBool )
(*var=&(x),*var1=&(n)); // good!
else { /* something else */ }

Counter variable passed to `msgpack_pack_int()` macro doesn't increment

I have a very odd issue trying to run this quite simple C program which is using zmq and msgpack.
There is no problem with server.c, however in clinet.c:39 there
is this msgpack_pack_int (&mpkg, i); and the value of i seems
to be picked up as 0 and doesn't change on each iteration. I
have tried a bunch of different things (e.g. making a pointer to
i and using that, also tried to split it into a function etc)
and nothing seems to help. I can see that msgpack_pack_int() is
a macro, but why it would introduce such a behaviour and what can
I do to overcome it? Is there a flag that could change the behaviour
of this kind of macro (as I see it expands to an inline function)...
I have tried -Werror -Wall, with gcc and clang, and nothing
comes up in warning either ;(*
I tried debugging it and i increments as expected.
I even tried this, and it would do the same thing anyway:
void pack (msgpack_packer *p, msgpack_sbuffer *b) {
static volatile int i = 0;
printf("\ni=%d\n", i);
msgpack_packer_init (p, b, msgpack_sbuffer_write);
msgpack_pack_array (p, 2);
msgpack_pack_int (p, i++);
msgpack_pack_str (p, "/i/am/a/clinet/");
}
I have even tried something which was supposed to be different, but no luck here either -
int count (void) {
static int i = 0;
i += 1; return i;
}
can anyone see why would this happen?
Update 1: Also I have recompiled msgpack library itself without optimization flags,
and that didn't change the behaviour either.
Update 2: Installed msgpack from the git repo and I get still have the same issue.
It turns out that on each iteration I was doing this:
msgpack_packer_init (&mpkg, &sbuf, msgpack_sbuffer_write);
that needs to be done only once, and this should be there instead:
msgpack_sbuffer_init (&sbuf);
or:
msgpack_sbuffer_clear (&sbuf);
It was rather logical to put msg_pack* functions together and indeed
that was taken from the simple example and
the problem is really with the documentation, one extra comment would help!
Update: working version & version without memcpy.

I have a function with a lot of return points. Is there any way that I can make gdb show me which one is returning?

I have a function with an absurd number of return points, and I don't want to caveman each one, and I don't want to next through the function. Is there any way I can do something like finish, except have it stop on the return statement?
You can try reverse debugging to find out where function actually returns. Finish executing current frame, do reverse-step and then you should stop at just returned statement.
(gdb) fin
(gdb) reverse-step
There is already similar question
I think you're stuck setting breakpoints. I'd write a script to generate the list of breakpoint commands to run and paste them into gdb.
Sample script (in Python):
lines = open(filename, 'r').readlines()
break_lines = [line_num for line_num, line in enumerate(lines) if 'return' in line and
line_num > first and line_num <= last]
break_cmds = ['b %s:%d' % (filename, line_num) for line_num in break_lines]
print '\n'.join(break_cmds)
Set filename to the name of the file with the absurd function, first to the first line of the function (this is a quick script, not a C parser) and last to the number of the last line of the function. The output ought to be suitable for pasting into gdb.
Kind of a stretch, but the catch command can stop on many kinds of things (like forking, exiting, receiving a signal). You may be able to use catch catch (which breaks for exceptions) to do what you want in C++ if you wrap the function in try/finally. For that matter, if you break on a line inside the finally you can probably single-step through the return after that (although how much that will tell you about where it came from is highly dependent on optimization: common return cases are often folded by gcc).
How about taking this opportunity to break up what seems to be clearly a too-large function?
This question's come up before on SO. My answer from there:
Obviously you ought to refactor this function, but in C++ you can use this simple expedient to deal with this in five minutes:
class ReturnMarker
{
public:
ReturnMarker() {};
~ReturnMarker()
{
dummy += 1; //<-- put your breakpoint here
}
static int dummy;
}
int ReturnMarker::dummy = 0;
and then instance a single ReturnMarker at the top of your function. When it returns, that instance will go out of scope, and you'll hit the destructor.
void LongFunction()
{
ReturnMarker foo;
// ...
}

Resources