I was looking through the manuals on strcpy() and strcat(). Seems there's no way to evaluate the "success" of the function call. (ie return value will never be NULL), is that correct?
It's just assumed that if you follow the rules for the input of these functions that the output will be valid? Just wanted to make sure I wasn’t missing anything here…
These functions cannot fail in any well-defined way. They will either succeed, or things have gone horribly wrong (e.g. missing 0 char or too small output buffer), and anything could happen.
Because I don't think functions like that really have any way of knowing what "success" is. For instance, strcpy is really just memcpy. So as far as C is concerned, it is just taking data from one memory location and copying it to another. It doesn't know how the data is supposed to look, or be formatted in ways you expect. I guess the only real way you'd know if there is a success or not is if you end up getting a segfault or not.
These functions are guaranteed to work, provided that you're not invoking undefined behaviour. In particular, the memory that you're writing to needs to be allocated. There is really no way for them to fail except crashing your program.
Generally, because it can be hard to tell how many bytes will be written, use of these functions is discouraged. Use strncpy and strncat if you can.
I am working with some legacy code which uses something like this:
void store_data(FILE *file);
However, I don't want to store data on the disk, I want to store it in memory (char *buf). I could edit all of the code, but the code jumps all over the place and fwrite gets called on the file all over the place. So is there an easier way, for example that I can map a FILE* object to an (auto-growing) buffer? I do not know the total size of the data before-hand.
The solution has to be portable.
There is no way to do this using only the facilities provided in the C standard. The closest you can come is
FILE *scratch = tmpfile();
...
store_data(scratch);
...
/* after you're completely done calling the legacy code */
rewind(scratch);
buf = read_into_memory_buffer(scratch);
fclose(scratch);
This does hit the disk, at least potentially, but I'd say it's your best bet if you need wide portability and can't modify the "legacy code".
In POSIX.1-2008, there is open_memstream, which does exactly what you want; however, this revision of POSIX is not yet widely adopted. GNU libc (used on Linux and a few others) has it, but it's not available on OSX or the *BSDs as far as I know, and certainly not on Windows.
You might want to look at fmemopen and open_memstream. They do something in the direction of what you want.
From the man page:
The open_memstream() function opens a stream for writing to a buffer.
The buffer is dynamically allocated (as with malloc(3)), and automati‐
cally grows as required. After closing the stream, the caller should
free(3) this buffer.
I don't know if it's a good idea, but it's an idea.
You can "redefine" fwrite using a macro.
#define fwrite(a, b, c) your_memory_write_function(a, b, c)
Then implement memory_write_function to write data to your auto growing buffer instead of a file.
You will need to call store_data with a pointer to something else though (not a pointer to FILE). But that's possible with C so you will have no issues there.
On what platform are you running? Can't you use tmpfs? If you open a file on tmpfs, is it not, from the point of view of the kernel, the same as a regular file, but written to memory?
You may want to look into fmemopen(). If that's not available to you, then you could possibly use a named shared memory segment along with fdopen() to convert the file descriptor returned by shm_open() to a FILE*.
I want to know which C standard library functions use malloc and free under the hood. It looked to me as if printf would be using malloc, but when I tested a program with valgrind, I noticed that printf calls didn't allocate any memory using malloc. How come? How does it manage the memory then?
Usually, the only routines in the C99 standard that might use malloc() are the standard I/O functions (in <stdio.h> where the file structure and the buffer used by it is often allocated as if by malloc(). Some of the locale handling may use dynamic memory. All the other routines have no need for dynamic memory allocation in general.
Now, is any of that formally documented? No, I don't think it is. There is no blanket restriction 'the functions in the library shall not use malloc()'. (There are, however, restrictions on other functions - such as strtok() and srand() and rand(); they may not be used by the implementation, and the implementation may not use any of the other functions that may return a pointer to a static memory location.) However, one of the reasons why the extremely useful strdup() function is not in the standard C library is (reportedly) because it does memory allocation. It also isn't completely clear whether this was a factor in the routines such as asprintf() and vasprintf() in TR 24731-2 not making it into C1x, but it could have been a factor.
The standard doesn't place any requirements on the implementation, AFAIK.
I don't know exactly how printf is implemented, but of the top of my head, I can't think of a reason why it would need to dynamically allocate memory. You could always look at the source for your platform.
It depends on which libc you are using. There should be no restriction on the C spec and up to the implementation.
For instance, newlib's printf usually done with using memory on stack frame, but when it really needs to, it calls an internal function _malloc_r() directly.
I have not used valgrind, I'm not sure if it can detect use of _malloc_r().
Neither the C nor the POSIX standard force implementors to make use of malloc(), so there's no general answer to your question.
However, every sane standard library implementation that uses malloc() in one of its functions will set errno to ENOMEM if malloc() fails. Hence, you can derive from the documentation whether a library function uses malloc() or not. Point in case: on my system, mmap() may use malloc(), since mmap() may set errno to ENOMEM.
That having said, using valgrind is a poor way to find out whether a particular function calls malloc() or not. Consider the following piece of code:
void foo(int x)
{
if (!x) malloc(1);
}
If you call this function with an argument other than 0, valgrind won't notice that it may actually call malloc(). Think of valgrind as a virtual machine (since that's what it is): it doesn't look at your code, it only sees what the machine would actually execute.
printf doesn't need to form the entire output string in one shot, it can send it to output piece by piece, and when it encounters a format specifier, it can output that piece of data as it is formed, and continue on with the rest of the string.
At most it would need a locally defined array of characters (on the stack) large enough to hold the largest integer or floating point number it can handle, which isn't very large.
I am using open_memstream in a library of mine, but I would like to port this library to MSVC. It seems there are no equivalent function available, but is there something similar enough?
What open_memstream does is it takes a char** destination and size and returns a FILE* which you many write to, the data is stored in a dynamically allocated buffer (accessible from the char** argument). When closing the FILE the char** contains the data that was written to the stream. This makes an easy way to construct large and complex string streams.
While it is possible to both read and seek from the memstream I only write to it.
Is there a way to open a similar memory FILE stream in MSVC? Also, this is pure C, no C++.
A similar function on Windows would be CreateStreamOnHGlobal(). That however works with the IStream COM interface, it isn't a drop-in replacement for FILE. You might want to take a peek at the Cygwin source code to see what they did.
https://github.com/Snaipe/fmem is a wrapper for different platform/version specific equivalents of open_memstream
It tries in sequence the following implementations:
open_memstream.
fopencookie, with growing dynamic buffer.
funopen, with growing dynamic buffer.
WinAPI temporary memory-backed file.
When no other mean is available, fmem falls back to tmpfile()
Is there any method to calculate size of a function? I have a pointer to a function and I have to copy entire function using memcpy. I have to malloc some space and know 3rd parameter of memcpy - size. I know that sizeof(function) doesn't work. Do you have any suggestions?
Functions are not first class objects in C. Which means they can't be passed to another function, they can't be returned from a function, and they can't be copied into another part of memory.
A function pointer though can satisfy all of this, and is a first class object. A function pointer is just a memory address and it usually has the same size as any other pointer on your machine.
It doesn't directly answer your question, but you should not implement call-backs from kernel code to user-space.
Injecting code into kernel-space is not a great work-around either.
It's better to represent the user/kernel barrier like a inter-process barrier. Pass data, not code, back and forth between a well defined protocol through a char device. If you really need to pass code, just wrap it up in a kernel module. You can then dynamically load/unload it, just like a .so-based plugin system.
On a side note, at first I misread that you did want to pass memcpy() to the kernel. You have to remind that it is a very special function. It is defined in the C standard, quite simple, and of a quite broad scope, so it is a perfect target to be provided as a built-in by the compiler.
Just like strlen(), strcmp() and others in GCC.
That said, the fact that is a built-in does not impede you ability to take a pointer to it.
Even if there was a way to get the sizeof() a function, it may still fail when you try to call a version that has been copied to another area in memory. What if the compiler has local or long jumps to specific memory locations. You can't just move a function in memory and expect it to run. The OS can do that but it has all the information it takes to do it.
I was going to ask how operating systems do this but, now that I think of it, when the OS moves stuff around it usually moves a whole page and handles memory such that addresses translate to a page/offset. I'm not sure even the OS ever moves a single function around in memory.
Even in the case of the OS moving a function around in memory, the function itself must be declared or otherwise compiled/assembled to permit such action, usually through a pragma that indicates the code is relocatable. All the memory references need to be relative to its own stack frame (aka local variables) or include some sort of segment+offset structure such that the CPU, either directly or at the behest of the OS, can pick the appropriate segment value. If there was a linker involved in creating the app, the app may have to be
re-linked to account for the new function address.
There are operating systems which can give each application its own 32-bit address space but it applies to the entire process and any child threads, not to an individual function.
As mentioned elsewhere, you really need a language where functions are first class objects, otherwise you're out of luck.
You want to copy a function? I do not think that this is possible in C generally.
Assume, you have a Harvard-Architecture microcontroller, where code (in other words "functions") is located in ROM. In this case you cannot do that at all.
Also I know several compilers and linkers, which do optimization on file (not only function level). This results in opcode, where parts of C functions are mixed into each other.
The only way which I consider as possible may be:
Generate opcode of your function (e.g. by compiling/assembling it on its own).
Copy that opcode into an C array.
Use a proper function pointer, pointing to that array, to call this function.
Now you can perform all operations, common to typical "data", on that array.
But apart from this: Did you consider a redesign of your software, so that you do not need to copy a functions content?
I don't quite understand what you are trying to accomplish, but assuming you compile with -fPIC and don't have your function do anything fancy, no other function calls, not accessing data from outside function, you might even get away with doing it once. I'd say the safest possibility is to limit the maximum size of supported function to, say, 1 kilobyte and just transfer that, and disregard the trailing junk.
If you really needed to know the exact size of a function, figure out your compiler's epilogue and prologue. This should look something like this on x86:
:your_func_epilogue
mov esp, ebp
pop ebp
ret
:end_of_func
;expect a varying length run of NOPs here
:next_func_prologue
push ebp
mov ebp, esp
Disassemble your compiler's output to check, and take the corresponding assembled sequences to search for. Epilogue alone might be enough, but all of this can bomb if searched sequence pops up too early, e.g. in the data embedded by the function. Searching for the next prologue might also get you into trouble, i think.
Now please ignore everything that i wrote, since you apparently are trying to approach the problem in the wrong and inherently unsafe way. Paint us a larger picture please, WHY are you trying to do that, and see whether we can figure out an entirely different approach.
A similar discussion was done here:
http://www.motherboardpoint.com/getting-code-size-function-c-t95049.html
They propose creating a dummy function after your function-to-be-copied, and then getting the memory pointers to both. But you need to switch off compiler optimizations for it to work.
If you have GCC >= 4.4, you could try switching off the optimizations for your function in particular using #pragma:
http://gcc.gnu.org/onlinedocs/gcc/Function-Specific-Option-Pragmas.html#Function-Specific-Option-Pragmas
Another proposed solution was not to copy the function at all, but define the function in the place where you would want to copy it to.
Good luck!
If your linker doesn't do global optimizations, then just calculate the difference between the function pointer and the address of the next function.
Note that copying the function will produce something which can't be invoked if your code isn't compiled relocatable (i.e. all addresses in the code must be relative, for example branches; globals work, though since they don't move).
It sounds like you want to have a callback from your kernel driver to userspace, so that it can inform userspace when some asynchronous job has finished.
That might sound sensible, because it's the way a regular userspace library would probably do things - but for the kernel/userspace interface, it's quite wrong. Even if you manage to get your function code copied into the kernel, and even if you make it suitably position-independent, it's still wrong, because the kernel and userspace code execute in fundamentally different contexts. For just one example of the differences that might cause problems, if a page fault happens in kernel context due to a swapped-out page, that'll cause a kernel oops rather than swapping the page in.
The correct approach is for the kernel to make some file descriptor readable when the asynchronous job has finished (in your case, this file descriptor almost certainly be the character device your driver provides). The userspace process can then wait for this event with select / poll, or with read - it can set the file descriptor non-blocking if wants, and basically just use all the standard UNIX tools for dealing with this case. This, after all, is how the asynchronous nature of network sockets (and pretty much every other asychronous case) is handled.
If you need to provide additional information about what the event that occured, that can be made available to the userspace process when it calls read on the readable file descriptor.
Function isn't just object you can copy. What about cross-references / symbols and so on? Of course you can take something like standard linux "binutils" package and torture your binaries but is it what you want?
By the way if you simply are trying to replace memcpy() implementation, look around LD_PRELOAD mechanics.
I can think of a way to accomplish what you want, but I won't tell you because it's a horrific abuse of the language.
A cleaner method than disabling optimizations and relying on the compiler to maintain order of functions is to arrange for that function (or a group of functions that need copying) to be in its own section. This is compiler and linker dependant, and you'll also need to use relative addressing if you call between the functions that are copied. For those asking why you would do this, its a common requirement in embedded systems that need to update the running code.
My suggestion is: don't.
Injecting code into kernel space is such an enormous security hole that most modern OSes forbid self-modifying code altogether.
As near as I can tell, the original poster wants to do something that is implementation-specific, and so not portable; this is going off what the C++ standard says on the subject of casting pointers-to-functions, rather than the C standard, but that should be good enough here.
In some environments, with some compilers, it might be possible to do what the poster seems to want to do (that is, copy a block of memory that is pointed to by the pointer-to-function to some other location, perhaps allocated with malloc, cast that block to a pointer-to-function, and call it directly). But it won't be portable, which may not be an issue. Finding the size required for that block of memory is itself dependent on the environment, and compiler, and may very well require some pretty arcane stuff (e.g., scanning the memory for a return opcode, or running the memory through a disassembler). Again, implementation-specific, and highly non-portable. And again, may not matter for the original poster.
The links to potential solutions all appear to make use of implementation-specific behaviour, and I'm not even sure that they do what the purport to do, but they may be suitable for the OP.
Having beaten this horse to death, I am curious to know why the OP wants to do this. It would be pretty fragile even if it works in the target environment (e.g., could break with changes to compiler options, compiler version, code refactoring, etc). I'm glad that I don't do work where this sort of magic is necessary (assuming that it is)...
I have done this on a Nintendo GBA where I've copied some low level render functions from flash (16 bit access slowish memory) to the high speed workspace ram (32 bit access, at least twice as fast). This was done by taking the address of the function immdiately after the function I wanted to copy, size = (int) (NextFuncPtr - SourceFuncPtr). This did work well but obviously cant be garunteed on all platforms (does not work on Windows for sure).
I think one solution can be as below.
For ex: if you want to know func() size in program a.c, and have indicators before and after the function.
Try writing a perl script which will compile this file into object format(cc -o) make sure that pre-processor statements are not removed. You need them later on to calculate the size from object file.
Now search for your two indicators and find out the code size in between.