verify generated library for the encapsulation - c

I have two C projects prepared under Visual Studio 2015. First project is simply a static library project whereas the second one is a console application which uses the static library file generated by the first project.
I checked the static library file with DUMPBIN tool in Windows and saw that there are many variables and functions exposed to outside which is very bad for the encapsulation issues.
My question is how can I be sure that I don't expose the functions which are supposed to be private. Do I need to check every time with that tool? My question is also covering the variables. All my static global variables are also exposed to outside. How can I force them to be private?

I don't think that presence in dumpbin output can be considered an "exposing". All your static global variables require some space allocation and probably an initialization at runtime. So it's only natural for them to be present in dumpbin output. Also if you are compiling with link-time code generation then everything is actually "exposed".

Related

Creating Backwards Compatible Drop In Statically Linked DLL

I have a C-based DLL that I wrote years ago for a project and it exports a set of functions that define an API. Now I need to re-write this DLL's internals but keep the API exactly the same.
The user of the DLL used static linking and they do not want to or are unable to recompile their executable.
I've noticed that the RVAs of the exported functions are different. My understanding is that means the executable won't be able to find the functions unless it is re-linked with the updated lib file.
Is there a way in VS2017 to force an exported function to use a specific RVA? I checked the Microsoft LINK DEF file format and I didn't see an option in there.
Even if it is possible, is fixing the RVAs enough to ensure the old executable will be able to use the updated DLL or are there additional complications that make this a non-starter?
Thanks.
When you statically link an EXE module against a DLL, you do indeed link against the the DLL's import library (a .LIB) created alongside the DLL when the DLL was built. This is not the same thing as linking against a static library which is confusing because those are also .LIB files.
The first thing you should do is figure out if your EXE module has an import entry for said DLL using a tool like Dependency Walker, Dumpbin, pelook or your favorite PE analyzer tool. If there is no DLL import entry, you have have probably linked the EXE against a static library as described by #HAL9000 's answer. Short of reverse-engineering the EXE, your best bet would be to rebuild the module as suggested if possible.
Otherwise, if you do find an import for said DLL, then yes you can swap out a newly-built DLL provided you have the same export (function) names and/or ordinal values as the original. DLLs find function by export names and/or ordinal values, not RVAs which in this case are only an internal detail. Whether the DLL is implicitly loaded (from being statically-linked) during process (EXE) initialization (before the EXE's entry point is called) or explicitly loaded (via code using LoadLibrary, etc.) the whole point of being a DLL is that it is a module is designed to be dynamically replaced and Windows was designed around this concept. The internal RVAs both within the EXE (referencing the DLL) and the DLL itself do not need to match an old DLL's values; this bookkeeping is automatically handled by the Windows loader during a process also known as runtime linking.
In the event the EXE is linked against said DLL and ALSO specifies hard-coded addresses (RVAs) for the DLL's exported functions (a process known as static binding), Windows will still verify the addresses still internally reflect the correct values in the DLL that is actually loaded which may be a different, updated DLL. This is done via a timestamp check in the import section for the DLL. If there is a mismatch, the Windows loader tosses-out all of the static RVAs and updates them with the current values incurring a slight performance penalty, but the program will still load. FWIW the bind.exe tool to do this static binding no longer ships with the Visual C++ toolset as the performance gain in modern versions of Windows is minimal. This optimization used to be be common practice to speed up load times, especially in OS-supplied system DLLs, but shouldn't affect what you are trying to do one way or the other.
If the user has statically linked in your library, then it is not a DLL, and making a drop in replacement without relinking is not possible. At least not without some ugly hacks. The old library functions have been copied into the executable, so there are no way around editing the executable. If you can't recompile or relink, then it is probably easier to rewrite the executable from scratch.
Mucking around with adresses of functions in your new DLL, if possible, can't have any effect if the executable doesn't have any code to load a DLL at all.

linking, loading, and virtual memory

I know these questions have been asked before - but I still can't reconcile everything together into an overall picture.
static vs dynamic library
static libraries have their code copied and linked into the resulting executable
static libraries have only copy and link the required modules into the executable, not the entire library implementation
static libraries don't need to be compiled as PIC as they are apart of the resulting executable
dynamic libraries copy and link in stubs that describe how to load/link (?) the function implementation at runtime
dynamic libraries can be PIC or relocatable
why are there separate static and dynamic libraries? All of the above seems to be be the job of the static or dynamic linker. Why do I need 2 libraries that implement scanf?
(bonus #1) what does a shared library refer to? I've heard it being used as (1) the overall umbrella term, synonymous to library, (2) directly to a dynamic library, (3) using virtual memory to map the same physical memory of a library to multiple address spaces. Can you do this only with dynamic libraries? (4) having different versions of the same dynamic library in memory.
(bonus #2) are the standard libraries (libc, libc++, stdlibc++, ..) linked dynamically or statically by default? I never need to dlopen()..
static vs dynamic linking
how is this any different than static vs dynamic libraries? I don't understand why there isn't just 1 library, and we use either a static or dynamic linker (other than the PIC issue). Instead of talking about static vs dynamic libraries, should we instead be discussing the more general static s dynamic linking?
is symbol resolution still performed at compile-time for both?
static vs dynamic loading
Static loading means copying the full executable into MM before executing it
Dynamic loading means that only the executable header copied into MM before executing, additional functionality is loaded into MM when requested. How is this any different from paging?
If the executable is dynamically linked, why would it not be dynamically loaded?
both static loading and dynamic loading may or may not perform relocation
I know there are a lot of things I'm confused about here - and I'm not necessary looking for someone to address each issue. I'm hoping by listing out everything that is confusing me, that someone that understands this will see where a lapse in my understanding is at a broad level, and be able to paint a larger picture about how these things cooperate together..
why 2 types of lib loading
dynamic saves space (you dont have hundreds of copies of the same code in all binaries using foo.lib
dynamic allows foo.lib vendor can ship a new version of the library and existing code takes advantage of it
static makes dependency management easier - in theory a binary can be one file
What is 'shared library'
unix name for dynamic library. Windows calls it DLL
Are standard libraries static or dynamic
depends on platform. On some you can choose on others its chosen for you. For example on windwos there are compiler switchs to say if you want static or dynamic runtimes. Not dont confuse dynamic library usage with dlopen - see later
'why we talk about 2 different types of library'
Typically a static library is in a different format from a dynamic one. Typically a static library is input to the linker just like any other compile unit. A dynamic library is typically output by the linker. They are used differently even though they both deliver the same chunk of code to your app
Symbol resolution is finalized at load time for a DLL
Full dynamic loading. This is the realm of dlopen. This is where you want to call entry points in a library that might not have even existing when you compiled. Use cases:
plugins that conform to a well known interface but there can be many implementations (PAM and NSS are good examples). The app chooses to load one or more implementations from specified files at run time
an app needs to load a library and call an arbitrary function. Imagine how , for example , how a scripting language can load and call an arbitrary method
To use a .so on unix you dont need to use dlopen. You can have it loaded for you (Same on windows). To really dynamically load a shared lib / dll you need dlopen or LoadLibrary
Note that statically linked libraries load faster, since there is less disk searching for all the runtime library files. If the libraries are small, and very unusual, probably better to link statically. If there are serious version dependencies / functional differences like MFC, the DLLs need different names.

LoadLibrary() an EXE?

I have an executable (that I created using Visual C++ 10), and I need to use its capabilities from another program I wrote (same environment). Due to complex deployment requirements which I won't go into, building a DLL from the required functionality and loading it in both programs is not something I can do.
So I thought that I can __declspec(dllexport) some functions in the EXE, and then LoadLibrary() will let me GetProcAddress() them.
Obviously this can't be done, though when I started looking at it - it looked feasible.
Specifically, when you __declspec(dllexport) functions in an EXE project, Visual C++ also generates a lib file for dynamic linking - so you don't even need to use LoadLibrary() - just link against the resulting lib and call the functions.
Unfortunately, the main problem is that when you declare the resulting file as an EXE, Visual C++ adds the "CRTmain" entry point into the resulting file, instead of the "CRTDLLmain" that a DLL gets. When Windows (automatically) LoadLibrary() the EXE from your main program, it doesn't call the the "CRTDLLmain" entry point (because it doesn't exist), the C runtime for the module doesn't get initialized, and as a result all interesting work (such as memory allocation) fails with interesting(*) runtime exceptions.
So as follows, my question is: is there a way to cause Visual C++ to build into the resulting file both the "CRTmain" entry point and the "CRTDLLmain" entry point?
(*) "Interesting" as in an old Chinese curse.
Yes it is possible.
http://www.codeproject.com/Articles/1045674/Load-EXE-as-DLL-Mission-Possible
The idea is a) to patch the IAT and b) to call the CRT before calling your exports.
Simply no!
The Problem is that the CRT and in the EXE you want load, uses some globle variables. You main EXE does the same. So how should Memory allocation work?
If you want to use such a structure you must use a DLL to be Aware of meulti threading, CRT initialization ans all this other stuff. You need this!
But what about COM Automation? Wouldn't tis be an easy solution to use your code in one EXE from another?
The short answer, is "no". After looking far and wide, there is no way to get VC++ to do what I want, and quite likely not any other compiler.
The main issue is that the main() entry point most people know and love is not the real entry point in C++ executables: the compiler needs to do a lot of initialization work to get the "C++ Run Time library" to a usable state, as well as initialize globals, statics and the likes. This initialization uses different code in shared libraries than in executables and there is no way to one to behave like another.
One thing that possibly can be done, is to build the shared functionality into a DLL, and for the primary executable to embed the DLL as a resource, and load it from a memory mapped region of the executable file (there are several code samples how to do this with VC++ on stackoverflow and elsewhere on the web). Now another program can do the same thing by loading the DLL from the bundling executable.

efficiency of utilizing dll in c source code

I have a dll which I'd like to use in a c program,
Do you think is efficient to have a dll (lots of common functions) and then create a program that will eventually use them, or have all the source code?
To include the dll, What syntax must be followed?
Do you think is efficient to have a dll (lots of common functions) and then create a program that will eventually use them,or have all the source code.
For memory and disk space, it is more efficient to use a shared library (a DLL is the Windows implementation of shared libraries), assuming that at least two programs use this component. If only one program will ever use this component, then there is no memory or disk space savings to be had.
Shared libraries can be slightly slower than statically linking the code; however, this is likely to be incredibly minor, and shared libraries carry a number of benefits that make it more than worthwhile (such as the ability to load and handle symbols dynamically, which allows for plugin-like architectures). That said, there are also some disadvantages (if you are not careful about where your DLLs live, how they are versioned, and who can update them, then you can get into DLL hell).
To include the dll, What syntax must be followed?
This depends. There are two ways that shared libraries can be used. In the first way, you tell the linker to reference the shared library, and the shared library will automatically be loaded on program startup, and you would basically reference the code like normal (include the various headers and just use the name of the symbol when you want to reference it). The second way is to dynamically load the shared library (on Windows this is done via LoadLibrary while it is done on UNIX with dlopen). This second way makes it possible to change the behavior of the program based on the presence or absence of symbols in the shared library and to inspect the available set of symbols. For the second way, you would use GetProcAddress (Windows) or dlsym (UNIX) to obtain a pointer to a function defined in the library, and you would pass around function pointers to reference the functions that were loaded.
You can put your functions into either a static library ( a .lib) which is merged into your application at compile time and is basically the same as putting the .c files in the project.
Or you can use a dll where the functions are included at run time. the advantage of a dll is that two programs which use the same functions can use the same dll (saving disk space) and you can upgrade the dll without changing the program - neither of these probably matters for you.
The dll is automatically loaded when your program runs there is nothing special you need to do to include it ( you can load a dll specifically in your code - there are sometimes special reasons to do this)
Edit - if you need to create a stub lib for an existing dll see http://support.microsoft.com/kb/131313

Tool to find global/static variables in C codebase

I have created a C++-wrapper (a class) for a larger C codebase which was originally written for a microprocessor. Now we want to simulate multiple instances of "agents" running this C code. As we want to see how they interact, we need to run them simultaneously. If possible, we'd like to run them in one process.
This failed at first because the C code used static variables and thus wasn't thread safe. We thought we had removed all static and global variables but still don't get the expected results. (Everything runs fine if we have only one instance.)
So my question is: instead of searching the whole code base for such variables, is there any tool that might assist to find the problem? The C code was written in Keil μVision and is now compiled in Visual Studio 2008 Team Suite.
Thanks for suggestions!
If you can build it in a more unix-ish environment, you should have a size command you can run on .o files that will tell you the data and bss segment sizes for each .o file. This is a very quick way to find variables of static storage duration (just look for nonzero sizes in either of these fields).
Perhaps you could try building with mingw or cygwin, or looking for a similar tool in the MSVC toolset.

Resources