XE6 Ansi/Unicode-String Linker Errors (Unresolved Externals) - linker

My senario is to port my projects from XE3 to XE6.
I having these unresolved extrenals when I'm switching calling convention from C to stdcall.
Starting from there, UnicodeString(), ~UnicodeString() (any other) become unresolved.
I'd compare ustring.h between XE3 and XE6. Looks like there are many changes there.
For example : UnicodeString destructor.
In XE3, it's decalred as :
__fastcall ~UnicodeString();
In XE6, it's decalred as :
~UnicodeString();
I than modified the declaration, to be :
__cdecl ~UnicodeString();
This correct the linker error.
Is this normal, is it the right correction to do ?
To obtain the problem :
create a new C++ package;
create a new component, derived from TEdit, and add it to the
package.
build and link => all is OK;
than go to project options, and switch calling convention to stdcall;
build and link => unresolved external occurs;
Thanks in advance for your answers.
N. Fortin

Do you happen to use 32-bit XE3 and 64-bit on XE6? If so, 64-bit generally doesn't support multiple calling conventions, and thus nearly everything is cdecl. This is normal

Related

RUST problem linking external module rust wants __imp_ on imported DLL library when it isn't there LNK2019

error LNK2019: unresolved external symbol __imp_yourexternFunc
I have a C DLL function that is external called "output" which is similar to printf:
output( format , va_args);
In *.h files its declared:
__declspec( dllexport ) void output( LPCTSTR format, ... );
or
__declspec( dllimport ) void output( LPCTSTR format, ... );
(for *.h includes) there is a MACRO that selects between export/import base on usage
In my rust module I declare it extern as:
#[link(name="aDLL", kind="dylib")]
extern {
fn output( format:LPCTSTR, ...);
}
The dumpbin for this function is as follows (from dumpbin)
31 ?output##YAXPEBDZZ (void __cdecl output(char const *,...))
But when I attempt to link this the rustc linker is prepending _imp to the function name:
second_rust_lib_v0.second_rust_lib_v0.ay01u8ua-cgu.6.rcgu.o : error LNK2019: unresolved external symbol __imp_output referenced in function print_something
On windows linking DLLs goes through a trampoline library (.lib file) which generates the right bindings. The convention for these is to prefix the function names with __imp__ (there is a related C++ answer).
There is an open issue that explains some of the difficulties creating and linking rust dlls under windows.
Here are the relevant bits:
If you start developing on Windows, Rust will produce a mylib.dll and mylib.dll.lib. To use this lib again from Rust you will have to specify #[link(name = "mylib.dll")], thus giving the impression that the full file name has to be specified. On Mac, however, #[link(name = "libmylib.dylib"] will fail (likewise Linux).
If you start developing on Mac and Linux, #[link(name = "mylib")] just works, giving you the impression Rust handles the name resolution (fully) automatically like other platforms that just require the base name.
In fact, the correct way to cross platform link against a dylib produced by Rust seems to be:
#[cfg_attr(all(target_os = "windows", target_env = "msvc"), link(name = "dylib.dll"))]
#[cfg_attr(not(all(target_os = "windows", target_env = "msvc")), link(name = "dylib"))]
extern "C" {}
Like your previous question you continue to ignore how compilers and linkers work. The two concepts you need to wrap your head around are these:
LPCTSTR is not a type. It is a preprocessor macro that expands to char const*, wchar_t const*, or __wchar_t const* if you are particularly unlucky. Either way, once the compiler is done, LPCTSTR is gone. Forever. It will not ever show up as a type even when using C++ name decoration.
It is not a type, don't use it in places where only types are allowed.
Compilers support different types of language linkage for external symbols. While you insist to have a C DLL, you are in fact using C++ linkage. This is evidenced by the symbol assigned to the exported function. While C++ linkage is great in that it allows type information to be encoded in the decorated names, the name decoration scheme isn't standardized in any way, and varies widely across compilers and platforms. As such, it is useless when the goal is cross language interoperability (or any interoperability).
As explained in my previous answer, you will need to get rid of the LPCTSTR in your C (or C++) interface. That's non-negotiable. It must go, and unwittingly you have done that already. Since DUMPBIN understands MSVC's C++ name decoration scheme, it was able to turn this symbol
?output##YAXPEBDZZ
into this code
void __cdecl output(char const *,...)
All type information is encoded in the decorated name, including the calling convention used. Take special note that the first formal parameter is of type char const *. That's fixed, set in stone, compiled into the DLL. There is no going back and changing your mind, so make sure your clients can't either.
You MUST change the signature of your C or C++ function. Pick either char const* or wchar_t const*. When it comes to strings in Rust on Windows there is no good option. Picking either one is the best you have.
The other issue you are going up against is insisting on having Rust come to terms with C++' language linkage. That isn't going to be an option until Standard C++ has formally standardized C++ language linkage. In statistics, this is called the "Impossible Event", so don't sink any more time into something that's not going to get you anywhere.
Instead, instruct your C or C++ library to export symbols using C language linkage by prepending an extern "C" specifier. While not formally specified either, most tools agree on a sufficiently large set of rules to be usable. Whether you like it or not, extern "C" is the only option we have when making compiled C or C++ code available to other languages (or C and C++, for that matter).
If for whatever reason you cannot use C language linkage (and frankly, since you are compiling C code I don't see a valid reason for that being the case) you could export from a DLL using a DEF file, giving you control over the names of the exported symbols. I don't see much benefit in using C++ language linkage, then throwing out all the benefits and pretend to the linker that this were C language linkage. I mean, why not just have the compiler do all that work instead?
Of course, if you are this desperately trying to avoid the solution, you could also follow the approach from your proposed answer, so long as you understand, why it works, when it stops working, and which new error mode you've introduced.
It works, in part by tricking the compiler, and in part by coincidence. The link_name = "?output##YAXPEBDZZ" attribute tells the compiler to stop massaging the import symbol and instead use the provided name when requesting the linker to resolve symbols. This works by coincidence because Rust defaults to __cdecl which happens to be the calling convention for all variadic functions in C. Most functions in the Windows API use __stdcall, though. Now ironically, had you used C linkage instead, you would have lost all type information, but retained the calling convention in the name decoration. A mismatch between calling conventions would have thus been caught during linking. Another opportunity missed, oh well.
It stops working when you recompile your C DLL and define UNICODE or _UNICODE, because now the symbol has a different name, due to different types. It will also stop working when Microsoft ever decide to change their (undocumented) name decoration scheme. And it will certainly stop working when using a different C++ compiler.
The Rust implementation introduced a new error mode. Presumably, LPCTSTR is a type alias, gated by some sort of configuration. This allows clients to select, whether they want an output that accepts a *const u8 or *const u16. The library, though, is compiled to accept char const* only. Another mismatch opportunity introduced needlessly. There is no place for generic-text mappings in Windows code, and hasn't been for decades.
As always, a few words of caution: Trying to introduce Rust into a business that's squarely footed on C and C++ requires careful consideration. Someone doing that will need to be intimately familiar with C++ compilers, linkers, and Rust. I feel that you are struggling with all three of those, and fear that you are ultimately going to provide a disservice.
Consider whether you should be bringing someone in that is sufficiently experienced. You can either thank me later for the advice, or pay me to fill in that role.
This is not my ideal answer, but it is how I solve the problem.
What I'm still looking for is a way to get the Microsoft Linker (I believe) to output full verbosity in the rust build as it can do when doing C++ builds. There are options to the build that might trigger this but I haven't found them yet. That plus this name munging in maybe 80% less text than I write here would be an ideal answer I think.
The users.rust-lang.org user chrefr helped by asking some clarifying questiongs which jogged my brain. He mentioned that "name mangling schema is unspecified in C++" which was my aha moment.
I was trying to force RUST to make the RUST linker look for my external output() API function, expecting it to look for the mangled name, as the native API call I am accessing was not declared with "cdecl" to prevent name mangling.
I simply forced RUST to use the mangled name I found with dumpbin.hex (code below)
What I was hoping for as an answer was a way to get linker.exe to output all the symbols it is looking for. Which would have been "output" which was what the compiler error was stating. I was thinking it was looking for a mangled name and wanted to compare the two mangled names by getting the microsoft linker to output what it was attempting to match.
So my solution was to use the dumpbin munged name in my #[link] directive:
//#[link(name="myNativeLib")]
//#[link(name="myNativeLib", kind="dylib")] // prepends _imp to symbol below
#[link(name="myNativeLib", kind="static")] // I'm linking with a DLL
extern {
//#[link_name = "output"]
#[link_name = "?output##YAXPEBDZZ"] // Name found via DUMPBIN.exe /Exports
fn output( format:LPCTSTR, ...);
}
Although I have access to sources of myNativeLib, these are not distributed, and not going to change. The *.lib and *.exp are only available internally, so long term I will need a solution to bind to these modules that only relys on the *.dll being present. That suggests I might need to dynamically load the DLL instead of doing what I consider "implicit" linking of the DLL. As I suspect rust is looking just at the *.lib module to resolve the symbols. I need a kind="dylibOnly" for Windows DLLS that are distributed without *.lib and *.exp modules.
But for the moment I was able to get all my link issues resolved.
I can now call my RUST DLL from a VS2019 Platform Toolset V142 "main" and the RUST DLL can call a 'C' DLL function "output" and the data goes to the propriatary stream that the native "output" function was designed to send data to.
There were several hoops involved but generally cargo/rustc/cbindgen worked well for this newbie. Now I'm trying to condsider any compute intensive task where multithreading is being avoided in 'C' that could be safely implemented in RUST which could be bencmarked to illustrate all this pain is worthwhile.

Which calling convention is used for functions exported via .def file?

I'm compiling some third party C code with Visual C++. The source tree contains the following .def file:
LIBRARY "ThirdParty.dll"
EXPORTS
ThirdPartyFunction #1
and there's no explicit calling convention specification (like __stdcall or __cdecl) near ThirdPartyFunction() definition. The Visual C++ project properties (C++ -> Advanced -> Calling convention) is set to __cdecl (/Gd).
Which calling convention will be used for the exported function and how do I make sure that it's that convention?
A .def file does not control the calling convention, it is purely determined by the compiler. If you don't explicitly use __cdecl or __stdcall in the function declaration then it is the compiler's default, so __cdecl. Corner cases are __thiscall for C++ member functions and __clrcall for managed code.
The calling convention also selects the name decoration style, specifically invented to avoid accidents with client code getting it wrong. __cdecl adds a single underscore before the name, __stdcall appends "#n" where n is the size of the stack activation frame. Which protects against a stack imbalance when the client code uses a wrong declaration with a mismatch in type or number of the arguments, a fatal and very hard to diagnose problem with __stdcall. Disabling this decoration with a .def file is actually a bad idea and should only ever be contemplated if the DLL is dynamically loaded with LoadLibrary+GetProcAddress. If you intend to have your DLL used by non C/C++ clients then it usually a good idea to use __stdcall explicitly since that tends to be the default for other language runtimes.
None of this matters for 64-bit code, it blissfully has only one calling convention. Although it looks like Microsoft is about to mess that up by adding the __vectorcall calling convention.

dlopen: Is it possible to trap unresolved symbols, "manually" resolving them as they happen?

Is it possible to trap unresolved symbol references when they happen, so that a function is called to try to resolve the symbol as needed? Or is it possible to add new symbols to the dynamic symbol table at runtime without creating a library file and dlopen'ing it? I am on GNU/Linux, using GCC. (Portability to other Unixes would be nice, but is not a key concern.)
Thanks in advance!
Edit: I should have given more detail about what I am trying to do. I want to write an interpreter for a programming language, which is expected to support both compiled (dlopen'ed) and interpreted modules. I wanted calls from a compiled module to functions defined elsewhere to be resolved by the linker, to avoid a lookup for the function at every call, but calls to interpreted code would be left unresolved. I wanted to trap those calls, so that I could call the appropriate interpreted function when needed (or signal an error if the function does not exist).
If you know what symbols are missing, you could write a library just with them, and LD_PRELOAD it prior to the application execution.
If you don't have the list of the symbols that are missing, you could discover them by using either 'nm' or 'objdump' on the binary, and, with base on that, write a script which will build the library with the missing symbols prior to the application execution, and then LD_PRELOAD it as well.
Also, you could use gdb to inject new 'code' into applications, making the functions point to what you need.
Finally, you could also override some of the ld.so functions to detect the missing symbols, and do something about them.
But in any case, if you could explain what you are trying to accomplish, it would be easier to provide a proper solution.
I'm making a wild guess that the problem you're trying to address is the case where you dlopen and start using a loadable module, then suddenly crash due to unresolved symbols. If so, this is a result of lazy binding, and you can disable it by exporting LD_BIND_NOW=1 (or any value, as long as it's set) in the environment. This will ensure that all symbols can be resolved before dlopen returns, and if any can't, the dlopen operation will fail, letting you handle the situation gracefully.

Error While Linking Multiple C Object files in Delphi 2007

I am new to delphi. I was trying to add C Object files in my Delphi project and link them directly since Delphi Supports C Object Linking. I got it working when i link a single Object file. But when i try to link multiple object files, i am getting error 'Unsatisfied forward or external declaration'. I have tried this in Delphi 2007 as well as XE.So what am i doing wrong here?
Working Code:
function a_function():Integer;cdecl;
implementation
{$Link 'a.obj'}
function a_function():Integer;cdecl;external;
end.
Error Code:
function a_function():Integer;cdecl;
function b_function();Integer;cdecl;
function c_function();Integer;cdecl;
implementation
{$LINK 'a.obj'}
{$LINK 'b.obj'}
{$LINK 'c.obj'}
function a_function():Integer;cdecl;external;
function b_function();Integer;cdecl;external;
function c_function();Integer;cdecl;external;
end.
As an aside, the article linked by #vcldeveloper has a good explanation of some of the common issues. The trick of providing missing C RTL functions in Pascal code is excellent and much quicker than trying to link in the necessary functions as C files, or even as .obj files.
However, I have a suspicion that I know what is going on here. I use this same approach but in fact have over 100 .obj files in the unit. I find that when I add new ones, I get the same linker error as you do. The way I work around this is to try re-ordering my $LINK instructions. I try to add the new obj files one by one and I have always been able, eventually, to get around this problem.
If your C files are totally standalone then you could put each one in a different unit and the linker would handle that. However, I doubt that is the case and indeed I suspect that if they really were standalone then this problem would not occur. Also, it's desirable to have the $LINK instructions in a single unit so that any RTL functions that need to be supplied can be supplied once and once only (they need to appear in the same unit as the $LINK instructions).
This oddity in the linker was present in Delphi 6 and is present in Delphi 2010.
EDIT 1: The realisation has now dawned on me that this issue is probably due to Delphi using a single pass compiler. I suspect that the "missing external reference" error is because the compiler processes the .obj files in the order in which they appear in the unit.
Suppose that a.obj appears before b.obj and yet a.obj calls a function in b() b.obj. The compiler wouldn't know where b() resides at the point where it needs to fixup the function call. When I find the time, I going to try and test if this hypothesis is at the very least plausible!
Finally, another easy way out of the problem would be to combine a.c, b.c and c.c into a single C file which would I believe bypass this issue for the OP.
Edit 2: I found another Stack Overflow question that covers this ground: stackoverflow.com/questions/3228127/why-does-the-order-of-linked-object-file-with-l-directive-matter
Edit 3: I have found another truly wonderful way to work around this problem. Every time the compiler complains
[DCC Error] Unit1.pas(1): E2065 Unsatisfied forward or external declaration: '_a'
you simply add, in the implementation section of the unit, a declaration like so:
procedure _a; external;
If it is a routine that you wish to call from Delphi then you clearly need to get the parameter list, calling conventions etc. correct. Otherwise, if it is a routine internal to the external code, then you can ignore the parameter list, calling conventions etc.
To the best of my knowledge this is the only way to import two objects that refer to each other in a circular manner. I believe that declaring an external procedure in this way is akin to making a forward declaration. The difference is that the implementation is provided by an object rather than Pascal code.
I've now been able to add a couple of more tools to my armory – thank you for asking the question!

How to catch unintentional function interpositioning?

Reading through my book Expert C Programming, I came across the chapter on function interpositioning and how it can lead to some serious hard to find bugs if done unintentionally.
The example given in the book is the following:
my_source.c
mktemp() { ... }
main() {
mktemp();
getwd();
}
libc
mktemp(){ ... }
getwd(){ ...; mktemp(); ... }
According to the book, what happens in main() is that mktemp() (a standard C library function) is interposed by the implementation in my_source.c. Although having main() call my implementation of mktemp() is intended behavior, having getwd() (another C library function) also call my implementation of mktemp() is not.
Apparently, this example was a real life bug that existed in SunOS 4.0.3's version of lpr. The book goes on to explain the fix was to add the keyword static to the definition of mktemp() in my_source.c; although changing the name altogether should have fixed this problem as well.
This chapter leaves me with some unresolved questions that I hope you guys could answer:
Does GCC have a way to warn about function interposition? We certainly don't ever intend on this happening and I'd like to know about it if it does.
Should our software group adopt the practice of putting the keyword static in front of all functions that we don't want to be exposed?
Can interposition happen with functions introduced by static libraries?
Thanks for the help.
EDIT
I should note that my question is not just aimed at interposing over standard C library functions, but also functions contained in other libraries, perhaps 3rd party, perhaps ones created in-house. Essentially, I want to catch any instance of interpositioning regardless of where the interposed function resides.
This is really a linker issue.
When you compile a bunch of C source files the compiler will create an object file for each one. Each .o file will contain a list of the public functions in this module, plus a list of functions that are called by code in the module, but are not actually defined there i.e. functions that this module is expecting some library to provide.
When you link a bunch of .o files together to make an executable the linker must resolve all of these missing references. This is the point where interposing can happen. If there are unresolved references to a function called "mktemp" and several libraries provide a public function with that name, which version should it use? There's no easy answer to this and yes odd things can happen if the wrong one is chosen
So yes, it's a good idea in C to "static" everything unless you really do need to use it from other source files. In fact in many other languages this is the default behavior and you have to mark things "public" if you want them accessible from outside.
It sounds like what you want is for the tools to detect that there are name conflicts in functions - ie., you don't want your externally accessible function names form accidentally having the same name and therefore 'override' or hide functions with the same name in a library.
There was a recent SO question related to this problem: Linking Libraries with Duplicate Class Names using GCC
Using the --whole-archive option on all the libraries you link against may help (but as I mentioned in the answer over there, I really don't know how well this works or how easy it is to convince builds to apply the option to all libraries)
Purely formally, the interpositioning you describe is a straightforward violation of C language definition rules (ODR rule, in C++ parlance). Any decent compiler must either detect these situations, or provide options for detecting them. It is simply illegal to define more than one function with the same name in C language, regardless of where these functions are defined (Standard library, other user library etc.)
I understand that many platforms provide means to customize the [standard] library behavior by defining some standard functions as weak symbols. While this is indeed a useful feature, I believe the compilers must still provide the user with means to enforce the standard diagnostics (on per-function or per-library basis preferably).
So, again, you should not worry about interpositioning if you have no weak symbols in your libraries. If you do (or if you suspect that you do), you have to consult your compiler documentation to find out if it offers you with means to inspect the weak symbol resolution.
In GCC, for example, you can disable the weak symbol functionality by using -fno-weak, but this basically kills everything related to weak symbols, which is not always desirable.
If the function does not need to be accessed outside of the C file it lives in then yes, I would recommend making the function static.
One thing you can do to help catch this is to use an editor that has configurable syntax highlighting. I personally use SciTE, and I have configured it to display all standard library function names in red. That way, it's easy to spot if I am re-using a name I shouldn't be using (nothing is enforced by the compiler, though).
It's relatively easy to write a script that runs nm -o on all your .o files and your libraries and checks to see if an external name is defined both in your program and in a library. Just one of the many sane sensible services that the Unix linker doesn't provide because it's stuck in 1974, looking at one file at a time. (Try putting libraries in the wrong order and see if you get a useful error message!)
The Interposistioning occurs when the linker is trying to link separate modules.
It cannot occur within a module. If there are duplicate symbols in a module the linker will report this as an error.
For *nix linkers, unintended Interposistioning is a problem and it is difficult for the linker to guard against it.
For the purposes of this answer consider the two linking stages:
The linker links translation units into modulles (basically
applications or libraries).
The linker links any remaining unfound symbols by searching in modules.
Consider the scenario described in 'Expert C programming' and in SiegeX's question.
The linker fist tries to build the application module.
It sess that the symbol mktemp() is an external and tries to find a funcion definiton for the symbol. The linker finds
the definition for the function in the object code of the application module and marks the symbol as found.
At this stage the symbol mktemp() is completely resolved. It is not considered in any way tentative so as to allow
for the possibility that the anothere module might define the symbol.
In many ways this makes sense, since the linker should first try and resolve external symbols within the module it is
currently linking. It is only unfound symbols that it searches for when linking in other modules.
Furthermore, since the symbol has been marked as resolved, the linker will use the applications mktemp() in any
other cases where is needs to resolve this symbol.
Thus the applications version of mktemp() will be used by the library.
A simple way to guard agains the problem is to try and make all external sysmbols in your application or library unique.
For modules that are only going to shared on a limited basis, this can fairly easily be done by making sure all
extenal symbols in your module are unique by appending a unique identifier.
For modules that are widely shared making up unique names is a problem.

Resources