I have declared a function tolower in assembly language and I am calling it from my C code which includes standard library. I wonder why this works properly without giving any error as now two functions with same name are present.
My C code:
#include <stdio.h>
#include <string.h>
int main() {
char in0[] = "something";
char in1[] = "SomethinG";
char in2[] = "S0mething";
int i = 0;
while (in0[i]) {
in0[i] = tolower(in0[i]);
in1[i] = tolower(in1[i]);
in2[i] = tolower(in2[i]);
i++;
}
return 0;
}
I can't put my assembly code as it comes under an assignment. I have globally declared tolower function.
.section .data
.section .text
.globl _start
.globl tolower
.type tolower, #function
tolower:
I have compiled using gcc cfile.c assfile.s
Are you sure you are calling the assembler version from the C code, and not the standard library version? How is the assembler code defined? Is it a module (.asm) in your C project, in-lined, a static .DLL library, or .DEF/OBJ import? What compiler, assembler, (and IDE, OS?) are you using? There is not enough information currently to yield a concrete answer, however if there is nothing "defining" the assembler function to C, in a way that C can understand, then it will be oblivious to any naming conflicts. This isn't technically an error either; identical function names are like having two identical house addresses (123_Anystreet) in Canada and Europe. Entirely possible and allowable, but C typically doesn't allow it (and assembler does.)
Related
To be specific:
The arm assembly function is written in a separated file such as .S file or .asm file.
I need to call this function in main.c
The ARM assembly is in ARMv8 architecture.
I've written some testing code but it fails to work.
#include <stdio.h>
extern int a_add(int a, int b);
int main(){
int fi=5;
int se=7;
int result=a_add(fi, se);
printf("result is %d", result);
return 0;
}
And following is the assembly code.(a_add.S)
.section .text
.globl a_add
a_add:
add x3,x1,x0
mov x0,x3
br x30
Does anyone know how I can fix these two files to let the a_add function work?
I haven't tried .asm file yet.
Any help is appreciated and please forgive if I made mistakes on expression, but I hope my question is clear.
I have fixed the code by using extern "C" and have solved the problem. What needed is just claim the function as following in main.cpp or main.c
extern "C" {int a_add(int a, int b);}
Replace the old "extern" line by this line with "C".
This is helpful if the compiler is for C++ code instead of C code, basically by letting the function compiled in the way of C code, not C++ code.
I haven't made such test but I suppose if the compiler is C compiler, there won't be such a problem.
main.c
#include<stdio.h>
extern int a_add(int, int);
int main() {
int a = 5;
int b = 3;
printf("%d + %d = %d \n", a, b, a_add(a, b));
}
add.s
.text
.global a_add
a_add:
add w0, w0, w1
ret
Compiled with gcc (9.3.0) on an AArch64 machine (a docker container with ubuntu 20.04) using:
gcc main.c add.s -o run.exe && ./run.exe
A quick way to get the skeleton ready for a desired function in a target assembly would be:
Write a simple c function matching the signature of the desired function, e.g., for the add function above:
int a_add(int a, int b){
return a + b;
}
Generate assembly for such C function to begin with.
gcc -S -O2 add.c
Note: Using higher optimisation level might produce less obvious code.
Tip: Compiler Explorer is a good tool to know while playing around with the assembly.
So I am currently learning assembly language (AT&T syntax). We all know that gcc has an option to generate assembly code from C code with -S argument. Now, I would like to look at some code, how it looks in assembly. The problem is, on laboratories we compile it with as+ld, and as for now, we cannot use C libraries. So for example we cannot use printf. We should do it by syscalls (32 bit is enough). And now I have this code in C:
#include <stdio.h>
int main()
{
int a = 5;
int b = 3;
int c = a + b;
printf("%d", c);
return 0;
}
This is simple code, so I know how it will look with syscalls. But if I have some more complicated code, I don't want to mess around and replace every call printf and modify other registers, cuz gcc generated code for printf, and I should have it with syscalls. So can I somehow make gcc generate assembly code with syscalls (for example for I/O (console, files)), not with C libs?
Under Linux there exist the macro family _syscallX to generate a syscall where the X names the number of parameters. It is marked as obsolete, but IMHO still working. E.g., the following code should work (not tested here):
_syscall3(int,syswrite,int,handle,char*,str,int len);
// ---
char str[]="Hello, world!\n";
// file handle 1 is stdout
syswrite(1,str,14);
Linking C with Assembly in Visual Studio
I've seen that already but that doesn't contain any helpful informations.
I have a C program, in which I'm using function written in assembly.
I include fun.h header in C file with declaration, and have fun.asm with implementation.
It was firstly written using NASM, and there is a global keyword. How can I achieve the same proper linking effect in MASM?
Minimal example:
main.c:
int main()
{
f();
return 0;
}
f.h:
void f();
f.asm:
.DATA
_05 DQ 0.5
_PI DQ 3.14159265358979323846264338327
.CODE
public _f
_f PROC
_f ENDP
END
According to what Michael Petch wrote in comment:
Searched keyword is PUBLIC, and in x86_64 there is NO need for underscore on PROC functions.
It is said that we can write multiple declarations but only one definition. Now if I implement my own strcpy function with the same prototype :
char * strcpy ( char * destination, const char * source );
Then am I not redefining the existing library function? Shouldn't this display an error? Or is it somehow related to the fact that the library functions are provided in object code form?
EDIT: Running the following code on my machine says "Segmentation fault (core dumped)". I am working on linux and have compiled without using any flags.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char *strcpy(char *destination, const char *source);
int main(){
char *s = strcpy("a", "b");
printf("\nThe function ran successfully\n");
return 0;
}
char *strcpy(char *destination, const char *source){
printf("in duplicate function strcpy");
return "a";
}
Please note that I am not trying to implement the function. I am just trying to redefine a function and asking for the consequences.
EDIT 2:
After applying the suggested changes by Mats, the program no longer gives a segmentation fault although I am still redefining the function.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char *strcpy(char *destination, const char *source);
int main(){
char *s = strcpy("a", "b");
printf("\nThe function ran successfully\n");
return 0;
}
char *strcpy(char *destination, const char *source){
printf("in duplicate function strcpy");
return "a";
}
C11(ISO/IEC 9899:201x) §7.1.3 Reserved Identifiers
— Each macro name in any of the following subclauses (including the future library
directions) is reserved for use as specified if any of its associated headers is included;
unless explicitly stated otherwise.
— All identifiers with external linkage in any of the following subclauses (including the
future library directions) are always reserved for use as identifiers with external
linkage.
— Each identifier with file scope listed in any of the following subclauses (including the
future library directions) is reserved for use as a macro name and as an identifier with
file scope in the same name space if any of its associated headers is included.
If the program declares or defines an identifier in a context in which it is reserved, or defines a reserved identifier as a macro name, the behavior is undefined. Note that this doesn't mean you can't do that, as this post shows, it can be done within gcc and glibc.
glibc §1.3.3 Reserved Names proveds a clearer reason:
The names of all library types, macros, variables and functions that come from the ISO C standard are reserved unconditionally; your program may not redefine these names. All other library names are reserved if your program explicitly includes the header file that defines or declares them. There are several reasons for these restrictions:
Other people reading your code could get very confused if you were using a function named exit to do something completely different from what the standard exit function does, for example. Preventing this situation helps to make your programs easier to understand and contributes to modularity and maintainability.
It avoids the possibility of a user accidentally redefining a library function that is called by other library functions. If redefinition were allowed, those other functions would not work properly.
It allows the compiler to do whatever special optimizations it pleases on calls to these functions, without the possibility that they may have been redefined by the user. Some library facilities, such as those for dealing with variadic arguments (see Variadic Functions) and non-local exits (see Non-Local Exits), actually require a considerable amount of cooperation on the part of the C compiler, and with respect to the implementation, it might be easier for the compiler to treat these as built-in parts of the language.
That's almost certainly because you are passing in a destination that is a "string literal".
char *s = strcpy("a", "b");
Along with the compiler knowing "I can do strcpy inline", so your function never gets called.
You are trying to copy "b" over the string literal "a", and that won't work.
Make a char a[2]; and strcpy(a, "b"); and it will run - it probably won't call your strcpy function, because the compiler inlines small strcpy even if you don't have optimisation available.
Putting the matter of trying to modify non-modifiable memory aside, keep in mind that you are formally not allowed to redefine standard library functions.
However, in some implementations you might notice that providing another definition for standard library function does not trigger the usual "multiple definition" error. This happens because in such implementations standard library functions are defined as so called "weak symbols". Foe example, GCC standard library is known for that.
The direct consequence of that is that when you define your own "version" of standard library function with external linkage, your definition overrides the "weak" standard definition for the entire program. You will notice that not only your code now calls your version of the function, but also all class from all pre-compiled [third-party] libraries are also dispatched to your definition. It is intended as a feature, but you have to be aware of it to avoid "using" this feature inadvertently.
You can read about it here, for one example
How to replace C standard library function ?
This feature of the implementation doesn't violate the language specification, since it operates within uncharted area of undefined behavior not governed by any standard requirements.
Of course, the calls that use intrinsic/inline implementation of some standard library function will not be affected by the redefinition.
Your question is misleading.
The problem that you see has nothing to do with the re-implementation of a library function.
You are just trying to write non-writable memory, that is the memory where the string literal a exists.
To put it simple, the following program gives a segmentation fault on my machine (compiled with gcc 4.7.3, no flags):
#include <string.h>
int main(int argc, const char *argv[])
{
strcpy("a", "b");
return 0;
}
But then, why the segmentation fault if you are calling a version of strcpy (yours) that doesn't write the non-writable memory? Simply because your function is not being called.
If you compile your code with the -S flag and have a look at the assembly code that the compiler generates for it, there will be no call to strcpy (because the compiler has "inlined" that call, the only relevant call that you can see from main, is a call to puts).
.file "test.c"
.section .rodata
.LC0:
.string "a"
.align 8
.LC1:
.string "\nThe function ran successfully"
.text
.globl main
.type main, #function
main:
.LFB2:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
subq $16, %rsp
movw $98, .LC0(%rip)
movq $.LC0, -8(%rbp)
movl $.LC1, %edi
call puts
movl $0, %eax
leave
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE2:
.size main, .-main
.section .rodata
.LC2:
.string "in duplicate function strcpy"
.text
.globl strcpy
.type strcpy, #function
strcpy:
.LFB3:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
subq $16, %rsp
movq %rdi, -8(%rbp)
movq %rsi, -16(%rbp)
movl $.LC2, %edi
movl $0, %eax
call printf
movl $.LC0, %eax
leave
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE3:
.size strcpy, .-strcpy
.ident "GCC: (Ubuntu/Linaro 4.7.3-1ubuntu1) 4.7.3"
.
I think Yu Hao answer has a great explanation for this, the quote from the standard:
The names of all library types, macros, variables and functions that
come from the ISO C standard are reserved unconditionally; your
program may not redefine these names. All other library names are
reserved if your program explicitly includes the header file that
defines or declares them. There are several reasons for these
restrictions:
[...]
It allows the compiler to do whatever special optimizations it pleases
on calls to these functions, without the possibility that they may
have been redefined by the user.
your example can operate in this way : ( with strdup )
char *strcpy(char *destination, const char *source);
int main(){
char *s = strcpy(strdup("a"), strdup("b"));
printf("\nThe function ran successfully\n");
return 0;
}
char *strcpy(char *destination, const char *source){
printf("in duplicate function strcpy");
return strdup("a");
}
output :
in duplicate function strcpy
The function ran successfully
The way to interpret this rule is that you cannot have multiple definitions of a function end up in the final linked object (the executable). So, if all the objects included in the link have only one definition of a function, then you are good. Keeping this in mind, consider the following scenarios.
Let's say you redefine a function somefunction() that is defined in some library. Your function is in main.c (main.o) and in the library the function is in an a object named someobject.o (in the libray). Remember that in the final link, the linker only looks for unresolved symbols in the libraries. Because somefunction() is resolved already from main.o, the linker does not even look for it in the libraries and does not pull in someobject.o. The final link has only one definition of the function, and things are fine.
Now imagine that there is another symbol anotherfunction() defined in someobject.o that you also happen to call. The linker will try to resolve anotherfunction() from someobject.o, and pull it in from the library, and it will become a part of the final link. Now you have two definitions of somefunction() in the final link - one from main.o and another from someobject.o, and the linker will throw an error.
I use this one frequently:
void my_strcpy(char *dest, char *src)
{
int i;
i = 0;
while (src[i])
{
dest[i] = src[i];
i++;
}
dest[i] = '\0';
}
and you can also do strncpy just by modify one line
void my_strncpy(char *dest, char *src, int n)
{
int i;
i = 0;
while (src[i] && i < n)
{
dest[i] = src[i];
i++;
}
dest[i] = '\0';
}
I am using BOOST_PP for to do precompile computations in the preprocessor. I am focusing on an application where code size is extremely important to me. (So please don't say the compiler should or usually does that, I need to control what is performed at compile time and what code is generated). However, I want to be able to use the same name of the macro/function for both integer constants and variables. As trivial example, I can have
#define TWICE(n) BOOST_PP_MUL(n,2)
//.....
// somewhere else in code
int a = TWICE(5);
This does what I want it to, evaluating to
int a = 10;
during compile time.
However, I also want it to be used at
int b = 5;
int a = TWICE(b);
This should be preprocessed to
int b = 5;
int a = 5 * 2;
Of course, I can do so by using the traditional macros like
#define TWICE(n) n * 2
But then it doesnt do what I want it to do for integer constants (evaluating them during compile time).
So, my question is, is there a trick to check whether the argument is a literal or a variable, and then use different definitions. i.e., something like this:
#define TWICE(n) BOOST_PP_IF( _IS_CONSTANT(n), \
BOOST_PP_MUL(n,2), \
n * 2 )
edit:
So what I am really after is some way to check if something is a constant available at compile time, and hence a good argument for the BOOST_PP_ functions. I realize that this is different from what most people expect from a preprocessor and general programming recommendations. But there is no wrong way of programming, so please don't hate on the question if you disagree with its philosophy. There is a reason the BOOST_PP library exists, and this question is in the same spirit. It might just not be possible though.
You're trying to do something that is better left to the optimizations of the compiler.
int main (void) {
int b = 5;
int a = b * 2;
return a; // return it so we use a and it's not optimized away
}
gcc -O3 -s t.c
.file "t.c"
.text
.p2align 4,,15
.globl main
.type main, #function
main:
.LFB0:
.cfi_startproc
movl $10, %eax
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (Debian 4.5.0-6) 4.5.1 20100617 (prerelease)"
.section .note.GNU-stack,"",#progbits
Optimizing compiler will optimize.
EDIT: I'm aware that you don't want to hear that the compiler "should" or "usually" does that. However, what you are trying to do is not something that is meant to be done in the C preprocessor; the people who designed the C language and the C preprocessor designed it to work with the preprocessing token as it's basic atom. The CPP is, in many ways, "dumb". This is not a bad thing (in fact, this is in many cases what makes it so useful), but at the end of the day, it is a preprocessor. It preprocesses source files before they are parsed. It preprocesses source files before semantic analysis occurs. It preprocesses source files before the validity of a given source file is checked. I understand that you do not want to hear that this is something that the parser and semantic analyzer should handle or usually does. However, this is the reality of the situation. If you want to design code that is incredibly small, then you should rely on your compiler to do it's job instead of trying to create preprocessor constructs to do the work. Think of it this way: thousands of hours of work went into your compiler, so try to reuse as much of that work as possible!
not quite direct approach, however:
struct operation {
template<int N>
struct compile {
static const int value = N;
};
static int runtime(int N) { return N; }
};
operation::compile<5>::value;
operation::runtime(5);
alternatively
operation<5>();
operation(5);
There is no real chance of mixing the two levels (preprocessor and evaluation of variables). From what I understand from you question, b should be a symbolic constant?
I think you should use the traditional one
#define TWICE(n) ((n) * 2)
but then instead of initializing variables with expressions, you should initialize them with compile time constants. The only thing that I see to force evaluation at compile time and to have symbolic constants in C are integral enumeration constants. These are defined to have type int and are evaluated at compile time.
enum { bInit = 5 };
int b = bInit;
enum { aInit = TWICE(bInit) };
int a = aInit;
And generally you should not be too thrifty with const (as for your b) and check the produced assembler with -S.