Error compiling a cuda project - c

I'm having some trouble compiling a cuda project with C Cuda and the lodepng libraries.
My makefile looks like this.
gpu: super-resolution.cu
gcc -g -O -c lodepng.c
nvcc -c super-resolution.cu
nvcc -o super-resolution-cuda super-resolution.o
rm -rf super-resolution.o
rm -rf lodepng.o
Could anyone tell me what I am doing wrong, because it is complaining about
nvcc warning : The 'compute_10' and 'sm_10' architectures are deprecated, and may be removed in a future release.
super-resolution.o: In function `main':
parallel-algorithm/super-resolution.cu:238: undefined reference to `lodepng_decode32_file(unsigned char**, unsigned int*, unsigned int*, char const*)'
parallel-algorithm/super-resolution.cu:259: undefined reference to `lodepng_encode32_file(char const*, unsigned char const*, unsigned int, unsigned int)'
parallel-algorithm/super-resolution.cu:269: undefined reference to `lodepng_encode32_file(char const*, unsigned char const*, unsigned int, unsigned int)'
parallel-algorithm/super-resolution.cu:282: undefined reference to `lodepng_encode32_file(char const*, unsigned char const*, unsigned int, unsigned int)'
parallel-algorithm/super-resolution.cu:292: undefined reference to `lodepng_encode32_file(char const*, unsigned char const*, unsigned int, unsigned int)'
parallel-algorithm/super-resolution.cu:301: undefined reference to `lodepng_encode32_file(char const*, unsigned char const*, unsigned int, unsigned int)'
...
I just need a way to compile my .cu file and add a C .o file into it during the compilation process using nvcc.
EDIT: tried suggestion. no success.
gcc -g -O -c lodepng.c
nvcc -c super-resolution.cu
nvcc warning : The 'compute_10' and 'sm_10' architectures are deprecated, and may be removed in a future release.
super-resolution.cu:1:2: warning: #import is a deprecated GCC extension [-Wdeprecated]
#import "cuda.h"
^
super-resolution.cu(106): warning: expression has no effect
super-resolution.cu(116): warning: expression has no effect
super-resolution.cu(141): warning: variable "y" was declared but never referenced
super-resolution.cu:1:2: warning: #import is a deprecated GCC extension [-Wdeprecated]
#import "cuda.h"
^
super-resolution.cu(106): warning: expression has no effect
super-resolution.cu(116): warning: expression has no effect
super-resolution.cu(141): warning: variable "y" was declared but never referenced
ptxas /tmp/tmpxft_00000851_00000000-5_super-resolution.ptx, line 197; warning : Double is not supported. Demoting to float
nvcc -o super-resolution-cuda super-resolution.o lodepng.o
nvcc warning : The 'compute_10' and 'sm_10' architectures are deprecated, and may be removed in a future release.
super-resolution.o: In function `main':
tmpxft_00000851_00000000-3_super-resolution.cudafe1.cpp:(.text+0x5d): undefined reference to `lodepng_decode32_file(unsigned char**, unsigned int*, unsigned int*, char const*)'
It still can't find the reference to the object file.
Edit: here's our .cu file.
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <cstdio>
extern "C" unsigned lodepng_encode32_file(const char* ,const unsigned char* , unsigned , unsigned h);
extern "C" unsigned lodepng_decode32_file(unsigned char** , unsigned* , unsigned* ,const char* );

don't #import. If you want to include cuda.h (which should be unnecessary) then use #include. Instead I would just delete that line from your super-resolution.cu file.
What you did not show before, but is now evident, is that in your super-resolution.cu you are including lodepng.h and also later specifying C-linkage for 2 functions: lodepng_decode32_file and lodepng_encode32_file. When I tried compiling your super-resolution.cu the compiler gave me errors like this (I don't know why you don't see them):
super-resolution.cu(8): error: linkage specification is incompatible with previous "lodepng_encode32_file"
lodepng.h(184): here
super-resolution.cu(9): error: linkage specification is incompatible with previous "lodepng_decode32_file"
lodepng.h(134): here
So basically you are tripping over C and C++ linkage.
I believe the simplest solution is to use lodepng.cpp (instead of lodepng.c), delete the following lines from your super-resolution.cu:
extern "C" unsigned lodepng_encode32_file(const char* ,const unsigned char* , unsigned , unsigned h);
extern "C" unsigned lodepng_decode32_file(unsigned char** , unsigned* , unsigned* ,const char* );
And just compile everything and link everything c++ style:
$ g++ -c lodepng.cpp
$ nvcc -c super-resolution.cu
nvcc warning : The 'compute_10' and 'sm_10' architectures are deprecated, and may be removed in a future release.
$ nvcc -o super-resolution super-resolution.o lodepng.o
nvcc warning : The 'compute_10' and 'sm_10' architectures are deprecated, and may be removed in a future release.
$
If you really want to link lodepng.o c-style instead of c++ style, then you will need to modify lodepng.h with appropriate extern "C" wrappers where the necessary functions are called out. In my opinion this gets messy.
If you want to get rid of the warnings about sm_10 then add the nvcc switch to compile for a different architecture, e.g.:
nvcc -arch=sm_20 ...
but make sure whatever you choose is compatible with your GPU.

Here is a simple snippet of the code.
The lodepng library can be gotten from here (http://lodev.org/lodepng/).
Renaming it to C will make it usable on C.
Even at this level, there's compilation issues with
"undefined reference to `lodepng_decode32_file'"
"undefined reference to `lodepng_encode32_file'"
File: Makefile
all: gpu
gcc -g -O -c lodepng.c
nvcc -c super-resolution.cu
nvcc -o super-resolution-cuda super-resolution.o lodepng.o
rm -rf super-resolution.o
rm -rf lodepng.o
File: super-resolution.cu
#import "cuda.h"
#include "lodepng.h"
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <cstdio>
extern "C" unsigned lodepng_encode32_file(const char* ,const unsigned char* , unsigned , unsigned h);
extern "C" unsigned lodepng_decode32_file(unsigned char** , unsigned* , unsigned* ,const char* );
//GPU 3x3 Blur.
__global__ void gpuBlur(unsigned char* image, unsigned char* buffer, int width, int height)
{
int i = threadIdx.x%width;
int j = threadIdx.x/width;
if (i == 0 || j == 0 || i == width - 1 || j == height - 1)
return;
int k;
for (k = 0; k <= 4; k++)
{
buffer[4*width*j + 4*i + k] = (image[4*width*(j-1) + 4*(i-1) + k] +
image[4*width*(j-1) + 4*i + k] +
image[4*width*(j-1) + 4*(i+1) + k] +
image[4*width*j + 4*(i-1) + k] +
image[4*width*j + 4*i + k] +
image[4*width*j + 4*(i+1) + k] +
image[4*width*(j+1) + 4*(i-1) + k] +
image[4*width*(j+1) + 4*i + k] +
image[4*width*(j+1) + 4*(i+1) + k])/9;
}
}
int main(int argc, char *argv[])
{
//Items for image processing;
//int threshold = 100;
unsigned int error;
unsigned char* image;
unsigned int width, height;
//Load the image;
if (argc > 1)
{
error = lodepng_decode32_file(&image, &width, &height, argv[1]);
printf("Loaded file: %s[%d]\n", argv[1], error);
}
else
{
return 0;
}
unsigned char* buffer =(unsigned char*)malloc(sizeof(char) * 4*width*height);
//GPU Blur Section.
unsigned char* image_gpu;
unsigned char* blur_gpu;
cudaMalloc( (void**) &image_gpu, sizeof(char) * 4*width*height);
cudaMalloc( (void**) &blur_gpu, sizeof(char) * 4*width*height);
cudaMemcpy(image_gpu,image, sizeof(char) * 4*width*height, cudaMemcpyHostToDevice);
cudaMemcpy(blur_gpu,image, sizeof(char) * 4*width*height, cudaMemcpyHostToDevice);
gpuBlur<<< 1, height*width >>> (image_gpu, blur_gpu, width, height);
cudaMemcpy(buffer, blur_gpu, sizeof(char) * 4*width*height, cudaMemcpyDeviceToHost);
//Spit out buffer as an image.
error = lodepng_encode32_file("GPU_OUTPUT1_Blur.png", buffer, width, height);
cudaFree(image_gpu);
cudaFree(blur_gpu);
free(buffer);
free(image);
}

Related

I have header file in /usr/include/bpf directory called tracing.h And including it seems to causing problem. I couldn't use it in include w/w-out -I

I have this header and function defined in it called long ptr = PT_REGS_PARM2(ctx); so first running command like following assuming my system config will take care of it finding this header file in /usr/include/bpf/tracing.h.
but couldn't find the header file
root#this:/home/ubuntu/Desktop/ebpf/Linux-exFilter-main/pkg/probe/bpf# clang -O2 -Wall -g -target bpf -I /usr/include/ -c kprobe_send.c -o kprobe_send.o
I also tried with - I and changing <bpf/tracing.h> to "bpf/tracing.h" not worked either.
I started this inclusion of -I after I was compiling this program and it causing error on compile, I could not understand the error but this is following
clang: error: clang frontend command failed with exit code 70 (use -v to see invocation)
#this:/home/ubuntu/Desktop/ebpf/Linux-exFilter-main/pkg/probe/bpf# clang -O2 -Wall -g-target bpf -c kprobe_send.c -o kprobe_send.o
clang: error: unknown argument: '-g-target'
clang: error: no such file or directory: 'bpf'
root#this:/home/ubuntu/Desktop/ebpf/Linux-exFilter-main/pkg/probe/bpf# clang -O2 -Wall -g -target bpf -c kprobe_send.c -o kprobe_send.o
kprobe_send.c:31:2: warning: implicit declaration of function 'srand' is invalid in C99 [-Wimplicit-function-declaration]
srand(time(NULL)); /* Seed the random number generator. */
^
kprobe_send.c:37:11: warning: implicit declaration of function 'rand' is invalid in C99 [-Wimplicit-function-declaration]
int c = randrange(MAX-i);
^
kprobe_send.c:11:22: note: expanded from macro 'randrange'
#define randrange(N) rand() / (RAND_MAX/(N) + 1)
^
kprobe_send.c:51:22: warning: implicit declaration of function 'PT_REGS_PARM2' is invalid in C99 [-Wimplicit-function-declaration]
char *ptr = PT_REGS_PARM2(ctx);
^
kprobe_send.c:51:15: warning: incompatible integer to pointer conversion initializing 'char *' with an expression of type 'int' [-Wint-conversion]
char *ptr = PT_REGS_PARM2(ctx);
^ ~~~~~~~~~~~~~~~~~~
kprobe_send.c:61:22: warning: incompatible integer to pointer conversion passing 'int' to parameter of type 'void *' [-Wint-conversion]
bpf_map_update_elem(fd,&key,&data,BPF_ANY);
^~
Error at line 37: Unsupport signed division for DAG: 0x18aff58: i64 = sdiv 0x18af668, 0x18afe20, kprobe_send.c:37:11Please convert to unsigned div/mod.
fatal error: error in backend: Cannot select: 0x18aff58: i64 = sdiv 0x18af668, 0x18afe20, kprobe_send.c:37:11
0x18af668: i64 = sra 0x189bcd8, Constant:i64<32>, kprobe_send.c:37:11
0x189bcd8: i64 = shl 0x189c4f8, Constant:i64<32>, kprobe_send.c:37:11
0x189c4f8: i64,ch,glue = CopyFromReg 0x189c018, Register:i64 $r0, 0x189c018:1, kprobe_send.c:37:11
0x189c150: i64 = Register $r0
0x189c018: ch,glue = callseq_end 0x189bc08, TargetConstant:i64<0>, TargetConstant:i64<0>, 0x189bc08:1, kprobe_send.c:37:11
0x189bfb0: i64 = TargetConstant<0>
0x189bfb0: i64 = TargetConstant<0>
What the above error even means, I thought it was complaining I did not include any headers so I started including tracing.h to cater to PT_REGS_PARM2(ctx)
How can I get rid of this error?
On this line it says unsupported sign division:
Error at line 37: Unsupport signed division for DAG: 0x18aff58: i64 = sdiv 0x18af668, 0x18afe20, kprobe_send.c:37:11Please convert to unsigned div/mod.
Is this line 37 referring to assembly or my source file? In line 37 of source file I am doing
int c = randrange(MAX-i);
Why is the above simple line is not allowed in ebpf program? This is my line 37 and rest of the bpf program
#include <linux/ptrace.h>
#include <linux/version.h>
#include <linux/bpf.h>
#include <bpf/bpf_helpers.h>
#include <string.h>
#include <sys/sendfile.h>
#include <time.h>
//#include <bpf/tracing.h>
#include <stdlib.h>
#define RAND_MAX 0x7fff
#define PERF_SAMPLE_RAW 1U << 0
#define randrange(N) rand() / (RAND_MAX/(N) + 1)
#define MAX 100000000 /* Values will be in the range (1 .. MAX) */
struct {
__uint(type, BPF_MAP_TYPE_PERF_EVENT_ARRAY);
__uint(key_size, sizeof(int));
__uint(value_size, sizeof(int));
__uint(max_entries, 100);
} my_map SEC(".maps");
SEC("kprobe/__x64_sys_recvfrom")
int bpf_prog1(struct pt_regs *ctx,int fd, const char *buf, size_t count)
{
static int vektor[100000000];
int candidates[MAX];
int i;
long key;
srand(time(NULL)); /* Seed the random number generator. */
for (i=0; i<MAX; i++)
candidates[i] = i;
for (i = 0; i < MAX-1; i++) {
int c = randrange(MAX-i);
int t = candidates[i];
candidates[i] = candidates[i+c];
candidates[i+c] = t;
}
for (i=0; i<10; i++)
vektor[i] = candidates[i] + 1;
struct S {
int pid;
char cookie[90];
char *ptr;
} data={1,""};
char *ptr = PT_REGS_PARM2(ctx);
//data.pid =count;// bpf_get_current_pid_tgid();
//if(buf==NULL)
//memcpy(data.cookie,buf,20);
data.ptr=ptr;
// data.cookie[0]=buf[0];
//bpf_get_current_comm(&data.cookie, sizeof(data.cookie));
key=vektor[i];
bpf_map_update_elem(fd,&key,&data,BPF_ANY);
//bpf_perf_event_output(ctx, &my_map, 1, &data, sizeof(data));
return 0;
}
char _license[] SEC("license") = "GPL";
int _version SEC("version") = 99;
is there any info source where it says what's allowed or what's not allowed because seems to me its just picking arbitrarily what's allowed and what's not looking into ebpf
My kernel: 5.14.1
My clang: 12
libbpf installed but I'd love to find a command that tells its version and or just where it's installed.
On some link I found it says maybe/or definitely (couldn't tell the difference) be a problem of symlink on Ubuntu exact source .. what this even means does it mean I installed libbpf or something and unbuntu messed up with my symlink that help find libbpf headers? How can I understand what this link says? https://github.com/iovisor/kubectl-trace/issues/76#issuecomment-513587108

Weird pointer conversion in C

I'm having trouble while writing my garbage collector in C. I give you a minimal and verifiable example for it.
The first file is in charge of dealing with the virtual machine
#include <stdlib.h>
#include <stdint.h>
typedef int32_t value_t;
typedef enum {
Lb, Lb1, Lb2, Lb3, Lb4, Lb5,
Ib, Ob
} reg_bank_t;
static value_t* memory_start;
static value_t* R[8];
value_t* engine_get_Lb(void) { return R[Lb]; }
value_t engine_run() {
memory_start = memory_get_start();
for (reg_bank_t pseudo_bank = Lb; pseudo_bank <= Lb5; ++pseudo_bank)
R[pseudo_bank] = memory_start + (pseudo_bank - Lb) * 32;
value_t* block = memory_allocate();
}
Then I have the actual garbage collector, the minimized code is:
#include <stdlib.h>
#include <stdint.h>
typedef int32_t value_t;
static value_t* memory_start = NULL;
void memory_setup(size_t total_byte_size) {
memory_start = calloc(total_byte_size, 1);
}
void* memory_get_start() { return memory_start; }
void mark(value_t* base){
value_t vbase = 0;
}
value_t* memory_allocate() {
mark(engine_get_Lb());
return engine_get_Lb();
}
Finally, minimal main is:
int main(int argc, char* argv[]) {
memory_setup(1000000);
engine_run();
return 0;
}
The problem I'm getting with gdb is that if I print engine_get_Lb() I get the address (value_t *) 0x7ffff490a800 while when printing base inside of the function mark I get the address (value_t *) 0xfffffffff490a800.
Any idea why this is happening?
Complementary files that may help
The makefile
SHELL=/bin/bash
SRCS=src/engine.c \
src/main.c \
src/memory_mark_n_sweep.c
CFLAGS_COMMON=-std=c11 -fwrapv
CLANG_SAN_FLAGS=-fsanitize=address
# Clang warning flags
CLANG_WARNING_FLAGS=-Weverything \
-Wno-format-nonliteral \
-Wno-c++98-compat \
-Wno-gnu-label-as-value
# Flags for debugging:
CFLAGS_DEBUG=${CFLAGS_COMMON} -g ${CLANG_SAN_FLAGS} ${CLANG_WARNING_FLAGS}
# Flags for maximum performance:
CFLAGS_RELEASE=${CFLAGS_COMMON} -O3 -DNDEBUG
CFLAGS=${CFLAGS_DEBUG}
all: vm
vm: ${SRCS}
mkdir -p bin
clang ${CFLAGS} ${LDFLAGS} ${SRCS} -o bin/vm
File with instructions .asm
5c190000 RALO(Lb,25)
value_t* memory_allocate() {
mark(engine_get_Lb());
return engine_get_Lb();
}
engine_get_Lb is not declared before use. It is assumed by the compiler to return int, per an antiquated and dangerous rule of the C language. It was deprecated in the C standard for quite some time, and now is finally removed.
Create a header file with declarations of all your global functions, and #include it in all your source files.
Your compiler should have at least warned you about this error at its default settings. If it did, you should have read and completely understood the warnings before continuing. If it didn't, consider an upgrade. If you cannot upgrade, permanently add -Wall -Wextra -Werror to your compilation flags. Consider also -Wpedantic and -std=c11.

Function is returning wrong type C

I have a problem with my function waveres. It is supposed to return a amplitude as a float, but does not. It return a random high number. I think it is the definition in my header file that is not "seen" by the main function. The other functions does work so I did not include them. When the waveres function runs, it prints correct values of amp.
Header file
#include <stdio.h>
#include <math.h>
#include <time.h> /* til random funksjonen */
#include <stdlib.h>
void faseforskyvning(float epsi[]);
float waveres(float S[],float w[],float *x, float *t, float epsi[]);
void lespar(float S[], float w[]);
Main program
#include "sim.h"
main()
{
float epsi[9], t = 1.0, x = 1.0;
float S[9], w[9];
float amp;
faseforskyvning(epsi);
lespar(S,w);
amp=waveres(S,w,&x,&t,epsi);
printf("%f\n", amp);
}
waveres:
#include "sim.h"
float waveres(float S[],float w[],float *x, float *t, float epsi[])
{
float amp = 0, k;
int i;
for(i=0;i<10;i++)
{
k = pow(w[i],2)/9.81;
amp = amp + sqrt(2*S[i]*0.2)*cos(w[i]*(*t)+k*(*x)+epsi[i]);
printf("%f\n",amp);
}
return(amp);
}
Sample output where the two last number are supposed to be the same.
0.000000
0.261871
3.750682
3.784552
3.741382
3.532950
3.759173
3.734213
3.418669
3.237864
1078933760.000000
A source to the error might be me compiling wrong. Here is a output from compiler:
make
gcc -c -o test.o test.c
gcc -c -o faseforskyvning.o faseforskyvning.c
gcc -c -o waveres.o waveres.c
gcc -c -o lespar.o lespar.c
gcc test.o faseforskyvning.o waveres.o lespar.o -o test -lm -E
gcc: warning: test.o: linker input file unused because linking not done
gcc: warning: faseforskyvning.o: linker input file unused because linking not done
gcc: warning: waveres.o: linker input file unused because linking not done
gcc: warning: lespar.o: linker input file unused because linking not done
You have undefined behavior, you iterate untill 10
for(i=0;i<10;i++)
But your arrays has size 9 which means the biggest index is 8
float epsi[9], t = 1.0, x = 1.0;
float S[9], w[9];
You need to change your loop to
for(i=0;i<9;i++)
Also your arrays are not initialized, this is also provokes undefined behavior. For example
float w[9]={0};
initializes all elements of array w with 0

Why does this code cause a Floating point exception - SIGFPE

Using gcc 4.7:
$ gcc --version
gcc (GCC) 4.7.0 20120505 (prerelease)
Code listing (test.c):
#include <stdint.h>
struct test {
int before;
char start[0];
unsigned int v1;
unsigned int v2;
unsigned int v3;
char end[0];
int after;
};
int main(int argc, char **argv)
{
int x, y;
x = ((uintptr_t)(&((struct test*)0)->end)) - ((uintptr_t)(&((struct test*)0)->start));
y = ((&((struct test*)0)->end)) - ((&((struct test*)0)->start));
return x + y;
}
Compile & execute
$ gcc -Wall -o test test.c && ./test
Floating point exception
The SIGFPE is caused by the second assignment (y = ...). In the assembly listing, there is a division on this line? Note that the only difference between x= and y= is casting to (uintptr_t).
Disregarding the undefined behaviour due to violation of constarints in the standard, what gcc does here is to calculate the difference between two pointers to char[0] - &(((struct test*)0)->start) and &(((struct test*)0)->end), and divide that difference by the size of a char[0], which of course is 0, so you get a division by 0.

What is the difference between `cc -std=c99` and `c99` on Mac OS?

Given the following program:
/* Find the sum of all the multiples of 3 or 5 below 1000. */
#include <stdio.h>
unsigned long int method_one(const unsigned long int n);
int
main(int argc, char *argv[])
{
unsigned long int sum = method_one(1000000000);
if (sum != 0) {
printf("Sum: %lu\n", sum);
} else {
printf("Error: Unsigned Integer Wrapping.\n");
}
return 0;
}
unsigned long int
method_one(const unsigned long int n)
{
unsigned long int i;
unsigned long int sum = 0;
for (i=1; i!=n; ++i) {
if (!(i % 3) || !(i % 5)) {
unsigned long int tmp_sum = sum;
sum += i;
if (sum < tmp_sum)
return 0;
}
}
return sum;
}
On a Mac OS system (Xcode 3.2.3) if I use cc for compilation using the -std=c99 flag everything seems just right:
nietzsche:problem_1 robert$ cc -std=c99 problem_1.c -o problem_1
nietzsche:problem_1 robert$ ./problem_1
Sum: 233333333166666668
However, if I use c99 to compile it this is what happens:
nietzsche:problem_1 robert$ c99 problem_1.c -o problem_1
nietzsche:problem_1 robert$ ./problem_1
Error: Unsigned Integer Wrapping.
Can you please explain this behavior?
c99 is a wrapper of gcc. It exists because POSIX requires it. c99 will generate a 32-bit (i386) binary by default.
cc is a symlink to gcc, so it takes whatever default configuration gcc has. gcc produces a binary in native architecture by default, which is x86_64.
unsigned long is 32-bit long on i386 on OS X, and 64-bit long on x86_64. Therefore, c99 will have a "Unsigned Integer Wrapping", which cc -std=c99 does not.
You could force c99 to generate a 64-bit binary on OS X by the -W 64 flag.
c99 -W 64 proble1.c -o problem_1
(Note: by gcc I mean the actual gcc binary like i686-apple-darwin10-gcc-4.2.1.)
Under Mac OS X, cc is symlink to gcc (defaults to 64 bit), and c99 is not (defaults to 32bit).
/usr/bin/cc -> gcc-4.2
And they use different default byte-sizes for data types.
/** sizeof.c
*/
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv)
{
printf("sizeof(unsigned long int)==%d\n", (int)sizeof(unsigned long int));
return EXIT_SUCCESS;
}
cc -std=c99 sizeof.c
./a.out
sizeof(unsigned long int)==8
c99 sizeof.c
./a.out
sizeof(unsigned long int)==4
Quite simply, you are overflowing (aka wrapping) your integer variable when using the c99 compiler.
.PMCD.

Resources