cl_float4 on host and float4 on device in openCL

cl_float4 on host and float4 on device in openCL - arrays

So I want to define a struct thats usable on both the host and on the device in openCL that makes use of the built in openCL float4 datatypes.
On the host side, the framework comes with a cl_float4 type but on the device it's just float4.
So if I create a struct like this...
typedef struct
{
cl_float4 a, b;
} MyStruct;
...and then try to pass that struct into a kernel (via a buffer) I get an error.
If I declare it as follows...
typedef struct
{
float4 a,b;
}
...that would work on the device but doesn't work on the host.
So is there a way to get it so that I can make use of openCLs built in vector types on both sides of my program within the same structs?

The C preprocessor can help you here, by treating the code differently depending on whether it is being compiled on the host or the device.
Here's some possible solutions:
typedef struct
{
#ifdef __OPENCL_C_VERSION__
float4
#else
cl_float4
#endif
a, b;
} MyStruct;
or:
#ifdef __OPENCL_C_VERSION__
typedef float4 cl_float4;
#endif
typedef struct
{
cl_float4 a, b;
} MyStruct;
or:
#ifndef __OPENCL_C_VERSION__
typedef cl_float4 float4;
#endif
typedef struct
{
float4 a, b;
} MyStruct;
or just use cl_float4, and compile the OpenCL code like this:
clBuildProgram(program, 1, &device, "-Dcl_float4=float4", NULL, NULL);

I'd avoid this approach unless you're very careful with your struct definition. The data structure alignment rules of your OpenCL device's architecture may very well be different than those of your OpenCL host's architecture. See this Wikipedia article for an overview on data structure alignment.
TLDR: the size of your struct may vary from device to host, as might the offsets from of each struct's member from the beginning of the struct. If this happens your program will break. Even if you get away with this on your current host/device combination it's not guaranteed to work on other hardware combinations.

Related

Import struct from shared library at runtime - without header file at compile time

I got curious when I tried to create a general purpose socket .so library.
I have a platform defined struct and a function that looks like:
## == sock.c
#ifdef __unix__
typedef int SOCKET;
#else
typedef struct UNISock {
_IN_ int af,
_IN_ int type,
_IN_ int protocol;
} SOCKET;
#endif
SOCKET socket_connect(char * hostname, int portnumber) {
#ifdef __unix__
SOCKET connfd = 0;
#else
SOCKET connfd = INVALID_SOCKET; // Windows struct eq of "0"
#endif
...
return connfd;
}
Which I then try to import and use from application.c.
## == application.c
typedef void* (*_func)();
int main(int argc, char *argv[]) {
void* lib;
_func socket_connect;
_func init;
lib = dlopen("./socket.so", RTLD_NOW|RTLD_GLOBAL);
*(void**)(&socket_connect) = dlsym(lib, "socket_connect");
*(void**)(&init) = dlsym(lib, "init");
SOCKET connection = socket_connect("127.0.0.1", 1337);
}
Obviously I'm not following C's best practices here since - well first of all normally you'd place these structs/definitions in a .h file and all is well.
But I started to wonder if I could produce this struct on the fly, and some how set it up in application.c without a header file that was linked during compilation.
Either by doing something similar to extern typedef int SOCKET; and use dlsym() to load it.
(extern will fail with multiple storage classes in declaration specifiers obviously, but it's an example of the idea I'm trying to achieve)
Or by having a init() function in my socket library that returned the sizeof(SOCKET) so that I could use malloc() in application.c to create a memory placeholder for whatever bytes SOCKET will need in memory allocation and do it manually.
Since I don't know anyone that's explored these things in C - and searching for threads online appears to give no information on the topic except link me to different virus categories on Wikipedia.
Or some threads avoids return values completely from library functions.
So I'm asking a vague question in hopes that someone knows how to:
Keep/Expose the struct SOCKET symbol to the linker/compiler by adding it to the symbol tree, and some how use dlsym() or similar to set it up.
Get the SOCKET information from socket.so to application.c so that application.c can store the return value of socket_connect without having to clutter down the main code with typedef's for SOCKET - since that's already done once in socket.c. And without a socket.h - because the goal is to work around the need for a header file.
Any alternative ways to handle return values from library functions without pre-defining the potentially hundreds of struct/typedef information in main application code.
I get that this is leaning towards OOP and that, again, it's not traditional C. It's both ugly and not something anyone sane person would do since C is a . And I apologize if this questions is borderline Off Topic or offends every programmer ever.

I don't think dlopen has anything to do with this... If you want a platform-independent socket-module, try something this:
/* ownsocket.h */
/* opaque type-definition */
typedef struct OwnSocket OwnSocket;
/* constructors */
OwnSocket *newServerSocket (int port);
OwnSocket *newClientSocket (const char *ptnhost, int ptnport);
OwnSocket *acceptClient (OwnSocket *server);
/* other constructors */
...
/* destructor */
closeSocket (OwnSocket *);
/* other operations */
and then:
/* ownsocket.c */
#include "ownsocket.h"
struct OwnSocket {
int type; /* -1/0/1/2 = closed/client/listening_server/sub_server */
...
#ifdef __unix__
int handle;
#else
SOCKET handle;
#endif
...
};
/* implementation of functions */

Understanding per cpu structure in linux kernel

I am studying linux kernel and sometimes I don't understand what kernel developers want in a particular piece of code. So I was reading through timers in kernel and a timer is created using a struct timer_list variable, that contains a per cpu pointer. I tried to understand a little better this per cpu variable so I was looking in linux kerenl, how things are getting created.
So I have taken different structures from kernel and listed the #defines to integrate things and see a clear picture, what actually is happening.
structures from all this started
struct timer_list {
/*
* All fields that change during normal runtime grouped to the
* same cacheline
*/
struct list_head entry;
unsigned long expires;
struct tvec_base *base;//pointer to per cpu variable, so I checked what is inside
void (*function)(unsigned long);
unsigned long data;
int slack;
#ifdef CONFIG_TIMER_STATS
int start_pid;
void *start_site;
char start_comm[16];
#endif
#ifdef CONFIG_LOCKDEP
struct lockdep_map lockdep_map;
#endif
};
the base pointer is a struct like this
struct tvec_base {
spinlock_t lock;
struct timer_list *running_timer;
unsigned long timer_jiffies;
unsigned long next_timer;
unsigned long active_timers;
struct tvec_root tv1;
struct tvec tv2;
struct tvec tv3;
struct tvec tv4;
struct tvec tv5;
} ____cacheline_aligned;//??why such a name __cacheline_aligned
struct tvec_base boot_tvec_bases;
EXPORT_SYMBOL(boot_tvec_bases);
static DEFINE_PER_CPU(struct tvec_base *, tvec_bases) = &boot_tvec_bases;//from here I am a little puzzeled as the way things are written been assigned.
DEFINE PER CPU is such a simple #define to understand, wish it was
#define DEFINE_PER_CPU(type, name) \
DEFINE_PER_CPU_SECTION(type, name, "")
#define DEFINE_PER_CPU_SECTION(type, name, sec) \//?? what exactly we achieve here
__PCPU_DUMMY_ATTRS char __pcpu_scope_##name; \
extern __PCPU_DUMMY_ATTRS char __pcpu_unique_##name; \
__PCPU_DUMMY_ATTRS char __pcpu_unique_##name; \
__PCPU_ATTRS(sec) PER_CPU_DEF_ATTRIBUTES __weak \
__typeof__(type) name
#define __PCPU_DUMMY_ATTRS \
__attribute__((section(".discard"), unused))//i think it is a section in map file, but already kernel is built and I am builing a timer module, so what does it do?
If anyone have good experience in linux internal can you just point me in the right direction.
Some specific questions which if answered can make me understand whole thing,
`static DEFINE_PER_CPU(struct tvec_base *, tvec_bases) = &boot_tvec_bases;
1)what does this mean? this address '&boot_tvec_bases' is going where?
2) why a name ____cacheline_aligned;
choosen. Does it do anything special?
3)what is
#define DEFINE_PER_CPU_SECTION(type, name, sec)
sec here?

timer.h:36: extern struct tvec_base boot_tvec_bases;
See ____cacheline_aligned_in_smp for structure in linux
____cacheline_aligned instructs the compiler to instantiate a struct or variable at an address corresponding to the beginning of an L1
cache line, for the specific architecture, i.e., so that it is L1
cache-line aligned. ____cacheline_aligned_in_smp is similar, but is
actually L1 cache-line aligned only when the kernel is compiled in SMP
configuration (i.e., with option CONFIG_SMP). These are defined in
file include/linux/cache.h
3.http://www.makelinux.net/ldd3/chp-8-sect-5
Per-CPU variables are an interesting 2.6 kernel feature. When you
create a per-CPU variable, each processor on the system gets its own
copy of that variable. This may seem like a strange thing to want to
do, but it has its advantages. Access to per-CPU variables requires
(almost) no locking, because each processor works with its own copy.

IAR Pragma data_alignment not working

I am trying to compile the libvpx library (webm decoder by google) with IAR embedded workbench for an ARM-A7 (bare metal application).
I managed to pull in all the necessary files, and it compiles, but there is a problem with the data alignment of some variables.
In the library, there is a macro DATA_ALIGNMENT() that expands to the GNUC __attribute__(aligned(n)) preprocessor directive. I think I managed to get this macro to work with the IAR version of data alignment (pragma data alignment), but I get the following warning
"Warning [Pe609]: this kind of pragma may not be used here"
and when I run the code, my variables are not aligned!
When searching for the warning on the internet, they say you cannot use pragma with definitions of the variables, but only when creating a variable of some kind! However, for data alignment you need to do it when defining the struct (and GCC does allow it, so why wouldnt IAR?)
Any help would be appreciated!
CODE
Macro Definitions:
#if (defined(__GNUC__) && __GNUC__) || defined(__SUNPRO_C)
#define DECLARE_ALIGNED(n, typ, val) typ val __attribute__((aligned(n)))
#elif defined(__ICCARM__)
#define CONCAT(a,b) a##b
#define DECLARE_ALIGNED(n, typ, val) CONCAT(DECLARE_ALIGNED_,n) (typ,val)
#define DECLARE_ALIGNED_1(typ, val) _Pragma("data_alignment=1") typ val
#define DECLARE_ALIGNED_8(typ, val) _Pragma("data_alignment=8") typ val
#define DECLARE_ALIGNED_16(typ, val) _Pragma("data_alignment=16") typ val
#define DECLARE_ALIGNED_32(typ, val) _Pragma("data_alignment=32") typ val
#define DECLARE_ALIGNED_256(typ, val) _Pragma("data_alignment=256") typ val
#else
#warning No alignment directives known for this compiler.
#define DECLARE_ALIGNED(n, typ, val) typ val
#endif
Example where used:
typedef struct VP9Decoder {
DECLARE_ALIGNED(16, MACROBLOCKD, mb);
DECLARE_ALIGNED(16, VP9_COMMON, common);
int ready_for_new_data;
int refresh_frame_flags;
...
} VP9Decoder;

I have tried this directly in my IAR compiler (7.40.6) and it works fine:
#define CONCAT(a,b) a##b
#define DECLARE_ALIGNED(n, typ, val) CONCAT(DECLARE_ALIGNED_,n) (typ,val)
#define DECLARE_ALIGNED_8(typ, val) _Pragma("data_alignment=8") typ val
typedef struct
{
int a;
char b;
char pad1;
char pad3;
char pad4;
int c;
char d;
} myType;
void main( void)
{
DELCARE_ALIGNED_4( myType, data);
// So data.a will be aligned to a 4 byte boundary
// data.b will be aligned to four bytes
// data.pad, pad1, pad2 are wasted space.
// data.c will be aligned to four bytes
// data.d will be aligned to four bytes
}
Unless you need to your struct to be in a specific order, for example mapping onto something, then ordering your struct carefully can reduce its size. For example, the padding I inserted in this case would quite likey be inserted by the compiler anyway. The order would be better as int a, int c, char b, char d. as the original struct is probably 16 bytes long due to padding and alignment. Whereas it could to be made to be only 12.

C - Struct across multiple files

I'm having a problem reading and writing to structs across multiple files. Essentially I need to write to variables within a struct that are later checked during a timer interrupt. When this happens, the new timer value is taken as a member in that struct. At the moment I hard set these timer values in a while(1) loop, but later these values will be taken from some algorithm. I'm not quite sure if I'm doing this reading of struct members correctly. The project compiles, however during runtime, it sets the timers to random values. GDB degugging confirmed they are correct however.
If I set the timer values directly, everything works file.
This is an embedded project on an ARM cortex M4.
I have a structure defined in types.h
#ifndef __TYPES_H
#define __TYPES_H
typedef struct { uint32_t a; uint32_t b; uint32_t c;} myStruct;
#endif
Then in main.c
#include <types.h>
myStruct hello
int main(void){
while(1){
hello.a = 10;
hello.b = 43;
hello.c = 98;
}
}
Then in interrupt.c
#include <types.h>
myStruct hello
int count = 0;
void timer_IRQHandler(void){
if(interrupt != RESET){
switch(count){
case 0:
timerSet(hello.a); // if i just put a number here, it works fine
count++;
break;
case 1:
timerSet(hello.b);
count++;
break;
case 2:
timerSet(hello.c);
count++;
break;
}
resetInterrupt();
}
}
--- SOLUTION ---
Ok I've figured it out, could do with moving things around but otherwise it works like this:
types.h
#ifndef __TYPES_H
#define __TYPES_H
typedef struct { uint32_t a; uint32_t b; uint32_t c;} myStruct;
#endif
Then in main.c
#include <types.h>
myStruct volatile hello = {10,10,10};
int main(void){
while(1){
hello.a = 10;
hello.b = 43;
hello.c = 98;
}
}
Then in interrupt.c
#include <types.h>
extern myStruct hello
int count = 0;
void timer_IRQHandler(void){
if(interrupt != RESET){
switch(count){
case 0:
timerSet(hello.a); // if i just put a number here, it works fine
count++;
break;
case 1:
timerSet(hello.b);
count++;
break;
case 2:
timerSet(hello.c);
count++;
break;
}
resetInterrupt();
}
}
The extern seems to have solved the problem of getting the struct across different files, and the initial value declaration of {10,10,10} has, I think solved some memory allocation problem. The code compiles, but doesn't hold correct values without it. I don't know what volatile does yet, no difference if I remove it. Something I shall read up on.

declare it in the header
extern myStruct hello;
and define it in only one cpp
myStruct hello;

Are you having a problem with the compiler optimizing the accesses in the while loop? Since the compiler doesn't know that another thread of execution is looking at the values of hello, maybe its just not writing them? Try to add volatile to your "extern myStruct hello" or look at the mixed assembly output and see if its writing through to the hello struct.
See here

You are in effect declaring the struct twice (with the same name) which is why your code is not working. Define it once (say in main) and then use extern to reference it in your other files.
extern myStruct hello;

It appears to me that your timer_IRQHandler is called in the kernel as an interrupt routine. If true, main is probably never called.
I would try to statically initialize your struct, rather than rely on main to initialize it. For example:
myStruct hello = { 10, 43, 98 };
and yes, if multiple files reference it you should declare it as extern in the header and define/initialize it only once in a source file.
As for volatile, that's for device registers or memory-mapped addresses that might not give the same answer if read twice. It tells the compiler to not try to optimize out multiple reads for that memory location.

Managing and handling multiple abstract data types in C with type safety?

I have several hardware signals that get toggled based on properties relevant to scenarios in which respective signals could be toggled. The problem is that signals and the properties that define scenarios, all three could change. I am forced to think in terms of a modular framework based design in which there is SignalManager that handles signal creation and there is a SignalPropertiesData with its SignalPropertiesDataManager that associate certain SignalScenario structure and all this is created specifically for any type of signal by the SignalManager. I wish to follow the public interface, private data in the C programming paradigm.
My dilemma is C in general when it comes to type safety and this kind of problem, the only solution is to lose type safety and use 'void' for any and all types of data. Can you point me to any code or component in the vast opensource sea, which can serve as a right reference for this problem.
signal_manager.h:
#ifdef _SIGNAL_MANAGER_H
#define _SIGNAL_MANAGER_H
int createSignal(SignalDescPtr signalDescPtr);
int destroySignal();
typedef struct SignalDesc* SignalDescPtr;
#endif
signal_manager.c:
#include "signal_manager.h"
typedef struct {
char* signalName;
unsigned int signalId;
SignalPropertiesDataPtr signalProperties;
} SignalDesc;
signal_properties_data.h:
#ifdef _SIGNAL_PROPERTIES_DATA
#define _SIGNAL_PROPERTIES_DATA
typedef enum {
SIGNAL_DATA_INT_TYPE,
SIGNAL_DATA_UNSIGNED_INT_TYPE,
SIGNAL_DATA_FLOAT_TYPE,
:
SIGNAL_DATA_UNSPECIFIED_BASIC_TYPE
} eSignalBasicType;
typedef enum {
SIGNAL_DATA_LIST_ARRAY_TYPE,
SIGNAL_DATA_LIST_ADT_TYPE,
:
:
SIGNAL_DATA_LIST_UNSPECIFIED_TYPE
} eSignalComplexType
typdef union {
eSignalBasicType signalBasicType;
eSignalComplexType signalComplexType;
} eSignalType;
typedef struct {
eSignalType signalType;
unsigned int signalDataLen;
} SignalDataValueType;
typedef SignalPropertiesData* SignalPropertiesDataPtr;
result_t setSignalType(..);
result_ getSignalType(..);
result_t setSignalData(..);
result_t getSignalData(..);
result_t setSignalDataLen(..);
result_t getSignalDataLen(..);
#endif
signal_properties_data.c:
#include "signal_properties_data.h"
typdef struct {
SignalDataValueType signalPropertiesDataType;
void* signalPropertiesDataValue;
} SignalPropertiesData;
signal_properties_data_mgr.h:
#ifdef _SIGNAL_PROPERTIES_DATA_MGR_H
#define _SIGNAL_PROPERTIES_DATA_MGR_H
#include "signal_properties_data.h"
#include "signal_scenario.h"
typedef SignalScenarioDesc* SignalScenarioDescPtr;
result_t createSignalPropertiesData(SignalPropertiesDataPtr *signalPropDataPtr, eSignalType desiredSignalType);
result_t freeSignalPropertiesData(..);
result_t associateSignalToggleScenario(SignalPropertiesDataPtr *signalPropDataPtr, SignalScenPtr signalScenPtr);
result_t disassociateSignalToggleScenario(SignalPropertiesDataPtr *signalPropDataPtr, SignalScenarioDescPtr signalScenPtr);
#endif
signal_properties_data_mgr.c:
#include "signal_properties_data_mgr.h"
typedef struct {
toggleFuncPtr fptr;
} SignalScenarioDesc;

Avoid going for void *. It loses the benefits of prototypes and is not necessary,
Since this is C, you should write in signalmanager.h
#ifndef SIGNAL_MANAGER_H_INCLUDED
#define SIGNAL_MANAGER_H_INCLUDED
typedef struct SignalDesc* SignalDescPtr;
int createSignal(SignalDescPtr signalDescPtr);
int destroySignal(void);
#endif
Changes:
Critical: the idiom is #ifndef MACRO / #define MACRO / #endif. You used #ifdef which won't work.
Place typedef before it is used.
Add explicit (void) to make destroySignal(void) into a prototype. Your version simply says 'there is a function destroySignal() that returns an int but it takes an unspecified (but not variadic) argument list'.
Do not use reserved name space (leading underscore, capital letter) for the header protection guard.
I prefer not to hide pointers in data type typedefs, so I'd probably write:
#ifndef SIGNAL_MANAGER_H_INCLUDED
#define SIGNAL_MANAGER_H_INCLUDED
typedef struct SignalDesc SignalDesc;
extern int createSignal(SignalDesc *sigdesc);
extern int destroySignal(void);
#endif /* SIGNAL_MANAGER_H_INCLUDED */
I'm not sure that the interfaces to the create and destroy are correct, but that's another subject for discussion. I'd normally expect to find that the interfaces are more like:
extern SignalDesc *createSignal(const char *name, int signum);
extern void destroySignal(SignalDesc *sigdesc);
The implementation file signal_manager.c would then define the structure type and use it; the external interface is through functions that work with pointers.
#include "signal_manager.h"
#include "signal_properties_data.h"
struct SignalDesc
{
char *signalName;
unsigned int signalId;
SignalPropertiesData *signalProperties;
};
The signal_properties_data.h needs similar cleaning up:
#ifndef SIGNAL_PROPERTIES_DATA_H_INCLUDED
#define SIGNAL_PROPERTIES_DATA_H_INCLUDED
typedef enum {
SIGNAL_DATA_INT_TYPE,
SIGNAL_DATA_UNSIGNED_INT_TYPE,
SIGNAL_DATA_FLOAT_TYPE,
:
SIGNAL_DATA_UNSPECIFIED_BASIC_TYPE
} eSignalBasicType;
typedef enum {
SIGNAL_DATA_LIST_ARRAY_TYPE,
SIGNAL_DATA_LIST_ADT_TYPE,
:
:
SIGNAL_DATA_LIST_UNSPECIFIED_TYPE
} eSignalComplexType
typdef union {
eSignalBasicType signalBasicType;
eSignalComplexType signalComplexType;
} eSignalType;
typedef struct {
eSignalType signalType;
unsigned int signalDataLen;
} SignalDataValueType;
/* Huge hole in types here - or is SignalValueDataType what you're after? */
typedef struct SignalPropertiesData SignalPropertiesData;
result_t setSignalType(..);
result_t getSignalType(..);
result_t setSignalData(..);
result_t getSignalData(..);
result_t setSignalDataLen(..);
result_t getSignalDataLen(..);
#endif /* SIGNAL_PROPERTIES_DATA_H_INCLUDED */
This revised header is not self-contained yet. It does not define result_t, so it should include the header that does. I assume the ... notation in the function calls is "do not want to specify in this question" because it isn't valid in C; you must have one argument of known type before you specify the ellipsis.
It is not clear from the functions in the header why the user of the header needs to know about the types that are defined in it. Think of a header as a resource that will be shared. It 'belongs to' or describes the external interface to a file (or a small set of files). It gives the users the minimum data that they need to use the facilities provided by the code, but no more than the minimum. You sometimes find that a set of files implementing a facility will need a private header to share between themselves as well as the public header that other code ('customer' code or 'consumer' code) uses.
The key to opaque types is that the consumer doesn't need to know the internal details of a structure to be able to use pointers to the structure type (see Which part of the C Standard allows this code to compile?, Does the C standard consider that there are one or two struct uperms_entry types in this header? and How to structure header files in C for some more insight).

I am not sure that I fully understand what your needs are, but I think that you could look at WIN32 handles for a reference implementation.
A short example would be to define a macro that lets you define custom handles like this:
/* example.h */
#ifndef _EXAMPLE_H
#define _EXAMPLE_H
/* Define macro */
#define DECLARE_HANDLE(HandleName) typedef struct HandleName##Tag * HandleName
/* Declare some handles */
DECLARE_HANDLE(SomeHandle);
DECLARE_HANDLE(SomeOtherHandle);
#endif /* _EXAMPLE_H */
/* example.c */
#include "example.h"
struct SomeHandleTag {
int foo;
};
struct SomeOtherHandleTag {
int foo;
};

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

cl_float4 on host and float4 on device in openCL - arrays

Related

Import struct from shared library at runtime - without header file at compile time

Understanding per cpu structure in linux kernel

IAR Pragma data_alignment not working

C - Struct across multiple files

Managing and handling multiple abstract data types in C with type safety?

Categories

Resources