Short version: Is there an official/correct way to query for the size of strings like CL_PLATFORM_VENDOR?
Long version:
Looking at OpenCL functions like clGetPlatformInfo I want to find out how much space to allocate for the result. The function has this signature:
cl_int clGetPlatformInfo(cl_platform_id platform,
cl_platform_info param_name,
size_t param_value_size,
void *param_value,
size_t *param_value_size_ret)
The docs say:
param_value_size
Specifies the size in bytes of memory pointed to by
param_value. This size in bytes must be greater
than or equal to size of return type specified in the
table below.
All of the return types are listed as char[]. I wanted to know how much space to reserve so I called it like this, where I pass 0 for param_value_size and NULL for param_value, hoping to get the correct size return in param_value_size_ret:
size_t size = 0;
l_success = clGetPlatformInfo(platform_id,
CL_PLATFORM_VENDOR, 0, NULL, &size);
if( l_success != CL_SUCCESS)
{
printf("Failed getting vendor name size.\n");
return -1;
}
printf("l_success = %d, size = %d\n", l_success, size);
char* vendor = NULL;
vendor = malloc(size);
if( vendor )
{
l_success = clGetPlatformInfo(platform_id,
CL_PLATFORM_VENDOR, size, vendor, &size);
if( l_success != CL_SUCCESS )
{
printf("Failed getting vendor name.\n");
return -1;
}
printf("Vendor name is '%s', length is %d\n", vendor, strlen(vendor));
} else {
printf("malloc failed.\n");
return -1;
}
It behaved as I had hoped, it return a size of 19 for the string, "NVIDIA Corporation" (size included null terminator) and strlen return 18. Is this the "right" way to query for parameter size or am I just getting lucky with my vendor's implementation? Has anyone seen this idiom fail on some vendor?
Edit: The bit that is tripping me up is this, "This size in bytes must be greater than or equal to size of return type", it seems like when I pass 0 and NULL the call should fail because that's not greater than or equal to the size of the returned value. I'm not sure why they say "return type".
Yes, it is the right way.
In the same doc you've mentioned:
If param_value is NULL, it is ignored.
And below:
Returns CL_SUCCESS if the function is executed successfully. Otherwise, it returns the following:
...
CL_INVALID_VALUE if param_name is not one of the supported values or if size in bytes specified by param_value_size is less than size of return type and param_value is not a NULL value.
So even if it is not stated explicitly, if param_value is NULL no error should be produced, so the code is supposed to work properly.
Here is a piece of code from Khronos OpenCL C++ bindings (specs). They too do it this way, and I think it counts as "official":
// Specialized GetInfoHelper for STRING_CLASS params
template <typename Func>
struct GetInfoHelper<Func, STRING_CLASS>
{
static cl_int get(Func f, cl_uint name, STRING_CLASS* param)
{
::size_t required;
cl_int err = f(name, 0, NULL, &required);
if (err != CL_SUCCESS) {
return err;
}
char* value = (char*) alloca(required);
err = f(name, required, value, NULL);
if (err != CL_SUCCESS) {
return err;
}
*param = value;
return CL_SUCCESS;
}
};
Note in the example that while asking for the value, the forth parameter has to be NULL. Otherwise some platforms answer a CL_INVALID_VALUE as return.
As the upfront query and malloc is somehow over the top, I use one large char buffer as workaround.
char vendor[10240];
l_success = clGetPlatformInfo( platform_id,
CL_PLATFORM_VENDOR, sizeof(vendor), vendor, NULL);
Related
Objective: Retrieve the class name of the current foreground window with plain C.
I have the following code to retrieve the class name:
PWSTR win_class = NULL;
GetClassNameW(hwnd,&win_class,MAX_PATH);
if(win_class != NULL)
free(win_class);
I am getting the following warnings:
warning C4047: 'function': 'LPWSTR' differs in levels of indirection
from 'PWSTR *' warning C4024: 'GetClassNameW': different types
for formal and actual parameter 2
I have two questions: How to solve those warnings, and how should I create an if condition to validate the result of the function GetClassName and set the value of win_class to "NOT FOUND" in case that the function does not find the class name?
The GetClassNameW() function does not allocate the memory needed for the returned class name - you have to do that (or simply provide automatic storage).
To check for success, simply test the return value of the function: if it succeeds, that will be the length of the class name string (in characters); if it fails, the value will be zero.
Here's a short, runnable program that gets the class name for the console window:
#include <Windows.h>
#include <stdio.h>
int main()
{
HWND hwnd = GetConsoleWindow();
wchar_t win_class[_MAX_PATH];
int status = GetClassNameW(hwnd, win_class, _MAX_PATH);
if (!status) wcscpy(win_class, L"NOT FOUND");
printf("%ls\n", win_class);
return 0;
}
When calling GetClassNameW(), you are passing a PWSTR* (wchar_t**) where a LPWSTR (wchar_t*) is expected. That is what the compiler is complaining about.
GetClassName() requires you to pre-allocate a character buffer and pass in a pointer to it, along with the buffer size. The function will not allocate a buffer and return a pointer back to you, like you are expecting. It will merely fill in your provided buffer as needed.
Try something more like this:
WCHAR win_class[256] = {0};
int win_class_len = 0;
HWND hwnd = GetForegroundWindow();
if (!hwnd)
{
// error handling as needed...
lstrcpyW(win_class, L"WINDOW NOT FOUND");
win_class_len = lstrlenW(win_class);
}
else
{
win_class_len = GetClassNameW(hwnd, win_class, 256);
if (win_class_len == 0)
{
DWORD err = GetLastError();
// error handling as needed...
lstrcpyW(win_class, L"CLASS NOT FOUND");
win_class_len = lstrlenW(win_class);
}
}
// use win_class as needed, up to win_class_len characters...
TCHAR is the universal char type working for Unicode and ANSI. GetClassName is the corresponding function replaced with GetClassNameA or GetClassNameW. If you need static string use TEXT() macro like TCHAR * s = TEXT("abc");
TCHAR win_class[MAX_PATH];
int res = GetClassName( hwnd, win_class, MAX_PATH );
if ( res > 0 )
// success
Your error is that win_class was unallocated (NULL) string pointer. GetClassName needs valid memory address of buffer.
You need to check if you function succseed. To do so, print the return value of the function.
Failiure will be presented by the value 0.
My situation is that MmCopyVirtualMemory almost always (%99 of the time) returns STATUS_PARTIAL_COPY.
(Im operating in a Ring0 Driver)
I've tried so many different things, like using different variables sizes and types, different addresses etc... It always returns STATUS_PARTIAL_COPY.
Nothing online has helped either, its not a really common error.
Error description:
{Partial Copy} Due to protection conflicts not all the requested bytes could be copied.
My way of reading a processes memory:
DWORD64 Read(DWORD64 SourceAddress, SIZE_T Size)
{
SIZE_T Bytes;
NTSTATUS Status = STATUS_SUCCESS;
DWORD64 TempRead;
DbgPrintEx(0, 0, "\nRead Address:%p\n", SourceAddress); // Always Prints Correct Address
DbgPrintEx(0, 0, "Read szAddress:%x\n", Size); // Prints Correct Size 8 bytes
Status = MmCopyVirtualMemory(Process, SourceAddress, PsGetCurrentProcess(), &TempRead, Size, KernelMode, &Bytes);
DbgPrintEx(0, 0, "Read Bytes:%x\n", Bytes); // Copied bytes - prints 0
DbgPrintEx(0, 0, "Read Output:%p\n", TempRead); // prints 0 as expected since it failed
if (!NT_SUCCESS(Status))
{
DbgPrintEx(0, 0, "Read Failed:%p\n", Status);
return NULL;
}
return TempRead;
}
Example of how I use it:
NetMan = Read(BaseAddr + NET_MAN, sizeof(DWORD64));
//BaseAddr is a DWORD64 and NetMan is also a DWORD64
I've tripped checked my code to many times, all of it appears to be right.
After investigating MiDoPoolCopy (MmCopyVirtualMemoy calls this) it seems that its failing during the move operation:
// MiDoPoolCopy function
// Probe to make sure that the specified buffer is accessible in
// the target process.
//Wont execute (supplied KernelMode)
if ((InVa == FromAddress) && (PreviousMode != KernelMode)){
Probing = TRUE;
ProbeForRead (FromAddress, BufferSize, sizeof(CHAR));
Probing = FALSE;
}
//Failing either here, copying inside the target process's address space to the buffer
RtlCopyMemory (PoolArea, InVa, AmountToMove);
KeUnstackDetachProcess (&ApcState);
KeStackAttachProcess (&ToProcess->Pcb, &ApcState);
//
// Now operating in the context of the ToProcess.
//
//Wont execute (supplied KernelMode)
if ((InVa == FromAddress) && (PreviousMode != KernelMode)){
Probing = TRUE;
ProbeForWrite (ToAddress, BufferSize, sizeof(CHAR));
Probing = FALSE;
}
Moving = TRUE;
//or failing here - moving from the Target Process to Source (target process->Kernel)
RtlCopyMemory (OutVa, PoolArea, AmountToMove);
Here's the SEH returning STATUS_PARTIAL_COPY (wrapped in try and except)
//(wrapped in try and except)
//
// If the failure occurred during the move operation, determine
// which move failed, and calculate the number of bytes
// actually moved.
//
*NumberOfBytesRead = BufferSize - LeftToMove;
if (Moving == TRUE) {
//
// The failure occurred writing the data.
//
if (ExceptionAddressConfirmed == TRUE) {
*NumberOfBytesRead = (SIZE_T)((ULONG_PTR)(BadVa - (ULONG_PTR)FromAddress));
}
}
return STATUS_PARTIAL_COPY;
MmCopyVirtualMemory (undocumented struct)
NTSTATUS NTAPI MmCopyVirtualMemory
(
PEPROCESS SourceProcess,
PVOID SourceAddress,
PEPROCESS TargetProcess,
PVOID TargetAddress,
SIZE_T BufferSize,
KPROCESSOR_MODE PreviousMode,
PSIZE_T ReturnSize
);
Here is the source for both MmCopyVirtualMemory and MiDoPoolCopy:
https://lacicloud.net/custom/open/leaks/Windows%20Leaked%20Source/wrk-v1.2/base/ntos/mm/readwrt.c
Any help would be greatly appreciated, I've been stuck on this for to long...
I know this is old, but because I had the same issue and fixed it, maybe someone else will stumble on this in the future.
Reading your code the issue is the target address you want to copy the data. You are using a single defined DWORD64, which is when running the function a simple variable (register or on stack) and not a memory region where you can write to.
The solution is to provide a correct buffer for which you allocated memory before with malloc.
Example (you want to read a DWORD64, based on your code with some adjustments):
DWORD64* buffer = malloc(sizeof(DWORD64))
Status = MmCopyVirtualMemory(Process, (void*)SourceAddress, PsGetCurrentProcess(), (void*)buffer, sizeof(DWORD64), UserMode, &Bytes);
Do not forget to free your buffer after using to prevent a memory leak.
Tried to print values of M_MMAP_THRESHOLD and M_ARENA_MAX in a sample c program :
if (mallopt(M_ARENA_MAX, 0) == 0) {
printf("mallopt() 2 failed");
exit(EXIT_FAILURE);
}
if (mallopt(M_MMAP_THRESHOLD, 64) == 0) {
printf("mallopt() 2 failed");
exit(EXIT_FAILURE);
}
p = malloc(1000);
if (p == NULL) {
fprintf(stderr, "malloc() failed");
exit(EXIT_FAILURE);
}
printf("Value for M_MMAP_MAX : %d \n",M_MMAP_MAX);
printf("Value for M_MMAP_THRESHOLD : %d \n",M_MMAP_THRESHOLD);
Output:
Value for M_MMAP_MAX : -4
Value for M_MMAP_THRESHOLD : -3
If you can suggest - how to get values for these macros.
The macros are selectors: the value tells mallopt which option to set. The definition of mallopt (only slightly simplified) is:
int mallopt(int which, int value) {
int result = 0;
internal_lock_malloc_state();
switch (which) {
case M_MMAP_MAX:
result = internal_set_maximum_mmap(value);
break;
case M_MMAP_THRESHOLD:
result = internal_set_threshold(value);
break;
// ...
}
internal_unlock_malloc_state();
return result;
}
The internal functions above are probably actually written out, but that makes no difference here. The important thing is that the macro is just a small integer which indicates which option is to be modified.
Unfortunately, from your perspective, there is no way to examine the current value of these options. In fact, there is not even a guarantee that there is a current value. For example, consider an implementation of malloc which never uses mmap, perhaps because the host system doesn't implement memory mapping. Such an implementation could quite reasonably ignore any attempt to set these options, by replacing the internal_set... functions above with result = 1;.
In short, if you want to query what the current value of a malloc option is, it is up to you to remember the last value you set it to. (And there is no way to get the default value other than by reading the documentation.)
I have an application that builds file path names through a series of string concatenations using pieces of text to create a complete file path name.
The question is whether an approach to handle concatenating a small but arbitrary number of strings of text together depends on Undefined Behavior for success.
Is the order of evaluation of a series of nested functions guaranteed or not?
I found this question Nested function calls order of evaluation however it seems to be more about multiple functions in the argument list rather than a sequence of nesting functions.
Please excuse the names in the following code samples. It is congruent with the rest of the source code and I am testing things out a bit first.
My first cut on the need to concatenate several strings was a function that looked like the following which would concatenate up to three text strings into a single string.
typedef wchar_t TCHAR;
TCHAR *RflCatFilePath(TCHAR *tszDest, int nDestLen, TCHAR *tszPath, TCHAR *tszPath2, TCHAR *tszFileName)
{
if (tszDest && nDestLen > 0) {
TCHAR *pDest = tszDest;
TCHAR *pLast = tszDest;
*pDest = 0; // ensure empty string if no path data provided.
if (tszPath) for (pDest = pLast; nDestLen > 0 && (*pDest++ = *tszPath++); nDestLen--) pLast = pDest;
if (tszPath2) for (pDest = pLast; nDestLen > 0 && (*pDest++ = *tszPath2++); nDestLen--) pLast = pDest;
if (tszFileName) for (pDest = pLast; nDestLen > 0 && (*pDest++ = *tszFileName++); nDestLen--) pLast = pDest;
}
return tszDest;
}
Then I ran into a case where I had four pieces of text to put together.
Thinking through this it seemed that most probably there would also be a case for five that would be uncovered shortly so I wondered if there was a different way for an arbitrary number of strings.
What I came up with is two functions as follows.
typedef wchar_t TCHAR;
typedef struct {
TCHAR *pDest;
TCHAR *pLast;
int destLen;
} RflCatStruct;
RflCatStruct RflCatFilePathX(const TCHAR *pPath, RflCatStruct x)
{
TCHAR *pDest = x.pLast;
if (pDest && pPath) for ( ; x.destLen > 0 && (*pDest++ = *pPath++); x.destLen--) x.pLast = pDest;
return x;
}
RflCatStruct RflCatFilePathY(TCHAR *buffDest, int nLen, const TCHAR *pPath)
{
RflCatStruct x = { 0 };
TCHAR *pDest = x.pDest = buffDest;
x.pLast = buffDest;
x.destLen = nLen;
if (buffDest && nLen > 0) { // ensure there is room for at least one character.
*pDest = 0; // ensure empty string if no path data provided.
if (pPath) for (pDest = x.pLast; x.destLen > 0 && (*pDest++ = *pPath++); x.destLen--) x.pLast = pDest;
}
return x;
}
Examples of using these two functions is as follows. This code with the two functions appears to work fine with Visual Studio 2013.
TCHAR buffDest[512] = { 0 };
TCHAR *pPath = L"C:\\flashdisk\\ncr\\database";
TCHAR *pPath2 = L"\\";
TCHAR *pFilename = L"filename.ext";
RflCatFilePathX(pFilename, RflCatFilePathX(pPath2, RflCatFilePathY(buffDest, 512, pPath)));
printf("dest t = \"%S\"\n", buffDest);
printf("dest t = \"%S\"\n", RflCatFilePathX(pFilename, RflCatFilePathX(pPath2, RflCatFilePathY(buffDest, 512, pFilename))).pDest);
RflCatStruct dStr = RflCatFilePathX(pPath2, RflCatFilePathY(buffDest, 512, pPath));
// other stuff then
printf("dest t = \"%S\"\n", RflCatFilePathX(pFilename, dStr).pDest);
Arguments to a function call are completely evaluated before the function is invoked. So the calls to RflCatFilePath* will be evaluated in the expected order. (This is guaranteed by ยง6.5.2.2/10: "There is a sequence point after the evaluations of the function designator and the actual arguments but before the actual call.")
As indicated in a comment, the snprintf function is likely to be a better choice for this problem. (asprintf would be even better, and there is a freely available shim for it which works on Windows.) The only problem with snprintf is that you may have to call it twice. It always returns the number of bytes which would have been stored in the buffer had there been enough space, so if the return value is not less than the size of the buffer, you will need to allocate a larger buffer (whose size you now know) and call snprintf again.
asprintf does that for you, but it is a BSD/Gnu extension to the standard library.
In the case of concatenating filepaths, there is a maximum string length supported by the operating system/file system, and you should be able to find out what it is (although it might require OS-specific calls on non-Posix systems). So it might well be reasonable to simply return an error indication if the concatenation does not fit into a 512-byte buffer.
Just for fun, I include a recursive varargs concatenator:
#include <stdarg.h>
#include <stdlib.h>
#include <string.h>
static char* concat_helper(size_t accum, char* chunk, va_list ap) {
if (chunk) {
size_t chunklen = strlen(chunk);
char* next_chunk = va_arg(ap, char*);
char* retval = concat_helper(accum + chunklen, next_chunk, ap);
memcpy(retval + accum, chunk, chunklen);
return retval;
} else {
char* retval = malloc(accum + 1);
retval[accum] = 0;
return retval;
}
}
char* concat_list(char* chunk, ...) {
va_list ap;
va_start(ap, chunk);
char* retval = concat_helper(0, chunk, ap);
va_end(ap);
return retval;
}
Since concat_list is a varargs function, you need to supply (char*)NULL at the end of the arguments. On the other hand, you don't need to repeat the function name for each new argument. So an example call might be:
concat_list(pPath, pPath2, pFilename, (char*)0);
(I suppose you need a wchar_t* version but the changes should be obvious. Watch out for the malloc.) For production purposes, the recursion should probably be replaced by an iterative version which traverses the argument list twice (see va_copy) but I've always been fond of the "there-and-back" recursion pattern.
I have some long source code that involves a struct definition:
struct exec_env {
cl_program* cpPrograms;
cl_context cxGPUContext;
int cpProgramCount;
int cpKernelCount;
int nvidia_platform_index;
int num_cl_mem_buffs_used;
int total;
cl_platform_id cpPlatform;
cl_uint ciDeviceCount;
cl_int ciErrNum;
cl_command_queue commandQueue;
cl_kernel* cpKernels;
cl_device_id *cdDevices;
cl_mem* cmMem;
};
The strange thing, is that the output of my program is dependent on the order in which I declare the components of this struct. Why might this be?
EDIT:
Some more code:
int HandleClient(int sock) {
struct exec_env my_env;
int err, cl_err;
int rec_buff [sizeof(int)];
log("[LOG]: In HandleClient. \n");
my_env.total = 0;
//in anticipation of some cl_mem buffers, we pre-emtively init some. Later, we should have these
//grow/shrink dynamically.
my_env.num_cl_mem_buffs_used = 0;
if ((my_env.cmMem = (cl_mem*)malloc(MAX_CL_BUFFS * sizeof(cl_mem))) == NULL)
{
log("[ERROR]:Failed to allocate memory for cl_mem structures\n");
//let the client know
replyHeader(sock, MALLOC_FAIL, UNKNOWN, 0, 0);
return EXIT_FAILURE;
}
my_env.cpPlatform = NULL;
my_env.ciDeviceCount = 0;
my_env.cdDevices = NULL;
my_env.commandQueue = NULL;
my_env.cxGPUContext = NULL;
while(1){
log("[LOG]: Awaiting next packet header... \n");
//read the first 4 bytes of the header 1st, which signify the function id. We later switch on this value
//so we can read the rest of the header which is function dependent.
if((err = receiveAll(sock,(char*) &rec_buff, sizeof(int))) != EXIT_SUCCESS){
return err;
}
log("[LOG]: Got function id %d \n", rec_buff[0]);
log("[LOG]: Total Function count: %d \n", my_env.total);
my_env.total++;
//now we switch based on the function_id
switch (rec_buff[0]) {
case CREATE_BUFFER:;
{
//first define a client packet to hold the header
struct clCreateBuffer_client_packet my_client_packet_hdr;
int client_hdr_size_bytes = CLI_PKT_HDR_SIZE + CRE_BUFF_CLI_PKT_HDR_EXTRA_SIZE;
//buffer for the rest of the header (except the size_t)
int header_rec_buff [(client_hdr_size_bytes - sizeof(my_client_packet_hdr.buff_size))];
//size_t header_rec_buff_size_t [sizeof(my_client_packet_hdr.buff_size)];
size_t header_rec_buff_size_t [1];
//set the first field
my_client_packet_hdr.std_header.function_id = rec_buff[0];
//read the rest of the header
if((err = receiveAll(sock,(char*) &header_rec_buff, (client_hdr_size_bytes - sizeof(my_client_packet_hdr.std_header.function_id) - sizeof(my_client_packet_hdr.buff_size)))) != EXIT_SUCCESS){
//signal the client that something went wrong. Note we let the client know it was a socket read error at the server end.
err = replyHeader(sock, err, CREATE_BUFFER, 0, 0);
cleanUpAllOpenCL(&my_env);
return err;
}
//read the rest of the header (size_t)
if((err = receiveAll(sock, (char*)&header_rec_buff_size_t, sizeof(my_client_packet_hdr.buff_size))) != EXIT_SUCCESS){
//signal the client that something went wrong. Note we let the client know it was a socket read error at the server end.
err = replyHeader(sock, err, CREATE_BUFFER, 0, 0);
cleanUpAllOpenCL(&my_env);
return err;
}
log("[LOG]: Got the rest of the header, packet size is %d \n", header_rec_buff[0]);
log("[LOG]: Got the rest of the header, flags are %d \n", header_rec_buff[1]);
log("[LOG]: Buff size is %d \n", header_rec_buff_size_t[0]);
//set the remaining fields
my_client_packet_hdr.std_header.packet_size = header_rec_buff[0];
my_client_packet_hdr.flags = header_rec_buff[1];
my_client_packet_hdr.buff_size = header_rec_buff_size_t[0];
//get the payload (if one exists)
int payload_size = (my_client_packet_hdr.std_header.packet_size - client_hdr_size_bytes);
log("[LOG]: payload_size is %d \n", payload_size);
char* payload = NULL;
if(payload_size != 0){
if ((payload = malloc(my_client_packet_hdr.buff_size)) == NULL){
log("[ERROR]:Failed to allocate memory for payload!\n");
replyHeader(sock, MALLOC_FAIL, UNKNOWN, 0, 0);
cleanUpAllOpenCL(&my_env);
return EXIT_FAILURE;
}
if((err = receiveAllSizet(sock, payload, my_client_packet_hdr.buff_size)) != EXIT_SUCCESS){
//signal the client that something went wrong. Note we let the client know it was a socket read error at the server end.
err = replyHeader(sock, err, CREATE_BUFFER, 0, 0);
free(payload);
cleanUpAllOpenCL(&my_env);
return err;
}
}
//make the opencl call
log("[LOG]: ***num_cl_mem_buffs_used before***: %d \n", my_env.num_cl_mem_buffs_used);
cl_err = h_clCreateBuffer(&my_env, my_client_packet_hdr.flags, my_client_packet_hdr.buff_size, payload, &my_env.cmMem);
my_env.num_cl_mem_buffs_used = (my_env.num_cl_mem_buffs_used+1);
log("[LOG]: ***num_cl_mem_buffs_used after***: %d \n", my_env.num_cl_mem_buffs_used);
//send back the reply with the error code to the client
log("[LOG]: Sending back reply header \n");
if((err = replyHeader(sock, cl_err, CREATE_BUFFER, 0, (my_env.num_cl_mem_buffs_used -1))) != EXIT_SUCCESS){
//send the header failed, so we exit
log("[ERROR]: Failed to send reply header to client, %d \n", err);
log("[LOG]: OpenCL function result was %d \n", cl_err);
if(payload != NULL) free(payload);
cleanUpAllOpenCL(&my_env);
return err;
}
//now exit if failed
if(cl_err != CL_SUCCESS){
log("[ERROR]: Error executing OpenCL function clCreateBuffer %d \n", cl_err);
if(payload != NULL) free(payload);
cleanUpAllOpenCL(&my_env);
return EXIT_FAILURE;
}
}
break;
Now what's really interesting is the call to h_clCreateBuffer. This function is as follows
int h_clCreateBuffer(struct exec_env* my_env, int flags, size_t size, void* buff, cl_mem* all_mems){
/*
* TODO:
* Sort out the flags.
* How do we store cl_mem objects persistantly? In the my_env struct? Can we have a pointer int the my_env
* struct that points to a mallocd area of mem. Each malloc entry is a pointer to a cl_mem object. Then we
* can update the malloced area, growing it as we have more and more cl_mem objects.
*/
//check that we have enough pointers to cl_mem. TODO, dynamically expand if not
if(my_env->num_cl_mem_buffs_used == MAX_CL_BUFFS){
return CL_MEM_OUT_OF_RANGE;
}
int ciErrNum;
cl_mem_flags flag;
if(flags == CL_MEM_READ_WRITE_ONLY){
flag = CL_MEM_READ_WRITE;
}
if(flags == CL_MEM_READ_WRITE_OR_CL_MEM_COPY_HOST_PTR){
flag = CL_MEM_READ_WRITE | CL_MEM_COPY_HOST_PTR;
}
log("[LOG]: Got flags. Calling clCreateBuffer\n");
log("[LOG]: ***num_cl_mem_buffs_used before in function***: %d \n", my_env->num_cl_mem_buffs_used);
all_mems[my_env->num_cl_mem_buffs_used] = clCreateBuffer(my_env->cxGPUContext, flag , size, buff, &ciErrNum);
log("[LOG]: ***num_cl_mem_buffs_used after in function***: %d \n", my_env->num_cl_mem_buffs_used);
log("[LOG]: Finished clCreateBuffer with id: %d \n", my_env->num_cl_mem_buffs_used);
//log("[LOG]: Finished clCreateBuffer with id: %d \n", buff_counter);
return ciErrNum;
}
The first time round the while loop, my_env->num_cl_mem_buffs_used is increased by 1. However, the next time round the loop, after the call to clCreateBuffer, the value of my_env->num_cl_mem_buffs_used gets reset to 0. This does not happen when I change the order in which I declare the members of the struct! Thoughts? Note that I've omitted the other case statements, all of which so similar things, i.e. updating the structs members.
Well, if your program dumps the raw memory content of the object of your struct type, then the output will obviously depend on the ordering of the fields inside your struct. So, here's one obvious scenario that will create such a dependency. There are many others.
Why are you surprised that the output of your program depends on that order? In general, there's nothing strange in that dependency. If you base your verdict of on the knowledge of the rest of the code, then I understand. But people here have no such knowledge and we are not telepathic.
It's hard to tell. Maybe you can post some code. If I had to guess, I'd say you were casting some input file (made of bytes) into this struct. In that case, you must have the proper order declared (usually standardized by some protocol) for your struct in order to properly cast or else risk invalidating your data.
For example, if you have file that is made of two bytes and you are casting the file to a struct, you need to be sure that your struct has properly defined the order to ensure correct data.
struct example1
{
byte foo;
byte bar;
};
struct example2
{
byte bar;
byte foo;
};
//...
char buffer[];
//fill buffer with some bits
(example1)buffer;
(example2)buffer;
//those two casted structs will have different data because of the way they are defined.
In this case, "buffer" will always be filled in the same manner, as per some standard.
Of course the output depends on the order. Order of fields in a struct matters.
An additional explanation to the other answers posted here: the compiler may be adding padding between fields in the struct, especially if you are on a 64 bit platform.
If you are not using binary serialization, then your best bet is an invalid pointer issue. Like +1 error, or some invalid pointer arithmetics can cause this. But it is hard to know without code. And it is still hard to know, even with the code. You may try to use some kind of pointer validation/tracking system to be sure.
other guesses
by changing the order you are having different uninitialized values in the struct. A pointer being zero or not zero for example
you somehow manage to overrun an item (by casting ) and blast later items. Different items get blasted depending on order
This may happen if your code uses on "old style" initializer as from C89. For a simple example
struct toto {
unsigned a;
double b;
};
.
.
toto A = { 0, 1 };
If you interchange the fields in the definition this remains a valid initializer, but your fields are initialized completely different. Modern C, AKA C99, has designated initializers for that:
toto A = { .a = 0, .b = 1 };
Now, even when reordering your fields or inserting a new one, your initialization remains valid.
This is a common error which is perhaps at the origin of the initializerphobia that I observe in many C89 programs.
You have 14 fields in your struct, so there is 14! possible ways the compiler and/or standard C library can order them during the output.
If you think from the compiler designer's point of view, what should the order be? Random is certainly not useful. The only useful order is the order in which the struct fields were declared by you.