I want to read the values of the memory locations of the entire program flash memory of an MCU, in particular, the CC2538 on the OpenMote-CC2538. The read values are then computed into, currently, a large sum of all the values.
At this moment, I have the following code working to traverse the memory and get the values
uint64_t readMemory() {
unsigned char * bytes = (char *) 0x200000;
size_t size = 0x0007FFD4;
size_t i;
uint64_t amount = 0;
for (i = 0; i < size; i++) {
amount += bytes[i];
}
return amount;
}
uint64_t readFlashMemory() {
unsigned int * bytes = (int *) 0x200000;
size_t size = 0x0007FFD4;
size_t i;
uint64_t amount = 0;
for (i = 0; i < size; i+=4) {
amount += FlashGet(bytes);
bytes++;
}
return amount;
}
address 0x200000 and its size is 0x0007FFD4. The first function works with a char and goes to each address one by one, while the second one uses an existing function FlashGet(uint32_t) from the flash.c file, which is a direct access to a register (HWREG).
FlashGet requires a uint32_t address and returns a uint32_t value, as such it has a length of 4 and the address should be moved with 4 in the loop .The first function uses char for the addressing, which is a length of 1 and so the address should also move by 1 in the loop. Am I correct in these statements? If so, am I executing them correctly? For the second function, incrementing the pointer with 1 should move it with 4 due to it being of type uint32_t (similar to int).
However, the functions return a different value.
The first one returns: 674426297757
The second one returns: 8213668631160
As both functions should be doing the same, one or both must be incorrect and is not reading the entire program flash memory.
How can I fix both functions? Is there a better or easier way to read the entire memory when you have the starting address and size?
Consider you have a 4-byte flash memory with content
00 01 02 03
Adding by byte values will give you 0x000000000000006
Adding by 32-bit int values will give you 0x0000000003020100 assuming little-endian.
Related
I have a problem that I'm having difficulty solving.
I have a union that contains a buffer with a struct mapping the bits of the buffer. Something along the lines of (it is pragma packed of course):
union
uint32 buf[512]
struct
uint8_t pad[256];
uint32_t data[256];
the buf[] part is intended to be passed to the Linux spi driver as a Receive buffer. The issue I'm having is, depending on my transmits, the size of the padding I receive back is variable, and because of this it isn't straight forward to access using the union.
What i need to do is to be able to pass buf[] at a specific index to the spi driver, I.E the Rx buffer begins at buf[128] instead of buf[0]. This isn't always equal, so i have an equation that tells me where i need the start point to be which is &(buf[0]+padmax-padsize]) which should result in a value between buf[0] and buf[256]. However, the issue is the spi driver expects the argument of the transfer buffer to contain a pointer to a buffer, and passing it the straight address isn't giving me what i want.
I have also tried assigning a pointer to the address of the above equation and passing that to the rxbuffer part of the spi struct and it again doesn't give me what i want.
Is it possible to create an array that is a subset of another array, starting at a specified address of the outer array? I think this may solve my problem but I'm also afraid of the memory implications of that
The reason is most likely that you're calculating the address in 32-bit units (in units of buf elements), not bytes as you expect, based on the arithmetic.
Let's simplify the situation, and say the structure is just
#define MAX_PAD 256
#define MAX_DATA 256
struct spi_data {
uint8_t pad[MAX_PAD];
uint32_t data[MAX_DATA];
};
and that you want to implement a function similar to
size_t spi_recv(int fd, struct spi_data *ref, size_t pad, size_t data)
where
fd is the file descriptor to read() from
ref is a pointer to the struct spi_data to be used
pad is the number of padding entries filled at the end of the ref->pad[] array
data is the number of entries filled at the beginning of the ref->data[] array
the return value is the number of data entries received (completely filled)
Consider the following (argument checks for fd == -1, ref == NULL, pad > MAX_PAD, data > MAX_DATA omitted for simplicity):
size_t spi_recv(int fd, struct spi_data *ref, size_t pad, size_t data)
{
ssize_t n;
n = read(fd, &ref->pad[sizeof ref->pad / sizeof ref->pad[0] - pad],
pad * sizeof ref->pad[0] + data * sizeof ref->data[0]);
if (n == -1) {
/* Error; errno already set */
return 0;
} else
if (n < 0) {
/* Should never occur, but let's be paranoid */
errno = EIO;
return 0;
} else
if (n < pad * sizeof ref->pad[0]) {
/* Only partial padding received */
errno = 0;
return 0;
} else {
/* Zero or more data words received */
errno = 0;
return (n - pad * sizeof ref->pad[0]) / sizeof ref->data[0];
}
}
The pointer to the last pad elements of padding is
&ref->pad[sizeof ref->pad / sizeof ref->pad[0] - pad])
which is essentially equivalent to &(ref->pad[MAX_PAD - pad]), except that instead of the MAX_PAD macro, we use (sizeof ref->pad)/(sizeof ref->pad[0]) to evaluate the number of members declared for the ref->pad[] array. (This only works if ref->pad is an array; it does not work if it is a pointer.)
As usual, read() takes the number of bytes -- not elements of ref->pad or ref->data -- as a parameter, so we need to multiply the element counts by their respective element sizes in bytes; thus, the number of bytes in pad elements of padding and data elements of data is pad * sizeof ref->pad[0] + data * sizeof ref->data[0].
Since the function returns the number of complete data words received, the number of padding bytes must be subtracted from the return value, then divided by the data element type (integer division rounding down), to get the number of complete data words.
I don't think the above interface is optimal, however. I particularly dislike the possibility of the SPI transfer ending with a partial word; the above interface does not let the caller detect such a situation reliably.
If you use spidev, the ioctl() interface would be much better to use. For one, you could use SPI_IOC_MESSAGE(2) to read the padding and the data into separate buffers, or even SPI_IOC_MESSAGE(3) to write a command, followed by a read to the padding buffer and another to the data buffer. The Linux-Sunxi Wiki page has a pretty simple example of this kind of usage here, except that it uses single reads instead of reading padding into a separate buffer. However, it should be quite simple to extend the examples to do that.
I have the following .bin file
1f ac 00 78 00 3f 00 c3 00 83....
and I'm supposed to go through it using pointer arithmetic. I'm supposed to grab the first byte that will tell me how many "words" I'm going to process, then every two bytes will tell me the offset where I'm supposed to begin reading. My problem is that I get the first byte without any problem, but now all I'm trying to do is to increase my pointer so that it points to ac, cast it to a uint16_t, print that value out, do some procedures and now I want it to point to 78. Here's what I have written so far:
Pre: Buffer points to a region of memory formatted as specified.
Log points to an opened text file.
Post: The target of Buffer has been parsed and report written as specified
uint8_t doStuff(uint8_t *Buffer, FILE *Log) // given function parameters
{
int wordsToProcess = *(Buffer); // get that first byte
uint16_t offset = 0;
bool firstTime = true;
for (int i = 0; i < wordsToProcess; i++)
{
if (firstTime)
{
Buffer++; // I've tried Buffer += 1;
offset = *((uint16_t*)Buffer); // casting turns into little endian.
// I want 00 ac but I'm not getting that
fprintf(Log, "Looking for %0X words, starting at %0X\n",
wordsToProcess, offset);
firstTime = false;
}
else
{
Buffer += 2;
offset = *((uint16_t*)Buffer);
}
}
}
I've even deleted almost everything from the hexdump except the first two bytes and I still get 66. I've also tried by making a pointer have the same address as Buffer and go from there because I thought maybe playing around with Buffer was causing me problems, but same deal. Could anyone help me figure out what I'm doing wrong?
I don't know what might be the problem, because it might be that the data is not what you expect it to be. But your code has some issues IMHO.
This is how it would be more readable and easier to understand, mantain and thus (easier to write with less bugs).
uint8_t
doStuff(uint8_t *buffer, FILE *log)
{
int wordCount = *buffer++;
uint16_t *pointer = (uint16_t *) buffer;
uint16_t offset = *pointer++;
pointer += offset;
for (int i = 0; i < wordCount; i++)
fprintf(log, "0x%04X ", *pointer++);
return 0 // I don't know what you want to return;
}
Please note that if, the data does not have the expected structure this code would cause Undefined Behavior.
I have an issue with passing data between arrays for processing that I can't seem to iron out. (I'm running the code on a Nios II processor)
HAL Type Definitions:
alt_u8 : Unsigned 8-bit integer.
alt_u32 : Unsigned 32-bit integer.
The core in my FPGA takes in a 128 bits at a time for data processing. I have this working in my original code by passing 4 x 32 bit unsigned int to the function:
alt_u32 load[4] = {0x10101010, 0x10101010, 0x10101010, 0x10101010};
The function processes this data and using another array I retrieve the info.
data_setload(&context,&load); //load data
data_process(&context); //process
memcpy(resultdata,context.result,4*sizeof(unsigned int));
for(i=0; i<4 ; i++){
printf("received 0x%X \n",resultdata[i]); //print to screen
}
Above works perfectly, but when I try combine it with the second part it does not work.
I have a buffer used to store data:
alt_u8 rbuf[512];
When the data buffer becomes full I'm trying to transfer the contents of 'rbuf' to the array 'load'. The main problem is load[4] takes 4 by 32 bit unsigned int for processing. So I want to 'fill up' these 4 by 32 bit unsigned int with data from rbuf, process the data and save the result to an array. Then loop again and fill the array load[4] with the next set of data (from rbuf) and continue until rbuf is empty. (and pad with zeros if necessary)
alt_u8 rbuf[512];
alt_u8 store[512];
alt_u32 resultdata[512];
alt_u32 *reg;
int d, k, j;
for (j=0; j<512; j++){
read_byte(&ch); //gets data
rbuf[j]=ch; //stores to array rbuf
}
printf(" rbuf is full \n");
memcpy(store,rbuf,512*sizeof(alt_u8)); //store gets the value in rbuf.
for(k=0;k<16;k++) //for loop used take in 4 chars to one unsigned 32 bit int
{
for(d=0;d<4;d++) //store 4 chars into an one 32 bit unsigned int
{
*reg = (*reg<<8 | store[d]) ;
}
reg =+1; //increment pointer to next address location(not working properly)
} //loop back
reg = 0; //set pointer address back to 0
for(j=0;j<16;j++) //trying to process data from here
{
memcpy(load,reg,4*sizeof(alt_u32)); //copy first 4 locations from 'reg' to 'load'
data_setload(&context,&load); //pass 'load' to function
data_process(&context); //process 128 bits
memcpy(resultdata,context.result,4*sizeof(alt_u32)); //results copied to 'resultdata'
*reg = *reg + 4; //increment pointer address by 4?
*resultdata = *resultdata+4; //increment resultdata address by 4 and loop again
}
/** need to put data back in char form for displaying***/
for(k=0;k<16;k++) //for loop used take chars from 32 unsigned int
{
for(d=4;d>=0;d--) //loads 4 chars FROM A 32 unsigned int
{
store[d] = *resultdata;
*resultdata = *resultdata>>8;
}
resultdata =+1; //increment pointer next address location
}
for(d=0; d<512 ; d++){
printf("received 0x%X ",store[d]);
The end goal is to take:
Array_A of unsigned 8 bit copy it into an Array_B[4] of unsigned 32 bit >> Process the Array_B[4] with my HDL code. It requires the input to be 128bits.
Then loop back and take the next 128 bits and process them.
reg is defined but not initialized, so it will be a null pointer and you are triying to write a value to it (*reg assigns value, reg assigns address).
Also, the k-d loop is wrong. If you got reg initialized correctly, then a really easy way to do that is:
for(k=0;k<16;k += 4) //for loop used take chars from 32 unsigned int
{
*rbuf = *((alt_u32*)&store[k]);
rbuf++;
}
that loop will take the four intengers stored as bytes in the beginning of store and copies them to where rbuf is pointing.
I'm nearly shure that's not what you want to achieve, but is what your code was trying to do. If you want to fully copy the store to where rbuf points then you can do this:
for(k=0;k<512;k += 4) //for loop used take chars from 32 unsigned int
{
*rbuf = *((alt_u32*)&store[k]);
rbuf++;
}
That will copy all the values stored at store to rbuf.
Also, a better, faster, and cleaner way:
memcpy(rbuf, &store, 512);
rbuf += 512 / sizeof(alt_u32);
Finally, if you just want to fill load with the first four integers, then you can do that:
for(k = 0; k < 4; k++)
{
load[k] = *((alt_u32*)&rbuf[k * 4]);
}
or
memcpy(&load, &rbuf, 4 * sizeof(alt_u32));
then you don't need store for noting.
Finally, here is a full rewriten function with the minimum memory usage and best performance:
alt_u8 rbuf[512];
alt_u32 resultdata[128]; //fixed its size to 128, (512 / sizeof(alt_u32))
int j;
//Do the loop to load data in rbuf
for (j=0; j<512; j++)
read_byte(&rbuf[j]);
printf(" rbuf is full \n");
//Loop through rbuf in 4 * 32 bits per iteration (4*4 bytes)
for(j = 0; j < 512; j+= sizeof(alt_u32) * 4)
{
data_setload(&context, (alt_u32*)&rbuf[j]); //I assume this function expects an alt_u32 pointer to 4 alt_u32 values
data_process(&context);
memcpy(&resultdata[j / sizeof(alt_u32)], context.result, sizeof(alt_u32) * 4);//I assume context.result is a pointer, if not then add & before it
}
//Print received data
for(j=0; j<512 ; j++){
printf("received 0x%X ",rbuf[d]);
I'm trying to fill zero regions in a matrix using memset() in this way:
unsigned short *ptr;
for(int i=0; i < nRows; ++i)
{
ptr = DepthMat.ptr<unsigned short>(i); /* OCV matrix of uint16 values */
for(int j=0; j<nCols; ++j)
{
uint n=0;
if(ptr[j] == 0) /* zero region found */
{
d_val = ptr[j-1]; /* saves last non-zero value */
while(ptr[j+n] == 0)
{ /* looks for non zero */
++n;
}
d_val = (d_val < ptr[j+n] ? d_val : ptr[j+n]);
memset( ptr+j, d_val, n*sizeof( ptr[0]) );
j += n;
}
}
}
I look for sequences of zero, then I store the positions (ptr+j-1 and ptr+j+n) and the values of the zero regions boundaries, and finally I use memset() to replace the zeros with d_val.
The problem is that when I check the values stored they don't match with d_val, for example, I put the value '222' but I get '57054'.
Any clue?
The value argument to memset() is only a single byte, even though the type for the argument is int.
The manual page describes the function as:
memset - fill memory with a constant byte
So, no more than the least-significant 8 bits of d_val will be written to memory. Since you're treating the memory as an array of short, you get "mangled" values that consist of the same byte repeated through the bytes of the short.
In ... short, don't do this; use a for loop to do a repeated write of actual shorts.
memset writes one char at a time, while you're accessing them as short. 57054 = 222*256+222
I am having trouble understanding the output of the following simple CUDA code. All that the code does is allocate two integer arrays: one on the host and one on the device each of size 16. It then sets the device array elements to the integer value 3 and then copies these values into the host_array where all the elements are then printed out.
#include <stdlib.h>
#include <stdio.h>
int main(void)
{
int num_elements = 16;
int num_bytes = num_elements * sizeof(int);
int *device_array = 0;
int *host_array = 0;
// malloc host memory
host_array = (int*)malloc(num_bytes);
// cudaMalloc device memory
cudaMalloc((void**)&device_array, num_bytes);
// Constant out the device array with cudaMemset
cudaMemset(device_array, 3, num_bytes);
// copy the contents of the device array to the host
cudaMemcpy(host_array, device_array, num_bytes, cudaMemcpyDeviceToHost);
// print out the result element by element
for(int i = 0; i < num_elements; ++i)
printf("%i\n", *(host_array+i));
// use free to deallocate the host array
free(host_array);
// use cudaFree to deallocate the device array
cudaFree(device_array);
return 0;
}
The output of this program is 50529027 printed line by line 16 times.
50529027
50529027
50529027
..
..
..
50529027
50529027
Where did this number come from? When I replace 3 with 0 in the cudaMemset call then I get correct behaviour. i.e.
0 printed line by line 16 times.
I compiled the code with nvcc test.cu on Ubuntu 10.10 with CUDA 4.0
I'm no cuda expert but 50529027 is 0x03030303 in hex. This means cudaMemset sets each byte in the array to 3 and not each int. This is not surprising given the signature of cuda memset (to pass in the number of bytes to set) and the general semantics of memset operations.
Edit: As to your (I guess) implicit question of how to achieve what you intended I think you have to write a loop and initialize each array element.
As others have pointed out, cudaMesetworks like the standard C memset- it sets byte values. From the CUDA documentation:
cudaError_t cudaMemset( void * devPtr, int value, size_t count)
Fills the first count bytes of the memory area pointed to by devPtr
with the constant byte value value.
If you want to set word size values, the best solution is to use your own memset kernel, perhaps something like this:
template<typename T>
__global__ void myMemset(T * x, T value, size_t count )
{
size_t tid = threadIdx.x + blockIdx.x * blockDim.x;
size_t stride = blockDim.x * gridDim.x;
for(int i=tid; i<count; i+=stride) {
x[i] = value;
}
}
which could be launched with enough blocks to cover the number of MP in your GPU, and each thread will do as many iterations as required to fill the memory allocation. Writes will be coalesced, so performance shouldn't be too bad. This could also be adapted to CUDA's vector types, if you so desired.
memset sets bytes, and integer is 4 bytes.. so what you get is 50529027 decimal, which is 0x3030303 in hex... In other words - you are using it wrong, and it has nothing to do with CUDA.
This is a classic memset shortcoming; it works only on data type with 8-bit size i.e char. This means it sets (probably) 3 to every 8-bits of the total memory. You can confirm this by a simple C++ code:
int main ()
{
int x=16;
size_t bytes = x*sizeof(int);
int *M = (int*)malloc(bytes);
memset(M,3,bytes);
for (int i = 0; i < x; ++i) {
printf("%d\n", M[i]);
}
return 0;
}
The only case in which memset works on all data types is when you set it to 0. (it sets every byte to 0 and hence all data to 0). If you change the data type to char, you'll see the desired output. cudaMemset is ditto copy of memset with the only difference that it takes a GPU pointer in input.
So memset or cudaMemset probably sets every byte to the integer value (in your case 3) of whole memory space defined by the third argument regardless of the datatype.
Tip:
Google: 50529027 in binary and you'll get the answer :)