Alternative to global arrays in C multithreading environment? - c

does anyone know about an elegant (efficient) alternative to using large global arrays in C for an embedded system, whereby the array is written to in an interrupt service routine and it is read elsewhere asynchronously:
I have no issues with the current implementation, however I was just wondering if it is the best option.
for example:
uint8_t array_data[20] = {0};
volatile bool data_ready = false;
someIsr(void){
for(uint8_t i = 0; i < 20; i++){
array_data[i] = some_other_data[i];
}
data_ready = true;
}
main(void){
for(;;){
if(data_ready){
write_data_somewhere(&array_data[0]);
data_ready = false;
}
}
}
Thanks

Using global arrays is often the best approach unless one would need to use the storage for other purposes when the interrupt routine isn't running. An alternative is to use a global pointer to data that may be stored elsewhere, but one must be very cautious changing that pointer while interrupts are enabled.
An important caveat with your code, by the way: although the Standard regards the implications of a volatile qualifier as implementation-defined, allowing for the possibility that implementations may treat a volatile write as a potential "memory clobber", the authors of gcc require the use of compiler-specific intrinsics to prevent operations on "ordinary" objects from being reordered across operations on volatile-qualified ones.
For example, given:
volatile unsigned short out_count;
int *volatile out_ptr;
int buffer[10];
__attribute__((noinline))
void do_write(int *p, unsigned short count)
{
__asm("");
out_ptr = p;
out_count = count;
do {} while(out_count);
__asm("");
}
void test(void)
{
buffer[0] = 10;
buffer[1] = 20;
do_write(buffer, 2);
buffer[0] = 30;
buffer[1] = 40;
buffer[2] = 50;
do_write(buffer, 3);
}
because the __asm intrinsics don't use gcc-specific syntax to indicate that they might "clobber" the contents of memory in ways the compiler can't understand (even though many compilers support the use of empty __asm intrinsics for that express purpose, and such intrinsics wouldn't really serve any other purpose), and because gcc can see that there's no way that do_write could alter the contents of buffer, it "optimizes out" the code that would store the values 10 and 20 into buffer before the first call to do_write.
Clang doesn't seem quite as bad as gcc. It doesn't seem to reorder writes across volatile writes, it seems to refrain from reordering reads across functions that are not in-line expanded, and it seems to treat empty asm directives as potential memory clobbers, but I I'm not familiar enough with its documentation to know whether such restraint is by design, or merely a consequence of "missed optimizations" which might be "fixed" in future versions.

Consider using translation unit scope rather than global scope. That is declare the array static in the translation unit in which it is used. That translation unit should contain in this case the ISR that writes the data and an access function to read the data. Anything else, including main() should be in other translation units in order that that do not have direct access to the array:
#include <stdbool.h>
#include <stdint.h>
static volatile uint8_t array_data[DATA_LEN] = {0};
static volatile bool data_ready = false;
void someIsr(void)
{
for(uint8_t i = 0; i < 20; i++)
{
array_data[i] = some_other_data[i];
}
data_ready = true;
}
bool getdata( char* dest )
{
bool new_data = data_ready ;
if( data_ready )
{
memcpy( desr, array_data, sizeof(array_data) ) ;
data_ready = false ;
}
}
Then main() in some other translation unit might have:
#include "mydevice.h"
int main( void )
{
uint8_t somewhare[DATA_LEN] = {0};
for(;;)
{
if( getdata( somewhere ) )
{
// process new data
}
}
}
The above is based on your example, and the aim here is to isolate the array so that outside of the ISR the access is enforced to be read-only. In practice it is likely that you will need a "safer" data structure or access method such as a critical-section, double-buffering or a ring buffer so that the data can be accessed without risk of it being modified while it is being read.
This is no less efficient that your original global access, it is simply a restriction of the visibility and accessibility of the array.

As I mentioned in my top comment, one of best ways is to implement a ring queue.
Although I done a few ring queue implementations, here's one I just cooked up for illustration purposes. It is a [cheap] simulation of an Rx ISR for a uart [which is fairly common in embedded systems].
It is fairly complete, but I've not debugged it, so it may have some issues with the queue index calculations.
Anyway, here's the code:
// queue.c -- a ring queue
#include <stdlib.h>
#include <unistd.h>
enum {
QMAX = 1024
};
typedef unsigned char qdata_t; // queue data item
typedef struct {
int qenq; // index for enqueue
int qdeq; // index for dequeue
int qmax; // maximum number of elements in queue
int qover; // number of queue overflows
qdata_t *qbuf; // pointer to queue's buffer
} queue_t;
queue_t *rxisr_q; // pointer to Rx qeueue
// cli -- disable interrupts
void
cli(void)
{
}
// sti -- enable interrupts
void
sti(void)
{
}
// uart_ready -- uart is ready (has Rx data available)
int
uart_ready(void)
{
int rval = rand();
rval = ((rval % 100) > 95);
return rval;
}
// uart_getc -- get character from uart receiver
int
uart_getc(void)
{
int rval = rand();
rval &= 0xFF;
return rval;
}
// qwrap -- increment and wrap queue index
int
qwrap(queue_t *que,int qidx,int inc)
{
int qmax = que->qmax;
qidx += inc;
if (inc > 0) {
if (qidx >= qmax)
qidx -= qmax;
}
else {
if (qidx < 0)
qidx += qmax;
}
return qidx;
}
// qavail_total -- total amount of space available (for enqueue)
int
qavail_total(queue_t *que)
{
int qlen;
qlen = que->qdeq - que->qenq;
if (qlen < 0)
qlen += que->qmax;
qlen -= 1;
return qlen;
}
// qavail_contig -- total amount of space available (for enqueue) [contiguous]
int
qavail_contig(queue_t *que)
{
int qlen;
qlen = que->qdeq - que->qenq;
if (qlen < 0)
qlen = que->qmax - que->qenq;
qlen -= 1;
return qlen;
}
// qready_total -- total amount of space filled (for dequeue)
int
qready_total(queue_t *que)
{
int qlen;
qlen = que->qenq - que->qdeq;
if (qlen < 0)
qlen += que->qmax;
return qlen;
}
// qready_contig -- total amount of space filled (for dequeue) [contiguous]
int
qready_contig(queue_t *que)
{
int qlen;
qlen = que->qenq - que->qdeq;
if (qlen < 0)
qlen = que->qmax - que->qdeq;
return qlen;
}
// qfull -- is queue full?
int
qfull(queue_t *que)
{
int next;
next = qwrap(que,que->qenq,1);
return (next == que->qdeq);
}
// qpush -- push single value
int
qpush(queue_t *que,qdata_t chr)
{
int qenq = que->qenq;
int qnxt;
int push;
qnxt = qwrap(que,qenq,1);
push = (qnxt != que->qdeq);
if (push) {
que->qbuf[qenq] = chr;
que->qenq = qnxt;
}
return push;
}
// qalloc -- allocate a queue
queue_t *
qalloc(int qmax)
{
queue_t *que;
que = calloc(1,sizeof(*que));
que->qbuf = calloc(qmax,sizeof(qdata_t));
return que;
}
// uart_rx_isr -- ISR for uart receiver
void
uart_rx_isr(void)
{
int chr;
queue_t *que;
que = rxisr_q;
while (uart_ready()) {
chr = uart_getc();
#if 0
if (qfull(que)) {
++que->qover;
break;
}
#endif
if (! qpush(que,chr)) {
++que->qover;
break;
}
}
}
int
main(int argc,char **argv)
{
int qlen;
int qdeq;
queue_t *que;
rxisr_q = qalloc(QMAX);
que = rxisr_q;
while (1) {
cli();
qlen = qready_contig(que);
if (qlen > 0) {
qdeq = que->qdeq;
write(1,&que->qbuf[qdeq],qlen);
que->qdeq = qwrap(que,qdeq,qlen);
}
sti();
}
return 0;
}

Related

Struct memory allocation issues for message buffer

I'm trying to use static structs as a buffer for incoming messages, in order to avoid checking the buffer on the MCP2515-external unit. An ISR enters the function with a can_message* value 255 to actually read new messages from my MCP2515.
Other applications register an ID in the message passed as argument, in order to check if the buffer holds any messages with the same value.
This returns wrong IDs, and the rest of the datafields are 0 and uninitialized. What is wrong?
can_message struct:
typedef struct
{
uint8_t id;
uint8_t datalength;
uint8_t data[8];
}can_message;
int CAN_message_receive(can_message* message)
{
static volatile can_message* buffers = (volatile can_message*)0x18FF;
static int birth = 1;
if(birth)
{
for (int i; i < CAN_MESSAGE_UNIQUE_IDS; i++)
{
//These structs gets addresses outside SRAM
buffers[i] = (can_message){0,0,0};
}
birth = 0;
}
if (message == CAN_UPDATE_MESSAGES)
{
/* Sorts messages <3 */
can_message currentMessage;
//These functions are working:
CAN_message_get_from_MCP_buf(&currentMessage, 0);
buffers[currentMessage.id] = currentMessage;
CAN_message_get_from_MCP_buf(&currentMessage, 1);
buffers[currentMessage.id] = currentMessage;
return 0; //returns nothing !
}
if(buffers[message->id].id != 0)
{
printf("test\n");
//This copy gives wrong id and data:
memcpy(message, &buffers[message->id], sizeof(can_message));
buffers[message->id].id = 0;
return 0;
}
return -1;
}
Edit 1:
I did however notice that any buffers[i]-struct gets a totally different address than expected. It does not use the addresses following 0x18FF on the SRAM. Is there any way to change this?
Edit 2:
This is my main-loop:
while (1) {
//printf("tx buf ready: %d\n", MCP2515_TX_buf_empty(0));
//CAN_Loopback_test();
_delay_ms(500);
value = USART_ReadByte(0);
CAN_message_receive(&msg);
printf("CAN_receive: ID: %d, datalength: %d, data: \n",msg.id);
for (int k; k < msg.datalength; k++)
{
printf("%d, ",msg.data[k]);
}
printf("\n");
}
Edit 3: Changing the buffer-pointer to array solved the issue. (It does no longer use the SRAM, but whatever floats my boat)
int CAN_message_receive(can_message* message)
{
static can_message buffers[CAN_MESSAGE_UNIQUE_IDS];
static int birth = 1;
if(birth)
{
for (int i; i < CAN_MESSAGE_UNIQUE_IDS*10; i++)
{
*(char*)(0x18FF+i) = 0;
printf("buffers: %X\n", &buffers[i]);
}
birth = 0;
}
Solved!
Pointer to buffers changed to buffer-array:
int CAN_message_receive(can_message* message)
{
static can_message buffers[CAN_MESSAGE_UNIQUE_IDS];
static int birth = 1;
if(birth)
{
for (int i; i < CAN_MESSAGE_UNIQUE_IDS*10; i++)
{
*(char*)(0x18FF+i) = 0;
printf("buffers: %X\n", &buffers[i]);
}
birth = 0;
}
I would strongly suggest to decouple the ISR logic with the programs own message cache logic. Also the initializing logic with the birth variable looks unnecessary.
I would setup some ring buffer that the ISR can write messages to and from that the main code reads the data into the ID-lookup-buffer.
This would ensure that message updates does not interfere with readouts (at least if you check the read/write indices to your ring buffer) and also eliminates the need to put Mutexes around your whole message buffer.
Currently it smells very badly because of missing read/write synchronization.
// global
#define CAN_MESSAGE_UNIQUE_IDS 50
static can_message g_can_messagebuffers[CAN_MESSAGE_UNIQUE_IDS];
#define MAX_RECEIVEBUFFER 8
static volatile can_message g_can_ringbuffer[MAX_RECEIVEBUFFER];
static volatile int g_can_ringbufferRead = 0;
static volatile int g_can_ringbufferWrite = 0;
// called from ISR
void GetNewMessages()
{
// todo: check ring buffer overflow
can_message currentMessage;
CAN_message_get_from_MCP_buf(&g_can_ringbuffer[g_can_ringbufferWrite], 0);
g_can_ringbufferWrite = (g_can_ringbufferWrite + 1) % MAX_RECEIVEBUFFER;
CAN_message_get_from_MCP_buf(&g_can_ringbuffer[g_can_ringbufferWrite], 1);
g_can_ringbufferWrite = (g_can_ringbufferWrite + 1) % MAX_RECEIVEBUFFER;
}
// called from main loop
void handleNewMessages()
{
while(g_can_ringbufferRead != g_can_ringbufferWrite){
const can_message* currentMessage = &g_can_ringbuffer[g_can_ringbufferRead];
if(currentMessage->id < CAN_MESSAGE_UNIQUE_IDS)
{
g_can_messagebuffers[currentMessage->id] = *currentMessage;
}
g_can_ringbufferRead = (g_can_ringbufferRead + 1) % MAX_RECEIVEBUFFER;
}
}
// called from whoever wants to know
// todo:
// really required a by value interface?
// would it not be sufficient to return a pointer and
// provide an additional interface to mark the message as used?
int getMsg(can_message* message)
{
if(buffers[message->id].id != 0)
{
printf("test\n");
*message = &g_can_messagebuffers[message->id];
g_can_messagebuffers[message->id].id = 0;
return 0;
}
return -1;
}
// alternative to above
const can_message* getMsg(int id)
{
if( (id < CAN_MESSAGE_UNIQUE_IDS)
&& (g_can_messagebuffers[id] != 0))
{
return &g_can_messagebuffers[id].id;
}
return NULL;
}
void invalidateMsg(int id)
{
if(id < CAN_MESSAGE_UNIQUE_IDS)
{
g_can_messagebuffers[id] = 0;
}
}
edit:
after your changes to an message array instead some strange pointer, there is also no need for the setup routine for this code.
edit:
if your micro controller already has a buffer for received messages, then may be it is unnecessary at all to register a ISR and you could empty it from the mainloop directly into your own id-lookup buffer (assuming the mainloop is fast enough)

How to use global variables on a state machine

I made this state machine :
enum states { STATE_ENTRY, STATE_....} current_state;
enum events { EVENT_OK, EVENT_FAIL,EVENT_REPEAT, MAX_EVENTS } event;
void (*const state_table [MAX_STATES][MAX_EVENTS]) (void) = {
{ action_entry , action_entry_fail , action_entry_repeat }, /*
procedures for state 1 */
......}
void main (void){
event = get_new_event (); /* get the next event to process */
if (((event >= 0) && (event < MAX_EVENTS))
&& ((current_state >= 0) && (current_state < MAX_STATES))) {
state_table [current_state][event] (); /* call the action procedure */
printf("OK 0");
} else {
/* invalid event/state - handle appropriately */
}
}
When I modify a global variable in one state the global variable remain the same , and I need that variable in all the states . Do you now what could be the problem ?
My Global variable is this structure:
#if (CPU_TYPE == CPU_TYPE_32)
typedef uint32_t word;
#define word_length 32
typedef struct BigNumber {
word words[64];
} BigNumber;
#elif (CPU_TYPE == CPU_TYPE_16)
typedef uint16_t word;
#define word_length 16
typedef struct BigNumber {
word words[128];
} BigNumber;
#else
#error Unsupported CPU_TYPE
#endif
BigNumber number1 , number2;
Here is how I modify:
//iterator is a number from where I start to modify,
//I already modified on the same way up to the iterator
for(i=iterator+1;i<32;i++){
nr_rand1=661;
nr_rand2=1601;
nr_rand3=1873;
number2.words[i]=(nr_rand1<<21) | (nr_rand2<<11) | (nr_rand3);
}
This is just in case you may want to change your approach for defining the FSM. I'll show you with an example; say you have the following FSM:
You may represent it as:
void function process() {
fsm {
fsmSTATE(S) {
/* do your entry actions heare */
event = getevent();
/* do you actions here */
if (event.char == 'a') fsmGOTO(A);
else fsmGOTO(E);
}
fsmSTATE(A) {
event = getevent();
if (event.char == 'b' || event.char == 'B') fsmGOTO(B);
else fsmGOTO(E);
}
fsmSTATE(B) {
event = getevent();
if (event.char == 'a' ) fsmGOTO(A);
else fsmGOTO(E);
}
fsmSTATE(E) {
/* done with the FSM. Bye bye! */
}
}
}
I do claim (but I believe someone will disagree) that this is simpler, much more readable and directly conveys the structure of the FSM than using a table. Even if I didn't put the image, drawing the FSM diagram would be rather easy.
To get this you just have to define the fsmXXX stuff as follows:
#define fsm
#define fsmGOTO(x) goto fsm_state_##x
#define fsmSTATE(x) fsm_state_##x :
Regarding the code that changese number2:
for(i=iterator+1;i<32;i){
nr_rand1=661;
nr_rand2=1601;
nr_rand3=1873;
number2.words[i]=(nr_rand1<<21) | (nr_rand2<<11) | (nr_rand3);
}
I can't fail to note that:
i is never incremented, so just one element of the array is changed (iterator+1) over an infinite loop;
even if i would be incremented, only the a portion of the words array it's changed depending on the value of iterator (but this might be the intended behaviour).
unless iterator can be -1, the element words[0] is never changed (again this could be the intended behaviour).
I would check if this is really what you intended to do.
If you're sure that it's just a visibility problem (since you said that when you declare it as local it worked as expected), the only other thing that I can think of is that you have the functions in one file and the main (or where you do your checks) in another.
Then you include the same .h header in both files and you end up (due to the linker you're using) with two different number2 because you did not declare it as extern in one of the two files.
Your compiler (or, better, the linker) should have (at least) warned you about this, did you check the compilation messages?
This is not an answer - rather it is a comment. But it is too big to fit the comment field so I post it here for now.
The code posted in the question is not sufficient to find the root cause. You need to post a minimal but complete example that shows the problem.
Something like:
#include<stdio.h>
#include<stdlib.h>
#include <stdint.h>
typedef uint32_t word;
#define word_length 32
typedef struct BigNumber {
word words[4];
} BigNumber;
BigNumber number2;
enum states { STATE_0, STATE_1} current_state;
enum events { EVENT_A, EVENT_B } event;
void f1(void)
{
int i;
current_state = STATE_1;
for (i=0; i<4; ++i) number2.words[i] = i;
}
void f2(void)
{
int i;
current_state = STATE_0;
for (i=0; i<4; ++i) number2.words[i] = 42 + i*i;
}
void (*const state_table [2][2]) (void) =
{
{ f1 , f1 },
{ f2 , f2 }
};
int main (void){
current_state = STATE_0;
event = EVENT_A;
state_table [current_state][event] (); /* call the action procedure */
printf("%u %u %u %u\n", number2.words[0], number2.words[1], number2.words[2], number2.words[3]);
event = EVENT_B;
state_table [current_state][event] (); /* call the action procedure */
printf("%u %u %u %u\n", number2.words[0], number2.words[1], number2.words[2], number2.words[3]);
return 0;
}
The above can be considered minimal and complete. Now update this code with a few of your own functions and post that as the question (if it still fails).
My code doesn't fail.
Output:
0 1 2 3
42 43 46 51

Linux DMA: Using the DMAengine for scatter-gather transactions

I try to use the DMAengine API from a custom kernel driver to perform a scatter-gather operation. I have a contiguous memory region as source and I want to copy its data in several distributed buffers through a scatterlist structure. The DMA controller is the PL330 one that supports the DMAengine API (see PL330 DMA controller).
My test code is the following:
In my driver header file (test_driver.h):
#ifndef __TEST_DRIVER_H__
#define __TEST_DRIVER_H__
#include <linux/platform_device.h>
#include <linux/device.h>
#include <linux/scatterlist.h>
#include <linux/dma-mapping.h>
#include <linux/dmaengine.h>
#include <linux/of_dma.h>
#define SG_ENTRIES 3
#define BUF_SIZE 16
#define DEV_BUF 0x10000000
struct dma_block {
void * data;
int size;
};
struct dma_private_info {
struct sg_table sgt;
struct dma_block * blocks;
int nblocks;
int dma_started;
struct dma_chan * dma_chan;
struct dma_slave_config dma_config;
struct dma_async_tx_descriptor * dma_desc;
dma_cookie_t cookie;
};
struct test_platform_device {
struct platform_device * pdev;
struct dma_private_info dma_priv;
};
#define _get_devp(tdev) (&((tdev)->pdev->dev))
#define _get_dmapip(tdev) (&((tdev)->dma_priv))
int dma_stop(struct test_platform_device * tdev);
int dma_start(struct test_platform_device * tdev);
int dma_start_block(struct test_platform_device * tdev);
int dma_init(struct test_platform_device * tdev);
int dma_exit(struct test_platform_device * tdev);
#endif
In my source that contains the dma functions (dma_functions.c):
#include <linux/slab.h>
#include "test_driver.h"
#define BARE_RAM_BASE 0x10000000
#define BARE_RAM_SIZE 0x10000000
struct ram_bare {
uint32_t * __iomem map;
uint32_t base;
uint32_t size;
};
static void dma_sg_check(struct test_platform_device * tdev)
{
struct dma_private_info * dma_priv = _get_dmapip(tdev);
struct device * dev = _get_devp(tdev);
uint32_t * buf;
unsigned int bufsize;
int nwords;
int nbytes_word = sizeof(uint32_t);
int nblocks;
struct ram_bare ramb;
uint32_t * p;
int i;
int j;
ramb.map = ioremap(BARE_RAM_BASE,BARE_RAM_SIZE);
ramb.base = BARE_RAM_BASE;
ramb.size = BARE_RAM_SIZE;
dev_info(dev,"nblocks: %d \n",dma_priv->nblocks);
p = ramb.map;
nblocks = dma_priv->nblocks;
for( i = 0 ; i < nblocks ; i++ ) {
buf = (uint32_t *) dma_priv->blocks[i].data;
bufsize = dma_priv->blocks[i].size;
nwords = dma_priv->blocks[i].size/nbytes_word;
dev_info(dev,"block[%d],size %d: ",i,bufsize);
for ( j = 0 ; j < nwords; j++, p++) {
dev_info(dev,"DMA: 0x%x, RAM: 0x%x",buf[j],ioread32(p));
}
}
iounmap(ramb.map);
}
static int dma_sg_exit(struct test_platform_device * tdev)
{
struct dma_private_info * dma_priv = _get_dmapip(tdev);
int ret = 0;
int i;
for( i = 0 ; i < dma_priv->nblocks ; i++ ) {
kfree(dma_priv->blocks[i].data);
}
kfree(dma_priv->blocks);
sg_free_table(&(dma_priv->sgt));
return ret;
}
int dma_stop(struct test_platform_device * tdev)
{
struct dma_private_info * dma_priv = _get_dmapip(tdev);
struct device * dev = _get_devp(tdev);
int ret = 0;
dma_unmap_sg(dev,dma_priv->sgt.sgl,\
dma_priv->sgt.nents, DMA_FROM_DEVICE);
dma_sg_exit(tdev);
dma_priv->dma_started = 0;
return ret;
}
static void dma_callback(void * param)
{
enum dma_status dma_stat;
struct test_platform_device * tdev = (struct test_platform_device *) param;
struct dma_private_info * dma_priv = _get_dmapip(tdev);
struct device * dev = _get_devp(tdev);
dev_info(dev,"Checking the DMA state....\n");
dma_stat = dma_async_is_tx_complete(dma_priv->dma_chan,\
dma_priv->cookie, NULL, NULL);
if(dma_stat == DMA_COMPLETE) {
dev_info(dev,"DMA complete! \n");
dma_sg_check(tdev);
dma_stop(tdev);
} else if (unlikely(dma_stat == DMA_ERROR)) {
dev_info(dev,"DMA error! \n");
dma_stop(tdev);
}
}
static void dma_busy_loop(struct test_platform_device * tdev)
{
struct dma_private_info * dma_priv = _get_dmapip(tdev);
struct device * dev = _get_devp(tdev);
enum dma_status status;
int status_change = -1;
do {
status = dma_async_is_tx_complete(dma_priv->dma_chan, dma_priv->cookie, NULL, NULL);
switch(status) {
case DMA_COMPLETE:
if(status_change != 0)
dev_info(dev,"DMA status: COMPLETE\n");
status_change = 0;
break;
case DMA_PAUSED:
if (status_change != 1)
dev_info(dev,"DMA status: PAUSED\n");
status_change = 1;
break;
case DMA_IN_PROGRESS:
if(status_change != 2)
dev_info(dev,"DMA status: IN PROGRESS\n");
status_change = 2;
break;
case DMA_ERROR:
if (status_change != 3)
dev_info(dev,"DMA status: ERROR\n");
status_change = 3;
break;
default:
dev_info(dev,"DMA status: UNKNOWN\n");
status_change = -1;
break;
}
} while(status != DMA_COMPLETE);
dev_info(dev,"DMA transaction completed! \n");
}
static int dma_sg_init(struct test_platform_device * tdev)
{
struct dma_private_info * dma_priv = _get_dmapip(tdev);
struct scatterlist *sg;
int ret = 0;
int i;
ret = sg_alloc_table(&(dma_priv->sgt), SG_ENTRIES, GFP_ATOMIC);
if(ret)
goto out_mem2;
dma_priv->nblocks = SG_ENTRIES;
dma_priv->blocks = (struct dma_block *) kmalloc(dma_priv->nblocks\
*sizeof(struct dma_block), GFP_ATOMIC);
if(dma_priv->blocks == NULL)
goto out_mem1;
for( i = 0 ; i < dma_priv->nblocks ; i++ ) {
dma_priv->blocks[i].size = BUF_SIZE;
dma_priv->blocks[i].data = kmalloc(dma_priv->blocks[i].size, GFP_ATOMIC);
if(dma_priv->blocks[i].data == NULL)
goto out_mem3;
}
for_each_sg(dma_priv->sgt.sgl, sg, dma_priv->sgt.nents, i)
sg_set_buf(sg,dma_priv->blocks[i].data,dma_priv->blocks[i].size);
return ret;
out_mem3:
i--;
while(i >= 0)
kfree(dma_priv->blocks[i].data);
kfree(dma_priv->blocks);
out_mem2:
sg_free_table(&(dma_priv->sgt));
out_mem1:
ret = -ENOMEM;
return ret;
}
static int _dma_start(struct test_platform_device * tdev,int block)
{
struct dma_private_info * dma_priv = _get_dmapip(tdev);
struct device * dev = _get_devp(tdev);
int ret = 0;
int sglen;
/* Step 1: Allocate and initialize the SG list */
dma_sg_init(tdev);
/* Step 2: Map the SG list */
sglen = dma_map_sg(dev,dma_priv->sgt.sgl,\
dma_priv->sgt.nents, DMA_FROM_DEVICE);
if(! sglen)
goto out2;
/* Step 3: Configure the DMA */
(dma_priv->dma_config).direction = DMA_DEV_TO_MEM;
(dma_priv->dma_config).src_addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES;
(dma_priv->dma_config).src_maxburst = 1;
(dma_priv->dma_config).src_addr = (dma_addr_t) DEV_BUF;
dmaengine_slave_config(dma_priv->dma_chan, \
&(dma_priv->dma_config));
/* Step 4: Prepare the SG descriptor */
dma_priv->dma_desc = dmaengine_prep_slave_sg(dma_priv->dma_chan, \
dma_priv->sgt.sgl, dma_priv->sgt.nents, DMA_DEV_TO_MEM, \
DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
if (dma_priv->dma_desc == NULL) {
dev_err(dev,"DMA could not assign a descriptor! \n");
goto out1;
}
/* Step 5: Set the callback method */
(dma_priv->dma_desc)->callback = dma_callback;
(dma_priv->dma_desc)->callback_param = (void *) tdev;
/* Step 6: Put the DMA descriptor in the queue */
dma_priv->cookie = dmaengine_submit(dma_priv->dma_desc);
/* Step 7: Fires the DMA transaction */
dma_async_issue_pending(dma_priv->dma_chan);
dma_priv->dma_started = 1;
if(block)
dma_busy_loop(tdev);
return ret;
out1:
dma_stop(tdev);
out2:
ret = -1;
return ret;
}
int dma_start(struct test_platform_device * tdev) {
return _dma_start(tdev,0);
}
int dma_start_block(struct test_platform_device * tdev) {
return _dma_start(tdev,1);
}
int dma_init(struct test_platform_device * tdev)
{
int ret = 0;
struct dma_private_info * dma_priv = _get_dmapip(tdev);
struct device * dev = _get_devp(tdev);
dma_priv->dma_chan = dma_request_slave_channel(dev, \
"dma_chan0");
if (dma_priv->dma_chan == NULL) {
dev_err(dev,"DMA channel busy! \n");
ret = -1;
}
dma_priv->dma_started = 0;
return ret;
}
int dma_exit(struct test_platform_device * tdev)
{
int ret = 0;
struct dma_private_info * dma_priv = _get_dmapip(tdev);
if(dma_priv->dma_started) {
dmaengine_terminate_all(dma_priv->dma_chan);
dma_stop(tdev);
dma_priv->dma_started = 0;
}
if(dma_priv->dma_chan != NULL)
dma_release_channel(dma_priv->dma_chan);
return ret;
}
In my driver source file (test_driver.c):
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/init.h>
#include <linux/version.h>
#include <linux/device.h>
#include <linux/platform_device.h>
#include <linux/of_device.h>
#include <linux/of_address.h>
#include <linux/of_irq.h>
#include <linux/interrupt.h>
#include "test_driver.h"
static int dma_block=0;
module_param_named(dma_block, dma_block, int, 0444);
static struct test_platform_device tdev;
static struct of_device_id test_of_match[] = {
{ .compatible = "custom,test-driver-1.0", },
{}
};
static int test_probe(struct platform_device *op)
{
int ret = 0;
struct device * dev = &(op->dev);
const struct of_device_id *match = of_match_device(test_of_match, &op->dev);
if (!match)
return -EINVAL;
tdev.pdev = op;
dma_init(&tdev);
if(dma_block)
ret = dma_start_block(&tdev);
else
ret = dma_start(&tdev);
if(ret) {
dev_err(dev,"Error to start DMA transaction! \n");
} else {
dev_info(dev,"DMA OK! \n");
}
return ret;
}
static int test_remove(struct platform_device *op)
{
dma_exit(&tdev);
return 0;
}
static struct platform_driver test_platform_driver = {
.probe = test_probe,
.remove = test_remove,
.driver = {
.name = "test-driver",
.owner = THIS_MODULE,
.of_match_table = test_of_match,
},
};
static int test_init(void)
{
platform_driver_register(&test_platform_driver);
return 0;
}
static void test_exit(void)
{
platform_driver_unregister(&test_platform_driver);
}
module_init(test_init);
module_exit(test_exit);
MODULE_AUTHOR("klyone");
MODULE_DESCRIPTION("DMA SG test module");
MODULE_LICENSE("GPL");
However, the DMA never calls my callback function and I do not have any idea why it happens. Maybe, I am misunderstanding something...
Could anyone help me?
Thanks in advance.
Caveat: I don't have a definitive solution for you, but merely some observations and suggestions on how to debug this [based on many years of experience writing/debugging linux device drivers].
I presume you believe the callback is not being done because you don't get any printk messages. But, the callback is the only place that has them. But, is the printk level set high enough to see the messages? I'd add a dev_info to your module init, to prove it prints as expected.
Also, you [probably] won't get a callback if dma_start doesn't work as expected, so I'd add some dev_info calls there, too (e.g. before and after the call in step 7). I also notice that not all calls in dma_start check error returns [may be fine or void return, just mentioning in case you missed one]
At this point, it should be noted that there are really two questions here: (1) Did your DMA request start successfully [and complete]? (2) Did you get a callback?
So, I'd split off some code from dma_complete into (e.g.) dma_test_done. The latter does the same checking but only prints the "complete" message. You can call this in a poll mode to verify DMA completion.
So, if you [eventually] get a completion, then the problem reduces to why you didn't get the callback. If, however, you don't [even] get a completion, that's an even more fundamental problem.
This reminds me. You didn't show any code that calls dma_start or how you wait for the completion. I presume that if your callback were working, it would issue a wakeup of some sort that the base level would wait on. Or, the callback would do the request deallocate/cleanup (i.e. more code you'd write)
At step 7, you're calling dma_async_issue_pending, which should call pl330_issue_pending. pl330_issue_pending will call pl330_tasklet.
pl330_tasklet is a tasklet function, but it can also be called directly [to kick off DMA when there are no active requests].
pl330_tasklet will loop on its "work" queue and move any completed items to its "completed" queue. It then tries to start new requests. It then loops on its completed queue and issues the callbacks.
pl330_tasklet grabs the callback pointer, but if it's null it is silently ignored. You've set a callback, but it might be good to verify that where you set the callback is the same place [or propagates to] the place where pl330_tasklet will fetch it from.
When you make the call, everything may be busy, so there are no completed requests, no room to start a new request, so nothing to complete. In that case, pl330_tasklet will be called again later.
So, when dma_async_issue_pending returns, nothing may have happened yet. This is quite probable for your case.
pl330_tasklet tries to start new DMA by calling fill_queue. It will check that a descriptor is not [already] busy by looking at status != BUSY. So, you may wish to verify that yours has the correct value. Otherwise, you'd never get a callback [or even any DMA start].
Then, fill_queue will try to start the request via pl330_submit_req. But, that can return an error (e.g. queue already full), so, again, things are deferred.
For reference, notice the following comment at the top of pl330_submit_req:
Submit a list of xfers after which the client wants notification.
Client is not notified after each xfer unit, just once after all
xfer units are done or some error occurs.
What I'd do is start hacking up pl330.c and add debug messages and cross-checking. If your system is such that pl330 is servicing many other requests, you might limit the debug messages by checking that the device's private data pointer matches yours.
In particular, you'd like to get a message when your request actually gets started, so you could add a debug message to the end of pl330_submit_req
Then, adding messages within pl330_tasklet for requests will help, too.
Those are two good starting points. But, don't be afraid to add more printk calls as needed. You may be surprised by what gets called [or doesn't get called] or in what order.
UPDATE:
If I install the kernel module with the blocking behaviour, everything is initialized well. However, the dma_busy_loop function shows that the DMA descriptor is always IN PROGESS and the DMA transaction never completes. For this reason, the callback function is not executed. What could be happening?
Did a little more research. Cookies are just sequence numbers that increment. For example, if you issue a request that gets broken up into [say] 10 separate scatter/gather operations [descriptors], each one gets a unique cookie value. The cookie return value is the latest/last of the bunch (e.g. 10).
When you're calling (1) dma_async_is_tx_complete, (2) it calls chan->device->device_tx_status, (3) which is pl330_tx_status, (4) which calls dma_cookie_status
Side note/tip: When I was tracking this down, I just kept flipping back and forth between dmaengine.h and pl330.c. It was like: Look at (1), it calls (2). Where is that set? In pl330.c, I presume. So, I grepped for the string and got the name of pl330's function (i.e. (3)). So, I go there, and see that it does (4). So ... Back to dmaengine.h ...
However, when you make the outer call, you're ignoring [setting to NULL] the last two arguments. These can be useful because they return the "last" and "used" cookies. So, even if you don't get full completion, these values could change and show partial progress.
One of them should eventually be >= to the "return" cookie value. (i.e.) The entire operation should be complete. So, this will help differentiate what may be happening.
Also, note that in dmaengine.h, right below dma_async_is_tx_complete, there is dma_async_is_complete. This function is what decides whether to return DMA_COMPLETE or DMA_IN_PROGRESS, based on the cookie value you pass and the "last" and "used" cookie values. It's passive, and not used in the code path [AFAICT], but it does show how to calculate completion yourself.

Configuration of a module's data sending

I want to configure (module A) to send certain data to (module B) at certain slots of time.
(Module B) should send these configuration to (module A) during initialization.
The data is:
struct _data
{
int temp;
int velocity;
int time;
}
For example, (module A) should send 'temp' at first slot, then 'temp & velocity' at second slot, then 'time' at third slot .... etc
I am thinking about making making "configuration flags" structure:
struct _configuration
{
int temp_flag;
int time_flag;
int velocity_flag;
}
Then making an array of this structures:
struct _configuration arr[NUMBER_OF_SLOTS];
and configure using this array:
arr[0].temp_flag = 1;
arr[0].velocity_flag = 0;
arr[0].time_flag = 0;
arr[1].temp_flag = 1;
arr[1].velocity_flag = 1;
arr[1].time_flag = 0;
arr[2].temp_flag = 0;
arr[2].velocity_flag = 0;
arr[2].time_flag = 1;
.... etc
But I am not very happy with this approach ... does anyone has a better way or algorithm to do this task ?
Many thanks in advance
One of the many possible solutions is bitmasks (very popular in computer graphics). As you know, every number can be represented as sequence of 0 and 1, so they can be the flags that mean something. It's quite simple to use them in such way because we have bitwise operations. And there is no need to create some configuration structures.
const int USE_TEMP = 1 << 0; // 01
const int USE_VELOCITY = 1 << 1; // 010
const int USE_TIME = 1 << 2; // 0100
// ...
// you have 32 free-to-use bits
const int USE_ALL = 0111; // just for fun
struct _data {
int temp;
int velocity;
int time;
}
And it bring us to
int arr[NUMBER_OF_SLOTS];
arr[0] = USE_TEMP;
arr[1] = USE_TEMP | USE_VELOCITY;
arr[2] = USE_TIME;
Looks better, isn't it?
Checking values
When you need to check whether some of the parameters included in configuration, it's very simple
if( arr[i] & USE_TEMP ) {
// do smth with temp, it's included
}
or this way
int expected_flags = USE_TIME | USE_VELOCITY;
if( arr[i] & expected_flags == expected_flags ) {
// time and velocity enabled
}
or even declare special function (or scary macro) to check whether some parameters are in your config
bool check(int config, int flags) {
return config & flags == flags;
}
Changing configurations
What if you need to delete/add parameters?
int some_conf = USE_TEMP | USE_VELOCITY;
Simple way
// delete
if( check(some_conf, USE_VELOCITY) )
some_conf -= USE_VELOCITY; // <--- dangerous without if( )
// add
if( !check(some_conf, USE_TIME) )
some_conf += USE_TIME; // <--- dangerous without if( )
Safety way
// delete
some_conf &= ~USE_VELOCITY;
// add
some_conf |= USE_TIME;

Uart Check Receive Buffer interrupt vs. polling

Hello I am learning how to use the Uart by using interrupts in Nios and I am not sure how to start. I have made it in polling, but I am not sure how to start using interrupts.
Any help would be appreciated
Here is my code
#include <stdio.h> // for NULL
#include <sys/alt_irq.h> // for irq support function
#include "system.h" // for QSYS defines
#include "nios_std_types.h" // for standard embedded types
#define JTAG_DATA_REG_OFFSET 0
#define JTAG_CNTRL_REG_OFFSET 1
#define JTAG_UART_WSPACE_MASK 0xFFFF0000
#define JTAG_UART_RV_BIT_MASK 0x00008000
#define JTAG_UART_DATA_MASK 0x000000FF
volatile uint32* uartDataRegPtr = (uint32*)JTAG_UART_0_BASE;
volatile uint32* uartCntrlRegPtr = ((uint32*)JTAG_UART_0_BASE +
JTAG_CNTRL_REG_OFFSET);
void uart_SendByte (uint8 byte);
void uart_SendString (uint8 * msg);
//uint32 uart_checkRecvBuffer (uint8 *byte);
uint32 done = FALSE;
void uart_SendString (uint8 * msg)
{
int i = 0;
while(msg[i] != '\0')
{
uart_SendByte(msg[i]);
i++;
}
} /* uart_SendString */
void uart_SendByte (uint8 byte)
{
uint32 WSPACE_Temp = *uartCntrlRegPtr;
while((WSPACE_Temp & JTAG_UART_WSPACE_MASK) == 0 )
{
WSPACE_Temp = *uartCntrlRegPtr;
}
*uartDataRegPtr = byte;
} /* uart_SendByte */
uint32 uart_checkRecvBuffer (uint8 *byte)
{
uint32 return_value;
uint32 DataReg = *uartDataRegPtr;
*byte = (uint8)(DataReg & JTAG_UART_DATA_MASK);
return_value = DataReg & JTAG_UART_RV_BIT_MASK;
return_value = return_value >> 15;
return return_value;
} /* uart_checkRecvBuffer */
void uart_RecvBufferIsr (void* context)
{
} /* uart_RecvBufferIsr */
int main(void)
{
uint8* test_msg = (uint8*)"This is a test message.\n";
//alt_ic_isr_register ( ); // used for 2nd part when interrupts are enabled
uart_SendString (test_msg);
uart_SendString ((uint8*)"Enter a '.' to exist the program\n\n");
while (!done)
{
uint8 character_from_uart;
if (uart_checkRecvBuffer(&character_from_uart))
{
uart_SendByte(character_from_uart);
}
// do nothing
} /* while */
uart_SendString((uint8*)"\n\nDetected '.'.\n");
uart_SendString((uint8*)"Program existing....\n");
return 0;
} /* main */
I am suppose to use the uart_RecvBufferIsr instead of uart_checkRecvBuffer. How can tackle this situation?
You will need to register your interrupt handler by using alt_ic_isr_register(), which will then be called when an interrupt is raised. Details can be found (including some sample code) in this NIOS II PDF document from Altera.
As far as modifying your code to use the interrupt, here is what I would do:
Remove uart_checkRecvBuffer();
Change uart_RecvBufferIsr() to something like (sorry no compiler here so can't check syntax/functioning):
volatile uint32 recv_flag = 0;
volatile uint8 recv_char;
void uart_RecvBufferIsr(void *context)
{
uint32 DataReg = *uartDataRegPtr;
recv_char = (uint8)(DataReg & JTAG_UART_DATA_MASK);
recv_flag = (DataReg & JTAG_UART_RV_BIT_MASK) >> 15;
}
The moral of the story with the code above is that you should keep your interrupts as short as possible and let anything that is not strictly necessary to be done outside (perhaps by simplifying the logic I used with the recv_char and recv_flag).
And then change your loop to something like:
while (!done)
{
if (recv_flag)
{
uart_SendByte(recv_byte);
recv_flag = 0;
}
}
Note that there could be issues with what I've done depending on the speed of your port - if characters are received too quickly for the "while" loop above to process them, you would be losing some characters.
Finally, note that I declared some variables as "volatile" to prevent the compiler from keeping them in registers for example in the while loop.
But hopefully this will get you going.

Resources