Issue with TFT lcd screen speed

Issue with TFT lcd screen speed - c

I am using TFT LCD screen (ILI9163c - 160*128). It is connected with athros AR9331 module with spi. Athros AR9331 is running with OpenWRT linux distribution. So, I am driving my LCD with spidev0.1. While filling screen or writing any string on LCD, it is taking too much time to print. So, what can i do to get sufficient printing speed.
Thanks.
This is the function i'm using to write data on spi pin using spidev...
void spi_transactor(unsigned char *write_data, int mode,int size)
{
int ret;
struct spi_ioc_transfer xfer[4];
unsigned char *init_reg;
init_reg = (unsigned char*) malloc(size);
memcpy(init_reg,write_data,size);
if (mode)
{
gpio_set_value(_rs, 1); // DATA
}
else
{
gpio_set_value(_rs, 0); // COMMAND
}
memset(xfer, 0, sizeof xfer);
xfer[0].bits_per_word = 8;
xfer[0].tx_buf = (unsigned long)init_reg;
xfer[0].rx_buf = 0; //( unsigned long ) &buf_rx[0];
xfer[0].len = size; //wlength + rlength;
xfer[0].delay_usecs = 0;
xfer[0].speed_hz = speedx; // 8MHZ
//xfer[0].speed_hz = 160000000; // 40MHZ
ret = ioctl(spi_fd, SPI_IOC_MESSAGE(1), &xfer);
gpio_set_value(_rs, 1);
}

The main performance issue here is that you make a hard copy of the data to send on the heap, every time the function is called. You also set up the communication parameters from scratch each time, even though they are always the same. To make things worse, the function has a massive bug: it leaks memory as if there's no tomorrow.
The hard copies aren't really necessary unless the SPI communication takes too much time for the program to sit and busy-wait on it to finish (rather likely). What you can do in that case is this:
Outsource the whole SPI business to a separate thread.
Create a work queue for the thread, using your favourite ADT for such. It should be a thread-safe FIFO.
Data is copied into the ADT as hard copies, by the caller.
The thread picks one chunk of work from the ADT and transmits it from there, without making yet another hard copy.
The thread waits for the SPI communcation to finish, then makes sure that the ADT deletes the data, before grabbing the next one. For hard real-time requirements, you can have the thread prepare the next message in advance while waiting for the previous one.
The communication parameters "xfer" are set up once by the thread, it just changes the data destination address from case to case.

Related

How receive data with HAL_UART?

I'm learning about the STM32. I'm want receive data by UART byte-to-byte with interruption.
HAL_UART_Receive_IT(&huart1, buffer, length)
Where &huart1 is my uart gate, buffer is the input storage and length is the amount of input bytes. I use the following function to read data
static requestRead(void *buffer, uint16_t length)
{
uint8_t teste;
while (HAL_UART_Receive_IT(&huart1, buffer, length) != HAL_OK) osDelay(1);
//HAL_UART_RxCpltCallback
}
I store my data in:
void StartDefaultTask(void const *argument)
{
char sender[] = "Alaska Sending\n";
uint8_t receive[10];
uint8_t data[30];
for (;;)
{
uint8_t i = 0;
memset(data, 0, 30);
requestRead(&receive, 1);
data[i++] = receive;
while (data != '\r')
{
requestRead(&receive, 1);
data[i++] = receive;
}
//HAL_UART_Transmit(&huart1, data, i, HAL_MAX_DELAY);
}
/* USER CODE END StartDefaultTask */
}
My problem is the value receive and store. When I send by serial a string of character as Welcome to Alaska\n, only W is read and stored, then I need send again the buffer and again just store only W. How solve this?

Well, there are a few issues here.
Arrays and their contents
data[i++] = receive;
stores the address of the receive buffer, a memory pointer value, into the data array. That's certainly not what you want. As this is a very basic C programming paradigm, I'd recommend reviewing the chapter on arrays and pointers in a good C textbook.
What you send and what you expect
while (data != '\r')
Even if you'd get the array address and its value right (see above), you are sending a string terminated with '\n', and check for a '\r' character, so change one or the other to get a match.
Missing volatile
uint8_t receive[10];
The receive buffer should be declared volatile, as it would be accessed by an interrupt handler. Otherwise the main program would miss writes to the buffer even if it had checked whether the receiving is complete (see below).
Working with hardware in realtime
while (HAL_UART_Receive_IT(&huart1, buffer, length) != HAL_OK) osDelay(1);
This would enable the UART receive (and error handling) interrupt to receive one byte. That's fine so far, but the function returns before receiving the byte, and as it's called again immediately, it would return HAL_BUSY the second time, and wait a millisecond before attempting it again. In that millisecond, it would miss most of the rest of the transmission, as bytes are arriving faster than that, and your program does nothing about it.
Moreover, the main program does not check when the receive is complete, possibly accessing the buffer before the interrupt handler places a value in it.
If you receive data using interrupts, you'd have to do something about that data in the interrupt handler. (If you don't use interrupts, but polling for data, then be sure that you'd meet the deadline imposed by the speed of the interface).
HAL is not suited for this kind of tasks
HAL has no interface for receiving an unknown length of data terminated by a specific value. Of course the main program can poll the receiver one byte at a time, but then it must ensure that the polling occurs faster than the data comes in. In other words, the program can do very little else while expecting a transmission.
There are some workarounds, but I won't even hint at them now, because they would lead to deadlocks in an RTOS environment which tend to occur at the most inconvenient times, and are quite hard to investigate and to properly avoid.
Write your interrupt handler instead
(all of this is detailed in the Reference Manual for your controller)
set the UART interrupt priority NVIC_SetPriority()
enable the interrupt with NVIC_EnableIRQ()
set the USART_CR1_RXNEIE bit to 1
in the interrupt handler,
read the SR register
if the RXNE bit is set,
read the data from the data register
store it in the buffer
set a global flag if it matches the terminating character
switch to another buffer if more data is expected while the program processes the first line.
Don't forget declaring declaring all variables touched by the interrupt handler as volatile.

delayed write from userspace to kernel space using framebuffer node

I have implemented a linux kernel driver which uses deferred IO mechanism to track the changes in framebuffer node.
static struct fb_deferred_io fb_defio = {
.delay = HZ/2,
.deferred_io = fb_dpy_deferred_io,
};
Per say the registered framebuffer node is /dev/graphics/fb1.
The sample application code to access this node is:
fbfd = open("/dev/graphics/fb1", O_RDWR);
if (!fbfd) {
printf("error\n");
exit(0);
}
screensize = 540*960*4;
/* Map the device to memory */
fbp = (unsigned char *)mmap(0, screensize, PROT_READ | PROT_WRITE, MAP_SHARED,
fbfd, 0);
if ((int)fbp == -1) {
printf("Error: failed to start framebuffer device to memory.");
}
int grey = 0x1;
for(cnt = 0; cnt < screensize; cnt++)
*(fbp + cnt) = grey<<4|grey;
This would fill up entire fb1 node with 1's.
The issue now is at the kernel driver when i try to read the entire buffer I find data mismatch at different locations.
The buffer in kernel is mapped as:
par->buffer = dma_alloc_coherent(dev, roundup((dpyw*dpyh*BPP/8), PAGE_SIZE),(dma_addr_t *) &DmaPhysBuf, GFP_KERNEL);
if (!par->buffer) {
printk(KERN_WARNING "probe: dma_alloc_coherent failed.\n");
goto err_vfree;
}
and finally the buffer is registered through register_framebuffer function.
On reading the source buffer I find that at random locations the data is not been written instead the old data is reflected.
For example:
At buffer location 3964 i was expecting 11111111 but i found FF00FF00.
On running the same application program with value of grey changed to 22222222
At buffer location 3964 i was expecting 22222222 but i found 11111111
It looks like there is some delayed write in the buffer. Is there any solution to this effect, because of partially wrong data my image is getting corrupted.
Please let me know if any more information is required.
Note: Looks like an issue of mapped buffer being cacheable or not. Its a lazy write to copy the data from cache to ram. Need to make sure that the data is copied properly but how still no idea.. :-(

"Deferred io" means that frame buffer memory is not really mapped to a display device. Rather, it's an ordinary memory area shared between user process and kernel driver. Thus it needs to be "synced" for kernel to actually do anything about it:
msync(fbp, screensize, MS_SYNC);
Calling fsync(fbfd) may also work.
You may also try calling ioctl(fbfd, FBIO_WAITFORVSYNC, 0) if your driver supports it. The call will make your application wait until vsync happens and the frame buffer data was definitely transferred to the device.

I was having a similar issue where I was having random artifacts displaying on the screen. Originally, the framebuffer driver was not using dma at all.
I tried the suggestion of using msync(), which improved the situation (artifacts happened less frequently), but it did not completely solve the issue.
After doing some research I came to the conclusion that I need to use dma memory because it is not cached. There is still the issue with mmap because it is mapping the kernel memory to userspace. However, I found that there is already a function in the kernel to handle this.
So, my solution was in my framebuffer driver, set the mmap function:
static int my_fb_mmap(struct fb_info *info, struct vm_area_struct *vma)
{
return dma_mmap_coherent(info->dev, vma, info->screen_base,
info->fix.smem_start, info->fix.smem_len);
}
static struct fb_ops my_fb_ops = {
...
.fb_mmap = my_fb_mmap,
};
And then in the probe function:
struct fb_info *info;
struct my_fb_par *par;
dma_addr_t dma_addr;
char *buf
info = framebuffer_alloc(sizeof(struct my_fb_par), &my_parent->dev);
...
buf = dma_alloc_coherent(info->dev, MY_FB_SIZE, dma_addr, GFP_KERNEL);
...
info->screen_base = buf;
info->fbops = &my_fb_ops;
info->fix = my_fb_fix;
info->fix.smem_start = dma_addr;
info->fix.smem_len = MY_FB_SIZE;
...
par = info->par
...
par->buffer = buf;
Obviously, I've left out the error checking and unwinding, but hopefully I have touched on all of the important parts.
Note: Comments in the kernel source say that dmac_flush_range() is for private use only.

Well eventually i found a better way to solve the issue. The data written through app at mmaped device node is first written in cache which is later written in RAM through delayed write policy. In order to make sure that the data is flushed properly we need to call the flush function in kernel. I used
dmac_flush_range((void *)pSrc, (void *)pSrc + bufSize);
to flush the data completely so that the kernel receives a clean data.

Circular buffer using pointers in C

I have a queue structure, that I attempted to implement using a circular buffer, which I am using in a networking application. I am looking for some guidance and feedback. First, let me present the relevant code.
typedef struct nwk_packet_type
{
uint8_t dest_address[NRF24_ADDR_LEN];
uint8_t data[32];
uint8_t data_len;
}nwk_packet_t;
/* The circular fifo on which outgoing packets are stored */
nwk_packet_t nwk_send_queue[NWK_QUEUE_SIZE];
nwk_packet_t* send_queue_in; /* pointer to queue head */
nwk_packet_t* send_queue_out; /* pointer to queue tail */
static nwk_packet_t* nwk_tx_pkt_allocate(void)
{
/* Make sure the send queue is not full */
if(send_queue_in == (send_queue_out - 1 + NWK_QUEUE_SIZE) % NWK_QUEUE_SIZE)
return 0;
/* return pointer to the next add and increment the tracker */
return send_queue_in++;//TODO: it's not just ++, it has to be modular by packet size
}
/* External facing function for application layer to send network data */
// simply adds the packet to the network queue if there is space
// returns an appropriate error code if anything goes wrong
uint8_t nwk_send(uint8_t* address, uint8_t* data, uint8_t len)
{
/* First check all the parameters */
if(!address)
return NWK_BAD_ADDRESS;
if(!data)
return NWK_BAD_DATA_PTR;
if(!len || len > 32)
return NWK_BAD_DATA_LEN;
//TODO: PROBABLY NEED TO START BLOCKING HERE
/* Allocate the packet on the queue */
nwk_packet_t* packet;
if(!( packet = nwk_tx_pkt_allocate() ))
return NWK_QUEUE_FULL;
/* Build the packet */
memcpy(packet->dest_address, address, NRF24_ADDR_LEN);
memcpy(packet->data, data, len);
packet->data_len = len;
//TODO: PROBABLY SAFE TO STOP BLOCKING HERE
return NWK_SUCCESS;
}
/* Only called during NWK_IDLE, pushes the next item on the send queue out to the chip's "MAC" layer over SPI */
void nwk_transmit_pkt(void)
{
nwk_packet_t tx_pkt = nwk_send_queue[send_queue_out];
nrf24_send(tx_pkt->data, tx_pkt->data_len);
}
/* The callback for transceiver interrupt when a sent packet is either completed or ran out of retries */
void nwk_tx_result_cb(bool completed)
{
if( (completed) && (nwk_tx_state == NWK_SENDING))
send_queue_out++;//TODO: it's not just ++, it has to be modular by packet size with in the buffer
}
Ok now for a quick explanation and then my questions. So the basic idea is that I've got this queue for data which is being sent onto the network. The function nwk_send() can be called from anywhere in application code, which by the wall will be a small pre-emptive task based operating system (FreeRTOS) and thus can happen from lots of places in the code and be interrupted by the OS tick interrupt.
Now since that function is modifying the pointers into the global queue, I know it needs to be blocking when it is doing that. Am I correct in my comments on the code about where I should be blocking (ie disabling interrupts)? Also would be smarter to make a mutex using a global boolean variable or something rather than just disabling interrupts?
Also, I think there's a second place I should be blocking when things are being taken off the queue, but I'm not sure where that is exactly. Is it in nwk_transmit_pkt() where I'm actually copying the data off the queue and into a local ram variable?
Final question, how do I achieve the modulus operation on my pointers within the arrays? I feel like it should look something like:
send_queue_in = ((send_queue_in + 1) % (NWK_QUEUE_SIZE*sizeof(nwk_packet_t))) + nwk_send_queue;
Any feedback is greatly appreciated, thank you.

About locking it will be best to use some existing mutex primitive from the OS you use. I am not familiar with FreeRTOS but it should have builtin primitives for locking between interrupt and user context.
For circular buffer you may use these:
check for empty queue
send_queue_in == send_queue_out
check for full queue
(send_queue_in + 1) % NWK_QUEUE_SIZE == send_queue_out
push element [pseudo code]
if (queue is full)
return error;
queue[send_queue_in] = new element;
send_queue_in = (send_queue_in + 1) % NWK_QUEUE_SIZE;
pop element [pseudo code]
if (queue is empty)
return error;
element = queue[send_queue_out];
send_queue_out = (send_queue_out + 1) % NWK_QUEUE_SIZE;
It looks that you copy and do not just reference the packet data before sending. This means that you can hold the lock until the copy is done.

Without an overall driver framework to develop with, and when communicating with interrupt-state on a uC, you need to be very careful.
You cannot use OS synchro primitives to communicate to interrupt state. Attmpting to do so will certainly crash your OS because interrupt-handlers cannot block.
Copying the actual bulk data should be avoided.
On an 8-bit uC, I suggest queueing an index onto a buffer array pool, where the number of buffers is <256. That means that only one byte needs to be queued up and so, with an appropriate queue class that stores the value before updating internal byte-size indexes, it is possible to safely communicate buffers into a tx handler without excessive interrupt-disabling.
Access to the pool array should be thread-safe and 'insertion/deletion' should be quick - I have 'succ/pred' byte-fields in each buffer struct, so forming a double-linked list, access protected by a mutex. As well as I/O, I use this pool of buffers for all inter-thread comms.
For tx, get a buffer struct from teh pool, fill with data, push the index onto a tx queue, disable interrupts for only long enough to determine whether the tx interrupt needs 'primimg'. If priming is required, shove in a FIFO-full of data before re-enabling interrupts.
When the tx interrupt-handler has sent the buffer, it can push the 'used' index back onto a 'scavenge' queue and signal a semaphore to make a handler thread run. This thread can then take the entry from the scavenge queue and return it to the pool.
This scheme only works if interrupt-handlers do not re-enable higher-priority interrupts using the same buffering scheme.

AT commands in embedded systems

I am using embedded C and trying to make application for GPRS terminal. My main problem is working with AT commands. I send AT command using serial line, but if it is some network oriented command its response could take time and because of that I have a lot of waiting, while processor don`t do anything. Idea is to make this waiting to be done in same way parallel like in different thread. Does anyone have idea how to do that because my system does not support threads?
I had idea to use some timers, because we have interrupt which is called every 5ms, but I don't know ho many seconds I have to wait for response, and if I compare strings in interrupt to check if all message is received it could be very inefficient, right?

you could either use interrupts, configure the serial interface to interrupt when data is available, or use an RTOS something, like FreeRTOS, to run two threads, one for the main code and the other to block and wait for the serial data.
Update: based on your comments, you say you don't know the size of the data, that's fine, in the interrupt handler check for the byte that terminates the data, this is a simple and generic example you should check the examples for your MCU:
void on_serial_char()
{
//disable interrupts
disable_interrupts();
//read byte
byte = serial.read();
//check if it's the terminating byte
if (byte == END) {
//set the flag here
MESSAGE_COMPLETE = 1;
}
//add byte to buffer
buf[length++] = byte;
//enable interrupts
enable_interrupts();
}
And check for that flag in your main loop:
...
if (MESSAGE_COMPLETE) {
//process data
...
//you may want to clear the flag here
MESSAGE_COMPLETE = 0;
//send next command
...
}

You can simply call a packetHandler in each mainLoopCycle.
This handler checks if new characters are available from the serial port.
The packetHandler will build the response message bit for bit, if the message is complete (CR LF found) then it calls a messageReceive function, else it simply returns to the mainLoop.
int main()
{
init();
for (;;)
{
packetHandler();
}
}
char msgBuffer[80];
int pos=0;
void packetHandler()
{
char ch;
while ( isCharAvailable() )
{
ch=getChar();
msgBuffer[pos++] = ch;
if ( ch == '\n' )
{
messageReceived(msgBuffer);
pos=0;
}
}
}

It sounds like you are rather close to the hardware drivers. If so, the best way is to use DMA, if the MCU supports it, then use the flag from the DMA hardware to determine when to start parse out the received data.
The second best option is to use rx interrupts, store every received byte in a simple FIFO, such as a circular buffer, then set some flag once you have received them. One buffer for incoming data and one for the latest valid data received may be necessary.

Network receipt timer to ms resolution

My scenario, I'm collecting network packets and if packets match a network filter I want to record the time difference between consecutive packets, this last part is the part that doesn't work. My problem is that I cant get accurate sub-second measurements no matter what C timer function I use. I've tried: gettimeofday(), clock_gettime(), and clock().
I'm looking for assistance to figure out why my timing code isn't working properly.
I'm running on a cygwin environment.
Compile Options: gcc -Wall capture.c -o capture -lwpcap -lrt
Code snippet :
/*globals*/
int first_time = 0;
struct timespec start, end;
double sec_diff = 0;
main() {
pcap_t *adhandle;
const struct pcap_pkthdr header;
const u_char *packet;
int sockfd = socket(PF_INET, SOCK_STREAM, 0);
.... (previous I create socket/connect - works fine)
save_attr = tty_set_raw();
while (1) {
packet = pcap_next(adhandle, &header); // Receive a packet? Process it
if (packet != NULL) {
got_packet(&header, packet, adhandle);
}
if (linux_kbhit()) { // User types message to channel
kb_char = linux_getch(); // Get user-supplied character
if (kb_char == 0x03) // Stop loop (exit channel) if user hits Ctrl+C
break;
}
}
tty_restore(save_attr);
close(sockfd);
pcap_close(adhandle);
printf("\nCapture complete.\n");
}
In got_packet:
got_packet(const struct pcap_pkthdr *header, const u_char *packet, pcap_t * p){ ... {
....do some packet filtering to only handle my packets, set match = 1
if (match == 1) {
if (first_time == 0) {
clock_gettime( CLOCK_MONOTONIC, &start );
first_time++;
}
else {
clock_gettime( CLOCK_MONOTONIC, &end );
sec_diff = (end.tv_sec - start.tv_sec) + ((end.tv_nsec - start.tv_nsec)/1000000000.0); // Packet difference in seconds
printf("sec_diff: %ld,\tstart_nsec: %ld,\tend_nsec: %ld\n", (end.tv_sec - start.tv_sec), start.tv_nsec, end.tv_nsec);
printf("sec_diffcalc: %ld,\tstart_sec: %ld,\tend_sec: %ld\n", sec_diff, start.tv_sec, end.tv_sec);
start = end; // Set the current to the start for next match
}
}
}
I record all packets with Wireshark to compare, so I expect the difference in my timer to be the same as Wireshark's, however that is never the case. My output for tv_sec will be correct, however tv_nsec is not even close. Say there is a 0.5 second difference in wireshark, my timer will say there is a 1.999989728 second difference.

Basically, you will want to use a timer with a higher resolution
Also, I did not check in libpcap, but I am pretty sure that libpcap can give you the time at which each packet was received. In which case, it will be closest that you can get to what Wireshark displays.

I don't think that it is the clocks that are your problem, but the way that you are waiting on new data. You should use a polling function to see when you have new data from either the socket or from the keyboard. This will allow your program to sleep when there is no new data for it to process. This is likely to make the operating system be nicer to your program when it does have data to process and schedule it quicker. This also allows you to quit the program without having to wait for the next packet to come in. Alternately you could attempt to run your program at really high or real time priority.
You should consider getting the current time at the first instance after you get a packet if the filtering can take very long. You may also want to consider multiple threads for this program if you are trying to capture data on a fast and busy network. Especially if you have more than one processor, but since you are doing some pritnfs which may block. I noticed you had a function to set a tty to raw mode, which I assume is the standard output tty. If you are actually using a serial terminal that could slow things down a lot, but standard out to a xterm can also be slow. You may want to consider setting stdout to fully buffered rather than line buffered. This should speed up the output. (man setvbuf)

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight