I'm doing reverse engineering about a ultrasound probe on the Linux side. I want to capture raw data from an ultrasound probe. I'm programming with C and using the libusb API.
There are two BULK IN endpoints in the device (2 and 6). The device is sending 2048 bytes data, but it is sending data as 512 bytes with four block.
This picture is data flow on the Windows side, and I want to copy that to the Linux side. You see four data blocks with endpoint 02 and after that four data blocks with endpoint 06.
But there is a problem about timing. The first data block of endpoint 02's and first data block of endpoint 06's are close to each other acoording to time. But in data flow they are not in sequence.
I see that the computer is reading the first data blocks of endpoint 02 and 06. After that, the computer is reading the other three data blocks of endpoint 02 and endpoint 06. But in USB Analyzer, the data flow is being viewed according to the endpoint number. The sequence is different according to time.
On the Linux side, I write code like this:
int index = 0;
imageBuffer2 = (unsigned char *) malloc(2048);
imageBuffer6 = (unsigned char *) malloc(2048);
while (1) {
libusb_bulk_transfer(devh, BULK_EP_2, imageBuffer2, 2048, &actual2, 0);
libusb_bulk_transfer(devh, BULK_EP_6, imageBuffer6, 2048, &actual6, 0);
//Delay
for(index = 0; index <= 10000000; index ++)
{
}
}
So that result is in picture as below
In other words, in my code all reading data is being read in sequence according to time and endpoint number. My result is different from the data flow on the Windows side.
In brief, I have two BULK IN endpoints, and they are starting read data close according to time. How is it possible?
It's not clear to me whether you're using a different method for getting the data on Windows or not, I'm going to assume that you are.
I'm not an expert on libusb by any means, but my guess would be that you are overwriting you data with each call, since you're using the same buffer each time. Try giving your buffer a fixed value before using the transfer method, and then evaluate the result.
If it is the case, I believe something along the lines of the following would also work in C:
imageBuffer2 = (unsigned char *) malloc(2048);
char *imageBuffer2P = imageBuffer2;
imageBuffer6 = (unsigned char *) malloc(2048);
char *imageBuffer6P = imageBuffer6;
int dataRead2 = 0;
int dataRead6 = 0;
while(dataRead2 < 2048 || dataRead6 < 2048)
{
int actual2 = 0;
int actual6 = 0;
libusb_bulk_transfer(devh, BULK_EP_2, imageBuffer2P, 2048-dataRead2, &actual2, 200);
libusb_bulk_transfer(devh, BULK_EP_6, imageBuffer6P, 2048-dataRead6, &actual6, 200);
dataRead2 += actual2;
dataRead6 += actual6;
imageBuffer2P += actual2;
imageBuffer6P += actual6;
usleep(1);
}
Related
I am having problem understanding difference between below two code cases. Case 1 is working as per expectation and Case 2 is not.
Problem Statement: I need to write some set of DWORDS on my device file and trigger a DMA. DMA capacity is 128*4 bytes (128 DWORDS). So I want to trigger DMA(using ioctl) after writing 128 bytes to device file descriptor to have full utilisation of capacity. I can do this for individual Dwords as well.
Basic difference in two cases:
In first case intention is to write 128 DWORDS to the file at once and in second case write DWORDS individually and trigger DMA after 128 DWORDS are written.
Is the data written to the file same in both cases? Second is not working so something wrong there. please help. By not working I mean the expected result of DMA commands are not happening so in second case data in file descriptor is is not same as first case just before the dma command.
Case 1 (WORKING)
int input_dwords[128] = {0xAABBCCDD, 0XBBCCDDAA, ....} //128 DWORDS (actually this data is in
//text file just putting in array for
//illustration)
int cmd_buf = (int*)malloc(sizeof(int) * 128); //space for 128 DWORDS
int* cur = cmd_buf;
for(int i = 0; i< 128; i++)
{
*cur = input_dword[i]
cur++;
}
//Write to file in one shot
write(fd, cmd_buf, sizeof(int)*128);
//trigger DMA (ioctl)
trigger_dma(fd);
Case 2 (NOT WORKING)
int input_dwords[128] = {0xAABBCCDD, 0xBBCCDDAA, ....}
int* cur = (int*)input_dwords;
for(int i = 0 ; i< 128 i++)
{
//writing to file one DWORD at a time.
write(fd, cur, sizeof(int));
cur++;
}
//trigger DMA (ioctl)
trigger_dma(fd);
I'm continuously sending 2D arrays of pixel values (uint32) from LabVIEW to a C-program through TCP/IP with the resolution 160x120. The purpose of the C-program is to display the received pixel values as 2D arrays in the console application. I'm sending the pixels as stream of bytes, and using the recv function in Ws2_32.lib to receive the bytes in the C-program. Then I'm converting the bytes to uint32 values and displaying them in the console application using a 2D arrays, so every 2D array will represent an image.
I have en issue with the frame rate though. I'm able to send 30 frames per second in LabVIEW, but when I open the TCP/IP connection with the C-program, the frame rate goes down to 1 frame per second. It must be an issue with the C-program, since I managed to send the desired frames per second with the same LabVIEW program to a corresponding C# program.
The C-code:
#define DEFAULT_BUFLEN 256
#define IMAGEX 120
#define IMAGEY 160
WSADATA wsa;
SOCKET s , new_socket;
struct sockaddr_in server , client;
int c;
int iResult;
char recvbuf[DEFAULT_BUFLEN];
int recvbuflen = DEFAULT_BUFLEN;
typedef unsigned int uint32_t;
unsigned int x=0,y=0,i,n;
uint32_t image[IMAGEX][IMAGEY];
size_t len;
uint32_t* p;
p = (uint32_t*)recvbuf;
do
{
iResult = recv(new_socket, recvbuf, recvbuflen, 0);
len = iResult/sizeof(uint32_t);
for(i=0; i < len; i++)
{
image[x][y] = p[i];
x++;
if (x >= IMAGEX)
{
x=0;
y++;
}
if (y >= IMAGEY)
{
y = 0;
x = 0;
//print image
for (n=0; n< IMAGEX*IMAGEY; n++)
{
printf("%d",image[n%IMAGEX][n/IMAGEY]);
if (n % IMAGEX)
{
printf(" ");
}
else
{
printf("\n");
}
}
}
}
} while ( iResult > 0 );
try reducing the prints .. Since you are reading and printing in the same thread, the data in the TCP connection will fill up and it will then back pressure the other end (LABView) and the LABView will stop sending data until it gets the green signal from the other end (you C program)
To start with you can debug by replacing this
for (n=0; n< IMAGEX*IMAGEY; n++)
{
printf("%d",image[n%IMAGEX][n/IMAGEY]);
if (n % IMAGEX)
{
printf(" ");
}
else
{
printf("\n");
}
}
with
printf("One frame recv\n");
and see if it makes any difference. I am assuming your tcp connection has ample bandwidth
Very hard to diagnose without further information. I can give a few suggestions, however.
First of all, your recv call is using a small buffer, so you are spending a lot of time calling it. Why not read a whole frame at a time? Also, you read in the data and then copy it to the image array. Wouldn't it be simpler to just use the image array itself? Combining those two suggestions would have recv reading a full frame directly into the image array, saving a lot of time.
Another source of the problem could be the console. With the sample code you provided, you are attempting to write 30*120*160=57,600 integer values per second to the terminal. If the average value, with delimiter, takes up 8 characters, that's 4 million characters per second. It's entirely possible that the display just can't go that fast, in which case things would back up and slow down all the way to the server writing to the socket.
There are several ways to handle this, but it's too much to go into here.
I'm developing on an AD Blackfin BF537 DSP running uClinux. I have a total of 32MB SD-RAM available. I have an ADC attached, which I can access using a simple, blocking call to read().
The most interesting part of my code is below. Running the program seems to work just fine, I get a nice data package that I can fetch from the SD-card and plot. However, if I comment out the float calculation part (as noted in the code), I get only zeroes in the ft_all.raw file. The same occurs if I change optimization level from -O3 to -O0.
I've tried countless combinations of all sorts of things, and sometimes it works, sometimes it does not - earlier (with minor modifications to below), the code would only work when optimization was disabled. It may also break if I add something else further down in the file.
My suspicion is that the data transferred by the read()-function may not have been transferred fully (is that possible, even though it returns the correct number of bytes?). This is also the first time I initialize pointers using direct memory adresses, and I have no idea how the compiler reacts to this - perhaps I missed something, here?
I've spent days on this issue now, and I'm getting desperate - I would really appreciate some help on this one! Thanks in advance.
// Clear the top 16M memory for data processing
memset((int *)0x01000000,0x0000,(size_t)SIZE_16M);
/* Prep some pointers for data processing */
int16_t *buffer;
int16_t *buf16I, *buf16Q;
buffer = (int16_t *)(0x1000000);
buf16I = (int16_t *)(0x1600000);
buf16Q = (int16_t *)(0x1680000);
/* Read data from ADC */
int rbytes = read(Sportfd, (int16_t*)buffer, 0x200000);
if (rbytes != 0x200000) {
printf("could not sample data! %X\n",rbytes);
goto end;
} else {
printf("Read %X bytes\n",rbytes);
}
FILE *outfd;
int wbytes;
/* Commenting this region results in all zeroes in ft_all.raw */
float a,b;
int c;
b = 0;
for (c = 0; c < 1000; c++) {
a = c;
b = b+pow(a,3);
}
printf("b is %.2f\n",b);
/* Only 12 LSBs of each 32-bit word is actual data.
* First 20 bits of nothing, then 12 bits I, then 20 bits
* nothing, then 12 bits Q, etc...
* Below, the I and Q parts are scaled with a factor of 16
* and extracted to buf16I and buf16Q.
* */
int32_t *buf32;
buf32 = (int32_t *)buffer;
uint32_t i = 0;
uint32_t n = 0;
while (n < 0x80000) {
buf16I[i] = buf32[n] << 4;
n++;
buf16Q[i] = buf32[n] << 4;
i++;
n++;
}
printf("Saving to /mnt/sd/d/ft_all.raw...");
outfd = fopen("/mnt/sd/d/ft_all.raw", "w+");
if (outfd == NULL) {
printf("Could not open file.\n");
}
wbytes = fwrite((int*)0x1600000, 1, 0x100000, outfd);
fclose(outfd);
if (wbytes < 0x100000) {
printf("wbytes not correct (= %d) \n", (int)wbytes);
}
printf(" done.\n");
Edit: The code seems to work perfectly well if I use read() to read data from a simple file rather than the ADC. This leads me to believe that the rather hacky-looking code when extracting the I and Q parts of the input is working as intended. Inspecting the assembly generated by the compiler confirms this.
I'm trying to get in touch with the developer of the ADC driver to see if he has an explanation of this behaviour.
The ADC is connected through a SPORT, and is opened as such:
sportfd = open("/dev/sport1", O_RDWR);
ioctl(sportfd, SPORT_IOC_CONFIG, spconf);
And here are the options used when configuring the SPORT:
spconf->int_clk = 1;
spconf->word_len = 32;
spconf->serial_clk = SPORT_CLK;
spconf->fsync_clk = SPORT_CLK/34;
spconf->fsync = 1;
spconf->late_fsync = 1;
spconf->act_low = 1;
spconf->dma_enabled = 1;
spconf->tckfe = 0;
spconf->rckfe = 1;
spconf->txse = 0;
spconf->rxse = 1;
A bfin_sport.h file from Analog Devices is also included: https://gist.github.com/tausen/5516954
Update
After a long night of debugging with the previous developer on the project, it turned out the issue was not related to the code shown above at all. As Chris suggested, it was indeed an issue with the SPORT driver and the ADC configuration.
While debugging, this error messaged appeared whenever the data was "broken": bfin_sport: sport ffc00900 status error: TUVF. While this doesn't make much sense in the application, it was clear from printing the data, that something was out of sync: the data in buffer was on the form 0x12000000,0x34000000,... rather than 0x00000012,0x00000034,... whenever the status error was shown. It seems clear then, why buf16I and buf16Q only contained zeroes (since I am extracting the 12 LSBs).
Putting in a few calls to usleep() between stages of ADC initialization and configuration seems to have fixed the issue - I'm hoping it stays that way!
The Overview
I am using the low-level calls in the libbzip2 library: BZ2_bzCompressInit(), BZ2_bzCompress() and BZ2_bzCompressEnd() to compress chunks of data to standard output.
I am migrating working code from higher-level calls, because I have a stream of bytes coming in and I want to compress those bytes in sets of discrete chunks (a discrete chunk is a set of bytes that contains a group of tokens of interest — my input is logically divided into groups of these chunks).
A complete group of chunks might contain, say, 500 chunks, which I want to compress to one bzip2 stream and write to standard output.
Within a set, using the pseudocode I outline below, if my example buffer is able to hold 101 chunks at a time, I would open a new stream, compress 500 chunks in runs of 101, 101, 101, 101, and one final run of 96 chunks that closes the stream.
The Problem
The issue is that my bz_stream structure instance, which keeps tracks of the number of compressed bytes in a single pass of the BZ2_bzCompress() routine, seems to claim to be writing more compressed bytes than the total bytes in the final, compressed file.
For example, the compressed output could be a file with a true size of 1234 bytes, while the number of reported compressed bytes (which I track while debugging) is somewhat higher than 1234 bytes (say 2345 bytes).
My rough pseudocode is in two parts.
The first part is a rough sketch of what I do to compress a subset of chunks (and I know that I have another subset coming after this one):
bz_stream bzStream;
unsigned char bzBuffer[BZIP2_BUFFER_MAX_LENGTH] = {0};
unsigned long bzBytesWritten = 0UL;
unsigned long long cumulativeBytesWritten = 0ULL;
unsigned char myBuffer[UNCOMPRESSED_MAX_LENGTH] = {0};
size_t myBufferLength = 0;
/* initialize bzStream */
bzStream.next_in = NULL;
bzStream.avail_in = 0U;
bzStream.avail_out = 0U;
bzStream.bzalloc = NULL;
bzStream.bzfree = NULL;
bzStream.opaque = NULL;
int bzError = BZ2_bzCompressInit(&bzStream, 9, 0, 0);
/* bzError checking... */
do
{
/* read some bytes into myBuffer... */
/* compress bytes in myBuffer */
bzStream.next_in = myBuffer;
bzStream.avail_in = myBufferLength;
bzStream.next_out = bzBuffer;
bzStream.avail_out = BZIP2_BUFFER_MAX_LENGTH;
do
{
bzStream.next_out = bzBuffer;
bzStream.avail_out = BZIP2_BUFFER_MAX_LENGTH;
bzError = BZ2_bzCompress(&bzStream, BZ_RUN);
/* error checking... */
bzBytesWritten = ((unsigned long) bzStream.total_out_hi32 << 32) + bzStream.total_out_lo32;
cumulativeBytesWritten += bzBytesWritten;
/* write compressed data in bzBuffer to standard output */
fwrite(bzBuffer, 1, bzBytesWritten, stdout);
fflush(stdout);
}
while (bzError == BZ_OK);
}
while (/* while there is a non-final myBuffer full of discrete chunks left to compress... */);
Now we wrap up the output:
/* read in the final batch of bytes into myBuffer (with a total byte size of `myBufferLength`... */
/* compress remaining myBufferLength bytes in myBuffer */
bzStream.next_in = myBuffer;
bzStream.avail_in = myBufferLength;
bzStream.next_out = bzBuffer;
bzStream.avail_out = BZIP2_BUFFER_MAX_LENGTH;
do
{
bzStream.next_out = bzBuffer;
bzStream.avail_out = BZIP2_BUFFER_MAX_LENGTH;
bzError = BZ2_bzCompress(&bzStream, (bzStream.avail_in) ? BZ_RUN : BZ_FINISH);
/* bzError error checking... */
/* increment cumulativeBytesWritten by `bz_stream` struct `total_out_*` members */
bzBytesWritten = ((unsigned long) bzStream.total_out_hi32 << 32) + bzStream.total_out_lo32;
cumulativeBytesWritten += bzBytesWritten;
/* write compressed data in bzBuffer to standard output */
fwrite(bzBuffer, 1, bzBytesWritten, stdout);
fflush(stdout);
}
while (bzError != BZ_STREAM_END);
/* close stream */
bzError = BZ2_bzCompressEnd(&bzStream);
/* bzError checking... */
The Questions
Am I calculating cumulativeBytesWritten (or, specifically, bzBytesWritten) incorrectly, and how would I fix that?
I have been tracking these values in a debug build, and I do not seem to be "double counting" the bzBytesWritten value. This value is counted and used once to increment cumulativeBytesWritten after each successful BZ2_bzCompress() pass.
Alternatively, am I not understanding the correct use of the bz_stream state flags?
For example, does the following compress and keep the bzip2 stream open, so long as I keep sending some bytes?
bzError = BZ2_bzCompress(&bzStream, BZ_RUN);
Likewise, can the following statement compress data, so long as there are at least some bytes are available to access from the bzStream.next_in pointer (BZ_RUN), and then the stream is wrapped up when there are no more bytes available (BZ_FINISH)?
bzError = BZ2_bzCompress(&bzStream, (bzStream.avail_in) ? BZ_RUN : BZ_FINISH);
Or, am I not using these low-level calls correctly at all? Should I go back to using the higher-level calls to continuously append a grouping of compressed chunks of data to one main file?
There's probably a simple solution to this, but I've been banging my head on the table for a couple days in the course of debugging what could be wrong, and I'm not making much progress. Thank you for any advice.
In answer to my own question, it appears I am miscalculating the number of bytes written. I should not use the total_out_* members. The following correction works properly:
bzBytesWritten = sizeof(bzBuffer) - bzStream.avail_out;
The rest of the calculations follow.
I'm trying to access a SPI sensor using the SPIDEV driver but my code gets stuck on IOCTL.
I'm running embedded Linux on the SAM9X5EK (mounting AT91SAM9G25). The device is connected to SPI0. I enabled CONFIG_SPI_SPIDEV and CONFIG_SPI_ATMEL in menuconfig and added the proper code to the BSP file:
static struct spi_board_info spidev_board_info[] {
{
.modalias = "spidev",
.max_speed_hz = 1000000,
.bus_num = 0,
.chips_select = 0,
.mode = SPI_MODE_3,
},
...
};
spi_register_board_info(spidev_board_info, ARRAY_SIZE(spidev_board_info));
1MHz is the maximum accepted by the sensor, I tried 500kHz but I get an error during Linux boot (too slow apparently). .bus_num and .chips_select should correct (I also tried all other combinations). SPI_MODE_3 I checked the datasheet for it.
I get no error while booting and devices appear correctly as /dev/spidevX.X. I manage to open the file and obtain a valid file descriptor. I'm now trying to access the device with the following code (inspired by examples I found online).
#define MY_SPIDEV_DELAY_USECS 100
// #define MY_SPIDEV_SPEED_HZ 1000000
#define MY_SPIDEV_BITS_PER_WORD 8
int spidevReadRegister(int fd,
unsigned int num_out_bytes,
unsigned char *out_buffer,
unsigned int num_in_bytes,
unsigned char *in_buffer)
{
struct spi_ioc_transfer mesg[2] = { {0}, };
uint8_t num_tr = 0;
int ret;
// Write data
mesg[0].tx_buf = (unsigned long)out_buffer;
mesg[0].rx_buf = (unsigned long)NULL;
mesg[0].len = num_out_bytes;
// mesg[0].delay_usecs = MY_SPIDEV_DELAY_USECS,
// mesg[0].speed_hz = MY_SPIDEV_SPEED_HZ;
mesg[0].bits_per_word = MY_SPIDEV_BITS_PER_WORD;
mesg[0].cs_change = 0;
num_tr++;
// Read data
mesg[1].tx_buf = (unsigned long)NULL;
mesg[1].rx_buf = (unsigned long)in_buffer;
mesg[1].len = num_in_bytes;
// mesg[1].delay_usecs = MY_SPIDEV_DELAY_USECS,
// mesg[1].speed_hz = MY_SPIDEV_SPEED_HZ;
mesg[1].bits_per_word = MY_SPIDEV_BITS_PER_WORD;
mesg[1].cs_change = 1;
num_tr++;
// Do the actual transmission
if(num_tr > 0)
{
ret = ioctl(fd, SPI_IOC_MESSAGE(num_tr), mesg);
if(ret == -1)
{
printf("Error: %d\n", errno);
return -1;
}
}
return 0;
}
Then I'm using this function:
#define OPTICAL_SENSOR_ADDR "/dev/spidev0.0"
...
int fd;
fd = open(OPTICAL_SENSOR_ADDR, O_RDWR);
if (fd<=0) {
printf("Device not found\n");
exit(1);
}
uint8_t buffer1[1] = {0x3a};
uint8_t buffer2[1] = {0};
spidevReadRegister(fd, 1, buffer1, 1, buffer2);
When I run it, the code get stuck on IOCTL!
I did this way because, in order to read a register on the sensor, I need to send a byte with its address in it and then get the answer back without changing CS (however, when I tried using write() and read() functions, while learning, I got the same result, stuck on them).
I'm aware that specifying .speed_hz causes a ENOPROTOOPT error on Atmel (I checked spidev.c) so I commented that part.
Why does it get stuck? I though it can be as the device is created but it actually doesn't "feel" any hardware. As I wasn't sure if hardware SPI0 corresponded to bus_num 0 or 1, I tried both, but still no success (btw, which one is it?).
UPDATE: I managed to have the SPI working! Half of it.. MOSI is transmitting the right data, but CLK doesn't start... any idea?
When I'm working with SPI I always use an oscyloscope to see the output of the io's. If you have a 4 channel scope ypu can easily debug the issue, and find out if you're axcessing the right io's, using the right speed, etc. I usually compare the signal I get to the datasheet diagram.
I think there are several issues here. First of all SPI is bidirectional. So if yo want to send something over the bus you also get something. Therefor always you have to provide a valid buffer to rx_buf and tx_buf.
Second, all members of the struct spi_ioc_transfer have to be initialized with a valid value. Otherwise they just point to some memory address and the underlying process is accessing arbitrary data, thus leading to unknown behavior.
Third, why do you use a for loop with ioctl? You already tell ioctl you haven an array of spi_ioc_transfer structs. So all defined transaction will be performed with one ioctl call.
Fourth ioctl needs a pointer to your struct array. So ioctl should look like this:
ret = ioctl(fd, SPI_IOC_MESSAGE(num_tr), &mesg);
You see there is room for improvement in your code.
This is how I do it in a c++ library for the raspberry pi. The whole library will soon be on github. I'll update my answer when it is done.
void SPIBus::spiReadWrite(std::vector<std::vector<uint8_t> > &data, uint32_t speed,
uint16_t delay, uint8_t bitsPerWord, uint8_t cs_change)
{
struct spi_ioc_transfer transfer[data.size()];
int i = 0;
for (std::vector<uint8_t> &d : data)
{
//see <linux/spi/spidev.h> for details!
transfer[i].tx_buf = reinterpret_cast<__u64>(d.data());
transfer[i].rx_buf = reinterpret_cast<__u64>(d.data());
transfer[i].len = d.size(); //number of bytes in vector
transfer[i].speed_hz = speed;
transfer[i].delay_usecs = delay;
transfer[i].bits_per_word = bitsPerWord;
transfer[i].cs_change = cs_change;
i++
}
int status = ioctl(this->fileDescriptor, SPI_IOC_MESSAGE(data.size()), &transfer);
if (status < 0)
{
std::string errMessage(strerror(errno));
throw std::runtime_error("Failed to do full duplex read/write operation "
"on SPI Bus " + this->deviceNode + ". Error message: " +
errMessage);
}
}