Alsa snd_pcm_avail always returning a 0 - alsa

I am trying to read data from my codec. For reasons in my project I would like to do nonblocking, but every time I read the number of bytes available on my codec it says zero.
The algorithm is pretty simple: wait 1ms then check to see if there are 160+ samples available in the codec to read and then read the samples. But every time I do a read it says the sample count is zero.
Can someone help me understand why "rc = snd_pcm_avail(inputCodecHandle);" is always returning a zero?
Here is the thread with the code in it.
void CRadioStack::rcvThread() {
ChannelBuffer_t *buffer_p = NULL;
int8_t *inputBuf_p;
int rc;
int16_t *inputBuf16_p;
int samplesToRead;
const int rxFrameSize = 160;
snd_pcm_sframes_t delay;
snd_pcm_nonblock(inputCodecHandle, 1);
snd_pcm_prepare(inputCodecHandle);
while (true) {
TWTime::msleep(1);
// get the number of samples available
snd_pcm_delay(inputCodecHandle, &delay);
rc = snd_pcm_avail(inputCodecHandle);
if (rc < 0) {
myLog->warn("Error in getting sample count: %s", snd_strerror(rc));
snd_pcm_prepare(outputCodecHandle);
continue;
}
samplesToRead = rc;
// if number of samples > 160 then get 160 samples
if (samplesToRead <= rxFrameSize) {
continue;
}
// read the from the codec into the Channel Buffer.
rc = snd_pcm_readi(inputCodecHandle, inputBuf_p, rxFrameSize);
if (rc < 0) {
myLog->warn("Error reading Codec: %s", snd_strerror(rc));
continue;
} else if (rc != rxFrameSize) { // nothing to get
myLog->warn("Input samples on codec not 160");
}
pushToInputQueue(inputBuf_p);
}
}
And here is the code to open the codec.
bool CRadioStack::openInputCodec()
{
unsigned int val;
int dir;
const int NUM_OF_CHAN = 1;
codecRunning = false;
snd_pcm_uframes_t frames;
int rc;
snd_pcm_t *handle;
snd_pcm_hw_params_t *params;
inputCodecHandle = nullptr;
// Open pcm device for output
rc = snd_pcm_open(&handle, "hw:0,0", SND_PCM_STREAM_CAPTURE, 0);
if (rc < 0) {
myLog->error("Unable to open input codec: %s", snd_strerror(rc));
return false;
}
// allocate a hardware parameters object
snd_pcm_hw_params_alloca(&params);
// fill with default values
snd_pcm_hw_params_any(handle, params);
// now setup the hardware paramters
snd_pcm_hw_params_set_access(handle, params, SND_PCM_ACCESS_RW_INTERLEAVED); // interleaved
snd_pcm_hw_params_set_format(handle, params, SND_PCM_FORMAT_S16_LE); // 16bin linear little-endian
snd_pcm_hw_params_set_channels(handle, params, NUM_OF_CHAN); // one channel
val = 0;
snd_pcm_hw_params_set_channels_near(handle, params, &val); // one channel
val = 8000;
dir = 0;
snd_pcm_hw_params_set_rate_near(handle, params, &val, &dir); // 8k sample rate.
frames = 160;
snd_pcm_hw_params_set_period_size_near(handle, params, &frames, &dir); // period size = 160 frames
// save the hardware parameters
rc = snd_pcm_hw_params(handle, params);
if (rc < 0) {
myLog->error("Unable to save hardware parameters to output codec.");
return false;
}
// ready to write to output codec.
// so save the handle so that it can be used elsewhere.
inputCodecHandle = handle;
return true;
}
Thanks!

The device was never started.
This would happen automatically with the first snd_pcm_read*() call, but can also be done explicitly with snd_pcm_start().

Related

Alsa plays audio get from CAN FD

I am trying to use Alsa library to reproduce the audio I get from my CAN FD communication, into my headphones. I don't quite understand how to properly configure Alsa's parameters, in order to be able to listen to the sound I get from the CAN FD.
static char *device = "plughw:0,0"; /* playback device */
static snd_pcm_format_t format = SND_PCM_FORMAT_S16_LE; /* sample format */
static unsigned int rate = 16000; /* stream rate */
static unsigned int channels = 1; /* count of channels */
static unsigned int buffer_time = 40000; /* ring buffer length in us */
static unsigned int period_time = 120000; /* period time in us */
static int resample = 1; /* enable alsa-lib resampling */
static int period_event = 0; /* produce poll event after each period */
int size;
while (1) {
do {
nbytes = read(s, &frame, sizeof(struct canfd_frame));
} while (nbytes == 0);
for (x = 0; x < 64; x = x + 2) {
buffer[a] = ((uint32_t) frame.data[x] << 8)
| ((uint32_t) (frame.data[x + 1]));
a++;
}
//err=snd_pcm_writei(handle,buffer,32);
//printf("Datos = %d\n", err);
memcpy(total1 + i * 32, buffer, 32 * sizeof(uint32_t));
i++;
a = 0;
if (i == 500) {
buffer_length=16000;
ptr = total1;
while(buffer_length > 0){
err = snd_pcm_writei(handle, ptr, 16000);
printf("Datos = %d\n", err);
snd_pcm_avail_delay(handle, &availp, &delayp);
//printf("available frames =%ld delay = %ld z = %d\n", availp, delayp, z);
if (err == -EAGAIN)
continue;
if(err < 0){
err=snd_pcm_recover(handle, err, 1);
}
else{
ptr += err * channels;
buffer_length -= err;
z++;
}
if(err<0){
printf("snd_pcm_writei failed: %s\n", snd_strerror(err));
break;
}
}
i = 0;
}
This is a part of my code, I don't thinks posting the whole code is worth. I don't understand which values should I give to buffer_time, period_time and how to be able to listen to what a I get through the CAN FD in real time. I am using snd_pcm_writei, inserting a buffer I fill with some samples I get from the CAN FD. I don't know which size should I give to the buffer and to the "frames" variable, another one that I don't quite understand, eventhough I have read some about it.
Any idea how should I configure my system? (buffer_time, period_time, buffer_size, frame,...)
I have tried using different buffer and frame sizes, but I don't think I understand how it works properly. How can I calculate the size of the frame and buffer of the snd_pcm_writei(), in order to listen in Real Time to the audio?
Should I use two differente threads? One to create the buffer with the CAN FD information and the other one to handle the buffer and the audio output?
Thanks in advance,
Ander.
I have finally managed to hear my self through the headphones. I have changed my configuration posted on my previous in order to sincronize it with the data I get from the CAN FD. I will post part of my code down here in case somebody needs an example. The most important part having to handle buffers like these is to handle the time to fill and the time to communicate it. Handling the time and configuring the Alsa parameters accordingly makes easier to handle the buffers.
static char *device = "plughw:0,0"; /* playback device */
static snd_pcm_format_t format = SND_PCM_FORMAT_S16_LE; /* sample format */
static unsigned int rate = 22000; /* stream rate */
static unsigned int channels = 1; /* count of channels */
static unsigned int buffer_time = 1000; /* ring buffer length in us */
static unsigned int period_time = 10000; /* period time in us */
static int resample = 1; /* enable alsa-lib resampling */
static int period_event = 0; /* produce poll event after each period */
int size;
static snd_pcm_sframes_t buffer_size;
static snd_pcm_sframes_t period_size;
static snd_output_t *output = NULL;
snd_pcm_sframes_t delayp;
snd_pcm_sframes_t availp;
snd_pcm_uframes_t frames;
static void write_loop(snd_pcm_t *handle) {
uint32_t *buffer = malloc(16000 * sizeof(uint32_t));
uint32_t *total1 = malloc(16000 * sizeof(uint32_t)); // array to hold the result
while (1) {
do {
nbytes = read(s, &frame, sizeof(struct canfd_frame));
} while (nbytes == 0);
for (x = 0; x < 64;x = x + 2) {
buffer[a] = ((uint32_t) frame.data[x] << 8)
| ((uint32_t) (frame.data[x + 1]));
//buffer[a]=frame.data[x];
a++;
}
i++;
if (i == 250) {
memcpy(total1, buffer, 16000 * sizeof(uint32_t));
//printf("Address = %lu \n",(unsigned long)total1);
flag = 1;
buffer_length = 16000;
i = 0;
a = 0;
}
if (flag == 1) {
while(buffer_length > 0) {
snd_pcm_prepare(handle);
err = snd_pcm_writei(handle, total1, buffer_length);
//printf("Datos = %d\n", err);
snd_pcm_avail_delay(handle, &availp, &delayp);
//printf("available frames =%ld delay = %ld\n",availp,delayp);
if (err == -EAGAIN)
continue;
if (err < 0) {
err = snd_pcm_recover(handle, err, 1);
} else {
ptr += err * channels;
buffer_length -= err;
z++;
}
if (err < 0) {
printf("snd_pcm_writei failed: %s\n", snd_strerror(err));
break;
}
}
flag = 0;
}
}
}

Decoding and resampling audio with FFmpeg for output with libao

I'm trying to write a program to read and play an audio file using FFmpeg and libao. I've been following the procedure outlined in the FFmpeg documentation for decoding audio using the new avcodec_send_packet and avcodec_receive_frame functions, but the examples I've been able to find are few and far between (the ones in the FFmpeg documentation either don't use libavformat or use the deprecated avcodec_decode_audio4). I've based a lot of my program off of the transcode_aac.c example (up to init_resampler) in the FFmpeg documentation, but that also uses the deprecated decoding function.
I believe I have the decoding part of the program working, but I need to resample the audio in order to convert it into an interleaved format to send to libao, for which I'm attempting to use libswresample. Whenever the program is run in its current state, it outputs (many times) "Error resampling: Output changed". The test file I've been using is just a YouTube rip that I had on hand. ffprobe reports the only stream as:
Stream #0:0(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 125 kb/s (default)
This is my first program with FFmpeg (and I'm still relatively new to C), so any advice on how to improve/fix other parts of the program would be welcome.
#include<stdio.h>
#include<libavcodec/avcodec.h>
#include<libavformat/avformat.h>
#include<libavutil/avutil.h>
#include<libswresample/swresample.h>
#include<ao/ao.h>
#define OUTPUT_CHANNELS 2
#define OUTPUT_RATE 44100
#define BUFFER_SIZE 192000
#define OUTPUT_BITS 16
#define OUTPUT_FMT AV_SAMPLE_FMT_S16
static char *errtext (int err) {
static char errbuff[256];
av_strerror(err,errbuff,sizeof(errbuff));
return errbuff;
}
static int open_audio_file (const char *filename, AVFormatContext **context, AVCodecContext **codec_context) {
AVCodecContext *avctx;
AVCodec *codec;
int ret;
int stream_id;
int i;
// Open input file
if ((ret = avformat_open_input(context,filename,NULL,NULL)) < 0) {
fprintf(stderr,"Error opening input file '%s': %s\n",filename,errtext(ret));
*context = NULL;
return ret;
}
// Get stream info
if ((ret = avformat_find_stream_info(*context,NULL)) < 0) {
fprintf(stderr,"Unable to find stream info: %s\n",errtext(ret));
avformat_close_input(context);
return ret;
}
// Find the best stream
if ((stream_id = av_find_best_stream(*context,AVMEDIA_TYPE_AUDIO,-1,-1,&codec,0)) < 0) {
fprintf(stderr,"Unable to find valid audio stream: %s\n",errtext(stream_id));
avformat_close_input(context);
return stream_id;
}
// Allocate a decoding context
if (!(avctx = avcodec_alloc_context3(codec))) {
fprintf(stderr,"Unable to allocate decoder context\n");
avformat_close_input(context);
return AVERROR(ENOMEM);
}
// Initialize stream parameters
if ((ret = avcodec_parameters_to_context(avctx,(*context)->streams[stream_id]->codecpar)) < 0) {
fprintf(stderr,"Unable to get stream parameters: %s\n",errtext(ret));
avformat_close_input(context);
avcodec_free_context(&avctx);
return ret;
}
// Open the decoder
if ((ret = avcodec_open2(avctx,codec,NULL)) < 0) {
fprintf(stderr,"Could not open codec: %s\n",errtext(ret));
avformat_close_input(context);
avcodec_free_context(&avctx);
return ret;
}
*codec_context = avctx;
return 0;
}
static void init_packet (AVPacket *packet) {
av_init_packet(packet);
packet->data = NULL;
packet->size = 0;
}
static int init_resampler (AVCodecContext *codec_context, SwrContext **resample_context) {
int ret;
// Set resampler options
*resample_context = swr_alloc_set_opts(NULL,
av_get_default_channel_layout(OUTPUT_CHANNELS),
OUTPUT_FMT,
codec_context->sample_rate,
av_get_default_channel_layout(codec_context->channels),
codec_context->sample_fmt,
codec_context->sample_rate,
0,NULL);
if (!(*resample_context)) {
fprintf(stderr,"Unable to allocate resampler context\n");
return AVERROR(ENOMEM);
}
// Open the resampler
if ((ret = swr_init(*resample_context)) < 0) {
fprintf(stderr,"Unable to open resampler context: %s\n",errtext(ret));
swr_free(resample_context);
return ret;
}
return 0;
}
static int init_frame (AVFrame **frame) {
if (!(*frame = av_frame_alloc())) {
fprintf(stderr,"Could not allocate frame\n");
return AVERROR(ENOMEM);
}
return 0;
}
int main (int argc, char *argv[]) {
AVFormatContext *context = 0;
AVCodecContext *codec_context;
SwrContext *resample_context = NULL;
AVPacket packet;
AVFrame *frame = 0;
AVFrame *resampled = 0;
int16_t *buffer;
int ret, packet_ret, finished;
ao_device *device;
ao_sample_format format;
int default_driver;
if (argc != 2) {
fprintf(stderr,"Usage: %s <filename>\n",argv[0]);
return 1;
}
av_register_all();
printf("Opening file...\n");
if (open_audio_file(argv[1],&context,&codec_context) < 0)
return 1;
printf("Initializing resampler...\n");
if (init_resampler(codec_context,&resample_context) < 0) {
avformat_close_input(&context);
avcodec_free_context(&codec_context);
return 1;
}
// Setup libao
printf("Starting audio device...\n");
ao_initialize();
default_driver = ao_default_driver_id();
format.bits = OUTPUT_BITS;
format.channels = OUTPUT_CHANNELS;
format.rate = codec_context->sample_rate;
format.byte_format = AO_FMT_NATIVE;
format.matrix = 0;
if ((device = ao_open_live(default_driver,&format,NULL)) == NULL) {
fprintf(stderr,"Error opening audio device\n");
avformat_close_input(&context);
avcodec_free_context(&codec_context);
swr_free(&resample_context);
return 1;
}
// Mainloop
printf("Beginning mainloop...\n");
init_packet(&packet);
// Read packets until done
while (1) {
packet_ret = av_read_frame(context,&packet);
// Send a packet
if ((ret = avcodec_send_packet(codec_context,&packet)) < 0)
fprintf(stderr,"Error sending packet to decoder: %s\n",errtext(ret));
av_packet_unref(&packet);
while (1) {
if (!frame)
frame = av_frame_alloc();
ret = avcodec_receive_frame(codec_context,frame);
if (ret == AVERROR(EAGAIN) || ret == AVERROR_EOF) // Need more input
break;
else if (ret < 0) {
fprintf(stderr,"Error receiving frame: %s\n",errtext(ret));
break;
}
// We have a valid frame, need to resample it
if (!resampled)
resampled = av_frame_alloc();
resampled->channel_layout = av_get_default_channel_layout(OUTPUT_CHANNELS);
resampled->sample_rate = codec_context->sample_rate;
resampled->format = OUTPUT_FMT;
if ((ret = swr_convert_frame(resample_context,resampled,frame)) < 0) {
fprintf(stderr,"Error resampling: %s\n",errtext(ret));
} else {
ao_play(device,(char*)resampled->extended_data[0],resampled->linesize[0]);
}
av_frame_unref(resampled);
av_frame_unref(frame);
}
if (packet_ret == AVERROR_EOF)
break;
}
printf("Closing file and freeing contexts...\n");
avformat_close_input(&context);
avcodec_free_context(&codec_context);
swr_free(&resample_context);
printf("Closing audio device...\n");
ao_close(device);
ao_shutdown();
return 0;
}
UPDATE: I've got it playing sound now, but it sounds like samples are missing (and MP3 files warn that "Could not update timestamps for skipped samples"). The issue was that the resampled frame needed to have certain attributes set before being passed to swr_convert_frame. I've also added av_packet_unref and av_frame_unref, but I'm still unsure as to where to best locate them.
ao_play(device,(char*)resampled->extended_data[0],resampled->linesize[0]);
You have problem in this line. Resampled audio frame has incorrect linesize parameters. swr_convert_frame aligns data and extended_data fields with silence. This silence is included into linesize parameter so you pass incorrect frame size into ao_play function.
ao_play(device, (char*)resampled->extended_data[0], av_sample_get_buffer_size(resampled->linesize, resampled->channels, resampled->nb_samples, resampled->format, 0));
Function av_sample_get_buffer_size() returns true sample size, without align. When I faced similar problem, this was the solution.

Serial port ReadFile reads 0 bytes and returns true

I'm trying to read in data from a serial port in Windows 7 using the Windows API. When I try to read in data, the WaitCommEvent() fires just fine and the ReadFile() call returns 1 as the status, but no data is read in. In the the ReadFile documentation it says that:
When a synchronous read operation reaches the end of a file, ReadFile returns TRUE and sets *lpNumberOfBytesRead to zero.
However, I'm sure there are no EOT characters in the data being sent over the serial port.
I currently have two USB cables plugged into my computer and connected to each other. I know that they can send and receive data as I have tested them with Putty.
Why won't ReadFile() read in any data?
My code is below.
Header:
typedef struct uart_handle
{
uint8_t port_num;
char port_name[10];
uint32_t baud_rate;
uint8_t byte_size;
uint8_t stop;
uint8_t parity;
int32_t error;
HANDLE handle;
} uart_handle;
Main file:
uart_handle* serial_comm_init(uint8_t port_num, uint32_t baud_rate, uint8_t byte_size, uint8_t stop, uint8_t parity)
{
uart_handle* uart;
DCB uart_params = { 0 };
COMMTIMEOUTS timeouts = { 0 };
int status;
uart = (uart_handle*) malloc(1 * sizeof(uart_handle));
status = 0;
// Set port name
if (port_num > 9)
{
sprintf(uart->port_name, "\\\\.\\COM%d", port_num);
}
else
{
sprintf(uart->port_name, "COM%d", port_num);
}
// Set baud rate
uart->baud_rate = baud_rate;
// Set byte size
uart->byte_size = byte_size;
// Set stop bit
uart->stop = stop;
// Set parity
uart->parity = parity;
// Set up comm state
uart_params.DCBlength = sizeof(uart_params);
status = GetCommState(uart->handle, &uart_params);
uart_params.BaudRate = uart->baud_rate;
uart_params.ByteSize = uart->byte_size;
uart_params.StopBits = uart->stop;
uart_params.Parity = uart->parity;
SetCommState(uart->handle, &uart_params);
// Setup actual file handle
uart->handle = CreateFile(uart->port_name, GENERIC_READ | GENERIC_WRITE, 0, NULL, OPEN_EXISTING, 0, NULL);
if (uart->handle == INVALID_HANDLE_VALUE) {
printf("Error opening serial port %s.\n", uart->port_name);
free(uart);
return NULL;
}
else {
printf("Serial port %s opened successfully.\n", uart->port_name);
}
// Set timeouts
status = GetCommTimeouts(uart->handle, &timeouts);
timeouts.ReadIntervalTimeout = 50;
timeouts.ReadTotalTimeoutConstant = 50;
timeouts.ReadTotalTimeoutMultiplier = 10;
timeouts.WriteTotalTimeoutConstant = 50;
timeouts.WriteTotalTimeoutMultiplier = 10;
status = SetCommTimeouts(uart->handle, &timeouts);
if (status == 0) {
printf("Error setting comm timeouts: %d", GetLastError());
}
return uart;
}
int32_t serial_comm_read(void* handle, uint8_t* msg, uint32_t msg_size, uint32_t timeout_ms, uint32_t flag)
{
uart_handle* uart;
uint32_t num_bytes_read;
uint32_t event_mask;
int32_t status;
uart = (uart_handle*) handle;
num_bytes_read = 0;
event_mask = 0;
status = 0;
memset(msg, 0, msg_size);
// Register Event
status = SetCommMask(uart->handle, EV_RXCHAR);
// Wait for event
status = WaitCommEvent(uart->handle, &event_mask, NULL);
printf("Recieved characters.\n");
do {
status = ReadFile(uart->handle, msg, msg_size, &num_bytes_read, NULL);
printf("Status: %d\n", status);
printf("Num bytes read: %d\n", num_bytes_read);
printf("Message: %s\n", msg);
} while (num_bytes_read > 0);
printf("Read finished.\n");
return 0;
}
Output:
Serial port COM9 opened successfully.
Recieved characters.
Status: 1
Num bytes read: 0
Message:
Read finished.
The code shown calls GetCommState() on an uninitialised handle:
status = GetCommState(uart->handle, &uart_params);
provoking UB doing so. Its returned status is not tested.
Due to this uart_params probably contains BS no useful data.
Do yourself a favour: Always and ever check the return value on all relevant function calls (and let the code act accordingly)! Consider as "relevant" all those functions returning or changing data used afterwards.

decode FLAC file with avcodec_decode_audio4 not work well

I use FFmpeg to decode my flac file and write it to pcm file, then use GoldenWave to play it with pcm signed 16bit, little endian, mono and the total play time is ok.
I doubt i write the 2 channel file in one place, but i don't know how to get every signal channel and write it to pcm file.
any help? thank you.
while (av_read_frame(fmt_ctx, &pkt) >= 0) {
AVPacket orig_pkt = pkt;
do {
ret = decode_packet(&got_frame, 0);
if (ret < 0)
break;
pkt.data += ret;
pkt.size -= ret;
} while (pkt.size > 0);
av_free_packet(&orig_pkt);
}
pkt.data = NULL;
pkt.size = 0;
do {
decode_packet(&got_frame, 1);
LOG("flush cached frames");
} while (got_frame);
static int decode_packet(int *got_frame, int cached)
{
int ret = 0;
int decoded = pkt.size;
*got_frame = 0;
if (pkt.stream_index == audio_stream_idx) {
ret = avcodec_decode_audio4(audio_dec_ctx, frame, got_frame, &pkt);
if (ret < 0) {
LOG("Error decoding audio frame (%s)\n", av_err2str(ret));
return ret;
}
decoded = FFMIN(ret, pkt.size);
if (*got_frame) {
size_t unpadded_linesize = frame->nb_samples * av_get_bytes_per_sample(audio_dec_ctx->sample_fmt);
//decode packet nb_samples:4608, xx:2, unpadded_linesize: 9216
LOG("decode packet nb_samples:%d, xx:%d, unpadded_linesize: %d",
frame->nb_samples, av_get_bytes_per_sample(audio_dec_ctx->sample_fmt), unpadded_linesize);
fwrite(frame->extended_data[0], 1, unpadded_linesize, audio_dst_file);
//int nb_sample = frame->nb_samples;
//fwrite(frame->extended_data[0], 1, nb_sample, audio_dst_file);
//fwrite(frame->extended_data[0] + nb_sample, 1, nb_sample, audio_dst_file);
}
}
if (*got_frame && api_mode == API_MODE_NEW_API_REF_COUNT)
av_frame_unref(frame);
return decoded;
}
You didn't describe the problem you're having, but from what you're writing, I see two problems:
you're not checking the raw audio format of the frame, see frame->format (or audio_dec_ctx->sample_fmt). You're writing it as if it were AV_SAMPLE_FMT_S16, but you're not checking that it is
your unpadded_linesize is not multiplied by the number of channels (see e.g. frame->channels)

How to improve cmuSphinx's accuracy?

I want to use pocketShpinx to do some speech-to-text word. I have install sphinxbase and pocketSphinx. And download the acoustic model/langauge model/dictionary. Then I test the example code just like follows:
#include <pocketsphinx/pocketsphinx.h>
#include <stdio.h>
#include <stdlib.h>
#include "debug.h"
int main(int argc, char *argv[])
{
ps_decoder_t *ps;
cmd_ln_t *config;
FILE *fh;
int rv;
char const *hyp, *uttid;
int32 score;
config = cmd_ln_init(NULL, ps_args(), TRUE,
"-hmm", "/home/madper/speech/hub4opensrc.cd_continuous_8gau",
"-lm", "/home/madper/speech/language_model.arpaformat.DMP",
"-dict", "/home/madper/speech/cmudict/cmudict/sphinxdict/cmudict_SPHINX_40",
NULL);
if (config == NULL)
{
DBG (("cmd_ln_init() failed.\n"));
exit(1);
}
if ((ps = ps_init (config)) == NULL) /* init decoder */
{
DBG (("ps_init() failed.\n"));
exit(1 );
}
if ((fh = fopen("test.raw", "rb")) == NULL) /* open raw file */
{
DBG (("fopen() failed.\n"));
exit (1);
}
if ((rv = ps_decode_raw (ps, fh, "test", -1)) < 0 )
{
DBG (("ps_decode_raw() error!\n"));
exit (1);
}
if ((hyp = ps_get_hyp(ps, &score, &uttid)) == NULL)
{
DBG (("ps_get_hyp() failed!\n"));
exit (1);
}
printf ("Recognized: %s\n", hyp); /* this is what you say */
fclose(fh);
ps_free(ps);
return 0;
}
DBG is just a macro to print error message if defined DEBUG.
Then I write some code to record from mic use alsa. Like follows:
#define ALSA_PCM_NEW_HW_PARAMS_API
#include <alsa/asoundlib.h>
int main() {
long loops;
int rc;
int size;
snd_pcm_t *handle;
snd_pcm_hw_params_t *params;
unsigned int val;
int dir;
snd_pcm_uframes_t frames;
char *buffer;
/* Open PCM device for recording (capture). */
rc = snd_pcm_open(&handle, "default",
SND_PCM_STREAM_CAPTURE, 0);
if (rc < 0) {
fprintf(stderr,
"unable to open pcm device: %s\n",
snd_strerror(rc));
exit(1);
}
/* Allocate a hardware parameters object. */
snd_pcm_hw_params_alloca(&params);
/* Fill it in with default values. */
snd_pcm_hw_params_any(handle, params);
/* Set the desired hardware parameters. */
/* Interleaved mode */
snd_pcm_hw_params_set_access(handle, params,
SND_PCM_ACCESS_RW_INTERLEAVED);
/* Signed 16-bit little-endian format */
snd_pcm_hw_params_set_format(handle, params,
SND_PCM_FORMAT_S16_LE);
/* Two channels (stereo) */
snd_pcm_hw_params_set_channels(handle, params, 1);
/* 44100 bits/second sampling rate (CD quality) */
val = 16000;
snd_pcm_hw_params_set_rate_near(handle, params,
&val, &dir);
/* Set period size to 32 frames. */
frames = 16;
snd_pcm_hw_params_set_period_size_near(handle,
params, &frames, &dir);
/* Write the parameters to the driver */
rc = snd_pcm_hw_params(handle, params);
if (rc < 0) {
fprintf(stderr,
"unable to set hw parameters: %s\n",
snd_strerror(rc));
exit(1);
}
/* Use a buffer large enough to hold one period */
snd_pcm_hw_params_get_period_size(params,
&frames, &dir);
size = frames * 2; /* 2 bytes/sample, 2 channels */
buffer = (char *) malloc(size);
/* We want to loop for 5 seconds */
snd_pcm_hw_params_get_period_time(params,
&val, &dir);
loops = 2000000 / val;
while (loops > 0) {
loops--;
rc = snd_pcm_readi(handle, buffer, frames);
if (rc == -EPIPE) {
/* EPIPE means overrun */
fprintf(stderr, "overrun occurred\n");
snd_pcm_prepare(handle);
} else if (rc < 0) {
fprintf(stderr,
"error from read: %s\n",
snd_strerror(rc));
} else if (rc != (int)frames) {
fprintf(stderr, "short read, read %d frames\n", rc);
}
rc = write(1, buffer, size);
if (rc != size)
fprintf(stderr,
"short write: wrote %d bytes\n", rc);
}
snd_pcm_drain(handle);
snd_pcm_close(handle);
free(buffer);
return 0;
}
So, I record a raw file. Then do speech-to-test on that file. But the accuracy is very vert poor. Just like hello or go home will give me hotel or MHM MHM and so on. So what's wrong with these code?
I have read the faqs, should I use acoustic model adaptation to improve accuracy?
PS. I change stereo to mono. And the sound is strange. I can't understand what I said. So, what's wrong with it? This is that raw file test.raw
If you look at the first Q and A in http://cmusphinx.sourceforge.net/wiki/faq you will notice that the library assumes mono data.
You record in stereo.

Resources