Store audio signal into array using matlab - arrays

I recorded an audio signal (.wav) and I need to convert this signal into a matrix or array using matlab, so i can add it to another one.
[x,fs] = wavread('C:\Users\Amira\Desktop\test222.wav');
fs=44100
length(x) = 339968
How can i sample this signal and covert it to matrix of (N,1) where N=40.

If you only want the first 40 samples of your audio signal, you can simply index into x:
[x,fs] = wavread('C:\Users\Amira\Desktop\test222.wav');
first40 = x(1:40);

Related

How to use KissFFT with audio?

I have an array of 2048 samples of an audio file at 44.1 khz and want to transform it into a spectrum for an LED effect. I don't know too much about the inner workings of fft but I tryed it using kiss fft:
kiss_fft_cpx *cpx_in = malloc(FRAMES * sizeof(kiss_fft_cpx));
kiss_fft_cpx *cpx_out = malloc(FRAMES * sizeof(kiss_fft_cpx));
kiss_fft_cfg cfg = kiss_fft_alloc( FRAMES , 0 ,0,0 );
for(int j = 0;j<FRAMES;j++) {
float x = (alsa_buffer[(fft_last_index+j+BUFFER_OVERSIZE*FRAMES)%(BUFFER_OVERSIZE*FRAMES)] - offset);
cpx_in[j] = (kiss_fft_cpx){.r = x, .i = x};
}
kiss_fft(cfg, cpx_in, cpx_out);
My output seems really off. When I play a simple sine, there multiple outputs with values way above zero. Also it generally seems like the first entries are way higher. Do I have to weigh the outputs?
I also don't understand how I have to treat the complex numbers, I'm currently using my input values on the real and imaginary part and for the output I use the abs, is that right?
Also usually spectrum analyzers for audio have logarithmic scaling, so I tried that but the problem is that the fft output as far as I know isn't logarithmic, so the first band for example is say 0-100hz but optimally my first LED on the effect should be only up to like 60hz (so a fraction of the first outputs band), while the last LED would be say 8khz to 10khz which would in that case be 20 fft outputs.
Is there any way to make the output logarithmic? How do I limit the spectrum to 20khz (or know what the bands of the output are in general) and is there any other thing to look out for when working with audio signals?

Vectorise circshift array

I have a for loop that shifts a signal over a certain amount and appends it to an array. How can I vectorize the circshift section so I don't need to use the for loop?
fs_rate=10
len_of_sig=1; %length of signal in seconds
t=linspace(0,len_of_sig,fs_rate*len_of_sig);
y=.5*sin(2*pi*1*t);
for aa=1:length(y)
y_new(aa,:)=circshift(y,[1,aa+3]); %shifts signal and appends to array
end
plot(t,y_new)
PS: I'm using Octave 4.2.2 Ubuntu 18.04 64bit
You can use the gallery to create a circular matrix after using circshift for your base shift:
base_shift = 4;
fs_rate = 10;
len_of_sig = 1; # length of signal in seconds
t = linspace (0, len_of_sig, fs_rate*len_of_sig);
y = .5 * sin (2*pi*1*t);
y = gallery ("circul", circshift (y, [1 base_shift]));
Or if you want to know how it was implemented, take a look at its source code type gallery

Whats is the best way to gather non-uniform chunks of a 1D array of a chosen MPI datatype?

Pi denotes an MPI process with rank i.
X and Y are 1D vectors that hold a chosen MPI_Datatype.
[aij] represents a sub-vector which may be of size 0.
For the vector X the index j in the sub-vector [aij] represents to which MPI process [aij] needs to be sent.
For the vector Y the index i in the sub-vector [aij] represents from which MPI process [aij] is to be received.
What is the best strategy to build Y from X?
A schematic diagram is shown below.
P1 X = {[a11],[a12],[a13],...,[a1n]}; Y = {[a11],[a21],[a31],...,[an1]};
P2 X = {[a21],[a22],[a23],...,[a2n]}; Y = {[a12],[a22],[a32],...,[an2]};
P3 X = {[a31],[a32],[a33],...,[a3n]}; Y = {[a13],[a23],[a33],...,[an3]};
...
Pn X = {[an1],[an2],[an3],...,[ann]}; Y = {[a1n],[a2n],[a3n],...,[ann]};
The index numbering is chosen to start from one for convenience.
EDIT (following comments from eduffy and francis):
Currenty I am using point-to-point communications to do this operation when the virtual communication topology is sparse, i.e. when size([aij]) = 0 for the majority of the indices i,j.
Determine the number of receives NRECV each MPI process will get. This can be found for each process by an MPI_Reduce of the boolean FLAG = (i != j)&&(size([aij]) != 0).
Each process i executes non blocking sends of the chunks [aij] to process j if FLAG is true.
Each process i executes non blocking probe and receive of the non-zero size chunks [aji] from process j. These dynamic receives can be written as a loop of NRECV iterations.
However implementing this strategy in the MPI standards compliant way is getting difficult. For instance, a buffer can only be used in one communication at a time (even the apparently harmless case of multiple reads; source: Jeff Squyres article).
UPDATE:
I find MPI_Alltoallv() suitable for the transposition of the sub-vectors [aij]. However, several articles (found in the the internet) do report scalability issues when the communication topology is sparse. Recent MPI standard (beyond version 3) includes the collective communication MPI_Neighbor_alltoallv() on a communicator with a topology structure.

zero padding a signal in MATLAB

I have an audio signal of length 12769. I'm trying to perform STFT on it by breaking it into small windows of 1024 samples. This gives me with 12 exact windows while there are 481 points remaining. Since i need 543 (1024 - 481) more points to make up 1024 samples, i used the following code to zero pad.
f = [a zeros(1,542)];
where a is the audio file.
However i get an error saying
??? Error using ==> horzcat
CAT arguments dimensions are not consistent.
How can I overcome this?
Your vector a is a column vector and cannot be concatenated with row vector zeros(1,542). Use zeros(542,1) instead.
However, it is much easier to just use
f = a;
f(1024*ceil(end/1024)) = 0;
MATLAB will zero pad the vector up to element 1024, and it is independent of the array being column or row.
You can either remove the excess 481 samples using
Total_Samples = length(a);
for i=1 : Total_Samples-481
a_new[i] = a[i];
or you could add an additional 543 Zero samples by using
Total_Samples = length(a);
for i=Total_Samples+1 : Total_Samples+543
a[i] = 0 ;

KissFFT output of kiss_fftr

I'm receiving PCM data trough socket connection in packets containing 320 samples. Sample rate of sound is 8000 samples per second. I am doing with it something like this:
int size = 160 * 2;//160;
int isinverse = 1;
kiss_fft_scalar zero;
memset(&zero,0,sizeof(zero));
kiss_fft_cpx fft_in[size];
kiss_fft_cpx fft_out[size];
kiss_fft_cpx fft_reconstructed[size];
kiss_fftr_cfg fft = kiss_fftr_alloc(size*2 ,0 ,0,0);
kiss_fftr_cfg ifft = kiss_fftr_alloc(size*2,isinverse,0,0);
for (int i = 0; i < size; i++) {
fft_in[i].r = zero;
fft_in[i].i = zero;
fft_out[i].r = zero;
fft_out[i].i = zero;
fft_reconstructed[i].r = zero;
fft_reconstructed[i].i = zero;
}
// got my data through socket connection
for (int i = 0; i < size; i++) {
// samples are type of short
fft_in[i].r = samples[i];
fft_in[i].i = zero;
fft_out[i].r = zero;
fft_out[i].i = zero;
}
kiss_fftr(fft, (kiss_fft_scalar*) fft_in, fft_out);
kiss_fftri(ifft, fft_out, (kiss_fft_scalar*)fft_reconstructed);
// lets normalize samples
for (int i = 0; i < size; i++) {
short* samples = (short*) bufTmp1;
samples[i] = rint(fft_reconstructed[i].r/(size*2));
}
After that I fill OpenAL buffers and play them. Everything works just fine but I would like to do some filtering of audio between kiss_fftr and kiss_fftri. Starting point as I think for this is to convert sound from time domain to frequency domain, but I don't really understand what kind of data I'm receiving from kiss_fftr function. What information is stored in each of those complex number, what its real and imaginary part can tell me about frequency. And I don't know which frequencies are covered (what frequency span) in fft_out - which indexes corresponds to which frequencies.
I am total newbie in signal processing and Fourier transform topics.
Any help?
Before you jump in with both feet into a C implementation, get familiar with digital filters, esp FIR filters.
You can design the FIR filter using something like GNU Octave's signal toolbox. Look at the command fir1(the simplest), firls, or remez. Alternately, you might be able to design a FIR filter through a web page. A quick web search for "online fir filter design" found this (I have not used it, but it appears to use the equiripple design used in the remez or firpm command )
Try implementing your filter first with a direct convolution (without FFTs) and see if the speed is acceptable -- that is an easier path. If you need an FFT-based approach, there is a sample implementation of overlap-save in the kissfft/tools/kiss_fastfir.c file.
I will try to answer your questions directly.
// a) the real and imaginary components of the output need to be combined to calculate the amplitude at each frequency.
float ar,ai,scaling;
scaling=1.0/(float)size;
// then for each output [i] from the FFT...
ar = fft_out[i].r;
ai = fft_out[i].i;
amplitude[i] = 2.0 * sqrtf( ar*ar + ai*ai ) * scaling ;
// b) which index refers to which frequency? This can be calculated as follows. Only the first half of the FFT results are needed (assuming your 8KHz sampling rate)
for(i=1;i<(size/2);i++) freq = (float)i / (1/8000) / (float)size ;
// c) phase (range +/- PI) for each frequency is calculated like this:
phase[i] = phase = atan2(fft_out[i].i / fft_out[i].r);
What you might want to investigate is FFT fast convolution using overlap add or overlap save algorithms. You will need to expand the length of each FFT by the length of the impulse of your desired filter. This is because (1) FFT/IFFT convolution is circular, and (2) each index in the FFT array result corresponds to almost all frequencies (a Sinc shaped response), not just one (even if mostly near one), so any single bin modification will leak throughout the entire frequency response (except certain exact periodic frequencies).

Resources