LUA - how to loop less times than entries in a table?

LUA - how to loop less times than entries in a table? - loops

LUA neewb having trouble with all the different ways of looping.
I adjusted a template script for mapping audio files in my filesystem to an audio sampler software, where each audio file goes into 1 group --> zone --> file.
This works as intended but now for a "Lite-Version" package of the software I only need some of the samples from my computer and this is where I'm having problems with my loop functions.
This is the function causing issues:
for index, file in next, samples do
...
The table "samples" consists of 180 samples but I'm only creating 38 groups and zones in the audio software for putting audio files into, so obviously this returns an error.
ERROR:
" ...- PATH.LINE: bad argument #2 to '--index' (invalid index, size is 58 got 58)"
The 58 makes sense because it is 38 sample groups + 18 empty groups + 2 template groups that get duplicated.
Since that is still smaller than 180 it wants to keep going and I'm not sure how to tell it to stop there.
CODE:
local samples = {}
local root = 0
-- [[ FILE SYSTEM]]
local i = 1
for _,p in filesystem.directoryRecursive(folderPath) do
if filesystem.isRegularFile(p) then
if filesystem.extension(p) == '.wav' or filesystem.extension(p) == '.aif' or filesystem.extension(p) == '.aiff' then
samples[i] = p
i = i+1
end
end
end
After this I create the amount of groups I need in the audio software (38).
A group contains a zone which contains a sample
-- [[ ZONES & FILES ]]
-- Create zones and place one file in each of the created groups.
-- file is a string property of the object Zones
-- samples is a table, populated with paths to our samples
for index, file in next, samples do
-- Initialize the zone variable.
local z = Zone()
-- Add a zone for each sample.
instrument.groups[index +1].zones:add(z)
-- Populate the attached zone with a sample from our table.
z.file = file
-- detect and set root note
local detectedPitch = mir.detectPitch(index)
z.rootKey = math.floor(detectedPitch + 0.5)
end
So my question is: how do I loop through the table samples but only do it up to 38 and not to 180?
I could do
for index = 1, #NUM_FACT_LAYERS do
but what do I do about file then?
Samples only has the paths to the files but z.file needs the string of the file.
z.file = file won't work in this case.
I guess that's why the previous script used for for in

Related

sum of variables in SPSS-statistics25 for multiple external files

I have 50 external EXCEL files. For each of these files, let's say #I, I import data as it follows in the SYNTAX of SPSS-statistics25:
GET DATA /TYPE=XLSX
/FILE='file#I.xlsx'
/SHEET=name 'Sheet2'
/CELLRANGE=full
/READNAMES=on
/ASSUMEDSTRWIDTH=32767.
EXECUTE.
DATASET NAME DataSet1 WINDOW=FRONT.
Then, I rank the variables included in #I file (WA CI) and I select one single case, at most, as it follows:
RANK VARIABLES= WA CI (D)
/RANK
/PRINT=YES
/TIES=LOW.
COUNT SVAR= RWA RCI (1).
SELECT IF( SVAR=2).
EXECUTE.
The task is the following:
I should print the sum of values of RWA looping on each EXCEL file #I. RWA can have value 1 or can be empty. If there are not selected cases (RWA is empty), the contribution to the sum of values should be 0. The final outcome should be the number of times RWA and RCI have the same TOP rank out of 50 Excel files.
How can I do this in a smart way?

Since I can't see the real data files, the following is a little in the dark, but I think it should be a viable strategy (you might as well try :)):
* first defining a macro to stack all the files together.
define stackFiles ()
GET DATA /TYPE=XLSX /FILE='file1.xlsx'
/SHEET=name 'Sheet2' /CELLRANGE=full /READNAMES=on /ASSUMEDSTRWIDTH=32767 /keep WA CI.
compute source=1.
exe.
dataset name gen.
!do !i=2 !to 40
GET DATA /TYPE=XLSX /FILE=!concat("'file", !i, ".xlsx'")
/SHEET=name 'Sheet2' /CELLRANGE=full /READNAMES=on /ASSUMEDSTRWIDTH=32767/keep WA CI.
compute source=!i.
exe.
add files /file=gen /file=*.
exe.
!doend.
!enddefine.
* now run the macro.
stackFiles .
* now for the rest of the analysis.
* first split the data by source file, then rank and select.
sort cases by source.
split file by source.
RANK VARIABLES= WA CI (D) /RANK /PRINT=YES /TIES=LOW.
COUNT SVAR= RWA RCI (1).
SELECT IF SVAR=2.
EXECUTE.
At this point you have up to 40 rows remaining - 0 or 1 from each original file. You can count or sum using descriptives RWA.

How to read the same column from multiple files and collect it in an array

I have 9 csv files each containing the same number of columns (61) as well as the same column headers. The files are basically follows-up of each other. Each column belongs to a signal reading which has been recorded for a long period of time and hence divided into multiple files. I need to graph the collected data for every single column. To do that I thought I would read one column at a time from all files and store the data into an array and graph it against time.
Since the data load is to much, system takes a reading every 5 seconds for a month, I want to read the data for every 30 mins which equals to reading 1 row per 362 rows.
I've tried plotting everything without skipping rows and it takes forever because of the data load.
file_list = glob.glob('*.csv')
cols = [0,1] # add more columns here
df = pd.DataFrame()
for f in file_list:
df = df.append(
pd.read_csv(f, delimiter='\s+', header=None, usecols=cols),
ignore_index=True,
)
arr = df.values
This is what I tried to read only specific columns from multiple files but I receive this message: "Usecols do not match columns, columns expected but not found: [1]"

the command below will do a parallel read followed by a concatenation. Assuming file_list contains a list of files that can be read with read_file function below
import multiprocessing as mp
def read_file(file):
return pd.read_csv(file)
pool = mp.Pool(mp.cpu_count()) # one worker per CPU. You can try other things
df = pd.concat(pool.map(read_file, file_list)
pool.terminate()
pool.join()

How to create runtime variable for reading csv file header using Pandas

I have a csv file .It logs some data depending upon the test condition.
The header file of this csv file is like below
UTC Time(s) SVID-1 Constel-1 Status-1 Zij-1 SVID-2 Constel-2 Status-2
10102 1 G P 0 2 G P
Zij-2 SVID-3 Constel-3 Status-3 Zij-3 .......
0.3 3 G A --
.....
Apart from UTC Time column, other columns may increase or decrease depending
upon test condition or number of satellites I use.
If any extra satellite introduces or reduces then corresponding Svid,Constel, Status,Zij will be present or will not be there.
I am interested to know whether is it possible to create runtime variable for each column without looking into csv file header.

SSIS Export all data from one table into multiple files

I have a table called customers which contains around 1,000,000 records. I need to transfer all the records to 8 different flat files which increment the number in the filename e.g cust01, cust02, cust03, cust04 etc.
I've been told this can be done using a for loop in SSIS. Please can someone give me a guide to help me accomplish this.
The logic behind this should be something like "count number of rows", "divide by 8", "export that amount of rows to each of the 8 files".

To me, it will be more complex to create a package that loops through and calculates the amount of data and then queries the top N segments or whatever.
Instead, I'd just create a package with 9 total connection managers. One to your Data Database (Source) and then 8 identical Flat File Connection managers but using the patterns of FileName1, Filename2 etc. After defining the first FFCM, just copy, paste and edit the actual file name.
Drag a Data Flow Task onto your Control Flow and wire it up as an OLE/ADO/ODBC source. Use a query, don't select the table as you'll need something to partition the data on. I'm assuming your underlying RDBMS supports the concept of a ROW_NUMBER() function. Your source query will be
SELECT
MT.*
, (ROW_NUMBER() OVER (ORDER BY (SELECT NULL))) % 8 AS bucket
FROM
MyTable AS MT;
That query will pull back all of your data plus assign a monotonically increasing number from 1 to ROWCOUNT which we will then apply the modulo (remainder after dividing) operator to. By modding the generated value by 8 guarantees us that we will only get values from 0 to 7, endpoints inclusive.
You might start to get twitchy about the different number bases (base 0, base 1) being used here, I know I am.
Connect your source to a Conditional Split. Use the bucket column to segment your data into different streams. I would propose that you map bucket 1 to File 1, bucket 2 to File 2... finally with bucket 0 to file 8. That way, instead of everything being a stair step off, I only have to deal with end point alignment.
Connect each stream to a Flat File Destination and boom goes the dynamite.

You could create a rownumber with a Script Component (don't worry very easy): http://microsoft-ssis.blogspot.com/2010/01/create-row-id.html
or you could use a rownumber component like http://microsoft-ssis.blogspot.com/2012/03/custom-ssis-component-rownumber.html or http://www.sqlis.com/post/Row-Number-Transformation.aspx
For dividing it in 8 files you could use the Balanced Data Distributor or the Conditional Split with a modulo expression (using your new rownumber column):

Stata Multiple Imputation output file

I'm new to Stata. I need to implement multiple imputation on Stata, but I have a problem when using it. I do everything like instruction by the following codes:
use http://www.stata-press.com/data/r11/mheart1s20
mi describe
mi impute regress bmi attack smokes age female hsgrad, add(20)
Then I got every thing as in instruction. However, I want to find the out put file (completed data).

There is no separate output data file; the 'completed' data is in memory. If you do mi describe again, you'll see that the dataset in memory now contains M = 40 imputations, whereas the output from the previous mi describe showed it contained M = 20 imputations. So you have added 20 imputations to the dataset, as specified by the add(20) option to your mi impute command.