Run TreSpEx analysis - loops

I am trying to run TreSpex analysis on a series of trees, which are saved in newick format as .fasta.txt files in a folder.
I have a list of Taxa names saved in a .txt file
I enter:
perl TreSpEx.v1.pl -fun e -ipt *fasta.txt -tf Taxa_List.txt
But it won't run. I tried writing a loop for each file within the folder but am not very good with them and my line of
for i in treefile/; do perl TreSpEx.v1.1.pl -fun e -ipt *.fasta.txt -tf Taxa_List.txt; done
won't work because -ipt apparently needs a name that starts with a letter or number

In your second example you are actually doing the same thing as in first (but posible several times).
I'm not familiar with TreSpEx or know Bash very well for that matter (which it seems you are using), but you might try something like below.
for i in treefile/*.fasta.txt ; do
perl TreSpEx.v1.1.pl -fun e -ipt $i -tf Taxa_List.txt;
done
Basically, you need to use a variable from the for loop (i) to pass name of each file to the command.

Related

error in looping over files, -fs- command

I'm trying to split some datasets in two parts, running a loop over files like this:
cd C:\Users\Macrina\Documents\exports
qui fs *
foreach f in `r(files)' {
use `r(files)'
keep id adv*
save adv_spa*.dta
clear
use `r(files)'
drop adv*
save fin_spa*.dta
}
I don't know whether what is inside the loop is correctly written but the point is that I get the error:
invalid '"e2.dta'
where e2.dta is the second file in the folder. Does this message refer to the loop or maybe what is inside the loop? Where is the mistake?
You want lines like
use "`f'"
not
use `r(files)'
given that fs (installed from SSC, as you should explain) returns r(files) as a list of all the files whereas you want to use each one in turn (not all at once).
The error message was informative: use is puzzled by the second filename it sees (as only one filename makes sense). The other filenames are ignored: use fails as soon as something is evidently wrong.
Incidentally, note that putting "" around filenames remains essential if any includes spaces.

I'm trying to use list comprehenison with numpy arrays, generating a single generator expression within a numpy array. Why?

so I'm writing some code that's going to be deployed on a raspberry pi. Because of the computational limitations of the raspberry pi, and the series of steps it's taking for this particular use(image capturing and processing) I thought it would be best to use list comprehension in place of for loops whenever possible. I managed to create a statement that made an array of files(names+paths), if it worked:
self.dark_file_names=np.array([(os.path.join(self.dark_frames_path,
line.strip("\n")) for root, dirs, files in os.walk(self.dark_frames_path) for line in files if line.endswith(".npy\n"))])
however, I tried a variation of this in ipython:
dark_file_names=np.array([(os.path.join(dark_frames_path, line.strip("\n")) for root, dirs, files in os.walk(dark_frames_path) for line in files if line.endswith(".py\n"))])
with this being the output:
array([<generator object <genexpr> at 0x7f14d4412888>], dtype=object)
with dark_frames_path being a local directory with a ton of python files
unfortunately this isn't what I was hoping for. I also tried with a normal list with a similar result. Why is it interpreting my statement as a generator expression instead of a list comprehension?
also I had this working when I was doing it over ssh with these commands in ipython:
stdin, stdout, stderr= ssh.exec_command('ls')
l=[line.strip('\n') for line in stdout if line.strip('\n').endswith(".py")]
with ssh being a paramiko.SSHClient() instance.

How do I run a C program several times and record the outputs?

so basically I have a C program which does a lot of computation based on an input .txt file and outputs a value. I want to run it 100 times and then work out the average, obviously this would be tedious to do individually.
So I've tried to research a bit about scripting etc and I've found things like this:
https://answers.yahoo.com/question/?qid=20091206100348AAaJPP8
Am I supposed to just do this in my command prompt? (I'm on Windows btw)
Thanks for any help :)
You're on Windows, so you can use a DOS batch script (.bat) to run your program N times using a loop (or N separate commands if that's easier for you). Use the >> symbol at the end of the command to append the output to a file. See http://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/redirection.mspx?mfr=true for more info on this, and search google for dos bat file for help on getting started with writing batch scripts.
Try this:
Have the program append the data into the text or csv file concerned and then write another program where you can run the program for a defined number of times. Use the function system(). It accepts a string as argument and executes it in the CUI.
Hope that helps.

File pointers in an array

Very raw with C. I'm writing a program that takes files as it's arguments, but this is rather annoying for debugging (GDB). Rather than have to re-type the file list each time that I start off in GDB, I'd rather store the file names in an array and modify the program to read this array rather than the argv[] values.
I started out with
FILE*[5] inpFiles;
inpFiles[0] = &file1.txt;
but this is all wrong. I need to get some sort of reference to each of the input files so that I can get its memory address.
How can I do this? Thanks.
You can define a GDB command in .gdbinit so you don't need to modify your production code.
For example, add the following lines in your ~/.gdbinit or .gdbinit in your working directory.
define do
run file1.txt file2.txt file3.txt file4.txt file5.txt
end
Then, in GDB, just type the command do, and GDB runs run file1.txt file2.txt file3.txt file4.txt file5.txt for you.
You can parse your input files, containing each of your files, by reading on the standard file stream 0.
so that you could do this:
./your_program < your_input_file
and in gdb,
run/r < your_input_file
but if you want to keep your args method, you can also do this:
./your_program `cat your_input_file`
Hope this helps.
An array of FILE * would be written:
FILE *inpFiles[5];
This can store the values returned from fopen() or a similar functions; it does not store file names.
You might store the file pointer into the structure that &file1 represents, or you might create a new structure that stores the name and the opened file pointer (though you may need to specify a mode; presumably "r" or "rb" by default).
So, clarify to yourself what exactly you want to do. You can create an array of file pointers, or an array of structures containing, amongst other things, a file pointer. But you have to decide how you're going to use it and what the semantics are going to be.
This presumes that modifying the program is a better idea than using GDB better. If you can learn to use the facilities of GDB more powerfully, then that's a better idea.
For example, can you make it easy to specify the files by using a metacharacter:
run debug?.txt
where your files are debug0.txt, debug1.txt, ...?
The other answers also suggest alternatives.

Executing nested loops+foreach+csh

It's been a while since I used csh formatting and I am having a little bit of trouble with a few things. Things seem so much easier to execute in Matlab, however I need to do this on the terminal because of the programs I am trying to interact with.
So here's what I want to do: I have a file del.txt that is structured like this
1
2
3
4
etc. So each value is in it's own row and there's one column for all the data. I have a bunch of other files that are within my directory. I want to match up say value 1 (which in this case is 1) with file 1 and value 2 with file 2, etc and so on and so forth. So here's what I did...
Code:
!/bin/csh
foreach a (cat del.txt)
foreach sta(ls *.HHZ)
echo a is $a
echo $sta
cat <<END>>macro.m
r $a
r $sta
END
sac macro.m
rm macro.m
end
end
However what I achieve is that it loops through all of the values in del.txt and each file and then moves on to the next file within my directory and loops through all of the values. I'm having trouble figuring out the format that this should be in to match up the correct values. I'm not doing much within the script yet until I can get them to match up. Please help Can someone tell me what I'm doing wrong? I read that the foreach command will execute all the commands on each file..but haven't been able to find a way to get around this. What I want it to do is take value 1 from del.txt and match it up with file 1 (sta) from the directory finish the loop, then take value 2 from del.txt and match it up with file 2 from the directory (sta). I've never done more than just simple iterations with csh on one subset of files, and I am not sure how to reference the values to one another. Any help would be greatly appreciated. I haven't found a simple way to do this without writing everything out. I looked at the 'for' and 'while' commands..if there is a simple way to do it I'm not seeing it.
Cheers,
K
IF I am understanding correctly, You have a txt file will list of strings and you want it to match with the files.
I am assuming by this particular statement of yours:
I want to match up say value 1 (which in this case is 1) with file 1
You mean to match the string at 1 with file name of file 1.
Here's a possible solution based on this assumption(this will help you anyway with loop):
#Store value in file.txt in a array
set file_var = `cat file.txt`
#Store file list in my_dir in a var
set my_dir = <your dir path>
set file_list_var = `ls $my_dir`
#Let's print "file Match: for every match
foreach var1 ($file_var)
foreach var2 ($file_list_var)
if("$var1" == "$var2") echo $var1 = $var2 : Match Found.
endif
endif

Resources