I have one folder with about 1000 files and I want to group them according to their resepctive parent folders.
I did ls- R > updated.txt to get the original setup of folders and files.
The updated. txt looks like this:
./Rhodococcus_RHA1:
NC_008268.fna
NC_008269.fna
NC_008270.fna
NC_008271.fna
./Rhodoferax_ferrireducens_T118:
NC_007901.fna
NC_007908.fna
./Rhodopseudomonas_palustris_BisA53:
NC_008435.fna
./Rhodopseudomonas_palustris_BisB18:
NC_007925.fna
./Rhodopseudomonas_palustris_BisB5:
NC_007958.fna
./Rhodopseudomonas_palustris_CGA009:
NC_005296.fna
NC_005297.fna
So, by looking at this file, I know what files go into what folder. The folder with all the 1000 files together looks like this:
results_NC_004193.fna.1.ebwt.map
results_NC_004307.fna.1.ebwt.map
results_NC_004310.fna.1.ebwt.map
results_NC_004311.fna.1.ebwt.map
results_NC_004337.fna.1.ebwt.map
results_NC_004342.fna.1.ebwt.map
results_NC_004343.fna.1.ebwt.map
results_NC_004344.fna.1.ebwt.map
and so on...
You can see that the filenames of all the 1000 files are dependent on their original names in the folder setup(if that's a good way to explain it).
I want to move these results_XXXXXXXX files to folders (have to create new folders) with the original setup. So it should be something like this:
./Rhodococcus_RHA1: (this is a folder)
results_NC_008268.fna.1.ebwt.map
results_NC_008269.fna.1.ebwt.map
results_NC_008270.fna.1.ebwt.map
results_NC_008271.fna.1.ebwt.map
./Rhodoferax_ferrireducens_T118:
results_NC_007901.fna.1.ebwt.map
results_NC_007908.fna.1.ebwt.map
I don't really know how to do this... maybe some kind of mov command? I'd appreciate help with this problem.
Run the following command from the folder where you have those 1000 files. The path/to/original/files is the path to the original files (the one that you did ls -R). you should get a list of mv commands. Verify several of them to confirm that those are correct. If so, add | sh next the command and rerun it to execute those commands. If you don't have all the corresponding files in the 1000 files folder, you would get mv commands that would return "file not found", that can be ignored or piped to /dev/null. This assumes that you always have a file in original folder so that it knows where to move the file. If not, some of those 1000 files won't be moved. As always, take a good backup before you do this.
find path/to/original/files -type f | awk -F"/" '{ path=$0; sub($NF, "", path); printf("mv results_%s.1.ebwt.map \"%s\"\n", $NF, path);}'
Related
I have tried to find an answer to my question looking at similar topics but didn't succeed. Maybe I have overlooked. Any help is appreciated!
So, I have hundreds of folders in my current directory named from folder1000 to folder1500. In each folder, I have one .fastq file with a different name (Lib1.fastq, Lib2.fastq, etc). I want to proceed each of these files in one loop command by running a shell script.
Here is my shell script (script.sh) for one file (it creates outputs which further proceeded) which I run in my Terminal:
#!/bin/sh
bowtie --threads 4 -v 2 -m 10 -a genome Lib1.fastq --sam > Lib1.sam
samtools view -h -o Lib1.sam Lib1.bam
sort -k 3,3 -k 4,4n Lib1.sam > Lib1.sam.sorted
# ...etc
Here is the loop I am trying to make as well in a shell script (here I have started only with a simple checking "head" command, and only with 5 first folders) which I run from my current directory where all folders are located:
#!/bin/sh
for file in ./folder{1000..1005}
do
head -10 *.fastq
done
But as a result I get:
head: *.fastq: No such file or directory
head: *.fastq: No such file or directory
head: *.fastq: No such file or directory
head: *.fastq: No such file or directory
head: *.fastq: No such file or directory
So, even a simple checking command does not work for me in a loop. Somehow I can not see the file. But if I run the command directly in one of the folders:
MacBook-Air-Maxim:folder1000 maxim$ head -10 *.fastq
then I get the correct result (the first 10 lines of the file displayed).
Could anyone suggest the way to process all files in the most convenient way?
Thanks a lot and very sorry, I am just learning.
Well, you are traversing through the folders using the variable $file, but you are not using this variable in the loop body. Just use it:
#!/bin/sh
for file in ./folder{1000..1005}
do
head -10 $file/*.fastq
done
There are other issues in the overall problem, but this is the answer to the point that is stopping you. Let's tackle the problems one by one :-)
i'm a near complete beginner to batch scripting.
I'm currently learning how to create batch files. My goal is to compress a folder using exclusively InfoZip, add the date to the file name, and have that file copied to an USB memory stick plugged on H:\
The reason why i need to use InfoZip, even though it is a very old program, is because i need somthing that works even on Win95.
InfoZip is not installed, it is just unpacked in folder and ready to use.
It is possible to download InfoZip 3.0 from here:
https://sourceforge.net/projects/infozip/
Anyway, so far, the only thing i could come up with is this...
--------------------------------------
Title : Your folder will be zipped into an archive that will be copied on the USB memory stick plugged on your computer. Please DO NOT remove the memory stick during the operation.
#ECHO OFF
call d:\infozip\wiz.exe
pause
--------------------------------------
It just brings up the InfoZip window on the screen, but then i have absolutely no idea about how to make it zip a folder, add the date, and copy that zipped file to the USB.
All the regular commands meant for 7-zip or Winzip don't seem to work with InfoZip.
I could really use some help, please :)
Thanks!
Using the waybackmachine I was able to get the documentation for info-zip:
https://web.archive.org/web/20170829173722/http://www.info-zip.org/mans/zip.html#EXAMPLES
In contrary to what zip.exe shows, the syntax for zipping files is this:
zip -r zipfilename zipfilecontents
Example:
zip -r myzip.zip c:\myfolder\*.*
The -r parameter includes subfolders as well.
Problem is that the complete folder structure is included in the zip. I have not found a solution for this yet.
To solve the structure folder problem, add a cd command that will target the folder container which has inside your file or files. This before running the code proposed by Martien de Jong upside.
for example:
The path of my file is: cd C:\aa\B\file.txt
So the path you will put in the cd to target the folder container is: cd C:\aa\B
cd C:\aa\B
zip -r myzip.zip B\*.*
*Remember that this code will zip all the files included in B folder.
I have a folder with a few files in it; I like to keep my folder clean of any stray files that can end up in it. Such stray files may include automatically generated backup files or log files, but could be a simple as someone accidentally saving to the wrong folder (my folder).
Rather then have to pick through all this all the time I would like to know if I can create a batch file that only keeps a number of specified files (by name and location) but deletes anything not on the "list".
[edit] Sorry when I first saw the question I read bash instead of batch. I don't delete the not so useful answer since as was pointed out in the comments it could be done with cygwin.
You can list the files, exclude the one you want to keep with grep and the submit them to rm.
If all the files are in one directory:
ls | grep -v -f ~/.list_of_files_to_exclude | xargs rm
or in a directory tree
find . | grep -v -f ~/.list_of_files_to_exclude | xargs rm
where ~/.list_of_files_to_exclude is a file with the list of patterns to exclude (one per line)
Before testing it make a backup copy and substitute rm with echo to see if the output is really what you want.
White lists for file survival is an incredibly dangerous concept. I would strongly suggest rethinking that.
If you must do it, might I suggest that you actually implement it thus:
Move ALL files to a backup area (one created per run such as a directory containing the current date and time).
Use your white list to copy back files that you wanted to keep, such as with copy c:\backups\2011_04_07_11_52_04\*.cpp c:\original_dir).
That way, you keep all the non-white-listed files in case you screw up (and you will at some point, trust me) and you don't have to worry about negative logic in your batch file (remove all files that _aren't of all these types), instead using the simpler option (move back every file that is of each type).
I have a folder with 2508 files (jpg and pdf) in it on my drive. I have a list in a .txt file of about 1000 files which I want to remove from that folder - either by deleting to trash, or removing to another folder.
Is there a utility - or possibly commands I can put into the terminal - which I can use to do this, without manually moving files while looking through my list?
(Context: The list is a list of orphaned image files put out by Dreamweaver - which I want to remove from the images folder of a given site)..
Any help appreciated.
Thanks.
Put this in the terminal:
cat filename-of-list-with-files | xargs rm
I have been cat'ing files in the Terminal untill now.. but that is time consuming when done alot. What I want is something like:
I have a folder with hundreds of files, and I want to effectively cat a few files together.
For example, is there a way to select (in the Finder) five split files;
file.txt.001, file.txt.002, file.txt.003, file.txt.004
.. and then right click on them in the Finder, and just click Merge?
I know that isn't possible out of the box of course, but with an Automator action, droplet or shell script, is something like that possible to do? Or maybe assigning that cat-action a keyboard shortcut, and when hit selected files in the Finder, will be automatically merged together to a new file AND placed in the same folder, WITH a name based on the original split files?
In this example file.001 through file.004 would magically appear in the same folder, as a file named fileMerged.txt ?
I have like a million of these kind of split files, so an efficient workflow for this would be a life saver. I'm working on an interactive book, and the publisher gave me this task..
cat * > output.file
works as a sh script. It's piping the contents of the files into that output.file.
* expands to all files in the directory.
Judging from your description of the file names you can automate that very easily with bash. e.g.
PREFIXES=`ls -1 | grep -o "^.*\." | uniq`
for PREFIX in $PREFIXES; do cat ${PREFIX}* > ${PREFIX}.all; done
This will merge all files in one directory that share the same prefix.
ls -1 lists all files in a directory (if it spans multiple directories can use find instead. grep -o "^.*\." will match everything up to the last dot in the file name (you could also use sed -e 's/.[0-9]*$/./' to remove the last digits. uniq will filter all duplicates. Then you have something like speech1.txt. sound1.txt. in the PREFIXES variable. The next line loops through those and merges the groups of files individually using the * wildcard.