I was trying to create a Perl script to rename the files (hundreds of files with different names), but I have not had any success. I first need to find the unique file number and then rename it to something more human readable. Since file names are not sequential, it makes it difficult.
Examples of files names: The number of importance is after que sequence
# vv-- this number
lane8-s244-index--ATTACTCG-TATAGCCT-01_S244_L008_R1_001.fastq
lane8-s245-index--ATTACTCG-ATAGAGGC-02_S245_L008_R1_001.fastq
lane8-s246-index--TCCGGAGA-TATAGCCT-09_S246_L008_R1_001.fastq
lane8-s247-index--TCCGGAGA-ATAGAGGC-10_S247_L008_R1_001.fastq
lane8-s248-index--TCCGGAGA-CCTATCCT-11_S248_L008_R1_001.fastq
lane8-s249-index--TCCGGAGA-GGCTCTGA-12_S249_L008_R1_001.fastq
lane8-s250-index--TCCGGAGA-AGGCGAAG-13_S250_L008_R1_001.fastq
lane8-s251-index--TCCGGAGA-TAATCTTA-14_S251_L008_R1_001.fastq
lane7-s0007-index--ATTACTCG-TATAGCCT-193_S7_L007_R1_001.fastq
lane7-s0008-index--ATTACTCG-ATAGAGGC-105_S8_L007_R1_001.fastq
lane7-s0009-index--ATTACTCG-CCTATCCT-195_S9_L007_R1_001.fastq
lane7-s0010-index--ATTACTCG-GGCTCTGA-106_S10_L007_R1_001.fastq
lane7-s0011-index--ATTACTCG-AGGCGAAG-197_S11_L007_R1_001.fastq
lane7-s0096-index--AGCGATAG-CAGGACGT-287_S96_L007_R1_001.fastq
I have created a file called RENAMING_parse_data.sh that reference RENAMING_parse_data.pl
So in theory the idea is that it is parsing the data to find the sample # that is in the middle of the name, and taking that unique ID and renaming it. But I don't think it's even going into the IF loop.
Any ideas?
HERE IS THE .sh file that calls the perl scipt
#!/bin/bash
#first part is the program
#second is the directory path
#third and fourth times are the names of the output files
#./parse_data.pl /ACTF/Course/PATHTDIRECTORY Tabsummary.txt Strucsummary.txt
#WHERE ./parse_data.pl =name of the program
#WHERE /ACTF/Course/PATHTODIRECTORY = directory path were your field are saved AND is referred to as $dir_in = $ARGV[0] in the perl script;
#new files you recreating with the extracted data AND is refered to as $dir_in = $ARGV[1];
./RENAMING_parse_data.pl ./Test/ FishList.txt
HERE IS THE PERL SCRIP:
#!/usr/bin/perl
print (":)\n");
#Proesessing files in a directory
$dir_in = $ARGV[0];
$indv_list = $ARGV[1];
#open directory to acess those files, the folder where you have the files
opendir(DIR, $dir_in) || die ("Cannot open $dir_in");
#files = readdir(DIR);
#set all variables = 0 to void chaos
$j=0;
#open output header line for output file and print header line for tab delimited file
open(OUTFILETAB, ">", $indv_list);
print(OUTFILETAB "\t Fish ID", "\t");
#open each file
foreach (#files){
#re start all arrays to void chaos
print("in loop [$j]");
#acc_ID=();
#find FISH name
#EXAMPLE FISH NAMES: (lenth of fishname varies)
#lane8-s251-index--TCCGGAGA-TAATCTTA-14_S251_L008_R1_001.fastq.gz
#lane7-s0096-index--AGCGATAG-CAGGACGT-287_S96_L007_R1_001.final.fastq
#NOTE: what is in btween () is the ID that is printed NOTE that value can change from 2 -3 depending on Sample #
#Trials:
#lane[0-9]{1}-[a-z]{1}[0-9]{4}-index--[A-Z]{8}[A-Z]{8}-([0-9]{3})[a-z]{1}[0-9]{2}_[A-Z]{1}[0-9]{3}_[a-z]{1}[0-9]{1}_[0-9]{3}.fastq
#lane[0-9]{1}-[a-z]{1}[0-9]{4}-index--[A-Z]{8}[A-Z]{8}-([0-9]{3})*.fastq
#lane*([0-9]{3})*.fastq
#lane.*-([0-9]{2})_.*.fastq
#lane.*-([0-9]{2})_*.fastq
#lane[0-9]{1}-[a-z]{1}[0-9]{3}-index--[A-Z]{8}[A-Z]{8}-([0-9]{2})_[A-Z]{1}[0-9]{3}_L008_R1_001.fastq
$string_FISH = #files;
if ($string_FISH =~ /^lane[0-9]{1}-[a-z]{1}[0-9]{3}-index--[A-Z]{8}[A-Z]{8}-([0-9]{2})_[A-Z]{1}[0-9]{3}_L008_R1_001.fastq/){
$FISH_ID =$1;
#acc_ID[$j] = $FISH_ID;
#print ("FISH. = |$FISH_ID[$j]| \n");
rename($string_FISH, "FISH. = |$FISH_ID[$j]|");
#print ($acc_ID[$j], "\n");
print(OUTFILETAB "FISH. = |$FISH_ID[$j]| \n");
}
$j= $j+1;
}
IDEAL END RESULT
So in the end I would like it to take the file name, find the unique identifier and rename it
from :
lane8-s244-index--ATTACTCG-TATAGCCT-01_S244_L008_R1_001.fastq
lane7-s0007-index--ATTACTCG-TATAGCCT-193_S7_L007_R1_001.fastq
to:
Fish.01.fastq
Fish.193.fastq
Any Ideas or suggestion on hot to fix this or If it need to change completely are greatly appreciated.
At the core of a Perl solution, you could use
s/^.*-(\d+)_[^-]+(?=\.fastq\z)/Fish.$1/sa
For example,
$ ls -1 *.fastq
lane8-s244-index--ATTACTCG-TATAGCCT-01_S244_L008_R1_001.fastq
lane8-s245-index--ATTACTCG-ATAGAGGC-02_S245_L008_R1_001.fastq
lane8-s246-index--TCCGGAGA-TATAGCCT-09_S246_L008_R1_001.fastq
lane8-s247-index--TCCGGAGA-ATAGAGGC-10_S247_L008_R1_001.fastq
lane8-s248-index--TCCGGAGA-CCTATCCT-11_S248_L008_R1_001.fastq
lane8-s249-index--TCCGGAGA-GGCTCTGA-12_S249_L008_R1_001.fastq
$ rename 's/^.*-(\d+)_[^-]+(?=\.fastq\z)/Fish.$1/sa' *.fastq
$ ls -1 *.fastq
Fish.01.fastq
Fish.02.fastq
Fish.09.fastq
Fish.10.fastq
Fish.11.fastq
Fish.12.fastq
(There are two similar tools named rename. This one is also known as prename.)
It's pretty simple to implement yourself:
#!/usr/bin/perl
use strict;
use warnings;
my $errors = 0;
for (#ARGV) {
my $old = $_;
s/^.*-(\d+)_[^-]+(?=\.fastq\z)/Fish.$1/sa;
my $new = $_;
next if $new eq $old;
if ( -e $new ) {
warn( "Can't rename \"$old\" to \"$new\": Already exists\n" );
++$errors;
}
elsif ( !rename( $old, $new ) ) {
warn( "Can't rename \"$old\" to \"$new\": $!\n" );
++$errors;
}
}
exit( !!$errors );
Provide the files to rename as arguments (e.g. using *.fastq from the shell).
$ ls -1 *.fastq
lane8-s244-index--ATTACTCG-TATAGCCT-01_S244_L008_R1_001.fastq
lane8-s245-index--ATTACTCG-ATAGAGGC-02_S245_L008_R1_001.fastq
lane8-s246-index--TCCGGAGA-TATAGCCT-09_S246_L008_R1_001.fastq
lane8-s247-index--TCCGGAGA-ATAGAGGC-10_S247_L008_R1_001.fastq
lane8-s248-index--TCCGGAGA-CCTATCCT-11_S248_L008_R1_001.fastq
lane8-s249-index--TCCGGAGA-GGCTCTGA-12_S249_L008_R1_001.fastq
$ ./a *.fastq
$ ls -1 *.fastq
Fish.01.fastq
Fish.02.fastq
Fish.09.fastq
Fish.10.fastq
Fish.11.fastq
Fish.12.fastq
The existence check (-e) is to prevent accidentally renaming a bunch of files to the same name and therefore losing all but one of them.
The above is an cleaned up version of an one-liner pattern I often use.
dir /b ... | perl -nle"$o=$_; s/.../.../; $n=$_; rename$o,$n if!-e$n"
Adapted to sh:
\ls ... | perl -nle'$o=$_; s/.../.../; $n=$_; rename$o,$n if!-e$n'
So I have a few files with spaces in them, I first remove spaces and put the "new names" in a list, then I construct a command to take old file name, which I put in quotes, and then the new file-name and rename them using mv and subprocess.run - but python is spitting out errors:
Here's the code:
import os
import subprocess
file_path = "/home/emil/import/"
files = os.listdir(file_path)
new_files = []
#Create new names
for each in files:
print ("Taking file: ", each)
new_name = each.replace(" ", "")
final_name = file_path + new_name
new_files.append(final_name)
cmd = "mv \"" + file_path + each + "\" " + final_name
print (cmd)
subprocess.run([cmd])
And here is the output:
emil#TITAN:~/programmering$ ./call.py
Taking file: test file1.pdf
mv "/home/emil/import/test file1.pdf" /home/emil/import/testfile1.pdf
Traceback (most recent call last):
File "./call.py", line 22, in <module>
subprocess.run([cmd])
File "/usr/lib/python3.7/subprocess.py", line 472, in run
with Popen(*popenargs, **kwargs) as process:
File "/usr/lib/python3.7/subprocess.py", line 775, in __init__
restore_signals, start_new_session)
File "/usr/lib/python3.7/subprocess.py", line 1522, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'mv "/home/emil/import/test file1.pdf" /home/emil/import/testfile1.pdf': 'mv "/home/emil/import/test file1.pdf" /home/emil/import/testfile1.pdf'
As you can see I print the variable cmd which I am sending to subprocess, and if I run that command individually it works great:
emil#TITAN:~/programmering$ mv "/home/emil/import/test file1.pdf" /home/emil/import/testfile1.pdf
emil#TITAN:~/programmering$ ls /home/emil/import/
'test file2.pdf' 'test file3.pdf' testfile1.pdf
So where am I going wrong?
I have 3000 files in c:\data\, and I need to replace a static string in each of them with the name of the file. For example, in the file 12345678.txt there will be some records along with the string 99999999, and I want to replace 99999999 with the filename 12345678.
How can I do this using a batch script?
try this,
replace_string="99999999"
for f in *.txt; do
sed -i "s/${replace_string}/${f%.*}/g" "$f";
done
Explanation:
for f in *.txt; do ... done: Loop through files named *.txt in current directory.
sed -i ... file Edit file in place (-i).
"s/pattern/replacement/g" Substitutes (s) pattern with replacement globally (g).
${f%.*} Filename without extension (via)
With GNU tools:
find . -regex '.*/[0-9]+\.txt' -type f -exec gawk -i inplace '
BEGINFILE {f = FILENAME; sub(".*/", "", f); sub(/\..*/, "", f)}
{gsub(/\<99999999\>/, f); print}' {} +
I have 6 files in a directory; A.v, A.def, B.v, B.def, C.v, C.def.
I want to read the A.v and A.def files at once and go to B.v and B.def and so on. I am using the following snippet to carry out the above-said function but it is throwing errors.
foreach i [glob "./*.v"] {
read_verilog $i.v
read_def $i.def
}
I would like to set the variable to read just the name A, B, C etc.
You want to use the file rootname command here:
foreach file [glob -nocomplain "./*.v"] {
read_verilog $file
read_def [file rootname $file].def
}
I have a bunch of files named like so:
output_1.png
output_2.png
...
output_10.png
...
output_120.png
What is the easiest way of renaming those to match a convention, e.g. with maximum four decimals, so that the files are named:
output_0001.png
output_0002.png
...
output_0010.png
output_0120.png
This should be easy in Unix/Linux/BSD, although I also have access to Windows. Any language is fine, but I'm interested in some really neat one-liners (if there are any?).
Python
import os
path = '/path/to/files/'
for filename in os.listdir(path):
prefix, num = filename[:-4].split('_')
num = num.zfill(4)
new_filename = prefix + "_" + num + ".png"
os.rename(os.path.join(path, filename), os.path.join(path, new_filename))
you could compile a list of valid filenames assuming that all files that start with "output_" and end with ".png" are valid files:
l = [(x, "output" + x[7:-4].zfill(4) + ".png") for x in os.listdir(path) if x.startswith("output_") and x.endswith(".png")]
for oldname, newname in l:
os.rename(os.path.join(path,oldname), os.path.join(path,newname))
Bash
(from: http://www.walkingrandomly.com/?p=2850)
In other words I replace file1.png with file001.png and file20.png with file020.png and so on. Here’s how to do that in bash
#!/bin/bash
num=`expr match "$1" '[^0-9]*\([0-9]\+\).*'`
paddednum=`printf "%03d" $num`
echo ${1/$num/$paddednum}
Save the above to a file called zeropad.sh and then do the following command to make it executable
chmod +x ./zeropad.sh
You can then use the zeropad.sh script as follows
./zeropad.sh frame1.png
which will return the result
frame001.png
All that remains is to use this script to rename all of the .png files in the current directory such that they are zeropadded.
for i in *.png;do mv $i `./zeropad.sh $i`; done
Perl
(from: Zero pad rename e.g. Image (2).jpg -> Image (002).jpg)
use strict;
use warnings;
use File::Find;
sub pad_left {
my $num = shift;
if ($num < 10) {
$num = "00$num";
}
elsif ($num < 100) {
$num = "0$num";
}
return $num;
}
sub new_name {
if (/\.jpg$/) {
my $name = $File::Find::name;
my $new_name;
($new_name = $name) =~ s/^(.+\/[\w ]+\()(\d+)\)/$1 . &pad_left($2) .')'/e;
rename($name, $new_name);
print "$name --> $new_name\n";
}
}
chomp(my $localdir = `pwd`);# invoke the script in the parent-directory of the
# image-containing sub-directories
find(\&new_name, $localdir);
Rename
Also from above answer:
rename 's/\d+/sprintf("%04d",$&)/e' *.png
Fairly easy, although it combines a few features not immediately obvious:
#echo off
setlocal enableextensions enabledelayedexpansion
rem iterate over all PNG files:
for %%f in (*.png) do (
rem store file name without extension
set FileName=%%~nf
rem strip the "output_"
set FileName=!FileName:output_=!
rem Add leading zeroes:
set FileName=000!FileName!
rem Trim to only four digits, from the end
set FileName=!FileName:~-4!
rem Add "output_" and extension again
set FileName=output_!FileName!%%~xf
rem Rename the file
rename "%%f" "!FileName!"
)
Edit: Misread that you're not after a batch file but any solution in any language. Sorry for that. To make up for it, a PowerShell one-liner:
gci *.png|%{rni $_ ('output_{0:0000}.png' -f +($_.basename-split'_')[1])}
Stick a ?{$_.basename-match'_\d+'} in there if you have other files that do not follow that pattern.
I actually just needed to do this on OSX. Here's the scripts I created for it - single line!
> for i in output_*.png;do mv $i `printf output_%04d.png $(echo $i | sed 's/[^0-9]*//g')`; done
For mass renaming the only safe solution is mmv—it checks for collisions and allows renaming in chains and cycles, something that is beyond most scripts. Unfortunately, zero padding it ain't too hot at. A flavour:
c:> mmv output_[0-9].png output_000#1.png
Here's one workaround:
c:> type file
mmv
[^0-9][0-9] #1\00#2
[^0-9][0-9][^0-9] #1\00#2#3
[^0-9][0-9][0-9] #1\0#2#3
[^0-9][0-9][0-9][^0-9] #1\0#2#3
c:> mmv <file
Here is a Python script I wrote that pads zeroes depending on the largest number present and ignores non-numbered files in the given directory. Usage:
python ensure_zero_padding_in_numbering_of_files.py /path/to/directory
Body of script:
import argparse
import os
import re
import sys
def main(cmdline):
parser = argparse.ArgumentParser(
description='Ensure zero padding in numbering of files.')
parser.add_argument('path', type=str,
help='path to the directory containing the files')
args = parser.parse_args()
path = args.path
numbered = re.compile(r'(.*?)(\d+)\.(.*)')
numbered_fnames = [fname for fname in os.listdir(path)
if numbered.search(fname)]
max_digits = max(len(numbered.search(fname).group(2))
for fname in numbered_fnames)
for fname in numbered_fnames:
_, prefix, num, ext, _ = numbered.split(fname, maxsplit=1)
num = num.zfill(max_digits)
new_fname = "{}{}.{}".format(prefix, num, ext)
if fname != new_fname:
os.rename(os.path.join(path, fname), os.path.join(path, new_fname))
print "Renamed {} to {}".format(fname, new_fname)
else:
print "{} seems fine".format(fname)
if __name__ == "__main__":
sys.exit(main(sys.argv[1:]))
$rename output_ output_0 output_? # adding 1 zero to names ended in 1 digit
$rename output_ output_0 output_?? # adding 1 zero to names ended in 2 digits
$rename output_ output_0 output_??? # adding 1 zero to names ended in 3 digits
That's it!
with bash split,
linux
for f in *.png;do n=${f#*_};n=${n%.*};mv $f $(printf output_"%04d".png $n);done
windows(bash)
for f in *.png;do n=${f#*_};mv $f $(printf output_"%08s" $n);done
I'm following on from Adam's solution for OSX.
Some gotchyas I encountered in my scenario were:
I had a set of .mp3 files, so the sed was catching the '3' in the '.mp3' suffix. (I used basename instead of echo to rectify this)
My .mp3's had spaces within their names, E.g., "audio track 1.mp3", this was causing basename+sed to screw up a little bit, so I had to quote the "$i" parameter.
In the end, my conversion line looked like this:
for i in *.mp3 ; do mv "$i" `printf "track_%02d.mp3\n" $(basename "$i" .mp3 | sed 's/[^0-9]*//g')` ; done
Using ls + awk + sh:
ls -1 | awk -F_ '{printf "%s%04d.png\n", "mv "$0" "$1"_", $2}' | sh
If you want to test the command before runing it just remove the | sh
I just want to make time lapse movie using
ffmpeg -pattern_type glob -i "*.jpg" -s:v 1920x1080 -c:v libx264 output.mp4
and got a similar problem.
[image2 # 000000000039c300] Pattern type 'glob' was selected but globbing is not supported by this libavformat build
glob not support on Windows 7 .
Also if file list like below, and uses %2d.jpg or %02d.jpg
1.jpg
2.jpg
...
10.jpg
11.jpg
...
[image2 # 00000000005ea9c0] Could find no file with path '%2d.jpg' and index in the range 0-4
%2d.jpg: No such file or directory
[image2 # 00000000005aa980] Could find no file with path '%02d.jpg' and index in the range 0-4
%02d.jpg: No such file or directory
here is my batch script to rename flies
#echo off
setlocal enabledelayedexpansion
set i=1000000
set X=1
for %%a in (*.jpg) do (
set /a i+=1
set "filename=!i:~%X%!"
echo ren "%%a" "!filename!%%~xa"
ren "%%a" "!filename!%%~xa"
)
after rename 143,323 jpg files,
ffmpeg -i %6d.jpg -s:v 1920x1080 -c:v libx264 output.mp4