Reading multiple files with the Ruby gem 'Yomu' - arrays

I'm trying to download documents and strip out the document metadata with the Yomu gem, but cannot find guidance for parsing multiple files. The semi working code is below, and should work if you put some pdf files in the same directory as the script.
require 'yomu'
dir = Dir.pwd
files = Dir["#{dir}/*.pdf"]
def allpdf(files)
filearray = []
files.each do |file|
filearray << file
end
filearray
end
def metadata(dir, allfiles)
array = []
allfiles.each do |file|
yomufile = Yomu.new file
array << yomufile.metadata["Author"]
puts array
end
end
allfiles = allpdf(files)
metadata(dir, allfiles)
So when I 'puts array' it spits out what I would expect. But if I call 'array' outside of the loop, I get a single entry repeated over and over, so I can only assume that the array/yomu hash is being overwritten perhaps. What is the best way to fix this so that I can return a full array for use elsewhere in the application?
Please Note: I suspect this may be a more general Ruby error on my part related to my lack of array skills rather than a Yomu specific issue. Im not sure how else to address this question however.

Jakub Pavlík was correct, the code was actually working as stated, it just wasn't displaying the output in the way I expected!

Related

writing filenames into an array in Python

I read the files in a directory using glob, and then I rename each file to something more legible for my purposes using os.rename.
for file_name in glob.glob(path+'*.txt'):
newfilename = 'run'+str(i)+'.csv' # rename filenames to something more readable
os.rename(file_name,path + newfilename) #put r before path if error ="(unicode error) ‘unicodeescape’ codec can’t decode bytes in position 2-3: truncated \UXXXXXXXX escape"
When I try to write the each new file created into an array (a list) previously intialized to :
filelist=[];
using
filelist.append(i)=newfilename
I get the following error: "SyntaxError: cannot assign to function call"
If I just try to add the file to the filelst array using indeces, ie, filelist[i]=newfilename, I then get an index out of range error.
How do I do create this list of renamed filenames "on the fly"? Thank you.
Ok..so I finally understood that append wants to "append the "thing" to the list, as opposed to "append the thing at index i to the list" like I was trying to do originally.
So the correct way to use append is:
filelist.append(newfilename)

Advice on reading multiple text files into an array with Ruby

I'm currently writing out a program in Ruby, which I'm fairly new at, and it requires multiple text files to be pushed into an array line by line.
I am currently unable to actually test my code since I'm at work and this is for personal use, but I'm seeking advice to see if my code is correct. I knows how to read a file and push it to the array. If possible can someone check it over and advise if I have the correct idea? I'm self taught regarding Ruby and have no-one to check my work.
I understand if this isn't the right place for trying to get this sort of advice and it's deleted/locked. Apologies if so.
contentsArray = []
Dir.glob('filepath').each do |filename|
next if File.directory?(filename)
r = File.open("#{path}#{filename}")
r.each_line { |line| contentsArray.push line}
end
I'm hoping this snippet will take the lines from multiple files in the same directory and stick them in the array so I can later splice what's in there.
Thank you for the question.
First let's assume that 'filepath' is something like the target pattern you want to glob in Dir.glob('filepath') (I used Dir.glob('src/*.h').each do |filename| in my test).
After that, File.open("#{path}#{filename}") prepends another path to the already complete path you'll have in filename.
And lastly, although this is probably not the problem, the code opens the file and never closes it. The IO object provides a readlines method that takes care of opening and closing the file for you.
Here's some working code that you can adapt:
contentsArray = []
Dir.glob('filepath').each do |filename|
next if File.directory?(filename)
lines = IO.readlines(filename)
contentsArray.concat(lines)
end
puts "#{contentsArray.length} LINES"
Here are references to the Ruby doc's for the IO::readlines and Array::concat methods used:
https://ruby-doc.org/core-2.5.5/IO.html#method-i-readlines
https://ruby-doc.org/core-2.5.5/Array.html#method-i-concat
As an alternative to using the goto (next) the code could conditionally execute on files, like this:
if File.file?(filename)
lines = IO.readlines(filename)
contentsArray.concat(lines)
end

How do I serialize a variable in VimScript?

I wish to save a random Vim dictionnary, let's say:
let dico = {'a' : [[1,2], [3]], 'b' : {'in': "str", 'out' : 51}}
to a file. Is there a clever way to do this? Something I could use like:
call SaveVariable(dico, "safe.vimData")
let recover = ReadVariable("safe.vimData")
Or should I build something myself with only textfiles?
You can put to good use the :string() function. Test these:
let g:dico = {'a' : [[1,2], [3]], 'b' : {'in': "str", 'out' : 51}}
let str_dico = 'let g:dico_copy = ' . string(dico)
echo str_dico
execute str_dico
echo g:dico_copy
... so you can save the str_dico string as a line of a vimscript file (e.g. using writefile()), and then source the vim file directly.
Thanks to VanLaser (cheers), I've been able to implement these functions using string, writefile and readfile. This is not binary serialization but it works well :)
function! SaveVariable(var, file)
" turn the var to a string that vimscript understands
let serialized = string(a:var)
" dump this string to a file
call writefile([serialized], a:file)
endfun
function! ReadVariable(file)
" retrieve string from the file
let serialized = readfile(a:file)[0]
" turn it back to a vimscript variable
execute "let result = " . serialized
return result
endfun
Use them this way:
call SaveVariable(anyvar, "safe.vimData")
let restore = ReadVariable("safe.vimData")
Enjoy!
I used #iago-lito's answer in a script I wrote a few years ago. Yesterday I spent some time improving on it. The vim dictionary is very similar to a JSON object, but:
when I open the file and set filetype=json, the linter complains about the single quotes around the strings, and
the JSON formatter splits the text into multiple lines, and indents them to make a pretty file. As a result, reading only the 0'th line of text doesn't give a complete dictionary object.
Here are my modifications to fix both issues.
function! SaveVariable(var, file)
" Change all single quotes to double quotes.
" {'x':'O''Toole','y':['A',2,'d']} -> {"x":"O""Toole","y":["A",2,"d"]}
let serialized = substitute(string(a:var),"'",'"','g')
" Change all escaped double quotes back to apostrophes.
" {"x":"O""Toole","y":["A",2,"d"]} -> {"x":"O'Toole","y":["A",2,"d"]}
let serialized = substitute(serialized,'""', "'",'g')
call writefile([serialized], a:file)
endfunction
function! ReadVariable(file)
execute 'let result = [' . join(readfile(a:file),'') . ']'
return result[0]
endfunction
This seems to work well for all kinds of data. I tested it with an object, a list, and number and string scalar values.
STRANGER, DANGER!
Here is a word of warning that goes along with this and any other dynamically-generated code. Both #iago-lito's and my solution are vulnerable to code injection, and if you are reading files that are out of your control, bad things can happen to your machine. For example, if someone sneaks this into the file:
42|call system('rmdir /s /q c:\')|call system('rm -rf /')
calling #iago-lito's ReadVariable() will return 42, but your computer will be toast, whether it's a Windows, Mac, or Linux machine. My version also fails, albeit with a more complex version of the statement:
42]|call system('rmdir /s /q c:\')|call system('rm -rf /')|let x=[
A proper solution would be to parse the text, looking for the end of the actual data, and dumping everything after it. This means you lose the simplicity of this approach. Vim and Neovim have come a long way in recent years. lua and python are, from what I've read, easier than ever to integrate into vimscript. I wouldn't be surprised if either of those languages has a built-in answer to this question.

How to run same code on multiple files, or all files in directory

so I am very new to coding and recently wrote a little program that involved R and sox. It looked like this
file <- "test.mp3"
testSox = paste("sox ",file," -n spectrogram -o ",file,".png stats",sep='')
sox = system(testSox, intern = TRUE)
print(sox)
Now, instead of assigning the one file manually within the code, I would just like to have this code read through all the mp3s in a folder automatically. Is this possible? Any help would be greatly appreciated. Thanks!
EDIT: Actually, I should add that I tried list.files, but when it comes to running the system() command, I get
"Error in system(command, as.integer(flag), f, stdout, stderr) :
character string expected as first argument"
Here's the list.files code I tried:
> temp = list.files(path = ".", pattern=".mp3")
>
> file <- temp
>
> firstSox = paste("sox ",file," -n spectrogram -o ",file,".png stats",sep='')
> sox = system(firstSox, intern = TRUE)
Error in system(command, as.integer(flag), f, stdout, stderr) :
character string expected as first argument
> print(sox)
I'm guessing this is not the correct route to go? Because I basically need to replace 'file' in the firstSox line with each mp3 that's in the temp array. So instead of running:
file <- "test.mp3"
...I would just like to have it re-assign each time for every file in the folder..., so it runs through as test.mp3, then 1.mp3, then 2.mp3, then 3.mp, etc.
I've scoured the net, and just feel like I've hit a brick wall. As stated in the comments, I've read up on loops, but for some reason I can't wrap my head around how to incorporate it into what I have written. I feel like I just need someone to show me at least the way, or maybe even write me an example so I can wrap my head around it. Would greatly appreciate help and any tips on what I'm doing wrong and could correct. Thanks.
Try the below code. I am using dir() instead of list.files, just because I find it easier. Remember there are many ways to do the same thing in R.
files <- dir(path = ".",pattern = ".mp3") #Get all the mp3 files
for(f in files) { #Loop over the mp3 files one at a time
firstSox = paste("sox ",f," -n spectrogram -o ",f,".png stats",sep='')
sox = system(firstSox, intern = TRUE)
print(sox)
}
Your firstSox variable will be a vector of commands to run (paste will generate a vector, one string for each element of file). So now you just need to run each command through system
One way to do this and capture the output is to use the lapply or sapply function:
sox <- lapply( firstSox, function(x) system(x, intern=TRUE) )
In this code lapply will run the function for each element of firstSox one at a time, the function just takes the current element (in x) and passes that to system. Then lapply gathers all the outputs together and combines them into a list that it puts into sox.
If the results of each run give the same shape of results (single number or vector of same length) then you can use sapply instead and it will simplify the return into a vector or matrix.

Create functions in matlab

How can I create a function with MATLAB so I can call it any where in my code?
I'm new to MATLAB so I will write a PHP example of the code I want to write in MATLAB!
Function newmatlab(n){
n=n+1;
return n;
}
array=array('1','2','3','4');
foreach($array as $x){
$result[]=newmatlab($x);
}
print_f($result);
So in nutshell, I need to loop an array and apply a function to each item in this array.
Can some one show me the above function written in MATLAB so I can understand better?
Note: I need this because I wrote a code that analyzes a video file and then plots data on a graph. I then and save this graph into Excel and jpg. My problem is that I have more than 200 video to analyze, so I need to automate this code to loop inside folders and analyze each *.avi file inside and etc.
As others have said, the documentation covers this pretty thoroughly, but perhaps we can help you understand.
There are a handful of ways that you can define functions in Matlab, but probably the most useful for you to get started is to define one in an m-file. I'll use your example code. You can do this by creating a file called newmatlab.m in your project's directory that looks something like this
% newmatlab.m
function result = newmatlab(array)
result = array + 1
Note that the function has the same name as the file and that there is no explicit return statement - it figures that out by what you've named the output parameter(s) (result in this case).
Then, in the same directory, you can create a script (or another function) that calls your newmatlab function by that name:
% main.m (or whatever)
a = [1 2 3 4];
b = newmatlab(a)
That's it! This is a simplified explanation, but hopefully enough to get you started and then the documentation can help more.
PS: There's no "include" in Matlab; any functions that are defined in m-files in the current path are visible. You can find out what's in the path by using the path command. Roughly, it's going to consist of
Matlab's own directory
The MATLAB subdirectory of your Documents directory
The current working directory

Resources