Recursing directories only goes one file deep - file

I have the following code:
find_info(File) ->
case file:read_file_info(File) of
{ok, Facts} ->
case Facts#file_info.type of
directory -> directory;
regular -> regular
end;
{error,Reason} -> exit(Reason)
end.
find_files(Dir,Flag,Ending,Acc) ->
case file:list_dir(Dir) of
{ok,A} -> find_files_helper(A,Dir,Flag,Acc,Ending);
{_,_} -> Acc
end.
find_files_helper([H|Tail],Dir,Flag,Acc,Ending) ->
A = find_info(filename:absname_join(Dir,H)),
case A of
directory ->
case Flag of
true ->
find_files(filename:absname_join(Dir,H),Flag,Ending,Acc ++ find_files_helper(Tail,Dir,Flag,Acc,Ending));
false -> find_files_helper(Tail,Dir,Flag,Acc,Ending)
end;
regular ->
case filename:extension(H) of
Ending -> find_files_helper(Tail,Dir,Flag,[to_md5_large(H)] ++ Acc, Ending);
_ -> find_files_helper(Tail,Dir,Flag,Acc,Ending)
end;
{error,Reason} -> exit(Reason)
end;
find_files_helper([],_,_,Acc,_) -> Acc.
However whenever I run the find_files/4 the program only goes one file deep before crashing.
Say I have the following directory
home/
a/
ser.erl
b/
c/
file.erl
file2.erl
When run I will get the md5 of file.erl of file2.erl and of ser.erl. However if the directory looks like this:
home/
a/
ser.erl
back.erl
b/
c/
file.erl
file2.erl
Then the whole program crashes. I have spent few good hours looking for what I'm missing here in my logic however I have no idea.
The error message that I get is exception enoent in function p:to_md5_large/1.
In case the md5 is needed here it is:
to_md5_large(File) ->
case file:read_file(File) of
{ok, <<B/binary>>} -> md5_helper(B,erlang:md5_init());
{error,Reason} -> exit(Reason)
end.
md5_helper(<<A:4/binary,B>>,Acc) -> md5_helper(B,erlang:md5_update(Acc,A));
md5_helper(A,Acc) ->
B = erlang:md5_update(Acc,A),
erlang:md5_final(B).

You're getting enoent because you're passing a filename like back.erl to to_md5_large when you're not in the directory where back.erl is located. Try passing the full filename instead. You're already calling filename:absname_join(Dir,H) in find_files_helper, so just save that to a variable and pass that variable instead of H to to_md5_large.

There is a function that does this for you:
fold_files(Dir, RegExp, Recursive, Fun, AccIn) -> AccOut
in your case:
Result = filelib:fold_files(Dir, ".*\.erl", true, fun(X,Acc) -> {ok,B} = file:read_file(X), [erlang:md5(B)|Acc] end, []).
[edit]
#Bula:
I didn't answer directly to your question for 2 reasons:
The first one is that, at the time I was writing my answer, you didn't provide the type of error you get. It is very important, with any language, to learn how to get information from error report. In erlang, most of the time, you get the error type an the line where it occurs, looking at the documentation you will have a very helpful information about what was going wrong. By the way, unless you want to manage errors, I discourage you to write things like:
case file:read_file(File) of
{ok, <<B/binary>>} -> md5_helper(B,erlang:md5_init());
{error,Reason} -> exit(Reason)
end.
The following code will do the same, shorter, and you'll get the exact line number where you got an issue (its not the best example in your code, but it's the shorter)
{ok, <<B/binary>>} = file:read_file(File),
md5_helper(B,erlang:md5_init()),
The second is that I find your code too big, with useless helper functions. I think it is important to try to have a concise and readable code, and also to try to use the library function in the right way. For example you are using erlang:md5:init/0, erlang:md5_update/2 and erlang:md5_final/1 while a single call to erlang:md5/1 is enough in your case. The way you use it exists to be able to calculate the md5 when you get the data chunk by chunk, which is not your case, and the way you wrote the helper function does not allow to use this feature.
I don't understand why you want to have a "deployed" version of your code, but I propose you another version where I tried to follow my advices (written directly in the shell, so it need R17+ for the definition of recursive anonymous function) :o)
1> F = fun F(X,D,Ending) ->
1> {ok,StartD} = file:get_cwd(), %% save current directory
1> ok = file:set_cwd(D), %% move to the directory to explore
1> R = case filelib:is_dir(X) of
1> true -> %% if the element to analyze is a directory
1> {ok,Files} = file:list_dir(X), %% getits content
1> [F(Y,X,Ending) || Y <- Files]; %% and recursively analyze all its elements
1> false ->
1> case filelib:is_regular(X) andalso (filename:extension(X) == Ending) of
1> true -> %% if it is a regular file with the right extension
1> {ok,B} = file:read_file(X), %% read it
1> [erlang:md5(B)]; %% and calculate the md5 (must be return in a list
1> %% for consistancy with directory results)
1> false ->
1> [] %% in other cases (symlink, ...) return empty
1> end
1> end,
1> ok = file:set_cwd(StartD), %% restore current directory
1> lists:flatten(R) %% flatten for nicer result
1> end.
#Fun<erl_eval.42.90072148>
2> Md5 = fun(D) -> F(D,D,".erl") end.
#Fun<erl_eval.6.90072148>
3> Md5("C:/My programs/erl6.2/lib/stdlib-2.2").
[<<150,238,21,49,189,164,184,32,42,239,200,52,135,78,12,
112>>,
<<226,53,12,102,125,107,137,149,116,47,50,30,37,13,211,243>>,
<<193,114,120,24,175,27,23,218,7,169,146,8,19,208,73,255>>,
<<227,219,237,12,103,218,175,238,194,103,52,180,132,113,
184,68>>,
<<6,16,213,41,39,138,161,36,184,86,17,183,125,233,20,125>>,
<<23,208,91,76,69,173,159,200,44,72,9,9,50,40,226,27>>,
<<92,8,168,124,230,1,167,199,6,150,239,62,146,119,83,36>>,
<<100,238,68,145,58,22,88,221,179,204,19,26,50,172,142,193>>,
<<253,79,101,49,78,235,151,104,188,223,55,228,163,25,16,
147>>,
<<243,189,25,98,170,97,88,90,174,178,162,19,249,141,94,60>>,
<<237,85,6,153,218,60,23,104,162,112,65,69,148,90,15,240>>,
<<225,48,238,193,120,43,124,63,156,207,11,4,254,96,250,204>>,
<<67,254,107,82,106,87,36,119,140,78,216,142,66,225,8,40>>,
<<185,246,227,162,211,133,212,10,174,21,204,75,128,125,
200,...>>,
<<234,191,210,59,62,148,130,187,60,0,187,124,150,213,...>>,
<<199,231,45,34,185,9,231,162,187,130,134,246,54,...>>,
<<157,226,127,87,191,151,81,50,19,116,96,121,...>>,
<<15,59,143,114,184,207,96,164,155,44,238,...>>,
<<176,139,190,30,114,248,0,144,201,14,...>>,
<<169,79,218,157,20,10,20,146,12,...>>,
<<131,25,76,110,14,183,5,103,...>>,
<<91,197,189,2,48,142,67,...>>,
<<94,202,72,164,129,237,...>>,
<<"^NQÙ¡8hÿèkàå"...>>,<<"ðÙ.Q"...>>,
<<150,101,76,...>>,
<<"A^ÏrÔ"...>>,<<"¹"...>>,<<...>>|...]
4>

Related

Error loading physionet ECG database on MATLAB

I'm using this code to load the ECG-ID database into MATLAB:
%% Initialization
clear all; close all; clc
%% read files from folder A
% Specify the folder where the files live.
myFolder = 'Databases\ECG_ID';
% Check to make sure that folder actually exists. Warn user if it doesn't.
if ~isfolder(myFolder)
errorMessage = sprintf('Error: The following folder does not exist:\n%s\nPlease specify a new folder.', myFolder;)
uiwait(warndlg(errorMessage);)
myFolder = uigetdir(; % Ask for a new one.)
if myFolder == 0
% User clicked Cancel
return;
end
end
% Get a list of all files in the folder with the desired file name pattern.
filePattern = fullfile(myFolder, '**/rec_*'; % Change to whatever pattern you need.)
theFiles = dir(filePattern;)
for k = 1 : length(theFiles)
baseFileName = theFiles(k.name;)
fullFileName = fullfile(theFiles(k.folder, baseFileName);)
fprintf(1, 'Now reading %s\n', fullFileName;)
% Now do whatever you want with this file name,
% such as reading it in as an image array with imread()
[sig, Fs, tm] = rdsamp(fullFileName, [1],[],[],[],1;)
end
But I keep getting this error message:
Now reading C:\Users\******\Documents\MATLAB\Databases\ECG_ID\Person_01\rec_1.atr
Error using rdsamp (line 203)
Could not find record: C:\Users\******\Documents\MATLAB\Databases\ECG_ID\Person_01\rec_1.atr. Search path is set to: '.
C:\Users\******\Documents\MATLAB\mcode\..\database\ http://physionet.org/physiobank/database/'
I can successfully load one signal at a time (but I can't load the entire database using the above code) using this command:
[sig, Fs, tm] = rdsamp('Databases\ECG_ID\Person_01\rec_1');
How do I solve this problem? How can I load all the files in MATLAB?
Thanks in advance.

How to mody and copy several files with python 35?

I have this code portion for instance :
fichiers=glob.glob('/path/*.file')
for f in fichiers:
if os.path.isfile(f):
fichier = open(f,'r')
for l in fichier:
m = regex.match(l)
if m:
print('%s/ EMO /%s'%(m.group(1),m.group(3)))
#here I want to write this modified line
else:
#write line non modified
fichier.close()
And I would like, instead of printing results in the shell, apply the substitution to all lines of each line with copying files with new names or in a new directory (to be sure not making mistakes).
Have you some idea to teach me how to do that please ?
It's really quite simple: all you need to do is define your output directory and open a new file in that directory to write to, every time you open a file that you read. Check this out:
import glob
import os
outdirpath = "/path/to/output/directory"
for fpath in glob.glob('/path/*.file'):
if not os.path.isfile(fpath): continue
with open(fpath) as fichier, open(os.path.join(outdirpath, os.path.basename(fpath)), 'w') as outfile:
for line in fichier:
m = regex.match(line)
if m:
outfile.write('%s/ EMO /%s'%(m.group(1),m.group(3)))
else:
outfile.write(line)

How do I read (and parse) a file and then append to the same file without getting an exception?

I am trying to read from a file correctly in Haskell but I seem to get this error.
*** Exception: neo.txt: openFile: resource busy (file is locked)
This is my code.
import Data.Char
import Prelude
import Data.List
import Text.Printf
import Data.Tuple
import Data.Ord
import Control.Monad
import Control.Applicative((<*))
import Text.Parsec
( Parsec, ParseError, parse -- Types and parser
, between, noneOf, sepBy, many1 -- Combinators
, char, spaces, digit, newline -- Simple parsers
)
These are the movie fields.
type Title = String
type Director = String
type Year = Int
type UserRatings = (String,Int)
type Film = (Title, Director, Year , [UserRatings])
type Period = (Year, Year)
type Database = [Film]
This is the Parsing of all the types in order to read correctly from the file
-- Parse a string to a string
stringLit :: Parsec String u String
stringLit = between (char '"') (char '"') $ many1 $ noneOf "\"\n"
-- Parse a string to a list of strings
listOfStrings :: Parsec String u [String]
listOfStrings = stringLit `sepBy` (char ',' >> spaces)
-- Parse a string to an int
intLit :: Parsec String u Int
intLit = fmap read $ many1 digit
-- Or `read <$> many1 digit` with Control.Applicative
stringIntTuple :: Parsec String u (String , Int)
stringIntTuple = liftM2 (,) stringLit intLit
film :: Parsec String u Film
film = do
-- alternatively `title <- stringLit <* newline` with Control.Applicative
title <- stringLit
newline
director <- stringLit
newline
year <- intLit
newline
userRatings <- stringIntTuple
newline
return (title, director, year, [userRatings])
films :: Parsec String u [Film]
films = film `sepBy` newline
This is the main program (write "main" in winghci to start the program)
-- The Main
main :: IO ()
main = do
putStr "Enter your Username: "
name <- getLine
filmsDatabase <- loadFile "neo.txt"
appendFile "neo.txt" (show filmsDatabase)
putStrLn "Your changes to the database have been successfully saved."
This is the loadFile function
loadFile :: FilePath -> IO (Either ParseError [Film])
loadFile filename = do
database <- readFile filename
return $ parse films "Films" database
the other txt file name is neo and includes some movies like this
"Blade Runner"
"Ridley Scott"
1982
("Amy",5), ("Bill",8), ("Ian",7), ("Kevin",9), ("Emma",4), ("Sam",7), ("Megan",4)
"The Fly"
"David Cronenberg"
1986
("Megan",4), ("Fred",7), ("Chris",5), ("Ian",0), ("Amy",6)
Just copy paste everything include a txt file in the same directory and test it to see the error i described.
Whoopsy daisy, being lazy
tends to make file changes crazy.
File's not closed, as supposed
thus the error gets imposed.
This small guile, by loadFile
is what you must reconcile.
But don't fret, least not yet,
I will show you, let's get set.
As many other functions that work with IO in System.IO, readFile doesn't actually consume any input. It's lazy. Therefore, the file doesn't get closed, unless all its content has been consumed (it's then half-closed):
The file is read lazily, on demand, as with getContents.
We can demonstrate this on a shorter example:
main = do
let filename = "/tmp/example"
writeFile filename "Hello "
contents <- readFile filename
appendFile filename "world!" -- error here
This will fail, since we never actually checked contents (entirely). If you get all the content (for example with printing, length or similar), it won't fail anymore:
main = do
let filename = "/tmp/example2"
writeFile filename "Hello "
content <- readFile filename
putStrLn content
appendFile filename "world!" -- no error
Therefore, we need either something that really closes the file, or we need to make sure that we've read all the contents before we try to append to the file.
For example, you can use withFile together with some "magic" function force that makes sure that the content really gets evaluated:
readFile' filename = withFile filename ReadMode $ \handle -> do
theContent <- hGetContents handle
force theContent
However, force is tricky to achieve. You could use bang patterns, but this will evaluate the list only to WHNF (basically just the first character). You could use the functions by deepseq, but that adds another dependency and is probably not allowed in your assignment/exercise.
Or you could use any function that will somehow make sure that all elements are evaluated or sequenced. In this case, we can use a small trick and mapM return:
readFile' filename = withFile filename ReadMode $ \handle -> do
theContent <- hGetContents handle
mapM return theContent
It's good enough, but you would use something like pipes or conduit instead in production.
The other method is to make sure that we've really used all the contents. This can be done by using another parsec parser method instead, namely runParserT. We can combine this with our withFile approach from above:
parseFile :: ParsecT String () IO a -> FilePath -> IO (Either ParseError a)
parseFile p filename = withFile filename ReadMode $ \handle ->
hGetContents handle >>= runParserT p () filename
Again, withFile makes sure that we close the file. We can use this now in your loadFilm:
loadFile :: FilePath -> IO (Either ParseError [Film])
loadFile filename = parseFile films filename
This version of loadFile won't keep the file locked anymore.
The problem is that readFile doesn't actually read the entire file into memory immediately; it opens the file and instantly returns a string. As you "look at" the string, behind the scenes the file is being read. So when readFile returns, the file it still open for reading, and you can't do anything else with it. This is called "lazy I/O", and many people consider it to be "evil" precisely because it tends to cause problems like the one you currently have.
There are several ways you can go about fixing this. Probably the simplest is to just force the whole string into memory before continuing. Calculating the length of the string will do that — but only if you "use" the length for something, because the length itself is lazy. (See how this rapidly becomes messy? This is why people avoid lazy I/O.)
The simplest thing you could try is printing the number of films loaded right before you try to append to the database.
main = do
putStr "Enter your Username: "
name <- getLine
filmsDatabase <- loadFile "neo.txt"
putStrLn $ "Loaded " ++ show (length filmsDatabase) ++ " films."
appendFile "neo.txt" (show filmsDatabase)
putStrLn "Your changes to the database have been successfully saved."
It's kind of evil that what looks like a simple print message is actually fundamental to making the code work though!
The other alternative is to save the new database under a different name, and then delete the old file and rename the new one over the top of the old one. This does have the advantage that if the program were to crash half way through saving, you haven't just lost all your stuff.

dbWriteTable in RMySQL error in name pasting

i have many data.frames() that i am trying to send to MySQL database via RMySQL().
# Sends data frame to database without a problem
dbWriteTable(con3, name="SPY", value=SPY , append=T)
# stock1 contains a character vector of stock names...
stock1 <- c("SPY.A")
But when I try to loop it:
i= 1
while(i <= length(stock1)){
# converts "SPY.A" into SPY
name <- print(paste0(str_sub(stock1, start = 1, end = -3))[i], quote=F)
# sends data.frame to database
dbWriteTable(con3,paste0(str_sub(stock1, start = 1, end = -3))[i], value=name, append=T)
i <- 1+i
}
The following warning is returned & nothing was sent to database
In addition: Warning message:
In file(fn, open = "r") :
cannot open file './SPY': No such file or directory
However, I believe that the problem is with pasting value onto dbWriteTable() since writing dbWriteTable(con3, "SPY", SPY, append=T) works but dbWriteTable(con3, "SPY", name, append=T) will not...
You are probably using a non-base package for str_sub and I'm guessing you get the same behavior with substr. Does this succeed?
dbWriteTable(con3, substr( stock1, 1,3) , get(stock1), append=T)

TypeError: invalid file: When trying to make a file name a variable

Hi I am trying to represent a file location as a variable because the finial script will be run on another machine. This is the code I have tried followed by the error I get. It seems to me that some how python is adding "\" and that is causing the problem. If that is the case how do I get it not to insert the "\"? Thank you
F = 'C:\Documents and Settings\myfile.txt','r'
f = open(F)
and the error
TypeError: invalid file: ('C:\\Documents and Settings\\myfile.txt', 'r')
From the docs:
http://docs.python.org/tutorial/inputoutput.html#reading-and-writing-files
Try this:
F = r'C:\Documents and Settings\myfile.txt'
f = open(F, 'r')
About the "double backslashes" - you need to escape backslashes in your strings or use r'string', see this:
http://docs.python.org/release/2.5.2/ref/strings.html
E.g. try this:
>>> a = 'a\nb'
>>> print a
a
b
To get what you expect, you need this:
>>> a = r'a\nb'
>>> print a
a\nb
or this:
>>> a = 'a\\nb'
>>> print a
a\nb
Try
f=open('C:\Documents and Settings\myfile.txt','r')
Instead of using the variable F. the way you have it 'r' is part of the file name, which it is not.
Instead of writing / or \, you should do this:
import os
F = os.path.join(
"C:",
os.path.join(
"Documents and Settings", "myfile.txt"
)
)
f = open(F, 'r')`
so that it uses / or \ according to your os.
(Although if you write C:/ it must mean you want to run your code on Windows... Hum...)

Resources