atomically creating a file lock in MATLAB (file mutex) - file

I am looking for a simple already implemented solution for atomically creating a file lock in MATLAB.
Something like:
file_lock('create', 'mylockfile'); %this will block until it creates the lock file.
file_lock('remove', 'mylockfile'); %this will remove the lock file:
This question has already been asked several times, with some proposed solution ideas (such as using Java FileLock),
but I didn't find a simple already implemented solution.
Are you aware of such an implemented solution?
Notes:
locking file access OR exchanging messages bw Matlab Instances
Thread Subject: Safe file mutex without race condition

I've settled on a pretty simple solution for combining error/logging messages from multiple worker threads into a single file. Every time I want to write to that file, I first write the output to the thread's own temporary file. Next, I append that temporary file to the "master" log file using flock. Skipping over some details here, the idea is:
fid=fopen(threadtemp, 'w');
fprintf(fid, 'Error message goes here');
fclose(fid);
runme = sprintf('flock -x %s -c ''cat %s >> %s''', LOGFILE, threadtemp, LOGFILE);
system(runme);
See the flock man page for details, but the call above is acquiring an eXclusive lock on the logfile, running the provided Command under the lock, and then releasing it.
This obviously only works if you're on a system which has flock (Linux/OS X, and only certain types of file systems at that) and you're doing something that can be done from the command line, but I'd bet that it's a pretty common use-case.

Depending on which Java version you're using, perhaps this will work (translated from: http://www.javabeat.net/2007/10/locking-files-using-java/)
classdef FileLock < handle
properties (Access = private)
fileLock = []
file
end
methods
function this = FileLock(filename)
this.file = java.io.RandomAccessFile(filename,'rw');
fileChannel = this.file.getChannel();
this.fileLock = fileChannel.tryLock();
end
function val = hasLock(this)
if ~isempty(this.fileLock) && this.fileLock.isValid()
val = true;
else
val = false;
end
end
function delete(this)
this.release();
end
function release(this)
if this.hasLock
this.fileLock.release();
end
this.file.close
end
end
end
Usage would be:
lock = FileLock('my_lock_file');
if lock.hasLock
%// do something here
else
%// I guess not
end
%// Manually release the lock, or just delete (or let matlab clean it up)
I like the obj wrapping pattern for IO so that releasing happens even in exceptions
EDIT: The file ref must be kept around and manually closed or you won't be able to edit this. That means this code is only really useful for pure lock files, I think.

If you only need to run on OS X and Linux (not Windows), you can use the following:
pathLock='/tmp/test.lock'
% Try to create and lock this file.
% In my case I use -r 0 to avoid retrying
% You could use -r -1 to retry forever, or for a particular amount of time,
% etc, see `man lockfile` for details.
if ~system(sprintf('lockfile -r 0 %s',pathLock))
% We succeeded, so perform some task which needs to be serialized.
% runSerializedTask()
% Now remove the lockfile
system(sprintf('rm -f %s',pathLock));
end

Write to a new file, then rename it. Renaming is an atomic operation, and all the new content will become visible at once.

At the end I did one implementation based on two consecutive tests (movefile, and verify the contents of the moved file).
not very well written, but it works for now for me.
+++++ file_lock.m ++++++++++++++++++++++++
function file_lock(op, filename)
%this will block until it creates the lock file:
%file_lock('create', 'mylockfile')
%
%this will remove the lock file:
%file_lock('remove', 'mylockfile')
% todo: verify that there are no bugs
filename = [filename '.mat'];
if isequal(op, 'create')
id = [tempname() '.mat']
while true
save(id, 'id');
success = fileattrib(id, '-w');
if success == 0; error('fileattrib'); end
while true
if exist(filename, 'file'); %first test
fprintf('file lock exists(1). waiting...\n');
pause(1);
continue;
end
status = movefile(id, filename); %second test
if status == 1; break; end
fprintf('file lock exists(2). waiting...\n');
pause(1);
end
temp = load(filename, 'id'); % third test.
if isequal(id, temp.id); break; end
fprintf('file lock exists(3). waiting...\n');
pause(1)
end
elseif isequal(op, 'remove')
%delete(filename);
execute_rs(#() delete(filename));
else
error('invalid op');
end
function execute_rs(f)
while true
try
lastwarn('');
f();
if ~isequal(lastwarn, ''); error(lastwarn); end %such as: Warning: File not found or permission denied
break;
catch exception
fprintf('Error: %s\n.Retrying...\n', exception.message);
pause(.5);
end
end
++++++++++++++++++++++++++++++++++++++++++

Related

Writing to output stream in chunks in Groovy

I'm getting Jenkins console logs and writing them into an output stream like this:
ByteArrayOutputStream stream = new ByteArrayOutputStream()
currentBuild.rawBuild.getLogText().writeLogTo(0, stream)
However, the downside of this approach is that writeLogTo() method is limited to 10000 lines:
https://github.com/jenkinsci/stapler/blob/master/core/src/main/java/org/kohsuke/stapler/framework/io/LargeText.java#L572
In this case, if Jenkins console log is more than a 10000 lines then the data from line 10000 and up is lost and not written into a buffer.
I'm trying to re-write the above approach in the most easiest way to account for cases when the log has more than 10000 lines.
I feel like my attempt is very complicated and error-prone. Is there an easier way to introduce a new logic?
Please note that the code below is not tested, this is just a draft of how I'm planning to implement it:
ByteArrayOutputStream stream = new ByteArrayOutputStream()
def log = currentBuild.rawBuild.getLogText()
def offset = 0
def maxNumOfLines = 10000
# get total number of lines in the log
# def totalLines = (still trying to figure out how to get it)
if (totalLines > maxNumOfLines) {
def numOfExecutions = round(totalLines / maxNumOfLines)
}
for (int i=0; i<numOfExecutions; i++) {
log.writeLogTo(offset, stream)
offset += maxNumOfLines
}
writeLogTo(long start, OutputStream out)
According to comments this method returns the offset to start the next write operation.
Seems code could be like this
def logFile = currentBuild.rawBuild.getLogText()
def start=0
while(logFile.length()>start)
start=logFile.writeLogTo(start, stream)
stream could be a FileOutputStream to avoid reading whole log into memory.
There is another method readAll()
So, the code could be simple as this to read whole log as text:
def logText=currentBuild.rawBuild.getLogText().readAll().getText()
Or if you want to transfer it to a local file:
new File('path/to/file.log').withWriter('UTF-8'){ w->
w << currentBuild.rawBuild.getLogText().readAll()
}

Groovy script to modify text file line by line

I have a input file from which a Groovy file reads input. Once a particular input is processed, Groovy script should be able to comment the input line it used and then move on.
File content:
1
2
3
When it processes line 1 and line 2, the input file will look as below:
'1
'2
3
By this way, if I re-run the Groovy, I would like to start from the line it stopped last time. If a input was used and it failed, that particular line shall not be commented (') so that a retry can be attempted.
Appreciate if you can help to draft a Groovy script.
Thanks
AFAIK in Groovy you can only append text at the end of the file.
Hence to add ' on each line when it is processed you need to rewrite the entire file.
You can use the follow approach but I only recommend you to use for a small files since you're loading all the lines in memory. In summary an approach for your question could be:
// open the file
def file = new File('/path/to/sample.txt')
// get all lines
def lines = file.readLines()
try{
// for each line
lines.eachWithIndex { line,index ->
// if line not starts with your comment "'"
if(!line.startsWith("'")){
// call your process and make your logic...
// but if it fails you've to throw an exception since
// you can not use 'break' within a closure
if(!yourProcess(line)) throw new Exception()
// line is processed so add the "'"
// to the current line
lines.set(index,"'${line}")
}
}
}catch(Exception e){
// you've to catch the exception in order
// to save the progress in the file
}
// join the lines and rewrite the file
file.text = lines.join(System.properties.'line.separator')
// define your process...
def yourProcess(line){
// I make a simple condition only to test...
return line.size() != 3
}
An optimal approach to avoid load all lines in memory for a large files is to use a reader to read the file contents, and a temporary file with a writer to write the result, and optimized version could be:
// open the file
def file = new File('/path/to/sample.txt')
// create the "processed" file
def resultFile = new File('/path/to/sampleProcessed.txt')
try{
// use a writer to write a result
resultFile.withWriter { writer ->
// read the file using a reader
file.withReader{ reader ->
while (line = reader.readLine()) {
// if line not starts with your comment "'"
if(!line.startsWith("'")){
// call your process and make your logic...
// but if it fails you've to throw an exception since
// you can not use 'break' within a closure
if(!yourProcess(line)) throw new Exception()
// line is processed so add the "'"
// to the current line, and writeit in the result file
writer << "'${line}" << System.properties.'line.separator'
}
}
}
}
}catch(Exception e){
// you've to catch the exception in order
// to save the progress in the file
}
// define your process...
def yourProcess(line){
// I make a simple condition only to test...
return line.size() != 3
}

Using Parallel::ForkManager in foreach loop

I am just learning Perl as a fourth language.
My wish is to use Parallel::ForkManager to speed up a foreach loop using an array whose members are taken from a text file.
Basically I am testing a .txt file of URLs, and wish to make it so that it will test multiple members of the array at once, not one at a time (five at a time in this instance) and without spamming the same URL inadvertently DoSing it.
Would something like this do the trick?
$limit = new Parallel::ForkManager(5);
foreach (#lines) {
$limit->start and next;
$lines = $_;
... do processing here ...
$limit->finish;
}
or would it be the equivalent of running that loop 5 times making a small multithreaded DoS script?
It isn't too clear from the documentation, but
A call to start will block in the parent process until there are fewer children running than the limit specified. Then it will return the (non-zero) child PID in the parent, and zero in the child
A child process can see all the data in the parent process as it was when the start was called. The data is presumably copy-on-write, as the child may modify it but the changes aren't reflected in any other process's workspace
The $pm->start and next idiom may seem a little obscure. Essentially it skips the rest of the loop if the start method returns a true value. I prefer something like my $pid = $fm->start; next if $pid; or the if construct in the code below. Both do the same thing, but I think more legibly
I recommend that you experiment with this simpler application, which uses a cache of five child threads to print the numbers from zero to nine.
use strict;
use warnings;
use Parallel::ForkManager;
STDOUT->autoflush;
my $fm = Parallel::ForkManager->new(5);
for my $i (0 .. 9) {
my $pid = $fm->start;
if ($pid == 0) {
print "$i\n";
sleep 2;
$fm->finish;
}
}
To test, use a safe local process like print or write to avoid spamming the URL's. Here's a working snippet from a program I wrote that uses the fork manager.
my $pm=new Parallel::ForkManager(20);
foreach $add (#adds){
$pm->start and next;
#if email is invalid move on
if (!defined(Email::Valid::Loose->address($add))){
writeaddr(*BADADDR, $add); #address is bad
$pm->finish;
}
#if email is valid get domain name
$is_valid = Email::Valid::Loose->address($add);
if ($is_valid =~ m/\#(.*)$/) {
$host = $1;
}
$is_valid="";
# perform dsn lookup to check domain
#mx=mx($resolver, $host);
if (#mx) {
writeaddr(*GOODADDR, $add); #address is good
}else{
writeaddr(*BADADDR, $add); #address is bad
}
$pm->finish;
}

Need program to loop correctly

Got it.
while 1:
line = sub.readline().split()
if line == []:
new = main
break
else:
new = main.replace(line[0],line[1])
main = new
This seem to work for me. Thanks for the help =).
Try another loop and in that loop index the word swap that you need to occur:
I assume each line of sub.txt has the subs that you want. Read all the lines of sub.txt, storing each line in an indexable array. Set up a loop around your main code, and in that loop index through the array referencing, sequentially the line of sub.txt that you want each time.
As pointed out by Cameron, this method will overwrite the output file every time thus recording only your last change.
The error in this block is :
while True:
word = substitute.readline().split()
print(word)
if word == []: // ---Indentation ---
break
else:
new = (main_story.read().replace(word[0],word[1]))
new_story.write(new)
You need to read the complete file at once, make changes and write to file.
Or you could read from the first file, and then do subsequent read/writes on the new file.

Is it safe to abort this file-searching thread?

First, the code:
lblFileNbr.Text = "?/?";
lblFileNbr.ToolTipText = "Searching for files...";
lock(_fileLock)
{
_dirFiles = new string[0];
_fileIndex = 0;
}
if(_fileThread != null && _fileThread.IsAlive)
{
_fileThread.Abort();
}
_fileThread = new Thread(() =>
{
string dir = Path.GetDirectoryName(fileName) ?? ".";
lock (_fileLock)
{
_dirFiles = GetImageFileExtensions().SelectMany(f => Directory.GetFiles(dir, f, _searchOption)).OrderBy(f => f).ToArray();
_fileIndex = Array.IndexOf(_dirFiles, fileName);
}
int totalFileCount = Directory.GetFiles(dir, "*.*", _searchOption).Length;
Invoke((MethodInvoker)delegate
{
lblFileNbr.Text = string.Format("{0}/{1}", NumberFormat(_fileIndex + 1), NumberFormat(_dirFiles.Length));
lblFileNbr.ToolTipText = string.Format("{0} ({1} files ignored)", dir, NumberFormat(totalFileCount - _dirFiles.Length));
});
});
_fileThread.Start();
I'm building a little image-viewing program. When you open an image, it lists the number of files in the same directory. I noticed when I open an image in a directory with a lot of other files (say 150K), it takes several seconds to build the file list. Thus, I'm delegating this task to another thread.
If, however, you open another image before it finishes searching for the files, that old count is no longer relevant, so I'm aborting the thread.
I'm locking _dirFiles and _fileIndex because I want to add some Left and Right key functionality to switch between photos, so I'll have to access those somewhere else (but in the UI thread).
Is this safe? There seems to be dozens of methods of dealing with threads in C# now, I just wanted something simple.
fileName is a local variable (which means it will be "copied" into the anonymous function, right?), and _searchOption is readonly, so I imagine those 2 are safe to access.
> Is it safe to abort this file-searching thread?
The short answer is NO!
It is almost never safe to abort a thread, and this advice applies even more when you might be executing native code.
If you can't cooperatively exit fast enough ( because it is your call to Directory.GetFiles that takes time ), your best bet is to abandon the thread: let it finish cleanly but ignore its results.
As always, I recommend reading Joe Albahari's free ebook
It isn't safe to abort the thread using Thread.Abort(). But you could instead implement your own abort which could allow you to safely bring the thread to a close in a controlled fashion.
If you use EnumerateFiles instead of GetFiles, you can loop through each file as you increment a counter to get the total number of files while checking a flag to see if the thread needs to abort.
Calling something such as this in place of your current GetFiles().Length:
private bool AbortSearch = false;
private int NumberOfFiles(string dir, string searchPattern, SearchOption searchOption)
{
var files = Directory.EnumerateFiles(dir, searchPattern, searchOption);
int numberOfFiles = 0;
foreach (var file in files)
{
numberOfFiles++;
if (AbortSearch)
{
break;
}
}
return numberOfFiles;
}
You could then replace
_fileThread.Abort();
with
AbortSearch=true;
_fileThread.Join();
You'll achieve what you are with the current Thread.Abort(), but you will allow all threads to end cleanly when you want them to.

Resources