Is it safe to abort this file-searching thread?

Is it safe to abort this file-searching thread? - winforms

First, the code:
lblFileNbr.Text = "?/?";
lblFileNbr.ToolTipText = "Searching for files...";
lock(_fileLock)
{
_dirFiles = new string[0];
_fileIndex = 0;
}
if(_fileThread != null && _fileThread.IsAlive)
{
_fileThread.Abort();
}
_fileThread = new Thread(() =>
{
string dir = Path.GetDirectoryName(fileName) ?? ".";
lock (_fileLock)
{
_dirFiles = GetImageFileExtensions().SelectMany(f => Directory.GetFiles(dir, f, _searchOption)).OrderBy(f => f).ToArray();
_fileIndex = Array.IndexOf(_dirFiles, fileName);
}
int totalFileCount = Directory.GetFiles(dir, "*.*", _searchOption).Length;
Invoke((MethodInvoker)delegate
{
lblFileNbr.Text = string.Format("{0}/{1}", NumberFormat(_fileIndex + 1), NumberFormat(_dirFiles.Length));
lblFileNbr.ToolTipText = string.Format("{0} ({1} files ignored)", dir, NumberFormat(totalFileCount - _dirFiles.Length));
});
});
_fileThread.Start();
I'm building a little image-viewing program. When you open an image, it lists the number of files in the same directory. I noticed when I open an image in a directory with a lot of other files (say 150K), it takes several seconds to build the file list. Thus, I'm delegating this task to another thread.
If, however, you open another image before it finishes searching for the files, that old count is no longer relevant, so I'm aborting the thread.
I'm locking _dirFiles and _fileIndex because I want to add some Left and Right key functionality to switch between photos, so I'll have to access those somewhere else (but in the UI thread).
Is this safe? There seems to be dozens of methods of dealing with threads in C# now, I just wanted something simple.
fileName is a local variable (which means it will be "copied" into the anonymous function, right?), and _searchOption is readonly, so I imagine those 2 are safe to access.

> Is it safe to abort this file-searching thread?
The short answer is NO!
It is almost never safe to abort a thread, and this advice applies even more when you might be executing native code.
If you can't cooperatively exit fast enough ( because it is your call to Directory.GetFiles that takes time ), your best bet is to abandon the thread: let it finish cleanly but ignore its results.
As always, I recommend reading Joe Albahari's free ebook

It isn't safe to abort the thread using Thread.Abort(). But you could instead implement your own abort which could allow you to safely bring the thread to a close in a controlled fashion.
If you use EnumerateFiles instead of GetFiles, you can loop through each file as you increment a counter to get the total number of files while checking a flag to see if the thread needs to abort.
Calling something such as this in place of your current GetFiles().Length:
private bool AbortSearch = false;
private int NumberOfFiles(string dir, string searchPattern, SearchOption searchOption)
{
var files = Directory.EnumerateFiles(dir, searchPattern, searchOption);
int numberOfFiles = 0;
foreach (var file in files)
{
numberOfFiles++;
if (AbortSearch)
{
break;
}
}
return numberOfFiles;
}
You could then replace
_fileThread.Abort();
with
AbortSearch=true;
_fileThread.Join();
You'll achieve what you are with the current Thread.Abort(), but you will allow all threads to end cleanly when you want them to.

Related

Efficient reuse of previous hashmap entries (insert or modify if key exists)

I've written a bit of code that creates CSV files, but when an identical file already exists, I'd like to delete the older copy. I decided to do that using a hashmap, but I'm having a problem with the way hashmaps in rust deal with existing vs previous entries. Namely, I'm trying to avoid hashing the key first in order to check if that entry already exists, and then do yet another hash to either retrieve the existing one and modify it or insert a new one.
Rust has a built in method for doing this naturally, but in at least some cases it doesn't work, and here's one of them:
let cwd = std::env::current_dir().unwrap();
let mut files = HashMap::with_capacity(5);
for dir_entry in cwd.read_dir()?.flatten() {
let fname = dir_entry.file_name();
let fntext = fname.to_string_lossy();
let md = dir_entry.metadata()?;
if md.is_file() && fntext.starts_with("test") && fntext.ends_with(".csv") {
let mut data = Vec::with_capacity(500_000);
let f = File::open(dir_entry.path())?;
let mut br = BufReader::new(f);
br.read_to_end(&mut data);
let hash = MeowHasher::hash(data.as_slice());
files.entry(hash.as_u128()).and_modify(|f: &mut std::fs::DirEntry| {
let md2 = f.metadata().unwrap();
if md2.modified().unwrap() > md.modified().unwrap() {
std::fs::remove_file(dir_entry.path()).unwrap();
} else {
std::fs::remove_file(f.path()).unwrap();
*f = dir_entry;
}
}).or_insert(dir_entry);
}
}
The problem here is the DirEntry struct doesn't implement clone. In a real world usage, the clone isn't even needed because the move won't even happen unless the entry is already there, and if it is already there, the or_insert clause won't even run. So this code is perfectly sound, nonetheless, this will not compile as is.
I know of several other ways to do what I'm doing successfully, but that isn't the point of this question. The point is to figure out how to do an "insert or modify if key exists" operation on Rust hashmaps when the modify operation involves replacing the existing entry outright, only without needing to clone the replacement in order to satisfy the borrow checker.
Note in this particular case, the dir_entry objects representing files that weren't deleted will need to be reused later, so the solution can't discard them.

Match hash_map::Entry directly. The borrow checker cannot reason about control flow performed via functions, but it can reason on the control flow of match:
match files.entry(hash.as_u128()) {
hash_map::Entry::Occupied(mut entry) => {
let f = entry.get_mut();
let md2 = f.metadata().unwrap();
if md2.modified().unwrap() > md.modified().unwrap() {
std::fs::remove_file(dir_entry.path()).unwrap();
} else {
std::fs::remove_file(f.path()).unwrap();
*f = dir_entry;
}
}
hash_map::Entry::Vacant(entry) => {
entry.insert(dir_entry);
}
}

Make a while loop, loop for 1 second in Dart/Flutter

I am trying to make a while loop loop a statement exactly for one second after which it stops. I have tried this in DartPad, but it crashes the browser window.
void main(){
var count = 0.0;
bool flag = true;
Future.delayed(Duration(seconds: 1), (){
flag = false;
});
while (flag){
count++;
}
print(count);
}
Am I doing something wrong?

I like how you are trying to figure Futures out. I was exactly where you were before I understood this stuff. It's kind of like threads, but quite different in some ways.
The Dart code that you wrote is single threaded. By writing Future.delayed, you did not start a job. Its execution won't happen unless you let go of the thread by returning from this main function.
Main does not have to return if it is marked with async.
Two actions have to run "concurrently" to be able to interact with each other like you are trying to do. The way to do it is to call Future.wait to get a future that depends on the two futures. Edit: Both of these actions have to let go of execution at every step so that the other can get control of the single thread. So, if you have a loop, you have to have some kind of await call in it to yield execution to other actions.
Here's a modified version of your code that counts up to about 215 for me:
Future main() async {
var count = 0.0;
bool flag = true;
var futureThatStopsIt = Future.delayed(Duration(seconds: 1), (){
flag = false;
});
var futureWithTheLoop = () async {
while (flag){
count++;
print("going on: $count");
await Future.delayed(Duration(seconds: 0));
}
}();
await Future.wait([futureThatStopsIt, futureWithTheLoop]);
print(count);
}

Using Parallel::ForkManager in foreach loop

I am just learning Perl as a fourth language.
My wish is to use Parallel::ForkManager to speed up a foreach loop using an array whose members are taken from a text file.
Basically I am testing a .txt file of URLs, and wish to make it so that it will test multiple members of the array at once, not one at a time (five at a time in this instance) and without spamming the same URL inadvertently DoSing it.
Would something like this do the trick?
$limit = new Parallel::ForkManager(5);
foreach (#lines) {
$limit->start and next;
$lines = $_;
... do processing here ...
$limit->finish;
}
or would it be the equivalent of running that loop 5 times making a small multithreaded DoS script?

It isn't too clear from the documentation, but
A call to start will block in the parent process until there are fewer children running than the limit specified. Then it will return the (non-zero) child PID in the parent, and zero in the child
A child process can see all the data in the parent process as it was when the start was called. The data is presumably copy-on-write, as the child may modify it but the changes aren't reflected in any other process's workspace
The $pm->start and next idiom may seem a little obscure. Essentially it skips the rest of the loop if the start method returns a true value. I prefer something like my $pid = $fm->start; next if $pid; or the if construct in the code below. Both do the same thing, but I think more legibly
I recommend that you experiment with this simpler application, which uses a cache of five child threads to print the numbers from zero to nine.
use strict;
use warnings;
use Parallel::ForkManager;
STDOUT->autoflush;
my $fm = Parallel::ForkManager->new(5);
for my $i (0 .. 9) {
my $pid = $fm->start;
if ($pid == 0) {
print "$i\n";
sleep 2;
$fm->finish;
}
}

To test, use a safe local process like print or write to avoid spamming the URL's. Here's a working snippet from a program I wrote that uses the fork manager.
my $pm=new Parallel::ForkManager(20);
foreach $add (#adds){
$pm->start and next;
#if email is invalid move on
if (!defined(Email::Valid::Loose->address($add))){
writeaddr(*BADADDR, $add); #address is bad
$pm->finish;
}
#if email is valid get domain name
$is_valid = Email::Valid::Loose->address($add);
if ($is_valid =~ m/\#(.*)$/) {
$host = $1;
}
$is_valid="";
# perform dsn lookup to check domain
#mx=mx($resolver, $host);
if (#mx) {
writeaddr(*GOODADDR, $add); #address is good
}else{
writeaddr(*BADADDR, $add); #address is bad
}
$pm->finish;
}

Why doesn't Process.WaitForInputIdle() work?

I am using Windows Automation to test my UI and am opening and closing processes. I want to have a valid WindowHandle, but Process.WaitForInputIdle() doesn't wait long enough. I have a work around, but don't understand why WaitForInputIdle() doesn't work.
Below is a small code snip:
Process = new Process
{
StartInfo =
{
WorkingDirectory = directory,
FileName = EXECUTABLE_FILE_NAME
}
};
Process.Start();
//Process.WaitForInputIdle() doesn't work,
//so will use a while loop until MainWindowHandle isn't IntPtr.Zero anymore,
//or until 10 seconds have elapsed
int count = 0;
while (Process.MainWindowHandle == IntPtr.Zero && count<100)
{
count++;
Thread.Sleep(100);
}
AppElement = AutomationElement.FromHandle(Process.MainWindowHandle);

As stated by Chaser324 in his comment, the answer to my question can be found here.
I basically need to add a call to Process.Refresh() inside of my 'while' loop.

atomically creating a file lock in MATLAB (file mutex)

I am looking for a simple already implemented solution for atomically creating a file lock in MATLAB.
Something like:
file_lock('create', 'mylockfile'); %this will block until it creates the lock file.
file_lock('remove', 'mylockfile'); %this will remove the lock file:
This question has already been asked several times, with some proposed solution ideas (such as using Java FileLock),
but I didn't find a simple already implemented solution.
Are you aware of such an implemented solution?
Notes:
locking file access OR exchanging messages bw Matlab Instances
Thread Subject: Safe file mutex without race condition

I've settled on a pretty simple solution for combining error/logging messages from multiple worker threads into a single file. Every time I want to write to that file, I first write the output to the thread's own temporary file. Next, I append that temporary file to the "master" log file using flock. Skipping over some details here, the idea is:
fid=fopen(threadtemp, 'w');
fprintf(fid, 'Error message goes here');
fclose(fid);
runme = sprintf('flock -x %s -c ''cat %s >> %s''', LOGFILE, threadtemp, LOGFILE);
system(runme);
See the flock man page for details, but the call above is acquiring an eXclusive lock on the logfile, running the provided Command under the lock, and then releasing it.
This obviously only works if you're on a system which has flock (Linux/OS X, and only certain types of file systems at that) and you're doing something that can be done from the command line, but I'd bet that it's a pretty common use-case.

Depending on which Java version you're using, perhaps this will work (translated from: http://www.javabeat.net/2007/10/locking-files-using-java/)
classdef FileLock < handle
properties (Access = private)
fileLock = []
file
end
methods
function this = FileLock(filename)
this.file = java.io.RandomAccessFile(filename,'rw');
fileChannel = this.file.getChannel();
this.fileLock = fileChannel.tryLock();
end
function val = hasLock(this)
if ~isempty(this.fileLock) && this.fileLock.isValid()
val = true;
else
val = false;
end
end
function delete(this)
this.release();
end
function release(this)
if this.hasLock
this.fileLock.release();
end
this.file.close
end
end
end
Usage would be:
lock = FileLock('my_lock_file');
if lock.hasLock
%// do something here
else
%// I guess not
end
%// Manually release the lock, or just delete (or let matlab clean it up)
I like the obj wrapping pattern for IO so that releasing happens even in exceptions
EDIT: The file ref must be kept around and manually closed or you won't be able to edit this. That means this code is only really useful for pure lock files, I think.

If you only need to run on OS X and Linux (not Windows), you can use the following:
pathLock='/tmp/test.lock'
% Try to create and lock this file.
% In my case I use -r 0 to avoid retrying
% You could use -r -1 to retry forever, or for a particular amount of time,
% etc, see `man lockfile` for details.
if ~system(sprintf('lockfile -r 0 %s',pathLock))
% We succeeded, so perform some task which needs to be serialized.
% runSerializedTask()
% Now remove the lockfile
system(sprintf('rm -f %s',pathLock));
end

Write to a new file, then rename it. Renaming is an atomic operation, and all the new content will become visible at once.

At the end I did one implementation based on two consecutive tests (movefile, and verify the contents of the moved file).
not very well written, but it works for now for me.
+++++ file_lock.m ++++++++++++++++++++++++
function file_lock(op, filename)
%this will block until it creates the lock file:
%file_lock('create', 'mylockfile')
%
%this will remove the lock file:
%file_lock('remove', 'mylockfile')
% todo: verify that there are no bugs
filename = [filename '.mat'];
if isequal(op, 'create')
id = [tempname() '.mat']
while true
save(id, 'id');
success = fileattrib(id, '-w');
if success == 0; error('fileattrib'); end
while true
if exist(filename, 'file'); %first test
fprintf('file lock exists(1). waiting...\n');
pause(1);
continue;
end
status = movefile(id, filename); %second test
if status == 1; break; end
fprintf('file lock exists(2). waiting...\n');
pause(1);
end
temp = load(filename, 'id'); % third test.
if isequal(id, temp.id); break; end
fprintf('file lock exists(3). waiting...\n');
pause(1)
end
elseif isequal(op, 'remove')
%delete(filename);
execute_rs(#() delete(filename));
else
error('invalid op');
end
function execute_rs(f)
while true
try
lastwarn('');
f();
if ~isequal(lastwarn, ''); error(lastwarn); end %such as: Warning: File not found or permission denied
break;
catch exception
fprintf('Error: %s\n.Retrying...\n', exception.message);
pause(.5);
end
end
++++++++++++++++++++++++++++++++++++++++++