Good practice to hold a file or channel in a class - file

In the following code, I am trying to make a class which can write something to a log file when asked via a method. Here, I am wondering if this is an idiomatic way for this purpose, or possibly is there a more recommended way, e.g., hold a separate field of file type (for some reason)? In other words, is it pratically no problem even if I hold only a channel type?
class Myclass {
var logfile: channel;
proc init() {
writeln( "creating log.out" );
logfile = openwriter( "log.out" );
}
proc log( x ) {
logfile.writeln( x );
}
}
proc main() {
var a = new borrowed Myclass();
a.log( 10 );
a.log( "orange" );
}

I believe what you're doing here is reasonable. The distinction between files and channels in Chapel is primarily made in support of the language's parallel computing theme, in order to support having multiple tasks access a single logical file simultaneously using distinct channels (views into the file, essentially). In a case like yours, there is a file underlying the channel you've created, but there's no need to explicitly store it if you have no need to interact further with it.
So I believe there is no practical problem to simply storing a channel as you have here.

Related

Is there a file object to get path or name of a file in Nim?

Let's say, I would like to use a single object to represent a file and I'd like to get the filename (or path) of it so that I can use the name to remove the file or for other standard library procedures. I'd like to have a single abstraction which can be used with all available file-related standard library procedures.
I've found FileInfo but in my research I didn't find a get-file-name-procedure. File and FileHandle are pretty useless from a software engineering point of view because they provide no convenient abstraction and don't have members.
Is there a file abstraction (object) in Nim, which provides fast access to FileInfo as well as the file name so that a file doesn't need more than one procedure parameter?
There is no such abstraction in Nim, or any other language, simply because you are asking for an impossible thing to do with most filesystems. Consider the FileInfo structure and its linkCount field which tells you the number of hard links the file object has. But there is no way to get-a-filename from one or all of those links short of building and updating yourself a database of the whole filesystem.
While most filesystems allow access to files through paths, there is rarely a filesystem that gives paths from files because they actually don't need one! An example would be a Unix filesystem where one process opens a file through a path, then removes the path without closing the file. While the process holding the file open is alive, that file won't actually disappear, so you would have the case of a file without path.
The issue of handling paths, especially considering cross platform applications, involves its own can of worms: if you store paths as strings, what is the path separator and how do you escape it? Does your filesystem support volumes that require special case handling? What string encoding do paths use to satisfy all users? Just the encoding issue requires tons of tables and conversions which would bog down every other API wishing to get just a file like handle to read or write bytes.
A FileInfo is just a snapshot of the state of the file at a given time, a file handle is the live file object you can operate on, and a path (or many paths if your filesystem supports hard links) is just a convenience name for end users.
These are all very different things, which is why they are separate. Your app may need a more complex abstraction than other programmers are willing to tolerate, so create own abstraction which holds together all the individual pieces you need. For instance, consider the following structure:
import os
type
AppFileInfo = object
fileInfo: FileInfo
file: File
oneOfMany: string
proc changeFileExt(appFileInfo: AppFileInfo, ext: string): string =
changeFileExt(appFileInfo.oneOfMany, ext)
proc readAll(appFileInfo: AppFileInfo): string =
readAll(appFileInfo.file)
Those procs simply mimic the respective standard library APIs but use your more complex structure as inputs and transform it as needed. If you are worried about this abstraction not being optimised due to the extra proc call you could use a template instead.
If you follow this route, however, at some point you will have to ask yourself what is the lifetime of an AppFileInfo object: do you create it with a path? Do you create it from a file handle? Is it safe to access the file field in parts of your code or has it not been initialised properly? Do you return errors or throw exceptions when something goes wrong? Maybe when you start to ask yourself these questions you'll realise they are very app specific and are very difficult to generalise for every use case. Therefore such a complex object doesn't make much sense in the language standard library.
I created the missing solution myself. I basically extended the File type using a global encapsulated table. Extending Types like this could be a useful idiom in Nim because of UFCS.
import tables
type FileObject = object
file : File
mode : FileMode
path : string
proc initFileObject(name: string; mode: FileMode; bufsize = -1) : FileObject =
result.file = open(name, mode, bufsize)
result.path = name
result.mode = mode
var g_fileObjects = initTable[File, FileObject]()
template get(this: File) : var FileObject = g_fileObjects[this]
proc openFile*(filepath: string; mode: FileMode = fmRead; bufsize = -1) : File =
var fileObject = initFileObject(filepath, mode, bufsize)
result = fileObject.file
g_fileObjects[result] = fileObject
proc filePath*(this: File) : string {.raises: KeyError.} =
return this.get.path
proc fileMode*(this: File) : FileMode {.raises: KeyError.} =
return this.get.mode
from os import tryRemoveFile
proc closeOrDeleteFile[delete = false](this: File) : bool =
result = g_fileObjects.hasKey(this)
if result:
when delete:
result = this.filepath.tryRemoveFile()
g_fileObjects.del(this)
this.close()
proc closeFile*(this: File) : bool = this.closeOrDeleteFile[:false]
proc deleteFile*(this: File) : bool = this.closeOrDeleteFile[:true]
Now you can write
var f = openFile("myFile.txt", fmWrite)
var g = openFile("hello.txt", fmWrite)
echo f.filePath
echo f.deleteFile()
g.writeLine(g.filePath)
echo g.closeFile()

How to use Collections.binarySearch() in a CodenameOne project

I am used to being able to perform a binary search of a sorted list of, say, Strings or Integers, with code along the lines of:
Vector<String> vstr = new Vector<String>();
// etc...
int index = Collections.binarySearch (vstr, "abcd");
I'm not clear on how codenameone handles standard java methods and classes, but it looks like this could be fixed easily if classes like Integer and String (or the codenameone versions of these) implemented the Comparable interface.
Edit: I now see that code along the lines of the following will do the job.
int index = Collections.binarySearch(vstr, "abcd", new Comparator<String>() {
#Override
public int compare(String object1, String object2) {
return object1.compareTo(object2);
}
});
Adding the Comparable interface (to the various primitive "wrappers") would also would also make it easier to use Collections.sort (another very useful method :-))
You can also sort with a comparator but I agree, this is one of the important enhancements we need to provide in the native VM's on the various platforms personally this is my biggest peeve in our current VM.
Can you file an RFE on that and mention it as a comment in the Number issue?
If we are doing that change might as well do both.

How to define thread safe array?

How can I define a thread safe global array with minimal modifications?
I want like every access to it to be accomplished by using mutex and synchronized block.
Something like this as 'T' will be some type (note that 'sync' keyword is not currently defined AFAIK):
sync Array!(T) syncvar;
And every access to it will be simmilar to this:
Mutex __syncvar_mutex;
//some func scope....
synchronized(__syncvar_mutex) { /* edits 'syncvar' safely */ }
My naive attempt was to do something like this:
import std.typecons : Proxy:
synchronized class Array(T)
{
static import std.array;
private std.array.Array!T data;
mixin Proxy!data;
}
Sadly, it doesn't work because of https://issues.dlang.org/show_bug.cgi?id=14509
Can't say I am very surprised though as automagical handling of multi-threading via hidden mutexes is very unidiomatic in modern D and the very concept of synchronized classes is mostly a relict from D1 times.
You can implement same solution manually, of course, by defining own SharedArray class with all necessary methods and adding locks inside the methods before calling internal private plain Array methods. But I presume you want something that work more out of the box.
Can't invent anything better right here and now (will think about it more) but it is worth noting that in general it is encouraged in D to create data structures designed for handling shared access explicitly instead of just protecting normal data structures with mutexes. And, of course, most encouraged approach is to not shared data at all using message passing instead.
I will update the answer if anything better comes to my mind.
It is fairly easy to make a wrapper around array that will make it thread-safe. However, it is extremely difficult to make a thread-safe array that is not a concurrency bottleneck.
The closest thing that comes to mind is Java's CopyOnWriteArrayList class, but even that is not ideal...
You can wrap the array inside a struct that locks the access to the array when a thread acquires a token and until it releases it.
The wrapper/locker:
acquire(): is called in loop by a thread. As it returns a pointer, the thread knows that it has the token when the method returns a non null value.
release(): is called by a thread after processing the data whose access has been acquired previously.
.
shared struct Locker(T)
{
private:
T t;
size_t token;
public:
shared(T) * acquire()
{
if (token) return null;
else
{
import core.atomic;
atomicOp!"+="(token, 1);
return &t;
}
}
void release()
{
import core.atomic;
atomicOp!"-="(token, 1);
}
}
and a quick test:
alias LockedIntArray = Locker!(size_t[]);
shared LockedIntArray intArr;
void arrayTask(size_t cnt)
{
import core.thread, std.random;
// ensure the desynchronization of this job.
Thread.sleep( dur!"msecs"(uniform(4, 20)));
shared(size_t[])* arr = null;
// wait for the token
while(arr == null) {arr = intArr.acquire;}
*arr ~= cnt;
import std.stdio;
writeln(*arr);
// release the token for the waiting threads
intArr.release;
}
void main(string[] args)
{
import std.parallelism;
foreach(immutable i; 0..16)
{
auto job = task(&arrayTask, i);
job.executeInNewThread();
}
}
With the downside that each block of operation over the array must be surrounded with an acquire/release pair.
You have the right idea. As an array, you need to be able to both edit and retrieve information. I suggest you take a look at the read-write mutex and atomic utilities provided by Phobos. A read operation is fairly simple:
synchronize on mutex.readLock
load (with atomicLoad)
copy the item out of the synchronize block
return the copied item
Writing should be almost exactly the same. Just syncronize on mutex.writeLock and do a cas or atomicOp operation.
Note that this will only work if you copy the elements in the array during a read. If you want to get a reference, you need to do additional synchronization on the element every time you access or modify it.

multithreads process data from the same file

can anyone in this forum give an example in C how two threads process data from one textfile.
As an example, I have one textfile that contains a paragraph. I have two threads that will process the data in the said file. One thread will count the number of lines in the paragraph. The second thread will count the numeric characters.
thanks
If you asked in C++ I could give you a code example, but I havent done ANSI C in a very long time so I will give you the design and pseudo code.
Please keep in mind this is really bad pseudo code that is meant to give an example. I'm not questioning WHY you would want to do this. For all I know it could be an excercise with threads or because you "feel like it".
Example 1
int integerCount = 0;
int lineCount = 0;
numericThread()
{
// By flagging the file as readonly you should
// be able to open it as many times as you wish
handle h = openfile ("textfile.txt". readonly);
while (!eof(h)) {
String word = readWord (h);
int outInteger
if (stringToInteger(word, outInteger)) {
++integerCount;
}
}
}
lineThread()
{
// By flagging the file as readonly you should
// be able to open it as many times as you wish
handle h = openfile ("textfile.txt". readonly);
while (!eof(h)) {
String word = readWord (h);
if (word.equals("\n") {
++lineCount ;
}
}
}
If for some reason you aren't able to open the file twice in readonly you will need to maintain a queue for each thread, having the main thread put words into each threads queue. The threads will then pull from the queue.
Example 2
int integerCount = 0;
int lineCount = 0;
queue numericQueue;
queue lineQueue;
numericThread()
{
while (!numericQueue.closed()) {
String word = numericQueue.pop();
int outInteger
if (stringToInteger(word, outInteger)) {
++integerCount;
}
}
}
lineThread()
{
while (!lineQueue.closed()) {
String word = lineQueue.pop();
if (word.equals("\n") {
++lineCount ;
}
}
}
mainThread()
{
handle h = openfile ("textfile.txt". readonly);
while (!eof(h)) {
String word = readWord(h);
numericQueue.push(word);
lineQueue.push(word);
}
numericQueue.close();
lineQueue.close();
}
There are lots of ways to do this. You can make different design decisions depending on how fast or simple or elegant or overengineered you want this to be. One way, as posted by Andrew Finnell is to have each thread open the file and read it completely independently. In theory this isn't great because you are doing expensive IO twice but in practice it's probably fine because the OS has likely cached the contents of whichever read executes first. Double IO is still more expensive than average because it involves a lot of needless system calls, but again in practice it will be irrelevant unless you have a very large file.
Another model of how to do this would be for each thread to have an input queue, or a shared global queue. The main thread reads the file and places each line in turn on the queue(s), and perhaps main doubles as one of your worker threads. This is more complicated because access to the queue(s) must be synchronized, or some lockless queue implementation must be used. In the case of a shared global queue, there is less duplication of data but now the lifecycle of that data is more complicated.
Just to point out how many ways such a simple thing can be done, you could go the overengineering route and make each thread generic. Instead of placing data on the queue(s) you place both data (or pointers to data) and function pointers and let each thread execute the callback. This kind of model might might sense if you plan on adding lots more kind of things to compute but want to limit the number of threads you will use.
I don't think you will see much performance difference in using 2 threads over one. Either way, you don't want both threads to read the file. Read the file first, then pass a COPY of the stream to the methods you want and process both. The threads will not have access to the same stream of data at the same time so you'll need to use 2 copies of the textfile.
P.S. It's possible that depending on the size of the file, you will actually loose performance using 2 threads.

Implementing Hierarchical State Machines in C

I'm a bit confused about how to implement my state machine.
I already know it's hierarchical since some states share the same action.
I determine what I need to do by these parameters:
Class (Values are: Base, Derived, Specific)
OpCode
Parameter 1 - optional
Parameter 2 - optional
My hierarchy is determined by the Class and the OpCode represents the action.
Derived can use the OpCodes of Base and Specific can use OpCodes of both Base and Derived.
The naive implementation is the following:
void (*const state_table [MAX_CLASSES][MAX_OPCODES]) (state *) {
{base_state1, base_state2, NULL, NULL},
{base_state1, base_state2, derived_state1, NULL},
{base_state1,base_state2, derived_state1, specific_state3},
};
void dispatch(state *s)
{
if (state_table[s->Class][s->OpCode] != NULL)
state_table[s->Class][s->OpCode](s);
}
This will turn unmaintainable really quick.
Is there another way to map the state to a superclass?
EDIT:
Further calcualtion leads me to think that I'll probably use most if not all OpCodes but I will not use all of the Classes available to me.
Another clarification:
Some OpCodes might be shared through multiple derived and base Classes.
For example:
I have a Class called Any
which is a Base class. It has the
OpCodes: STATE_ON, STATE_OFF, STATE_SET.
I have another Class called
MyGroup which is a Derived class. It has the OpCodes:
STATE_FLIP, STATE_FLOP.
The third Class is a Specific
class called ThingInMyGroup which
has the OpCode:
STATE_FLIP_FLOP_AND_FLOOP.
So a message with class Any is sent from the server, recieved in all clients and processed.
A message with class MyGroup is sent from the server, recieved in all clients and processed only on clients that belong to MyGroup, any OpCodes that are valid for the Any class are valid for the MyGroup class.
A message with class ThingInMyGroup is sent from the server, recieved in all clients and processed only on clients that belong to MyGroup and are a ThingInMyGroup*, any **OpCodes that are valid for the Any class and MyGroup class are valid for the ThingInMyGroup class.
After a message is received the client will ACK/NACK accordingly.
I prefer not to use switch cases or const arrays as they will become unmaintainable when they get bigger.
I need a flexible design that allows me:
To specify which OpCodes are available
for each Class.
To specify a superclass for each Class and through that specification to allow me to call the function pointer that is represented by the current OpCode.
There are several ways to deal with this. Here is one:
edit -- with general purpose hierarchy added
typedef unsigned op_code_type;
typedef void (*dispatch_type)(op_code_type);
typedef struct hierarchy_stack hierarchy_stack;
struct hierarchy_stack {
dispatch_type func;
hierarchy_stack *tail;
};
void dispatch(state *s, hierarchy_stack *stk) {
if (!stk) {
printf("this shouldn't have happened");
} else {
stk->func(s, stk->tail);
}
}
void Base(state *s, hierarchy_stack *stk ) {
switch (s->OpCode) {
case bstate1:
base_state1(s);
break;
case bstate2:
base_state(2);
break;
default:
dispatch(s, stk);
}
}
void Derived(state *s, hierarchy_stack *stk ) {
switch(s->opcode) {
case dstate1:
deriveds_state1(s);
break;
default:
dispatch(s, stk);
}
}
...
NOTE : All function calls are tail calls.
This localizes your "class"es a good bit so that if you decide that Derived needs 100 more methods/opcodes then you only have to edit methods and the enum (or whatever) that you use to define opcodes.
Another, more dynamic way, to deal with this would be to have a parent pointer within each "class" that pointed to the "class" that would handle anything that it could not handle.
The 2D table approach is fast and flexible (Derived could have a different handler than Base for opcode 0), but it grows fast.
I wrote a little tool that generates code similar to your naive implementation based on a mini-language. The language just specified the state-opcode-action relationships, all of the actions were just C functions conforming to a typedef.
It didn't handle the HSM aspect, but this would be relatively easy to add to a language.
I'd recommend taking this approach -- create a little language that gives you a clean way to describe the state machine, and then generate code based on that machine description. That way when you need to insert a new state a month from now, the whole thing isn't a tangled mess to edit.
Let me know if you want the code and I'll make sure it's still available somewhere.

Resources