Dealing with database access in transformer stacks - database

This question is about groundhog or persistent, because I believe both share the same problem.
Say I have a transformer Tr m a that provides some functionality f :: Int -> Tr m (). This functionality requires database access. There are a few options I can use here, and none are satisfactory.
I could put a DbPersist transformer somewhere inside of Tr. Actually, I'd need to put it at the top because there are no PersistBackend instances for standard transformers AND I'd still need to write an instance for my Tr newtype. This already sucks because the class is far from minimal. I could also lift every db action I do.
The other option is changing the signature of f to PersistBackend m => Int -> Tr m (). This would again either require a PersistBackend instance on my Tr newtype, or lifting.
Now here's the real issue. How do I run Tr inside of a context that already has a PersistBackend constraint? There's no way to share it with Tr.
I can either do the first option and run the actual DbPersist transformer inside of Tr with some new connection pool (as far as I can tell there's no way to get the pool from the PersistBackend context I'm already in), or I can do the second option and have the run function be runTr :: PersistBackend m => Tr m a -> m a. The second option would actually be completely fine, but the problem here is that the DbPersist, that will eventually have to be somewhere in the stack, is now under the Tr transformer and there are no PersistBackend instances for the standard transformers of which Tr is made of.
What's the correct approach here? At the moment is seems that the best option is to go with a sepatare ReaderT somewhere in the stack that provides me with the connection pool on request and then do runDbConn with that pool everywhere where I want to access the DB. Seeing how DbPersist basically already is just a ReaderT I don't see the sense in having to do that.

groundhog
I recommend using the latest groundhog from their master branch. Even though the change I'm about to describe appears to have been implemented in Sept. 2015, no release has made it to Hackage. But the authors seemed to have tackled this very problem.
On tip, PersistBackend is now a much simpler class to implement, much reduced from the dozens-of-methods-long behemoth it once was:
class (Monad m, Applicative m, Functor m, MonadIO m, ConnectionManager (Conn m), PersistBackendConn (Conn m)) => PersistBackend m where
type Conn m
getConnection :: m (Conn m)
instance (Monad m, Applicative m, Functor m, MonadIO m, PersistBackendConn conn) => PersistBackend (ReaderT conn m) where
type Conn (ReaderT conn m) = conn
getConnection = ask
They wrote an instance for ReaderT conn m (DbPersist has been deprecated and aliased to ReaderT conn), and you could as easily write one for Tr (ReaderT conn) if you choose to go the route of putting ReaderT inside rather than outside. It's not quite an mtl monad transformer since you would have to instance Tr m instead of Tr, but this and the associated data type trick they're using should allow you to use a custom monad stack without too much fuss.
Either option you choose will probably require some lifting. In my personal opinion I would stick ReaderT conn on the very outside of the stack. That way, the mtl helpers can still lift through most of your stack and you can glue on an additional lift to take it home. And, if you were to stick with the version on Hackage, this seems to be the only reasonable option since otherwise you would have the (old) monolithic PersistBackend class.
persistent
Persistent is a little more straightforward: as long as the monad transformer stack contains ReaderT SqlBackend and terminates in IO, you can lift a call to runSqlPool :: MonadBaseControlIO m => ReaderT SqlBackend m a -> Pool SqlBackend -> m a. All Persistent operations are defined to return something of type ReaderT backend m a, so the design sort of just works out.

Related

How do Pact interfaces provide abstraction similar to Haskell type classes?

The Pact documentation mentions that interfaces in Pact allow for abstraction similar to Haskell's type classes:
When a module issues an implements, then that module is said to ‘implement’ said interface, and must provide an implementation . This allows for abstraction in a similar sense to Java’s interfaces, Scala’s traits, Haskell’s typeclasses or OCaML’s signatures. Multiple interfaces may be implemented in a given module, allowing for an expressive layering of behaviors.
I understand that you can write an interface (this corresponds with declaring a class in Haskell), and then you can implement the interface for one or more modules (this corresponds with an instance in Haskell). For example:
-- This corresponds with declaring an interface
class Show a where
show :: a -> String
-- This corresponds with implementing an interface
instance Show Bool where
show True = "True"
show False = "False"
However, the great value of a Haskell type class is that you can then abstract over it. You can write a function that takes any value so long as it is an instance of Show:
myFunction :: (Show a) => a -> ...
myFunction = ...
What is the corresponding concept in Pact? Is there a way to accept any module as a dependency, so long as it implements the interface? If not, how does this open up abstraction "in a similar sense to Haskell's type classes"?
I think your question may be conflating typeclasses with type variables and universal quantification. Typeclasses give you a common interface like show that can be used on any type (or in this case, module) that supports them. Universal quantification lets you write generic algorithms that work for any Show instance.
Pact provides the former, but not the latter. The main utility is in giving your module a template to work against, and anyone who knows the interface will be able to use your module. This makes "swapping implementations" possible, but doesn't open the door to "generic algorithms". For that we'd need some way of saying "For all modules that implement interface"...
UPDATE: As per Stuart Popejoy's comment, this sort of abstraction can indeed be achieved using modrefs. Here is an example of a module that implements a generic method over any module implementing a certain interface:
(interface iface
(defun op:integer (arg:string)))
(module impl1 G
(defcap G () true)
(implements iface)
(defun op:integer (arg:string) (length arg)))
(module impl2 G
(defcap G () true)
(implements iface)
(defun op:integer (arg:string) -1))
(module app G
(defcap G () true)
(defun go:integer (m:module{iface} arg:string)
(m::op arg)))
(expect "impl1" 5 (app.go impl1 "hello"))
(expect "impl2" -1 (app.go impl2 "hello"))

Does blockwise allow iteration over out-of-core arrays?

The blockwise docs mention that with concatenate=False:
In the case of a contraction the passed function should expect an iterable of blocks on any array that holds that index.
My question then is whether or not there is a fundamental limitation that would prohibit this "iterable of blocks" from loading the blocks one at a time rather than keeping them all in a list (i.e. in memory). Is this possible? It does not look like blockwise works this way now, but I am wondering if it could:
import dask.array as da
import operator
# Create an array and write to disk
x = da.random.random(size=(10, 6), chunks=(5, 3))
da.to_zarr(x, '/tmp/x.zarr', overwrite=True)
x = da.from_zarr('/tmp/x.zarr')
y = x.T
def fn(x, y):
print(type(x), type(x[0]))
x = np.concatenate(x, axis=1)
y = np.concatenate(y, axis=0)
return np.matmul(x, y)
da.blockwise(fn, 'ik', x, 'ij', y, 'jk', concatenate=False, dtype='float').compute(scheduler='single-threaded')
# <class 'list'> <class 'numpy.ndarray'>
Is it possible for these lists to be generators instead?
This was true very early on in Dask, but we switched to concrete lists eventually. Today a task does not start until all of its dependency tasks are available in memory.
Given the context of your question I'm guessing that you're running up against memory issues with tensordot style applications. The memory use of tensordot style applications depends heavily on chunk structure. I encourage you to look at this issue, and especially at the talk referenced in the first post: https://github.com/dask/dask/issues/2225

Theano: Restoring broadcastable settings after dense -> sparse -> dense transformation

Background: I'm working on a project that historically has relied on sparse matrices for a lot of the math, and developing a plugin to outsource some of the heavy lifting to theano. Since theano's sparse support is limited, we're building a dense version first -- but hopefully that explains why we're interested in the approach below.
The task: apply some operator to only the nonzero values of a matrix.
The following subroutine works most of the time:
import theano.sparse.basic as TSB
def _applyOpToNonzerosOfDense(self,op,expr):
sparseExpr = TSB.clean(TSB.csr_from_dense(expr))
newData = op(TSB.csm_data(sparseExpr)).flatten()
newSparse = TS.CSR(newData, \
TSB.csm_indices(sparseExpr), \
TSB.csm_indptr(sparseExpr), \
TSB.csm_shape(sparseExpr))
ret = TSB.dense_from_sparse(newSparse)
return ret
The problem comes when expr is not a canonical matrix tensor, but a row tensor (so, expr is 1xN and expr.broadcastable is (True, False)). When that happens, we need to be able to retain or restore the broadcast status in the returned tensor.
Some things I've tried that don't work:
dense_from_sparse doesn't support broadcastable settings
Theano 0.9 doesn't support assignment to ret.broadcastable
ret.dimshuffle( ('x',1) ) fails with "You cannot drop a non-broadcastable dimension."
ret has (ought to have) exactly the same shape as expr, so I wasn't expecting this to be hard. How do I get my broadcast settings back?
LOL, it's in the API: T.addbroadcast(x,*axes)

Sqlite with OCaml

I'm sorry for my bad english if somethig is not clear please ask me and I will explain.
My goal is make back end in OCaml for start to "play seriusly" with this language, I chose to do beck end project because I wanna make front end too in React for improve my skill with React too (I use OCaml for passion, and Ract for job I'm web developer)
I chose sqlite (with this lib: http://mmottl.github.io/sqlite3-ocaml/api/Sqlite3.html) as db for avoid db configuration
I have idea to make little wrapper for db calls(so if I chose to change db type I just need to change it), and make a function like this:
val exec_query : query -> 'a List Deferred.t = <fun>
but in lib I see this signature for exec function:
val exec : db -> ?cb:(row -> headers -> unit) -> string -> Rc.t = <fun>
The result is passed row by row to callback, but for my purpose I think I need to have some kind of object (list, array, etc.), but I have no idea how to make it from this function.
Can someone suggest how to proceed?
I guess you want val exec_query : query -> row List Deferred.t. Since Sqlite3 does not know about Async, you want to execute the call returning the list of rows in a separate system thread. The function In_thread.run : (unit -> 'a) -> 'a Deferred.t (optional args removed from signature) is the function to use for that. Thus you want to write (untested):
let exec_query db query =
let rows_of_query () =
let rows = ref [] in
let rc = Sqlite3.exec_no_headers db query
~cb:(fun r -> rows := r :: !rows) in
(* Note: you want to use result to handle errors *)
!rows in
In_thread.run rows_of_query

How to cancel individual async computation, being run in parallel with others, from a button click event

I've prepared the following WinForms code to be as simple as possible to help answer my question. You can see I have a start button which sets up and runs 3 different async computations in parallel which each do some work and then update labels with a result. I have 3 cancel buttons corresponding to each async computation being run in parallel. How can I wire up these cancel buttons to cancel their corresponding async computations, while allowing the others to continue running in parallel? Thanks!
open System.Windows.Forms
type MyForm() as this =
inherit Form()
let lbl1 = new Label(AutoSize=true, Text="Press Start")
let lbl2 = new Label(AutoSize=true, Text="Press Start")
let lbl3 = new Label(AutoSize=true, Text="Press Start")
let cancelBtn1 = new Button(AutoSize=true,Enabled=false, Text="Cancel")
let cancelBtn2 = new Button(AutoSize=true,Enabled=false, Text="Cancel")
let cancelBtn3 = new Button(AutoSize=true,Enabled=false, Text="Cancel")
let startBtn = new Button(AutoSize=true,Text="Start")
let panel = new FlowLayoutPanel(AutoSize=true, Dock=DockStyle.Fill, FlowDirection=FlowDirection.TopDown)
do
panel.Controls.AddRange [|startBtn; lbl1; cancelBtn1; lbl2; cancelBtn2; lbl3; cancelBtn3; |]
this.Controls.Add(panel)
startBtn.Click.Add <| fun _ ->
startBtn.Enabled <- false
[lbl1;lbl2;lbl3] |> List.iter (fun lbl -> lbl.Text <- "Loading...")
[cancelBtn1;cancelBtn2;cancelBtn3] |> List.iter (fun cancelBtn -> cancelBtn.Enabled <- true)
let guiContext = System.Threading.SynchronizationContext.Current
let work (timeout:int) = //work is not aware it is being run within an async computation
System.Threading.Thread.Sleep(timeout)
System.DateTime.Now.Ticks |> string
let asyncUpdate (lbl:Label) (cancelBtn:Button) timeout =
async {
let result = work timeout //"cancelling" means forcibly aborting, since work may be stuck in an infinite loop
do! Async.SwitchToContext guiContext
cancelBtn.Enabled <- false
lbl.Text <- result
}
let parallelAsyncUpdates =
[|asyncUpdate lbl1 cancelBtn1 3000; asyncUpdate lbl2 cancelBtn2 6000; asyncUpdate lbl3 cancelBtn3 9000;|]
|> Async.Parallel
|> Async.Ignore
Async.StartWithContinuations(
parallelAsyncUpdates,
(fun _ -> startBtn.Enabled <- true),
(fun _ -> ()),
(fun _ -> ()))
Cancelling threads un-cooperatively is generally a bad practice, so I wouldn't recommend doing that. See for example this article. It can be done when you're programming with Thread directly (using Thread.Abort), but none of the modern parallel/asynchronous libraries for .NET (such as TPL or F# Async) use this. If that's really what you need, then you'll have to use threads explicitly.
A better option is to change the work function so that it can be cooperatively cancelled. In F#, this really just means wrapping it inside async and using let! or do!, because this automatically inserts support for cancellation. For example:
let work (timeout:int) = async {
do! Async.Sleep(timeout)
return System.DateTime.Now.Ticks |> string }
Without using async (e.g. if the function is written in C#), you could pass around a CancellationToken value and use it to check if cancellation was requested (by calling ThrowIfCancellationRequestsd). Then you can start the three computations using Async.Start (creating a new CancellationTokenSource for each of the computations).
To do something until they all complete, I would probably create a simple agent (that triggers some event until it receives a specifies number of messages). I don't think there is any more direct way to do that (because Async.Parallel uses the same cancellation token for all workflows).
So, I guess that the point of this answer is - if work is meant to be cancelled, then it should be aware of the situation, so that it can deal with it appropriately.
As Tomas mentioned, forcibly stopping a thread is a bad idea, and designing something that doesn't realize it is a thread to be able to stop raises flags, in my mind, but, if you are doing a long calculation, if you are using some data structure, such as a 2 or 3D array, then one option would be to be able to set that to null, but, this violates many concepts of F#, since what your function is working on should be not only immutable, but if there is some array that is going to be changed, then nothing else should be changing it.
For example, if you need to stop a thread that is processing a file (I had to do this before), then, since the file couldn't be deleted, as it was open, then I was able to open it in Notepad, then just delete all the content, and save it, and the thread crashed.
So, you may want to do something like this in order to accomplish your goal, but, I would suggest that you re-evaluate your design and see if there is a better way to do this.

Resources