What ORMs work well with Scala? [closed] - database

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I'm about to write a Scala command-line application that relies on a MySQL database. I've been looking around for ORMs, and am having trouble finding one that will work well.
The Lift ORM looks nice, but I'm not sure it can be decoupled from the entire Lift web framework. ActiveObjects also looks OK, but the author says that it may not work well with Scala.
I'm not coming to Scala from Java, so I don't know all the options. Has anyone used an ORM with Scala, and if so, what did you use and how well did it work?

There are several reasons why JPA-oriented frameworks (Hibernate, for instance) do not fit into idiomatic Scala applications elegantly:
there are no nested annotations as states the Scala 2.8 Preview -- that means you cannot use annotations as mapping metadata for complex applications (even the simplest ones often use #JoinTable -> #JoinColumn);
inconsistencies between Scala and Java collections make developers convert collections; there are also cases when it is impossible to map Scala collections to associations without implementing complex interfaces of the underlying framework (Hibernate's PersistentCollections, for example);
some very common features, such as domain model validation, require JavaBeans conventions on persistent classes -- these stuff is not quite "Scala way" of doing things;
of course, the interop problems (like Raw Types or proxies) introduce a whole new level of issues that cannot be walked around easily.
There are more reasons, I'm sure. That's why we have started the Circumflex ORM project. This pure-Scala ORM tries it's best to eliminate the nightmares of classic Java ORMs. Specifically, you define your entities in pretty much way you would do this with classic DDL statements:
class User extends Record[User] {
val name = "name".TEXT.NOT_NULL
val admin = "admin".BOOLEAN.NOT_NULL.DEFAULT('false')
}
object User extends Table[User] {
def byName(n: String): Seq[User] = criteria.add(this.name LIKE n).list
}
// example with foreign keys:
class Account extends Record[Account] {
val accountNumber = "acc_number".BIGINT.NOT_NULL
val user = "user_id".REFERENCES(User).ON_DELETE(CASCADE)
val amount = "amount".NUMERIC(10,2).NOT_NULL
}
object Account extends Table[Account]
As you can see, these declarations are a bit more verbose, than classic JPA POJOs. But in fact there are several concepts that are assembled together:
the precise DDL for generating schema (you can easily add indexes, foreign keys and other stuff in the same DSL-like fashion);
all queries can be assembled inside that "table object" instead of being scattered around in DAO; the queries themselves are very flexible, you can store query objects, predicates, projections, subqueries and relation aliases in variables so you can reuse them, and even make batch update operations from existing queries (insert-select for example);
transparent navigation between associations (one-to-one, many-to-one, one-to-many and many-to-many-through-intermediate-relation) can be achieved either by lazy or by eager fetching strategies; in both cases the associations are established on top of the foreign keys of underlying relations;
validation is the part of framework;
there is also a Maven2 plugin that allows generating schema and importing initial data from handy XML formatted files.
The only things Circumflex ORM lacks are:
multi-column primary keys (although it is possible to create multi-column foreign keys backed by multi-column unique constraints, but it is only for data integrity);
full-fledged documentation (although we are actively working on it);
success stories of ten-billion-dollar production systems that have Circumflex ORM as it's core technology.
P.S. I hope this post will not be considered an advertisement. It isn't so, really -- I was trying to be as objective as possible.

I experimented with EclipseLink JPA and basic operations worked fine for me. JPA is a Java standard and there are other implementations that may also work (OpenJPA, etc). Here is an example of what a JPA class in Scala looks like:
import javax.persistence.Entity;
import javax.persistence.GeneratedValue;
import javax.persistence.Id;
#Entity { val name = "Users" }
class User {
#Id
#GeneratedValue
var userid:Long = _
var login:String = _
var password:String = _
var firstName:String = _
var lastName:String = _
}

Slick is a perfect match for a functional world. Traditional ORM's are not a perfect fit for Scala. Slick composes well and uses a DSL that mimics Scala collection classes and for comprehensions.

I am happy to announce the 1st release of a new ORM library for Scala. MapperDao maps domain classes to database tables. It currently supports mysql, postgresql (oracle driver to be available soon), one-to-one, many-to-one, one-to-many, many-to-many relationships, autogenerated keys, transactions and optionally integrates nicely with spring framework. It allows freedom on the design of the domain classes which are not affected by persistence details, encourages immutability and is type safe. The library is not based on reflection but rather on good Scala design principles and contains a DSL to query data, which closely resembles select queries. It doesn't require implementation of equals() or hashCode() methods which can be problematic for persisted entities. Mapping is done using type safe Scala code.
Details and usage instructions can be found at the mapperdao's site:
http://code.google.com/p/mapperdao/
The library is available for download on the above site and also as a maven dependency (documentation contains details on how to use it via maven)
Examples can be found at:
https://code.google.com/p/mapperdao-examples/
Very brief introduction of the library via code sample:
class Product(val name: String, val attributes: Set[Attribute])
class Attribute(val name: String, val value: String)
...
val product = new Product("blue jean", Set(new Attribute("colour", "blue"), new Attribute("size", "medium")))
val inserted = mapperDao.insert(ProductEntity, product)
// the persisted entity has an id property:
println("%d : %s".format(inserted.id,inserted))
Querying is very familiar:
val o=OrderEntity
import Query._
val orders = query(select from o where o.totalAmount >= 20.0 and o.totalAmount <= 30.0)
println(orders) // a list of orders
I encourage everybody to use the library and give feedback. The documentation is currently quite extensive, with setup and usage instructions. Please feel free to comment and get in touch with me at kostas dot kougios at googlemail dot com.
Thanks,
Kostantinos Kougios

Here's basically the same example with #Column annotation:
/*
Corresponding table:
CREATE TABLE `users` (
`id` int(11) NOT NULL auto_increment,
`name` varchar(255) default NULL,
`admin` tinyint(1) default '0',
PRIMARY KEY (`id`)
)
*/
import _root_.javax.persistence._
#Entity
#Table{val name="users"}
class User {
#Id
#Column{val name="id"}
var id: Long = _
#Column{val name="name"}
var name: String = _
#Column{val name="admin"}
var isAdmin: Boolean = _
override def toString = "UserId: " + id + " isAdmin: " + isAdmin + " Name: " + name
}

Of course, any Java database access framework will work in Scala as well, with the usual issues that you may encounter, such as collections conversion, etc. jOOQ for instance, has been observed to work well in Scala. An example of jOOQ code in Scala is given in the manual:
object Test {
def main(args: Array[String]): Unit = {
val c = DriverManager.getConnection("jdbc:h2:~/test", "sa", "");
val f = new Factory(c, SQLDialect.H2);
val x = T_AUTHOR as "x"
for (r <- f
select (
T_BOOK.ID * T_BOOK.AUTHOR_ID,
T_BOOK.ID + T_BOOK.AUTHOR_ID * 3 + 4,
T_BOOK.TITLE || " abc" || " xy"
)
from T_BOOK
leftOuterJoin (
f select (x.ID, x.YEAR_OF_BIRTH)
from x
limit 1
asTable x.getName()
)
on T_BOOK.AUTHOR_ID === x.ID
where (T_BOOK.ID <> 2)
or (T_BOOK.TITLE in ("O Alquimista", "Brida"))
fetch
) {
println(r)
}
}
}
Taken from
http://www.jooq.org/doc/2.6/manual/getting-started/jooq-and-scala/

Related

CakePHP 3.3 tables dedicated for different data based on the selected language

I have a non-standard question to CakePHP 3.3. Let's imagine that in my database I have two tables: A and B (both are identical, first is dedicated for data in the first language, second is dedicated for data in the second language).
I correctly coded the whole website for table A (table B is not yet in use). Additionally, I implemented the .po files mechanizm to switch the language of the interface. The language of the inteface switches correctly.
How can I easily plug the table B - I do not want to make IF-ELSE statements in all cases because the website is getting big, and there are many operations in table A already included. Is there a possibility to somehow make a simple mapping that table A equals table B if language pl_PL is selected to en_US (through .po files)?
The most simple option that comes to my mind would be to inject the current locale into your existing table class, and have it set the database table name accordingly.
Let's assume your existing table class would be called SomeSharedTable, this could look something along the lines of:
// ...
class SomeSharedTable extends Table
{
public function initialize(array $config)
{
if (!isset($config['locale'])) {
throw new \InvalidArgumentException('The `locale` config key is missing');
}
$table = 'en_table';
if ($config['locale'] === 'pl_PL') {
$table = 'pl_table';
}
$this->table($table);
// ...
}
// ...
}
And before your appplication code involves the model layer, and after it sets the locale of course (that might for example be in your bootstrap), configure the alias that you're using throughout your application (for this example we assume that the alias matches the table name):
\Cake\ORM\TableRegistry::config('SomeShared', [
'locale' => \Cake\I18n\I18n::locale()
]);
Given that it's possible that the locale might not make it into the class for whatever reason, you should implement some safety measures, I've just added that basic isset() check for example purposes. Given that a wrongly configured table class could cause quite some problems, you probably want to add some checks that are a little more sophisticated.

Addressing a database as a multi-level associative array

I'm writing a web game, where it's very convenient to think about the complete game state as a single arbitrary-level hash.
A couple examples of the game state being updated:
// Defining a mission card that could be issued to a player
$game['consumables']['missions']['id04536']['name'] = "Build a Galleon for Blackbeard";
$game['consumables']['missions']['id04536']['requirements']['lumber'] = 20;
$game['consumables']['missions']['id04536']['requirements']['iron'] = 10;
$game['consumables']['missions']['id04536']['rewards']['blackbeard_standing'] = 5;
// When a player turns in this mission card with its requirements
$game['players']['id3214']['blackbeard_standing'] += 5;
This is a web game, so storing the information in a database makes sense. I need the features of a database: That the game state can be accessed from multiple browsers at the same time, that it's non-volatile and easy to back up, etc.
Essentially, I want syntax as easy as reading-to/writing-from an associative array of arbitrary depth. I need all the functionality of dealing with an associative array: Not just simple reads and writes, the but the ability to run foreach loops against it, and so on. And I need the effect to actually be to perform all reads/writes from a database, not from volatile memory.
I'm personally fond of raw Ruby, but if there's a specific language or framework that gives me this one feature, it would make the rest of this project easy enough to be worth using.
any language, framework? How about python+sqlalchemy+postgresql
python because you can easily create new types that behave for all the world like regular dicts
postgres because it has two particularly interesting types uncommon in other sql databases, we'll get to that in a moment.
sqlalchemy because it can do all of the dirty work of dealing with an rdbms concisely.
Using a sql database like this is awkward, because the normal 'key' you would want in a table has to be a fixed set of columns, so ideally, you'd need a single column for the whole "path" into the deeply nested mapping.
An even more irritating problem is that you seem to want to store a range of different types at the leaves. This is not ideal.
Fortunately, postgres can help us out with both issues, using a TEXT[] for the first, we can have a single column so that every entry can concisely represent the whole path, all the way down the tree. For the second, we can use JSON, which is exactly like it sounds, permitting arbitrary json encodable types, which significantly does include both strings and numbers as in your code example.
Because I am lazy, i'll use sqlalchemy to do most of the work. First, we need a table using the above types:
from sqlalchemy import *
from sqlalchemy.orm import *
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.dialects import postgres as pg
Base = declarative_base()
class AssocStorage(Base):
__tablename__ = 'assoc_storage'
key = Column(pg.ARRAY(pg.TEXT, as_tuple=True), primary_key=True)
value = Column(pg.JSON)
That will give us a relational version of a single entry in a deeply nested mapping. We're most of the way there already:
>>> engine = create_engine('postgres:///database')
>>> Base.metadata.create_all(engine) ## beware, extreem lazyness
>>> session = Session(bind=engine) ## also unusually lazy, real applications should use `sessionmaker`
>>> session.add(AssocStorage(key=('foo','bar'), value=5))
>>> session.commit()
>>> x = session.query(AssocStorage).get((('foo', 'bar'),))
>>> x.key
(u'foo', u'bar')
>>> x.value
5
Okay, not too bad, but this is a little more annoying to use, Like i mentioned earlier, the python type system is compliant enough to make this look like a normal dict, we need only give a class that implements the proper protocol:
import collections
class PersistentDictView(collections.MutableMapping):
# a "production grade" version should actually implement these:
__delitem__ = __iter__ = __len__ = NotImplemented
def __init__(self, session):
self.session = session
def __getitem__(self, key):
return self.session.query(AssocStorage).get((key, )).value
def __setitem__(self, key, value):
existing_item = self.session.query(AssocStorage).get((key, ))
if existing_item is None:
existing_item = AssocStorage(key=key)
self.session.add(existing_item)
existing_item.value = value
This is slightly different from the code you've posted, where you have x[a][b][c], this requires x[a, b, c].
>>> d = PersistentDictView(session)
>>> d['foo', 'bar', 'baz'] = 5
>>> d['foo', 'bar', 'baz']
5
>>> d['foo', 'bar', 'baz'] += 5
>>> session.commit()
>>> d['foo', 'bar', 'baz']
10
If you really need your nesting for some reason, you could get that behavior with only a bit more work, but it would require a little more effort. Additionally, this totally punts on transaction management, notice the explicit session.commit() above.
MongoDB(+its Ruby gem) was the best answer I found. And so easy to install and learn.
It gets me 90% of the way to treating the entire game state as a multi-level hash. Close enough. I can treat the collections as the first hash level, and use it to store & retrieve multilevel hashes.

Does the NDB membership query ("IN" operation) performance degrade with lots of possible values?

The documentation for the IN query operation states that those queries are implemented as a big OR'ed equality query:
qry = Article.query(Article.tags.IN(['python', 'ruby', 'php']))
is equivalent to:
qry = Article.query(ndb.OR(Article.tags == 'python',
Article.tags == 'ruby',
Article.tags == 'php'))
I am currently modelling some entities for a GAE project and plan on using these membership queries with a lot of possible values:
qry = Player.query(Player.facebook_id.IN(list_of_facebook_ids))
where list_of_facebook_ids could have thousands of items.
Will this type of query perform well with thousands of possible values in the list? If not, what would be the recommended approach for modelling this?
This won't work with thousands of values (in fact I bet it starts degrading with more than 10 values). The only alternative I can think of are some form of precomputation. You'll have to change your schema.
One way you can you do it is to create a new model called FacebookPlayer which is an index. This would be keyed by facebook_id. You would update it whenever you add a new player. It looks something like this:
class FacebookUser(ndb.Model):
player = ndb.KeyProperty(kind='Player', required=True)
Now you can avoid queries altogether. You can do this:
# Build keys from facebook ids.
facebook_id_keys = []
for facebook_id in list_of_facebook_ids:
facebook_id_keys.append(ndb.Key('FacebookPlayer', facebook_id))
keysOfUsersMatchedByFacebookId = []
for facebook_player in ndb.get_multi(facebook_id_keys):
if facebook_player:
keysOfUsersMatchedByFacebookId.append(facebook_player.player)
usersMatchedByFacebookId = ndb.get_multi(keysOfUsersMatchedByFacebookId)
If list_of_facebook_ids is thousands of items, you should do this in batches.

Lightweight Groovy persistence

What are some lightweight options for persistence in Groovy? I've considered serialization and XML so far but I want something a bit more robust than those, at least so I don't have to rewrite the entire file every time. Ideally, it would:
Require no JARs in classpath, using Grapes instead
Require no external processes, administration, or authentication (so all embedded)
Support locking
I plan on using it to cache some information between runs of a standalone Groovy script. I imagine responses will focus around SQL and NoSQL databases. Links to pages demonstrating this usage would be appreciated. Thanks!
Full SQL Database
The h2 in-process SQL database is very easy to use. This is the same database engine grails uses by default, but it's simple to use in a groovy script as well:
#GrabConfig(systemClassLoader=true)
#Grab(group='com.h2database', module='h2', version='1.3.167')
import groovy.sql.Sql
def sql = Sql.newInstance("jdbc:h2:hello", "sa", "sa", "org.h2.Driver")
sql.execute("create table test (id int, value text)")
sql.execute("insert into test values(:id, :value)", [id: 1, value: 'hello'])
println sql.rows("select * from test")
In this case the database will be saved to a file called hello.h2.db.
Simple Persistent Maps
Another alternative is jdbm, which provides disk-backed persistent maps. Internally, it uses Java's serialization. The programming interface is much simpler, but it's also much less powerful than a full-blown SQL db. There's no support for concurrent access, but it is synchronized and thread safe, which may be enough depending on your locking requirements. Here's a simple example:
#Grab(group='org.fusesource.jdbm', module='jdbm', version='2.0.1')
import jdbm.*
def recMan = RecordManagerFactory.createRecordManager('hello')
def treeMap = recMan.treeMap("test")
treeMap[1] = 'hello'
treeMap[100] = 'goodbye'
recMan.commit()
println treeMap
This will save the map to a set of files.
just a little groovy update on simple persistence using JDBM. Concurrent access is supported now. Name has changed from JDBM4 to MapDB.
#Grab(group='org.mapdb', module='mapdb', version='0.9.3')
import java.util.concurrent.ConcurrentNavigableMap
import org.mapdb.*
DB db = DBMaker.newFileDB( new File("myDB.file") )
.closeOnJvmShutdown()
.make()
ConcurrentNavigableMap<String,String> map = db.getTreeMap("myMap")
map.put("1", "one")
map.put("2", "two")
db.commit()
println "keySet "+map.keySet()
assert map.get("1") == "one"
assert map.get("2") == "two"
db.close()
Chronicle Map is a persisted ConcurrentMap implementation for JVM.
Usage example:
ConcurrentMap<String, String> store = ChronicleMap
.of(String.class, String.class)
.averageKey("cachedKey").averageValue("cachedValue")
.entries(10_000)
.createPersistedTo(new File("cacheFile"))
store.put("foo", "bar")
store.close()
I am little late to the party. But for sake of posterity, listing one more options here:
gstorm
A simple ORM for databases and CSV files. Intended to be used in groovy scripts and small projects
disclosure: author here :)

Understanding ListProperty backend behavior in GAE

I'm trying to understand how you're supposed to access items in a GAE db.ListProperty(db.Key).
Example:
A Magazine db.Model entity has a db.ListProperty(db.Key) that contains 10 Article entities. I want to get the Magazine object and display the Article names and dates. Do I make 10 queries for the actual article objects? Do I do a batch query? What if there's 50 articles? (Don't batch queries rely on the IN operator, which is limited to 30 or fewer elements?)
So you are describing something like this:
class Magazine(db.Model):
ArticleList = db.ListProperty(db.Key)
class Article(db.Model):
ArticleName = db.StringProperty()
ArticleDate = db.DateProperty()
In this case the simplest way to grab the listed articles is to use the Model.get() method, which looks for a key list.
m = Magazine.get() #grab the first record
articles = Article.get(m.ArticleList) #get Articles using key list
for a in articles:
name = a.ArticleName
date = a.ArticleDate
#do something with this data
Depending on how you plan on working with the data you may be better off adding a Magazine reference property to your Article entities instead.
You need to read Modeling Entity Relationships, especially the part about one to many.

Resources