What are some lightweight options for persistence in Groovy? I've considered serialization and XML so far but I want something a bit more robust than those, at least so I don't have to rewrite the entire file every time. Ideally, it would:
Require no JARs in classpath, using Grapes instead
Require no external processes, administration, or authentication (so all embedded)
Support locking
I plan on using it to cache some information between runs of a standalone Groovy script. I imagine responses will focus around SQL and NoSQL databases. Links to pages demonstrating this usage would be appreciated. Thanks!
Full SQL Database
The h2 in-process SQL database is very easy to use. This is the same database engine grails uses by default, but it's simple to use in a groovy script as well:
#GrabConfig(systemClassLoader=true)
#Grab(group='com.h2database', module='h2', version='1.3.167')
import groovy.sql.Sql
def sql = Sql.newInstance("jdbc:h2:hello", "sa", "sa", "org.h2.Driver")
sql.execute("create table test (id int, value text)")
sql.execute("insert into test values(:id, :value)", [id: 1, value: 'hello'])
println sql.rows("select * from test")
In this case the database will be saved to a file called hello.h2.db.
Simple Persistent Maps
Another alternative is jdbm, which provides disk-backed persistent maps. Internally, it uses Java's serialization. The programming interface is much simpler, but it's also much less powerful than a full-blown SQL db. There's no support for concurrent access, but it is synchronized and thread safe, which may be enough depending on your locking requirements. Here's a simple example:
#Grab(group='org.fusesource.jdbm', module='jdbm', version='2.0.1')
import jdbm.*
def recMan = RecordManagerFactory.createRecordManager('hello')
def treeMap = recMan.treeMap("test")
treeMap[1] = 'hello'
treeMap[100] = 'goodbye'
recMan.commit()
println treeMap
This will save the map to a set of files.
just a little groovy update on simple persistence using JDBM. Concurrent access is supported now. Name has changed from JDBM4 to MapDB.
#Grab(group='org.mapdb', module='mapdb', version='0.9.3')
import java.util.concurrent.ConcurrentNavigableMap
import org.mapdb.*
DB db = DBMaker.newFileDB( new File("myDB.file") )
.closeOnJvmShutdown()
.make()
ConcurrentNavigableMap<String,String> map = db.getTreeMap("myMap")
map.put("1", "one")
map.put("2", "two")
db.commit()
println "keySet "+map.keySet()
assert map.get("1") == "one"
assert map.get("2") == "two"
db.close()
Chronicle Map is a persisted ConcurrentMap implementation for JVM.
Usage example:
ConcurrentMap<String, String> store = ChronicleMap
.of(String.class, String.class)
.averageKey("cachedKey").averageValue("cachedValue")
.entries(10_000)
.createPersistedTo(new File("cacheFile"))
store.put("foo", "bar")
store.close()
I am little late to the party. But for sake of posterity, listing one more options here:
gstorm
A simple ORM for databases and CSV files. Intended to be used in groovy scripts and small projects
disclosure: author here :)
Related
Code Migration due to Performance Issues :-
SQL Server LIKE Condition ( BEFORE )
SQL Server Full Text Search --> CONTAINS ( BEFORE )
Elastic Search ( CURRENTLY )
Achieved So Far :-
We have a web page created in ASP.Net Core which has a Auto Complete Drop Down of 2.5+ Million Companies Indexed in Elastic Search https://www.99corporates.com/
Due to performance issues we have successfully shifted our code from SQL Server Full Text Search to Elastic Search and using NEST v7.2.1 and Elasticsearch.Net v7.2.1 in our .Net Code.
Still looking for a solution :-
If the user does not select a company from the Auto Complete List and simply enters a few characters and clicks on go then a list should be displayed which we had done earlier by using the SQL Server Full Text Search --> CONTAINS
Can we call the ASP.Net Web Service which we have created using SQL CLR and code like SELECT * FROM dbo.Table WHERE Name IN( dbo.SQLWebRequest('') )
[System.Web.Script.Services.ScriptMethod()]
[System.Web.Services.WebMethod]
public static List<string> SearchCompany(string prefixText, int count)
{
}
Any better or alternate option
While that solution (i.e. the SQL-APIConsumer SQLCLR project) "works", it is not scalable. It also requires setting the database to TRUSTWORTHY ON (a security risk), and loads a few assemblies as UNSAFE, such as Json.NET, which is risky if any of them use static variables for caching, expecting each caller to be isolated / have their own App Domain, because SQLCLR is a single, shared App Domain, hence static variables are shared across all callers, and multiple concurrent threads can cause race-conditions (this is not to say that this is something that is definitely happening since I haven't seen the code, but if you haven't either reviewed the code or conducted testing with multiple concurrent threads to ensure that it doesn't pose a problem, then it's definitely a gamble with regards to stability and ensuring predictable, expected behavior).
To a slight degree I am biased given that I do sell a SQLCLR library, SQL#, in which the Full version contains a stored procedure that also does this but a) handles security properly via signatures (it does not enable TRUSTWORTHY), b) allows for handling scalability, c) does not require any UNSAFE assemblies, and d) handles more scenarios (better header handling, etc). It doesn't handle any JSON, it just returns the web service response and you can unpack that using OPENJSON or something else if you prefer. (yes, there is a Free version of SQL#, but it does not contain INET_GetWebPages).
HOWEVER, I don't think SQLCLR is a good fit for this scenario in the first place. In your first two versions of this project (using LIKE and then CONTAINS) it made sense to send the user input directly into the query. But now that you are using a web service to get a list of matching values from that user input, you are no longer confined to that approach. You can, and should, handle the web service / Elastic Search portion of this separately, in the app layer.
Rather than passing the user input into the query, only to have the query pause to get that list of 0 or more matching values, you should do the following:
Before executing any query, get the list of matching values directly in the app layer.
If no matching values are returned, you can skip the database call entirely as you already have your answer, and respond immediately to the user (much faster response time when no matches return)
If there are matches, then execute the search stored procedure, sending that list of matches as-is via Table-Valued Parameter (TVP) which becomes a table variable in the stored procedure. Use that table variable to INNER JOIN against the table rather than doing an IN list since IN lists do not scale well. Also, be sure to send the TVP values to SQL Server using the IEnumerable<SqlDataRecord> method, not the DataTable approach as that merely wastes CPU / time and memory.
For example code on how to accomplish this correctly, please see my answer to Pass Dictionary to Stored Procedure T-SQL
In C#-style pseudo-code, this would be something along the lines of the following:
List<string> = companies;
companies = SearchCompany(PrefixText, Count);
if (companies.Length == 0)
{
Response.Write("Nope");
}
else
{
using(SqlConnection db = new SqlConnection(connectionString))
{
using(SqlCommand batch = db.CreateCommand())
{
batch.CommandType = CommandType.StoredProcedure;
batch.CommandText = "ProcName";
SqlParameter tvp = new SqlParameter("ParamName", SqlDbType.Structured);
tvp.Value = MethodThatYieldReturnsList(companies);
batch.Paramaters.Add(tvp);
db.Open();
using(SqlDataReader results = db.ExecuteReader())
{
if (results.HasRows)
{
// deal with results
Response.Write(results....);
}
}
}
}
}
Done. Got the solution.
Used SQL CLR https://github.com/geral2/SQL-APIConsumer
exec [dbo].[APICaller_POST]
#URL = 'https://www.-----/SearchCompany'
,#JsonBody = '{"searchText":"GOOG","count":10}'
Let me know if there is any other / better options to achieve this.
I have a ODI 12c project with 30 mappings. I need to check if every "Component context" on every datastore object (source or target) is set to "Execution context" (not forced).
Is there a way to achive this by querying ODI underlying database so I don't have to do this manually, and to avoid possible mistakes ?
I have a list of ODI 12c Repository tables and comments on table columns which I got from the Oracle support website, and after hours of digging through database I still can't see this information stored in any table.
My package is located in SNP_PACKAGE, SNP_MAPPING has info about mapping , and SNP_MAP_COMP describes objects in mapping.
I have searched through many different tables as well.
A bit late but for anyone else looking
Messing about the tables is a no-no. APIs are better. Specially if you are to modify anything.
https://docs.oracle.com/en/middleware/data-integrator/12.2.1.3/odija/index.html
Run the following groovy script in ODI (Tools/Groovy/New Script). Should be simple enough to modify. Using the SDK gets a lot easier if you manage to set up a complete development env in IntelliJ or another Java IDE. Groovy in ODI opens up a whole new world.
//Created by DI Studio
import oracle.odi.domain.mapping.Mapping
import oracle.odi.domain.mapping.finder.IMappingFinder
tme = odiInstance.getTransactionalEntityManager()
IMappingFinder mapf = (IMappingFinder) tme.getFinder(Mapping.class)
Collection<Mapping> mappings = mapf.findByProject("PROJECT","FOLDER")
println("Found ${mappings.size()} mappings")
mappings.each { map ->
map.physicalDesigns.each{ phys ->
phys.physicalNodes.each{ node ->
println("${map.project.name}...${map.parentFolder.parentFolder?.name}.${map.parentFolder.name}.${map.name}.${phys.name}.${node.name}.defaultContext=${(node.context.defaultContext) ? "default" : node.context.name}")
}
}
}
It prints default or the set (forced) context. Seems forced context has been deprecated in 12c. Physical.node.context.defaultContext seems to mirror Component Context (Forced) in ODI Studio 12.2.1.3.
https://docs.oracle.com/en/middleware/data-integrator/12.2.1.3/odija/index.html
Update 2019-12-20 - including getExecutionContextName
The following script lists in a hierarchical manner and maybe easier to read the code. Not sure if you get what you are originally was after without having mapping with your exact setup.
//Created by DI Studio
import oracle.odi.domain.mapping.Mapping
import oracle.odi.domain.mapping.finder.IMappingFinder
import oracle.odi.domain.mapping.component.DatastoreComponent
tme = odiInstance.getTransactionalEntityManager()
String project = "PROJECT"
String parentFolder = "PARENT_FOLDER"
IMappingFinder mapf = (IMappingFinder) tme.getFinder(Mapping.class)
Collection<Mapping> mappings = mapf.findByProject(project, parentFolder)
println("Found ${mappings.size()} mappings")
println "Project: ${project}"
mappings.each { map ->
println "\tMapping: ..${map.parentFolder.parentFolder?.name}/${map.parentFolder.name}/${map.name}"
map.physicalDesigns.each{ phys ->
println "\t\tPhysical: ${phys.name}"
phys.physicalNodes.each{ node ->
println "\t\t\tNode: ${node.name}"
println "\t\t\t\tdefaultContext: ${(node.context.defaultContext)}"
println "\t\t\t\tNode context name: ${node.context.name}"
println "\t\t\t\tDatastoreComponent ExecutionContextName: ${DatastoreComponent.getDatastoreComponent(node)?.getExecutionContextName(node).toString()}"
}
}
}
Below is a list of some tables and columns that might hold the value you are looking for.
These tables and columns are from ODI 12.1.2, depending on the exact ODI version you are using, the structure could be a little different.
Here is also a query to retrieve this information directly from database.
-- Forced Contexts on Datastores in Mapping
SELECT MAPP.NAME MAP_NAME, MAPP_COMP.NAME DATASTORE_NAME,
MAPP_REF.QUALIFIED_NAME FORCE_CONTEXT
FROM SNP_MAPPING MAPP
INNER JOIN SNP_MAP_REF MAPP_REF
ON MAPP_REF.I_OWNER_MAPPING = MAPP.I_MAPPING
INNER JOIN SNP_MAP_PROP MAPP_PROP
ON MAPP_REF.I_MAP_REF = MAPP_PROP.I_PROP_XREF_VALUE
INNER JOIN ODIW12.SNP_MAP_COMP MAPP_COMP
ON MAPP_COMP.I_MAP_COMP = MAPP_PROP.I_MAP_COMP
WHERE
MAPP_REF.ADAPTER_INTF_TYPE = 'IContext'
and MAPP.NAME like %yourMapping%
I am new to cloudant , no-sql data base (i had worked on mongodb )
1) is there any cloudant ui to write the queires to find the resultset for developing.
2) how to create map-reduce in cloudant ?..
can u please reply me or send your thoughts.
The search indexes are written in JavaScript (at the moment, Cloduant has launched their own "Cloudant Query" which promises to be easier to work with but I haven't had the time to try it properly yet.)
Say you have documents in your DB which contain a field called "UserName" and you want to create a view on all these. You could write a function like this;
function(doc) {
if ( typeof doc.UserName !== "undefined" ) {
emit([doc.UserName], doc._id);
}
}
For example (it will output the user names and document ids)
If a given user name could be associated with multiple documents you could do this, for example;
function(doc) {
if ( typeof doc.UserName !== "undefined" ) {
emit([doc.UserName,doc._id], 1);
}
}
and also use the built-in "count" or "sum" reduce functions that Cloudant provides to tally the number of documents a given user name is associated with etc.
You can use the UI in the Cloudant DB dashboard to execute queries or (as I personally favour) use a tool like Postman (https://www.getpostman.com/)
One word of warning though; error- and sanity -checking of your JavaScript code is pretty much non-existent and you'll only know that something isn't working when you hit "save & build index" which can be a major pain if you're working on large databases (it can grind the whole thing to a halt). A pro tip, therefore, is to work out your indexes on smaller data sets in some safe little sandbox database before you let it lose on anything important...
All of this is supposedly going to be Much Better with Cloudant Query.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I'm about to write a Scala command-line application that relies on a MySQL database. I've been looking around for ORMs, and am having trouble finding one that will work well.
The Lift ORM looks nice, but I'm not sure it can be decoupled from the entire Lift web framework. ActiveObjects also looks OK, but the author says that it may not work well with Scala.
I'm not coming to Scala from Java, so I don't know all the options. Has anyone used an ORM with Scala, and if so, what did you use and how well did it work?
There are several reasons why JPA-oriented frameworks (Hibernate, for instance) do not fit into idiomatic Scala applications elegantly:
there are no nested annotations as states the Scala 2.8 Preview -- that means you cannot use annotations as mapping metadata for complex applications (even the simplest ones often use #JoinTable -> #JoinColumn);
inconsistencies between Scala and Java collections make developers convert collections; there are also cases when it is impossible to map Scala collections to associations without implementing complex interfaces of the underlying framework (Hibernate's PersistentCollections, for example);
some very common features, such as domain model validation, require JavaBeans conventions on persistent classes -- these stuff is not quite "Scala way" of doing things;
of course, the interop problems (like Raw Types or proxies) introduce a whole new level of issues that cannot be walked around easily.
There are more reasons, I'm sure. That's why we have started the Circumflex ORM project. This pure-Scala ORM tries it's best to eliminate the nightmares of classic Java ORMs. Specifically, you define your entities in pretty much way you would do this with classic DDL statements:
class User extends Record[User] {
val name = "name".TEXT.NOT_NULL
val admin = "admin".BOOLEAN.NOT_NULL.DEFAULT('false')
}
object User extends Table[User] {
def byName(n: String): Seq[User] = criteria.add(this.name LIKE n).list
}
// example with foreign keys:
class Account extends Record[Account] {
val accountNumber = "acc_number".BIGINT.NOT_NULL
val user = "user_id".REFERENCES(User).ON_DELETE(CASCADE)
val amount = "amount".NUMERIC(10,2).NOT_NULL
}
object Account extends Table[Account]
As you can see, these declarations are a bit more verbose, than classic JPA POJOs. But in fact there are several concepts that are assembled together:
the precise DDL for generating schema (you can easily add indexes, foreign keys and other stuff in the same DSL-like fashion);
all queries can be assembled inside that "table object" instead of being scattered around in DAO; the queries themselves are very flexible, you can store query objects, predicates, projections, subqueries and relation aliases in variables so you can reuse them, and even make batch update operations from existing queries (insert-select for example);
transparent navigation between associations (one-to-one, many-to-one, one-to-many and many-to-many-through-intermediate-relation) can be achieved either by lazy or by eager fetching strategies; in both cases the associations are established on top of the foreign keys of underlying relations;
validation is the part of framework;
there is also a Maven2 plugin that allows generating schema and importing initial data from handy XML formatted files.
The only things Circumflex ORM lacks are:
multi-column primary keys (although it is possible to create multi-column foreign keys backed by multi-column unique constraints, but it is only for data integrity);
full-fledged documentation (although we are actively working on it);
success stories of ten-billion-dollar production systems that have Circumflex ORM as it's core technology.
P.S. I hope this post will not be considered an advertisement. It isn't so, really -- I was trying to be as objective as possible.
I experimented with EclipseLink JPA and basic operations worked fine for me. JPA is a Java standard and there are other implementations that may also work (OpenJPA, etc). Here is an example of what a JPA class in Scala looks like:
import javax.persistence.Entity;
import javax.persistence.GeneratedValue;
import javax.persistence.Id;
#Entity { val name = "Users" }
class User {
#Id
#GeneratedValue
var userid:Long = _
var login:String = _
var password:String = _
var firstName:String = _
var lastName:String = _
}
Slick is a perfect match for a functional world. Traditional ORM's are not a perfect fit for Scala. Slick composes well and uses a DSL that mimics Scala collection classes and for comprehensions.
I am happy to announce the 1st release of a new ORM library for Scala. MapperDao maps domain classes to database tables. It currently supports mysql, postgresql (oracle driver to be available soon), one-to-one, many-to-one, one-to-many, many-to-many relationships, autogenerated keys, transactions and optionally integrates nicely with spring framework. It allows freedom on the design of the domain classes which are not affected by persistence details, encourages immutability and is type safe. The library is not based on reflection but rather on good Scala design principles and contains a DSL to query data, which closely resembles select queries. It doesn't require implementation of equals() or hashCode() methods which can be problematic for persisted entities. Mapping is done using type safe Scala code.
Details and usage instructions can be found at the mapperdao's site:
http://code.google.com/p/mapperdao/
The library is available for download on the above site and also as a maven dependency (documentation contains details on how to use it via maven)
Examples can be found at:
https://code.google.com/p/mapperdao-examples/
Very brief introduction of the library via code sample:
class Product(val name: String, val attributes: Set[Attribute])
class Attribute(val name: String, val value: String)
...
val product = new Product("blue jean", Set(new Attribute("colour", "blue"), new Attribute("size", "medium")))
val inserted = mapperDao.insert(ProductEntity, product)
// the persisted entity has an id property:
println("%d : %s".format(inserted.id,inserted))
Querying is very familiar:
val o=OrderEntity
import Query._
val orders = query(select from o where o.totalAmount >= 20.0 and o.totalAmount <= 30.0)
println(orders) // a list of orders
I encourage everybody to use the library and give feedback. The documentation is currently quite extensive, with setup and usage instructions. Please feel free to comment and get in touch with me at kostas dot kougios at googlemail dot com.
Thanks,
Kostantinos Kougios
Here's basically the same example with #Column annotation:
/*
Corresponding table:
CREATE TABLE `users` (
`id` int(11) NOT NULL auto_increment,
`name` varchar(255) default NULL,
`admin` tinyint(1) default '0',
PRIMARY KEY (`id`)
)
*/
import _root_.javax.persistence._
#Entity
#Table{val name="users"}
class User {
#Id
#Column{val name="id"}
var id: Long = _
#Column{val name="name"}
var name: String = _
#Column{val name="admin"}
var isAdmin: Boolean = _
override def toString = "UserId: " + id + " isAdmin: " + isAdmin + " Name: " + name
}
Of course, any Java database access framework will work in Scala as well, with the usual issues that you may encounter, such as collections conversion, etc. jOOQ for instance, has been observed to work well in Scala. An example of jOOQ code in Scala is given in the manual:
object Test {
def main(args: Array[String]): Unit = {
val c = DriverManager.getConnection("jdbc:h2:~/test", "sa", "");
val f = new Factory(c, SQLDialect.H2);
val x = T_AUTHOR as "x"
for (r <- f
select (
T_BOOK.ID * T_BOOK.AUTHOR_ID,
T_BOOK.ID + T_BOOK.AUTHOR_ID * 3 + 4,
T_BOOK.TITLE || " abc" || " xy"
)
from T_BOOK
leftOuterJoin (
f select (x.ID, x.YEAR_OF_BIRTH)
from x
limit 1
asTable x.getName()
)
on T_BOOK.AUTHOR_ID === x.ID
where (T_BOOK.ID <> 2)
or (T_BOOK.TITLE in ("O Alquimista", "Brida"))
fetch
) {
println(r)
}
}
}
Taken from
http://www.jooq.org/doc/2.6/manual/getting-started/jooq-and-scala/
Is GQL easy to learn for someone who knows SQL? How is Django/Python? Does App Engine really make scaling easy? Is there any built-in protection against "GQL Injections"? And so on...
I'd love to hear the not-so-obvious ups and downs of using app engine.
Cheers!
My experience with google app engine has been great, and the 1000 result limit has been removed, here is a link to the release notes:
app-engine release notes
No more 1000 result limit - That's
right: with addition of Cursors and
the culmination of many smaller
Datastore stability and performance
improvements over the last few months,
we're now confident enough to remove
the maximum result limit altogether.
Whether you're doing a fetch,
iterating, or using a Cursor, there's
no limits on the number of results.
The most glaring and frustrating issue is the datastore api, which looks great and is very well thought out and easy to work with if you are used to SQL, but has a 1000 row limit across all query resultsets, and you can't access counts or offsets beyond that. I've run into weirder issues, with not actually being able to add or access data for a model once it goes beyond 1000 rows.
See the Stack Overflow discussion about the 1000 row limit
Aral Balkan wrote a really good summary of this and other problems
Having said that, app engine is a really great tool to have at ones disposal, and I really enjoy working with it. It's perfect for deploying micro web services (eg: json api's) to use in other apps.
GQL is extremely simple - it's a subset of the SQL 'SELECT' statement, nothing more. It's only a convenience layer over the top of the lower-level APIs, though, and all the parsing is done in Python.
Instead, I recommend using the Query API, which is procedural, requires no run-time parsing, and makes 'GQL injection' vulnerabilities totally impossible (though they are impossible in properly written GQL anyway). The Query API is very simple: Call .all() on a Model class, or call db.Query(modelname). The Query object has .filter(field_and_operator, value), .order(field_and_direction) and .ancestor(entity) methods, in addition to all the facilities GQL objects have (.get(), .fetch(), .count()), etc.) Each of the Query methods returns the Query object itself for convenience, so you can chain them:
results = MyModel.all().filter("foo =", 5).order("-bar").fetch(10)
Is equivalent to:
results = MyModel.gql("WHERE foo = 5 ORDER BY bar DESC LIMIT 10").fetch()
A major downside when working with AppEngine was the 1k query limit, which has been mentioned in the comments already. What I haven't seen mentioned though is the fact that there is a built-in sortable order, with which you can work around this issue.
From the appengine cookbook:
def deepFetch(queryGen,key=None,batchSize = 100):
"""Iterator that yields an entity in batches.
Args:
queryGen: should return a Query object
key: used to .filter() for __key__
batchSize: how many entities to retrieve in one datastore call
Retrieved from http://tinyurl.com/d887ll (AppEngine cookbook).
"""
from google.appengine.ext import db
# AppEngine will not fetch more than 1000 results
batchSize = min(batchSize,1000)
query = None
done = False
count = 0
if key:
key = db.Key(key)
while not done:
print count
query = queryGen()
if key:
query.filter("__key__ > ",key)
results = query.fetch(batchSize)
for result in results:
count += 1
yield result
if batchSize > len(results):
done = True
else:
key = results[-1].key()
The above code together with Remote API (see this article) allows you to retrieve as many entities as you need.
You can use the above code like this:
def allMyModel():
q = MyModel.all()
myModels = deepFetch(allMyModel)
At first I had the same experience as others who transitioned from SQL to GQL -- kind of weird to not be able to do JOINs, count more than 1000 rows, etc. Now that I've worked with it for a few months I absolutely love the app engine. I'm porting all of my old projects onto it.
I use it to host several high-traffic web applications (at peak time one of them gets 50k hits a minute.)
Google App Engine doesn't use an actual database, and apparently uses some sort of distributed hash map. This will lend itself to some different behaviors that people who are accustomed to SQL just aren't going to see at first. So for example getting a COUNT of items in regular SQL is expected to be a fast operation, but with GQL it's just not going to work the same way.
Here are some more issues:
http://blog.burnayev.com/2008/04/gql-limitations.html
In my personal experience, it's an adjustment, but the learning curve is fine.