How to query datomic for all datoms in a particular partition? - datomic

In Datomic query language how can I write a query to return all datoms in a particular partition? Is this even possible?

You generally can't really use a Datalog query for that, because it would require to traverse all the datoms of the database, which Datalog won't let you do.
Given any entity id, you can retrieve it's partition by calling the part function of the Peer library.
You can then use a filter on your database to have a view of only those datoms. Here's a Clojure example:
(defn part-db
"Given a db and a partition entity id,
returns a view of the db with only the datoms which entities are of this partition."
[db part]
(d/filter db (fn [_ ^Datom datom]
(-> datom .e d/part (= part))
)))
To find out the entity id of your partition from it's name (e.g :my.partitions/part1), you can for example resolve it as an Entity:
(def my-part-id (:db/id (d/entity mydb :my.partitions/part1)))
From here, you can:
List all the datoms of your database through the index: (d/datoms (part-db mydb my-part-id) :eavt)
Query the filtered database using Datalog.
... whatever else you do with a database value!
Note that if you really want to get all the datoms, you may want to do this on a history database.

Related

Indexing EAV model using Solr

The database I have at hand uses the EAV model describing all objects one can find in a house. Good or bad isn't the question, there is no choice but to keep and use this model. 6.000+ items point to 3.000+ attributes and 150.000+ attribute-values.
My task is to get this data into a Solr index for quick searching/sorting/faceting.
In Solr, using DIH, a regular SQL query is used to extract data. Each column name returned from the query is a 'field' (defined or not in a schema), and each row of the query's resultset is a 'document'.
Because the EAV model uses rows for attributes instead of columns, a simple query will not work, I need to flatten each item row. What should my SQL query look like in order to extract all items from the DB ? Is there a special Solr/DIH configuration which I should consider ?
There are some similar questions on SO, but none really helped.
Any pointers are much appreciated!

Difference between dba_segments and dba_users when trying to identify list of schema in Oracle DB

I am trying to figure out the list schema created in a database, I came across many answers like this and this which are trying to tell either use dba_segments or use dba_users.
But when I use those in my database then results have substantial difference.
I am looking for answers explaining which one is correct (dba_segments or dba_users) and why, so please do not think that my question is "how to get a list of all available schema in database".
dba_segments shows SEGMENTS - which are owned by schemas
you can have a schema that has no segments - objects that use segments can generally be thought of as tables or indexes. A user could own a synonym or a PL/SQL unit but have no segments for example.
Here's a list of segment types for my 12c system
HR#orcl🍻🍺 >select distinct segment_type from dba_segments;
SEGMENT_TYPE
LOBINDEX
INDEX PARTITION
ROLLBACK
NESTED TABLE
TABLE PARTITION
LOB PARTITION
LOBSEGMENT
INDEX
TABLE
CLUSTER
dba_users will show you EVERY user in the database, whether they own 'data' or not
here's how to find SCHEMAS with no segments, or one way
HR#orcl🍻🍺 >select distinct username
2 from dba_users
3 minus
4 select distinct owner
5 from dba_segments;
USERNAME
ANONYMOUS
APEX_LISTENER
APEX_PUBLIC_USER
APEX_REST_PUBLIC_USER
APPQOSSYS
BASIC_PRIVS
BI...

Appengine's Indexing order, cursors, and aggregation

I need to do some continuous aggregation on a data set. I am using app engines High Replication Datastore.
Lets say we have a simple object with a property that holds a string of the date when it's created. There's other fields associated with the object but it's not important in this example.
Lets say I create and store some objects. Below is the date associated with each object. Each object is stored in the order below. These objects will be created in separate transactions.
Obj1: 2012-11-11
Obj2: 2012-11-11
Obj3: 2012-11-12
Obj4: 2012-11-13
Obj5: 2012-11-14
The idea is to use a cursor to continually check for new indexed objects. Aggregation on the new indexed entities will be performed.
Here are the questions I have:
1) Are objects indexed in order? As in is it possible for Obj4 to be indexed before Obj 1,2, and 3? This will be a issue if i use a ORDER BY query and a cursor to continue searching. Some entities will not be found if there is a delay in indexing.
2) If no ORDER BY is specified, what order are entities returned in a query?
3) How would I go about checking for new indexed entities? As in, grab all entities, storing the cursor, then later on checking if any new entities were indexed since the last query?
Little less important, but food for thought
4) Are all fields indexed together? As in, if I have a date property, and lets say a name property, will both properties appear to be indexed at the same time for a given object?
5) If multiple entities are written in the same transaction, are all entities in the transaction indexed at the same time?
6) If all entities belong to the same entity group, are all entities indexed at the same time?
Thanks for the responses.
All entities have default indexes for every property. If you use ORDER BY someProperty then you will get entities ordered by values of that property. You are correct on index building: queries use indexes and indexes are built asynchronously, meaning that it's possible that query will not find an entity immediately after it was added.
ORDER BY defaults to ASC, i.e. ascending order.
Add a created timestamp to you entity then order by it and repeat the cursor. See Cursors and Data Updates.
Indexes are built after put() operation returns. They are also built in parallel. Meaning that when you query some indexes may be build, some not. See Life of a Datastore Write. Note that if you want to force "apply" on an entity you can issue a get() after put(), which will force the changes to be applied (= indexes written).
and 6. All entities touched in the same transaction must be in the same entity group (=have common parent). Transaction isolation docs state that transactions can be unapplied, meaning that query after put() will not find new entities. Again, you can force entity to be applied via a read or ancestor query.

Variable table name in Django

Can I use variable table name for db mapped objects? For example, there are n objects of the same structure and I want to store it in different tables, for raising performance on some operations.
Let's say I've got class defined as:
class Measurement(models.Model):
slave_id = models.IntegerField()
tag = models.CharField(max_length=40)
value = models.CharField(max_length=16)
timestamp = models.DateTimeField()
class Meta:
db_table = 'measurements'
Now all objects are stored into table 'measurements'. I would like to make table name dependant on 'slave_id' value. For example, to handle data from tables 'measurements_00001', 'measurements_00002' etc...
Is it possible to achieve this using Django ORM model or the only solution is to drop to SQL level?
In the vast majority of cases, this shouldn't buy you any performance advantage. Any RDBMS worth its salt should handle immense tables effortlessly.
If it's needed, there could be some sharding of the table. Again, managed by the DB server; at SQL level (and ORM) it should be seen as a single table. Ideally, the discrimination should be automatically handled; if not, most RDBMS let you specify it at table definition time (or sometimes tune with ALTER TABLE)
If you choose to define the sharding method, each RDBMS has it's own non-standard methods. Best not to tie your Python code to that; do the tuning once on the DB server instead.

What is the difference between a schema and a table and a database?

This is probably a n00blike (or worse) question. But I've always viewed a schema as a table definition in a database. This is wrong or not entirely correct. I don't remember much from my database courses.
schema -> floor plan
database -> house
table -> room
A relation schema is the logical definition of a table - it defines what the name of the table is, and what the name and type of each column is. It's like a plan or a blueprint. A database schema is the collection of relation schemas for a whole database.
A table is a structure with a bunch of rows (aka "tuples"), each of which has the attributes defined by the schema. Tables might also have indexes on them to aid in looking up values on certain columns.
A database is, formally, any collection of data. In this context, the database would be a collection of tables. A DBMS (Database Management System) is the software (like MySQL, SQL Server, Oracle, etc) that manages and runs a database.
In a nutshell, a schema is the definition for the entire database, so it includes tables, views, stored procedures, indexes, primary and foreign keys, etc.
This particular posting has been shown to relate to Oracle only and the definition of Schema changes when in the context of another DB.
Probably the kinda thing to just google up but FYI terms do seem to vary in their definitions which is the most annoying thing :)
In Oracle a database is a database. In your head think of this as the data files and the redo logs and the actual physical presence on the disk of the database itself (i.e. not the instance)
A Schema is effectively a user. More specifically it's a set of tables/procs/indexes etc owned by a user. Another user has a different schema (tables he/she owns) however user can also see any schemas they have select priviliedges on. So a database can consist of hundreds of schemas, and each schema hundreds of tables. You can have tables with the same name in different schemas, which are in the same database.
A Table is a table, a set of rows and columns containing data and is contained in schemas.
Definitions may be different in SQL Server for instance. I'm not aware of this.
Schema behaves seem like a parent object as seen in OOP world. so it's not a database itself. maybe this link is useful.
But, In MySQL, the two are equivalent. The keyword DATABASE or DATABASES
can be replaced with SCHEMA or SCHEMAS wherever it appears. Examples:
CREATE DATABASE <=> CREATE SCHEMA
SHOW DATABASES <=> SHOW SCHEMAS
Documentation of MySQL
SCHEMA & DATABASE terms are something DBMS dependent.
A Table is a set of data elements (values) that is organized using a model of vertical columns (which are identified by their name) and horizontal rows. A database contains one or more(usually) Tables . And you store your data in these tables. The tables may be related with one another(See here).
As per https://www.informit.com/articles/article.aspx?p=30669
The names of all objects must be unique within some scope. Every
database must have a unique name; the name of a schema must be unique
within the scope of a single database, the name of a table must be
unique within the scope of a single schema, and column names must be
unique within a table. The name of an index must be unique within a
database.
From the PostgreSQL documentation:
A database contains one or more named schemas, which in turn contain tables. Schemas also contain other kinds of named objects, including data types, functions, and operators. The same object name can be used in different schemas without conflict; for example, both schema1 and myschema can contain tables named mytable. Unlike databases, schemas are not rigidly separated: a user can access objects in any of the schemas in the database he is connected to, if he has privileges to do so.
There are several reasons why one might want to use schemas:
To allow many users to use one database without interfering with each other.
To organize database objects into logical groups to make them more manageable.
Third-party applications can be put into separate schemas so they do not collide with the names of other objects.
Schemas are analogous to directories at the operating system level, except that schemas cannot be nested.
Contrary to some of the above answers, here is my understanding based on experience with each of them:
MySQL: database/schema :: table
SQL Server: database :: (schema/namespace ::) table
Oracle: database/schema/user :: (tablespace ::) table
Please correct me on whether tablespace is optional or not with Oracle, it's been a long time since I remember using them.
As MusiGenesis put so nicely, in most databases:
schema : database : table :: floor plan : house : room
But, in Oracle it may be easier to think of:
schema : database : table :: owner : house : room
More on schemas:
In SQL 2005 a schema is a way to group objects. It is a container you can put objects into. People can own this object. You can grant rights on the schema.
In 2000 a schema was equivalent to a user. Now it has broken free and is quite useful. You could throw all your user procs in a certain schema and your admin procs in another. Grant EXECUTE to the appropriate user/role and you're through with granting EXECUTE on specific procedures. Nice.
The dot notation would go like this:
Server.Database.Schema.Object
or
myserver01.Adventureworks.Accounting.Beans
A Schema is a collection of database objects which includes logical structures too.
It has the name of the user who owns it.
A database can have any number of Schema's.
One table from a database can appear in two different schemas of same name.
A user can view any schema for which they have been assigned select privilege.
I try answering based on my understanding of the following analogy:
A database is like the house
In the house there are several types of rooms. Assuming that you're living in a really big house. You really don't want your living rooms, bedrooms, bathrooms, mezzanines, treehouses, etc. to look the same. They each need a blueprint to tell how to build/use them. In other words, they each need a schema to tell how to build/use a bathroom, for example.
Of course, you may have several bedrooms, each looks slightly different. You and your wife/husband's bedroom is slightly different from your kids' bedroom. Each bedroom is analogous to a table in your database.
A DBMS is like a butler in the house. He manages literally everything.
In oracle Schema is one user under one database,For example scott is one schema in database orcl.
In one database we may have many schema's like scott
Schemas contains Databases.
Databases are part of a Schema.
So, schemas > databases.
Schemas contains views, stored procedure(s), database(s), trigger(s) etc.
A schema is not a plan for the entire database. It is a plan/container for a subset of objects (ex.tables) inside a a database. This goes to say that you can have multiple objects(ex. tables) inside one database which don't neccessarily fall under the same functional category. So you can group them under various schemas and give them different user access permissions. That said, I am unsure whether you can have one table under multiple schemas. The Management Studio UI gives a dropdown to assign a schema to a table, and hence making it possible to choose only one schema. I guess if you do it with TSQL, it might create 2 (or multiple) different objects with different object Ids.
A database schema is a way to logically group objects such as tables, views, stored procedures etc. Think of a schema as a container of objects.
And tables are collections of rows and columns.
combination of all tables makes a db.

Resources