i'm struggling with gae for something that looks very simple to me.
let me explain.
i have a table with data about a person, i decided to use email as id
#Entity
public class Person{
#Id
#Column(name = "email")
String email;
...
what i would like to accomplish is
create a table with two columns, both containing email from Person, with the meaning "email in column 1 has written to email in column 2"
when i delete a row from Person, i would like that all the rows in the table at point 1 that contain this Person email in column 1 or 2 would be deleted as a cascade effect
i want to query my database so that, given an email address, i will be able to join all the rows in the table at point 1 and extract all the datas (name, phone...) of the Persons the given email has written to.
trouble is that apparently in gae i cannot use join, and i simply can't understand how to create a join table with cascade effect.
any help is welcome.
thanks in advance
Datastore isn't a relational database, so you should familiarise yourself with the concepts before starting to design a solution. What you're trying to do is fit a square peg into a round hole: not only will you find you're missing joins, you will also have to implement your own cascade-on-delete (hint: you may not want to do this but if you did, and you have a lot of data, look at Task Queues).
You don't provide much in the way of code, and I don't know JPA (tip: look at Objectify, it's much more suitable for the non-relational Datastore) but you might want something like this (using Objectify annotations):
#Entity
public class Person {
#Id
String email;
...
}
Then I'm assuming you will have some kind of Message entity (what you refer to as a two-column table):
#Entity
public class Message {
#Id
Long msgId;
#Index
Ref<Person> from;
#Index
Ref<Person> to;
...
}
Depending on what queries you need to perform, you may need to create a custom index (read here). Remember, queries on Datastore are index scans.
But, say, you want to get messages sent from Person A to Person B, you can do something like:
Person a = ofy().load().type(Person.class).id("a#example.com").now();
Person b = ofy().load().type(Person.class).id("b#example.com").now();
...
ofy().load().type(Person.class).filter("from =", Ref.create(a)).filter("to =", Ref.create(b)).list();
Instead of using Ref<Person> (essentially a Key), you could of course use a String representing the email. You may also not want to use email as the #Id as that would prevent a user changing their email address.
Related
I have two models, using Go 1.19:
type User struct {
Name string
ID int
}
type Order struct {
ID int
Name string
User *User
// or
UserID int
}
Of course, the database orders table has a foreign key to the users table via user_id.
Probably in different situations I have to use one of these models. When exactly?
Mb only user_id in DTO models, the user in responses from the server?
I will be glad for any information :)
It depends on your purpose. As usual, you have to use id when a table to include has meta info about your entity (often it's tables with a lot of rows and so heavy), therefore it will be better to use id, otherwise if it's table which describe some fields in initial table, you can use full entity.
On every request to my SessionBean I need to receive the last added instance of a JPA entity whose PK is declared with #Id #GeneratedValue(strategy=GenerationType.AUTO) Long id.
My current approach is to add ORDER BY e.id DESC to the query. Unfortunately im not sure whether generated ids are strictly increasing for subsequently persisted entities and I can't seem to find any documentation on that topic. Can anyone help me with that?
JPA does not specify the order of id generation, so the provider is free to issue nonsequential ids.
If you want to rely on the entity insertion order, consider adding a temporal createdAt or modifiedAt field to your entity. This approach is used by some persistace frameworks, e.g. ActiveRecord.
You can leave the generation of this value to the provider by using a callback in a base entity class:
#PrePersist
void makeCreationTimestamp() {
createdAt = System.currentTimeMillis();
}
I am a beginner to Datastore and I am wondering how I should use it to achieve what I want to do.
For example, my app needs to keep track of customers and all their purchases.
Coming from relational database, I can achieve this by creating [Customers] and [Purchases] table.
In Datastore, I can make [Customers] and [Purchases] kinds.
Where I am struggling is the structure of the [Purchases] kind.
If I make [Purchases] as the child of [Customers] kind, would there be one entity in [Customers] and one entity in [Purchases] that share the same key? Does this mean inside of this [Purchases] entity, I would have a property that just keeps increasing for each purchase they make?
Or would I have one [Purchases] entity for each purchase they make and in each of these entities I would have a property that points to a entity in [Customers] kind?
How does Datastore perform in these scenarios?
Sounds like you don't fully understand ancestors. Let's go with the non-ancestor version first, which is a legitimate way to go:
class Customer(ndb.Model):
# customer data fields
name = ndb.StringProperty()
class Purchase(ndb.Model):
customer = ndb.KeyProperty(kind=Customer)
# purchase data fields
price = ndb.IntegerProperty
This is the basic way to go. You'll have one entity in the datastore for each customer. You'll have one entity in the datastore for each purchase, with a keyproperty that points to the customer.
IF you have a purchase, and need to find the associated customer, it's right there.
purchase_entity.customer.get()
If you have a Customer, you can issue a query to find all the purchases that belong to the customer:
Purchase.query(customer=customer_entity.key).fetch()
In this case, whenever you write either a customer or purchase entity, the GAE datastore will write that entity any one of the datastore machines running in the cloud that's not busy. You can have really high write throughput this way. However, when you query for all the purchases of a given customer, you just read back the most current data in the indexes. If a new purchase was added, but the indexes not updated yet, then you may get stale data (eventual consistency). You're stuck with this behavior unless you use ancestors.
Now as for the ancestor version. The basic concept is essentially the same. You still have a customer entity, and separate entities for each purchase. The purchase is NOT part of the customer entity. However, when you create a purchase using a customer as an ancestor, it (roughly) means that the purchase is stored on the same machine in the datastore that the customer entity was stored on. In this case, your write performance is limited to the performance of that one machine, and is advertised as one write per second. As a benefit though, you can can query that machine using an ancestor query and get an up-to-date list of all the purchases of a given customer.
The syntax for using ancestors is a bit different. The customer part is the same. However, when you create purchases, you'd create it as:
purchase1 = Purchase(ancestor=customer_entity.key)
purchase2 = Purchase(ancestor=customer_entity.key)
This example creates two separate purchase entities. Each purchase will have a different key, and the customer has its own key as well. However, each purchase key will have the customer_entity's key embedded in it. So you can think of the purchase key being twice as long. However, you don't need to keep a separate KeyProperty() for the customer anymore, since you can find it in the purchases key.
class Purchase(ndb.Model):
# you don't need a KeyProperty for the customer anymore
# purchase data fields
price = ndb.IntegerProperty
purchase.key.parent().get()
And in order to query for all the purchases of a given customer:
Purchase.query(ancestor=customer_entity.key).fetch()
The actual of structure of the entities don't change much, mostly the syntax. But the ancestor queries are fully consistent.
The third option that you kinda describe is not recommended. I'm just including it for completeness. It's a bit confusing, and would go something like this:
class Purchase(ndb.Model):
# purchase data fields
price = ndb.IntegerProperty()
class Customer(ndb.Model):
purchases = ndb.StructuredProperty(Purchase, repeated=True)
This is a special case which uses ndb.StructuredProperty. In this case, you will only have a single Customer entity in the datastore. While there's a class for purchases, your purchases won't get stored as separate entities - they'll just be stored as data within the Customer entity.
There may be a couple of reasons to do this. You're only dealing with one entity, so your data fetch will be fully-consistent. You also have reduced write costs when you have to update a bunch of purchases, since you're only writing a single entity. And you can still query on the properties of the Purchase class. However, this was designed for only having a limited number or repeated objects, not hundreds or thousands. And each entity is limited to ta total size of 1MB, so you'll eventually hit that and you won't be able to add more purchases.
(from your personal tags I assume you are a java guy, using GAE+java)
First, don't use the ancestor relationships - this has a special purpose to define the transaction scope (aka Entity Groups). It comes with several limitations and should not be used for normal relationships between entities.
Second, do use an ORM instead of low-level API: my personal favourite is objectify. GAE also offers JDO or JPA.
In GAE relations between entities are simply created by storing a reference (a Key) to an entity inside another entity.
In your case there are two possibilities to create one-to-many relationship between Customer and it's Purchases.
public class Customer {
#Id
public Long customerId; // 'Long' identifiers are autogenerated
// first option: parent-to-children references
public List<Key<Purchase>> purchases; // one-to-many parent-to-child
}
public class Purchase {
#Id
public Long purchaseId;
// option two: child-to-parent reference
public Key<Customer> customer;
}
Whether you use option 1 or option 2 (or both) depends on how you plane to access the data. The difference is whether you use get or query. The difference between two is in cost and speed, get being always faster and cheaper.
Note: references in GAE Datastore are manual, there is no referential integrity: deleting one part of a relationship will produce no warning/error from Datastore. When you remove entities it's up to your code to fix references - use transactions to update two entities consistently (hint: no need to use Entity Groups - to update two entities in a transaction you can use XG transactions, enabled by default in objectify).
I think the best approach in this specific case would be to use a parent structure.
class Customer(ndb.Model):
pass
class Purchase(ndb.Model):
pass
customer = Customer()
customer_key = customer.put()
purchase = Purchase(parent=customer_key)
You could then get all purchases of a customer using
purchases = Purchase.query(ancestor=customer_key)
or get the customer who bough the purchase using
customer = purchase.key.parent().get()
It might be a good idea to keep track of the purchase count indeed when you use that value a lot.
You could do that using a _pre_put_hook or _post_put_hook
class Customer(ndb.Model):
count = ndb.IntegerProperty()
class Purchase(ndb.Model):
def _post_put_hook(self):
# TODO check whether this is a new entity.
customer = self.key.parent().get()
customer.count += 1
customer.put()
It would also be good practice to do this action in a transacion, so the count is reset when putting the purchase fails and the other way around.
#ndb.transactional
def save_purchase(purchase):
purchase.put()
Even after reading the documentation, I seem to have a fundamental misunderstanding about Google App Engine's entity groups. My goal is a trivial example of ORM: I've got some Employees assigned to Departments. An employee can only be assigned to one department, but a department can have many employees. It's your standard one-to-many relationship.
Given the employee's key (email) and a department name, I want to look up both the employee and department objects, and if they don't exist, to create them.
What follows is pseudocode, not meant to compile. If producing code that will compile would help you help me, I'd be happy to do so, but I think my problem is conceptual.
Data Objects:
#Entity
public class Department {
private Key key;
private String name;
// getters and setters
}
#Entity
#NamedQuery(name="getEmployeesInDept", query="SELECT a from Employee a WHERE a.dept=:dept")
public class Employee {
private Key key;
private String firstName;
#ManyToOne
private Department dept;
// getters and setters
}
Look Up or Create
Key employeeKey = KeyFactory.createKey("Employee", email);
Employee employee = entityManager.find(Employee.class, employeeKey);
if(employee == null)
{
Key deptKey = KeyFactory.createKey("Department", deptName);
Department dept = entityManager.find(Department.class, deptKey);
if(dept == null)
{
dept = new Department();
dept.setKey(deptKey);
dept.setName(deptName);
entityManager.persist(dept);
}
employee = new Employee();
employee.setKey(employeeKey);
employee.setFirstName(firstName);
employee.setDept(dept);
entityManager.persist(employee);
}
entityManager.close();
print("Found employee " + employee.getFirstName() + " from " + dept.getName() + " department!");
That's the logic that worked perfectly when I was using ye olde generic ORM before I tried migrating to Google App Engine.
However, on GAE, I get an exception like:
javax.persistence.PersistenceException: Detected attempt to establish
Employee("bob#mycompany.com") as the parent of Department(14) but the
entity identified by Department(14) has already been persisted without
a parent. A parent cannot be established or changed once an object has
been persisted.
While I understand that in order to get Employee and Department into the same entity group (which I would prefer), I have to make one of them the parent of the other, their relationship isn't really one that fits into the parent-child paradigm in my mind.
I have tried wrapping various parts between entityManager.getTransaction().begin() and entityManager.getTransaction().end(), but to no avail.
I can get around this by including Department's key as part of Employee's key (thus making Department the parent of Employee), but then I have no idea how to look up an Employee based on their email and figure out what department they're in, or, conversely, how to look up all the employees in a given department.
Does this make sense? How should I structure this relationship in GAE? Surely this is a very common pattern that has a simple solution that is just eluding me.
I'm convinced that there's some fundamental piece of this puzzle that I'm missing, because it seems rather ridiculous that a simple many-to-one foreign key cannot be easily represented in GAE's ORM.
Cheers!
So if it doesn't fit owned relationships then make it unowned, which is supported in v2.x of the GAE JPA plugin.
I need people's advice as to whether this the best way to achieve what I want. Apologies in advance if this is a little to subjective.
I want to use Entity Framework V.1 to create something similar to the following C# classes:
abstract class User
{
public int UserId;
public string TelephoneNumber;
}
class Teacher : User
{
public string FavorateNewspaper;
}
class Pupil : User
{
public string FavorateCartoon;
}
I need people's advice as to how to best to persist this information.
I plan to use SQL Server and the normal Membership Provider. It will create for me a table called aspnet_Users. There will be two roles: Teacher and Pupil.
I will add fields to the table aspnet_Users which are common to both roles. Then create tbl_Teachers and tbl_Pupils to hold information specific to one role.
So My database will look a bit like this:
aspnet_Users
int UserId
varchar TelephoneNumber
tbl_Teachers
int UserId
varchar FavorateNewspaper
tbl_Pupils
int UserId
varchar FavorateCartoon
The idea of course being that I can match up the data in aspnet_Users to that in either tbl_Teachers or tbl_Pupils by joining on UserId.
So to summarise, my questions are:
Is my database structure the best option to achieve these classes?
Should I try to wrap the Entities within my own POCO classes?
Should I change my database structure so that EF creates entities which are closer to the classes I want?
EDIT: I re-arranged my question it make it a bit clearer what I'm asking.
If you're using EF 1, then POCO can be a bit unpleasant. Unless there's a good reason not to, I'd just use normal EF entities. Your database model is fine, by the way, and is an example of TPT (Table Per Type) inheritance mapping. You could either use the wizard to create entites from the databaes, or create your entites and map them to the associated tables. If you do the former you'd initially end up with three unrelated entities. You'd then use the designer to tell EF that Pupil and Teacher inherit from User, and that User is abstract.
In general, one of the strengths of EF is that the entities don't have to match that closely to the tables that persist them. In this case though there's a natural mapping.