Django database fields for better queries - database

Hello I would like to ask a question for setting up some fields in my db. I have three models
models
class Customer(models.Model):
#fields
class Entry(models.Model):
#fields
customer = models.ForeignKey(Customer, related_name='entries')
class EntryHistory(models.Model):
#fields
entry = models.OneToOne(Entry, related_name='history')
I want to get all Entry histories for a specific customer. Isn't that
for entry in customer.entries:
try:
history.append(entry.history)
except entry.history.related.models.DoesNotExist:
pass(or continue?)
My thought search all customers entries and if they have a history added to the list. Would it be better if i just added a customer field in the EntryHistory?Would the query be quicker? This would be a frequent transaction in my project. But i know adding a customer in the EntryHistory model is redundant since I know who the customer is from the entry field. Should I be worried if e.g i have a lot of customers and a lot of entries for each?

I want to get all Entry histories for a specific customer.
customer = Customer.objects.get(pk=1)
EntryHistory.objects.filter(entry__customer=customer)
Django provides both forward and reverse lookups for foreign keys, so exploiting that you can go from EntryHistory -> Entry -> Customer

Related

How don't select the same row twice in two select cassandra queries?

I am working on a social networking project with cassandra. Users can subscribe to a profile and have access to the list of people who have subscribed to that same profile. My goal is to retrieve in a table called user_follows the list of people subscribed to a profile.
CREATE TABLE users_follows (to_id text, from_id text, followed_at timestamp, PRIMARY KEY(to_id, from_id))
The problem is that some profiles can have thousands of subscribers and I don't want to get them all at once. That's why I'd like to get the list in increments of 20 depending on how far down the user goes. My problem is that I can't see how to retrieve the other parts of the list after the first select because Cassandra always returns the same users.
SELECT * FROM users_follows where to_id = 'xxxxx'
A possible solution was to sort with a timestamp but in case I want to retrieve the list of people to whom a user is subscribed (the reverse query) this would not work. One solution would be to use materialized views but I'm not sure that it would be very optimal given the size of the table. Or to create a different table, one user_follows and another user_followers, but I don't think this is very recommended....

GAE: NDB query on list

Google App Engine NDB data model like so:
Users
Username
FirstName
LastName
Posts
PostID
PosterUsername
SubscribedPosts
PostID
SubscriberUsername
For a specific user, I want to return all the Posts which the user is subscribed to and display them on the page.
Since the wonderful NDB doesn't support JOINs, we do two queries:
postIDList =
SubscribedPosts.query(SubscribedPosts.SubscriberUsername == 'johndoe').fetch()
This gives us a list of SubscribedPosts. So how do I take my postIDList list and use it as a filter criteria for a Posts query?
Something like:
results = Posts.query(Post.PostID IN postIDList.PostID)
In a normal relational database, this would be a simple query using table joins. How is this done in Google's ndb?
You are going to run into lots of bottlenecks if you try to design your datastore models the same way you would design tables in a relational database as you have in this example.
Your comment goes in one possible right direction, although there are other solutions. Going that route, I would drop the "subscribedPosts" model altogether use a repeated KeyProperty entity in the User model to store subscribed posts.
See this related post: One-To-Many Example in NDB
Seems you are looking to model a many-to-many relationship, not one-to-many. Read Modelling Entity Relationships (althought this is for the older db, not newer ndb, it still gives the idea).
One of the two entities should maintain a list of keys (repeated=True) of the related other entities. Which entity should hold the list? Preferably the list should be on the side that usually has fewer relationships so that the list of keys is smaller. Another consideration is which side likely has less contention for updates.
In your specific case, lets say on average users subscribe to 10 posts and lets say on average each post has 100 users subscribed to it. In this case, we would want to put the list of keys on Users side of the relation.
class Users(ndb.Model):
user_name = ndb.StringProperty()
first_name = ndb.StringProperty()
last_name = ndb.StringProperty()
posts = ndb.KeyProperty(kind='Posts', repeated=True)
class Posts(ndb.Model)
post_id = ndb.StringProperty()
poster_user_name = ndb.StringProperty()
Establish the relationship by adding to the list in the Users instance:
current_user.posts.append(current_post.key)
For a given Users instance, getting all subscribed Posts is easy since the list of keys of the subscribed Posts is already within the given Users:
ndb.get_multi(given_user.posts)
For a given Posts instance, get all subscribing Users by ...
query = Users.query(Users.posts == given_post.key)

filemaker database relationships

I'm very new to FileMaker currently working on a Mac. I've been assigned a new simple system to work towards completing and I have bumped into some issues with database relationships. I've got experience with PHP/MySQL databases connections etc. but FileMaker seems to require a somewhat different mindset and approach.
I'll try to explain this as simply as I can.
Here's the table relationships in my database
What I'm trying to do is a list of "to-do" notes, an interactive menu where the user can add things that needs to be done. I've done this with a portal on a layout based on the table "site". The portal is based on the table "todo_notes", which is connected to site through the "site_id".
Here's what it looks like in browse mode
What I'm having problems with is adding a relationship between the todo_notes and contacts. The contacts are two separate tables called "county_contacts" and "property_owner_contacts". What I want to accomplish is the possibility for the user to, from a dropdown-list, add a single contact from these two tables. Preferably I'd like to sort of merge these two tables into the same dropdown-list.
Let me know if you need any other information or a better explanation of my issue. Any help is very welcome!
If you have a single contacts table with foreign keys for both county and property owner tables, that would let you have a single list for all contacts. From there you could also build a value list based on a relationship, for example to filter only contacts that belong to either county or property owners.
If you then need to further normalize the tables, fields that pertain to either relationship exclusively could be moved to another table from there, as a one to one relationship, if that is a concern.
The Short Answer
You need to create a Contacts table. Filemaker has no way of dynamically generating value lists. Instead, you can base a value list on any field, therefore, the only way of generating a list of the contact names would be if they were all in the same table.
The Long Answer
Because Filemaker only allows us to use ONE field for a value list, we must create a new table for the contact. I would recommend that you replace the two contact tables with a single contact table,(seeing as the fields look the same between the two tables) and then add a toggle on the contact for Owner or County. However, you could also create a single contact table for all of the fields that overlap that has foreign keys to the owner and county tables.
You would then use the fullname field from the contact and be good to go.
That is, assuming that you did not want to filter the contacts at all or only show contacts associated with this site.
To start with, I highly recommend using the Anchor-buoy method for organizing the relationship graph. Here's an explanation of the anchor-buoy method: http://sixfriedrice.com/wp/six-fried-rice-methodology-part-2-anchor-buoy-and-data-structures/ . It's just a convention, but will help you with the idea of context in FileMaker. It's widely accepted among the FileMaker community as the "right" way to organize a relationship graph. I will continue my explanation using this method.
Each Table Occurrence (the boxes in the graphs, or TO) represents a unique context from which you can view and edit information. In the anchor buoy method, each Table only has one "anchor" TO. I would recommend only using anchor TO's for the context of your layouts. Then, your portal, and any other corresponding information, will be on your buoy TO's. Here is what your new portal relationship would look like. You would select fields from your buoy TO's to use in the portal.
The easiest way to filter your value list by only contacts associated with this site would be to create a foreign key from the contact table to the site, and then add a TO to the graph, for the contact table. You would then click "Include only related values starting from" radio button, and specify your new TO.

Is class diagram required to change if DB contains Views on tables?

I'm a newbie in designing class diagrams. I follow DAO pattern in my design.
In my project, one user can add multiple contacts which can further be added to many groups.
Let's say 1 contact -> many groups.
As per my requirement UserContact and UserGroup are classes which manage contacts and groups.
In DB contacts and groups are stored in two different tables.
There is one use case to retrieve all contacts along with their groups so that I need to retrieve all contacts first and then using contactID again I need to make a query to get its relative groups.To avoid this in DB, there is VIEW on these two tables.
Now,with VIEW, I need to make one query to get user contacts and groups.
How can I add this method in DAO?
How do I change my classes so that contacts and groups can be mapped to my objects?
The following are the classes involved in this use case.
If I understand the question correctly; you can create a GroupContactDAOImpl class that has only query (get) methods, e.g. getGroupContacts(groupId) or getAllGroupContacts() if you will. Since "group contacts" is not an entity that has its own table, UserContactDAOImpl.addContact and UserGroupDAOImpl.addGroup methods would be sufficient to add new records.
If I need to get groups associated with one contact, how can I map contact information and groups to one object?
If you want to map contact info and groups to one object, you can add a UserGroup property to UserContact class (like in Hibernate). Then GroupContactDAOImpl will have a method like: UserContact getUserContactWithGroups(contactID).
Similarly, if you'd like to get a group's info with all the contacts asscoiated; you can add a UserContact property to UserGroup class. GroupContactDAOImpl will have a method like: UserGroup getUserGroupWithContacts(groupID).
There might be other methods to do the same thing; hope this helps.
As I need to insert and update a contact using UserContact, it should contain only contactID, email and name. I use UserContact and UserGroup as DTOs. Can I create a new class which holds UserContact and UserGroup that uses UserContactDAOImpl and is it a preferable way?
Yes, you can follow that way if you'd like to leave UserContact class as it is. You'd have a new DTO (e.g UserContactWithGroups) and return it from UserContactDAOImpl (e.g .getContactWithGroups method.) You can also create a separate DAO class, but it doesn't matter much. It's best to adapt it to your own use case.

How would I achieve this using Google App Engine Datastore?

I am a beginner to Datastore and I am wondering how I should use it to achieve what I want to do.
For example, my app needs to keep track of customers and all their purchases.
Coming from relational database, I can achieve this by creating [Customers] and [Purchases] table.
In Datastore, I can make [Customers] and [Purchases] kinds.
Where I am struggling is the structure of the [Purchases] kind.
If I make [Purchases] as the child of [Customers] kind, would there be one entity in [Customers] and one entity in [Purchases] that share the same key? Does this mean inside of this [Purchases] entity, I would have a property that just keeps increasing for each purchase they make?
Or would I have one [Purchases] entity for each purchase they make and in each of these entities I would have a property that points to a entity in [Customers] kind?
How does Datastore perform in these scenarios?
Sounds like you don't fully understand ancestors. Let's go with the non-ancestor version first, which is a legitimate way to go:
class Customer(ndb.Model):
# customer data fields
name = ndb.StringProperty()
class Purchase(ndb.Model):
customer = ndb.KeyProperty(kind=Customer)
# purchase data fields
price = ndb.IntegerProperty
This is the basic way to go. You'll have one entity in the datastore for each customer. You'll have one entity in the datastore for each purchase, with a keyproperty that points to the customer.
IF you have a purchase, and need to find the associated customer, it's right there.
purchase_entity.customer.get()
If you have a Customer, you can issue a query to find all the purchases that belong to the customer:
Purchase.query(customer=customer_entity.key).fetch()
In this case, whenever you write either a customer or purchase entity, the GAE datastore will write that entity any one of the datastore machines running in the cloud that's not busy. You can have really high write throughput this way. However, when you query for all the purchases of a given customer, you just read back the most current data in the indexes. If a new purchase was added, but the indexes not updated yet, then you may get stale data (eventual consistency). You're stuck with this behavior unless you use ancestors.
Now as for the ancestor version. The basic concept is essentially the same. You still have a customer entity, and separate entities for each purchase. The purchase is NOT part of the customer entity. However, when you create a purchase using a customer as an ancestor, it (roughly) means that the purchase is stored on the same machine in the datastore that the customer entity was stored on. In this case, your write performance is limited to the performance of that one machine, and is advertised as one write per second. As a benefit though, you can can query that machine using an ancestor query and get an up-to-date list of all the purchases of a given customer.
The syntax for using ancestors is a bit different. The customer part is the same. However, when you create purchases, you'd create it as:
purchase1 = Purchase(ancestor=customer_entity.key)
purchase2 = Purchase(ancestor=customer_entity.key)
This example creates two separate purchase entities. Each purchase will have a different key, and the customer has its own key as well. However, each purchase key will have the customer_entity's key embedded in it. So you can think of the purchase key being twice as long. However, you don't need to keep a separate KeyProperty() for the customer anymore, since you can find it in the purchases key.
class Purchase(ndb.Model):
# you don't need a KeyProperty for the customer anymore
# purchase data fields
price = ndb.IntegerProperty
purchase.key.parent().get()
And in order to query for all the purchases of a given customer:
Purchase.query(ancestor=customer_entity.key).fetch()
The actual of structure of the entities don't change much, mostly the syntax. But the ancestor queries are fully consistent.
The third option that you kinda describe is not recommended. I'm just including it for completeness. It's a bit confusing, and would go something like this:
class Purchase(ndb.Model):
# purchase data fields
price = ndb.IntegerProperty()
class Customer(ndb.Model):
purchases = ndb.StructuredProperty(Purchase, repeated=True)
This is a special case which uses ndb.StructuredProperty. In this case, you will only have a single Customer entity in the datastore. While there's a class for purchases, your purchases won't get stored as separate entities - they'll just be stored as data within the Customer entity.
There may be a couple of reasons to do this. You're only dealing with one entity, so your data fetch will be fully-consistent. You also have reduced write costs when you have to update a bunch of purchases, since you're only writing a single entity. And you can still query on the properties of the Purchase class. However, this was designed for only having a limited number or repeated objects, not hundreds or thousands. And each entity is limited to ta total size of 1MB, so you'll eventually hit that and you won't be able to add more purchases.
(from your personal tags I assume you are a java guy, using GAE+java)
First, don't use the ancestor relationships - this has a special purpose to define the transaction scope (aka Entity Groups). It comes with several limitations and should not be used for normal relationships between entities.
Second, do use an ORM instead of low-level API: my personal favourite is objectify. GAE also offers JDO or JPA.
In GAE relations between entities are simply created by storing a reference (a Key) to an entity inside another entity.
In your case there are two possibilities to create one-to-many relationship between Customer and it's Purchases.
public class Customer {
#Id
public Long customerId; // 'Long' identifiers are autogenerated
// first option: parent-to-children references
public List<Key<Purchase>> purchases; // one-to-many parent-to-child
}
public class Purchase {
#Id
public Long purchaseId;
// option two: child-to-parent reference
public Key<Customer> customer;
}
Whether you use option 1 or option 2 (or both) depends on how you plane to access the data. The difference is whether you use get or query. The difference between two is in cost and speed, get being always faster and cheaper.
Note: references in GAE Datastore are manual, there is no referential integrity: deleting one part of a relationship will produce no warning/error from Datastore. When you remove entities it's up to your code to fix references - use transactions to update two entities consistently (hint: no need to use Entity Groups - to update two entities in a transaction you can use XG transactions, enabled by default in objectify).
I think the best approach in this specific case would be to use a parent structure.
class Customer(ndb.Model):
pass
class Purchase(ndb.Model):
pass
customer = Customer()
customer_key = customer.put()
purchase = Purchase(parent=customer_key)
You could then get all purchases of a customer using
purchases = Purchase.query(ancestor=customer_key)
or get the customer who bough the purchase using
customer = purchase.key.parent().get()
It might be a good idea to keep track of the purchase count indeed when you use that value a lot.
You could do that using a _pre_put_hook or _post_put_hook
class Customer(ndb.Model):
count = ndb.IntegerProperty()
class Purchase(ndb.Model):
def _post_put_hook(self):
# TODO check whether this is a new entity.
customer = self.key.parent().get()
customer.count += 1
customer.put()
It would also be good practice to do this action in a transacion, so the count is reset when putting the purchase fails and the other way around.
#ndb.transactional
def save_purchase(purchase):
purchase.put()

Resources