Rails - Big vs Distributed Tables

Rails - Big vs Distributed Tables - database

I am relatively new to Rails.
I have a User model through Devise. I am wondering if it is more efficient to have all the additional fields i need for the user, in a separate Profile model.
I am coming across similar situations where I am considering creating a new model and using a has_one association to that model however it seems like maybe it will be cleaner if I had all the attributes belonging to a user within the User model. How do you deal with such situations? What effect will it have on application performance?
Can someone elaborate on the advantages and disadvantages of creating has_one relationship, especially in terms of performance.

I am also relatively new to Rails as well, but this is my take...
The benefits of associations in Rails are pretty obvious I think. In this specific case I think you are fine to go either way. Here are some things to consider...
If you use a has_one relationship, you must remember that when you are referencing the profile you will end up with something similar to this:
user = User.first
puts user.profile.first_name
puts user.profile.age
Simple enough, but if you wanted something like user.first_name you would need to delegate that method to the Profile model. This is all a matter of preference.

Related

MVC: Correct pattern to reference objects from a different model

I'm using CakePHP2.3 and my app has many associations between models. It's very common that a controller action will involve manipulating data from another model. So I start to write a method in the model class to keep the controllers skinny... But in these situations, I'm never sure which model the method should go in?
Here's an example. Say I have two models: Book and Author. Author hasMany Book. In the /books/add view I might want to show a drop-down list of popular authors for the user to select as associated with that book. So I need to write a method in one of the two models. Should I...
A. Write a method in the Author model class and call that method from inside the BooksController::add() action...
$this->Author->get_popular_authors()
B. Write a method in the Book model class that instantiates the other model and uses it's find functions... Ex:
//Inside Book::get_popular_authors()
$Author = new Author();
$populars = $Author->find('all', $options);
return $populars;
I think my question is the same as asking "what is the best practice for writing model methods that primarily deal with associations between another model?" How best to decide which model that method should belong to? Thanks in advance.
PS: I'm not interested in hearing whether you thinking CakePHP sucks or isn't "true" MVC. This question is about MVC design pattern, not framework(s).

IMHO the function should be in the model that most closely matches the data you're trying to retrieve. Models are the "data layer".
So if you're fetching "popular authors", the function should be in the Author model, and so on.
Sometimes a function won't fit any model "cleanly", so you just pick one and continue. There are much more productive design decisions to concern yourself with. :)
BTW, in Cake, related models can be accessed without fetching "other" the model object. So if Book is related to Author:
//BooksController
$this->Book->Author->get_popular_authors();
//Book Model
$this->Author->get_popular_authors();
ref: http://book.cakephp.org/2.0/en/models/associations-linking-models-together.html#relationship-types

Follow the coding standards: get_popular_authors() this should be camel cased getPopularAuthors().
My guess is further that you want to display a list of popular authors. I would implement this using an element and cache that element and fetching the data in that element using requestAction() to fetch the data from the Authors controller (the action calls the model method).
This way the code is in the "right" place, your element is cached (performance bonus) and reuseable within any place.
That brings me back to
"what is the best practice for writing model methods that primarily
deal with associations between another model?"
In theory you can stuff your code into any model and call it through the assocs. I would say common sense applies here: Your method should be implement in the model/controller it matches the most. Is it user related? User model/controller. Is it a book that belongs to an user? Book model/controller.
I would always try to keep the coupling low and put the code into a specific domain. See also separation of concerns.

I think the key point to answer your question is defined by your specifications: "... popular authors for the user to select as associated with that book.".
That, in addition to the fact that you fetch all the authors, makes me ask:
What is the criteria that you will use to determine which authors are popular?
I doubt it, but if that depends on the current book being added, or some previous fields the user entered, there's some sense in adopting solution B and write the logic inside the Book model.
More likely solution A is the correct one because your case needs the code to find popular authors only in the add action of the Book controller. It is a "feature" of the add action only and so it should be coded inside the Author model to retrieve the list and called by the add action when preparing the "empty" form to pass the list to the view.
Furthermore, it would make sense to write some similar code inside the Book model if you wanted, e.g., to display all the other books from the same author.
In this case you seem to want popular authors (those with more books ?), so this clearly is an "extra feature" of the Author model (That you could even code as a custom find method).
In any case, as stated by others as well, there's no need to re-load the Author model as it is automatically loaded via its association with Books.

Look out for Premature Optimization. Just build your project till it works. You can always optimize your code or mvc patterns after you do a review of your code. And most important after your project is done most of the time you will see a more clear or better way to do it faster/smarter and better than you did before.
You can't and never will build a perfect mvc or project in one time. You need to find yourself a way of working you like or prefer and in time you'll learn how to improve your coding.
See for more information about Premature Optimization

Backbone Relational and subviews, best "save" strategy

I'm using Backbone-relational like this:
class window.Car extends Backbone.RelationalModel
class window.Person extends Backbone.RelationalModel
relations: [{
type: Backbone.HasOne
key: 'car'
relatedModel: Car
}]
There is also a PersonView, which embeds a subview CarView.
Now my question is, what is the best strategy when the user clicks "Save" in the PersonView? The problem is that the save will happen in two steps, first the car then the person. But what if validation fails with the person? It will cancel the save, but the car will be already saved!
Maybe Backbone-relational is not the best option here? Any alternative?
More generally, I'm more and more frustrated with Backbone playing not very nice with deeply embedded documents (I'm using MongoDB). Yes, the Todo app is nice, but the real world is more complex! Any guidance or tutorial would be very much appreciated.

It’s difficult to answer without to know the details, but, are you sure that you need relational models in the browser side?
Backbone is designed for restful applications. Is your API in the server side restful?
In your case (and without really understanding the constraints you have) I can think of the following implementation.
In the server the following URIs API:
[…]/carType/{carType}
[…]/persons/{person}
[…]/cars/{car}
In this implementation, “car” represents an actual physical object where “carType” represents a class of car. The backbone model for “car” contains the ID for the “carType” and the ID for the “person”. There are also backbone models for “carType” and “person”.
In this way, when you want to associate a “person” and a “carType” you create a new “car” and make a POST to the server. As “car” is its own independent object (and has its own URL), you can operate in a transactional way with it (that is what, I think, you are asking).
I hope it helps and the answer its not very far of what you are actually trying to do.

The best save strategy would be to save the whole thing atomically (in one step). Otherwise, you're always going to have these type of problems where failing to save one object on the server means you're going to have to destroy other objects on both the server and the client.
To support that, Backbone-relational has excellent support for serializing and deserializing nested objects.

Self Tracking Entities Traffic Optimization

I'm working on a personal project using WPF with Entity Framework and Self Tracking Entities. I have a WCF web service which exposes some methods for the CRUD operations. Today I decided to do some tests and to see what actually travels over this service and even though I expected something like this, I got really disappointed. The problem is that for a simple update (or delete) operation for just one object - lets say Category I send to the server the whole object graph, including all of its parent categories, their items, child categories and their items, etc. I my case it was a 170 KB xml file on a really small database (2 main categories and about 20 total and about 60 items). I can't imagine what will happen if I have a really big database.
I tried to google for some articles concerning traffic optimization with STE, but with no success, so I decided to ask here if somebody has done something similar, knows some good practices, etc.
One of the possible ways I came out with is to get the data I need per object with more service calls:
return context.Categories.ToList();//only the categories
...
return context.Items.ToList();//only the items
Instead of:
return context.Categories.Include("Items").ToList();
This way the categories and the items will be separated and when making changes or deleting some objects the data sent over the wire will be less.
Has any of you faced a similar problem and how did you solve it or did you solve it?

We've encountered similiar challenges. First of all, as you already mentioned, is to keep the entities as small as possible (as dictated by the desired client functionality). And second, when sending entities back over the wire to be persisted: strip all navigation properties (nested objects) when they haven't changed. This sounds very simple but is not at all trivial. What we do is to recursively dig into the entities present in trackable collections of say the "topmost" entity (and their trackable collections, and theirs, and...) and remove them when their ChangeTracking state is "Unchanged". But be carefull with this, because in some cases you still need these entities because they have been removed or added to trackable collections of their parent entity (so then you shouldn't remove them).
This, what we call "StripEntity", is also mentioned (not with any code sample or whatsoever) in Julie Lerman's - Programming Entity Framework.
And although it might not be as efficient as a more purist kind of approach, the use of STE's saves a lot of code for queries against the database. We are not in need for optimal performance in a high traffic situation, so STE's suit our needs and takes away a lot of code to communicate with the database. You have to decide for your situation what the "best" solution is. Good luck!

You can find an Entity Framework project item at http://selftrackingentity.codeplex.com/. With version 0.9.8, I added a method called GetObjectGraphChanges() that returns an optimized entity object graph with only objects that have changes.
Also, there are two helper methods: EstimateObjectGraphSize() and EstimateObjectGraphChangeSize(). The first method returns the estimate size of the whole entity object along with its object graph; and the later returns the estimate size of the optimized entity object graph with only object that have changes. With these two helper methods, you can decide whether it makes sense to call GetObjectGraphChanges() or not.

Silverlight LINQtoSQL: one big dataclass, or several small ones?

I'm new to Silverlight, but being dumped right into the fray - good way to learn I suppose :o)
Anyway, the webapp I'm working on has a relatively complex database structure that represents various object types that are linked to each other, and I was wondering 2 things:
1- What is the recommended approach when it comes to dataclasses? Have just one big dataclass, or try and separate it into several smaller dataclasses, keeping in mind they will need to reference each other?
2- If the recommended approach is to have several dataclasses, how do you define the inter-dataclasses references?
I'm asking because I did a small test. In my DB (simplified here, real model is more complex but that's not important), I have a table "Orders" and a table "Parameters". "Orders" has a foreign key on "Parameters". What I did is create 2 dataclasses.
The first one, ParamClass, were I dropped the "Parameters" table only, so I can have a nice "parameter" class. I then created a simple service to add basic SELECT and INSERT functionality.
The second one, OrdersClass, where I dropped both tables, so that the relation between the tables would automatically create a "EntityRef<parameter>" variable inside the "order" class. I then removed the "parameters" class that was automatically created in the OrdersClass dataclass, since the class has already been declared in the ParamClass dataclass. Again I created a small service to test it.
So far so good, it builds happily. The problem is that when I try to handle things on the application code, I added service references for both dataclasses, but it is not happy doing something like:
OrdersServiceReference.order myOrder = new OrdersServiceReference.order();
myOrder.parameter = new ParamServiceReference.parameter(); //<-PROBLEM IS HERE
It comlpains that it cannot implicitly convert from type 'MytestDC.ParamServiceReference.parameter' to 'MytestDC.OrdersServiceReference.parameter'
Do I somehow need to declare some sort of reference to ParamClass from OrdersClass, or how do I "convert" one to the other?
Is this even a recommended and efficient way of doing this?
Since it's a team-project, I initially wanted to separate the dataclasses so that they (and their services) can be easily checked out by one member without checking out the whole entire dataclass.
Any help appreciated!
PS: using Silverlight 4, in case that's important

Based on the widely accepted Single Responsability Principle (SRP), a class should always be responsible for one task, and one task only.
That pretty much invalidates your "one big dataclass" approach.
I would always recommend smaller, more manageable bits that can be combined, instead of one humonguous class that does everything (except brew coffee for you).
Resources for the SRP:
Wikipedia on SRP
OODesign: Single Responsibility Principle
ObjectMentor: list of articles on good app design - which has a few links to PDF documents, like this one on SRP written by Robert C. Martin - the "guru" on proper OO design

OK, some more research let me to this: it is not simple to separate classes from a relational model using LINQtoSQL. I ended up switching to an Entity Framework approach, which itself doesn't deal with it gracefully (see here and there, for example), but at least it solved another major problem I had with LINQtoSQL.
There are other ORMs out there that are apparently much more capable at this (NHibernate comes up often in recommendations), unfortunately, I don't have time to investigate them now, being under such a tight deadline.
As for the referencing, it was quite simple, change the line to:
myOrder.parameter = new OrderServiceReference.parameter();
even though I removed the declaration from that dataclass.
Hope this helps someone!

Django Advantage forms.Form vs forms.ModelForm

There is a question very similar to this but I wanted to ask it in a different way.
I am a very customized guy, but I do like to take shortcuts at times. So here it goes.
I do find these two classes very similar although one "helps" the programmer to write code faster or have less code/repeating code. Connecting Models to Forms sounds like an obvious thing to do. One thing that is not particularly clear in the docs using a ModelForm. What happens if you need to add extra fields that are not in the Model or some way connected to another Model?
I guess you could subclass that out and make it work, but does that really help you save time than just manually doing it with a Form?
So next question may not have a definite answer if I do subclass it out, and use ModelForm. Is ModelForm particularly faster than Form? Does it still use the same Update techniques or is binding significantly faster in one or the other?

If you want a form across two models, you got a couple options:
1) create two modelforms, save each individually when posted, and if one of the two depends on the other (i.e. foreignkey), set that in your view before saving.
2) try Django's inline formset: http://docs.djangoproject.com/en/dev/topics/forms/modelforms/#using-an-inline-formset-in-a-view
3) Add non-model fields to your modelform. On a ModelForm, you can add fields that are not tied to your model. They are available in cleaned_data as any other field would, be but are simply ignored when the model is saved.
One advantage that ModelForm's have over Form's is you can specify the ordering of fields (searching for how to order Form fields brought to your post incidentally). Obvious other advantages are you don't have to rewrite your model saving code

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Rails - Big vs Distributed Tables - database

Related

MVC: Correct pattern to reference objects from a different model

Backbone Relational and subviews, best "save" strategy

Self Tracking Entities Traffic Optimization

Silverlight LINQtoSQL: one big dataclass, or several small ones?

Django Advantage forms.Form vs forms.ModelForm

Categories

Resources