Rails new migrations not updating old data - database

I have a model called Article. This model has attribute called duration which is a string. The duration can be morning, afternoon, evening and night.
For example, article object looks like this:
#<Article id: 1, created_at: "2015-09-22 08:00:08", updated_at: "2015-09-22 08:00:08", duration: "morning">
Since the duration has similar properties, I create a base class called duration and the inherited classes as morning, afternoon, evening and night.
In a duration there can be many articles. But a single article can have only one duration. So, I have has_many and belongs_to association:
app/model/duration.rb
class Duration < ActiveRecord::Base
has_many :articles
end
app/model/article.rb
class Article < ActiveRecord::Base
belongs_to :duration
end
Other inherited classes are:
app/model/duration/morning.rb
class Morning < Duration
end
and so on for afternoon.rb, evening.rb and night.rb.
I already have the migration to create durations table. To add the field type to duration, I have a migration called add_type_to_duration.rb
AddTypeToDuration < ActiveRecord::Migration
def change
add_column :durations, :type, :string
end
end
I have another migration file called add_duration_ref_to_articles.rb to add reference
class AddDurationRefToArticles < ActiveRecord::Migration
def change
add_reference :articles, :duration, index:true
end
end
I have another migration to create the new durations in add_initial_durations.rb
class AddInitialDurations < ActiveRecord::Migration
def change
Morning.create
Afternoon.create
Evening.create
Night.create
end
end
Now I want to update the old data to adapt according to new migrations. So, I have another migration called update_articles_to_have_duration.rb
class UpdateArticlesToHaveDuration < ActiveRecord::Migration
def change
morning = Duration.find_by_type("Morning")
Article.where(duration: "morning").find_each do |article|
article.update!(duration_id: morning.id)
end
end
end
Now when I run the migrations all the articles, that had duration = morning, now have duration_id = nil. However, when I run the last migration again with rake db:migrate:redo step:1, then articles have correct duration_id. I think somehow the migration is not run properly. However, I don't get any error while running them. Can anyone please let me know what am I doing wrong here?
Thanks for the help.

As you've said that duration_id is set properly when running the migration second time, the only reason why it's not working on the first go might be that the migrations are not running sequentially as you've shown.
Migration files have timestamps on them and they run oldest first when you do rake db:migrate.
See the timestamps of the migration file to ensure that they're ordered as you want them.
❯ rake db:migrate:status
database: ~/blog/db/development.sqlite3
Status Migration ID Migration Name
--------------------------------------------------
up 20150907015616 Create articles
down 20150907031746 Create comments
down 20150909034614 Devise create users
You can use above command to see the Migration Id and copy it to run the migrations one after another to verify my assumption with following command:
❯ bin/rake db:migrate VERSION=20150907015616 VERSION=20150907031746 VERSION=20150909034614
You'll have to re-order(better to delete and recreate) the migration files or transpose its contents if the migration files do not follow the chronological order of timestamps.

Related

Rails - include? with object or object id

I have an object called Company which has a method called owners that returns an array of User objects. I want to filter those companies by checking if the current user is in the owner's array. first I did something like this (which works):
Company.select { |c| c.owners.include?(current_user) }
However, I figured a more efficient check would be comparing only the ids instead of whole objects:
Company.select { |c| c.owners.map(&:id)include?(current_user.id) }
Can anyone help me understand if there is a difference between the two options?
There's really no difference between the two. Both are very inefficient and won't work if your Company table is large. They load all the Company records into memory, and furthermore call c.owners which fires an additional query on each of the records. This is called an N+1 query. You can address the N+1 part by using Company.all.includes(:owners), but you'd still have the issue of loading everything into memory.
It's hard to give you exact code since you haven't shared your model definition (specifically, how the Company#owners association is defined). But I'll assume you have a CompanysOwners join table. In which case I would recommend the following code:
companies = CompanysOwners.where(owner: current_user).includes(:company).map(&:company)
This only fires off a single query, and doesn't load more records into memory than are needed. You could get rid of the includes(:company) part and it would still work, but be slower.
If your Company#owners association is defined differently than this, feel free to leave a comment and I can show you how to modify this for your needs.
By the way, to address your original question more directly ... under the hood, c.owners.include?(current_user) uses == to compare the records. And ActiveRecord::Core#== is comparing the IDs under the hood. You can see the source code here for proof of that: https://api.rubyonrails.org/v6.1.3.1/classes/ActiveRecord/Core.html#method-i-3D-3D. So they really are doing the same thing.
I have an object called Company which has a method called owners that returns an array of User objects.
Well that's your problem right there.
This should really be handled on the model layer by setting up an indirect association:
class User < ApplicationRecord
has_many :company_ownerships,
foreign_key: :owner_id,
inverse_of: :owner
has_many :companies, through: :company_ownerships
end
class CompanyOwnership < ApplicationRecord
belongs_to :owner, class_name: 'User'
belongs_to :company
end
class Company < ApplicationRecord
has_many :company_ownerships
has_many :owners, through: :company_ownerships
end
This will let you get the companies owned by a user by simply calling:
current_user.companies
This is vastly more efficient as it does a single query to fetch the associated records instead of loading the entire table into memory. This will return an ActiveRecord::Relation object and not an array which allows you to add additional scopes if needed:
current_user.companies.where(bankrupt: false)
Its also superior from a code design standpoint as the buisness logic is encapsulated in your models instead of leaking the implementation details all over the place.
It will also let you include/preload/eager_load to avoid n+1 queries:
#users = User.include(:companies)
.all
#users.each do |user|
# this loads all the companies in one single query instead
# of one query per user
user.companies.each do |company|
puts company.name
end
end
If you ever for some reason need to check if two records are related you want to use a join and where clause:
#companies = Company.joins(:owners)
.where(users: { id: current_user.id })

Instantiate new data on migration

Suppose I have a model Person. Now I create a new model:
class Ranking(models.Model):
person = models.ForeignKey(Person)
score= models.IntegerField(null=False, default= 100)
date_created = models.DateTimeField(auto_now_add=True)
The thing is, I want each Person to have at least one Ranking, so on creation of new Person objects I can just create a new Ranking for each object.
What I don't know is how to create a new default Ranking instance for each of the existing Person objects in the db?
In a django script it would look as simple as something like:
for person in people:
Ranking(person=person).save()
Is there a way to add that code to the south forward migration file? Is there a better way of solving this problem? Any ideas?
First, auto generate the migration python manage.py schemamigration myapp --auto.
Then find something that resembles the following in the migration file (presumably myapp/migrations/00xx_auto__add__ranking.py):
def forwards(self, orm):
# Adding model 'Ranking'
db.create_table(u'myapp_ranking', (
(u'id', self.gf('django.db.models.fields.AutoField')(primary_key=True)),
('person', self.gf('django.db.models.fields.related.ForeignKey')(to=orm['myapp.Person'])),
('score', self.gf('django.db.models.fields.IntegerField')(default=100)),
('date_created', self.gf('django.db.models.fields.DateTimeField')(auto_now_add=True, blank=True)),
))
db.send_create_signal(u'myapp', ['Ranking'])
After this, insert something like the following:
# Create 'blank' Ranking entries for all extant Person objects
for person in orm.Person.objects.all():
orm.Ranking.objects.create(person=person)
Other approaches include splitting this into three migrations (this is better say if you have a large dataset in a production environment):
add the model, with the person field not required
add a separate data migration (python manage.py datamigration myapp), and insert into it code to do what I suggested above.
run the two migrations above (allow this to take time if necessary)
change the person field to be once it's all populated, and run this final migration
The South docs have something along these lines. There's also a similar question here that might give insight.

Rails 3.2.8 - Multiple database access within one single Rails App

I'm building a webapp that has its own database called 'products_db'. But my app will have to call reviews which is located in the database 'reviews_db', which is a legacy db being used by another system that I can't do anything because client wants it that way.
So, luckily both db are located in the same SQL Server (MSSQL). I've already have te 'activerecord-sqlserver-adapter' working but I need to figure out a way to access 'reviews_db' from my webapp.
The reviews_db doesn't follow any Rails convention because its a legacy system.
So, my class Product:
class Product < ActiveRecord::Base
attr_accessible :name, :description, :price
has_many :reviews
end
And my class Review:
class Review < ActiveRecord::Base
# THIS CLASS DOESN'T FOLLOW RAILS CONVENTION
# HOW DO I SET AND MANAGE LEGACY PRIMARY KEY?
# HOW DO I CONNECT THIS MODEL TO THE OTHER DATABASE?
# HOW DO I CONNECT THIS MODEL TO THE RIGHT TABLE NAME?
attr_accessible :rv_tbl_title, :rv_tbl_id, :rv_tbl_text, :rv_tbl_author, :rv_tbl_ref_prod
has_one :Product, foreign_key: :rv_tbl_author
end
Is there a gem for it? What's the solution to use in the Review class questions?
I'm not sure if this first part is necessary or not, but in your database.yml file, make a new connection by adding something like this to the end:
review:
adapter: sqlserver
database: reviews_db
.... put your other configuration info here
Then in your review model review.rb:
class Review < ActiveRecord::Base
establish_connection :review
self.table_name = "review_table"
self.primary_key = "review_id"
end
Change the table name to the correct table, and the primary key to the correct column name.
Then create a new table/model for the sole purpose of having a local reference to a review. So you could call it ReviewReference
class ReviewReference < ActiveRecord::Base
belongs_to :review
has_one :product
end
And change your Product model to
class Product < ActiveRecord::Base
has_many :reviews, class_name: "ReviewReference"
end
This should get you a long way toward your goal. You might have to end up doing a lot of
#reviews = Review.where("some_column = ?", some_value)
or
#reviews = Review.find_by_sql("Some SQL Here") if you're doing more complex queries.
Sorry my answer isn't more concrete, I've only done this once. Props to Chad Fowler's Rails Recipes book for the concept.

Django profile id may not not null using get_or_create, how does it relate to the db?

I followed a bit the steps on Django User Profiles - Simple yet powerful.
Not quite the same because I am in the middle of developing the idea.
From this site I used in particular, also this line:
User.profile = property(lambda u:
UserProfile.objects.get_or_create(user=u)[0])
I was getting always an error message on creating the object, typically
"XX" may not be null. I solved part of the problems by playing with models
and (in my present case) sqliteman. Till I got the same
message on the id: "xxx.id may not be null".
On the net I found a description of a possible solution which involved doing a reset
of the database, which I was not that happy to do. In particular because for the
different solutions, it might have involved the reset of the application db.
But because the UserProfile model was kinda new and till now empty,
I played with it on the DB directly and made an hand made drop of the table and
ask syncdb to rebuilt it. (kinda risky thought).
Now this is the diff of the sqlite dump:
294,298c290,294
< CREATE TABLE "myt_userdata" (
< "id" integer NOT NULL PRIMARY KEY,
< "user_id" integer NOT NULL UNIQUE REFERENCES "auth_user" ("id"),
< "url" varchar(200),
< "birthday" datetime
---
> CREATE TABLE myt_userdata (
> "id" INTEGER NOT NULL,
> "user_id" INTEGER NOT NULL,
> "url" VARCHAR(200),
> "birthday" DATETIME
Please note that both versions are generated by django. The ">" version was generated with a simple model definition which had indeed the connection with the user table via:
user = models.ForeignKey(User, unique=True)
The new "<" version has much more information and it is working.
My question:
Why Django complains about an myt_userdata.id may not be null?
The subsidiary question:
Does Django tries to relate to the underline db structure, how?
(for example the not NULL message comes from the model or from the DB?)
The additional question:
I have been a bit reluctant to the use south: Too complicated, additional modules
which I might have to care between devel and production and maybe not that easy
if I want to switch DB engine (I am using sqlite only at devel stage, I plan to move to
mysql).
Probably south might have worked in this case. Would it work? would you suggest its use
anyway?
Edited FIY:
This is my last model (the working one):
class UserData(models.Model):
user = models.ForeignKey(User, unique=True)
url = models.URLField("Website", blank=True, null=True)
birthday = models.DateTimeField('Birthday', blank=True, null=True)
def __unicode__(self):
return self.user.username
User.profile = property(lambda u: UserData.objects.get_or_create(user=u,defaults={'birthday': '1970-01-01 00:00:00'})[0])
Why Django complains about an myt_userdata.id may not be null?
Because id is not a primary key and is not populated automatically though. Also, you don't provide it on model creation, so DB does not know what to do.
Does Django tries to relate to the underline db structure, how? (for example the not NULL message comes from the model or from the DB?)
It's an error from DB, not from Django.
You can use sql command to understan what exactly is executed on syncdb. Variant above seems to be correct table definition made from correct Django model, and I have no ide how have you got a variant below. Write a correct and clear model, and you'll get correct and working table scheme after syncdb

django-haystack won't index my data

I'm following instructions on haystack documentation.
I'm getting no results for SearchQuerySet().all().
I think the problem is here
$ ./manage.py rebuild_index
WARNING: This will irreparably remove EVERYTHING from your search index in connection 'default'.
Your choices after this are to restore from backups or rebuild via the `rebuild_index` command.
Are you sure you wish to continue? [y/N] y
Removing all documents from your index because you said so.
All documents removed.
Indexing 0 notes. // <-- here 0 notes!
mysite/note/search_indexes.py looks like
import datetime
import haystack
from haystack import indexes
from note.models import Note
class NoteIndex(indexes.SearchIndex, indexes.Indexable):
text = indexes.CharField(document=True, use_template=True)
author = indexes.CharField(model_attr='user')
pub_date = indexes.DateTimeField(model_attr='pub_date')
def get_model(self):
return Note
def index_queryset(self):
"""Used when the entire index for model is updated."""
return self.get_model().objects.filter(pub_date__lte=datetime.datetime.now())
and I have mysite/note/templates/search/indexes/note/Note_text.txt
{{ object.title }}
{{ object.user.get_full_name }}
{{ object.body }}
Debugging haystack document mentions
Do you have a search_sites.py that runs haystack.autodiscover?
Have you registered your models with the main haystack.site (usually
within your search_indexes.py)?
But none of search_sites.py , haystack.autodiscover, haystack.site was mentioned in the first article.
I'm so confused. Are their docs dealing with different haystack versions?
My setups are..
haystack version 2.0.0.beta
django 1.3.1
solr 3.6.0
sqlite 3
def index_queryset(self):
"""Used when the entire index for model is updated."""
return self.get_model().objects.filter(pub_date__lte=datetime.datetime.now())
was the culprit.
I don't know why, but commenting out fixes the problem.
I guess 'time' in my system is somehow messed up.
It should be...
def index_queryset(self, using=None):
I don't know if this will fix your issue or not, but that is the correct signature for the method.
Removing def index_queryset(self) makes sense. It builds a regular Django ORM QuerySet, which decides which objects get put into the full-text index. Your sample index_queryset limits the objects to past timestamps only (before now).
So, you really have a datetime handling problem. Check your SQL database's timezone and how it stores times.
A timestamp in UTC locale is about +5 hours ahead of New York, and most of the USA. SQLite caused the same problem for me by choosing UTC times in the future.

Resources