pull Drupal field values with db_query() or db_select() - database

I've created a content type in Drupal 7 with 5 or 6 fields. Now I want to use a function to query them in a hook_view call back. I thought I would query the node table but all I get back are the nid and title. How do I get back the values for my created fields using the database abstraction API?

Drupal stores the fields in other tables and can automatically join them in. The storage varies depending on how the field is configured so the easiest way to access them is by using an EntityFieldQuery. It'll handle the complexity of joining all your fields in. There's some good examples of how to use it here: http://drupal.org/node/1343708
But if you're working in hook_view, you should already be able access the values, they're loaded into the $node object that's passed in as a parameter. Try running:
debug($node);
In your hook and you should see all the properties.

If you already known the ID of the nodes (nid) you want to load, you should use the node_load_multiple() to load them. This will load the complete need with all fields value. To search the node id, EntityFieldQuery is the recommended way but it has some limitations. You can also use the database API to query the node table for the nid (and revision ID, vid) of your nodes, then load them using node_load_multiple().
Loading a complete load can have performance impacts since it will load way more data than what you need. If this prove to be an issue, you can either try do directly access to field storage tables (if your fields values are stored in your SQL database). The schema of these tables is buld dynamicaly depedning on the fields types, cardinality and other settings. You will have to dig into your database schema to figure it out. And it will probably change as soon as you change something on your fields.
Another solution, is to build stub node entities and to use field_attach_load() with a $options['field_id'] value to only load the value of a specific field. But this require a good knowledge and understanding of the Field API.

See How to use EntityFieldQuery article in Drupal Community Documentation.
Creating A Query
Here is a basic query looking for all articles with a photo that are
tagged as a particular faculty member and published this year. In the
last 5 lines of the code below, the $result variable is populated with
an associative array with the first key being the entity type and the
second key being the entity id (e.g., $result['node'][12322] = partial
node data). Note the $result won't have the 'node' key when it's
empty, thus the check using isset, this is explained here.
Example:
<?php
$query = new EntityFieldQuery();
$query->entityCondition('entity_type', 'node')
->entityCondition('bundle', 'article')
->propertyCondition('status', 1)
->fieldCondition('field_news_types', 'value', 'spotlight', '=')
->fieldCondition('field_photo', 'fid', 'NULL', '!=')
->fieldCondition('field_faculty_tag', 'tid', $value)
->fieldCondition('field_news_publishdate', 'value', $year. '%', 'like')
->fieldOrderBy('field_photo', 'fid', 'DESC')
->range(0, 10)
->addMetaData('account', user_load(1)); // Run the query as user 1.
$result = $query->execute();
if (isset($result['node'])) {
$news_items_nids = array_keys($result['node']);
$news_items = entity_load('node', $news_items_nids);
}
?>
Other resources
EntityFieldQuery on api.drupal.org
Building Energy.gov without Views

Related

Cakephp contains association when using matching()

In my project I have the following tables: Messages, Recipients, Groups and Users. A Message has many Recipients, and a Recipient has one Group and one User.
In my RecipientsTable::beforeFind I have some code to automatically contain Groups and Users for Recipient finds, since I always need to access those associations.
public function beforeFind($event, $query, $options, $primary) {
return $query->contain([
'Groups',
'Users',
]);
}
I don't know if this is a bad design decision but it has worked for me so far.
The problem has come now that I'm trying to filter messages by group, and I tried doing so by using the matching function:
$possible_groups = [1,2,3]; //just an example
$query->matching('Recipients', function($q) use ($possible_groups){
return $q->where(['Recipients.group_id IN' => $possible_groups]);
});
When I execute the query, I get the following error:
Messages is not associated with Groups
Is there any way to keep my beforeFind like that and be able to use matching? Or, is there a better way to automatically load associations without using beforeFind?
TL;DR: A hasMany B, B hasOne C. If a query on table A uses matching on table B and table B's beforeFind uses contain to load C, C ends up contained onto the original query (of A) and the execution fails since A is not associated with C.
Use table classes to load Table associations automatically. Like using hasMany(), hasOne(), belongsTo(), belongsToMany() in necessary table classes.
Official documentation:
https://book.cakephp.org/3.0/en/orm/associations.html
Table files should be in the Table folder inside the Model folder like src\Model\Table.

CakePHP 2 - Deriving table name of model within Behavior

Whats the usual practice in getting a behavior (linked to multiple models) to build filters for SQL queries, and then read the table that belongs to that model?
I have a Behavior function which is meant to do a database query with certain SQL conditions. I currently pass in the $this->request->data.
I have issues building the SQL conditions because i'm not sure how to derive the name of the table (that corresponds to the model). See below for example, I want to change "BillingCenterDetail" which is the table name (and also the model name), to something generic I can use across different models. I want this table name to be derived automatically based on the model name. I'm not sure if i can use the $model reference for that.
public function saveWithTimeConstraintCheck(Model $model, $data) {
//FIND ALL RECORDS THAT OVERLAP
$overlapfilter = array(
'BillingCenterDetail.billing_center_id =' => $data['BillingCenterDetail']['billing_center_id'],
'BillingCenterDetail.startdate <=' => $data['BillingCenterDetail']['enddate'],
'BillingCenterDetail.enddate >=' => $data['BillingCenterDetail']['startdate']
);
... after building the filter, I can use $model->find to execute the query, this should be OK because its generic.
$overlapresults = $model->find('all', array('conditions' => $overlapfilter));
I've answered my own question.
And actually to build filter conditions, I needed name of the model, not the name of the table, because the name of the table is a plural name with the "S" at the end.
I used
$Model->name
From:
CakePHP: get current model name in a controller
For table names i found out u can also use
$this->Model->table
cakephp - get table names and its column details

Django Query Optimisation

I am working currently on telecom analytics project and newbie in query optimisation. To show result in browser it takes a full minute while just 45,000 records are to be accessed. Could you please suggest on ways to reduce time for showing results.
I wrote following query to find call-duration of a person of age-group:
sigma=0
popn=len(Demo.objects.filter(age_group=age))
card_list=[Demo.objects.filter(age_group=age)[i].card_no
for i in range(popn)]
for card in card_list:
dic=Fact_table.objects.filter(card_no=card.aggregate(Sum('duration'))
sigma+=dic['duration__sum']
avgDur=sigma/popn
Above code is within for loop to iterate over age-groups.
Model is as follows:
class Demo(models.Model):
card_no=models.CharField(max_length=20,primary_key=True)
gender=models.IntegerField()
age=models.IntegerField()
age_group=models.IntegerField()
class Fact_table(models.Model):
pri_key=models.BigIntegerField(primary_key=True)
card_no=models.CharField(max_length=20)
duration=models.IntegerField()
time_8bit=models.CharField(max_length=8)
time_of_day=models.IntegerField()
isBusinessHr=models.IntegerField()
Day_of_week=models.IntegerField()
Day=models.IntegerField()
Thanks
Try that:
sigma=0
demo_by_age = Demo.objects.filter(age_group=age);
popn=demo_by_age.count() #One
card_list = demo_by_age.values_list('card_no', flat=True) # Two
dic = Fact_table.objects.filter(card_no__in=card_list).aggregate(Sum('duration') #Three
sigma = dic['duration__sum']
avgDur=sigma/popn
A statement like card_list=[Demo.objects.filter(age_group=age)[i].card_no for i in range(popn)] will generate popn seperate queries and database hits. The query in the for-loop will also hit the database popn times. As a general rule, you should try to minimize the amount of queries you use, and you should only select the records you need.
With a few adjustments to your code this can be done in just one query.
There's generally no need to manually specify a primary_key, and in all but some very specific cases it's even better not to define any. Django automatically adds an indexed, auto-incremental primary key field. If you need the card_no field as a unique field, and you need to find rows based on this field, use this:
class Demo(models.Model):
card_no = models.SlugField(max_length=20, unique=True)
...
SlugField automatically adds a database index to the column, essentially making selections by this field as fast as when it is a primary key. This still allows other ways to access the table, e.g. foreign keys (as I'll explain in my next point), to use the (slightly) faster integer field specified by Django, and will ease the use of the model in Django.
If you need to relate an object to an object in another table, use models.ForeignKey. Django gives you a whole set of new functionality that not only makes it easier to use the models, it also makes a lot of queries faster by using JOIN clauses in the SQL query. So for you example:
class Fact_table(models.Model):
card = models.ForeignKey(Demo, related_name='facts')
...
The related_name fields allows you to access all Fact_table objects related to a Demo instance by using instance.facts in Django. (See https://docs.djangoproject.com/en/dev/ref/models/fields/#module-django.db.models.fields.related)
With these two changes, your query (including the loop over the different age_groups) can be changed into a blazing-fast one-hit query giving you the average duration of calls made by each age_group:
age_groups = Demo.objects.values('age_group').annotate(duration_avg=Avg('facts__duration'))
for group in age_groups:
print "Age group: %s - Average duration: %s" % group['age_group'], group['duration_avg']
.values('age_group') selects just the age_group field from the Demo's database table. .annotate(duration_avg=Avg('facts__duration')) takes every unique result from values (thus each unique age_group), and for each unique result will fetch all Fact_table objects related to any Demo object within that age_group, and calculate the average of all the duration fields - all in a single query.

How to use indexed properties of NodeModels in cypher queries of Neo4django?

I'm a newbie to Django as well as neo4j. I'm using Django 1.4.5, neo4j 1.9.2 and neo4django 0.1.8
I've created NodeModel for a person node and indexed it on 'owner' and 'name' properties. Here is my models.py:
from neo4django.db import models as models2
class person_conns(models2.NodeModel):
owner = models2.StringProperty(max_length=30,indexed=True)
name = models2.StringProperty(max_length=30,indexed=True)
gender = models2.StringProperty(max_length=1)
parent = models2.Relationship('self',rel_type='parent_of',related_name='parents')
child = models2.Relationship('self',rel_type='child_of',related_name='children')
def __unicode__(self):
return self.name
Before I connected to Neo4j server, I set auto indexing to True and and gave indexable keys in conf/neo4j.properties file as follows:
# Autoindexing
# Enable auto-indexing for nodes, default is false
node_auto_indexing=true
# The node property keys to be auto-indexed, if enabled
node_keys_indexable=owner,name
# Enable auto-indexing for relationships, default is false
relationship_auto_indexing=true
# The relationship property keys to be auto-indexed, if enabled
relationship_keys_indexable=child_of,parent_of
I followed Neo4j: Step by Step to create an automatic index to update above file and manually create node_auto_index on neo4j server.
Below are the indexes created on neo4j server after executing syndb of django on neo4j database and manually creating auto indexes:
graph-person_conns lucene
{"to_lower_case":"true", "_blueprints:type":"MANUAL","type":"fulltext"}
node_auto_index lucene
{"_blueprints:type":"MANUAL", "type":"exact"}
As suggested in https://github.com/scholrly/neo4django/issues/123 I used connection.cypher(queries) to query the neo4j database
For Example:
listpar = connection.cypher("START no=node(*) RETURN no.owner?, no.name?",raw=True)
Above returns the owner and name of all nodes correctly. But when I try to query on indexed properties instead of 'number' or '*', as in case of:
listpar = connection.cypher("START no=node:node_auto_index(name='s2') RETURN no.owner?, no.name?",raw=True)
Above gives 0 rows.
listpar = connection.cypher("START no=node:graph-person_conns(name='s2') RETURN no.owner?, no.name?",raw=True)
Above gives
Exception Value:
Error [400]: Bad Request. Bad request syntax or unsupported method.
Invalid data sent: (' expected but-' found after graph
I tried other strings like name, person_conns instead of graph-person_conns but each time it gives error that the particular index does not exist. Am I doing a mistake while adding indexes?
My project mainly depends on filtering the nodes based on properties, so this part is really essential. Any pointers or suggestions would be appreciated. Thank you.
This is my first post on stackoverflow. So in case of any missing information or confusing statements please be patient. Thank you.
UPDATE:
Thank you for the help. For the benefit of others I would like to give example of how to use cypher queries to traverse/find shortest path between two nodes.
from neo4django.db import connection
results = connection.cypher("START source=node:`graph-person_conns`(person_name='s2sp1'),dest=node:`graph-person_conns`(person_name='s2c1') MATCH p=ShortestPath(source-[*]->dest) RETURN extract(i in nodes(p) : i.person_name), extract(j in rels(p) : type(j))")
This is to find shortest path between nodes named s2sp1 and s2c1 on the graph. Cypher queries are really cool and help traverse nodes limiting the hops, types of relations etc.
Can someone comment on the performance of this method? Also please suggest if there are any other efficient methods to access Neo4j from Django. Thank You :)
Hm, why are you using Cypher? neo4django QuerySets work just fine for the above if you set the properties to indexed=True (or not, it'll just be slower for those).
people = person_conns.objects.filter(name='n2')
The neo4django docs have some other querying examples, as do the Django docs. Neo4django executes those queries as Cypher on the backend- you really shouldn't need to drop down to writing the Cypher yourself unless you have a very particular traversal pattern or a performance issue.
Anyway, to more directly tackle your question- the last example you used needs backticks to escape the index name, like
listpar = connection.cypher("START no=node:`graph-person_conns`(name='s2') RETURN no.owner?, no.name?",raw=True)
The first example should work. One thought- did you flip the autoindexing on before or after saving the nodes you're searching for? If after, note that you'll have to manually reindex the nodes either using the Java API or by re-setting properties on the node, since it won't have been autoindexed.
HTH, and welcome to StackOverflow!

Rename field using Objectify and Google App Engine

I am trying a case where we changed a field name in our entity. we have something like this for example
class Person {
String name; //The original declaration was "String fullName"
}
According to objectify you have to use annonation #AutoLoad(""). This is ok and it works as Google Datastore doesn't delete the data Actually but it makes a new field so this annotation is like a mapping between the old and the new field. No problem when you are reading the whole table.
The problem arises when you apply a filter on your query (Suppose you made 5 objects with old name and 5 with new name). The result of your query depends on whether you used the old variable name or the new one (returns back only 5 but never the 10). It won't fetch both of them and map them. Any Suggestions for this problem? I hope i explained it in a clear way.
Thanks in advance
The simplest straight forward solution. fetch all data with the annonation "AutoLoad()". Then store them again. In this way they will be saved as the new field. The old one doesn't exist anymore or at least it doesn't contain any data anymore. It is like migrating the data from the old name to the new name. Anyone has better suggestions ?
If you've changed the name of your field, you need to load and re-put all your data (using the mapreduce API would be one option here). There's no magic way around this - the data you've stored exists with two different names on disk.
You can use #OldName
http://www.mail-archive.com/google-appengine-java#googlegroups.com/msg05586.html

Resources