Fetch field of another Doc in CouchDB? - database

I'm very new to CouchDB and I have a simple task that I have no been able to find a straight answer.
I have two related docs: order and invoice
invoice = {
id: "invoice_id",
order: "order_id",
values: [...]
}
order = {
id: "order_id",
**order_number: 12345**
}
I have defined a map function that select the unfulfilled invoices, now I need the order_number, which is in the order doc. They are the same transaction. How do I fetch the order_number from the order when I get my invoices?
I've looked around and I'm getting so many answers like: view collation, linked documents, include_docs=true, structure docs to have both...
I'm just looking for the simplest way with a clear explanation. I appreciate any help.
p.s.
Since I'm new I'm finding couchDB development to be very involved. I have map functions, but they need to be pushed to the couchInstance? Or I edit the map functions in Futon? Are there better ways to develop against couchDB? I see there's couchApp but the docs are sparse and the project hasn't been updated in a while.

Related

get the top 10 javascript/opensource repositories ranked by star using GitHub GraphQL Api

I would like to get the top 10 javascript/opensource repositories ranked by star (and some related informations) using GitHub GraphQL Api in a python project. I have this query so far:
query{
search(type: REPOSITORY, query: "language:javascript", first:10) {
userCount
edges {
node {
... on Repository {
name
url
stargazers {
totalCount
}
owner{
login
}
}
}
}
}
}
The problem is that it does not always return the same result: it will return 10 random repositories ordered by starcount at each query rather than the absolute top 10.
And on top of that I’d like to get the ones that are open source.
I use the query
query{
licenses{name}
}
to get a list of licences but I don’t know if this is an exhaustive list (seems like it's missing some licenses like MIT). According to the doc it is
Return a list of known open source licenses.
How to get an exhaustive lists of the licences and add it to my main query above to make my research more precise?
I can't seem to find clear answers as the documentation about the GraphQl api for GitHub is scarce and quite vague.
Thanks
I got an partial explanation from GitHub Support about the reason of why the results are inconsistent: it's due to the fact that there is a timeout when queries run for too long.
Some queries are computationally expensive for our search infrastructure to execute. To keep search fast for everyone, we limit how long any individual query can run. In rare situations when a query exceeds the time limit, search returns all matches that were found prior to the timeout and informs you that a timeout occurred.
Reaching a timeout does not necessarily mean that search results are incomplete. It just means that the query was discontinued before it searched through all possible data.
Our team wrote about this here:
https://help.github.com/articles/troubleshooting-search-queries/#potential-timeouts
Given this reality, these timeouts may cause inconsistencies while paging through the results. We see how this could be improved in future iterations of search, so we've let our team know so they're aware though we can't make any promises on specific changes.
Edit: Provided by the support, adding query: "language:javascript stars:>1600" (1600 is more or less the minimum star count of the top 3000 reps but need to be big enough to narrow the search) will provide consistently the top 10 repos ordered by star.

Query by key in Datastore with Dart

I have a List<Key> which I would like to retrieve the full data records for but with applying additional filtering to it.
I can retrieve them via dbService.lookup(Project, keys) but lookup doesn't allow me to apply additional filtering.
This is essentially what I want to do:
dbService.query(Project)
..filter('__key__ IN', keys)
..filter('acl_read IN', roles)
..run();
but since __key__ is not supported in Google Cloud's Dart implementation, I cannot run this query.
I could do:
projects = dbService.lookup(keys);
projects.removeWhere((project) => (project.acl_read.fold(false, (result, key) => result || members.contains(key))));
but this seems not like the right way of achieving this.
So what's the right way of doing this?
There isn't a server-based method to do what you're looking to do, so your method of post filtering on the client-side is how you'd do it..
Alternatively, if you know that all querying all the keys with your filter results in a small set of keys then what you have in List, then do a full query first and then find the Union of results and List

Firebase + AngularFire -> States?

I'd like to know how I would deal with object states in a FireBase environment.
What do I mean by states? Well, let's say you have an app with which you organize order lists. Each list consists of a bunch of orders, so it can be considered a hierarchical data structure. Furthermore each list has a state which might be one of the following:
deferred
open
closed
sent
acknowledged
ware completely received
ware partially received
something else
On the visual (HTML) side the lists shall be distinguished by their state. Each state shall be presented to the client in its own, say, div-element, listing all the related orders beneath.
So the question is, how do I deal with this state in FireBase (or any other document based database)?
structure
Do I...
... (option 1) use a state-field for each orderlist and filter on the clientside by using if or something similar:
orderlist1.state = open
order1
order2
orderlist2.state = open
order1
orderlist3.state = closed
orderlist4.state = deferred
... (option 2) use the hierarchy of FireBase to classify the orderlists like so:
open
orderlist1
order1
order2
orderlist2
order1
closed
orderlist3
deferred
orderlist4
... (option 3) take a totally different approach?
So, what's the royal road here?
retrieval, processing & visual output of option 2
Since for option 1 the answer to this question is apparantly pretty straight forward (if state == ...) I continue with option 2: how do I retrieve the data in option 2? Do I use a Firebase-object for each state, like so:
var closedRef = new Firebase("https://xxx.firebaseio.com/closed");
var openRef = new Firebase("https://xxx.firebaseio.com/open");
var deferredRef = new Firebase("https://xxx.firebaseio.com/deferred");
var somethingRef = new Firebase("https://xxx.firebaseio.com/something");
Or what's considered the best approach to deal with that sort of data/structure?
There is no universal answer to this question. The "best approach" is going to depend on the particulars of your use case, which you haven't provided here. Specifically, how you will be reading and manipulating the data.
Data architecture in NoSQL is all about working hard on writes to make reads easy. It's all about how you plan to use the data. (It's also enough material for a chapter in a book.)
The advantage to "option 1" is that you can easily iterate all the entire list. Great if your list is measured in hundreds. This is a great approach if you want to fetch the list and manipulate it on the fly on the client side.
The advantage to "option 2" is that you can easily grab a subset of the list. Great if your list is measured in thousands and you will typically be fetching open issues only rather than closed ones. This is great for archiving/new/old lists like yours.
There are other options as well.
Sorted Data using Priorities
Perhaps the most universal approach is to use ordered data. This allows you to query a subset of your records using something like:
new Firebase(URL).startAt('open').endAt('open').limit(10);
This is sufficient in most cases where you have only one criteria, or when you can create a unique identifier from multiple criteria (e.g. 'open:marketing') without difficulty. Examples are scoreboards, state lists like yours, data ordered by timestamps.
Using an index
You can also create custom subsets of your data by creating an index of keys and using that to fetch the others.
This is most useful when there is no identifiable characteristic of your subsets. For example, if I pick them from a list and store my favorites.
I think my this plnkr can help you for this.
Here, click on edit/add and just check the country(order in your case) - State(state in your case) dependent dropdown may be the same as you want.just one single thing you may need to add is filter it.
They both are different tables in db.
You can also get it from git.

How to use indexed properties of NodeModels in cypher queries of Neo4django?

I'm a newbie to Django as well as neo4j. I'm using Django 1.4.5, neo4j 1.9.2 and neo4django 0.1.8
I've created NodeModel for a person node and indexed it on 'owner' and 'name' properties. Here is my models.py:
from neo4django.db import models as models2
class person_conns(models2.NodeModel):
owner = models2.StringProperty(max_length=30,indexed=True)
name = models2.StringProperty(max_length=30,indexed=True)
gender = models2.StringProperty(max_length=1)
parent = models2.Relationship('self',rel_type='parent_of',related_name='parents')
child = models2.Relationship('self',rel_type='child_of',related_name='children')
def __unicode__(self):
return self.name
Before I connected to Neo4j server, I set auto indexing to True and and gave indexable keys in conf/neo4j.properties file as follows:
# Autoindexing
# Enable auto-indexing for nodes, default is false
node_auto_indexing=true
# The node property keys to be auto-indexed, if enabled
node_keys_indexable=owner,name
# Enable auto-indexing for relationships, default is false
relationship_auto_indexing=true
# The relationship property keys to be auto-indexed, if enabled
relationship_keys_indexable=child_of,parent_of
I followed Neo4j: Step by Step to create an automatic index to update above file and manually create node_auto_index on neo4j server.
Below are the indexes created on neo4j server after executing syndb of django on neo4j database and manually creating auto indexes:
graph-person_conns lucene
{"to_lower_case":"true", "_blueprints:type":"MANUAL","type":"fulltext"}
node_auto_index lucene
{"_blueprints:type":"MANUAL", "type":"exact"}
As suggested in https://github.com/scholrly/neo4django/issues/123 I used connection.cypher(queries) to query the neo4j database
For Example:
listpar = connection.cypher("START no=node(*) RETURN no.owner?, no.name?",raw=True)
Above returns the owner and name of all nodes correctly. But when I try to query on indexed properties instead of 'number' or '*', as in case of:
listpar = connection.cypher("START no=node:node_auto_index(name='s2') RETURN no.owner?, no.name?",raw=True)
Above gives 0 rows.
listpar = connection.cypher("START no=node:graph-person_conns(name='s2') RETURN no.owner?, no.name?",raw=True)
Above gives
Exception Value:
Error [400]: Bad Request. Bad request syntax or unsupported method.
Invalid data sent: (' expected but-' found after graph
I tried other strings like name, person_conns instead of graph-person_conns but each time it gives error that the particular index does not exist. Am I doing a mistake while adding indexes?
My project mainly depends on filtering the nodes based on properties, so this part is really essential. Any pointers or suggestions would be appreciated. Thank you.
This is my first post on stackoverflow. So in case of any missing information or confusing statements please be patient. Thank you.
UPDATE:
Thank you for the help. For the benefit of others I would like to give example of how to use cypher queries to traverse/find shortest path between two nodes.
from neo4django.db import connection
results = connection.cypher("START source=node:`graph-person_conns`(person_name='s2sp1'),dest=node:`graph-person_conns`(person_name='s2c1') MATCH p=ShortestPath(source-[*]->dest) RETURN extract(i in nodes(p) : i.person_name), extract(j in rels(p) : type(j))")
This is to find shortest path between nodes named s2sp1 and s2c1 on the graph. Cypher queries are really cool and help traverse nodes limiting the hops, types of relations etc.
Can someone comment on the performance of this method? Also please suggest if there are any other efficient methods to access Neo4j from Django. Thank You :)
Hm, why are you using Cypher? neo4django QuerySets work just fine for the above if you set the properties to indexed=True (or not, it'll just be slower for those).
people = person_conns.objects.filter(name='n2')
The neo4django docs have some other querying examples, as do the Django docs. Neo4django executes those queries as Cypher on the backend- you really shouldn't need to drop down to writing the Cypher yourself unless you have a very particular traversal pattern or a performance issue.
Anyway, to more directly tackle your question- the last example you used needs backticks to escape the index name, like
listpar = connection.cypher("START no=node:`graph-person_conns`(name='s2') RETURN no.owner?, no.name?",raw=True)
The first example should work. One thought- did you flip the autoindexing on before or after saving the nodes you're searching for? If after, note that you'll have to manually reindex the nodes either using the Java API or by re-setting properties on the node, since it won't have been autoindexed.
HTH, and welcome to StackOverflow!

Soccer and CouchDb (noob pining for sql and joins)

This has kept me awake until these wee hours.
I want a db to keep track of a soccer tournament.
Each match has two teams, home and away.
Each team can be the home or the away of many matches.
I've got one db, and two document types, "match" (contains: home: teamId and away: teamId) and team (contains: teamId, teamName, etc).
I've managed to write a working view but it would imply adding to each team the id of every match it is involved in, which doesn't make much logical sense - it's such an hack.
Any idea on how this view should be written? I am nearly tempted to just throw the sponge in and use postgres instead.
EDIT: what I want is to have the team info for both the home and away teams, given the id of a match. Pretty easy to do with two calls, but I don't want to make two calls.
Just emit two values in map for each match like this:
function (doc) {
if (!doc.home || !doc.away) return;
emit([doc._id, "home"], { _id: doc.home });
emit([doc._id, "away"], { _id: doc.away });
}
After querying the view for the match id MATCHID with:
curl 'http://localhost:5984/yourdb/_design/yourpp/_view/yourview?startkey=\["MATCHID"\]&endkey=\["MATCHID",\{\}\]&include_docs=true'
you should be able to get both teams' documents in doc fields in list of results (row), possibly like below:
{"total_rows":2,"offset":0,"rows":[
{"id":"MATCHID","key":["MATCHID","home"],"value":{"_id":"first_team_id"},"doc":{...full doc of the first team...}},
{"id":"MATCHID","key":["MATCHID","away"],"value":{"_id":"second_team_id"},"doc":{...full doc of the second team...}}
]}
Check the CouchDB Book: http://guide.couchdb.org/editions/1/en/why.html
It's free and includes answers to a beginner :)
If you like it, consider buying it. Even thought Chris and Jan are so awesome they just put their book out there for free you should still support the great work they did with their book.

Resources