Count states and cities with one gremlin query - graph-databases

I use orientDb with thinkerpop 3 support and the data is like this:
One country has multiple states and this states has multiple cities (one exception for my example is that not every state has cities).
I would like to count the states and the cities for one specific state in one gremlin query.
ArrayList list = new ArrayList();
g.V().has("key", GERMANY_KEY)
.repeat(__.in())
.until(__.hasLabel("state"))
.as("states")
.repeat(__.in())
.until(__.hasLabel("city"))
.as("cities")
.select("states", "cities")
.fill(list);
This is what I've but after this, I don't know I can count them together.
A possible answer from this could be
Germany has 16 states and 1000 cities.
Is this possible or do I need to make two queries?
Thanks a lot.

I found a solution:
g.V().has("key", GERMAN_KEY)
.repeat(__.in()).emit()
.groupCount().by(__.label())
.forEachRemaining(e -> logger.info("Data: {}", e));

Related

How to get the count of property in a Kind?

I have a Kind Students which stores the details of favorite colors of all students. They are allowed to pick their favorite color from a set of three colors : {Red,Blue,Green}
Let us assume there are 100 students, my code is like this for every student :
Entity arya = new Entity("Student","Arya");
arya.setProperty("Color","Red");
Entity robb = new Entity("Student","Robb");
robb.setProperty("Color","Green");
..
..
Entity jon = new Entity("Student","Jon");
jon.setProperty("Color","Blue");
How to find out how many students liked a particular color(say Red) in this Student Kind ? What Query I should write to fetch the count ?
Thanks in advance
The number you seek would be the number of items in the result of a query with an equality filter on the Color property.
You could use a keys-only query (a special kind of projection query) for this purpose, faster and less expensive:
Keys-only queries
A keys-only query (which is a type of projection query) returns just
the keys of the result entities instead of the entities themselves, at
lower latency and cost than retrieving entire entities.
...
A keys-only query is a small operation and counts as only a single
entity read for the query itself.
Something along these lines (but note that I'm not a java user, the snippet is based only on the documentation examples)
Query<Key> query = Query.newKeyQueryBuilder()
.setKind("Student")
.setFilter(PropertyFilter.eq("Color", "Red")
.build();
I agree with the Dan Cornilescu's answer. Here is a direct Datastore API usage. I have prepared the request body for your use-case. You can run it by just adding your Project Id. This will return the entities that matches with the filter then you can count the number of them.

Google Datastore - Search Optimization Technique

I am dealing with a real-estate app. A Home will hvae typical properties like Price, Bed Rooms, Bath Rooms, SqFt, Lot size etc. User will search for Homes and such a query will require multiple inequality filters like: Price between x and y, rooms greater than z, bathrooms more than p... etc.
I know that multiple inequality filters are not allowed. I also do not want to perform any filtering in my code and/because I want to be able to use Cursors.
so I have come up with two solutions. I am not sure if these are right - so wonder if gurus can shed some light
Solution 1: I will discretize the values of each attribute and save them in a list-field, then use IN. For example: If there are 3 bed rooms, instead of storing beds=3, I will store beds = [1,2,3]. Now if a user searches for homes with say at least two bedrooms, then instead of writing the filter as beds>2, I will write the filter as "beds IN [2]" - and my home above [1,2,3] will qualify - so so will any home with 2 beds [1,2] or 4 beds [1,2,3,4] and so on
Solution 2: It is similar to the first one but instead of creating a list-property, I will actually add attributed (columns) to the home. So a home with 3 bed rooms will have the following attributed/columns/properties: col-bed-1:true, col-bed-2:true, col-bed-3:true. Now if a user searches for homes with say at least two bedrooms, then instead of writing the filter as beds>2, I will write the filter as "col-bed-2 = true" - and my home will qualify - so will any home with 2 beds, 3 beds, 4 beds and so on
I know both solutions will work, but I want to know:
1. Which one is better both from a performance and google pricing perspective
2. Is there a better solution to do this?
I do almost exactly your use case with a python gae app that lists posts with housing advertisements (similar to craigslist). I wrote it in python and searching with a filter is working and straightforward.
You should choose a language: Python, Java or Go, and then use the Google Search API (that has built-in filtering for equalities or inequalities) and build datastore indexes that you can query using the search API.
For instance, you can use a python class like the following to populate the datastore and then use the Search API.
class Home(db.Model):
address = db.StringProperty(verbose_name='address')
number_of_rooms = db.IntegerProperty()
size = db.FloatProperty()
added = db.DateTimeProperty(verbose_name='added', auto_now_add=True) # readonly
last_modified = db.DateTimeProperty(required=True, auto_now=True)
timestamp = db.DateTimeProperty(auto_now=True) #
image_url = db.URLProperty();
I definitely think that you should avoid storing permutations for several reasons: Permutations can explode in size and makes the code difficult to read. Instead you should do like I did and find examples where someone else has already solved an equal or similar problem.
This appengine demo might help you.

Firebase + AngularFire -> States?

I'd like to know how I would deal with object states in a FireBase environment.
What do I mean by states? Well, let's say you have an app with which you organize order lists. Each list consists of a bunch of orders, so it can be considered a hierarchical data structure. Furthermore each list has a state which might be one of the following:
deferred
open
closed
sent
acknowledged
ware completely received
ware partially received
something else
On the visual (HTML) side the lists shall be distinguished by their state. Each state shall be presented to the client in its own, say, div-element, listing all the related orders beneath.
So the question is, how do I deal with this state in FireBase (or any other document based database)?
structure
Do I...
... (option 1) use a state-field for each orderlist and filter on the clientside by using if or something similar:
orderlist1.state = open
order1
order2
orderlist2.state = open
order1
orderlist3.state = closed
orderlist4.state = deferred
... (option 2) use the hierarchy of FireBase to classify the orderlists like so:
open
orderlist1
order1
order2
orderlist2
order1
closed
orderlist3
deferred
orderlist4
... (option 3) take a totally different approach?
So, what's the royal road here?
retrieval, processing & visual output of option 2
Since for option 1 the answer to this question is apparantly pretty straight forward (if state == ...) I continue with option 2: how do I retrieve the data in option 2? Do I use a Firebase-object for each state, like so:
var closedRef = new Firebase("https://xxx.firebaseio.com/closed");
var openRef = new Firebase("https://xxx.firebaseio.com/open");
var deferredRef = new Firebase("https://xxx.firebaseio.com/deferred");
var somethingRef = new Firebase("https://xxx.firebaseio.com/something");
Or what's considered the best approach to deal with that sort of data/structure?
There is no universal answer to this question. The "best approach" is going to depend on the particulars of your use case, which you haven't provided here. Specifically, how you will be reading and manipulating the data.
Data architecture in NoSQL is all about working hard on writes to make reads easy. It's all about how you plan to use the data. (It's also enough material for a chapter in a book.)
The advantage to "option 1" is that you can easily iterate all the entire list. Great if your list is measured in hundreds. This is a great approach if you want to fetch the list and manipulate it on the fly on the client side.
The advantage to "option 2" is that you can easily grab a subset of the list. Great if your list is measured in thousands and you will typically be fetching open issues only rather than closed ones. This is great for archiving/new/old lists like yours.
There are other options as well.
Sorted Data using Priorities
Perhaps the most universal approach is to use ordered data. This allows you to query a subset of your records using something like:
new Firebase(URL).startAt('open').endAt('open').limit(10);
This is sufficient in most cases where you have only one criteria, or when you can create a unique identifier from multiple criteria (e.g. 'open:marketing') without difficulty. Examples are scoreboards, state lists like yours, data ordered by timestamps.
Using an index
You can also create custom subsets of your data by creating an index of keys and using that to fetch the others.
This is most useful when there is no identifiable characteristic of your subsets. For example, if I pick them from a list and store my favorites.
I think my this plnkr can help you for this.
Here, click on edit/add and just check the country(order in your case) - State(state in your case) dependent dropdown may be the same as you want.just one single thing you may need to add is filter it.
They both are different tables in db.
You can also get it from git.

Google App Engine query optimization

I have a Google App Engine datastore that could have several million records in it and I'm trying to figure out the best way to do a query where I need get records back that match several Strings.
For example, say I have the following Model:
String name
String level
Int score
I need to return all the records for a given "level" that also match a list of "names". There might be only 1 or 2 names in the name list, but there could be 100.
It's basically a list of high scores ("score") for players ("name") for a given level ("level"). I want to find all the scores for a given "level" for a list of players by "name" to build a high score list that include just your friends.
I could just loop over the list of "names" and do a query for each their high scores for that level, but I don't know if this is the best way. In SQL I could construct a single (complex) query to do this.
Given the size of the datastore, I want to make sure I'm not wasting time running python code that should be done by the query or vise-versa.
The "level" needs to be a String, not an Int since they are not numbered levesl but rather level names, but I don't know if that matters.
You can use IN filter operator to match property against a list of values (user names):
scores = Scores.all().filter('level ==', level).filter('user IN', user_list)
Note that under the hood this performs as much queries as there are users in user_list.
players = Player.all().filter('level =', level).order('score')
names = [name1, name2, name3, ...]
players = [p for p in players if p.name in names]
for player in players:
print name, print score
is this what you want?
...or am i simplifying too much?
No you can not do that in one pass.
You will have to either query the friends for the level one by one
or
make a friends score entity for each level. Each time the score changes check which friends list he belongs to and update all their lists. Then its just a matter or retrieving that list.
the first one will be slow and the second costly unless optimized.

Predicate to zip two collections and use as binding source

I've got two collections, ObservableCollection<Lap> and a ObservableCollection<Racer> where Lap holds lap data of a car race and Racer, you guess it, the Racer's data. Both objects know the racerId.
Is there a way I can come up with a predicate to use that as a Zip-func to zip those two collections together? The reason I want to do that is to bind them DataGrid.
I had seen this, but can't quite see how to use it with a predicate.
I came up with that:
laps.Zip(participants, (lap, racer) => lap.EnrollmentId == racer.EnrollmentId);
But how would I map that to the DataGridColumns?
I think you are looking for a Join instead, since you do want to combine the properties of both based on a matching Id. For Zip() to work both collections must have the same number of entries in the same matching order already.
var results = from racer in participants
join l in laps
on racer.EnrollmentId equals l.EnrollmentId
select new
{
//select the properties you are interested in here
//or just use both:
Racer = racer,
Lap = l
}

Resources