couchdb map / reduce multiple keys filtering by date - arrays

I have a view setup with a map reduce. Right now this code works great:
function(doc) {
if (doc.type == 'test'){
if(doc.trash != 1){
for (var id in doc.items) {
emit([id,doc.items[id].name], 1);
}
}
}
}
function(keys,prices){
return (keys, sum(prices));
}
I get a return and when using the group parameter, it condenses everything just fine.
My issue/question, I want to add a third key.... DATE, so I may only reduce records from certain dates. So for example:
function(doc) {
if (doc.type == 'test'){
if(doc.trash != 1){
for (var id in doc.items) {
emit([date,id,doc.items[id].name], 1);
}
}
}
}
My issue is that since date is at the beginning of the array, the reduce groups by date, id etc. I know I use group_level and say just take the first key from the array or the first 2 keys, but that doesn't help either because afaik, group_level goes from left to right in the array. I could put the date on the end of the emit array, but that doesn't help either because I need to have values at the beginning of my startkey and endkey to search on.
Here is an example of the output of data:
{"key":["2012-03-13","356752b8a5f6871f3","Apple"],"value":1},
{"key":["2012-03-20","123752b8a76986857","Pear"],"value":1},
{"key":["2012-04-12","3013531de05871194","Grapefruit"],"value":1},
{"key":["2012-04-12","356752b8a5f6871f3","Apple"],"value":1},
I want APPLE to be added up in one row, here it's adding up apples by date first. I was able to successfully just add up all the apples if I remove DATE as the first key in the array, but then I can't search by date range.
Any ideas on how to accomplish this?

If I correctly understand what you want to do, then you'd want to put the date as the first element of your array, and use group_level as well as start_key and end_key.
Eg. startkey=[1, "someid"] endkey=[1,"someid",{}] group_level=2
Will get you all items from date 1 (obviously choose your own format here), with id "someid" and any name. It seems funny that you emit id's before names, and without having more information about what you're actually trying to accomplish, it's hard to advise your general data model. If ID is a "type" id meaning that many items share the same ID then this makes sense. If ID is a unique per item ID, then it does not. In that case, you'd want to emit "name" before ID...
Edit 1
As per your comment, to do a range of dates you do this:
startkey=[1] endkey=[5,{}] group_level=2
You will get everything from date 1 to date 5 grouped by id ie. apples, oranges etc. I use this exact technique in a very large scale production application. I actually formatted the dates as an easily human readable integers of the format yyyymmdd, so 20140624 would sort to the top. If I want everything from the start of the month till now grouped by my group ids, I call
startkey=[20140601] endkey=[20140624,{}] group_level=2
It works perfectly and as far as I can tell that's what you're looking to do. I also have a third key layer "detail" which allows me to provide a deeper level of grouping for items that need it. I can then call
startkey=[20140601, "someid"] endkey=[20140624, "someid",{}] group_level=3
To drill to the detail level for a particular id, or just use the previous query with group_level=3 if I want the details for every id. I'm certain you can make this work - I've solved this exact problem in a production application using the techniques described.
Edit 2
If you want to group all apples regardless of date, then you'll need to let apples be the first element in the key. You can then get all apples over all time as a single row in the view result using group_level=1, and Apples over a date range using group_level=2. The difference here is that you'll only be able to do the group_level=2 query on a single item type at a time. If you want the best of both worlds, you unfortunately just need to make 2 views. That's just how key ordering works... If you need fast response times for both types of queries, all item types over a date range, and all of a particular item not grouped by date, I believe 2 views is the only way to achieve that.
Note
Another thing to note is about your reduce function. Wherever possible it is highly recommended that you use the built in reduce functions. They're implemented in erlang and are highly optimized compared to custom javascript reduce functions.
In your case, just replace your reduce function with this
_sum
Easy hey?
If you post more info about your application, data model etc. then I'd be happy to help out more with your database design.

Related

arrayforumla + sorting (googlesheets) sorting dynamic data

I'm using a googleform to collect info about game players and their scores for different games, and am trying to create a leaderboard.
I've used an ARRAYFORMULA in col A to bring back unique values of each Player, then another in col B to SUMIF their Scores. So it's a leaderboard, I want it to autosort by score descending.
I've tried using various scripts etc but it seems that even just using the Sort function doesn't work. It sorts for a second, then resets back to the order the Players appear on the Form Responses. I'm taking this to mean I can't sort dynamic data in this way.
Any ideas on how I can autosort this, so even when more Players are added it will always act as a leaderboard?
EDITED TO ADD LINK:
https://docs.google.com/spreadsheets/d/1oitJrH-TdeRFfHCCTf9XO2ma4qwN571cPkYJhhPKKi8/edit?usp=sharing
screenshot of googlesheet
try:
=QUERY(CombinedScores!A2:B, "order by B desc", )

Can you create dynamic formulas in Google Sheets?

So I'm just starting out creating a portfolio tracker within Google Sheets. I'm using the Google Finance methods to get the stocks name and all the relevant data that I need. The only issue is that I can't figure out how to populate the specific data I need without having to manually type out the same formula's for each stock I want data for.
For example... Each row in the first column would contain the ticker symbol for that specific stock. If I bought a new stock, I would just type in the ticker symbol in cell A1 and this would populate the necessary fields such as price and so on. If I bought another stock I would essentially do the same thing but now in A2.
I know that you can get the price of a stock by doing
=GOOGLEFINANCE(A1, "price")
but is there any way to make it dynamic? something like:
=GOOGLEFINANCE(A(Row(ref)), "price")?
Any suggestions would be helpful. Maybe there's even an addon that makes this process simpler, but I'm not sure.
try:
=ARRAYFORMULA(IFERROR(GOOGLEFINANCE(A1:A10, "price")))
You just have to write the function for A1:
=GOOGLEFINANCE(A1, "price")
And then drag the little square on the cell down. It will automatically pick up the correspondant number of the row in the A column.
You can set-up your sheet to have like 100 rows used, and when you add the ticker it will automatically calculate it.
If you don't want th #N/A to show you can do it like:
=IFERROR(GOOGLEFINANCE(A1, "price"))

ValueFilter for DateTime Attributes

I'm working with the Blog app and I see how to filter the Blog posts by year using the Visual Query Designer. I use the querystring value that has the year and in the ValueFilter and my properties are as follows:
Attribute: PublicationMoment
Value: [QueryString:year]-01-01 and [QueryString:year]-12-31
Operation: between
How would I get the posts from a specific month and year, if those values are passed via query string parameters. Because the months of the year have a varying number of days, I'm not sure how you would accomplish this in the Value field of the ValueFilter. Currently I'm passing the 2 digit month as the parameter.
I tried something like: [QueryString:year]-[Querystring:month]
Operation: contains
but the above operation doesn't really work because the datatype is a DateTime object.
I could do it in the razor view but I'm afraid that the paging datasource would have too many pages in it since it would be based on the larger subset of posts for the given year that was passed in the querystring parameter.
Is there any way to do this with the filter?
Basically dates are not perfectly handled yet, but there are a few ways to do it using the visual query:
Use the correct date in the query like between [QueryString:Start] and [QueryString:End] and calculate the correct dates there where you generate the links
Since your main problem with the "between" filter is actually that it would include the last day too, you could also use a two filters a >= first date and another < second date, so the first-date would be the year/month and day 1; the second one is year-month and day 1 as well
Last but not least: if you do it with razor and LINQ you shouldn't run into any performance issues - it's technically the same thing the pipeline does and it's been tested to perform well with tens of thousands of records.

MS Access: Search through Table for the same part of the String

I am trying to loop through Column Values inside my Table.
I have a register form, which provides the user with UNIQUE ID, based uppon his information.
For example:
Country = Austria
Each user that selects country Austria will get some sort of Unique Value for that match (lets say 00).
Account ID look like this:
XXXX00UNIQUECODE
Each country has it´s own unique value: (AT = 00, DE = 01, etc)
Now, I want to generate a UNIQUE CODE for each user, that will be just an increment (+1) value of the previous UC value stored in the table, for the same country!
In order to do that, I need to somehow loop through the Column, where the Account IDs are stored and search for the match.
The thing is, when a user tries to generate the UNIQUE CODE, he does not have it yet, so he has only:
XXXX00
Now I need to find all the XXXX00 strings in my AccountID Column, and store them in an Array - then find the Max Value of those and increment it.
BUT I dont know how to search for a part of the string inside a Column of the Table ?
Just the XXXX00 part, not the entire Account ID XXXX00UNIQUECODE.
Agh, I hope you can understand me. It´s quite complicated I know, but I´m really stuck here. Hopefully, someone will know what I mean and maybe even find a smoother solutions for this.
Thanks in advance!
You're pounding a square peg into a round hole. Why not just create a new column called UserID and then you can do:
SELECT Max(UserID) FROM MyTable WHERE Mid(AccountID, 5, 2) = "00"
and increment it by 1.
Better yet, store CountryCode, UserID and the XXXX part in separate fields, and index them. It'll save time when you search or filter, which I'm assume you're going to be doing.

Creating custom rollups with SSAS

I am currently working on a requirement as follows and would appreciate some help in figuring out a way to configure the aggregation of my measure:
I have a fact table that contains the following Item ID, DateID,StoreID, ReceivedComments. The way received comments work is that on a daily basis a new record is created that adds to the value of received comments (for example if Item 5 in Store 5 on 1 Jan had 23 Received Comments and it received 5 comments the following day, the row for Jan 2 would be Item 5, Store 5, Jan 2, 28)
We created a measure using MAX and it works fine whenever Item ID is used in the query. When we start moving to a higher level the max produces wrong results. Our requirement is to setup the measure to be as follows:
If the member selected is on the Item Level then MAX, if it's on any other level (Date or Store) then the measure should aggregate the Max of all Items under this date or store.
Due to the business rules and structure of the database Store and Item are different dimensions so I can not include them in 1 Hierarchy.
We have been playing around with Custom RollUps but so far haven't been able to get it to work.
Thanks
I would solve this by using a more traditional approach to your fact table. Instead of keeping a cumulative count in the ReceivedComments column, I would keep only the number of comments received THAT DAY.
That way, instead of using MAX, you can create your measure using SUM, and it will automatically rollup when you go to higher levels.
The only disadvantage I can see to this approach is that you will need to use a range of dates, instead of only the most recent date, to get a full total of all the comments for a given item/store/date. But that's a very small change to your MDX.
Someone suggested using ISLEAF to determine the level, Instead of using ISLeaf i went with AS CASE WHEN [Item].[ItemID].CURRENTMEMBER.LEVEL IS [Item].[ItemID].[(All)] so I don't have to account for other dimensions such as Date, Store, etc as I have several other dimensions that all behave the same way.
And then I went with this formula to determine the Sum of the Max of the items in a particular store like this:
SUM({[Item].[Item ID].children},[Measures].[ReceivedComments]), Now I expect some performance issues with this measure but we are currently running some tests to see if it's gonna be reliable to work with it on actual data.

Resources