How do I count events with multiple boolean variables in Keen IO? - analytics

Say one of my Keen IO event properties is an object of booleans: { "is_a": true, "is_b": true, "is_c": false, ... }.
How would I get a count of how many events have each boolean set to true?
i.e. I'd like to get a result that tells me that in the last week there were:
100 events with is_a true
60 events where is_b was true
70 events where is_c was true
Is there any way to do this without making a separate call for each of is_a/b/c?

It's probably easiest to simply run 3 counts for this query, each with a single filter.
However, there is a way you could do it in a single query.
Run a count and group_by by all three properties.
var count = new Keen.Query("count", {
event_collection: "purchases",
group_by: ["is_a", "is_b", "is_c"]
});
This will count all of the true and false values for all of the combinations of these properties, and you'd have to parse them to pick out the individual cases.
It's less complicated to run the count 3 times.

Related

For bounded data, how do I get Flink to "trigger" once flatmap has finished outputting all its data

I've explicitly set "batch mode" in Flink's StreamExecutionEnvironmen settings, as I'm working with bounded data.
The bounded data passes through a flatmap; and the flatmap is windowed using GlobalWindows. Since the data is bounded, there is a FINITE (though initially unknown) number of elements that will be outputted by the Collection.out() operations in the FlatMap. I'd like to trigger a Reduce() function. However, I can't figure out how to tell Flink the following: once the FlatMap has finished outputting all its elements, proceeed with the remainder of the code, eg do the reduce. (From the documentation, GlogalWindows always use the NeverTrigger, so I need to explicitly call a trigger I presume.) (Note: The CountTrigger won't work I believe, since I don't know apriori the number of elements that the flatmap will output.)
Bonus: Technically, the reduce operation can start as soon as the flatmap starts outputting its output. So, I'm not sure exactly how Flink works, but ideally, the reduce starts right away but only "completes" after the window closes....(And the window should close, in the case of bounded data, once the flatmap stops outputting the the output data.)
===
Edit #1:
Per #kkrugler, here's the skeleton code:
sosCleavedFeaturesEtc
.flatMap((Tuple4<Float2FloatAVLTreeMap, List<ImmutableFeatureV2>, List<ImmutableFeatureV2>, Integer> tuple4, Collector<Tuple4<Float2FloatAVLTreeMap, List<ImmutableFeatureV2>, Integer, Integer>> out) -> {
...
IntStream.range(0, numBlocksForClustering + 1)
.forEach(blockIdx -> out.collect(Tuple4.of(rtMapper, unmodifiableLstCleavedFeatures, diaWindowNum, blockIdx)));
})
.flatMap((Tuple4<Float2FloatAVLTreeMap, List<ImmutableFeatureV2>, Integer, Integer> tuple4, Collector<Tuple2<Float2FloatAVLTreeMap, Cluster>> out) -> {
...
setClusters
.stream()
.filter(cluster -> cluster.getClusterSize() >= minFeaturesInCluster)
.forEach(e -> out.collect(Tuple2.of(rtMapper, e)));
})
.map(tuple -> {
...
})
.filter(repFeature -> {
...
})
.windowAll(GlobalWindows.create())
...trigger??...
.aggregate(...});

Collecting elements in a list while iterating a collection in a thread-safe way

I set Salesforce fetchSize=100 but it does not fetch elements in sets of 100 for my query. Therefore I want to be able to collect the single result from the ConsumerIterator into a list, to be handed off to a batch process in sets of 100. Here is the code below. Is this a correct way to do it? I would appreciate any suggestions on how to do it correctly. I would like to process all the ConsumerIterator elements in batches of 50. If the batch is less than 50, I would like to process that batch. My attempt is below
ConsumerIterator<HashMap<String,Object>> iter=
(ConsumerIterator<HashMap<String,Object>>)obj;
List<HashMap<String,Object>> l=new CopyOnWriteArrayList<>();
while(iter.hasNext()){
Object payload=iter.next();
if(l.size()<50){
l.add((HashMap<String,Object>)payload);
}else{
write(l);
}
public int [] write(List<HashMap<String,Object> list)
{
synchronized(list)
{
ArrayList newList=copy(list);
save(newList);
}
+
In Salesforce query, you can append "Limit 100;" at the end of the query to get only 100 elements in a list.
I solved the problem by using a fetch size of 100 and then used the resulting ConsumerIterator to aggregate the elements.

How do I turn arguments 'integer<120 boolean boolean boolean boolean' into the most compact package possible

I've seen this but considering the extra argument that's 60+ times the information, I've become confused.
I plan on writing a converter for end users that will take arguments,
$genmask int>5<120b[bool(0,1)][bool(0,1)][bool(0,1)][bool(0,1)]
and convert that to a mask that the script can then read, but what if I want them to be able to leave out an option and have it generate correctly.
currently, I'm trying things out like
120.to_s(36) + [1.to_s(36), 2.to_s(36), 4.to_s(36), 8.to_s(36)].join
# => "3c1248"
but that isn't quite what I'm looking for, I'm more so looking for the direct addition, like that of linux permissions or where the whole thing can be written in 1-3 characters.
I may be making it too complicated, I don't know.
Four Boolean arguments corresponds to a four-bit number, which can be represented as a single hexadecimal character. So if you think your users can do hexadecimal addition, you can use that.
flags = [true, true, true, true]
flags.map { |b| b ? 1 : 0 }.join.to_i(2).to_s(16)
# "f"
"5".to_i(16).to_s(2).rjust(4, ?0).each_char.map { |c| c == ?1 }
# [ false, true, false, true]
And since you can represent 120 in two characters with base 36, that gets all your arguments in 3 characters as desired.
But to be honest that sounds like an awfully confusing interface. chmod's octal arguments are barely understandable and this looks way more confusing.

Use a boolean expression with ng-pluralize

Using ng-pluralize with this template:
Your subscription <span ng-pluralize count="::vm.account.subscription.expirationDays"
when="{ '-1': 'has expired!',
'0': 'expires today!',
'one': 'expires tomorrow.',
'other': 'expires in {} days.'}"></span>
Yields the following result:
Expiration Days Label
-1 Your subscription has expired!
0 Your subscription expires today!
1 Your subscription expires tomorrow!
X Your subscription expires in X days.
However, this breaks as soon as a subscription expires 2 days ago.
Is it possible to define a boolean expression as a when clause so that
vm.account.subscription.expirationDays < 0 === 'has expired!'
Currently I'm having to handle expired labels in a different element which kind of defeats the purpose of using ng-pluralize.
It looks like your scenario is, albeit perhaps a common one, too complex for ngPluralize. I also doubt it will change, because ngPluralize is based on "plural categories":
http://unicode.org/repos/cldr-tmp/trunk/diff/supplemental/language_plural_rules.html
The problem being that en-US, Angular's default locale, defines only the categories "one" and "other". Anything that doesn't fall into those categories is explicitly defined (or inferred by $locale.pluralCat).
The three best options for your scenario that immediately come to me are:
1) Simplest would be to have two objects:
when="count >=0 ? positivePlurals : negativePlurals"
where, of course $scope.count = vm.account.subscription.expirationDays, positivePlurals is your positive phrases and negativePlurals is your negative phrases.
2) Wrap a localization library that supports many-or-custom plural rules (such as i18next) in a directive, and use that instead. I'm not very familiar with the popular angular-translate, but at first glance it doesn't seem to support custom pluralization rules. It does, however, allow logic in interpolation, so you might get away with that.
3) Write a directive similar to ngPluralize that supports ("-other", "x", "other"). The source for ngPluralize is available here. It would probably be as simple as modifying the statement at L211 in a way similar to:
var countIsNaN = isNaN(count);
var countIsNegative = count < 0;
if (!countIsNaN && !(count in whens)) {
// If an explicit number rule such as 1, 2, 3... is defined, just use it.
// Otherwise, check it against pluralization rules in $locale service.
count = $locale.pluralCat(count - offset);
if(countIsNegative){
count = '-'+count; // "-one", "-other"
}
}

Django - would these query sets be cached?

class UnassignedThread(models.Manager):
def get_queryset(self):
return super(UnassignedThread,
self).get_queryset().filter(
_irc_name__isnull=True)
Would results = ThreadVault.unassigned_threads.all() be cached? I am not certain if _isnull=True counts as being a evaluated(since the evaluation causes the cache).
Also, if have a model called ThreadVault, and I want to look up if threads #777 and #888 exist in the database, which way is the best to utilize cache to do the look up?
ThreadVault.objects.get(thread_id="777")
ThreadVault.objects.get(thread_id="888")
or
results = ThreadVault.objects.all()
for ticket in results:
if ticket.thread_id == "777" or ticket.thread_id == "888":
do something
No, querysets are lazy until they are sliced or iterated. filter simply adds conditions to the query, but does not evaluate it.
For your second question, neither of these are great, although the first is vastly preferable to the second (which involves loading and iterating through every object in the table). Instead, you should use exists() in conjunction with an __in filter:
ThreadVault.objects.filter(thread_id__in=["777", "888"].exists()
Neither of these questions has anything to do with caching.
th_ids = ["777","888"]
ThreadVault.objects.filter(thread_id__in=th_ids).exists()
for caching your view
from django.views.decorators.cache import cache_page
#cache_page(60 * 15)
def my_view(request):

Resources