Getting max value in CouchDB - database

i am trying to use CouchDB as my Database.
I have problem when i want to get max value from 50000 records.
What the best solution to find max value?
I use this reduce function :
function (key, values, rereduce) {
var largest = Math.max.apply(Math, key);
return largest;
}
And this is my map function :
function(doc) {
if(doc.TYPE=="customer"){
emit(doc._id.substring(0,8), doc._id);
}
}
Thank you

When writing reduce functions, you should not forget to account for the rereduce parameter. However, you don't really even need it for 2 reasons:
1) you can just use descending=true when you query the view and get the largest item. (so you wouldn't need a reduce step at all)
2) you can use the builtin reduce function _stats and get the max value w/o writing your own reduce function.
The advantage to just using the map function w/o the reduce function is that you can not only get the largest value, but also the document that contains that value.

Related

Is Slice<Results> still lazy loaded in Realm?

I'm trying to limit the results of my Realm query. If I have a million records and I call Swift prefix function, does it touch all million records?
Here's what I'm trying to do:
let objects = realm.objects(BookRealmObject.self)
.sorted(byKeyPath: "createdAt", ascending: false)
let items: [BookType] = {
guard let limit = request.limit, limit > 0 else {
return objects.map { Book(from: $0) }
}
return objects.prefix(limit).map { Book(from: $0) }
}()
The type returned from prefix is Slice<Results<Element>>. Whether a limit is requested by the caller or not, I need to convert it to a plain object to pass to different threads.
Is this the proper way to handle this, or is there a more optimized, concise way to do this?
As we can find in docs:
Since queries in Realm are lazy, performing this sort of paginating behavior isn’t necessary at all, as Realm will only load objects from the results of the query once they are explicitly accessed.
So, when you get the prefix of objects it still should be lazy, but when you access objects using map you lose the lazy feature.

Collecting elements in a list while iterating a collection in a thread-safe way

I set Salesforce fetchSize=100 but it does not fetch elements in sets of 100 for my query. Therefore I want to be able to collect the single result from the ConsumerIterator into a list, to be handed off to a batch process in sets of 100. Here is the code below. Is this a correct way to do it? I would appreciate any suggestions on how to do it correctly. I would like to process all the ConsumerIterator elements in batches of 50. If the batch is less than 50, I would like to process that batch. My attempt is below
ConsumerIterator<HashMap<String,Object>> iter=
(ConsumerIterator<HashMap<String,Object>>)obj;
List<HashMap<String,Object>> l=new CopyOnWriteArrayList<>();
while(iter.hasNext()){
Object payload=iter.next();
if(l.size()<50){
l.add((HashMap<String,Object>)payload);
}else{
write(l);
}
public int [] write(List<HashMap<String,Object> list)
{
synchronized(list)
{
ArrayList newList=copy(list);
save(newList);
}
+
In Salesforce query, you can append "Limit 100;" at the end of the query to get only 100 elements in a list.
I solved the problem by using a fetch size of 100 and then used the resulting ConsumerIterator to aggregate the elements.

Compare array with sorted Array, pick first element

The setup is the following :
targets = ['green','orange','red']; //targets are in order of priority
sources = ['redalert','blackadder','greenlantern'];
I am trying to make a function that returns the one source element which contains the highest priority target string. In this case, it would be 'greenlantern', as it contains the string 'green', which has higher priority than 'red' found in 'redalert'.
I have done it already using for loops and temp arrays, but I know these manipulations aren't my forte, and my real-life arrays are way larger, so I'd like to optimize execution. I have tried with Lodash too, but can't figure out how to do it all in one step. Is it possible?
The way I see it, it has to :
for each target, loop through sources, if source elem matches target elem, break and return.
but I'm sure there's a better way.
Here's another lodash approach that uses reduce() instead of sortBy():
_.reduce(targets, function(result, target) {
return result.concat(_.filter(sources, function(source) {
return _.includes(source, target);
}));
}, []);
Since targets is already in order, you can iterate over it and build the result in the same order. You use reduce() because you're building a result iteratively, that isn't a direct mapping.
Inside the reduce callback, you can concat() results by using filter() and includes() to find the appropriate sources.
This gets you the sorted array, but it's also doing a lot of unnecessary work if you only want the first source that corresponds to the first target:
_.find(sources, _.ary(_.partialRight(_.includes, _.first(targets)), 1));
Or, if you prefer not to compose callback functions:
_.find(sources, function(item) {
return _.includes(item, _.first(targets));
});
Essentially, find() will only iterate over the sources collection till there's a match. The first() function gets you the first target to look for.
Keeping it very simple:
var sortedSources = _.sortBy(sources, function(source){
var rank = 0
while(rank < targets.length){
if(source.indexOf(targets[rank]) > -1){
break
}else{
rank++
}
}
return rank
})
Sources are now sorted by target priority, thus sortedSources[0] is your man.

System.TypeException: Cannot have more than 10 chunks in a single operation

I have this very weird error, "System.TypeException: Cannot have more than 10 chunks in a single operation", has anyone seen/encountered this before ? Please can you guide me if you know how to solve this.
I am trying to insert different types of sObjects together in a list of sObject. The list is never larger than 10 rows.
This post here:
https://developer.salesforce.com/forums/ForumsMain?id=906F000000090nUIAQ
suggests that it is not the number of different sObjects, but the order of the objects that causes this chunk limit to be exceeded. In other words, "1,1,1,2,2,2" has one chunk, the transition from "1" to "2". "1,2,3,4,5,6" has six chunks, even though the number of elements is the same. Putting the objects into the list sorted in object order is the suggested solution.
Is it possible for you to create a reasonable test case with only 2 or 3 rows?
There are two possible explanations for this issue:
As Jagular noted, you did not order the sobjects you tried to insert, so there are more than 10 'chunks' in the list.
You try to insert > 2000 records, and > 1 sobject type. This one seems like a Salesforce Bug, since the error message doesn't match the issue.
Scenario 1 and its solution
When you have a hybrid list, make sure that the objects are not scattered without any order. For example, A,B,A,B,A,B,A,B…. Salesforce has an inherent trouble in switching sObject types for more than 10 times. They call this switching limit as Chunking Limit. So, on this hybrid list, if you would have sorted it and then passed it for DML, Salesforce would have been much happier. For example. A,A,A,A,B,B,B,B… In this case, salesforce only has to switch one time (that is read all A objects –>switch –> read all B objects). The max chunk limit is 10. So, here we are safe.
listToUpdate.sort();
UPDATE listToUpdate;
Scenario 2 and its solution
Another point that we have to bear in our mind is that if the hybrid list contains more number of objects for one type, we can run into TypeException. As mentioned in the screenshot, if list contains 1001 objects of type A and 1001 objects of type B, then total objects is equal to 2002. The maximum chunks allowed is 10. So, if you do a simple math, the number of objects in each chunk would be 2002/10 = 200. Salesforce also enforces another governor limit that each chunk should not contain 200 or more than 200 objects. In this case, we will have to foresee how much objects are possible to enter this code and we have to write code to pass lists of safe size for DML every time.
Scenario 3 and its solution
Scenario 3 and its solution
Third scenario that can happen is if the hybrid list contains objects of more than 10 types, then even if the size of the list if very small, switching happens when salesforce reads different sObject. So, we have to make sure that in this case, we allot separate lists for each sObject type and then pass it on for DML. Doing this in an apex trigger or apex class would cause you some trouble as multiple DML’s are initiated in a context. Passing this kind of multiple sObject lists for DML operations in a different context would really ease up the load you pump into the platform. Consider doing this kind of logic in a Batch Apex Job rather than a apex trigger or apex class.
Hope this helps.
Below a code that should cover all 3 Scenario from Arpit Sethi.
It's a piece of code I took from this topic: https://developer.salesforce.com/forums/?id=906F000000090nUIAQ.
and modified to cover Scenario 2.
private static void saveSobjectSet(List <Sobject> listToUpdate) {
Integer SFDC_CHUNK_LIMIT = 10;
// Developed this part due to System.TypeException: Cannot have more than 10 chunks in a single operation
Map<String, List<Sobject>> sortedMapPerObjectType = new Map<String, List<Sobject>>();
Map<String, Integer> numberOf200ChunkPerObject = new Map<String, Integer>();
for (Sobject obj : listToUpdate) {
String objTypeREAL = String.valueOf(obj.getSObjectType());
if (! numberOf200ChunkPerObject.containsKey(objTypeREAL)){
numberOf200ChunkPerObject.put(objTypeREAL, 1);
}
// Number of 200 chunk for a given Object
Integer numnberOf200Record = numberOf200ChunkPerObject.get(objTypeREAL);
// Object type + number of 200 records chunk
String objTypeCURRENT = String.valueOf(obj.getSObjectType()) + String.valueOf(numnberOf200Record);
// CurrentList
List<sObject> currentList = sortedMapPerObjectType.get(objTypeCURRENT);
if (currentList == null || currentList.size() > 199) {
if(currentList != null && currentList.size() > 199){
numberOf200ChunkPerObject.put(objTypeREAL, numnberOf200Record + 1);
objTypeCURRENT = String.valueOf(obj.getSObjectType()) + String.valueOf(numnberOf200Record);
}
sortedMapPerObjectType.put(objTypeCURRENT, new List<Sobject>());
}
sortedMapPerObjectType.get(objTypeCURRENT).add(obj);
}
while(sortedMapPerObjectType.size() > 0) {
// Create a new list, which can contain a max of chunking limit, and sorted, so we don't get any errors
List<Sobject> safeListForChunking = new List<Sobject>();
List<String> keyListSobjectType = new List<String>(sortedMapPerObjectType.keySet());
for (Integer i = 0;i<SFDC_CHUNK_LIMIT && !sortedMapPerObjectType.isEmpty();i++) {
List<Sobject> listSobjectOfOneType = sortedMapPerObjectType.remove(keyListSobjectType.remove(0));
safeListForChunking.addAll(listSobjectOfOneType);
}
update safeListForChunking;
}
}
Hope it helps,
Bye
Hi i kind of deviced a simple way to sort a list of different sobject types
public List<Sobject> SortRecordsByType(List<Sobject> records){
List<Sobject> response = new List<Sobject>();
Map<string,List<Sobject>> sortDictionary = new Map<string,List<Sobject>>();
for(Sobject record : records){
string objectTypeName = record.getSobjectType().getDescribe().getName();
if(sortDictionary.containsKey(objectTypeName)){
sortDictionary.get(objectTypeName).add(record);
}else{
sortDictionary.put(objectTypeName , new List<Sobject>{record});
}
}
// arrange in order
for(string objectName : sortDictionary.keySet()){
response.addAll(sortDictionary.get(objectName));
}
return response;
}
hopefully this solves your problem .

Using Active Record pattern in CakePHP, and avoiding passing arrays around

As my CakePHP 2.4 app gets bigger, I'm noticing I'm passing a lot of arrays around in the model layer. Cake has kinda led me down this path because it returns arrays, not objects, from it's find calls. But more and more, it feels like terrible practice.
For example, in my Job model, I've got a method like this:
public function durationInSeconds($job) {
return $job['Job']['estimated_hours'] * 3600; // convert to seconds
}
Where as I imagine that using active record patter, it should look more like this:
public function durationInSeconds() {
return $this->data['Job']['estimated_hours'] * 3600; // convert to seconds
}
(ie, take no parameter, and assume the current instance represents the Job you want to work with)
Is that second way better?
And if so, how do I use it when, for example, I'm looping through the results of a find('all') call? Cake returns an array - do I loop through that array and do a read for every single row? (seems a waste to re-fetch the info from the database)
Or should I implement a kind of setActiveRecord method that emulates read, like this:
function setActiveRecord($row){
$this->id = $row['Job']['id'];
$this->dtaa = $row;
}
Or is there a better way?
EDIT: The durationInSeconds method was just a simplest possible example. I know for that particular case, I could use virtual fields. But in other cases I've got methods that are somewhat complex, where virtual fields won't do.
The best solution depends on the issue you need to solve. But if you have to make a call to a function for each result row, perhaps it is necessary to redesign the query taking all the necessary data.
In this case that you have shown, you can use simply a virtual Field on Job model:
$this->virtualFields = array(
'duration_in_seconds' => 'Job.estimated_hours * 3600',
):
..and/or you can use a method like this:
public function durationInSeconds($id = null) {
if (!empty($id)) {
$this->id = $id;
}
return $this->field('estimated_hours') * 3600; // convert to seconds
}

Resources