Is there a limit on the number of entities you can query from the GAE datastore? - google-app-engine

My GCM Endpoint is derived from the code at /github.com/GoogleCloudPlatform/gradle-appengine-templates/tree/master/GcmEndpoints/root/src/main. Each Android client device
registers with the endpoint. A message can be sent to the first 10 registered devices using this code:
#Api(name = "messaging", version = "v1", namespace = #ApiNamespace(ownerDomain = "${endpointOwnerDomain}", ownerName = "${endpointOwnerDomain}", packagePath="${endpointPackagePath}"))
public class MessagingEndpoint {
private static final Logger log = Logger.getLogger(MessagingEndpoint.class.getName());
/** Api Keys can be obtained from the google cloud console */
private static final String API_KEY = System.getProperty("gcm.api.key");
/**
* Send to the first 10 devices (You can modify this to send to any number of devices or a specific device)
*
* #param message The message to send
*/
public void sendMessage(#Named("message") String message) throws IOException {
if(message == null || message.trim().length() == 0) {
log.warning("Not sending message because it is empty");
return;
}
// crop longer messages
if (message.length() > 1000) {
message = message.substring(0, 1000) + "[...]";
}
Sender sender = new Sender(API_KEY);
Message msg = new Message.Builder().addData("message", message).build();
List<RegistrationRecord> records = ofy().load().type(RegistrationRecord.class).limit(10).list();
for(RegistrationRecord record : records) {
Result result = sender.send(msg, record.getRegId(), 5);
if (result.getMessageId() != null) {
log.info("Message sent to " + record.getRegId());
String canonicalRegId = result.getCanonicalRegistrationId();
if (canonicalRegId != null) {
// if the regId changed, we have to update the datastore
log.info("Registration Id changed for " + record.getRegId() + " updating to " + canonicalRegId);
record.setRegId(canonicalRegId);
ofy().save().entity(record).now();
}
} else {
String error = result.getErrorCodeName();
if (error.equals(Constants.ERROR_NOT_REGISTERED)) {
log.warning("Registration Id " + record.getRegId() + " no longer registered with GCM, removing from datastore");
// if the device is no longer registered with Gcm, remove it from the datastore
ofy().delete().entity(record).now();
}
else {
log.warning("Error when sending message : " + error);
}
}
}
}
}
The above code sends to the first 10 registered devices. I would like to send to all registered clients. According to http://objectify-appengine.googlecode.com/svn/branches/allow-parent-filtering/javadoc/com/googlecode/objectify/cmd/Query.html#limit(int) setting limit(0) accomplishes this. But I'm not convinced there will not be a problem for very large numbers of registered clients due to memory constraints or the time it takes to execute the query. https://code.google.com/p/objectify-appengine/source/browse/Queries.wiki?repo=wiki states "Cursors let you take a "checkpoint" in a query result set, store the checkpoint elsewhere, and then resume from where you left off later. This is often used in combination with the Task Queue API to iterate through large datasets that cannot be processed in the 60s limit of a single request".
Note the comment about the 60s limit of a single request.
So my question - if I modified the sample code at /github.com/GoogleCloudPlatform/gradle-appengine-templates/tree/master/GcmEndpoints/root/src/main to request all objects from the datastore, by replacing limit(10) with limit(0), will this ever fail for a large number of objects? And if it will fail, roughly what number of objects?

This is a poor pattern, even with cursors. At the very least, you'll hit the hard 60s limit for a single request. And since you're doing updates on the RegistrationRecord, you need a transaction, which will slow down the process even more.
This is exactly what the task queue is for. The best way is to do it in two tasks:
Your api endpoint enqueues "send message to everyone" and returns immediately.
That first task is the "mapper" which iterates the RegistrationRecords with a keys-only query. For each key, enqueue a "reducer" task for "send X message to this record".
The reducer task sends the message and (in a transaction) performs your record update.
Using Deferred this actually isn't much code at all.
The first task frees you client immediately and gives you 10m to iterating RegistrationRecord keys rather than the 60s limit for a normal request. If you have your chunking right and batch queue submissions, you should be able to generate thousands of reducer tasks per second.
This will effortlessly scale to hundreds of thousands of users, and might get you into millions. If you need to scale higher, you can apply a map/reduce approach to parallelize the mapping. Then it's just a question of how many instances you want to throw at the problem.
I have used this approach to great effect in the past sending out millions of apple push notifications at a time. The task queue is your friend, use it heavily.

Your query will time out if you try to retrieve too many entities. You will need to use cursors in your loop.
No one can say how many entities can be retrieved before this timeout - it depends on the size of your entities, complexity of your query, and, most importantly, what else happens in your loop. For example, in your case you can dramatically speed up your loop (and thus retrieve many more entities before a timeout) by creating tasks instead of building and sending messages within the loop itself.
Note that by default a query returns entities in chunks of 20 - you will need to increase the chunk size if you have a large number of entities.

Related

OptaPlanner: timetable resuming generates a wrong solution, even if the generation starts from where it stopped

I am using Java Spring Boot and OptaPlanner to generate a timetable with almost 20 constraints. At the initial generation, everything works fine. The score showed by the OptaPlanner logging messages matches the solution received, but when I want to resume the generation, the solution contains a lot of problems (like the constraints are not respected anymore) although the generation starts from where it has stopped and it continues initializing or finding a best solution.
My project is divided into two microservices: one that communicates with the UI and keeps the database, and the other receives data from the first when a request for starting/resuming the generation is done and generates the schedule using OptaPlanner. I use the same request for starting/resuming the generation.
This is how my project works: the UI makes the requests for starting, resuming, stopping the generation and getting the timetable. These requests are handled by the first microservice, which uses WebClient to send new requests to the second microservice. Here, the timetable will be generated after asking for some data from the database.
Here is the method for starting/resuming the generation from the second microservice:
#PostMapping("startSolver")
public ResponseEntity<?> startSolver(#PathVariable String organizationId) {
try {
SolverConfig solverConfig = SolverConfig.createFromXmlResource("solver/timeTableSolverConfig.xml");
SolverFactory<TimeTable> solverFactory = new DefaultSolverFactory<>(solverConfig);
this.solverManager = SolverManager.create(solverFactory);
this.solverManager.solveAndListen(TimeTableService.SINGLETON_TIME_TABLE_ID,
id -> timeTableService.findById(id, UUID.fromString(organizationId)),
timeTable -> timeTableService.updateModifiedLessons(timeTable, organizationId));
return new ResponseEntity<>("Solving has successfully started", HttpStatus.OK);
} catch(OptaPlannerException exception) {
System.out.println("OptaPlanner exception - " + exception.getMessage());
return utils.generateResponse(exception.getMessage(), HttpStatus.CONFLICT);
}
}
-> findById(...) method make a request to the first microservice, expecting to receive all data needed by constraints for generation (lists of planning entities, planning variables and all other useful data)
public TimeTable findById(Long id, UUID organizationId) {
SolverDataDTO solverDataDTO = webClient.get()
.uri("http://localhost:8080/smart-planner/org/{organizationId}/optaplanner-solver/getSolverData",
organizationId)
.retrieve()
.onStatus(HttpStatus::isError, error -> {
LOGGER.error(extractExceptionMessage("findById.fetchFails", "findById()"));
return Mono.error(new OptaPlannerException(
extractExceptionMessage("findById.fetchFails", "")));
})
.bodyToMono(SolverDataDTO.class)
.block();
TimeTable timeTable = new TimeTable();
/.. populating all lists from TimeTable with the one received in solverDataDTO ../
return timeTable;
}
-> updateModifiedLessons(...) method send to the first microservice the list of all generated planning entities with the corresponding planning variables assigned
public void updateModifiedLessons(TimeTable timeTable, String organizationId) {
List<ScheduleSlot> slots = new ArrayList<>(timeTable.getScheduleSlotList());
List<SolverScheduleSlotDTO> solverScheduleSlotDTOs =
scheduleSlotConverter.convertModelsToSolverDTOs(slots);
String executionMessage = webClient.post()
.uri("http://localhost:8080/smart-planner/org/{organizationId}/optaplanner-solver/saveTimeTable",
organizationId)
.header(HttpHeaders.CONTENT_TYPE, MediaType.APPLICATION_JSON_VALUE)
.body(Mono.just(solverScheduleSlotDTOs), SolverScheduleSlotDTO.class)
.retrieve()
.onStatus(HttpStatus::isError, error -> {
LOGGER.error(extractExceptionMessage("saveSlots.savingFails", "updateModifiedLessons()"));
return Mono.error(new OptaPlannerException(
extractExceptionMessage("saveSlots.savingFails", "")));
})
.bodyToMono(String.class)
.block();
}
I would probably start by making sure that the solution you save to the DB after the first run of startSolver() is the same (in terms of Java equality), including the assignments of planning variables to values, as the solution you retrieve via findById() at the beginning of the second run.

Salesforce trigger/outbound api call does not seem to be working with batches of donations

The problem I am running into is that I cannot get the API call to our servers to run when someone enters donations with batches. If the donation is entered individually, it works perfectly fine. But when a batch is used, an error is returned that says a future method cannot called from a future or batch method. So I uncommented "#future (callout=true)" above the method, and it creates the record, but the API call is never made. But if I leave it uncommented, we get the error.
Basically this is how the structure of the code is set up:
Apex trigger class triggers when a new donation is added (Pulls data relating to donation)
The trigger calls a class that has a method that will pull more data about the account and contact from the data pulled from the trigger.
That method then called another method (in the same class) that has an API call to our servers that will send the data to our servers for us to process
Here is the code:
Trigger:
trigger NewDonor on Donation (after insert) {
for(Donation d : Trigger.new)
{
if(d.Stage == 'Closed Won' || d.Stage == 'Received'){
//Get contacts account id
string accountId = d.AccountId;
decimal money = d.Amount;
string donation_amount = money.toPlainString();
string donation_id = d.Id;
ProcessDonor.SerializeJson(accountId, donation_amount, donation_id);
}
}
}
Class:
public class ProcessDonor {
String account;
String JSONString;
//Method to make http request
public static String jsonHandler(String json1, String json2, String account, String donoAmount, String donoId){
String JSONString1 = json1;
String JSONString2 = json2;
String accountId = account;
String donation_amount = donoAmount;
String donation_id = donoId;
Http http = new Http();
HttpRequest request = new HttpRequest();
request.setEndpoint('https://urlToTerver');
request.setMethod('POST');
//Set headers
request.setHeader('Content-Type', 'application/json;charset=UTF-8');
request.setHeader('accountDataFields', JSONString1);
request.setHeader('contactDataFields', JSONString2);
request.setHeader('accountId', accountId);
request.setHeader('donationAmount', donation_amount);
request.setHeader('donationId', donation_id);
// Set the body as a JSON object
HttpResponse response = http.send(request);
// Parse the JSON response
if (response.getStatusCode() != 200) {
System.debug('The status code returned was not expected: ' +
response.getStatusCode() + ' ' + response.getBody());
return 'The status code returned was not expected: ' + response.getStatusCode() + ' ' + response.getStatus();
} else {
System.debug(response.getBody());
return response.getBody();
}
}
//Main method 2
//#future (callout=true)
public static void SerializeJson(String account, String amount, String donationId) {
String accountId = account;
String donationAmount = amount;
String donoId = donationId;
List<Account> accountQuery = Database.query('SELECT Id, Name, FirstGiftDate__c, LatestGiftDate__c, LifetimeDonationCount__c FROM Account WHERE Id = :accountId');
Decimal donationCount = accountQuery[0].LifetimeDonationCount__c;
Date latestGiftDate = accountQuery[0].LatestGiftDate__c;
Date firstGiftDate = accountQuery[0].FirstGiftDate__c;
if(donationCount == 1.00) {
System.debug(accountId);
System.debug(donationAmount);
//Make database query and set result to json
String JSONString1 = JSON.serialize(Database.query('SELECT Id, Contact__c, Name, Billing_Street, Billing_City, Billing_State, Billing_Postal_Code, EnvelopeSalutation__c, FirstGiftDate__c FROM Account WHERE Id = :accountId limit 1'));
String JSONString2 = JSON.serialize(Database.query('SELECT AccountId, Addressee__c, salutation FROM Contact WHERE AccountId = :accountId limit 1'));
String response = jsonHandler(JSONString1, JSONString2, accountId, donationAmount, donoId);
}
}
}
Maybe I just don't understand how batches fully work, but I am not sure how else to make this work.
Any help on this would be greatly appreciated!
The issue is that you are looping in the trigger and trying to make multiple future calls. Instead, move the loop to the future call. Loop through the records and make the callouts there. But be aware, there is a limitation on how many callous you can make. If the calls to your server don't have to happen in realtime, You might want to consider writing a batch class that would process all the donations at night.
Are you using Non-Profit Starter Pack (NPSP)? There's no standard object called Donation, NPSP renames Opportunities to Donations... That trigger looks suspicious (if it's Opportunities - it should say StageName, not Stage), you sure it compiled? And the way it's written it'll fire every time somebody edits a closed won opportunity. You probably want to run it only once, when stage changes. Or when some field "sent ok" is not set yet (and on successfull callout you'd set that field?)
"a future method cannot called from a future or batch method" - this suggests there's something more at play. For example a batch job (apex class with implements database.batchable in it) that inserts multiple donations -> calls trigger -> this calls #future -> problem. You shouldn't get that error from "normal" operation, even if you'd use Data Loader for bulk load -> trigger...
Or - do you update some field on donation on successful call? But at that point you're in #future, it will enter your trigger again, schedule another #future - boom headshot. It's bit crude but SF protects you from recursion. Did you post complete code? Do you think you have recursion in there?
The reason calls to external servers have to be asynchronous (#future etc) is that in a trigger you have a lock on the database table. SF can't block write access to this table for up to 2 minutes depending on a whim of 3rd party server. It has to be detached from main execution.
You can make 50 calls to #future method in single execution and each can have 100 HTTP callouts (unless they hit 120s combined max timeout), this should be plenty of room to do what you need.
Assuming worst case scenario, that you really have a batch job that inserts donations and this has side effect of making callouts - you won't be able to do it like that. You have few options to detach it.
Option 1: In trigger you can run System.isBatch() || System.isFuture() || System.isQueueable() and run current code only if it's all false. If we're in "normal" mode, not already asynchronous - you should be able to call the #future method and sendyour donation. This would sort situations where user manually creates one or you have "normal" load with Data Loader for example.
If one of these returns true - don't run the code. Instead have some other batch job that runs every hour? every 5 minutes? and "cleans". Picks donations that are closed won/received but don't have some hidden "sent ok" checkbox set and send just them.
Option 2 Similar but using Platform Events. It's bit fancier. Your trigger would raise an event (notification that can be read by other apex code, flows but also 3rd party systems), you'd write a trigger on events and send your callouts from that trigger. It'd be properly detached and it shouldn't matter whether it's called from UI, Data Loader or apex batch job. https://trailhead.salesforce.com/en/content/learn/modules/platform_events_basics/platform_events_subscribe might help

Google Cloud PubSub send the message to more than one consumer (in the same subscription)

I have a Java SpringBoot2 application (app1) that sends messages to a Google Cloud PubSub topic (it is the publisher).
Other Java SpringBoot2 application (app2) is subscribed to a subscription to receive those messages. But in this case, I have more than one instance (the k8s auto-scaling is enabled), so I have more than one pod for this app consuming messages from the PubSub.
Some messages are consumed by one instance of app2, but many others are sent to more than one app2 instance, so the messages process is duplicated for these messages.
Here is the code of consumer (app2):
private final static int ACK_DEAD_LINE_IN_SECONDS = 30;
private static final long POLLING_PERIOD_MS = 250L;
private static final int WINDOW_MAX_SIZE = 1000;
private static final Duration WINDOW_MAX_TIME = Duration.ofSeconds(1L);
#Autowired
private PubSubAdmin pubSubAdmin;
#Bean
public ApplicationRunner runner(PubSubReactiveFactory reactiveFactory) {
return args -> {
createSubscription("subscription-id", "topic-id", ACK_DEAD_LINE_IN_SECONDS);
reactiveFactory.poll(subscription, POLLING_PERIOD_MS) // Poll the PubSub periodically
.map(msg -> Pair.of(msg, getMessageValue(msg))) // Extract the message as a pair
.bufferTimeout(WINDOW_MAX_SIZE, WINDOW_MAX_TIME) // Create a buffer of messages to bulk process
.flatMap(this::processBuffer) // Process the buffer
.doOnError(e -> log.error("Error processing event window", e))
.retry()
.subscribe();
};
}
private void createSubscription(String subscriptionName, String topicName, int ackDeadline) {
pubSubAdmin.createTopic(topicName);
try {
pubSubAdmin.createSubscription(subscriptionName, topicName, ackDeadline);
} catch (AlreadyExistsException e) {
log.info("Pubsub subscription '{}' already configured for topic '{}': {}", subscriptionName, topicName, e.getMessage());
}
}
private Flux<Void> processBuffer(List<Pair<AcknowledgeablePubsubMessage, PreparedRecordEvent>> msgsWindow) {
return Flux.fromStream(
msgsWindow.stream()
.collect(Collectors.groupingBy(msg -> msg.getRight().getData())) // Group the messages by same data
.values()
.stream()
)
.flatMap(this::processDataBuffer);
}
private Mono<Void> processDataBuffer(List<Pair<AcknowledgeablePubsubMessage, PreparedRecordEvent>> dataMsgsWindow) {
return processData(
dataMsgsWindow.get(0).getRight().getData(),
dataMsgsWindow.stream()
.map(Pair::getRight)
.map(PreparedRecordEvent::getRecord)
.collect(Collectors.toSet())
)
.doOnSuccess(it ->
dataMsgsWindow.forEach(msg -> {
log.info("Mark msg ACK");
msg.getLeft().ack();
})
)
.doOnError(e -> {
log.error("Error on PreparedRecordEvent event", e);
dataMsgsWindow.forEach(msg -> {
log.error("Mark msg NACK");
msg.getLeft().nack();
});
})
.retry();
}
private Mono<Void> processData(Data data, Set<Record> records) {
// For each message, make calculations over the records associated to the data
final DataQuality calculated = calculatorService.calculateDataQualityFor(data, records); // Arithmetic calculations
return this.daasClient.updateMetrics(calculated) // Update DB record with a DaaS to wrap DB access
.flatMap(it -> {
if (it.getProcessedRows() >= it.getValidRows()) {
return finish(data);
}
return Mono.just(data);
})
.then();
}
private Mono<Data> finish(Data data) {
return dataClient.updateStatus(data.getId, DataStatus.DONE) // Update DB record with a DaaS to wrap DB access
.doOnSuccess(updatedData -> pubSubClient.publish(
new Qa0DonedataEvent(updatedData) // Publis a new event in other topic
))
.doOnError(err -> {
log.error("Error finishing data");
})
.onErrorReturn(data);
}
I need that each messages is consumed by one and only one app2 instance. Anybody know if this is possible? Any idea to achieve this?
Maybe the right way is to create one subscription for each app2 instance and configure the topic to send each message t exactly one subscription instead of to every one. It is possible?
According to the official documentation, once a message is sent to a subscriber, Pub/Sub tries not to deliver it to any other subscriber on the same subscription (app2 instances are subscriber of the same subscription):
Once a message is sent to a subscriber, the subscriber should
acknowledge the message. A message is considered outstanding once it
has been sent out for delivery and before a subscriber acknowledges
it. Pub/Sub will repeatedly attempt to deliver any message that has
not been acknowledged. While a message is outstanding to a subscriber,
however, Pub/Sub tries not to deliver it to any other subscriber on
the same subscription. The subscriber has a configurable, limited
amount of time -- known as the ackDeadline -- to acknowledge the
outstanding message. Once the deadline passes, the message is no
longer considered outstanding, and Pub/Sub will attempt to redeliver
the message
In general, Cloud Pub/Sub has at-least-once delivery semantics. That means that it will be possible to have messages redelivered that have already been acked and to have messages delivered to multiple subscribers receive the same message for a subscription. These two cases should be relatively rare for a well-behaved subscriber, but without keeping track of the IDs of all messages delivered across all subscribers, it will not be possible to guarantee that there won't be duplicates.
If it is happening with some frequency, it would be good to check if your messages are getting acknowledged within the ack deadline. You are buffering messages for 1s, which should be relatively small compared to your ack deadline of 30s, but it also depends on how long the messages ultimately take to process. For example, if the buffer is being processed in sequential order, it could be that the later messages in your 1000-message buffer aren't being processed in time. You could look at the subscription/expired_ack_deadlines_count metric in Cloud Monitoring to determine if it is indeed the case that your acks for messages are late. Note that late acks for even a small number of messages could result in more duplicates. See the "Message Redelivery & Duplication Rate" section of the Fine-tuning Pub/Sub performance with batch and flow control settings post.
Ok, after doing tests, reading documentation and reviewing the code, I have found a "small" error in it.
We had a wrong "retry" on the "processDataBuffer" method, so when an error happened, the messages in the buffer were marked as NACK, so they were delivered to another instance, but due to retry, they were executed again, correctly, so messages were also marked as ACK.
For this, some of them were prosecuted twice.
private Mono<Void> processDataBuffer(List<Pair<AcknowledgeablePubsubMessage, PreparedRecordEvent>> dataMsgsWindow) {
return processData(
dataMsgsWindow.get(0).getRight().getData(),
dataMsgsWindow.stream()
.map(Pair::getRight)
.map(PreparedRecordEvent::getRecord)
.collect(Collectors.toSet())
)
.doOnSuccess(it ->
dataMsgsWindow.forEach(msg -> {
log.info("Mark msg ACK");
msg.getLeft().ack();
})
)
.doOnError(e -> {
log.error("Error on PreparedRecordEvent event", e);
dataMsgsWindow.forEach(msg -> {
log.error("Mark msg NACK");
msg.getLeft().nack();
});
})
.retry(); // this retry has been deleted
}
My question is resolved.
Once corrected the mentioned bug, I still receive duplicated messages. It is accepted that Google Cloud's PubSub does not guarantee the "exactly one deliver" when you use buffers or windows. This is exactly my scenario, so I have to implement a mechanism to remove dups based on a message id.

What is the maximum amount of records can jparespositry saveAll can handle at a time?

in my spring-boot application I am running a schedule job which is taking the record from one table and saving it another as archive, I just ran into a problem so now the records I have selected to be save are almost around 400,000 plus and when the job is executing it just keeps on going. Can the ja handle this much data to be saved at once? Below are my configuration and the method.
#--Additional Settings for Batch Processing --#
spring.jpa.properties.hibernate.jdbc.batch_size=5000
spring.jpa.properties.hibernate.order_inserts=true
spring.jpa.properties.hibernate.generate_statistics=true
spring.jpa.properties.hibernate.jdbc.batch_versioned_data = true
rewriteBatchedStatements=true
cachePrepStmts=true
The agentAuditTrailArchiveList object has size of 400,000 plus.
private void archiveAgentAuditTrail(int archiveTimePeriod) {
List<AgentAuditTrail> archivableAgentAuditTrails = agentAuditTrailRepository
.fecthArchivableData(LocalDateTime.now().minusDays(archiveTimePeriod));
List<AgentAuditTrailArchive> agentAuditTrailArchiveList = new ArrayList<>();
archivableAgentAuditTrails
.forEach(agentAuditTrail -> agentAuditTrailArchiveMapper(agentAuditTrailArchiveList, agentAuditTrail));
System.out.println(" agentAuditTrailArchiveList = " + agentAuditTrailArchiveList.size());
System.out.println("archivableAgentAuditTrails = " + archivableAgentAuditTrails.size());
agentAuditTrailArchiveRepository.saveAll(agentAuditTrailArchiveList);
agentAuditTrailRepository.deleteInBatch(archivableAgentAuditTrails);
}
Assuming you are actually doing this as a Spring Batch job this is already handled. You just need to set the chunk size within the batch job step config so it will only process a set number of records at a time. See the Batch step config example below.
You never want to try to do the entire job as one commit unless it is always going to be really small. Doing it all in one commit leaves you open to a laudatory list of failure conditions and data integrity problems.
#Bean
public Step someStepInMyProcess(FlatFileItemReader<MyDataObject> reader,
MyProcesssor processor, ItemWriter<MyDataObject> writer) {
return stepBuilders.get("SomeStepName")
.<MyDataObject, MyDataObject>chunk(50)
.reader( reader )
.processor( processor )
.writer(writer)
.transactionManager(transactionManager)
.build();
}

Using/Searching AsyncDataProvider with Objectify / Google App Engine

I currently have an application which uses the activities/places and an AsyncDataProvider.
Right now, everytime the activity loads up - it uses the request factory to retrieve the data (currently not a lot but will get very large coming up here soon) and passes it to the View to update the DataGrid. Before it is updated it is filtered based on a search box.
Right now - I have implemented updating the DataGrid as follows: (this code isn't the prettiest)
private void updateData() {
final AsyncDataProvider<EquipmentTypeProxy> provider = new AsyncDataProvider<EquipmentTypeProxy>() {
#Override
protected void onRangeChanged(HasData<EquipmentTypeProxy> display) {
int start = display.getVisibleRange().getStart();
int end = start + display.getVisibleRange().getLength();
final List<EquipmentTypeProxy> subList = getSubList(start, end);
end = (end >= subList.size()) ? subList.size() : end;
if (subList.size() < DATAGRID_PAGE_SIZE) {
updateRowCount(subList.size(), true);
} else {
updateRowCount(data.size(), true);
}
updateRowData(start, subList);
}
private List<EquipmentTypeProxy> getSubList(int start, int end) {
final List<EquipmentTypeProxy> filteredEquipment;
if (searchString == null || searchString.equals("")) {
if (data.isEmpty() == false && data.size() > (end - start)) {
filteredEquipment = data.subList(start, end);
} else {
filteredEquipment = data;
}
} else {
filteredEquipment = new ArrayList<EquipmentTypeProxy>();
for (final EquipmentTypeProxy equipmentType : data) {
if (equipmentType.getName().contains(searchString)) {
filteredEquipment.add(equipmentType);
}
}
}
return filteredEquipment;
}
};
provider.addDataDisplay(dataGrid);
}
Ultimately - what I would like to do is only load up the necessary data at first (the default page size in this application is 25).
Unfortunately, to my current understanding, with Google App Engine there is no order to any of the Id's (one entry has an ID of 3 the next has an entry of 4203).
What I'm wondering, what is the best way to go about retrieving a subset of data from Google App Engine when using Objectify?
I was looking into using Offset and limit but another stack overflow post (http://stackoverflow.com/questions/9726232/achieve-good-paging-using-objectify) basically said this is inefficient.
The best information I've found is the following link (http://stackoverflow.com/questions/7027202/objectify-paging-with-cursors). The answer here says to use Cursors but also says this is inefficient. I'm also using Request Factory so I will have to store the Cursor in my user Session (if that is incorrect please let me know).
Currently since there isn't likely to be a lot of data (maybe 200 rows total for the next few months) I am just pulling back the entire set to the client as a temporary hack - I know this is the worst way to do it but would like to get input to the best way to do it before wasting my time implementing another hack solution. I am worried currently as it seems every single post i've read on doing this makes it seem like there's not really a solid way to do this.
What i am also thinking about doing - currently my searching / page loading is lightning fast because all the data is already on the client side. I use a KeyUpEvent handler in the search box to filter the data - i don't think there is any way to keep this speed by making a call to the server - is there any accepted solution to this problem?
Thank you very much
Go with Cursors. They are as efficient as it gets - cursor stores the point where last query ended and continues from there. The answer you linked actually does not discuss efficiency of cursors vs offset. (there is a comment that is wrong)
You can use limit with Cursors - it does not affect efficiency.
Also, Cursors can be serialized via cursor.toWebSafeString() and sent to client via RPC. This way you do not need to save them in session. Actually you can also use them as fragment identifier (aka history token in GWT parlance) - this way a certain "page" of your result set can be bookmarked.
(Offset is "inefficient" because it actually loads, and charges you, for all entities upto offset+limit, bit it only returns limit entities)
OTOH, if you already know the query parameters when the page is loaded, then just do the query at page generation time, instead invoking it via RPC. Also, if you have a small set of data (<1000) you could just preload all entity IDs s part of page html.

Resources