Using/Searching AsyncDataProvider with Objectify / Google App Engine

Using/Searching AsyncDataProvider with Objectify / Google App Engine - google-app-engine

I currently have an application which uses the activities/places and an AsyncDataProvider.
Right now, everytime the activity loads up - it uses the request factory to retrieve the data (currently not a lot but will get very large coming up here soon) and passes it to the View to update the DataGrid. Before it is updated it is filtered based on a search box.
Right now - I have implemented updating the DataGrid as follows: (this code isn't the prettiest)
private void updateData() {
final AsyncDataProvider<EquipmentTypeProxy> provider = new AsyncDataProvider<EquipmentTypeProxy>() {
#Override
protected void onRangeChanged(HasData<EquipmentTypeProxy> display) {
int start = display.getVisibleRange().getStart();
int end = start + display.getVisibleRange().getLength();
final List<EquipmentTypeProxy> subList = getSubList(start, end);
end = (end >= subList.size()) ? subList.size() : end;
if (subList.size() < DATAGRID_PAGE_SIZE) {
updateRowCount(subList.size(), true);
} else {
updateRowCount(data.size(), true);
}
updateRowData(start, subList);
}
private List<EquipmentTypeProxy> getSubList(int start, int end) {
final List<EquipmentTypeProxy> filteredEquipment;
if (searchString == null || searchString.equals("")) {
if (data.isEmpty() == false && data.size() > (end - start)) {
filteredEquipment = data.subList(start, end);
} else {
filteredEquipment = data;
}
} else {
filteredEquipment = new ArrayList<EquipmentTypeProxy>();
for (final EquipmentTypeProxy equipmentType : data) {
if (equipmentType.getName().contains(searchString)) {
filteredEquipment.add(equipmentType);
}
}
}
return filteredEquipment;
}
};
provider.addDataDisplay(dataGrid);
}
Ultimately - what I would like to do is only load up the necessary data at first (the default page size in this application is 25).
Unfortunately, to my current understanding, with Google App Engine there is no order to any of the Id's (one entry has an ID of 3 the next has an entry of 4203).
What I'm wondering, what is the best way to go about retrieving a subset of data from Google App Engine when using Objectify?
I was looking into using Offset and limit but another stack overflow post (http://stackoverflow.com/questions/9726232/achieve-good-paging-using-objectify) basically said this is inefficient.
The best information I've found is the following link (http://stackoverflow.com/questions/7027202/objectify-paging-with-cursors). The answer here says to use Cursors but also says this is inefficient. I'm also using Request Factory so I will have to store the Cursor in my user Session (if that is incorrect please let me know).
Currently since there isn't likely to be a lot of data (maybe 200 rows total for the next few months) I am just pulling back the entire set to the client as a temporary hack - I know this is the worst way to do it but would like to get input to the best way to do it before wasting my time implementing another hack solution. I am worried currently as it seems every single post i've read on doing this makes it seem like there's not really a solid way to do this.
What i am also thinking about doing - currently my searching / page loading is lightning fast because all the data is already on the client side. I use a KeyUpEvent handler in the search box to filter the data - i don't think there is any way to keep this speed by making a call to the server - is there any accepted solution to this problem?
Thank you very much

Go with Cursors. They are as efficient as it gets - cursor stores the point where last query ended and continues from there. The answer you linked actually does not discuss efficiency of cursors vs offset. (there is a comment that is wrong)
You can use limit with Cursors - it does not affect efficiency.
Also, Cursors can be serialized via cursor.toWebSafeString() and sent to client via RPC. This way you do not need to save them in session. Actually you can also use them as fragment identifier (aka history token in GWT parlance) - this way a certain "page" of your result set can be bookmarked.
(Offset is "inefficient" because it actually loads, and charges you, for all entities upto offset+limit, bit it only returns limit entities)
OTOH, if you already know the query parameters when the page is loaded, then just do the query at page generation time, instead invoking it via RPC. Also, if you have a small set of data (<1000) you could just preload all entity IDs s part of page html.

Related

OptaPlanner: timetable resuming generates a wrong solution, even if the generation starts from where it stopped

I am using Java Spring Boot and OptaPlanner to generate a timetable with almost 20 constraints. At the initial generation, everything works fine. The score showed by the OptaPlanner logging messages matches the solution received, but when I want to resume the generation, the solution contains a lot of problems (like the constraints are not respected anymore) although the generation starts from where it has stopped and it continues initializing or finding a best solution.
My project is divided into two microservices: one that communicates with the UI and keeps the database, and the other receives data from the first when a request for starting/resuming the generation is done and generates the schedule using OptaPlanner. I use the same request for starting/resuming the generation.
This is how my project works: the UI makes the requests for starting, resuming, stopping the generation and getting the timetable. These requests are handled by the first microservice, which uses WebClient to send new requests to the second microservice. Here, the timetable will be generated after asking for some data from the database.
Here is the method for starting/resuming the generation from the second microservice:
#PostMapping("startSolver")
public ResponseEntity<?> startSolver(#PathVariable String organizationId) {
try {
SolverConfig solverConfig = SolverConfig.createFromXmlResource("solver/timeTableSolverConfig.xml");
SolverFactory<TimeTable> solverFactory = new DefaultSolverFactory<>(solverConfig);
this.solverManager = SolverManager.create(solverFactory);
this.solverManager.solveAndListen(TimeTableService.SINGLETON_TIME_TABLE_ID,
id -> timeTableService.findById(id, UUID.fromString(organizationId)),
timeTable -> timeTableService.updateModifiedLessons(timeTable, organizationId));
return new ResponseEntity<>("Solving has successfully started", HttpStatus.OK);
} catch(OptaPlannerException exception) {
System.out.println("OptaPlanner exception - " + exception.getMessage());
return utils.generateResponse(exception.getMessage(), HttpStatus.CONFLICT);
}
}
-> findById(...) method make a request to the first microservice, expecting to receive all data needed by constraints for generation (lists of planning entities, planning variables and all other useful data)
public TimeTable findById(Long id, UUID organizationId) {
SolverDataDTO solverDataDTO = webClient.get()
.uri("http://localhost:8080/smart-planner/org/{organizationId}/optaplanner-solver/getSolverData",
organizationId)
.retrieve()
.onStatus(HttpStatus::isError, error -> {
LOGGER.error(extractExceptionMessage("findById.fetchFails", "findById()"));
return Mono.error(new OptaPlannerException(
extractExceptionMessage("findById.fetchFails", "")));
})
.bodyToMono(SolverDataDTO.class)
.block();
TimeTable timeTable = new TimeTable();
/.. populating all lists from TimeTable with the one received in solverDataDTO ../
return timeTable;
}
-> updateModifiedLessons(...) method send to the first microservice the list of all generated planning entities with the corresponding planning variables assigned
public void updateModifiedLessons(TimeTable timeTable, String organizationId) {
List<ScheduleSlot> slots = new ArrayList<>(timeTable.getScheduleSlotList());
List<SolverScheduleSlotDTO> solverScheduleSlotDTOs =
scheduleSlotConverter.convertModelsToSolverDTOs(slots);
String executionMessage = webClient.post()
.uri("http://localhost:8080/smart-planner/org/{organizationId}/optaplanner-solver/saveTimeTable",
organizationId)
.header(HttpHeaders.CONTENT_TYPE, MediaType.APPLICATION_JSON_VALUE)
.body(Mono.just(solverScheduleSlotDTOs), SolverScheduleSlotDTO.class)
.retrieve()
.onStatus(HttpStatus::isError, error -> {
LOGGER.error(extractExceptionMessage("saveSlots.savingFails", "updateModifiedLessons()"));
return Mono.error(new OptaPlannerException(
extractExceptionMessage("saveSlots.savingFails", "")));
})
.bodyToMono(String.class)
.block();
}

I would probably start by making sure that the solution you save to the DB after the first run of startSolver() is the same (in terms of Java equality), including the assignments of planning variables to values, as the solution you retrieve via findById() at the beginning of the second run.

PlaybackNearlyFinished and PlaybackFinished occur almost at the same time?

It turns out, that sometimes (not always btw but very frequently) PlaybackNearlyFinished and PlaybackFinished occur almost at the same time. What also confuses is that both of the events convey exactly the same offset that represents the very end of the stream:
When this happens the next stream scheduled in PlaybackNearlyFinished does not go - the playback just finishes.
Unless this is a bug in Alexa/Infrastructure, I can not figure out how to implement a playback for a playlist - there's just no way to reliably schedule the upcoming track...
Is there anything I could do in my code to make it work well?
I am using Echo Dot 2 gen., physically located in Europe, use java SDK, AWS Lambda, Dynamo DB.

Looks like it is figured out now - in order to queue the next stream properly one needs to come up with really unique stream tokens. This means that even the same file/url should be enqueued under a unique token.
In my example above, I used the index of the track in the playlist as an token. Once I solved it like below, everything started to work like a charm:
import org.apache.commons.lang3.RandomStringUtils;
public class TokenService {
public String createToken(int playbackPosition) {
String suffix = RandomStringUtils.randomAlphanumeric(16);
return String.valueOf(playbackPosition) + ":" + suffix;
}
public int tokenToPlaybackIndex(String token) {
String positionStr = token.split(":")[0];
return Integer.valueOf(positionStr);
}
}
Hope it helps somebody!

Is there a limit on the number of entities you can query from the GAE datastore?

My GCM Endpoint is derived from the code at /github.com/GoogleCloudPlatform/gradle-appengine-templates/tree/master/GcmEndpoints/root/src/main. Each Android client device
registers with the endpoint. A message can be sent to the first 10 registered devices using this code:
#Api(name = "messaging", version = "v1", namespace = #ApiNamespace(ownerDomain = "${endpointOwnerDomain}", ownerName = "${endpointOwnerDomain}", packagePath="${endpointPackagePath}"))
public class MessagingEndpoint {
private static final Logger log = Logger.getLogger(MessagingEndpoint.class.getName());
/** Api Keys can be obtained from the google cloud console */
private static final String API_KEY = System.getProperty("gcm.api.key");
/**
* Send to the first 10 devices (You can modify this to send to any number of devices or a specific device)
*
* #param message The message to send
*/
public void sendMessage(#Named("message") String message) throws IOException {
if(message == null || message.trim().length() == 0) {
log.warning("Not sending message because it is empty");
return;
}
// crop longer messages
if (message.length() > 1000) {
message = message.substring(0, 1000) + "[...]";
}
Sender sender = new Sender(API_KEY);
Message msg = new Message.Builder().addData("message", message).build();
List<RegistrationRecord> records = ofy().load().type(RegistrationRecord.class).limit(10).list();
for(RegistrationRecord record : records) {
Result result = sender.send(msg, record.getRegId(), 5);
if (result.getMessageId() != null) {
log.info("Message sent to " + record.getRegId());
String canonicalRegId = result.getCanonicalRegistrationId();
if (canonicalRegId != null) {
// if the regId changed, we have to update the datastore
log.info("Registration Id changed for " + record.getRegId() + " updating to " + canonicalRegId);
record.setRegId(canonicalRegId);
ofy().save().entity(record).now();
}
} else {
String error = result.getErrorCodeName();
if (error.equals(Constants.ERROR_NOT_REGISTERED)) {
log.warning("Registration Id " + record.getRegId() + " no longer registered with GCM, removing from datastore");
// if the device is no longer registered with Gcm, remove it from the datastore
ofy().delete().entity(record).now();
}
else {
log.warning("Error when sending message : " + error);
}
}
}
}
}
The above code sends to the first 10 registered devices. I would like to send to all registered clients. According to http://objectify-appengine.googlecode.com/svn/branches/allow-parent-filtering/javadoc/com/googlecode/objectify/cmd/Query.html#limit(int) setting limit(0) accomplishes this. But I'm not convinced there will not be a problem for very large numbers of registered clients due to memory constraints or the time it takes to execute the query. https://code.google.com/p/objectify-appengine/source/browse/Queries.wiki?repo=wiki states "Cursors let you take a "checkpoint" in a query result set, store the checkpoint elsewhere, and then resume from where you left off later. This is often used in combination with the Task Queue API to iterate through large datasets that cannot be processed in the 60s limit of a single request".
Note the comment about the 60s limit of a single request.
So my question - if I modified the sample code at /github.com/GoogleCloudPlatform/gradle-appengine-templates/tree/master/GcmEndpoints/root/src/main to request all objects from the datastore, by replacing limit(10) with limit(0), will this ever fail for a large number of objects? And if it will fail, roughly what number of objects?

This is a poor pattern, even with cursors. At the very least, you'll hit the hard 60s limit for a single request. And since you're doing updates on the RegistrationRecord, you need a transaction, which will slow down the process even more.
This is exactly what the task queue is for. The best way is to do it in two tasks:
Your api endpoint enqueues "send message to everyone" and returns immediately.
That first task is the "mapper" which iterates the RegistrationRecords with a keys-only query. For each key, enqueue a "reducer" task for "send X message to this record".
The reducer task sends the message and (in a transaction) performs your record update.
Using Deferred this actually isn't much code at all.
The first task frees you client immediately and gives you 10m to iterating RegistrationRecord keys rather than the 60s limit for a normal request. If you have your chunking right and batch queue submissions, you should be able to generate thousands of reducer tasks per second.
This will effortlessly scale to hundreds of thousands of users, and might get you into millions. If you need to scale higher, you can apply a map/reduce approach to parallelize the mapping. Then it's just a question of how many instances you want to throw at the problem.
I have used this approach to great effect in the past sending out millions of apple push notifications at a time. The task queue is your friend, use it heavily.

Your query will time out if you try to retrieve too many entities. You will need to use cursors in your loop.
No one can say how many entities can be retrieved before this timeout - it depends on the size of your entities, complexity of your query, and, most importantly, what else happens in your loop. For example, in your case you can dramatically speed up your loop (and thus retrieve many more entities before a timeout) by creating tasks instead of building and sending messages within the loop itself.
Note that by default a query returns entities in chunks of 20 - you will need to increase the chunk size if you have a large number of entities.

Issue with getting database via Sitecore API

We noticed a slight oddity in the Sitecore API code. The code is below for your reference. The code is trying to get a database by doing new Database(database). But randomly it was failing.
This code worked for a while with Database db = new Database(database); but started failing randomly yesterday. When we changed the code to Database db = Database.GetDatabase(database);, the code started working again. What is the difference between the two approaches and what is recommended by Sitecore?
I've seen this happen twice now - multiple times in production and a couple of times in my development environment.
public static void DeleteItem(string id, stringdatabase)
{
//get the database
Database db = new Database(database);
//get the item
item = db.GetItem(new ID(id));
if (item != null)
{
using(new Sitecore.SecurityModel.SecurityDisabler())|
{
//delete the item
item.Delete();
}
}
}

A common way you will see people get a specific database is:
Sitecore.Data.Database master = Sitecore.Configuration.Factory.GetDatabase("master");
This is equivalent to Sitecore.Data.Database.GetDatabase("master").
When you call either of these methods it will first check the cache for the database. If not found it will build up the database with all of the configuration values within the config file via reflection. Once the database is created it will be placed in the cache for future use.
When you use the constructor on the database it is simply creating a rather empty database object. I am rather suprised to hear it was working at all when you used this method.
The proper approach to get a specific database would be to use:
Sitecore.Configuration.Factory.GetDatabase("master");
// or
Sitecore.Data.Database.GetDatabase("master");
If you are looking to get the database used with the current request (aka context database) you can use Sitecore.Context.Database. You can also use Sitecore.Context.ContentDatabase.

Provide a database packaged with the .APK file or host it separately on a website?

Here is some background about my app:
I am developing an Android app that will display a random quote or verse to the user. For this I am using an SQLite database. The size of the DB would be approximately 5K to 10K records, possibly increasing to upto 1M in later versions as new quotes and verses are added. Thus the user would need to update the DB as and when newer versions are of the app or DB are released.
After reading through some forums online, there seem to be two feasible ways I could provide the DB:
1. Bundle it along with the .APK file of the app, or
2. Upload it to my app's website from where users will have to download it
I want to know which method would be better (if there is yet another approach other than these, please do let me know).
After pondering this problem for some time, I have these thoughts regarding the above approaches:
Approach 1:
Users will obtain the DB along with the app, and won't have to download it separately. Installation would thereby be easier. But, users will have to reinstall the app every time there is a new version of the DB. Also, if the DB is large, it will make the installable too cumbersome.
Approach 2:
Users will have to download the full DB from the website (although I can provide a small, sample version of the DB via Approach 1). But, the installer will be simpler and smaller in size. Also, I would be able to provide future versions of the DB easily for those who might not want newer versions of the app.
Could you please tell me from a technical and an administrative standpoint which approach would be the better one and why?
If there is a third or fourth approach better than either of these, please let me know.
Thank you!
Andruid

I built a similar app for Android which gets periodic updates with data from a government agency. It's fairly easy to build an Android compatible db off the device using perl or similar and download it to the phone from a website; and this works rather well, plus the user gets current data whenever they download the app. It's also supposed to be possible to throw the data onto the sdcard if you want to avoid using primary data storage space, which is a bigger concern for my app which has a ~6Mb database.
In order to make Android happy with the DB, I believe you have to do the following (I build my DB using perl).
$st = $db->prepare( "CREATE TABLE \"android_metadata\" (\"locale\" TEXT DEFAULT 'en_US')");
$st->execute();
$st = $db->prepare( "INSERT INTO \"android_metadata\" VALUES ('en_US')");
$st->execute();
I have an update activity which checks weather updates are available and if so presents an "update now" screen. The download process looks like this and lives in a DatabaseHelperClass.
public void downloadUpdate(final Handler handler, final UpdateActivity updateActivity) {
URL url;
try {
close();
File f = new File(getDatabasePath());
if (f.exists()) {
f.delete();
}
getReadableDatabase();
close();
url = new URL("http://yourserver.com/" + currentDbVersion + ".sqlite");
URLConnection urlconn = url.openConnection();
final int contentLength = urlconn.getContentLength();
Log.i(TAG, String.format("Download size %d", contentLength));
handler.post(new Runnable() {
public void run() {
updateActivity.setProgressMax(contentLength);
}
});
InputStream is = urlconn.getInputStream();
// Open the empty db as the output stream
OutputStream os = new FileOutputStream(f);
// transfer bytes from the inputfile to the outputfile
byte[] buffer = new byte[1024 * 1000];
int written = 0;
int length = 0;
while (written < contentLength) {
length = is.read(buffer);
os.write(buffer, 0, length);
written += length;
final int currentprogress = written;
handler.post(new Runnable() {
public void run() {
Log.i(TAG, String.format("progress %d", currentprogress));
updateActivity.setCurrentProgress(currentprogress);
}
});
}
// Close the streams
os.flush();
os.close();
is.close();
Log.i(TAG, "Download complete");
openDatabase();
} catch (Exception e) {
Log.e(TAG, "bad things", e);
}
handler.post(new Runnable() {
public void run() {
updateActivity.refreshState(true);
}
});
}
Also note that I keep a version number in the filename of the db files, and a pointer to the current one in a text file on the server.

It sounds like your app and your db are tightly bound -- that is, the db is useless without the database and the database is useless without the app, so I'd say go ahead and put them both in the same .apk.
That being said, if you expect the db to change very slowly over time, but the app to change quicker, and you don't want your users to have to download the db with each new app revision, then you might want to unbundle them. To make this work, you can do one of two things:
Install them as separate applications, but make sure they share the same userID using the sharedUserId tag in the AndroidManifest.xml file.
Install them as separate applications, and create a ContentProvider for the database. This way other apps could make use of your database as well (if that is useful).

If you are going to store the db on your website then I would recommend that you just make rpc calls to your webserver and get data that way, so the device will never have to deal with a local database. Using a cache manager to avoid multiple lookups will help as well so pages will not have to lookup data each time a page reloads. Also if you need to update the data you do not have to send out a new app every time. Using HttpClient is pretty straight forward, if you need any examples please let me know