AppEngine full text search cursors broken in dev/unit test environment

AppEngine full text search cursors broken in dev/unit test environment - google-app-engine

I've noticed an inconsistency in the behavior of App Engine text search cursors in the devserver or unit test environment vs. production environments. The dev and unit test environments appear to exhibit a bug for cursors used in combination with sort expressions. Consider the following unit test code:
#Test
public void testQueryCursor( ) throws Exception
{
testQueryCursor("id_%02d"); // works
testQueryCursor("id_%d"); // fails
}
private void testQueryCursor( final String idFmt ) throws Exception
{
final int TEST_COUNT = 12;
final Index index =
SearchServiceFactory.getSearchService().getIndex(IndexSpec.newBuilder().setName("MY_TEST_IDX").build());
final List<String> docIds = new ArrayList<String>(TEST_COUNT);
try {
// populate some test data into an index
for (int i = 0; i < TEST_COUNT; i++) {
final String docId = String.format(idFmt, i);
final Document.Builder builder = Document.newBuilder().setId(docId);
builder.addField(Field.newBuilder().setName("some_field").setText("str1 " + docId)); // include varied docId in field for sorting
index.put(builder.build());
docIds.add(docId);
}
// for comparison to sorted search results
Collections.sort(docIds);
// define query options
final QueryOptions.Builder optionsBuilder =
QueryOptions
.newBuilder()
.setReturningIdsOnly(true)
.setLimit(10)
.setSortOptions(
SortOptions
.newBuilder()
.setLimit(20)
.addSortExpression(
SortExpression.newBuilder().setExpression("some_field")
.setDirection(SortDirection.ASCENDING).setDefaultValue("")));
// see https://developers.google.com/appengine/docs/java/search/results#Java_Using_cursors
// create an initial per-query cursor
Cursor cursor = Cursor.newBuilder().build();
final Iterator<String> idIter = docIds.iterator();
int batchIdx = 0;
do {
// build options and query
final QueryOptions options = optionsBuilder.setCursor(cursor).build();
final Query query = Query.newBuilder().setOptions(options).build("some_field : str1");
// search at least once
final Results<ScoredDocument> results = index.search(query);
int batchCount = 0;
for (final ScoredDocument match : results) {
batchCount++;
assertTrue(idIter.hasNext());
assertEquals(idIter.next(), match.getId());
System.out.println("Document " + match.getId() + " matched.");
}
System.out.println("Read " + batchCount + " results from batch " + ++batchIdx);
cursor = results.getCursor();
} while (cursor != null);
} finally {
index.delete(docIds);
}
}
If the assertEquals(idIter.next(), match.getId()); line is commented out the full output of the previously failing call the testQueryCursor("id_%d") can be observed and we see that the proper ordering of results appears to be ignored. What's more, the last search performed on the cursor repeats the last two elements retrieved from the previous search call. Since these two elements SHOULD BE the last two returned from the search perhaps this behavior is simply an artifact of the flaw which causes the improper sort.
This code can be easily run as a unit test as shown here or run from a JSP on the devserver and the behavior is consistent. When run as a JSP on a production instance of App Engine the behavior differs in that the search returns the correctly ordered results in all cases. It would be nice if the devserver environment and unit test tools were fixed to provide correct behavior consistent with production.

Related

Pagination in Google cloud endpoints + Datastore + Objectify

I want to return a List of "Posts" from an endpoint with optional pagination.
I need 100 results per query.
The Code i have written is as follows, it doesn't seem to work.
I am referring to an example at Objectify Wiki
Another option i know of is using query.offset(100);
But i read somewhere that this just loads the entire table and then ignores the first 100 entries which is not optimal.
I guess this must be a common use case and an optimal solution will be available.
public CollectionResponse<Post> getPosts(#Nullable #Named("cursor") String cursor,User auth) throws OAuthRequestException {
if (auth!=null){
Query<Post> query = ofy().load().type(Post.class).filter("isReviewed", true).order("-timeStamp").limit(100);
if (cursor!=null){
query.startAt(Cursor.fromWebSafeString(cursor));
log.info("Cursor received :" + Cursor.fromWebSafeString(cursor));
} else {
log.info("Cursor received : null");
}
QueryResultIterator<Post> iterator = query.iterator();
for (int i = 1 ; i <=100 ; i++){
if (iterator.hasNext()) iterator.next();
else break;
}
log.info("Cursor generated :" + iterator.getCursor());
return CollectionResponse.<Post>builder().setItems(query.list()).setNextPageToken(iterator.getCursor().toWebSafeString()).build();
} else throw new OAuthRequestException("Login please.");
}
This is a code using Offsets which seems to work fine.
#ApiMethod(
name = "getPosts",
httpMethod = ApiMethod.HttpMethod.GET
)
public CollectionResponse<Post> getPosts(#Nullable #Named("offset") Integer offset,User auth) throws OAuthRequestException {
if (auth!=null){
if (offset==null) offset = 0;
Query<Post> query = ofy().load().type(Post.class).filter("isReviewed", true).order("-timeStamp").offset(offset).limit(LIMIT);
log.info("Offset received :" + offset);
log.info("Offset generated :" + (LIMIT+offset));
return CollectionResponse.<Post>builder().setItems(query.list()).setNextPageToken(String.valueOf(LIMIT + offset)).build();
} else throw new OAuthRequestException("Login please.");
}

Be sure to assign the query:
query = query.startAt(cursor);
Objectify's API uses a functional style. startAt() does not mutate the object.

Try the following:
Remove your for loop -- not sure why it is there. But just iterate through your list and build out the list of items that you want to send back. You should stick to the iterator and not force it for 100 items in a loop.
Next, once you have iterated through it, use the iterator.getStartCursor() as the value of the cursor.

FOR statement to query a SQL Server

I want to create a query against a database with a for statement (in C#)
something like this:
List<object> data = new List<object>();
for(int i = 0; i < executeScalar("SELECT COUNT(*) FROM mytable"); i++)
{
List[i] = executeRead("SELECT rownumber(i) From mytable");
// or
executeUpdate("UPDATE mytable SET ... inrownumber(i)",List[i])
}
and the question is: is there any function to use for this "rownumber(i)" and "inrownumber(i)"?
I know I can do it like this
List[i] = executeRead("SELECT * From mytable WHERE ROW_NUMBER() = " + i);
and
executeUpdate("UPDATE mytable SET ... WHERE ROW_NUMBER() = " + i,List[i])
but if I do that - the database will search in all the table each time to find one item, so if I have 100 items, the database will pass on 10,000 items. and I wont that each time the database go directly to the row, so it pass only 100 items in all the for statement
Do you know any way do do it?
(I need it because in my program - the developer assumed that all the data is in the list, and he take them with a for statement and by index, and do "Add" and "Insert" and so on, and I don't wont to change all the program)
Thanks

Assuming you have your data stored in a generic list called places then
Using (SqlConnection cn = GetMyDbConnectionHere())
{
Using(SqlCommand cmd = new SqlCommand("dbo.UpdatePlace", cn)
{
// Create your parameters for the command here - e.g. p_PlaceName
foreach(Place place in places)
{
if(place.HasChanged)
{
p_PrimaryKey.value = place.primaryKey;
p_PlaceName.value = place.placeName;
p_PlaceLat.value = place.lat;
// And so on and so forth
cmd.ExecuteNonQuery();
}
}
}
}
All this code is straight off the top of my head and typed directly into SO on the web page - so I make no guarantee as to it being fully functional - but it should at least get you going... In addition there's zero error handling here - also a major no-no.

Sitecore Solr Search result items matching count

I am using Sitecore Solr search for searching using a keyword string, Is there a way to know the number of matches for each of the returned result items?.
The following is the code I am using:
using (var context = Index.CreateSearchContext())
{
List<Item> ResultList = new List<Item>();
var contentPredicate = PredicateBuilder.True<customSearchResultItem>();
contentPredicate = contentPredicate.And(p => p.Content.Contains(SearchKey));
contentPredicate = contentPredicate.And(p => p.Name != "__Standard Values");
var languagePredicate = PredicateBuilder.True<customSearchResultItem>();
languagePredicate = languagePredicate.And(p => p.Language == Context.Language.Name);
var CombinPredicates = PredicateBuilder.True<customSearchResultItem>();
CombinPredicates = CombinPredicates.And(languagePredicate);
CombinPredicates = CombinPredicates.And(contentPredicate);
// execute the search
IQueryable<customSearchResultItem> query = context.GetQueryable<customSearchResultItem>().Where(CombinPredicates);
var hits = query.GetResults().Hits;
}

From what I know, you can not get the number of matches for every result item based on the keyword used for search. What you can get, is a score value from Solr.
var hits = query.GetResults().Hits;
foreach (var hit in hits)
{
var score = hit.Score;
}
This is the value for the whole query, so it includes all predicates like language, not Standard Values and keywords in your case.
Remember, that this value can be different if you use Solr and if you use Lucene - this is dependent on the internal calculations.

I solved this by adding boosting values to each predicate then the for each result item I got the score and divide it by .59 which in my case the maximum value that occurs when all predicates staesfied; The code in details can be found on the following blog post:
http://sitecoreinfo.blogspot.com/2015/10/sitecore-solr-search-result-items.html

Caching of Function Results

I essentially want to write a bunch of commonly used queries in a web application of this format:
SELECT *
FROM secure_table
WHERE security_function(value 1, value 2) = true;
Value 1 and value 2 will have a limited enough range of values for the idea of caching the result of the security function to be potentially very useful in improving application performance. We would also need to be able to trigger a reset of the cache at will since some conditions would render the cached values out of date.
Is there an out of the box way of doing this with SQL Server (I believe we will be using the 2012 version)? I've had a google around and seen nothing concrete, some references to ASP.NET state but nothing concrete about what that actually involves, and some references to memcached, but that wouldn't seem to go down to function level, so doesn't seem suitable either.
EDIT:
So I would like the function to work something like this:
function security_function(val1, val2) {
result = getFromCache(val1, val2)
if result is empty then
result = //do big complicated query
addToCache(val1, val2, result)
end
return result
}

If you are using ASP.Net you can use the cache object to store the results of the query:
in c#:
Results GetResults(string value1, string value2)
{
string cacheItemName = "cacheItem-" + value1 + "-" + value2;
if (Cache[cacheItemName] != null)
{
return Cache[cacheItemName];
}
else
{
var result = // do big complicated query;
Cache.Insert(cacheItemName, result,
null, DateTime.Now.AddMinutes(15d), // Expire after 15 minutes
System.Web.Caching.Cache.NoSlidingExpiration);
return result;
}
}

Dapper return result fails after enumeration

I have a dapper query multiple function that outputs a number of different lists except for the very first list. While debugging I discovered that when the code gets to the following line in Dapper the results disappear:
public IEnumerable<T> Read<T>....
var result = ReadDeferred<T>(gridIndex, deserializer.Func, typedIdentity); //result has correct db values here
return buffered ? result.ToList() : result; //result = Enumeration yielded no results
The ReadDeferred function does not process any code in the try or finally clause. Why is the value of result being lost in enumeration?
Here is my code that calls dapper:
var results = con.QueryMultiple("GetInspections", p, commandType: CommandType.StoredProcedure, commandTimeout: 5000);
var inspectionDetails = new Inspection
{
InspectionDetailList = results.Read<Inspection>().ToList(), <-- this one does not popuplate
SOHList = results.Read<SOHPrograms>().ToList(),
BuildingList = results.Read<Building>().ToList(),
AdministratorList = results.Read<Employee>().ToList(),
NotAdminList = results.Read<Employee>().ToList(),
InspectionList = results.Read<InspectionList>().ToList()
};
return inspectionDetails;
I have verified that there are result sets being returned for each list from the sql query.

This problem had a two part answer, because I had two problem errors. The first was that I was calling the InspectionDetailList as a list from inside the Inspection object which I removed and the second was to change the code that calls dapper to use a using statement and call the pieces individually. Thanks goes to a friend and one of the overflow posts found here.
using(var results = con.QueryMultiple("GetInspections", p, commandType: CommandType.StoredProcedure, commandTimeout: 5000))
{
var inspectionDetails = results.Read<Inspection>().First();
inspectionDetails.OshList = results.Read<SOHPrograms>.ToList();
inspectionDetails.BuildingList = results.Read<Building>.ToList();
}

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

AppEngine full text search cursors broken in dev/unit test environment - google-app-engine

Related

Pagination in Google cloud endpoints + Datastore + Objectify

FOR statement to query a SQL Server

Sitecore Solr Search result items matching count

Caching of Function Results

Dapper return result fails after enumeration

Categories

Resources