Create Index for Full Text Search in Google App Engine - google-app-engine

I'm reading the documentation on full text search api (java) in google app engine at https://developers.google.com/appengine/docs/java/search/overview. They have example on getting the index:
public Index getIndex() {
IndexSpec indexSpec = IndexSpec.newBuilder()
.setName("myindex")
.setConsistency(Consistency.PER_DOCUMENT)
.build();
return SearchServiceFactory.getSearchService().getIndex(indexSpec);
}
How about on creating an index? How to create one?
Thanks

You just did. You just created one.
public class IndexSpec
Represents information about an index. This class is used to fully specify the index you want to retrieve from the SearchService. To build an instance use the newBuilder() method and set all required parameters, plus optional values different than the defaults.
https://developers.google.com/appengine/docs/java/javadoc/com/google/appengine/api/search/IndexSpec
You can confirm this by looking at the SearchService
SearchService is also responsible for creating new indexes. For example:
SearchService searchService = SearchServiceFactory.getSearchService();
index = searchService.getIndex(IndexSpec.newBuilder().setName("myindex"));
https://developers.google.com/appengine/docs/java/javadoc/com/google/appengine/api/search/SearchService
Anyway, It seems your code will create a new index if it doesn't exist. That's what the docs suggest:
// Get the index. If not yet created, create it.
Index index = searchService.getIndex(
IndexSpec.newBuilder()
.setIndexName("indexName")
.setConsistency(Consistency.PER_DOCUMENT));
https://developers.google.com/appengine/docs/java/javadoc/com/google/appengine/api/search/Index
Now, what happens if you run the code again and change the Consistency? Do you have the same index with a different consistency? Is the index overwritten? I don't know. I would use the SearchService to lookup existing indexes instead of using code that might create them just to avoid trying to get an index in my code but changing the specs inadvertantly.

An Index is implicitly created when a document is written. Consistency is an attribute of the index, i.e. you can't have two indexes of the same name with different consistencies.

Related

How can we initialize DataChangeDetectionPolicy using .netsdk?

I have created a new index that is populated using an indexer. The indexer's datasource is a SQL view that has a Timestamp column of type datetime. Since we don't want a full reindexing each time the indexer runs, this column should be used to determine which data have changed since the last indexer run.
According to the documentation we need to create or update the datasource by setting the HighWatermarkColumnName and ODataType to the DataChangeDetectionPolicy object. The example in the documentation uses the REST API and there is also way to do it using the azure search portal directly.
However I want to do it using .netsdk and so far I haven't been able to do so. I am using Azure.Search.Documents(11.2.0 - beta.2). Here is the part of the code I use to create the datasource:
SearchIndexerDataSourceConnection CreateIndexerDataSource()
{
var ds = new SearchIndexerDataSourceConnection(DATASOURCE,
SearchIndexerDataSourceType.AzureSql,
this._datasourceConStringMaxEvents,
new SearchIndexerDataContainer(SQLVIEW));
//ds.DataChangeDetectionPolicy = new DataChangeDetectionPolicy();
return ds;
}
The commented code is what I tried to do to initialize the DataChangeDetectionPolicy but there is no ctor exposed. Am I missing something?
Thanks in advance.
Instead of using DataChangeDetectionPolicy, you will need to use HighWaterMarkChangeDetectionPolicy which is derived from DataChangeDetectionPolicy.
So your code would be something like:
ds.DataChangeDetectionPolicy = new HighWaterMarkChangeDetectionPolicy("Timestamp");

Rails update remove number from an array attribute?

Is there a way to remove a number from an attibute array in an update? For example, if I want to update all of an alchy's booze stashes if he runs out of a particular type of booze:
Alchy has_many :stashes
Stash.available_booze_types = [] (filled with booze.ids)
Booze is also a class
#booze.id = 7
if #booze.is_all_gone
#alchy.stashes.update(available_booze_types: "remove #booze.id")
end
update: #booze.id may or may not be present in the available_booze_types array
... so if #booze.id was in any of the Alchy.stash instances (in the available_booze_types attribute array), it would be removed.
I think you can do what you want in the following way:
if #booze.is_all_gone
#alchy.stashes.each do |stash|
stash.available_booze_types.delete(#booze.id)
end
end
However, it looks to me like there are better ways to do what you are trying to do. Rails gives you something like that array by using relations. Also, the data in the array will be lost if you reset the app (if as I understand available_booze_types is an attribute which is not stored in a database). If your application is correctly set up (an stash has many boozes), an scope like the following in Stash class seems to me like the correct approach:
scope :available_boozes, -> { joins(:boozes).where("number > ?", 0) }
You can use it in the following way:
#alchy.stashes.available_boozes
which would only return the ones that are available.

How to dynamically set Azure Search's returned document size?

I know that by default Azure search will return 50 rows and maximum, it can return 1000 in one request. Then I need to use the continueToken to get the rest.
However, when I use SearchServiceClient and SearchParameters to do a query with the SDK, seems I can't pass a parameter to say how many rows I want to return in one request. Did I miss something? There is my simple code, just to return everything.
(what I want is that for certain scenario, return max 50 rows per request, but in other scenario, return 1000 rows per request and loop to get the rest).
var _searchClient = new SearchServiceClient(searchServiceName, new SearchCredentials(apiKey));
var _indexClient = _searchClient.Indexes.GetClient("unit");
SearchParameters sp = new SearchParameters() { SearchMode = SearchMode.All};
var result= _indexClient.Documents.Search(null , sp);
Azure Cognitive Search provides a model class SearchParameters in Microsoft.Azure.Search.Models namespace for the .NET SDK. You can set the Top and Skip properties to control the number of returned docs.
Refer to docs for SearchParameters for more properties. The following article gives examples of using this parameters with search - How to use Azure Cognitive Search from a .NET Application
In Azure Cognitive Search, you use the $count, $top, and $skip parameters to return the number of search results . The following example shows a sample request for total hits on an index called "online-catalog", returned as #odata.count:
GET /indexes/online-catalog/docs?search=*&$top=15&$skip=0&$count=true
For more details, you could refer to this article.

GoogleAppEngine - query with some custom filter

I am quite new with appEnginy and objectify. However I need to fetch a single row from db to get some value from it. I tried to fetch element by ofy().load().type(Branch.class).filter("parent_branch_id", 0).first() but the result is FirstRef(null). However though when I run following loop:
for(Branch b : ofy().load().type(Branch.class).list()) {
System.out.println(b.id +". "+b.tree_label+" - parent is " +b.parent_branch_id);
};
What do I do wrong?
[edit]
Ofcourse Branch is a database entity, if it matters parent_branch_id is of type long.
If you want a Branch as the result of your request, I think you miss a .now():
Branch branch = ofy().load().type(Branch.class).filter("parent_branch_id", 0).first().now();
It sounds like you don't have an #Index annotation on your parent_branch_id property. When you do ofy().load().type(Branch.class).list(), Objectify is effectively doing a batch get by kind (like doing Query("Branch") with the low-level API) so it doesn't need the property indexes. As soon as you add a filter(), it uses a query.
Assuming you are using Objectify 4, properties are not indexed by default. You can index all the properties in your entity by adding an #Index annotation to the class. The annotation reference provides useful info.
Example from the Objectify API reference:
LoadResult<Thing> th = ofy.load().type(Thing.class).filter("foo", foo).first();
Thing th = ofy.load().type(Thing.class).filter("foo", foo).first().now();
So you need to make sure member "foo" has an #Index and use the now() to fetch the first element. This will return a null if no element is found.
May be "parent_branch_id"in your case is a long, in which case the value must be 0L and not 0.

Implementing get_multi on app engine memcache

I was wondering if somebody could help. I'm using the blobcache module outlined in this post here
This works fine but I'm looking to speed retrieval from memcache by using the get_multi()
key function but my current code cannot find the keys when using get_multi
My current get def looks like this
def get(key):
chunk_keys = memcache.get(key)
if chunk_keys is None:
return None
chunk_keys= ",".join(chunk_keys)
str(chunk_keys)
chunk = memcache.get_multi(chunk_keys)
if chunk is None:
return None
try:
return chunk
except Exception:
return None
My understanding per the documentation is that you only need to pass through a string of keys to get_multi.
However his is not returning anything at the moment.
Can someone point out what i'm doing wrong here?
pass it a list of strings (keys) , instead of a single string with commas in it.
get_multi(keys, key_prefix='', namespace=None, for_cas=False)
keys = List of keys to look up. A Key can be a string or a tuple of
(hash_value, string), where the hash_value, normally used for sharding
onto a memcache instance, is instead ignored, as Google App Engine
deals with the sharding transparently.
Multi Get Documentation

Resources