How to effectively delete Google App engine Search API index - google-app-engine

I have found some similary questions here ,but no solid answer.
How to delete or reset a search index in Appengine
how to delete search index in GAE Search API
How to delete a search index on the App Engine using Go?
How to delete a search Index itself
I see some googler suggest that
You can effectively delete an index by first using index.delete() to remove all of the documents from an index, and then using index.delete_schema() to remove the type mappings from the index 1.
Unfortunately, golang sdk does not have "index.delete_schema()" api.
I can only delete document one by one by getting itemId list from index. And We got a surprisely billing status in dashboard:
Resource Usage Billable Price Cost
Search API Simple Searches 214,748.49 10K Ops 214,748.39 $0.625 / 10K Ops $134,217.74
Can someone tell me how to effectively delete Google App engine Search API index wihout cost so much ?

Unfortunately there is no simple operation that allows you to delete an entire large search index without incurring substantial cost, short of deleting the entire app (which, actually, could be an effective approach in certain circumstances).

The short answer is NO.
There is no pefectly efficient way with GCP to drop full search index in one go.
The only efficient way they themselves suggest in thier "Best Practices" is to delete in bactches of 200 documents per index.delete() method call (in Java and Python app engine sdk).
To add to the disappointment, GO SDK even does not support this too and allows only one doc deletion per call. What a miserable support from GCP!
So if your indexes have grown to some good GBs, you are forced to consume your dollars and days or better say cleanup the mess, left by GCP, at your own cost. Mind it, it costs you a lot with giant index>10GB.
Now, how to do it in GO runtime.
Donot do it with GO runtime
Better, write a micro-service in Java or Python under the same projectId and use these runtimes with thier SDKs/Client Libraries to delete index the only efficient way(200 per call), GCP offers. So very very limited and essentially cost bearing solution with app engine. Have to live with it, dear :)
PS. I have created an bug/issue a year back regarding the same. No actions takent yet :)

As you mentioned, deleting an index is only available for Java 8 at the moment.
Since you're using GO, currently there is no possibility to delete an index, but you can remove documents that are part of it to reduce the cost
To delete documents from an index you can follow this example here
func deleteHandler(w http.ResponseWriter, r *http.Request) {
ctx := appengine.NewContext(r)
index, err := search.Open("users")
if err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
return
}
id := "PA6-5000"
err = index.Delete(ctx, id)
if err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
return
}
fmt.Fprint(w, "Deleted document: ", id)
}

Related

How to use an array to work around google scripts "service invoked too many times for one day" error

I wrote a simple script to fetch the distance between two locations, each in a different cell in GoogleSheets (below). My sheet has one set of 65 locations in the top row and a second set of 6000 locations listed in the first column. I want to find the distance between each location in the top row and each location in the first column.
Given the size of my data set, I'm running into the "service invoked too many times for one day: route" error message. I found this post suggesting that one could create an array to execute calculations for the whole spreadsheet at once, rather than cell by cell. Would this be a suitable solution for my current problem? If so, how would I go about writing the script? Here's my current code:
function GOOGLEMAPS(start_address,end_address) {
Utilities.sleep(1000)
var mapObj = Maps.newDirectionFinder();
mapObj.setOrigin(start_address);
mapObj.setDestination(end_address);
var directions = mapObj.getDirections();
var meters = directions["routes"][0]["legs"][0]["distance"]["value"];
var distance = meters * 0.000621371
//Logger.log(distance)
return distance;
}
If you desire overcome a daily limit established by the Google Apps Script you should initialize a Map with the Google Maps Premium plan credentials. This means you should contact Google and purchase a Google Maps Platform Premium plan license in order to get additional quota allowances.
There is a Maps.setAuthentication(clientId, signingKey); method for this purpose.
Enables the use of an externally established Maps API for Business account, to leverage additional quota allowances. Your client ID and signing key can be obtained from the Google Enterprise Support Portal. Set these values to null to go back to using the default quota allowances.
source: https://developers.google.com/apps-script/reference/maps/maps#setAuthentication(String,String)
I hope this answer clarifies your doubt.

Cloudant Database Map Reduce

I am new to cloudant , no-sql data base (i had worked on mongodb )
1) is there any cloudant ui to write the queires to find the resultset for developing.
2) how to create map-reduce in cloudant ?..
can u please reply me or send your thoughts.
The search indexes are written in JavaScript (at the moment, Cloduant has launched their own "Cloudant Query" which promises to be easier to work with but I haven't had the time to try it properly yet.)
Say you have documents in your DB which contain a field called "UserName" and you want to create a view on all these. You could write a function like this;
function(doc) {
if ( typeof doc.UserName !== "undefined" ) {
emit([doc.UserName], doc._id);
}
}
For example (it will output the user names and document ids)
If a given user name could be associated with multiple documents you could do this, for example;
function(doc) {
if ( typeof doc.UserName !== "undefined" ) {
emit([doc.UserName,doc._id], 1);
}
}
and also use the built-in "count" or "sum" reduce functions that Cloudant provides to tally the number of documents a given user name is associated with etc.
You can use the UI in the Cloudant DB dashboard to execute queries or (as I personally favour) use a tool like Postman (https://www.getpostman.com/)
One word of warning though; error- and sanity -checking of your JavaScript code is pretty much non-existent and you'll only know that something isn't working when you hit "save & build index" which can be a major pain if you're working on large databases (it can grind the whole thing to a halt). A pro tip, therefore, is to work out your indexes on smaller data sets in some safe little sandbox database before you let it lose on anything important...
All of this is supposedly going to be Much Better with Cloudant Query.

What ways does Go have to easily convert data into bytes or strings

I've been developing a couple apps using Google App Engine Go SDK that use Memcache as a buffer for loading data from the Datastore. As the Memcache is only able to store data as []byte, I often find myself creating functions to encode the various structures as strings, and also functions to reverse the process. Needless to say, it is quite tedious when I need to do this sort of thing 5 times over.
Is there an easy way to convert any arbitrary structure that can be stored in Datastore into []byte in order to store it in Memcache and then load it back without having to create custom code for various structures in GAE Golang?
http://golang.org/pkg/encoding/gob or http://golang.org/pkg/encoding/json can turn arbitrary datatypes into []byte slices given certain rules apply to the datastructures being encoded. You probably want one of them gob will encode to smaller sizes but json is more easily shareable with other languages if that is a requirement.
I found myself needing the same thing. So I created a package called:
AEGo/ds
Documentation | Source
go get github.com/scotch/aego/ds
It uses the same API as the "appengine/datastore" so It will work as a drop in replacement.
import "github.com/scotch/aego/v1/ds"
u = &User{Name: "Bob"}
key := datastore.NewKey(c, "User", "bob", 0, nil)
key, err := ds.Put(c, key, u)
u = new(User)
err = ds.Get(c, key, u)
By default it will cache all Puts and Gets to memcache, but you can modify this behavior by calling the ds.Register method:
ds.Register("User", true, false, false)
The Register method takes a string representing the Kind and 3 bool - userDatastore, useMemcache, useMemory. Passing a true value will cause AEgo/ds to persist the record to that store. The Memory store is useful for records that you do not expect to change, but could contain stale data if you have more then one instance running.
Supported methods are:
Put
PutMulti
Get
GetMulti
Delete
DeleteMulti
AllocateIDs
Note: Currently cashing only occures with Get. GetMulti pulls from the datastore.
AEGo/ds is a work in progress, but the code is well tested. Any feedback would be appreciated.
And to answer you question here's how I serialized the entities to gob for the memcache persistence.

GAE Go size of datastore

Is there some function one can call to get the amount of entries in the GAE Go Datastore of an app, without querying it for the whole database and counting the output?
c := appengine.NewContext(r)
var result struct {
Bytes int64 `datastore:"bytes"`
Count int64 `datastore:"count"`
Timestamp datastore.Time `datastore:"timestamp"`
}
datastore.NewQuery("__Stat_Total__").Run(c).Next(&result)
c.Infof("count: %d", result.Count)
You can view the size of all entities in the admin console under Data > Datastore Statistics.
These stats can be queried programmatically from Python or Java; I couldn't find a documented equivalent for Go.

What's your experience developing on Google App Engine?

Is GQL easy to learn for someone who knows SQL? How is Django/Python? Does App Engine really make scaling easy? Is there any built-in protection against "GQL Injections"? And so on...
I'd love to hear the not-so-obvious ups and downs of using app engine.
Cheers!
My experience with google app engine has been great, and the 1000 result limit has been removed, here is a link to the release notes:
app-engine release notes
No more 1000 result limit - That's
right: with addition of Cursors and
the culmination of many smaller
Datastore stability and performance
improvements over the last few months,
we're now confident enough to remove
the maximum result limit altogether.
Whether you're doing a fetch,
iterating, or using a Cursor, there's
no limits on the number of results.
The most glaring and frustrating issue is the datastore api, which looks great and is very well thought out and easy to work with if you are used to SQL, but has a 1000 row limit across all query resultsets, and you can't access counts or offsets beyond that. I've run into weirder issues, with not actually being able to add or access data for a model once it goes beyond 1000 rows.
See the Stack Overflow discussion about the 1000 row limit
Aral Balkan wrote a really good summary of this and other problems
Having said that, app engine is a really great tool to have at ones disposal, and I really enjoy working with it. It's perfect for deploying micro web services (eg: json api's) to use in other apps.
GQL is extremely simple - it's a subset of the SQL 'SELECT' statement, nothing more. It's only a convenience layer over the top of the lower-level APIs, though, and all the parsing is done in Python.
Instead, I recommend using the Query API, which is procedural, requires no run-time parsing, and makes 'GQL injection' vulnerabilities totally impossible (though they are impossible in properly written GQL anyway). The Query API is very simple: Call .all() on a Model class, or call db.Query(modelname). The Query object has .filter(field_and_operator, value), .order(field_and_direction) and .ancestor(entity) methods, in addition to all the facilities GQL objects have (.get(), .fetch(), .count()), etc.) Each of the Query methods returns the Query object itself for convenience, so you can chain them:
results = MyModel.all().filter("foo =", 5).order("-bar").fetch(10)
Is equivalent to:
results = MyModel.gql("WHERE foo = 5 ORDER BY bar DESC LIMIT 10").fetch()
A major downside when working with AppEngine was the 1k query limit, which has been mentioned in the comments already. What I haven't seen mentioned though is the fact that there is a built-in sortable order, with which you can work around this issue.
From the appengine cookbook:
def deepFetch(queryGen,key=None,batchSize = 100):
"""Iterator that yields an entity in batches.
Args:
queryGen: should return a Query object
key: used to .filter() for __key__
batchSize: how many entities to retrieve in one datastore call
Retrieved from http://tinyurl.com/d887ll (AppEngine cookbook).
"""
from google.appengine.ext import db
# AppEngine will not fetch more than 1000 results
batchSize = min(batchSize,1000)
query = None
done = False
count = 0
if key:
key = db.Key(key)
while not done:
print count
query = queryGen()
if key:
query.filter("__key__ > ",key)
results = query.fetch(batchSize)
for result in results:
count += 1
yield result
if batchSize > len(results):
done = True
else:
key = results[-1].key()
The above code together with Remote API (see this article) allows you to retrieve as many entities as you need.
You can use the above code like this:
def allMyModel():
q = MyModel.all()
myModels = deepFetch(allMyModel)
At first I had the same experience as others who transitioned from SQL to GQL -- kind of weird to not be able to do JOINs, count more than 1000 rows, etc. Now that I've worked with it for a few months I absolutely love the app engine. I'm porting all of my old projects onto it.
I use it to host several high-traffic web applications (at peak time one of them gets 50k hits a minute.)
Google App Engine doesn't use an actual database, and apparently uses some sort of distributed hash map. This will lend itself to some different behaviors that people who are accustomed to SQL just aren't going to see at first. So for example getting a COUNT of items in regular SQL is expected to be a fast operation, but with GQL it's just not going to work the same way.
Here are some more issues:
http://blog.burnayev.com/2008/04/gql-limitations.html
In my personal experience, it's an adjustment, but the learning curve is fine.

Resources