I am using the following GoLang package: https://godoc.org/cloud.google.com/go/bigquery
My app runs in Google App Engine
If I have understood the documentation correctly it should be possible to extract the result of a job/query to Google Cloud Storage using a job. I don't think the documentation is very clear and was wondering if anyone has an example code or other help.
TL:DR
How do I get access to the temporary table when using Go Lang instead of command line.
How do I extract the result of my Bigquery to GCS
** EDIT **
Solution i used
I created a temporary table and set it as the Dst (Destination) for the Query result and created an export job with it.
dataset_result.Table(table_name).Create(ctx, bigquery.TableExpiration(time.Now().Add(1*time.Hour)))
Update 2018:
https://github.com/GoogleCloudPlatform/google-cloud-go/issues/547
To get the table name:
q := client.Query(...)
job, err := q.Run(ctx)
// handle err
config, err := job.Config()
// handle err
tempTable := config.(*QueryConfig).Dst
How do I extract the result of my BigQuery to GCS
You cannot directly write the results of a query to GCS. You first need to run the query, save the results to a permanent table, and then kick off an export job to GCS.
https://cloud.google.com/bigquery/docs/exporting-data
How do I get access to the temporary table when using Go Lang instead of command line.
You call use the jobs API, or look in the query history if using the web UI. See here and here.
https://cloud.google.com/bigquery/querying-data#temporary_and_permanent_tables
Related
I have found some similary questions here ,but no solid answer.
How to delete or reset a search index in Appengine
how to delete search index in GAE Search API
How to delete a search index on the App Engine using Go?
How to delete a search Index itself
I see some googler suggest that
You can effectively delete an index by first using index.delete() to remove all of the documents from an index, and then using index.delete_schema() to remove the type mappings from the index 1.
Unfortunately, golang sdk does not have "index.delete_schema()" api.
I can only delete document one by one by getting itemId list from index. And We got a surprisely billing status in dashboard:
Resource Usage Billable Price Cost
Search API Simple Searches 214,748.49 10K Ops 214,748.39 $0.625 / 10K Ops $134,217.74
Can someone tell me how to effectively delete Google App engine Search API index wihout cost so much ?
Unfortunately there is no simple operation that allows you to delete an entire large search index without incurring substantial cost, short of deleting the entire app (which, actually, could be an effective approach in certain circumstances).
The short answer is NO.
There is no pefectly efficient way with GCP to drop full search index in one go.
The only efficient way they themselves suggest in thier "Best Practices" is to delete in bactches of 200 documents per index.delete() method call (in Java and Python app engine sdk).
To add to the disappointment, GO SDK even does not support this too and allows only one doc deletion per call. What a miserable support from GCP!
So if your indexes have grown to some good GBs, you are forced to consume your dollars and days or better say cleanup the mess, left by GCP, at your own cost. Mind it, it costs you a lot with giant index>10GB.
Now, how to do it in GO runtime.
Donot do it with GO runtime
Better, write a micro-service in Java or Python under the same projectId and use these runtimes with thier SDKs/Client Libraries to delete index the only efficient way(200 per call), GCP offers. So very very limited and essentially cost bearing solution with app engine. Have to live with it, dear :)
PS. I have created an bug/issue a year back regarding the same. No actions takent yet :)
As you mentioned, deleting an index is only available for Java 8 at the moment.
Since you're using GO, currently there is no possibility to delete an index, but you can remove documents that are part of it to reduce the cost
To delete documents from an index you can follow this example here
func deleteHandler(w http.ResponseWriter, r *http.Request) {
ctx := appengine.NewContext(r)
index, err := search.Open("users")
if err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
return
}
id := "PA6-5000"
err = index.Delete(ctx, id)
if err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
return
}
fmt.Fprint(w, "Deleted document: ", id)
}
The index.yaml file of my GAE app is no longer updated by the development server.
I have recently added a new kind to my app and a handler that queries this kind like so:
from google.appengine.ext import ndb
class MyKind(ndb.Model):
thing = ndb.TextProperty()
timestamp = ndb.DateTimeProperty(auto_now_add=True)
and in the handler I have a query
query = MyKind.query()
query.order(-MyKind.timestamp)
logging.info(query.iter().index_list())
entities = query.fetch(100)
for entity in entities:
# do something
AFAIK, the development server should create an index for this query and update index.yaml accordingly. However, it doesn't. It just looks like this:
indexes:
# AUTOGENERATED
The logging.info(query.iter().index_list()) should output the index used for the query, it just says 'None'. Also, the SDK console says 'Datastore contains no indexes.'
Running the query returns the entities unsorted. I have two questions:
is there some syntax error in my code causes the query results be unsorted or is it the missing index?
if it's the missing index, is there a way to manually force the dev server to update index.yaml? Other suggestions?
Thank you
your call to order returns the new query..
query = MyKind.query()
query = query.order(-MyKind.timestamp)
..to clarify..
query.order(-MyKind.timestamp) does not change the query, it returns a new one, so you need to use the query returned by that method. As it is query.order(-MyKind.timestamp) in your code does nothing.
I have 1.6 Million entities in a Google App Engine app that I would like to download. I tried using the built in bulkloader mechanism but found that it is terribly slow. While I can only download ~30 entities/second via the bulkloader, I can do ~500 entities/second by querying the datastore via a backend. A backend is necessary to circumvent the 60 second request limit. In addition, datastore queries can only live for up to 30 seconds so you need to break up your fetches across multiple queries using query cursors.
The code on the server side fetches an 1000 entities and returns a query cursor:
cursor = request.get('cursor')
devices = Pushdev.all()
if (cursor and cursor!=''):
devices.with_cursor(cursor)
next1000 = devices.fetch(1000)
for d in next1000:
t = int(time.mktime(d.created.timetuple()))
response.out.write('%s/%s/%d\n'%(d.name,d.alias,t))
response.out.write(devices.cursor())
On the client side, I have a loop that invokes the handler on the server with a null cursor to begin with and then starts to pass the cursor received by the previous invocation. It terminates when it gets an empty result.
PROBLEM: I am only able to fetch a fraction - ~20% of the entities using this method. I get a response with empty data even though the full set of entities has not been traversed. Why does this method not fetch everything comprehensively?
I couldn't find anything to confirm or deny this in the docs, but my guess is that all() has a non-deterministic ordering such that eventually one of your fetch(1000)'s will hit the "last element" and devices.cursor() will return nothing.
Try this:
devices = Pushdev.all().order('__key__')
Is there some function one can call to get the amount of entries in the GAE Go Datastore of an app, without querying it for the whole database and counting the output?
c := appengine.NewContext(r)
var result struct {
Bytes int64 `datastore:"bytes"`
Count int64 `datastore:"count"`
Timestamp datastore.Time `datastore:"timestamp"`
}
datastore.NewQuery("__Stat_Total__").Run(c).Next(&result)
c.Infof("count: %d", result.Count)
You can view the size of all entities in the admin console under Data > Datastore Statistics.
These stats can be queried programmatically from Python or Java; I couldn't find a documented equivalent for Go.
I don't know what can be wrong here:
SELECT * FROM RGUser
WHERE isGuest = FALSE AND created < DATE('2011-09-01')
ORDER BY created
screenshot http://my.jetscreenshot.com/3910/20110904-pwms-18kb.jpg
There's nothing wrong in your query, it's just the datastore viewer that does not work well in case of query modification after an error; just to copy the query, click again on Datastore viewer, paste it and launch it again.