How to download the local datastore into csv file? - google-app-engine

I would like to download my local fixture data into a csv, to keep it safe.
I have tried something similar 6 months ago, with the only difference that I downloaded the data from a deployed appengine project, rather than downloading it from my local datastore.
However how could I download it from my local datastore?
appcfg.py download_data --url=http://localhost:8080/_ah/remote_api/ --filename=data.csv --email=x#w.eu --application="dev~myApp"
I get the following error message:
03:37 PM Downloading data records.
[INFO ] Logging to bulkloader-log-20140626.153708
[INFO ] Throttling transfers:
[INFO ] Bandwidth: 250000 bytes/second
[INFO ] HTTP connections: 8/second
[INFO ] Entities inserted/fetched/modified: 20/second
[INFO ] Batch Size: 10
[INFO ] Opening database: bulkloader-progress-20140626.153708.sql3
[INFO ] Opening database: bulkloader-results-20140626.153708.sql3
Password for x#w.eu:
[INFO ] Connecting to localhost:8080/_ah/remote_api/
[ERROR ] Unable to download kind stats for all-kinds download.
[ERROR ] Kind stats are generated periodically by the appserver
[ERROR ] Kind stats are not available on dev_appserver.
Any idea what I am doing wrong?

Go to the dev admin console (probably "localhost:8000")
Go to Datastore Stats
Click Generate Stats
This error occurs because the dev app engine does not automatically generate datastore statistics. Instead you must perform the steps above to manually generate these statistics.

Related

appcfg.py upload_data won't ask for authentication locally

In the local development environment, the upload_data command will not launch a browser for OAuth. Why is that?!?
The code works perfectly fine on App Engine but not the local development environment. Is there a trick to use the remote API for the dev environment.
Here's how I use the command...
appcfg.py upload_data --config_file=bulkloader.yaml --filename=./stops.txt --kind=StopLocationLoader --url=http://localhost:8082/_ah/remote_api
10:39 PM Uploading data records.
[INFO ] Logging to bulkloader-log-20161017.223916
[INFO ] Throttling transfers:
[INFO ] Bandwidth: 250000 bytes/second
[INFO ] HTTP connections: 8/second
[INFO ] Entities inserted/fetched/modified: 20/second
[INFO ] Batch Size: 10
Error 401: --- begin server output ---
You must be logged in as an administrator to access this.
--- end server output ---
This is a bug: https://code.google.com/p/googleappengine/issues/detail?id=12445
It links to a workaround posted for another question:
gcloud auth login
gcloud auth print-access-token
appcfg.py upload_data --oauth2_access_token=<oauth2_access_token> --config_file bulkloader.yaml --url=http://<yourproject>.appspot.com/remote_api --filename places.csv --kind=Place --email=<you#gmail.com>

Why I can not download GAE data? 302 error is returned

I used the following command to download my GAE data:
appcfg.py download_data --log_file=bulkloader.log --kind=MyKind --url=http://myappid.appspot.com/rmt_api --filename=myfilename --db_filename=MyKind_db.sql3 --result_db_filename=MyKind_result_db.sql3 --config_file=bulkloader.yaml
It worked well a few days ago, but yesterday it returned me access_token is expired. I manually deleted gae oauth files at /Users/myuser/ (I am on OSX), but it didn't help - now I get the following output:
09:51 PM Downloading data records.
[INFO ] Logging to bulkloader.log
[INFO ] Throttling transfers:
[INFO ] Bandwidth: 250000 bytes/second
[INFO ] HTTP connections: 8/second
[INFO ] Entities inserted/fetched/modified: 20/second
[INFO ] Batch Size: 10
Error 302: --- begin server output ---
--- end server output ---
I tried to add --verbose and --noisy parms, but nothing changed.
In result I can not download the data. Upload of the application appcfg.py --update works well.
It looks like GAE issue, but all such issues are rejected.
It appears that the thread became confused by users posting potentially-unrelated material. You should open a new ticket linking to that one, and we'll be happy to take a look. We monitor the relevant tags here on Stack Overflow and Serverfault, along with several Google Groups forums and code.google.com Public Issue Trackers (along with the various GitHub projects), so you'll be sure to get a quick response.

Broken Pipe error on appcfg upload_data

I have successfully download my data from datastore on my Mac OS X 10.8 (SDK 1.6). However when I'm using appcfg upload_data I get the following error; any idea ?
appcfg.py upload_data --url=http://localhost:8081/_ah/remote_api --filename=db_backup_2012_Nov_3
ERROR:
[INFO ] Connecting to localhost:8081/_ah/remote_api [INFO ] Starting import; maximum 10 entities per post ...........[INFO ] Unexpected thread death: WorkerThread-2 [INFO ] An error occurred. Shutting down... ........[ERROR ] Error in WorkerThread-2: [ERROR ] Error in WorkerThread-7:
[INFO ] 350 entities total, 0 previously transferred [INFO ] 190 entities (179958 bytes) transferred in 15.2 seconds [INFO ] Some entities not successfully transferred
I solved the problem by limiting the number of threads to 1 when runnign appcfg upload_data:
appcfg.py upload_data --url=http://localhost:8081/_ah/remote_api --filename=db_backup_2012_Nov_3 --num_threads=1
The dev app server seems to be dropping threads. Perhaps because of having "threadsafe: no" in app.yaml!

Google App Engine download_data authentication error

I have read the numerous questions on this, but found no solution that works :(
$ appcfg.py download_data --url=http://THING.appspot.com/_ah/remote_api --filename=backup1 .
08:47 PM Application: THING
08:47 PM Downloading data records.
[INFO ] Logging to bulkloader-log-20120910.204726
[INFO ] Throttling transfers:
[INFO ] Bandwidth: 250000 bytes/second
[INFO ] HTTP connections: 8/second
[INFO ] Entities inserted/fetched/modified: 20/second
[INFO ] Batch Size: 10
[INFO ] Opening database: bulkloader-progress-20120910.204726.sql3
[INFO ] Opening database: bulkloader-results-20120910.204726.sql3
[INFO ] Connecting to THING.appspot.com/_ah/remote_api
[INFO ] Authentication Failed
So I have several questions about what's going on:
Why does this not ask me for my password, when in almost every other question I've seen it does? Is it because I've already uploaded a new version of my app and signed in?
Why do some people have to put application='s~THING' in their command line (does not help me).
I'm using a gmail address as my admin, so presumably that means its not related to any of the OpenID bugs given as answers to other similar questions?
I have builtins: - remote_api: on in my app.yaml (which is in this directory - hence the ".", right?), do I need to put a handler in?
The request for /_ah/remote_api goes to my main ("/.*") handler! Is that the cause of the problem?
How can I fix any of these things?
Edit:
Sebastian kindly pointed me in the right direction, but I now have this error:
$ appcfg.py download_data --application='s~THING' --url=http://THING.appspot.com/_ah/remote_api --filename=backup1 --kind=Article .
09:47 PM Application: s~THING (was: THING)
09:47 PM Downloading data records.
[INFO ] Logging to bulkloader-log-20120910.214744
[INFO ] Throttling transfers:
[INFO ] Bandwidth: 250000 bytes/second
[INFO ] HTTP connections: 8/second
[INFO ] Entities inserted/fetched/modified: 20/second
[INFO ] Batch Size: 10
[INFO ] Opening database: bulkloader-progress-20120910.214744.sql3
[INFO ] Opening database: bulkloader-results-20120910.214744.sql3
[INFO ] Connecting to THING.appspot.com/_ah/remote_api
[INFO ] Downloading kinds: ['Article']
.[ERROR ] [WorkerThread-1] WorkerThread:
Traceback (most recent call last):
File "/home/me/google_appengine/google/appengine/tools/adaptive_thread_pool.py", line 176, in WorkOnItems
status, instruction = item.PerformWork(self.__thread_pool)
File "/home/me/google_appengine/google/appengine/tools/bulkloader.py", line 764, in PerformWork
transfer_time = self._TransferItem(thread_pool)
File "/home/me/google_appengine/google/appengine/tools/bulkloader.py", line 1170, in _TransferItem
self, retry_parallel=self.first)
File "/home/me/google_appengine/google/appengine/tools/bulkloader.py", line 1471, in GetEntities
results = self._QueryForPbs(query)
File "/home/me/google_appengine/google/appengine/tools/bulkloader.py", line 1442, in _QueryForPbs
raise datastore._ToDatastoreError(e)
Error: API error 4 (datastore_v3: NEED_INDEX): no matching index found.
[INFO ] An error occurred. Shutting down...
[ERROR ] Error in WorkerThread-1: API error 4 (datastore_v3: NEED_INDEX): no matching index found.
[INFO ] Have 10 entities, 0 previously transferred
[INFO ] 10 entities (12985 bytes) transferred in 1.6 seconds
I am still left with errors (see above) but the basics appear to be covered by https://developers.google.com/appengine/docs/go/tools/uploadingdata
as recommended in a comment. Thanks for that.
If I fix the other errors I will update this.
There's a small bug in the Go remote_api support. To work around it you can either add the relevant index, or use a dummy Python version to download the data. It should be fixed in a future release.

appcfg.py create_bulkloader_config returns "Authentication Failed" without even prompting me for authentication

I've got a Java appengine app with remote_api installed as per http://ikaisays.com/2010/06/10/using-the-bulkloader-with-java-app-engine/
<servlet>
<servlet-name>RemoteApi</servlet-name>
<servlet-class>com.google.apphosting.utils.remoteapi.RemoteApiServlet</servlet-class>
</servlet>
<servlet-mapping>
<servlet-name>RemoteApi</servlet-name>
<url-pattern>/remote_api</url-pattern>
</servlet-mapping>
When I go to myapp.appspot.com/remote_api with a web browser, I see the message "This request did not contain a necessary header," which I understand is expected.
But when I run appcfg.py create_bulkloader_config --url=http://APPID.appspot.com/remote_api --application=APPID --filename=config.yml from my command line (with the proper APPID) I get
C:\ag\dev>appcfg.py create_bulkloader_config --url=https://correctid.appspot.c
om/remote_api --application=correctid --filename=config.yml
Creating bulkloader configuration.
[INFO ] Logging to bulkloader-log-20101114.081901
[INFO ] Throttling transfers:
[INFO ] Bandwidth: 250000 bytes/second
[INFO ] HTTP connections: 8/second
[INFO ] Entities inserted/fetched/modified: 20/second
[INFO ] Batch Size: 10
[INFO ] Opening database: bulkloader-progress-20101114.081901.sql3
[INFO ] Opening database: bulkloader-results-20101114.081901.sql3
[INFO ] Connecting to correctid.appspot.com/remote_api
[INFO ] Authentication Failed
C:\ag\dev>
I've already tried the no_cookies option, which didn't help. I also tried using -e correctadminmail#gmail.com. Neither of these changed my output at all.
How can I specify my authentication parameters?
This might happen if your app is configured to use OpenID for logins - OpenID isn't compatible with remote_api.
This blog post describes the problem and a solution:
http://blog.notdot.net/2010/06/Using-remote-api-with-OpenID-authentication
The solution is in Python though - you might be able to do something equivalent in Java (or upload some Python code to a different version of your app, just for the remote api).

Resources