Saving Traned AI models In Google Colab - artificial-intelligence

After training a twin delayed DDPG agent in Google colab for 10 hours I downloaded the python file to continue the work on another platform. The problem however is that the training data is not included when I save the python notebook file, hence the training data was lost. How can I save the file, move it to for example to the Unity 3D environment without dropping the training so I don't have to re-train the agent.
I sincerely appreciate any answers, comments, thoughts etc!

Store files you want to be persistent across sessions in Drive.
Here's a snippet showing how to mount your Google Drive as a FUSE filesystem in Colab:
https://colab.research.google.com/notebooks/io.ipynb#scrollTo=u22w3BFiOveA

Related

Heroku where can i save files?

I have a telegram bot, and it saves the user's audio messages and photos in the repository and DB(path only), I deployed it in on pythonanywhere and everything works.
But before that, I tried to deploy it on heroku and ran into the problem that you can't store files there and everything can only be done through databases.
Do I understand correctly that you need to create a field in the database that stores the file itself, or are there other ways?
You may use, for example, cloudinary. They provide 25GB of bandwidth for free. The service is intended to be used for pictures but works well with other files as well. AND it has a good API to go with it for many programming languages (not sponsored)).

How to deploy a web app that needs regular access to large data files

I am trying to deploy a web app I have written, but I am stuck with one element. The bulk of it is just an Angular application that interacts with a MongoDB database, thats all fine. Where I am stuck is that I need local read access to around 10Gb of files (geoTiff digital elevation models) - these dont change and are broken down into 500 or so files. Each time my app needs geographic elevations, it needs to find the right file, read the right bit of the files, return the data - the quicker the better. To reiterate, I am not serving these files, just reading data from them.
In development these files are on my machine and I have no problems, but the files seem to be too large to bundle in the Angular app (runs out of memory), and too large to include in any backend assets folder. I've looked at two serverless cloud hosting platforms (GCP and Heroku) both of which limit the size of the deployed files to around 1Gb (if I remember right). I have considered using cloud storage for the files, but I'm worried about negative performance as each time I need a file it would need to be downloaded from the cloud to the application. The only solution I can think of is to use a VM based service like Google Compute and use an API service to recieve requests from the app and deliver back the required data, but I had hoped it could be more co-located (not least cos that solution costs more $$)...
I'm new to deployment so any advice welcome.
Load your data to a GIS DB, like PostGIS. Then have your app query this DB, instead of the local raster files.

Google Cloud Platform Tensorboard - No dashboards are currently active

I was working on the tensorflow object detection API. I managed to train it locally on my computer and get decent results. However, when I tried to replicate the same on GCP, I had several errors. So, basically, I followed the documentation mentioned in the official tensorflow -running on cloud documentation
So this is how the bucket is laid out:
Bucket
weeddetectin-data
Train-packages
This is how I ran the training and evaluation job:
Running a multiworker training job
Running an evaluation job on cloud
I then used the following command to monitor on tensoboard:
tensorboard --logdir=gs://weeddetection --port=8080
I opened the dashboard using the preview feature in the console. But it says no dashboards are active for the current data set.
No Dashboards are active
So, I checked on my activity page to really see if the training and evaluation job were submitted:
Training Job
Evaluation Job
It seems as if there are no events files being written to your bucket.
The root cause could be that the manual your are using refers to an old version of the tensor models.
Please try and change
--train_dir=gs:...
to
--model_dir=gs://${YOUR_BUCKET_NAME}/model
And resend the job, once the job is running check the model_dir in the bucket to see if the files are written there.
Check out: gcloud ml-engine jobs documentation for additional read.
Hope it help!

A plea for a basic Notebook example getting data into and out of Google Cloud Datalab

I have started to try to use the Google Cloud datalab. While I understand it is a Beta product, I find the Doc's very frustrating, to say the least.
The questions here and lack of responses as well as lack of new revisions or docs over the several months the project has been available make me wonder if there is any commitment to the product?
A beginning would be a notebook that shows data ingestion from external sources to both the datastore system and the Big query system. That is a common use case. I'd like to use my own data, it would be great to have a Notebook to ingest it. It seems that should be doable without huge effort? And it would get me (and others) out of this mess trying to link the various terse docs from various products and workspaces up and working together..
in addition to a better explanation of the Git hub connection process (prior question))
For BigQuery, see here: https://github.com/GoogleCloudPlatform/datalab/blob/master/content/datalab/tutorials/BigQuery/Importing%20and%20Exporting%20Data.ipynb
For GCS, see here: https://github.com/GoogleCloudPlatform/datalab/blob/master/content/datalab/tutorials/Storage/Storage%20Commands.ipynb
Those are the only two storage options currently supported in Datalab (which should not be used in any event for large scale data transfers; these are for small scale transfers that can fit in memory in the Datalab VM).
For Git support, see https://github.com/GoogleCloudPlatform/datalab/blob/master/content/datalab/intro/Using%20Datalab%20-%20Managing%20Notebooks%20with%20Git.ipynb. Note that this has nothing to do with Github, however.
As for the low level of activity recently, that is because we have been heads down getting ready for GCP Next (which happens this coming week). Once that is over we should be able to migrate a number of new features over to Datalab and get a new public release out soon.
Datalab isn't running on your local machine. Just the presentation part is in your browser. So if you mean the browser client machine, that wouldn't be a good solution - you'd be moving data from the local machine to a VM which is running the Datalab Python code (and this VM has limited storage space), and then moving it again to the real destination. Instead, you should use the cloud console or (preferably) gcloud command line on your local machine for this.

local GAE datastore does not keep data after computer shuts down

On my local machine (i.e. http://localhost:8080/), I have entered data into my GAE datastore for some entity called Article. After turning off my computer and then restarting next day, I find the datastore empty: no entity. Is there a way to prevent this in the future?
How do I make a copy of the data in my local datastore? Also, will I be able to upload said data later into both localhost and production?
My model is ndb.
I am using Max OS X and Python 2.7, if theses matter.
I have experienced the same problem. Declaring the datastore path when running dev_appserver.py should fix it. These are the arguments I use when starting the dev_appserver
python dev_appserver.py --high_replication --use_sqlite --datastore_path=myapp.datastore --blobstore_path=myapp_blobs
This will use sqlite and save the data in the file myapp.datastore. If you want to save it in a different directory, use --datastore_path=/path/to/myapp/myapp.datastore
I also use --blobstore_path to save my blobs in a specific directory. I have found that it is more reliable to declare which directory to save my blobs. Again, that is --blobstore_path=/path/to/myapp/blobs or whatever you would like.
Since declaring blob and datastore paths, I haven't lost any data locally. More info can be found in the App Engine documentation here:
https://developers.google.com/appengine/docs/python/tools/devserver#Using_the_Datastore
Data in the local datastore is preserved unless you start it with the -c flag to clear it, at least on the PC. You therefore probably have a different issue with temp files or permissions or something.
The local data is stored using a different method to the actual production servers, so not sure if you can make a direct backup as such. If you want to upload data to both the local and deployed servers you can use the Upload tool suite: uploading data
The bulk loader tool can upload and download data to and from your application's datastore. With just a little bit of setup, you can upload new datastore entities from CSV and XML files, and download entity data into CSV, XML, and text files. Most spreadsheet applications can export CSV files, making it easy for non-developers and other applications to produce data that can be imported into your app. You can customize the upload and download logic to use different kinds of files, or do other data processing.
So you can 'backup' by downloading the data to a file.
To load/pull data into the local development server just give it the local URL.
The datastore typically saves to disk when you shut down. If you turned off your computer without shutting down the server, I could see this happening.

Resources