I was working on the tensorflow object detection API. I managed to train it locally on my computer and get decent results. However, when I tried to replicate the same on GCP, I had several errors. So, basically, I followed the documentation mentioned in the official tensorflow -running on cloud documentation
So this is how the bucket is laid out:
Bucket
weeddetectin-data
Train-packages
This is how I ran the training and evaluation job:
Running a multiworker training job
Running an evaluation job on cloud
I then used the following command to monitor on tensoboard:
tensorboard --logdir=gs://weeddetection --port=8080
I opened the dashboard using the preview feature in the console. But it says no dashboards are active for the current data set.
No Dashboards are active
So, I checked on my activity page to really see if the training and evaluation job were submitted:
Training Job
Evaluation Job
It seems as if there are no events files being written to your bucket.
The root cause could be that the manual your are using refers to an old version of the tensor models.
Please try and change
--train_dir=gs:...
to
--model_dir=gs://${YOUR_BUCKET_NAME}/model
And resend the job, once the job is running check the model_dir in the bucket to see if the files are written there.
Check out: gcloud ml-engine jobs documentation for additional read.
Hope it help!
Related
This could entirely be a case of me misunderstanding how Azurite works, but I can't seem to find the answers by searching.
I've downloaded Azurite through the VS code extension, and uploaded some data to a local blob source on my hard drive using Windows Storage Explorer; that's now visible in the azurite __blobstorage__ folder. I've tried initialising a new function to try and search over the data, but the project i'm working on specifically phrased it as:
"Set-up local version of Storage and Cognitive Search and index a sample set of documents"
Is this possible to do and i'm just missing something somewhere? Or have I misunderstood the task and you can't actually run cognitive search locally without at some stage attaching to the subscription? I'm waiting for the PM to get back from annual leave, so I thought i'd carry on trying to find out the answer whilst I wait, and hoping someone here might be able to help me out!
I've tried hunting through both the microsoft VS Code Local Development Hot to Guide and the Git repository for Azurite, so i'm not sure if i'm just reading the information wrong or if it's just not there to find.
Azure Search does not currently offer a localhost emulator. Azurite is for localhost storage emulation. It is not possible for an Azure Search Indexer to index data from a local emulator, but you can write data to Azure Search directly via the Index Docs REST APIs. You would need to write a script to read from your local storage and make an API call to index the data into a Search instance in Azure.
I am connecting via ssh to one of an App Engine Flex instances with .Net Core application running on it and get this:
Where does that ruby process(with 24% cpu usage) come from? Is it some internal google service?
The running Ruby process is /usr/sbin/google-fluentd. This package contains the logger agent which is the basis of Stackdriver Logging and it is written in Ruby gem as explained in this document. All in all, the Ruby process is using the CPU because the application’s logging.
As an aside, I noticed that the screenshot you uploaded contains your account-id and project-id. I strongly suggest you to re-upload the picture without this information for security and privacy reasons.
I have started to try to use the Google Cloud datalab. While I understand it is a Beta product, I find the Doc's very frustrating, to say the least.
The questions here and lack of responses as well as lack of new revisions or docs over the several months the project has been available make me wonder if there is any commitment to the product?
A beginning would be a notebook that shows data ingestion from external sources to both the datastore system and the Big query system. That is a common use case. I'd like to use my own data, it would be great to have a Notebook to ingest it. It seems that should be doable without huge effort? And it would get me (and others) out of this mess trying to link the various terse docs from various products and workspaces up and working together..
in addition to a better explanation of the Git hub connection process (prior question))
For BigQuery, see here: https://github.com/GoogleCloudPlatform/datalab/blob/master/content/datalab/tutorials/BigQuery/Importing%20and%20Exporting%20Data.ipynb
For GCS, see here: https://github.com/GoogleCloudPlatform/datalab/blob/master/content/datalab/tutorials/Storage/Storage%20Commands.ipynb
Those are the only two storage options currently supported in Datalab (which should not be used in any event for large scale data transfers; these are for small scale transfers that can fit in memory in the Datalab VM).
For Git support, see https://github.com/GoogleCloudPlatform/datalab/blob/master/content/datalab/intro/Using%20Datalab%20-%20Managing%20Notebooks%20with%20Git.ipynb. Note that this has nothing to do with Github, however.
As for the low level of activity recently, that is because we have been heads down getting ready for GCP Next (which happens this coming week). Once that is over we should be able to migrate a number of new features over to Datalab and get a new public release out soon.
Datalab isn't running on your local machine. Just the presentation part is in your browser. So if you mean the browser client machine, that wouldn't be a good solution - you'd be moving data from the local machine to a VM which is running the Datalab Python code (and this VM has limited storage space), and then moving it again to the real destination. Instead, you should use the cloud console or (preferably) gcloud command line on your local machine for this.
I have developed an app in Twilio which I would like to run from the cloud. I tried learning about AWS and Google App Engine but am quite confused at this stage:
I have 2 questions which I hope to get your help on:
1) How can I store my scripts and database in the cloud? Right now, everything is running out of my local machine but I would like to transfer the scripts and db to another server and run my app at a predetermined time of day. What would be the best way to do this?
2) How can I write a batch file to run my app at a predetermined time of day in the cloud?
I understand this does not have code, but I really hope someone can point me to the right direction. I have spent lots of time trying to understand this myself but still am unsure. Tks in adv.
Update: The application is a Twilio app that makes calls to people, the script simply applies an algorithm to make calls in a certain fashion and the database is a mysql db that provides the details of people to be called.
This is quite difficult to provide an exact answer without understanding what is the application, what is the DB or what is the script that you wish to run.
I can give you a couple of ideas that might be helpful in such cases.
OpsWorks (http://aws.amazon.com/opsworks/) is a managed service for managing applications. You can define your stack (multiple layers like web, workers, DB...) and what are the chef recipes that should run in various points in the life of the instances in each layer (startup, shutdown, app deployment or stack modification..). Then you can use the ability to add instances to each layer in specific days and hours, to implement the functionality of running at predetermined times as you requested.
In such a solution you can either have some of your instances (like DB) always on, or even to bootstrap them using the chef recipes every day, with restore from snapshot on start and create snapshot on shutdown.
Another AWS service that you use is Data Pipeline (http://aws.amazon.com/datapipeline/). It is designed to move data periodically between data sources, for example from a MySQL database to Amazon Redshift, the Data warehouse service. But you can use it to trigger scripts and run random shell scripts that you wish (http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-object-shellcommandactivity.html), and schedule it to run in various conditions like every hour/day or specific times (http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-concepts-schedules.html).
A simple path here would be just to create an EC2 instance in AWS, and put the components needed to run your app there. A thorough walk through is here:
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/get-set-up-for-amazon-ec2.html
Essentially you will create an EC2 virtual machine, which you can for most purposes treat just like any other Linux server. You can install MySQL on it, copy your script there, and run it. Of course whatever container or support libraries your code requires will need to be installed as well.
You don't say what OS you are using locally, but if it is Mac or Linux, you should be able to follow almost the same process to get your script running on an EC2 instance that you used on your local machine.
As you get to know AWS, there are sophisticated services you can use for deployment, infrastructure orchestration, database services, and so on. But just to get started running a script from a virtual machine should be pretty straightforward.
I recently developed a Twilio application using Ruby on Rails for the backend and found Heroku extremely simple to setup and launch. While Heroku does cost more than AWS, I found that the time I saved using Heroku more than made up this. As an early stage startup, we wanted to spend our time developing important features, and not "wasting" time optimizing our AWS cloud.
However, while I believe Heroku is ideal for early-stage websites/startups I do believe hosting should be reevaluated once a company reaches a certain size. At some point it becomes economically viable to devote resources into optimizing an AWS cloud solution because it will be cheaper than Heroku in the long run.
I`m running several python (2.7) applications and constantly hit one problem: log search (from dashboard, admin console) is not reliable. it is fine when i searching for recent log entries (they are normally found ok), but after some period (one day for instance) its not possible to find same record with same search query again. just "no results". admin console shows that i have 1 gig of logs spanning 10-12 days, so old record should be here to find, retention/log size limits is not a reason for this.
Specifically i have "cron" request that write stats to log every day (it`s enough for me) and searching for this request always gives me the last entry, not entry-per-day-of-span-period as expected.
Is it expected behaviour (i do not see clear statements about log storage behaviour in docs, for example) or there is something to tune? For example, will it help to log less per request? or may be there is advanced use of query language.
Please advise.
This is a known issue that has already been reported on googleappengine issue tracker.
As an alternative you can consider reading your application logs programmatically using the Log Service API to ingest them in BigQuery, or build your own search index.
Google App Engine Developer Relations delivered a codelab at Google I/O 2012 about App Engine logs ingestion into Big Query.
And Streak released a tool called called Mache and a chrome extension to automate this use case.