Is it possible to use cognitive search in a local azurite environment - azure-cognitive-search

This could entirely be a case of me misunderstanding how Azurite works, but I can't seem to find the answers by searching.
I've downloaded Azurite through the VS code extension, and uploaded some data to a local blob source on my hard drive using Windows Storage Explorer; that's now visible in the azurite __blobstorage__ folder. I've tried initialising a new function to try and search over the data, but the project i'm working on specifically phrased it as:
"Set-up local version of Storage and Cognitive Search and index a sample set of documents"
Is this possible to do and i'm just missing something somewhere? Or have I misunderstood the task and you can't actually run cognitive search locally without at some stage attaching to the subscription? I'm waiting for the PM to get back from annual leave, so I thought i'd carry on trying to find out the answer whilst I wait, and hoping someone here might be able to help me out!
I've tried hunting through both the microsoft VS Code Local Development Hot to Guide and the Git repository for Azurite, so i'm not sure if i'm just reading the information wrong or if it's just not there to find.

Azure Search does not currently offer a localhost emulator. Azurite is for localhost storage emulation. It is not possible for an Azure Search Indexer to index data from a local emulator, but you can write data to Azure Search directly via the Index Docs REST APIs. You would need to write a script to read from your local storage and make an API call to index the data into a Search instance in Azure.

Related

How can the Google Cloud Datastore Emulator verify our datastore-index.xml?

We use the Google Cloud Datastore Emulator. It autogenerates indexes.yaml. But as we did with the old Google Plugin for Eclipse, we want to get missing-index messages in the local development environment, and not later in cloud deployment. So, we want the Emulator to use our manually-maintained datastore-indexes.xml
How do we configure the use of a specific datastore-indexes.xml in the Google Cloud Datastore Emulator? I don't see any relevant command-line switches in the help text.
EDIT:
My answer was based on the dev_appserver emulator, not the current one. After running some tests, it appears that the emulator only has endpoints for a subset of the Datastore API methods, and index building (nor export/import for that matter) are available.
Leaving my previous answer to avoid repeated answers with the same wrong info:
_________
According to the docs, if autoGenerate="false" is in your datastore-indexes.xml, the development server should ignore the contents of WEB-INF/appengine-generated/datastore-indexes-auto.xml.
I think this might be what you're looking for, although I have not yet tested it.

How do I delete/update files in Google Compute Engine

I have everything set up so that my discord bot runs on Google Cloud, but my only problem is that I can't seem to figure out how to update or delete files on the cloud disk drive. I am searching everywhere and I can't seem to find it. This bugs me because now I have to completely rename my bot every time I upload it or else I can't run it. This issue is really hurting my coding because I want to move forward, but I am very thorough and this issue will haunt me if I just leave it be.
I found the answer finally and to edit files you have to use vi, vim, or nano in the terminal. To remove the files you have to use rm. I think Google should make a better way to access the directory and edit my disk without having to use command lines, but I doubt they will.
Please note that GCP only offers e virtual infrastructure (for Compute Engine). The VM still runs a regular Operating System (with certain files pre packaged in to make sure it works with the cloud environment). Management of the operating system is still up to the user.
You can use something like App Engine if you only want to manage and update the code for your application.
Alternatively, you can also use gcloud compute scp to copy files from a local system directly to the VM.

A plea for a basic Notebook example getting data into and out of Google Cloud Datalab

I have started to try to use the Google Cloud datalab. While I understand it is a Beta product, I find the Doc's very frustrating, to say the least.
The questions here and lack of responses as well as lack of new revisions or docs over the several months the project has been available make me wonder if there is any commitment to the product?
A beginning would be a notebook that shows data ingestion from external sources to both the datastore system and the Big query system. That is a common use case. I'd like to use my own data, it would be great to have a Notebook to ingest it. It seems that should be doable without huge effort? And it would get me (and others) out of this mess trying to link the various terse docs from various products and workspaces up and working together..
in addition to a better explanation of the Git hub connection process (prior question))
For BigQuery, see here: https://github.com/GoogleCloudPlatform/datalab/blob/master/content/datalab/tutorials/BigQuery/Importing%20and%20Exporting%20Data.ipynb
For GCS, see here: https://github.com/GoogleCloudPlatform/datalab/blob/master/content/datalab/tutorials/Storage/Storage%20Commands.ipynb
Those are the only two storage options currently supported in Datalab (which should not be used in any event for large scale data transfers; these are for small scale transfers that can fit in memory in the Datalab VM).
For Git support, see https://github.com/GoogleCloudPlatform/datalab/blob/master/content/datalab/intro/Using%20Datalab%20-%20Managing%20Notebooks%20with%20Git.ipynb. Note that this has nothing to do with Github, however.
As for the low level of activity recently, that is because we have been heads down getting ready for GCP Next (which happens this coming week). Once that is over we should be able to migrate a number of new features over to Datalab and get a new public release out soon.
Datalab isn't running on your local machine. Just the presentation part is in your browser. So if you mean the browser client machine, that wouldn't be a good solution - you'd be moving data from the local machine to a VM which is running the Datalab Python code (and this VM has limited storage space), and then moving it again to the real destination. Instead, you should use the cloud console or (preferably) gcloud command line on your local machine for this.

Restore app-engine entities locally

Hi guys I've dumped (made a backup) of my Appengine datastore entities,following this tutorial, now I wonder if there is a way to restore the data locally ? so I can do some test and debug.
In windows, the datastore is in the directory
C:\Users\UserName\AppData\Local\Temp\AppName
In OSx this question can help you
In this directory are storade the datastore.db (the local storage), change the name (the app should not be running, and if is locked, kill all the python process)
Now go to the appengine dashboard
click in your app link
click in Blob Viewer (i'm assumming that you did the backup into a blobstore)
click in the file name
click in download
rename the file to datastore.db
copy to the previous path
start the app
Remote API (as koma mentions) is the main GAE-documented approach, and it's a good approach. Alternatively, you can download the entities using the cloud download tool, write your own store reader/deserializer, and execute it within your dev server local instance: http://gbayer.com/big-data/app-engine-datastore-how-to-efficiently-export-your-data. Read the part about the New Approach...
While these options are not automatic and require engineering, I really wanted to point out the side effect of doing this: We have been facing performance issues in the local development server for months now, specifically when the datastore has more than 1,000 entities with over 50 indexes. Just search for "require_indexes slow" and you'll see what I'm talking about.
I'm sure you have a solid reason to import lots of data locally for testing and debugging, just wanted to let you know your application will perform extremely slow, and debug mode will be impossibly slow; we can't even use debug mode with our setup anymore.
If you want to get some test data in your local db, you could copy some using the remote api

My GAE python development datastore is never persisted to a file

I have just started using GAE (Python 2.7 SDK 1.6.4) , I have set up a
simple test project using Pydev (latest version) in eclipse (indigo)
on Windows XP (SP3).
It all works fine, my app can record data in the datastore and the blobstore
and then retrieve it, but when I stop the development server and start
it again the data in the datastore is lost. This is not the case for
the blobstore which is retaining blobs fine and I can see the
blobstore folder that gets created in C:\Temp
I did the sensible thing and look back through old posts and found
that most people who have this problem solve it by changing the
location of the datastore file, so I used the following parameters;
--datastore_path="${workspace_loc}/myproject/datastore"
--blobstore_path="${workspace_loc}/myproject/blobstore"
"${workspace_loc}/myproject/src"
I moved the blobstore at the same time as you can see.
The blobstore still works, and now the blobstore folder is created in
myproject folder as expected. The datastore file is still not created
however, and when I stop and restart the development server the data
is still lost.
The dev server startup logs include the following entry
WARNING 2012-04-20 10:49:04,513 datastore_file_stub.py:513] Could not
read datastore data from C:\myworkspace\myproject\datastore
So I know it is trying to create the datastore in the correct place.
Finally I lifted the whole eclipse workspace folder and copied it to
another computer with exactly the same setup except it is running
Windows 7 instead of Windows XP.
Everything works fine there - both the datastore file and blobstore
folder are now created where I expect them to be.
I have set up eclipse, python, gae, my project and my eclipse launch
file in exactly the same way on two computers, it works on one and
not the other. Maybe XP is something to do with it but to be honest I
think that's unlikely.
The only other clue I have come up with is that a recent change to the
GAE development server stopped writing to the datastore file after
every change and only flushes on exit, this problem may be closely related to mine;
App Engine local datastore content does not persist
However adding the following to my code did not help at all.
from google.appengine.tools import dev_appserver
import atexit
atexit.register(dev_appserver.TearDownStubs)
So it's not down to incorrect termination sequence either as far as I
can tell although it may be that I was just added it in the wrong place (I'm am new to python).
Anyway I am stumped and I would be really grateful for suggestions you
guys can come up with.
It's probably http://code.google.com/p/googleappengine/issues/detail?id=7244 and a bug. Hopefully a fix will be available soon.
did you try:
--storage_path=...
Path at which all local files (such as the Datastore, Blobstore files, Google Cloud Storage Files, logs, etc) will be stored, unless overridden by --datastore_path, --blobstore_path, --logs_path, etc.
found at https://developers.google.com/appengine/docs/python/tools/devserver?csw=1

Resources