post_import_function in App Engine bulkuploader yaml - google-app-engine

I am trying to use post_import_function while uploading data using bulkloader.yaml. As per this link, Using post_import_function in App Engine bulkuploader yaml , I am using type google.appengine.api.datastore.Entity for entity operations. As in the link, this is a subclass of 'dict'. However I am not sure how do I apply methods to this entity.
My code looks like (I am using Geomodel):
def post_import_restaurants(input_dict, instance, bulkload_state_copy):
lat = instance['lat']
lng = instance['lng']
latlng = "%s,%s" % (lat,lng)
instance['location'] = db.GeoPt(latlng)
instance.update_location()
return instance
instance.update_location(), is where I am having issues. And I am not sure how to write this statement.

You can't add methods to an Entity. Just inline the code, or write it as a separate function that you pass the entity to.

Related

How to reference custom Java class in Fusion Solr Javascript Index Stage?

In Fusion's Javascript Indexing Stage, we can import Java classes and run them in the javascript such as this :
var imports = new JavaImporter(java.lang.String);
with (imports) {
var name = new String("foo"); ...
}
If we have customized complex Java classes, how to include the compile jar with Fusion so that the class can be imported in Javascript Indexing Stages for use?
And where can we store configuration values for the Javascript Indexing Stage to look up and how to retrieve them?
I'm thinking of something like this:
var imports = new JavaImporter(mycompany.com.custompkg.SomeParser);
with (imports) {
var some_config = ResourceManager.GetString("key");
var sp = new SomeParser(some_config); ...
}
Regards,
Kelvin
Starting in Fusion 4.x The API and Connectors started using a common location for jars i.e. apps/libs . This is a reasonable place to put custom jars but the services must be told about the new jars as well. That's done in two places
/jetty/connectors-classic/webapps/connectors-extra-classpath.txt
./jetty/api/webapps/api-extra-classpath.txt
Also, index documents can get processed by the api service so even if the jar is only used for indexing, register with both classpaths. Finally, bounce the services.
Put the Java class file, as a jar file, in $FUSION_HOME/apps/jetty/api/webapps/api/WEB-INF/lib/.
I used this to access my custom class.
var SomeParser = Java.type('mycompany.com.custompkg.SomeParser');

Parsing Swagger JSON data and storing it in .net class

I want to parse Swagger data from the JSON I get from {service}/swagger/docs/v1 into dynamically generated .NET class.
The problem I am facing is that different APIs can have different number of parameters and operations. How do I dynamically parse Swagger JSON data for different services?
My end result should be list of all APIs and it's operations in a variable on which I can perform search easily.
Did you ever find an answer for this? Today I wanted to do the same thing, so I used the AutoRest open source project from MSFT, https://github.com/Azure/autorest. While it looks like it's designed for generating client code (code to consume the API documented by your swagger document), at some point on the way producing this code it had to of done exactly what you asked in your question - parse the Swagger file and understand the operations, inputs and outputs the API supports.
In fact we can get at this information - AutoRest publically exposes this information.
So use nuget to install AutoRest. Then add a reference to AutoRest.core and AutoRest.Model.Swagger. So far I've just simply gone for:
using Microsoft.Rest.Generator;
using Microsoft.Rest.Generator.Utilities;
using System.IO;
...
var settings = new Settings();
settings.Modeler = "Swagger";
var mfs = new MemoryFileSystem();
mfs.WriteFile("AutoRest.json", File.ReadAllText("AutoRest.json"));
mfs.WriteFile("Swagger.json", File.ReadAllText("Swagger.json"));
settings.FileSystem = mfs;
var b = System.IO.File.Exists("AutoRest.json");
settings.Input = "Swagger.json";
Modeler modeler = Microsoft.Rest.Generator.Extensibility.ExtensionsLoader.GetModeler(settings);
Microsoft.Rest.Generator.ClientModel.ServiceClient serviceClient;
try
{
serviceClient = modeler.Build();
}
catch (Exception exception)
{
throw new Exception(String.Format("Something nasty hit the fan: {0}", exception.Message));
}
The swagger document you want to parse is called Swagger.json and is in your bin directory. The AutoRest.json file you can grab from their GitHub (https://github.com/Azure/autorest/tree/master/AutoRest/AutoRest.Core.Tests/Resource). I'm not 100% sure how it's used, but it seems it's needed to inform the tool about what is supports. Both JSON files need to be in your bin.
The serviceClient object is what you want. It will contain information about the methods, model types, method groups
Let me know if this works. You can try it with their resource files. I used their ExtensionLoaderTests for reference when I was playing around(https://github.com/Azure/autorest/blob/master/AutoRest/AutoRest.Core.Tests/ExtensionsLoaderTests.cs).
(Also thank you to the Denis, an author of AutoRest)
If still a question you can use Swagger Parser library:
https://github.com/swagger-api/swagger-parser
as simple as:
// parse a swagger description from the petstore and get the result
SwaggerParseResult result = new OpenAPIParser().readLocation("https://petstore3.swagger.io/api/v3/openapi.json", null, null);

Salesforce Metadata apis

I want to rerieve list of Metadata Component's like ApexClass using Salesforce Metadata API's.
I'm getting list of all the Apex Classes(total no is 2246) that are on the Salesforce using the following Code and its taking too much time to retrieve these file names:
ListMetadataQuery query = new ListMetadataQuery();
query.type = "ApexClass";
double asOfVersion = 23.0;
// Assume that the SOAP binding has already been established.
FileProperties[] lmr = metadataService.listMetadata(
new ListMetadataQuery[] { query }, asOfVersion);
if (lmr != null)
{
foreach(FileProperties n in lmr)
{
string filename = n.fileName;
}
}
My requirement is to get list of Metadata Components(Apex Classes) which are developed by my organizasion only so that i can get the Salesforce Metadata Components which are relevant to me and possibly can save my time by not getting all the classes.
How can I Achieve this?
Reply as soon as possible.
Thanks in advance.
I've not used the meta-data API directly, but I'd suggest either trying to filter on the created by field, or use a prefixed name on your classes so you can filter on that.
Not sure if filters are possible though! As for speed, my experience of using the Meta-Data API via Eclipse is that it's always pretty slow and there's not much you can do about it!

How to pass a google mapreduce parameter to done_callback

I'm having trouble setting a parameter when kicking off a mapreduce via start_map so I can access it in done_callback. Numerous things I've read imply that it's possible, but somehow I've not got the earth-moon-stars properly aligned. Ultimately, what I'm trying to accomplish is to delete the temporary blob I created for the mapreduce job.
Here's how I kick it off:
mrID = control.start_map(
"Find friends",
"findfriendshandler.findFriendHandler",
"mapreduce.input_readers.BlobstoreLineInputReader",
{"blob_keys": blobKey},
shard_count=7,
mapreduce_parameters={'done_callback': '/fnfrdone','blobKey': blobKey})
In done_callback, the context object isn't available:
class FindFriendsDoneHandler(webapp.RequestHandler):
def post(self):
ctx = context.get()
if ctx is not None:
params = ctx.mapreduce_spec.mapper.params
try:
blobKey = params['blobKey']
logging.info(['BLOBKEY ' + blobKey])
except KeyError:
logging.info('blobKey key not found in params')
else:
logging.info('context.get did not work') #THIS IS WHAT GETS OUTPUT
Thanks!
EDIT: It seems like there may be more than one MR library, so I wanted to include my various imports:
from mapreduce import control
from mapreduce import operation as op
from mapreduce import context
from mapreduce import model
Below is the code I used in my done_callback handler to retrieve my blobKey user parameter:
class FindFriendsDoneHandler(webapp.RequestHandler):
mrID = self.request.headers['Mapreduce-Id']
try:
mapreduceState = MapreduceState.get_by_key_name(mrID)
mrSpec = mapreduceState.mapreduce_spec
jsonSpec = mrSpec.to_json()
jsonParams = jsonSpec['params']
blobKey = jsonParams['blobKey']
blobInfo = BlobInfo.get(blobKey)
blobInfo.delete()
logging.info('Temp blob deleted successfully for mapreduce:' + mrID)
except:
logging.warning('Unable to delete temp blob for mapreduce:' + mrID)
This uses the mapreduce ID passed into done callback via the header to retrieve the mapreduce state model object from the mapreduce state table. The model stores any user params sent via start_map in a mapreduce_spec property which is in json format.
Note that MR, itself, actually stores the blob_key elsewhere in mapreduce_spec.
Thanks again to #Nick for pointing me to the model.py source file.
I'd love to hear if there's a simpler way to get at MR user params...
Context is only available to mappers/reducers - it's largely concerned with things that don't make sense outside the context of one. As you can see from the source, however, the "Mapreduce-Id" header is set, from which you can get the ID of the mapreduce job.
You shouldn't have to do your own cleanup, though - mapreduce has a handler that does it for you.

Google AppEngine datastore config: reusable?

The documentation about datastore config objects confuses me:
"A configuration object can be used any number of times. You must create a separate configuration object for each datastore call that uses it."
(from AppEngine doc)
So can I do something like this:
config = db.create_config(deadline=5)
db.put(someModels, config=config)
db.delete(someKeys, config=config)
Or do I have to do something like this:
config = db.create_config(deadline=5)
db.put(someModels, config=config)
config = db.create_config(deadline=5)
db.delete(someKeys, config=config)
?
Thanks
That is a left-over from when config options were changed by creating a RPC. Each RPC could be used only once. The new datastore Configuration objects can be used multiple times; parameters are now read from them and passed on.
For reference, when settings were passed by creating RPC objects the docs read:
An RPC object can only be used once. You must create a separate RPC object for each datastore call that uses it.

Resources