How to send custom DocumentOperation to DocumentProcessing pipeline from a Processor? - vespa

Scenario: I've been stuck on this for way to long and I think solution might be easy but I just can't see it, this is the scenario:
cURL POST to http://localhost:8080/my_imports (raw JSON data on body)
->
MyImportsCustomHandler (extends ThreadedHttpRequestHandler [Validations]
->
MyObjectProcessor (extends Processor) [JSON deserialize and data massage]
->
MyFirstDocumentProcessor (extends DocumentProcessor) [Set some fields and save]
Problem is that execution never reaches MyFirstDocumentProcessor, likely because request didn't started from the document_api endpoints (intentionaly).
There are no errors thrown, just processing route never reaches the document processor chain, I think it should because on MyObjectProcessor I'm doing:
DocumentType type =
localDocHandler.getDocumentTypeManager().getDocumentType("my_doc");
DocumentId id = new DocumentId("id:default:my_doc::2");
Document document = new Document(type, id);
DocumentPut docPut = new DocumentPut(document);
Processing proc = com.yahoo.docproc.Processing.of(docPut);
I got this idea from here: https://github.com/vespa-engine/vespa/blob/master/docproc/src/test/java/com/yahoo/docproc/util/SplitterJoinerTestCase.java
but on that test I see this line splitter.process(p);, which I'm not able to find a suitable replacement that works inside a Processor, in that context I only have the Request, Execution and DocumentProcessingHandler
I hope somebody versed on Vespa con shine some light on this, is just the last hop on the processing chain that I can't bridge :|

To write documents from Java code, you need to use the Document Access API:
http://docs.vespa.ai/documentation/document-api-guide.html#document-access
A working solution is in https://github.com/vespa-engine/sample-apps/pull/44

Related

How to use sagemaker java API to invoke a endpoint?

I was trying to run this example: tensorflow_abalone_age_predictor_using_layers
, in which abalone_predictor.predict(tensor_proto) is used to call the endpoint and make the prediction. I was trying to use the java API AmazonSageMakerRuntime to achieve the same effect, but I don't know how to specify the body and contentType for the InvokeEndPointRequest. The document is not in detailed abou the format of the request. Greatly appreciate any piece of help!
I have not tried the specific example but the below snippet should help you to invoke the endpoint for predictions
InvokeEndpointRequest invokeEndpointRequest = new InvokeEndpointRequest();
invokeEndpointRequest.setContentType("application/x-image");
ByteBuffer buf = ByteBuffer.wrap(image);
invokeEndpointRequest.setBody(buf);
invokeEndpointRequest.setEndpointName(endpointName);
invokeEndpointRequest.setAccept("application/json");
AmazonSageMakerRuntime amazonSageMaker = AmazonSageMakerRuntimeClientBuilder.defaultClient();
InvokeEndpointResult invokeEndpointResult = amazonSageMaker.invokeEndpoint(invokeEndpointRequest);
I see the example you are trying creates a TensorProto and passes to the endpoint request. You can try to create a TensorProto of your invoke request and set as the body
Just figured I can override the input_fn to convert the request body string to something can be fed to the model, in this case a TensorProto object.

Make a solr query from Geotools through geoserver

I come here because I am searching (like the title mentionned) to do a query from geotools (through geoserver) to get feature from a solr index.
To be more precise :
I saw on geoserver user manual that i can do query on solr like this in http :
http://localhost:8080/geoserver/wfs?service=WFS&version=1.1.0&request=GetFeature
&typeName=mySolrLayer
&format="xxx"
&viewparams=q:"mySolrQuery"
The important part on this URL is the viewparams that I want to use directly from geotools.
I have already test this case (this is a part of my code):
url = new URL(
"http://localhost:8080/geoserver/wfs?request=GetCapabilities&VERSION=1.1.0";
);
Map<String, String> param = new HashMap();
params.put(WFSDataStoreFactory.URL.key, url);
param.put("viewparams","q:myquery");
Hints hints = new Hints();
hints.put(Hints.VIRTUAL_TABLE_PARAMETERS, viewParams);
query.setHints(hints);
...
featureSource.getFeatures(query);
But here, it seems to doesn't work, the url send to geoserver is a normal "GET FEATURE" request without the viewparams parameter.
I tried this with geotools-12.2 ; geotools-13.2 and geotools-15-SNAPSHOT but I didn't succeed to pass the query, geoserver send me all the feature in my database and doesn't take "viewparams" as a param.
I need to do it like this because actually the query come from another program and I would easily communicate this query to another part of the project...
If someone can help me ?
There doesn't currently seem to be a way to do this in the GeoTool's WFSDatastore implementations as the GetFeature request is constructed from the URL provided by the getCapabilities document. This is as the standard requires but it may be worth making a feature enhancement request to allow clients to override this string (as QGIS does for example) which would let you specify the additional parameter in your base URL which would then be passed to the server as you need.
Unfortunately the WFS module lives in Unsupported land at present so unless you have resources to work on this issue yourself and can provide a PR to implement it there is not a great chance of it being implemented.

Parsing Swagger JSON data and storing it in .net class

I want to parse Swagger data from the JSON I get from {service}/swagger/docs/v1 into dynamically generated .NET class.
The problem I am facing is that different APIs can have different number of parameters and operations. How do I dynamically parse Swagger JSON data for different services?
My end result should be list of all APIs and it's operations in a variable on which I can perform search easily.
Did you ever find an answer for this? Today I wanted to do the same thing, so I used the AutoRest open source project from MSFT, https://github.com/Azure/autorest. While it looks like it's designed for generating client code (code to consume the API documented by your swagger document), at some point on the way producing this code it had to of done exactly what you asked in your question - parse the Swagger file and understand the operations, inputs and outputs the API supports.
In fact we can get at this information - AutoRest publically exposes this information.
So use nuget to install AutoRest. Then add a reference to AutoRest.core and AutoRest.Model.Swagger. So far I've just simply gone for:
using Microsoft.Rest.Generator;
using Microsoft.Rest.Generator.Utilities;
using System.IO;
...
var settings = new Settings();
settings.Modeler = "Swagger";
var mfs = new MemoryFileSystem();
mfs.WriteFile("AutoRest.json", File.ReadAllText("AutoRest.json"));
mfs.WriteFile("Swagger.json", File.ReadAllText("Swagger.json"));
settings.FileSystem = mfs;
var b = System.IO.File.Exists("AutoRest.json");
settings.Input = "Swagger.json";
Modeler modeler = Microsoft.Rest.Generator.Extensibility.ExtensionsLoader.GetModeler(settings);
Microsoft.Rest.Generator.ClientModel.ServiceClient serviceClient;
try
{
serviceClient = modeler.Build();
}
catch (Exception exception)
{
throw new Exception(String.Format("Something nasty hit the fan: {0}", exception.Message));
}
The swagger document you want to parse is called Swagger.json and is in your bin directory. The AutoRest.json file you can grab from their GitHub (https://github.com/Azure/autorest/tree/master/AutoRest/AutoRest.Core.Tests/Resource). I'm not 100% sure how it's used, but it seems it's needed to inform the tool about what is supports. Both JSON files need to be in your bin.
The serviceClient object is what you want. It will contain information about the methods, model types, method groups
Let me know if this works. You can try it with their resource files. I used their ExtensionLoaderTests for reference when I was playing around(https://github.com/Azure/autorest/blob/master/AutoRest/AutoRest.Core.Tests/ExtensionsLoaderTests.cs).
(Also thank you to the Denis, an author of AutoRest)
If still a question you can use Swagger Parser library:
https://github.com/swagger-api/swagger-parser
as simple as:
// parse a swagger description from the petstore and get the result
SwaggerParseResult result = new OpenAPIParser().readLocation("https://petstore3.swagger.io/api/v3/openapi.json", null, null);

Jmeter: How to create an array in bean shell post processor and make it available in other thread groups?

Does anyone knows how to create an array in bean shell post processor and make it available in other thread groups?
I've been searching for a while and i'm not managing to solve this.
Thanks
There is no need to do it through writing and reading files. Beanshell extension mechanism is smart enough to handle it without interim 3rd party entities.
Short answer: bsh.shared namespace
Long answer:
assuming following Test Plan Structure:
Thread Group 1
Beanshell Sampler 1
Thread Group 2
Beahshell Sampler 2
Put following Beanshell code to Beanshell Sampler 1
Map map = new HashMap();
map.put("somekey","somevalue");
bsh.shared.my_cool_map = map;
And the following to Beanshell Sampler 2
Map map = bsh.shared.my_cool_map;
log.info(map.get("somekey"));
Run it and look into jmeter.log file. You should see something like
2014/01/04 10:32:09 INFO - jmeter.util.BeanShellTestElement: somevalue
Voila.
References:
Sharing variables (from JMeter Best Practices)
How to use BeanShell: JMeter's favorite built-in component guide
Following some advice, here's how i did it:
The HTTP request has a Regular Expressions Extractor to extract the XPTO variable from the request. Then, a BeanShell PostProcessor saves data to a CSV file:
String xpto_str = vars.get("XPTO");
log.info("Variable is: " + xpto_str);
f = new FileOutputStream("/tmp/xptos.csv", true);
p = new PrintStream(f);
this.interpreter.setOut(p);
print(xpto_str + ",");
f.close();
Then, in second thread group, i added a CSV Data Set Config, in which i read the variable from the file. This is really easy, just read the guide (http://jmeter.apache.org/usermanual/component_reference.html#CSV_Data_Set_Config).
Thanks

AppEngine - Optimize read/write count on POST request

I need to optimize the read/write count for a POST request that I'm using.
Some info about the request:
The user sends a JSON array of ~100 items
The servlet needs to check if any of the received items is newer then its counterpart in the datastore using a single long attribute
I'm using JDO
what i currently do is (pseudo code):
foreach(item : json.items) {
storedItem = persistenceManager.getObjectById(item.key);
if(item.long > storedItem.long) {
// Update storedItem
}
}
Which obviously results in ~100 read requests per request.
What is the best way to reduce the read count for this logic? Using JDO Query? I read that using "IN"-Queries simply results in multiple queries executed after another, so I don't think that would help me :(
There also is PersistenceManager.getObjectsById(Collection). Does that help in any way? Can't find any documentation of how many requests this will issue.
I think you can use below call to do a batch get:
Query q = pm.newQuery("select from " + Content.class.getName() + " where contentKey == :contentKeys");
Something like above query would return all objects you need.
And you can handle all the rest from here.
Best bet is
pm.getObjectsById(ids);
since that is intended for getting multiple objects in a call (particularly since you have the ids, hence keys). Certainly current code (2.0.1 and later) ought to do a single datastore call for getEntities(). See this issue

Resources