Dynamic file in request body - file

I am playing with Karate to test one resource that accepts a date that cannot be in the past.
Scenario: Schedule one
Given path '/schedules'
And request read('today_at_19h30.json')
When method post
Then status 201
I have created some JSON files (mostly duplicates but with subtle changes) for each scenario. But I cannot seriously imagine changing the date in all of them each time I want to run my tests.
Using a date in the far future is not a good idea because there is some manual verification and that will force us to click (on next) too many times.
Is there a way to include a variable or an expression in a file?
Thanks

There are multiple ways to "clobber" JSON data in Karate. One way is to just use a JS expression. For example:
* def foo = { a: 1 }
* foo.a = 2
* match foo == { a: 2 }
For your specific use case, I suspect embedded expressions will be the more elegant way to do it. The great thing about embedded expressions is that they work in combination with the read() API.
For example where the contents of the file test.json is { "today": "#(today)" }
Background:
* def getToday =
"""
function() {
var SimpleDateFormat = Java.type('java.text.SimpleDateFormat');
var sdf = new SimpleDateFormat('yyyy/MM/dd');
var date = new java.util.Date();
return sdf.format(date);
}
"""
Scenario:
* def today = getToday()
* def foo = read('test.json')
* print foo
Which results in:
Running com.intuit.karate.junit4.dev.TestRunner
20:19:20.957 [main] INFO com.intuit.karate - [print] {
"today": "2020/01/22"
}
By the way if the getToday function has been defined, you can even do this: { "today": "#(getToday())" }. Which may give you more ideas.

Related

Karate: can we make multiple calls from scenario outline using an array variable? [duplicate]

I currently use junit5, wiremock and restassured for my integration tests. Karate looks very promising, yet I am struggling with the setup of data-driven tests a bit as I need to prepare a nested data structures which, in the current setup, looks like the following:
abstract class StationRequests(val stations: Collection<String>): ArgumentsProvider {
override fun provideArguments(context: ExtensionContext): java.util.stream.Stream<out Arguments>{
val now = LocalDateTime.now()
val samples = mutableListOf<Arguments>()
stations.forEach { station ->
Subscription.values().forEach { subscription ->
listOf(
*Device.values(),
null
).forEach { device ->
Stream.Protocol.values().forEach { protocol ->
listOf(
null,
now.minusMinutes(5),
now.minusHours(2),
now.minusDays(1)
).forEach { startTime ->
samples.add(
Arguments.of(
subscription, device, station, protocol, startTime
)
)
}
}
}
}
}
return java.util.stream.Stream.of(*samples.toTypedArray())
}
}
Is there any preferred way how to setup such nested data structures with karate? I initially thought about defining 5 different arrays with sample values for subscription, device, station, protocol and startTime and to combine and merge them into a single array which would be used in the Examples: section.
I did not succeed so far though and I am wondering if there is a better way to prepare such nested data driven tests?
I don't recommend nesting unless absolutely necessary. You may be able to "flatten" your permutations into a single table, something like this: https://github.com/intuit/karate/issues/661#issue-402624580
That said, look out for the alternate option to Examples: which just might work for your case: https://github.com/intuit/karate#data-driven-features
EDIT: In version 1.3.0, a new #setup life cycle was introduced that changes the example below a bit.
Here's a simple example:
Feature:
Scenario:
* def data = [{ rows: [{a: 1},{a: 2}] }, { rows: [{a: 3},{a: 4}] }]
* call read('called.feature#one') data
and this is: called.feature:
#ignore
Feature:
#one
Scenario:
* print 'one:', __loop
* call read('called.feature#two') rows
#two
Scenario:
* print 'two:', __loop
* print 'value of a:', a
This is how it looks like in the new HTML report (which is in 0.9.6.RC2 and may need more fine tuning) and it shows off how Karate can support "nesting" even in the report, which Cucumber cannot do. Maybe you can provide feedback and let us know if it is ready for release :)

Karate callonce: don't want to outsource init vars - getting endless loop [duplicate]

This question already has answers here:
Can we call a scenario from one feature in another using karate?
(2 answers)
Closed 1 year ago.
I need variables to be re-used (shared) across scenarios in the same feature file.
Please find below the working way that I'm currently using.
The problem here is that I have to outsource the shared variables to another feature file what seems to be quite cumbersome for such a silly task.
I was wondering if I could define the re-usable variables as an ignored scenario in the same feature file that I can callonce from "myself" (the same feature file) as follows:
File my.feature:
Feature: My
Background:
* url myUrl
# call once explicitly the scenario tagged with '#init'
* def vars = callonce read('my.feature#init')
#ignore #init
Scenario: Return shared variables for all scenarios
* def id = uuid()
# the non-ignored scenarios follow below this line...
Problem: Unfortunately this leads to an endless loop with many errors. It seems like callonce myself (the same file that invokes callonce) runs the Background including the callonce again.
Is the idea shown above possible and if yes, where's my mistake?
Or could you callonce without processing the Background again? Something like adding an argument to callonce or use karate.callSingle(file, dontProcessBackground=true)?
Many thanks.
--
The following works (but is cumbersome):
File my.feature:
Feature: My
Background:
* url myUrl
* def vars = callonce read('my.init.feature')
#one
Scenario: One
* def payload = `{ "id" : "${vars.id}" }`
* request payload
* method post
* status 200
* match $.value == 'one'
#two
Scenario: Two
* def payload = `{ "id" : "${vars.id}" }`
* request payload
* method post
* status 200
* match $.value == 'two'
File my.init.feature:
#ignore
Feature: Create variables to be used across mutliple scnearios
Scenario: Return shared variables for all scenarios
* def id = uuid()
... where uuid() is shared in karate-config.js:
function fn() {
var uuid = () => { return String(java.util.UUID.randomUUID().toString()) };
// ...
var config = { uuid: uuid }
return config;
}
I have to outsource the shared variables to another feature file
There is nothing wrong with using a second file for re-usable stuff. All programming languages work this way.
If this is such an inconvenience, kindly contribute code to Karate, it is an open-source project.

Google Sheet Coinmarketcap requesting 1 single importxml instead of 1 request for each coin

I am developing a sheet in google sheet that pulls data from coinmarketcap with a script i've been trying to write.
I am a f. noob # coding.
I use the function importxml (i need to refresh the latest price for each coin, like 100 coins) in this script:
function CryptoRefresher() {
var spreadsheet = SpreadsheetApp.getActive();
var queryString = Math.random();
var link1 = "C";
var xpath = "D";
var destination = "E";
var Direction=SpreadsheetApp.Direction;
var NumeroRighe =spreadsheet.getRange("B"+(spreadsheet.getLastRow()+1)).getNextDataCell(Direction.UP).getRow();
for (var i = 2; i <= NumeroRighe; i++) {
var cellFunction1 = '=IMPORTXML("' + SpreadsheetApp.getActiveSheet().getRange(link1+i).getValue() + '?' + queryString + '", "'+ SpreadsheetApp.getActiveSheet().getRange(xpath+i).getValue() + '")';
SpreadsheetApp.getActiveSheet().getRange(destination+i).setValue(cellFunction1);
}
};
Example Data:
Cell B2 = "bitcoin"
cell C2 = "https://coinmarketcap.com/currencies/Bitcoin"
Cell D2 = "//div[#class='priceValue___11gHJ']"
Cell E2 = is the destination and will receive the bitcoin price
The problem is that it's really slow because it calls 1 coin per time.
Question: Is there a way to send ALL THE COINS REQUESTS in 1 single importxml call?
Like I'd like to collect all the coin names in column C (1 cell of column C has 1 different and unique Coin Name) to collect all the coin names that i am watching and ask for 1 single call to speed up the process?
(Is there a way to create an array, a list of the coin names and do 1 single call to coinmarketcap?)
I really can't figure that out and i hope what i'm asking is clear!
Thank you!
Alessandro
Given the structure of the webpage, it is currently not possible to have a single IMPORTXML call to pull multiple arbitrary currencies from the CoinMarketCap site.
However, they have a convenient API that can do that exactly, please see references below:
CoinMarketCap API / Cryptocurrency
And this should get you started in pulling information from the API:
Pulling Currency Data to Google Sheets
I'd suggest using a dedicated service to retrieve the data, for instance this request will give you the data you need without any parsing or signing up to 3rd party services
=IMPORTDATA("https://cryptoprices.cc/BTC")
Trying to parse a complex web page under active development is just prone to fail at some point.
As alternative, go straight to the source, by signing up to the CoinMarketCap api to get more up to date data. (Already mentioned above) You can sign up for the free tier API (333 req/day) at https://pro.coinmarketcap.com/signup/
Wow! Сryptoprices.cc it’s a great service.
But some cryptocurrencies don’t read clearly.
If you change the formula, add / the price is updated and it is correct.

Best practices to execute faster a CasperJS script that scrapes thousands of pages

I've written a CasperJS script that works very well except that it takes a (very very) long time to scrape pages.
In a nutshell, here's the pseudo code:
my functions to scrape the elements
my casper.start() to start the navigation and log in
casper.then() where I loop through an array and store my links
casper.thenOpen() to open each link and call my functions to scrap.
It works perfectly (and fast enough) for scraping a bunch of links. But when it comes to thousands (right now I'm running the script with an array of 100K links), the execution time is endless: the first 10K links have been scrapped in 3h54m10s and the following 10K in 2h18m27s.
I can explain a little bit the difference between the two 10K batches : the first includes the looping & storage of the array with the 100K links. From this point, the scripts only open pages to scrap them. However, I noticed the array was ready to go after roughly 30 minutes so it doesn't explain exactly the time gap.
I've placed my casper.thenOpen() in the for loop hoping that after each new link built and stored in the array, the scrapping will happen. Now, I'm sure I've failed this but will it change anything in terms of performance ?
That's the only lead I have in mind right now and I'd be very thankful if anyone is willing to share his/her best practices to reduce significantly the running time of the script's execution (shouldn't be hard!).
EDIT #1
Here's my code below:
var casper = require('casper').create();
var fs = require('fs');
// This array maintains a list of links to each HOL profile
// Example of a valid URL: https://myurl.com/list/74832
var root = 'https://myurl.com/list/';
var end = 0;
var limit = 100000;
var scrapedRows = [];
// Returns the selector element property if the selector exists but otherwise returns defaultValue
function querySelectorGet(selector, property, defaultValue) {
var item = document.querySelector(selector);
item = item ? item[property] : defaultValue;
return item;
}
// Scraping function
function scrapDetails(querySelectorGet) {
var info1 = querySelectorGet("div.classA h1", 'innerHTML', 'N/A').trim()
var info2 = querySelectorGet("a.classB span", 'innerHTML', 'N/A').trim()
var info3 = querySelectorGet("a.classC span", 'innerHTML', 'N/A').trim()
//For scraping different texts of the same kind (i.e: comments from users)
var commentsTags = document.querySelectorAll('div.classComments');
var comments = Array.prototype.map.call(commentsTags, function(e) {
return e.innerText;
})
// Return all the rest of the information as a JSON string
return {
info1: info1,
info2: info2,
info3: info3,
// There is no fixed number of comments & answers so we join them with a semicolon
comments : comments.join(' ; ')
};
}
casper.start('http://myurl.com/login', function() {
this.sendKeys('#username', 'username', {keepFocus: true});
this.sendKeys('#password', 'password', {keepFocus: true});
this.sendKeys('#password', casper.page.event.key.Enter, {keepFocus: true});
// Logged In
this.wait(3000,function(){
//Verify connection by printing welcome page's title
this.echo( 'Opened main site titled: ' + this.getTitle());
});
});
casper.then( function() {
//Quick summary
this.echo('# of links : ' + limit);
this.echo('scraping links ...')
for (var i = 0; i < limit; i++) {
// Building the urls to visit
var link = root + end;
// Visiting pages...
casper.thenOpen(link).then(function() {
// We pass the querySelectorGet method to use it within the webpage context
var row = this.evaluate(scrapDetails, querySelectorGet);
scrapedRows.push(row);
// Stats display
this.echo('Scraped row ' + scrapedRows.length + ' of ' + limit);
});
end++;
}
});
casper.then(function() {
fs.write('infos.json', JSON.stringify(scrapedRows), 'w')
});
casper.run( function() {
casper.exit();
});
At this point I probably have more questions than answers but let's try.
Is there a particular reason why you're using CasperJS and not Curl for example ? I can understand the need for CasperJS if you are going to scrape a site that uses Javascript for example. Or you want to take screenshots. Otherwise I would probably use Curl along with a scripting language like PHP or Python and take advantage of the built-in DOM parsing functions.
And you can of course use dedicated scraping tools like Scrapy. There are quite a few tools available.
Then the 'obvious' question: do you really need to have arrays that large ? What you are trying to achieve is not clear, I am assuming you will want to store the extracted links to a database or something. Isn't it possible to split the process in small batches ?
One thing that should help is to allocate sufficient memory by declaring a fixed-size array ie:
var theArray = new Array(1000);
Resizing the array constantly is bound to cause performance issues. Every time new items are added to the array, expensive memory allocation operations must take place in the background, and are repeated as the loop is being run.
Since you are not showing any code, so we cannot suggest meaningful improvements, just generalities.

parallel code execution python2.7 ndb

in my app i for one of the handler i need to get a bunch of entities and execute a function for each one of them.
i have the keys of all the enities i need. after fetching them i need to execute 1 or 2 instance methods for each one of them and this slows my app down quite a bit. doing this for 100 entities takes around 10 seconds which is way to slow.
im trying to find a way to get the entities and execute those functions in parallel to save time but im not really sure which way is the best.
i tried the _post_get_hook but the i have a future object and need to call get_result() and execute the function in the hook which works kind of ok in the sdk but gets a lot of 'maximum recursion depth exceeded while calling a Python objec' but i can't really undestand why and the error message is not really elaborate.
is the Pipeline api or ndb.Tasklets what im searching for?
atm im going by trial and error but i would be happy if someone could lead me to the right direction.
EDIT
my code is something similar to a filesystem, every folder contains other folders and files. The path of the Collections set on another entity so to serialize a collection entity i need to get the referenced entity and get the path. On a Collection the serialized_assets() function is slower the more entities it contains. If i could execute a serialize function for each contained asset side by side it would speed things up quite a bit.
class Index(ndb.Model):
path = ndb.StringProperty()
class Folder(ndb.Model):
label = ndb.StringProperty()
index = ndb.KeyProperty()
# contents is a list of keys of contaied Folders and Files
contents = ndb.StringProperty(repeated=True)
def serialized_assets(self):
assets = ndb.get_multi(self.contents)
serialized_assets = []
for a in assets:
kind = a._get_kind()
assetdict = a.to_dict()
if kind == 'Collection':
assetdict['path'] = asset.path
# other operations ...
elif kind == 'File':
assetdict['another_prop'] = asset.another_property
# ...
serialized_assets.append(assetdict)
return serialized_assets
#property
def path(self):
return self.index.get().path
class File(ndb.Model):
filename = ndb.StringProperty()
# other properties....
#property
def another_property(self):
# compute something here
return computed_property
EDIT2:
#ndb.tasklet
def serialized_assets(self, keys=None):
assets = yield ndb.get_multi_async(keys)
raise ndb.Return([asset.serialized for asset in assets])
is this tasklet code ok?
Since most of the execution time of your functions are spent waiting for RPCs, NDB's async and tasklet support is your best bet. That's described in some detail here. The simplest usage for your requirements is probably to use the ndb.map function, like this (from the docs):
#ndb.tasklet
def callback(msg):
acct = yield ndb.get_async(msg.author)
raise tasklet.Return('On %s, %s wrote:\n%s' % (msg.when, acct.nick(), msg.body))
qry = Messages.query().order(-Message.when)
outputs = qry.map(callback, limit=20)
for output in outputs:
print output
The callback function is called for each entity returned by the query, and it can do whatever operations it needs (using _async methods and yield to do them asynchronously), returning the result when it's done. Because the callback is a tasklet, and uses yield to make the asynchronous calls, NDB can run multiple instances of it in parallel, and even batch up some operations.
The pipeline API is overkill for what you want to do. Is there any reason why you couldn't just use a taskqueue?
Use the initial request to get all of the entity keys, and then enqueue a task for each key having the task execute the 2 functions per-entity. The concurrency will be based then on the number of concurrent requests as configured for that taskqueue.

Resources