Increase load test over time - google-app-engine

I would like to test the load of my App Engine App.
From the load test google recommendation. Query per second should increase gradually.
So I would like to add 1 connection every second to my load test.
How can I do that? I search for AB (Apache Benchmark) and JMeter without success.
Maybe my question is very basic, but as I'm not use to load testing I don't google it properly.
Thanks.

If you want one connection every seconds, your ramp up period (in seconds) should be equal to no. of users you choose in the thread group settings.
Refer to following tutorials to understand how to use jmeter for load testing.
http://www.javaworld.com/javaworld/jw-07-2005/jw-0711-jmeter.html
http://nico.vahlas.eu/2010/03/17/some-thoughts-on-stress-testing-web-applications-with-jmeter-part-1/
http://nico.vahlas.eu/2010/03/30/some-thoughts-on-stress-testing-web-applications-with-jmeter-part-2/

Due to my leak of vocabulary I couldn't Google it.
It's call a "ramp-up". The word was actually in Google slide.
It can be done with JMeter under the "Thread Group" element.

Related

Selenium Webdriver + Jmeter + StormRunner for Performance test

I wanted to try out integration of Selenium Jmeter and StormRunner. My end goal is to do Load testing with 'n' number of users on StormRunner
What ? - For e.g. I have Selenium Script, convert it in to Jmeter (I can get this information from many sources)
Then my Jmeter script should get ready
Then upload Jmeter script in to StormRunner and pass the necessary parameter through Jenkins and run the load test.
I really want the opinion here about feasibility and whether it is in right direction or not.
Idea here is that Automated Load/Performance test
Selenium is a browser automation framework and JMeter acts on HTTP protocol level so your "Automated" requirement might not be fulfilled especially if your tests are relying on client-side checks like sorting or waiting for element to appear.
Theoretically given you properly configure JMeter it can behave like a real browser, but it still not be executing client-side JavaScript.
If you're fine with this constraint - your approach is valid, if not and the "automated functional test" requirement is the must - consider migrating to TruClient Protocol instead
Why wouldn't you covert your script to a native Loadrunner/Stormrunner form of virtual user?
You should look at the value of what you are trying to achieve. The end value of a performance test is in analysis. Analysis simply takes the timing records and the resource measurements produced during the test, bringing them together on a common timestamp, and then allowing you to analyze what resource "X" is being impinged when timing record "Y" is too long. This then points to some configuration or code which locks up on resource, "X."
What is your path to value in your model? You speak about converting a functional test script to a performance one. Realistically, you should already know that your code, "works for one," before you get to asking, "Does it work for many?" There is a change in script definitions which typically accompanies this understanding.
Where are your collection of resources noted? Which Resources? On which Hosts? This is on the "path to value" problem where you need to have the resource measurements to diagnose root cause of poor performance.

will gatling actually perform the operation or will it check only the urls' response time?

I have a gatling test for an application that will answer a survey and upon answering this survey, the application will identify possible answers that may pose a risk and create what we call riskareas. These riskareas are normally created in the background as soon as the survey answering is finished. My question is I have a gatling test with ten users who will go and answer the survey and logout, I used recorder to record the test; now after these ten users are finished I do not see any riskareas being created in the application. Am I missing something--should the survey be really answered by gatling (like it does in selenium) user or is it just the urls that the gatling test will touch ?
I am new to gatling please help.
Gatling should be indistinguishable from a user in a web browser (or Selenium) as far as the server is concerned, so the end result should be exactly the same as if you'd gone through the process yourself. However, writing a Gatling script is a little more work than writing a Selenium script.
For performance reasons, Gatling operates at a lower level than Selenium. Gatling works with the actual data that is sent and received from the server (i.e, the actual GETs and POSTs sent to the server), rather than with user-level interactions (such as clicking links and filling forms).
The recorder will generally produce a relaitvely "dumb" script. It records the exact data that was sent to the server, and makes no attempt to account for things that may change from run to run. For example, the web application you are testing might have hidden form fields that contain session information, or the link addresses might contain a unique identifier or a session id.
This means that your script may not be doing what you think it's doing.
To debug the script, the first thing to do is to add checks on each of the requests, to validate that you are getting the response you expect (for example, check that when you submit page 1 of the survey, you are taken to page 2 - check for something that you'd only expect to find on page 2, like a specific question).
Once you know which requests are failing, look at what data was sent with the request, and try to figure out where it came from. You will probably find that there are session ids, view state, or similar, that must be extracted from the previous page.
It will help to enable request and response logging, as per the documentation.
To simplify testing of web apps, we wrote some helper functions to allow tests to be written in a more Selenium-like way. Once you understand what your application is doing, you may find that it simplifies scripting for you too. However, understanding why your current script doesn't work the way you expect should be your first step.

How do I execute code on App Engine without using servlets?

My goal is to receive updates for some service (using http request-response) all the time, and when I get a specific information, I want to push it to the users. This code must run all the time (let's say every 5 seconds).
How can I run some code doing this when the server is up (means, not by an http request that initiates the execution of this code) ?
I'm using Java.
Thanks
You need to use
Scheduled Tasks With Cron for Java
You can set your own schedule (e.g. every minute), and it will call a specified handler for you.
You may also want to look at
App Engine Modules in Java
before you implement your code. You may separate your user-facing and backend code into different modules with different scaling options.
UPDATE:
A great comment from #tx802:
I think 1 minute is the highest frequency you can achieve on App Engine, but you can use cron to put 12 tasks on a Push Queue, either with delays of 5s, 10s, ... 55s using TaskOptions.countdownMillis() or with a processing rate of 1/5 sec.

How to ensure that a bot/scraper does not get blocked

I coded a simple scraper , who's job is to go on several different pages of a site. Do some parsing , call some URL's that are otherwise called via AJAX , and store the data in a database.
Trouble is , that sometimes my ip is blocked after my scraper executes. What steps can I take so that my ip does not get blocked? Are there any recommended practices? I have added a 5 second gap between requests to almost no effect. The site is medium-big(need to scrape several URLs)and my internet connection slow, so the script runs for over an hour. Would being on a faster net connection(like on a hosting service) help ?
Basically I want to code a well behaved bot.
lastly I am not POST'ing or spamming .
Edit: I think I'll break my script into 4-5 parts and run them at different times of the day.
You could use rotating proxies, but that wouldn't be a very well behaved bot. Have you looked at the site's robots.txt?
Write your bot so that it is more polite, i.e. don't sequentially fetch everything, but add delays in strategic places.
Following guidelines set in robots.txt is a good first step. There are tools such as import.io and morph.io. There are also packages/ plugins for servers. For example x-ray; a node.js which have options to assist in quickly writing responsible scrapers e.g. throttle, delays, max connections etc.

Consecutive XML HTTP Requests seem to block on Google App Engine

I am working on an application on Google App Engine. Roughly this is what I do:
The user screen is split into 2 parts (actually 3, but lets leave that out for now). The left part (this takes upto 75% of the screen) has a document with some words highlighted. When one of these highlighted words are clicked the right part displays various meanings of it, example usage etc. The way this works is clicking the word send an XML HTTP Request to the server, where the sample usage(s)/meaning(s) are retrieved from the datastore. This data is returned and displayed.
My problem:
After I click on a few words consecutively, the application seems to "hang" - say, I click on 5 words in quick succession, clicking on the 6th word (or any word after that) doesn't replace the info regarding the 5th word on my right panel.
Since some data store columns (at least single valued properties) are indexed by default I'm guessing retrieval is not the bottleneck here. It is probably the requests.
Is such an issue known with the GAE? Any workarounds possible?
Kind of in a soup with this - the application was supposed to go live today. Urgent help required!
Thanks! :)
You're probably being limited to two simultaneous requests by your browser - not by appengine. If you click on a third link before the first two have had a chance to return, make sure your app can deal with requests returning for links that should no longer be displayed.
If you were hitting a limit on appengine, you'd see exceptions in your server logs. If you're not seeing those exceptions, it's probably a client-side issue.
Sorry for the late ack (for some reason I received a notification for the responses a day late, by which we had managed to fix a few things). It does look like the problem was at the data end - our code was doing some inserts, and it turns out you can't do too many of them quickly - the logs reported a transaction time-out error. The reason we couldn't spot it earlier in the logs was we were writing simply too much info out and this was buried in somewhere.
The clicks on the user-side were pulling data from this table.
Unfortunately, the GAE simulator doesn't simulate any timeout error - so even though we had tested with comparable volumes of data before deployment this error never happened during development.
Thanks again for your responses!
And yet again, I apologize for responding late.

Resources