Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 2 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
I have been looking around for tools that can help me to do load testing and benchmarking. I found a couple like:
https://github.com/wg/wrk
http://www.joedog.org/siege-home/
https://github.com/rakyll/boom
I'm wondering if anyone has any experience with these tools and have any feedback pros vs cons of these tools. My load stress will include different test cases using DELETE, PUT, GET, POST, etc. headers.
Load testing and benchmarking tools
Listed in alphabetical order.
ab: slow and single threaded, written in C
apib: most of the features of ApacheBench (ab), also designed as a more modern replacement, written in C
baloo: expressive end-to-end HTTP API testing made easy, written in Go (golang)
baton: HTTP load testing, written in Go (golang)
bombardier: fast crossplatform HTTP benchmarking tool, written in Go (golang)
curl-loader: performance loading of various application services and traffic generation, written in C
drill: HTTP load testing application inspired by Ansible syntax, written in Rust
fasthttploader: benchmark (kinda ab) with autoadjustment and charts based on fasthttp library, write in Go (golang)
fortio: load testing library and command line tool and web UI. Allows to specify a set query-per-second load and record latency histograms and other useful stats, write in Go (golang)
gatling: high performance load testing framework based on Scala, Akka and Netty, write in Scala
go-wrk: HTTP benchmarking tool based in spirit on the excellent wrk tool (wg/wrk), write in Go (golang)
goad: AWS Lambda powered, highly distributed, load testing tool, write in Go (golang)
gobench: HTTP/HTTPS load testing and benchmarking tool, write in Go (golang)
gohttpbench: ab-like benchmark tool run on multi-core cpu, write in Go (golang)
hey: HTTP(S) load generator, ApacheBench (ab) replacement, formerly known as rakyll/boom, written in Go (golang)
htstress: multithreading high-load bechmarking services (>5K rps), written in C/Linux
httperf: difficult configuration, slow and single threaded, written in C
inundator: simple and high-throughput HTTP flood program, written in C/Linux
jmeter: Apache JMeter™, pure application designed to load test performance both on static and dynamic resources, written in Java
k6: modern load testing tool scriptable in ES6 JS with support for HTTP/1.1, HTTP/2.0 and WebSocket, written in Go (golang)
locust: easy-to-use, distributed load testing tool with real-time web UI. Simulates a swarm of concurrent users, the behavior of each of them is defined by your python code. Written in Python
mgun: modern tool for load testing HTTP servers, written in Go (golang)
pounce: evented, but results fluctuate, it's sometimes faster than htstress, written in C
siege: slow and single threaded, written in C
slapper: simple load testing tool with real-time updated histogram of request timings, written in Go (golang)
slow_cooker: load tester focused on lifecycle issues and long-running tests, service with a predictable load and concurrency level for a long period of time, written in Go (golang)
sniper: powerful & high-performance http load tester, written in Go (golang)
tsung: simulate stress users in order to test the scalability and performance of IP based client/server applications HTTP, WebDAV, SOAP, PostgreSQL, MySQL, LDAP and Jabber/XMPP servers, written in Erlang
vegeta: HTTP load testing tool and library, written in Go (golang)
weighttp: multithreaded, but slower than htstress without keepalive, written in C
wrk: multithreaded, written in C/Lua
wrk2: constant throughput, correct latency recording variant of wrk, written in C/Lua
yandex-tank: load and performance benchmark tool, written in Python/C|C++|Asm (phantom)
Descriptions are from here.
I've used wrk and siege, siege is a really easy to use tool, but I'm not sure if you can test DELETE or PUT with siege.
Wrk can use provided lua script to generate requests, so DELETE and PUT won't be a problem. AND wrk is a tool that can overpower NGINX static file server, so I think it's fast enough for general purpose load testing.
I've never used boom or Yandex.tank suggested by #Direvius, basically because wrk is simple enough and fit our needs. But JMeter is too complex for me.
I've never used any of these, but I've heard some positive opinions about wrk.
I think, you should also try Jmeter, which is very popular, and maybe Yandex.tank, which is the tool we use at our LT department for most of our web-services.
Related
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 3 years ago.
Improve this question
We have a web service that needs a somewhat POSIX-compatible shared filesystem for the application servers (multiple redundant systems running in parallel behind redundant load balancers). We're currently running GlusterFS as the shared filesystem for the application servers but I'm not happy with the performance of the system. Compared to actual raw performance of the storage servers running GlusterFS, it starts to look more sensible to run DRBD and single NFS server with all the other GlusterFS servers (currently 3 servers) waiting in hot-standby role.
Our workload is highly read oriented and usually deals with small files and I'd be happy to use "eventually consistent" system as long as a client can request sync for a single file if needed (that is, client is prepared to wait until the file has been successfully stored in the backend storage). I'd even accept a system where such "sync" requires querying the state of the file via some other way than POSIX fdatasync(). File metadata such as modification times is not important, only filename and the contents.
I'm currently aware of possible candidates and the problems each one currently has:
GlusterFS: overall performance is pretty poor in practice, performance goes down while adding new servers/bricks.
Ceph: highly complex to configure/administrate, POSIX compatibility sacrifices performance a lot as far as I know.
MooseFS: partially obfuscated open source (huge dumps of internally written code published seldomly with intentionally lost patch history), documentation leaves lots to desire.
SeaweedFS: pretty simple design and supposedly high performance, future of this project is unclear because pretty much all code is written and maintained by Chris Lu - what happens if he no longer writes any code? Unclear if the "Filer" component supports no single point of failure.
I know that CAP theorem prevents ever having truly consistent and always available system. Is there any good system for distributed file system where writes must be durable, but read performance is really good and the system has no single point of failure?
I am Chris Lu working on SeaweedFS. There are plans to commercialize it. (By adding more advanced features.)
The filer does not have simple point of failure, you can have multiple filer instances. The filer store can be any key-value store. If you need no SPOF, you can use Cassandra, Redis cluster, CockroachDB, TiDB, or Etcd. Or you can add your own key-value store option, which is pretty easy.
I'm trying to benchmark a real-time planning algorithm but can't seem to find how to do it, is this supported in Optaplanner?
I've successfully run a benchmark using an offline version of my problem. I've implemented SolutionFileIO that reads my problem instances and converts them to a solution. I've read the docs and saw the video related to benchmarking but couldn't find what I'm looking for.
Alternatively, I can run the real-time algorithms using my own framework, but that would require me to manually define all Optaplanner heuristics that I want to run (which is quite cumbersome when using a matrix setup). Is there a way to instantiate the solvers (in Java) based on the benchmark xml definition? This would allow me to run my own real-time benchmark while still using the Optaplanner benchmark definition.
A benchmark config that also fires ProblemFactChange events (= real-time planning), is not yet supported, vote for this jira. How would you like the benchmark config to look like?
To HACK reusing the solvers from a benchmark configuration, cast PlannerBenchmark to PlannerBenchmarkRunner and use getPlannerBenchmarkResult().getSolverBenchmarkResultList(), but that will give up on a bunch of orchestration (including the report). Instead, if you can succeed in overriding SubSingleBenchmarkResult, you wouldn't loose that orchestration (but your hacks would be even deeper).
Whatever you end up doing, do share how you'd the benchmark config to look like, as this will give us inspiration when we implement it for a future OptaPlanner version.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 8 years ago.
Improve this question
I run a data-intensive site with CMS built by outsourcing. Now as the no. of users grows it is getting slower. It goes down during major product launches. What services that I can turn to, to analyze the bottleneck of the site and analyze SQL queries, etc? Service which can provide solutions like separating load between redundant servers, configure master-slave for db, etc? I am new to this.
If you have a custom-built CMS, you're going to need software engineers to analyze the problems and propose solutions; there's no off-the-shelf solution for that. The engineers you're looking for would need to understand the programming language your CMS is built in, and have experience in web scalability. Obviously, your outsource partner would be a good candidate here...
If the CMS is an off the shelf solution, the vendor might be able to recommend specialist service providers, independent of the outsource partner.
In general, the performance / scalability process is:
understand the targets, ideally represented as page generations per
second, subdivided into page types if necessary (e.g. "50 CMS product
pages / second, 20 logins / second"). Establish the maximum acceptable response time (e.g. 1 second average, 4 seconds max).
create a test environment which you understand completely, and which you can relate to the target production system. The test environment should be easy to work with, and accessible to the team; typically I recommend using a developer work station, or a low-powered VM. The purpose is to bring bottlenecks to light, not to handle huge amounts of traffic.
establish test targets for your test environment - e.g. if "production" needs to handle 100 page generations / second, your test environment might only need to handle 20 page generations / second.
deploy your application to the test environment, and set up a way of collecting performance information, e.g. CPU, memory, disk, network, etc.
run load tests on the test environment; increase load until you exceed the response time targets. Use the monitoring tools to identify the bottleneck.
fix bottleneck
rinse & repeat until you hit your performance targets
deploy application to production-like environment - similar capacity and architecture - if you're lucky enough to have one. Set up monitoring and performance capture tools.
run load test, increasing load until you exceed your performance targets
if you've met your goals, congratulations!
if you haven't met your goals, it suggests your production environment has a different bottleneck than your test environment (often the database). Find out what the bottleneck is; try to replicate on your test environment.
restart testing on test environment.
Quite often, you will find you have to make architectural or infrastructure changes to reach your targets; I've used the following:
- run the solution on bigger hardware.
- introduce a CDN to offload traffic - some CMS-driven sites can be cached almost entirely on a CDN
- introduce caching into the application, ideally at the page generation level, but a typically web app has many places where caching can help
- add more front-end web servers (assuming application was built with load balancing in mind)
- add more database servers (this is nearly always a major intervention unless the app was built with this in mind)
Load testing tools are available as services you can hire (Keynote is one I've used), or tools you can run yourself (JMeter is my favourite).
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I am developing high volume processing systems. Like mathematical models that calculate various parameters based on milions of records, calculated derived fields over milions of records, process huge files having transactions etc...
I am well aware of unit testing methodologies and if my code is in X# I have no problem in unit testing is. Problem is I often have code in T-SQL, C# code that is a SQL stored assembly, and SSIS workflow with a good amount of logic (and outcomes etc) or some SAS process.
What is the approach YOu use when developing such systems. I usually develop several tests as Stored procedures in a designed schema(TEST) and then automatically run them overnight and check out the results. But this is only for T-SQL. But the problem is with testing SSIS packages. How do You test it? What is Your preferred approach for stubbing data into tables (especially if You need a lot data initialization). I have some approach derived over the years but maybe I am just not reading enough articles.
So Banking, Telecom, Risk developers out there. How do You test your mission critical apps that process milions of records at end day, month end etc? What frameworks do You use? How do You validate that Your ssis package is Correct (as You develop it)/ How do You achieve continous integration in such an environment (Personally I never got there)? I hope this is not to open-ended question. How do You test Your map-reduce jobs for example (i do not use hadoop but this is quite similar).
luke
Hope that this is not to open ended
Firstly build logging, monitoring & double entry systems into what you're building.
Ensure that even with these systems switched on, performance is acceptable, so benchmark, and profile these, and ensure the hardware is appropriate for the entire system.
Split each system into sub-systems which can be tested independently, so try and ensure systems are designed to be quite loosely coupled.
Also ensure each sub-system validates their inputs before processing further, this ensures erroneous data is stopped before it becomes a bigger problem.
By using logging, you can test a variety of systems in a similar way.
For any system which doesn't have unit test frameworks available, use logging, and then test the logs generated.
This should allow you to test SSIS processes, Workflow's, or assembly's.
Monitoring & double entry systems, will flag up errors & process problems, so you can identify and ideally resolve them in a timely fashion.
Finally, when systems go live, don't switch logging off entirely.
If necessary, reduce it's verbosity, but ensure this can be switched on, to debug processes, as problems will still occur in the live environment which you need to resolve.
Ensure you use live data, and edge cases, for automated testing.
Use code reviews or pair programming to ensure the code is optimal.
Ensure you use expert QA staff to think of use cases you won't think of.
Ensure you have a excellent project manager, who can manage you, your team, the related teams, the end users, and your bosses, and ensure everyone is communicating appropriately.
You won't be able to achieve well tested processes without a well run team.
Using some of the above, has allowed us to develop well tested processes, which handles billions of pounds worth of transactions annually, so we must be doing something right.
Automated regression testing, not unit testing. Custom tools to compare input and expected output. Performance over everything. Performance tests. Test using pre-loaded systems. Try on x64, x32 etc. Custom tools to synthesise data based on business cases. Modular dtsx. One dev per dtsx. List goes on.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
What alternatives are there to GAE, given that I already have a good bit of code working that I would like to keep. In other words, I'm digging python. However, my use case is more of a low number of requests, higher CPU usage type use case, and I'm worried that I may not be able to stay with App Engine forever. I have heard a lot of people talking about Amazon Web Services and other sorts of cloud providers, but I am having a hard time seeing where most of these other offerings provide the range of services (data querying, user authentication, automatic scaling) that App Engine provides. What are my options here?
AppScale
AppScale is a platform that allows users to deploy and host their own Google App Engine applications. It executes automatically over Amazon EC2 and Eucalyptus as well as Xen and KVM. It has been developed and is maintained by AppScale Systems. It supports the Python, Go, PHP, and Java Google App Engine platforms.
http://github.com/AppScale/appscale
In the mean time...
...it is amost 2015 and it seems that containers are the way to go forward. Alternatives to GAE are emerging:
Google has released Kubernetes, container scheduling software developed by them to manage GCE containers, but can be used on other clusters as well.
There are some upcoming PaaS on Docker such as
http://deis.io/
http://www.tsuru.io/
even Appscale themselves are supporting Docker
Interesting stuff to keep an eye on.
I don't think there is another alternative (with regards to code portability) to GAE right now since GAE is in a class of its own. Sure GAE is cloud computing, but I see GAE as a subset of cloud computing. Amazon's EC2 is also cloud computing (as well as Joyent Accelerators, Slicehost Slices), but obviously they are two different beasts as well. So right now you're in a situation that requires rethinking your architecture depending on your needs.
The immediate benefits of GAE is that its essentially maintenance free as it relates to infrastructure (scalable web server and database administration). GAE is more tailored to those developers that only want to focus on their applications and not the underlying system.In a way you can consider that developer friendly. Now it should also be said that these other cloud computing solutions also try to allow you to only worry about your application as much as you like by providing VM images/templates. Ultimately your needs will dictate the approach you should take.
Now with all this in mind we can also construct hybrid solutions and workarounds that might fulfill our needs as well. For example, GAE doesn't seem directly suited to this specific app needs you describe. In other words, GAE offers relatively high number of requests, low number of cpu cycles (not sure if paid version will be any different).
However, one way to tackle this challenge is by building a customized solution involving GAE as the front end and Amazon AWS (EC2, S3, and SQS) as the backend. Some will say you might as well build your entire stack on AWS, but that may involve rewriting lots of existing code as well. Furthermore, as a workaround a previous stackoverflow post describes a method of simulating background tasks in GAE. Furthermore, you can look into HTTP Map/Reduce to distribute workload as well.
As of 2016, if you're willing to lump PaaS (platform as a service) and FaaS (function as a service) in the same serverless computing category, then you have a few FaaS options.
Proprietary
AWS Lambda
AWS Lambda lets you run code without provisioning or managing servers. You pay only for the compute time you consume - there is no charge when your code is not running. With Lambda, you can run code for virtually any type of application or backend service - all with zero administration. Just upload your code and Lambda takes care of everything required to run and scale your code with high availability. You can set up your code to automatically trigger from other AWS services or call it directly from any web or mobile app.
AWS Step Functions complements AWS Lambda.
AWS Step Functions makes it easy to coordinate the components of distributed applications and microservices using visual workflows. Building applications from individual components that each perform a discrete function lets you scale and change applications quickly. Step Functions is a reliable way to coordinate components and step through the functions of your application. Step Functions provides a graphical console to arrange and visualize the components of your application as a series of steps. This makes it simple to build and run multi-step applications. Step Functions automatically triggers and tracks each step, and retries when there are errors, so your application executes in order and as expected. Step Functions logs the state of each step, so when things do go wrong, you can diagnose and debug problems quickly. You can change and add steps without even writing code
Google Cloud Functions
As of 2016 it is in alpha.
Google Cloud Functions is a lightweight, event-based, asynchronous compute solution that allows you to create small, single-purpose functions that respond to cloud events without the need to manage a server or a runtime environment. Events from Google Cloud Storage and Google Cloud Pub/Sub can trigger Cloud Functions asynchronously, or you can use HTTP invocation for synchronous execution.
Azure Functions
An event-based serverless compute experience to accelerate your development. It can scale based on demand and you pay only for the resources you consume.
Open
Serverless
The Serverless Framework allows you to deploy auto-scaling, pay-per-execution, event-driven functions to any cloud. We currently support Amazon Web Service's Lambda, and are expanding to support other cloud providers.
IronFunctions
IronFunctions is an open source serverless computing platform for any cloud - private, public, or hybrid.
It remains to seen how well FaaS competes with CaaS (container as a service). The former seems more lightweight. Both seem suited to microservices architectures.
I anticipate that functions (as in FaaS) are not the end of the line, and that many years forward we'll see further service abstractions, e.g. test-only development, followed by plain-language scenarios.
Alternatives:
1. AppScale
2. Heroku.
Ref: Alternative for Google AppEngine?
Amazon's Elastic Compute Cloud or EC2 is a good option. You basically run Linux VMs on their servers that you can control via a web interface (for powering up and down) and of course access via SSH or whatever you normally set up...
And as it's a linux install that you control, you can of course run python if you wish.
Microsoft Windows Azure might be worth consideration. I'm afraid I haven't used it so can't say if it's any good and you should bear in mind that it's a CTP at the moment.
Check it out here.
A bit late, but I would give Heroku a go:
Heroku is a polyglot cloud application platform. With Heroku, you
don’t need to think about servers at all. You can write apps using
modern development practices in the programming language of your
choice, back it with add-on resources such as SQL and NoSQL databases,
Memcached, and many others. You manage your app using the Heroku
command-line tool and you deploy code using the Git revision control
system, all running on the Heroku infrastructure.
https://www.heroku.com/about
You may also want to take a look at AWS Elastic Beanstalk - it has a closer equivalence to GAE functionality, in that it is designed to be PaaS, rather than an IaaS (i.e. EC2)
If you're interested in the cloud, and maybe want to create your own for production and/or testing you have to look at Eucalyptus. It's allegedly code compatible with EC2 but open source.
I'd be more interested in seeing how App Engine can be easily coupled with another server used for CPU intensive requests.
TyphoonAE is trying to do this. I haven't tested it, but while it is still in beta, it looks like it's atleast in active development.
The shift to cloud computing is happening so rapidly that you have no time to waste for testing different platforms.
I suggest you trying out Jelastic if you are interested in Java as well.
One of the greatest things about Jelastic is that you do not need to make any changes in the code of your application, except the changes for your application functionality, but not for the reason the chosen platform demands this. With reference to this you do not actually waste your time.The deployment process is just flawless, and you can deploy your .war file anywhere further.Using GAE requires you to modify the app around their system needs. In case if you happen to get working with Java and start looking for a more flexible platform, Jelastic is a compatible alternative.
You can also use Red Hat's Cape Dwarf project, to run GAE apps on top of the Wildfly appserver (previously JBoss) without modification.
You can check it out here:
http://capedwarf.org/