Outcome field in LoadTest2010's LoadTestTestResults table - sql-server

I'm running load tests in Visual Studio 2010 Ultimate, and I'm trying to build some custom reporting tools. In the LoadTestTestResults table, there's a column labeled Outcome. I've seen it have the values 0, 1, 3, and (mostly) 10. But I can't find anything that explains what the different values mean.
I think that 10 is a success outcome, according to a comment in Prc_GetUserTestDetail. No clue on the others -- they don't seem to match up with any numbers in the VS summary.
What do these outcome codes mean?

I contacted a Microsoft developer from the MSDN blog on VS load testing and asked about this. Here's the information I got back, in case anybody else needs it:
The Outcome field is an enum that stores the status of an individual test case within a load test run. It can have values from 0 - 13.
0 - Error: There was a system error while we were trying to execute a test.
1 - Failed: Test was executed, but there were issues. Issues may involve exceptions or failed assertions.
2 - Timeout: The test timed out.
3 - Aborted: Test was aborted. This was not caused by a user gesture, but rather by a framework decision.
4 - Inconclusive: Test has completed, but we can't say if it passed or failed. May be used for aborted tests...
5 - PassedButRunAborted: Test was executed w/o any issues, but run was aborted.
6 - NotRunnable: Test had its chance for been executed but was not, as ITestElement.IsRunnable == false.
7 - NotExecuted: Test was not executed. This was caused by a user gesture - e.g. user hit stop button.
8 - Disconnected: Test run was disconnected before it finished running.
9 - Warning: To be used by Run level results. This is not a failure.
10 - Passed: Test was executed w/o any issues.
11 - Completed: Test has completed, but there is no qualitative measure of completeness.
12 - InProgress: Test is currently executing.
13 - Pending: Test is in the execution queue, was not started yet.

Related

batch query is not allowed to request data from "".""

I'm getting started with Kapacitor and have been trying to run the first guide in the Kapacitor documentation, but with data I already have. I managed to define a task, but I can neither enable it nor can I run a backfill. I came across this question, which is similar to my problem, but the answer there didn't help. In contrast to the error message there I get empty strings for database, retention policy, and/or measurement.
In Kapacitor config I set an InfluxDB connection to the local host instance with the name localhost (which has a database mydb and the measurements weather.current.clouds and weather.current.visibility with default retention policy autogen) and created the following weathertest.tick script:
dbrp "mydb"."autogen"
var clouds = batch
|query('select mean(value) / 100.0 as val from "mydb"."autogen"."weather.current.clouds"')
.period(1h)
.every(1h)
.groupBy(time(1m), *)
.fill(0)
var vis = batch
|query('select mean(value) / 10000.0 as val from "mydb"."autogen"."weather.current.visibility"')
.period(1h)
.every(1h)
.groupBy(time(1m), *)
.fill(0)
clouds
|join(vis)
.as('c', 'v')
|eval(lambda: 100 * (1 - "c.val") * "v.val")
.as('pcent')
|influxDBOut()
.cluster('localhost')
.database('mydb')
.retentionPolicy('autogen')
.measurement('testmetric')
.tag('host', 'myhost.local')
.tag('key', 'weather.current.lightidx')
This is what I came up with after hours of trial and (especially) error. As given in the title, when I try to enable my task with kapacitor enable weathertest, I get the error message enabling task weathertest: batch query is not allowed to request data from ""."". Same thing when I try to record as in the "Backfill" example. Also, in that example there is a start and a stop date for limiting the time frame. The time format given there is wrong and is not understood by Kapacitor. Instead of e. g. 2015-10-01 I have to put in 2015-10-01T00:00Z to make it at least pass the error message regarding time format error.
In the Kapacitor logs there is not a single line regarding these errors, only when I try to remove a record, I get something like remove /var/lib/kapacitor/replay/1f5...750.brpl: no such file or directory and this can be found in the logs. There are lots of info lines in the logs showing successful POSTs to/from InfluxDB for the _internal database with HTTP response result 204.
Has anyone an Idea what I may be doing wrong?
OK, after the weekend I tried again. Without any change it accepted my script now in the failing steps, however, now I was able to find error messages in the log. The node mentioned there was the eval node and pointed towards a type mismatch. When I changed the line
|eval(lambda: 100 * (1 - "c.val") * "v.val")
to
|eval(lambda: 100.0 * (1.0 - "c.val") * "v.val")
the error messages were gone and the command kapacitor show weathertest showed a rather sane content now.
Furthermore, I redefined, recorded, replayed and deleted the tasks and recordings during my tests over and over again and I may have forgotten to redefine tasks after making changes to the tick script (not really sure). After changing the above, redefining the task and replaying it I finally found the expected data in the InfluxDB instance.

Wrk vs Gatling benchmark test comparison

With wrk, I runt the following command :
wrk -t10 -c10 -d30s http://localhost:8080/myService --latency -H "Accept-Encoding: gzip"
As a result, I obtain Requests/sec: 15000 and no error
I am trying to reproduce the same kind of test with Gatling. So I have tried the following :
scn.inject(
rampUsersPerSec(1) to 15000 during (30 seconds)
)
But as a result, I obtain errors :
---- Errors --------------------------------------------------------------------
i.n.c.AbstractChannel$AnnotatedSocketException: Can't assign r 573 (42,44%)
equested address: localhost/127.0.0.1:8080
i.n.c.AbstractChannel$AnnotatedSocketException: Resource tempo 530 (39,26%)
rarily unavailable: localhost/0:0:0:0:0:0:0:1:8080
j.i.IOException: Premature close 247 (18,30%)
From wrk, I believe my server can handle 15000 request/s but with Gatling it seems not the case. Do you have an idea why such a difference ?
Disclaimer: Gatling's creator here
You're comparing apples and oranges.
With wrk, you're opening 10 connections and looping as fast as possible during 30s.
With your current Gatling set up, you're spawning 225,015 virtual users ((1 + 15,000) / 2 * 30), each one trying to open its own connection.
I recommend you reading this article about picking injection profiles that make sense for your use case.
If you really want to do the same thing as wrk here, you need to wrap your scenario in a during(30) loop and change your injection profile to atOnceUsers(10).
You also have the option of using a shared connection pool.
Then, you can't expect any other to load test tool to be as fast as wrk for this kind of logicless, static test.
Also note that:
there was a mistake in Gatling's JVM configuration that was fixed in Gatling 3.4.0 that hurt performance in this kind of minimalistic
super high throughput tests, see issue
Gatling runs on a JVM, hence with a runtime, so it needs to warm up, boot throughput will be lower than the warm one

SpecFlow and Selenium-Share data between different step definitions or classes

I just started using SpecFlow along with Selenium & N Unit.
I have a basic question, maybe I know the answer to it, but want to get it confirmed.
Consider, there are two features - Register and Add new transaction. Two separate features with their respective step definitions. How do I share the IWebDriver element across both the features?
I do not want to launch a new browser again and add a transaction. I want to execute both as a flow.
My thoughts are this functionality is not allowed using SpecFlow as basic use of feature-based testing is violated as trying to run two features in the same session. Will context injection assist help in this case?
What you want to do is a bad idea. you should start a new browser session for each feature, IMHO.
There is no guarantee in what order your tests will execute, this will be decided by the test runner, so you might get Feature2 running before Feature1.
In fact your scenarios should be independent as well.
you can share the webdriver instance between steps as in this answer but you should use specflows other features like scenario Background to do setup, or definign steps which do your comman setup.
EDIT
We have a similar issue with some of our tests and this is what we do:
We create a sceanrio for the first step
Feature: thing 1 is done
Scenario: Do step 1
Given we set things for step 1 up
When we execute step 1
Then some result of step one should be verified
Then we do one for step 2 (which lets assume relies on step 1)
Feature: thing 2 is processed
Scenario: Do step 2
Given we have done step 1
And we set things for step 2 up
When we execute step 2
Then some result of step 2 should be verified
This first step Given we have done step 1
is a step that calls all the steps of feature 1:
[Given("we have done step 1")]
public void GivenWeHaveDoneStep1()
{
Given("we set things for step 1 up");
When("we execute step 1");
Then("some result of step one should be verified");
}
Then if we have step 3 we do this:
Feature: thing 3 happens
Scenario: Do step 3
Given we have done step 2
And we set things for step 3 up
When we execute step 3
Then some result of step 3 should be verified
Again the Given we have done step 2 is a composite step that calls all the steps in the scenarion for step 2 (and hence all the steps for step 1)
[Given("we have done step 2")]
public void GivenWeHaveDoneStep2()
{
Given("we have done step 1");
Given("we set things for step 2 up");
When("we execute step 2");
Then("some result of step 2 should be verified");
}
We repeat this process so that when we get to step 5, it is running all the steps in the correct order. Sometimes one we get to step 5 we #ignore the previous 4 steps as they will all be called by step 5 anyway.

Error during node startup: Unable to start DSE server / Plugin activation failed / Cannot find core

I've been having these issues for quite a while already but I ignored them initially because I can still start my nodes. However, one of these issues became more serious recently that it now takes me a lot of tries in order to successfully start a node.
Issue #1: Unable to start DSE server / Plugin activation failed / Cannot find core
ERROR [main] 2015-01-28 03:30:40,058 DseDaemon.java (line 492) Unable to start DSE server.
java.lang.RuntimeException: com.datastax.bdp.plugin.PluginManager$PluginActivationException: Plugin activation failed
at com.datastax.bdp.plugin.PluginManager.activate(PluginManager.java:135)
at com.datastax.bdp.server.DseDaemon.start(DseDaemon.java:480)
at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:509)
at com.datastax.bdp.server.DseDaemon.main(DseDaemon.java:659)
Caused by: com.datastax.bdp.plugin.PluginManager$PluginActivationException: Plugin activation failed
at com.datastax.bdp.plugin.PluginManager.activate(PluginManager.java:284)
at com.datastax.bdp.plugin.PluginManager.activate(PluginManager.java:128)
... 3 more
Caused by: java.lang.IllegalStateException: Cannot find core: myks.mycf
at com.datastax.bdp.search.solr.core.SolrCoreResourceManager.doWaitForCore(SolrCoreResourceManager.java:742)
at com.datastax.bdp.search.solr.core.SolrCoreResourceManager.waitForCore(SolrCoreResourceManager.java:478)
at com.datastax.bdp.plugin.SolrContainerPlugin.waitForSecondaryIndexesLoading(SolrContainerPlugin.java:237)
at com.datastax.bdp.plugin.SolrContainerPlugin.onActivate(SolrContainerPlugin.java:98)
at com.datastax.bdp.plugin.PluginManager.initialize(PluginManager.java:334)
at com.datastax.bdp.plugin.PluginManager.activate(PluginManager.java:263)
... 4 more
INFO [Thread-3] 2015-01-28 03:30:40,059 DseDaemon.java (line 505) DSE shutting down...
INFO [StorageServiceShutdownHook] 2015-01-28 03:30:40,164 Gossiper.java (line 1307) Announcing shutdown
INFO [Thread-3] 2015-01-28 03:30:40,620 PluginManager.java (line 356) All plugins are stopped.
INFO [Thread-3] 2015-01-28 03:30:40,620 CassandraDaemon.java (line 463) Cassandra shutting down...
INFO [StorageServiceShutdownHook] 2015-01-28 03:30:42,165 MessagingService.java (line 701) Waiting for messaging service to quiesce
INFO [ACCEPT-/144.76.201.233] 2015-01-28 03:30:42,814 MessagingService.java (line 941) MessagingService has terminated the accept() thread
This exception started as a "mild" issue - mild because although it prevents a node from starting up when it happens, it usually takes me 1 more try to successfully start the affected node. However, about two weeks ago, after having not restarted any of my nodes for quite a while, I discovered that I now need a lot more attempts (20+) in order to start a node.
From the stack trace, it looks like a timeout issue (in doWaitForCore()); but I cannot find a setting to increase the amount of time that DSE would wait for a core to load during startup before giving up. The core that is mentioned in the stack trace is always the same, and I assume that this is because it is my biggest core (~1.4 billions records) and it takes the longest time to load. But when I manage to start the node successfully, there are no signs of errors - I can query the core like any other core.
--
There are two other issues that may or may not be related to the one above. Both of them always appear during startup; and unlike the first one, they do not cause a startup failure (i.e. they also appear when a node starts successfully)
Issue #2: Invalid Number: static
ERROR [searcherExecutor-67-thread-1] 2015-01-28 04:26:49,691 SolrException.java (line 124) org.apache.solr.common.SolrException: Invalid Number: static
at org.apache.solr.schema.TrieField.readableToIndexed(TrieField.java:396)
at org.apache.solr.schema.FieldType.getFieldQuery(FieldType.java:697)
at org.apache.solr.schema.TrieField.getFieldQuery(TrieField.java:343)
at org.apache.solr.parser.SolrQueryParserBase.getFieldQuery(SolrQueryParserBase.java:741)
at org.apache.solr.parser.SolrQueryParserBase.handleBareTokenQuery(SolrQueryParserBase.java:545)
at org.apache.solr.parser.QueryParser.Term(QueryParser.java:300)
at org.apache.solr.parser.QueryParser.Clause(QueryParser.java:186)
at org.apache.solr.parser.QueryParser.Query(QueryParser.java:108)
at org.apache.solr.parser.QueryParser.TopLevelQuery(QueryParser.java:97)
at org.apache.solr.parser.SolrQueryParserBase.parse(SolrQueryParserBase.java:153)
at org.apache.solr.search.LuceneQParser.parse(LuceneQParser.java:50)
at org.apache.solr.search.QParser.getQuery(QParser.java:143)
at org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:135)
at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:183)
I looked at the data that I imported and I couldn't find a supposedly-numeric value that was incorrectly supplied as "static". In the java application that I wrote to convert CSVs to SSTables, I cast all numeric values to int/long/double depending on the field type so I honestly don't think that it has something to do with my data.
Issue #3: Could not getStatistics on info bean com.datastax.bdp.search.solr.FilterCacheMBean
WARN [SolrSecondaryIndex myks.mycf2 index initializer.] 2015-01-28 04:26:51,770 JmxMonitoredMap.java (line 256) Could not getStatistics on info bean com.datastax.bdp.search.solr.FilterCacheMBean
java.lang.RuntimeException: java.lang.ClassCastException: org.apache.lucene.search.FieldCache$CreationPlaceholder cannot be cast to org.apache.solr.search.SolrCache
at com.datastax.bdp.search.solr.FilterCacheMBean.getStatistics(FilterCacheMBean.java:185)
at org.apache.solr.core.JmxMonitoredMap$SolrDynamicMBean.getMBeanInfo(JmxMonitoredMap.java:236)
at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getNewMBeanClassName(DefaultMBeanServerInterceptor.java:333)
at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:319)
at com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522)
at org.apache.solr.core.JmxMonitoredMap.put(JmxMonitoredMap.java:140)
at org.apache.solr.core.JmxMonitoredMap.put(JmxMonitoredMap.java:51)
at com.datastax.bdp.search.solr.core.CassandraCoreContainer.registerExtraMBeans(CassandraCoreContainer.java:679)
at com.datastax.bdp.search.solr.core.CassandraCoreContainer.register(CassandraCoreContainer.java:427)
at com.datastax.bdp.search.solr.core.CassandraCoreContainer.doLoad(CassandraCoreContainer.java:757)
at com.datastax.bdp.search.solr.core.CassandraCoreContainer.load(CassandraCoreContainer.java:162)
at com.datastax.bdp.search.solr.AbstractSolrSecondaryIndex$2.run(AbstractSolrSecondaryIndex.java:882)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassCastException: org.apache.lucene.search.FieldCache$CreationPlaceholder cannot be cast to org.apache.solr.search.SolrCache
at com.datastax.bdp.search.solr.FilterCacheMBean.getStatistics(FilterCacheMBean.java:174)
... 16 more
I have absolutely no idea what this is.
--
Has anyone encountered these errors/exceptions/warnings before? What did you do?
Issue #1: The max waiting time to load a core was hard-coded at 1 min. So, your assumption is right: a very large core or hundreds of cores could prevent the node starting due to the excessive time to load this particular core. In the next patch release (4.5.6, 4.6.1) we address this issue by creating a new option load_max_time_per_core in dse.yaml. This option allows you to increase the max waiting time for core loading, starting at 1 min. For 500 cores you would need to increase load_max_time_per_core to about 3 minutes, for example.
Issue #2: Unfortunately, I don't know what could be causing this. We would need further info about this to see why it's happening.
Issue #3: We have currently investigating what this can be.
Regarding issue #2, are you sure you don't have a QuerySenderListener with a wrong warmup query in your solrconfig?

How to debug long-running mapreduce jobs with millions of writes?

I am using the simple control.start_map() function of the appengine-mapreduce library to start a mapreduce job. This job successfully completes and shows ~43M mapper-calls on the resulting /mapreduce/detail?mapreduce_id=<my_id> page. However, this page makes no mention of the reduce step or any of the underlying appengine-pipeline processes that I believe are still running. Is there some way to return the pipeline ID that this calls makes so I can look at the underlying pipelines to help debug this long-running job? I would like to retrieve enough information to pull up this page: /mapreduce/pipeline/status?root=<guid>
Here is an example of the code I am using to start up my mapreduce job originally:
from third_party.mapreduce import control
mapreduce_id = control.start_map(
name="Backfill",
handler_spec="mark_tos_accepted",
reader_spec=(
"third_party.mapreduce.input_readers.DatastoreInputReader"),
mapper_parameters={
"input_reader": {
"entity_kind": "ModelX"
},
},
shard_count=64,
queue_name="backfill-mapreduce-queue",
)
Here is the mapping function:
# This is where we keep our copy of appengine-mapreduce
from third_party.mapreduce import operation as op
def mark_tos_accepted(modelx):
# Skip users who have already been marked
if (not modelx
or modelx.tos_accepted == myglobals.LAST_MATERIAL_CHANGE_TO_TOS):
return
modelx.tos_accepted = user_models.LAST_MATERIAL_CHANGE_TO_TOS
yield op.db.Put(modelx)
Here are the relevant portions of the ModelX:
class BackupModel(db.Model):
backup_timestamp = db.DateTimeProperty(indexed=True, auto_now=True)
class ModelX(BackupModel):
tos_accepted = db.IntegerProperty(indexed=False, default=0)
For more context, I am trying to debug a problem I am seeing with writes showing up in our data warehouse.
On 3/23/2013, we launched a MapReduce job (let's call it A) over a db.Model (let's call it ModelX) with ~43M entities. 7 hours later, the job "finished" and the /mapreduce/detail page showed that we had successfully mapped over all of the entities, as shown below.
mapper-calls: 43613334 (1747.47/sec avg.)
On 3/31/2013, we launched another MapReduce job (let's call it B) over ModelX. 12 hours later, the job finished with status Success and the /mapreduce/detail page showed that we had successfully mapped over all of the entities, as shown below.
mapper-calls: 43803632 (964.24/sec avg.)
I know that MR job A wrote to all ModelX entities, since we introduced a new property that none of the entities contained before. The ModelX contains an auto_add property like so.
backup_timestamp = ndb.DateTimeProperty(indexed=True, auto_now=True)
Our data warehousing process runs a query over ModelX to find those entities that changed on a certain day and then downloads those entities and stores them in a separate (AWS) database so that we can run analysis over them. An example of this query is:
db.GqlQuery('select * from ModelX where backup_timestamp >= DATETIME(2013, 4, 10, 0, 0, 0) and backup_timestamp < DATETIME(2013, 4, 11, 0, 0, 0) order by backup_timestamp')
I would expect that our data warehouse would have ~43M entities on each of the days that the MR jobs completed, but it is actually more like ~3M, with each subsequent day showing an increase, as shown in this progression:
3/16/13 230751
3/17/13 193316
3/18/13 344114
3/19/13 437790
3/20/13 443850
3/21/13 640560
3/22/13 612143
3/23/13 547817
3/24/13 2317784 // Why isn't this ~43M ?
3/25/13 3701792 // Why didn't this go down to ~500K again?
3/26/13 4166678
3/27/13 3513732
3/28/13 3652571
This makes me think that although the op.db.Put() calls issued by the mapreduce job are still running in some pipeline or queue and causing this trickle effect.
Furthermore, if I query for entities with an old backup_timestamp, I can go back pretty far and still get plenty of entities, but I would expect all of these queries to return 0:
In [4]: ModelX.all().filter('backup_timestamp <', 'DATETIME(2013,2,23,1,1,1)').count()
Out[4]: 1000L
In [5]: ModelX.all().filter('backup_timestamp <', 'DATETIME(2013,1,23,1,1,1)').count()
Out[5]: 1000L
In [6]: ModelX.all().filter('backup_timestamp <', 'DATETIME(2012,1,23,1,1,1)').count()
Out[6]: 1000L
However, there is this strange behavior where the query returns entities that it should not:
In [8]: old = ModelX.all().filter('backup_timestamp <', 'DATETIME(2012,1,1,1,1,1)')
In [9]: paste
for o in old[1:100]:
print o.backup_timestamp
## -- End pasted text --
2013-03-22 22:56:03.877840
2013-03-22 22:56:18.149020
2013-03-22 22:56:19.288400
2013-03-22 22:56:31.412290
2013-03-22 22:58:37.710790
2013-03-22 22:59:14.144200
2013-03-22 22:59:41.396550
2013-03-22 22:59:46.482890
2013-03-22 22:59:46.703210
2013-03-22 22:59:57.525220
2013-03-22 23:00:03.864200
2013-03-22 23:00:18.040840
2013-03-22 23:00:39.636020
Which makes me think that the index is just taking a long time to be updated.
I have also graphed the number of entities that our data warehousing downloads and am noticing some cliff-like drops that makes me think that there is some behind-the-scenes throttling going on somewhere that I cannot see with any of the diagnostic tools exposed on the appengine dashboard. For example, this graph shows a fairly large spike on 3/23, when we started the mapreduce job, but then a dramatic fall shortly thereafter.
This graph shows the count of entities returned by the BackupTimestamp GqlQuery for each 10-minute interval for each day. Note that the purple line shows a huge spike as the MapReduce job spins up, and then a dramatic fall ~1hr later as the throttling kicks in. This graph also shows that there seems to be some time-based throttling going on.
I don't think you'll have any reducer functions there, because all you've done is start a mapper. To do a complete mapreduce, you have to explicitly instantiate a MapReducePipeline and call start on it. As a bonus, that answers your question, as it returns the pipeline ID which you can then use in the status URL.
Just trying to understand the specific problem. Is it that you are expecting a bigger number of entities in your AWS database? I would suspect that the problem lies with the process that downloads your old ModelX entities into an AWS database, that it's somehow not catching all the updated entities.
Is the AWS-downloading process modifying ModelX in any way? If not, then why would you be surprised at finding entities with an old modified time stamp? modified would only be updated on writes, not on read operations.
Kind of unrelated - with respect to throttling I've usually found a throttled task queue to be the problem, so maybe check how old your tasks in there are or if your app is being throttled due to a large amount of errors incurred somewhere else.
control.start_map doesn't use pipeline and has no shuffle/reduce step. When the mapreduce status page shows its finished, all mapreduce related taskqueue tasks should have finished. You can examine your queue or even pause it.
I suspect there are problems related to old indexes for the old Model or to eventual consistency. To debug MR, it is useful to filter your warnings/errors log and search by the mr id. To help with your particular case, it might be useful to see your Map handler.

Resources