I have a simulation with a step that allows me to publish to different endpoints.
class MySimulation extends Simulation {
// some init code
var testTitle = this.getClass.getSimpleName
val myscenario = scenario("Scn Description")
.exec(PublishMessageRandom(pConfigTest, testTitle + "-" + numProducers, numProducers))
if (testMode == "debug") {
setUp(
myscenario.inject(
atOnceUsers(1)
)
).protocols(httpConf)
} else if (testMode == "open") {
setUp(
myscenario.inject(
rampConcurrentUsers(concurrentUserMin) to (concurrentUserMax) during (durationInMinutes minutes),
)
).protocols(httpConf)
}
}
Now here is my PublishMessageRandom definition
def PublishMessageRandom(producerConfig : ProducerConfig, testTitle : String, numberOfProducers : Int ) = {
val jsonBody = producerConfig.asJson
val valuedJsonBody = Printer.noSpaces.copy(dropNullValues = true).print(jsonBody)
println(valuedJsonBody)
val nodes : Array[String] = endpoints.split(endpointDelimiter)
val rnd = scala.util.Random
val rndIndex = rnd.nextInt(numberOfProducers)
var endpoint = "http://" + nodes(rndIndex) + perfEndpoint
println("endpoint:" + endpoint)
exec(http(testTitle)
.post(endpoint)
.header(HttpHeaderNames.ContentType, HttpHeaderValues.ApplicationJson)
.body(StringBody(valuedJsonBody))
.check(status.is(200))
.check(bodyString.saveAs("serverResponse"))
)
// the below is only useful in debug mode. Comment it out for longer tests
/*.exec { session =>
println("server_response: " + session("serverResponse").as[String])
println("endpoint:" + endpoint)
session */
}
}
as you can see it simply round-robin of endpoints. Unfortunately I see the above println("endpoint:" + endpoint) once and it looks like it picks one endpoint randomly and keeps hitting that instead of desired purpose of hitting endpoints randomly.
Can someone explain that behavior? Is Gatling caching the Step or and how do I go around that?
Quoting the official documentation:
Warning
Gatling DSL components are immutable ActionBuilder(s) that have to be
chained altogether and are only built once on startup. The results is
a workflow chain of Action(s). These builders don’t do anything by
themselves, they don’t trigger any side effect, they are just
definitions. As a result, creating such DSL components at runtime in
functions is completely meaningless.
I had to use feeder to solve the problem where the feeder takes the random endpoint.
// feeder is random endpoint as per number of producers
val endpointFeeder = GetEndpoints(numProducers).random
val myscenario = scenario("Vary number of producers hitting Kafka cluster")
.feed(endpointFeeder)
.exec(PublishMessageRandom(pConfigTest, testTitle + "-" + numProducers))
and Publish message random looks like this:
def PublishMessageRandom(producerConfig : ProducerConfig, testTitle : String ) = {
val jsonBody = producerConfig.asJson
val valuedJsonBody = Printer.noSpaces.copy(dropNullValues = true).print(jsonBody)
println(valuedJsonBody)
exec(http(testTitle)
.post("${endpoint}")
.header(HttpHeaderNames.ContentType, HttpHeaderValues.ApplicationJson)
.body(StringBody(valuedJsonBody))
.check(status.is(200))
.check(bodyString.saveAs("serverResponse"))
)
}
you see the line above .post("${endpoint}") will end up hitting the endpoint coming from the feeder.
The feeder function GetEndpoints is defined as follows
where we create an array of maps with one value each "endpoint" is the key.
def GetEndpoints(numberOfProducers : Int ) : Array[Map[String,String]] = {
val nodes : Array[String] = endpoints.split(endpointDelimiter)
var result : Array[Map[String,String]] = Array()
for( elt <- 1 to numberOfProducers ) {
var endpoint = "http://" + nodes(elt-1) + perfEndpoint
var m : Map[String, String] = Map()
m += ("endpoint" -> endpoint )
result = result :+ m
println("map:" + m)
}
result
}
Related
I am creating an external stage, I want it to be based on 2 URLs.
PROBLEM 1
url1 = s3://bucket1/f1/2022/2/
url2 = s3://bucket1/f3/2022/2/
create or replace stage ext_stage url = ??????????
file_format=data_format
storage_integration=s3_integration;
How can I give 2 URLs in the external stage command? Is it possible?
PROBLEM 2
Also, I need to form the URLs.
I am thinking to use a procedure for it.
CREATE OR REPLACE PROCEDURE get_url()
RETURNS STRING
LANGUAGE SCALA
RUNTIME_VERSION = '2.12'
HANDLER = 'C.run'
PACKAGES = ('com.snowflake:snowpark:latest')
AS
$$
import java.util.Calendar
object C{
def run(session: com.snowflake.snowpark.Session): String = {
try {
val cal = Calendar.getInstance()
val year =cal.get(Calendar.YEAR)
var month =cal.get(Calendar.MONTH) + 1
return "s3://bucket1/folder1/" + year + "/"+ month + "/"
}
catch {
case e: Throwable => println("Not able to for url")
return "Failed"
}
}
}
$$;
create or replace stage ext_stage url = call get_url()
file_format=data_format
storage_integration=s3_integration;
It is failing and I can not call the function. How can I call it?
I am executing 2 consecutive scenarios, I have a requirement where I need to record current time before start of 1st scenario and then pass that time value to next scenario. Can someone please suggest how this can be implemented. Please check below my code
def fileUpload() = foreach("${datasetIdList}","datasetId"){
println("File Upload Start Time::::"+Calendar.getInstance().getTime+" for datasetId ::: ${datasetId}")
exec(http("file upload").post("/datasets/${datasetId}/uploadFile")
.formUpload("File","./src/test/resources/data/Scan_good.csv")
.header("content-type","multipart/form-data")
.check(status is 200).check(status.saveAs("uploadStatus")))
.exec(session => {
if(session("uploadStatus").as[Int] == 200)
counter +=1
session
})
}
def getDataSetId() = foreach("${datasetIdList}","datasetId"){
exec(http("get datasetId")
.get("/datasets/${datasetId}")
.header("content-type","application/json")
.check(status is 200)
)
I need to record upload start time for each iteration of datasetIdList and pass that value to next scenario and print that value for each datasetId. can someone please suggest how this can be implemented
You may try using before section
package load
import io.gatling.core.Predef._
import io.gatling.http.Predef._
class TransferTimeSimulation extends Simulation {
var beforeScn1Start: Long = 0L
before {
println("Simulation is about to start!")
beforeScn1Start = System.currentTimeMillis()
}
after {
println("Simulation is finished!")
}
val scn1 = scenario("Scenario 1").exec(
http("get google")
.get("http://google.com")
.check(status.is(200))
)
val scn2 = scenario("Scenario 2")
.exec { session =>
println("beforeScn1Start = " + beforeScn1Start)
session
}
setUp(
scn1.inject(atOnceUsers(1))
.andThen(scn2.inject(atOnceUsers(1)))
)
.protocols(http)
.maxDuration(10)
.assertions(
forAll.failedRequests.count.is(0),
)
}
For more flexibilty you may also consider using lazy val initialization
https://www.baeldung.com/scala/lazy-val
I hope someone can point me into the right direction!
I try to run one scenario which has several steps that have to be executed in order and each with the same user session to work properly. The below code works fine with one user but fails if I use 2 or more users...
What am I doing wrong?
val headers = Map(
Constants.TENANT_HEADER -> tenant
)
val httpConf = http
.baseURL(baseUrl)
.headers(headers)
val scen = scenario("Default Order Process Perf Test")
.exec(OAuth.getOAuthToken(clientId))
.exec(session => OAuth.createAuthHHeader(session, clientId))
.exec(RegisterCustomer.registerCustomer(customerMail, customerPassword,
tenant))
.exec(SSO.doLogin(clientId, customerMail, customerPassword, tenant))
.exec(session => OAuth.upDateAuthToken(session, clientId))
.exec(session =>
UpdateCustomerBillingAddr.prepareBillingAddrRequestBody(session))
.exec(UpdateCustomerBillingAddr.updateCustomerBillingAddr(tenant))
.exec(RegisterSepa.startRegisterProcess(tenant))
.exec(session => RegisterSepa.prepareRegisterRequestBody(session))
.exec(RegisterSepa.doRegisterSepa(tenant))
setUp(
scen
.inject(atOnceUsers(2))
.protocols(httpConf))
object OAuth {
private val OBJECT_MAPPER = new ObjectMapper()
def getOAuthToken(clientId: String) = {
val authCode = PropertyUtil.getAuthCode
val encryptedAuthCode = new
Crypto().rsaServerKeyEncrypt(authCode)
http("oauthTokenRequest")
.post("/oauth/token")
.formParam("refresh_token", "")
.formParam("code", encryptedAuthCode)
.formParam("grant_type", "authorization_code")
.formParam("client_id", clientId)
.check(jsonPath("$").saveAs("oauthToken"))
.check(status.is(200))
}
def createAuthHHeader(session: Session, clientId: String) = {
val tokenString = session.get("oauthToken").as[String]
val tokenDto = OBJECT_MAPPER.readValue(tokenString,
classOf[TokenDto])
val session2 = session.set(Constants.TOKEN_DTO_KEY, tokenDto)
val authHeader = AuthCommons.createAuthHeader(tokenDto,
clientId, new util.HashMap[String, String]())
session2.set(Constants.AUTH_HEADER_KEY, authHeader)
}
def upDateAuthToken(session: Session, clientId: String) = {
val ssoToken = session.get(Constants.SSO_TOKEN_KEY).as[String]
val oAuthDto = session.get(Constants.TOKEN_DTO_KEY).as[TokenDto]
val params = new util.HashMap[String, String]
params.put("sso_token", ssoToken)
val updatedAuthHeader = AuthCommons.createAuthHeader(oAuthDto,
clientId, params)
session.set(Constants.AUTH_HEADER_KEY, updatedAuthHeader)
}
}
def createAuthHHeader(session: Session, clientId: String) = {
val tokenString = session.get("oauthToken").as[String]
val tokenDto = OBJECT_MAPPER.readValue(tokenString,
classOf[TokenDto])
val session2 = session.set(Constants.TOKEN_DTO_KEY, tokenDto)
val authHeader = AuthCommons.createAuthHeader(tokenDto,
clientId, new util.HashMap[String, String]())
session2.set(Constants.AUTH_HEADER_KEY, authHeader)
}
So I did add the two methods that dont work along as expected. In the first part I try to fetch a token and store in the session via check(jsonPath("$").saveAs("oauthToken")) and in the second call I try to read that token with val tokenString = session.get("oauthToken").as[String] which fails with the exception saying that there is no entry for that key in the session...
I've copied it and removed/mocked any missing code references, switched to one of my apps auth url and it seems to work - at least 2 firsts steps.
One thing that seems weird is jsonPath("$").saveAs("oauthToken") which saves whole json (not single field) as attribute, is it really what you want to do? And are you sure that getOAuthToken is working properly?
You said that it works for 1 user but fails for 2. Aren't there any more errors? For debug I suggest changing logging level to TRACE or add exec(session => {println(session); session}) before second step to verify if token is properly saved to session. I think that something is wrong with authorization request (or building that request) and somehow it fails or throws some exception. I would comment out all steps except 1st and focus on checking if that first request is properly executed and if it adds proper attribute to session.
I think your brackets are not set correctly. Change them to this:
setUp(
scn.inject(atOnceUsers(2))
).protocols(httpConf)
I wrote a spark program in scala, of which the main codes are:
val centers:Array[(Vector,Double)] = initCenters(k)
val sumsMap:Map(int,(vector,int))= data.mapPartitions{
***
}.reduceByKey(***).collectAsMap()
sumsMap.foreach{case(index,(sum,count))=>
sum/=count
centers(index)=(sum,sum.norm2())
}
the origin codes are:
val centers = initCenters.getOrElse(initCenter(data))
val br_centers = data.sparkContext.broadcast(centers)
val trainData = data.map(e => (e._2, e._2.norm2)).cache()
val squareStopBound = stopBound * stopBound
var isConvergence = false
var i = 0
val costs = data.sparkContext.doubleAccumulator
while (!isConvergence && i < maxIters) {
costs.reset()
val res = trainData.mapPartitions { iter =>
val counts = new Array[Int](k)
util.Arrays.fill(counts, 0)
val partSum = (0 until k).map(e => new DenseVector(br_centers.value(0)._1.size))
iter.foreach { e =>
val (index, cost) = KMeans.findNearest(e, br_centers.value)
costs.add(cost)
counts(index) += 1
partSum(index) += e._1
}
counts.indices.filter(j => counts(j) > 0).map(j => (j -> (partSum(j), counts(j)))).iterator
}.reduceByKey { case ((s1, c1), (s2, c2)) =>
(s1 += s2, c1 + c2)
}.collectAsMap()
br_centers.unpersist(false)
println(s"cost at iter: $i is: ${costs.value}")
isConvergence = true
res.foreach { case (index, (sum, count)) =>
sum /= count
val sumNorm2 = sum.norm2()
val squareDist = math.pow(centers(index)._2, 2.0) + math.pow(sumNorm2, 2.0) - 2 * (centers(index)._1 * sum)
if (squareDist >= squareStopBound) {
isConvergence = false
}
centers.update(index,(sum, sumNorm2))
}
i += 1
}
when these run in a pseudo-distributed mode in IDEA, I get the centers updated, while when I get these run on a spark cluster, I do not get the centers updated.
LostInOverflow's answer is correct, but not especially descriptive as to what's going on.
Here are some important properties of your code:
declare an array centers
broadcast this array as br_centers
update centers iteratively
So how is this going wrong? Well, broadcasts are static. If I write:
val a = Array(1,2,3)
val aBc = sc.broadcast(a)
a(0) = 67
and access aBc.value(0), I'm going to get different results depending on whether this code was run on the driver JVM or not. Broadcasting takes an object, torrents it across the network to each node, and creates a new reference in each JVM. This reference exists as it did when the base object was broadcasted, and it is NOT updated in real time as you mutate the base object.
What's the solution? I think moving the broadcast inside the while loop so that you broadcast the updated centers should work:
while (!isConvergence && i < maxIters) {
val br_centers = data.sparkContext.broadcast(centers)
...
Please check Understanding closures section in the programming guide.
Spark is a distributed system and behavior of the code you've shown is simply undefined. It works in local mode only by accident because it executes everything in a single JVM.
I wrote a simple stream using akka-streams api assuming it will handle my source but unfortunately it doesn't. I am sure I am doing something wrong in my source. I simply created an iterator which generate very large number of elements assuming it won't matter because akka-streams api will take care of backpressure. What am I doing wrong, this is my iterator.
def createData(args: Array[String]): Iterator[TimeSeriesValue] = {
var data = new ListBuffer[TimeSeriesValue]()
for (i <- 1 to range) {
sessionId = UUID.randomUUID()
for (j <- 1 to countersPerSession) {
time = DateTime.now()
keyName = s"Encoder-${sessionId.toString}-Controller.CaptureFrameCount.$j"
for (k <- 1 to snapShotCount) {
time = time.plusSeconds(2)
fValue = new Random().nextLong()
data += TimeSeriesValue(sessionId, keyName, time, fValue)
totalRows += 1
}
}
}
data.iterator
}
The problem is primarily in the line
data += TimeSeriesValue(sessionId, keyName, time, fValue)
You are continuously adding to the ListBuffer with a "very large number of elements". This is chewing up all of your RAM. The data.iterator line is simply wrapping the massive ListBuffer blob inside of an iterator to provide each element one at a time, it's basically just a cast.
Your assumption that "it won't matter because ... of backpressure" is partially true that the akka Stream will process the TimeSeriesValue values reactively, but you are creating a large number of them even before you get to the Source constructor.
If you want this iterator to be "lazy", i.e. only produce values when needed and not consume memory, then make the following modifications (note: I broke apart the code to make it more readable):
def createTimeSeries(startTime: Time, snapShotCount : Int, sessionId : UUID, keyName : String) =
Iterator.range(1, snapShotCount)
.map(_ * 2)
.map(startTime plusSeconds _)
.map(t => TimeSeriesValue(sessionId, keyName, t, ThreadLocalRandom.current().nextLong()))
def sessionGenerator(countersPerSession : Int, sessionID : UUID) =
Iterator.range(1, countersPerSession)
.map(j => s"Encoder-${sessionId.toString}-Controller.CaptureFrameCount.$j")
.flatMap { keyName =>
createTimeSeries(DateTime.now(), snapShotCount, sessionID, keyName)
}
object UUIDIterator extends Iterator[UUID] {
def hasNext : Boolean = true
def next() : UUID = UUID.randomUUID()
}
def iterateOverIDs(range : Int) =
UUIDIterator.take(range)
.flatMap(sessionID => sessionGenerator(countersPerSession, sessionID))
Each one of the above functions returns an Iterator. Therefore, calling iterateOverIDs should be instantaneous because no work is immediately being done and de mimimis memory is being consumed. This iterator can then be passed into your Stream...