I have defined a route in a backend module this way
(r'/_ah/start', 'ConversionTaskQueueWorker'),
My backend module is terminated like so:
But if I put a log statement in
class ConversionTaskQueueWorker(webapp2.RequestHandler):
def get(self):
"""Indefinitely fetch tasks and update the datastore."""
q = taskqueue.Queue(TASK_QUEUE_NAME)
while True:
LOG.info('Keeping it alive')
...
It stays up and running
Why it is so?
I don't want to flood my log with this message. Is there any alternative way to keep the backend module running?
Related
Background story: I need to obtain the handles of the tagged Twitter users from an attached Twitter media. There's no current API method to do that unfortunately (see https://twittercommunity.com/t/how-to-get-tags-of-a-media-in-a-tweet/185614 and https://github.com/twitterdev/open-evolution/issues/34).
I have no other choice but to scrape, this is an example URL: https://twitter.com/justinwood_/status/1626275168157851650/media_tags. This is the page which pops up when you click on the tags link under the media of the parent Tweet: https://twitter.com/justinwood_/status/1626275168157851650/
The React generated DOM is deep and ugly, but would be scrapeable, however I do not want to log in with any account to get banned. Unfortunately when you visit https://twitter.com/justinwood_/status/1626275168157851650/media_tags in an Incognito window the popup shows up dead empty. However when I dig into the network requests the /TweetDetail GraphQL endpoint is full of messages about the anonymous page visit, fortunately it still contains the list of handles I need despite of all of this.
So what I need to have is a scraper which is able to process JavaScript, and capture the response for that specific GraphQL call. Selenium uses a headless Chrome under the hood, so it is able to process JavaScript, and Selenium-Wire offers the ability to capture the response.
Unfortunately my crafted Selenium-Wire script only has the TweetResultByRestId and UsersByRestId GraphQL requests but is missing the TweetDetail. I don't know what to tweak to make all the requests to happen. I iterated over a ton of Chrome options. Here is a variation of my script:
from seleniumwire import webdriver
from selenium.webdriver.chrome.service import Service
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument("--disable-extensions")
chrome_options.add_argument("--disable-gpu")
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument("--headless") # for Jenkins
chrome_options.add_argument("--disable-dev-shm-usage") # Jenkins
chrome_options.add_argument('--start-maximized')
chrome_options.add_argument('--window-size=1900,1080')
chrome_options.add_argument('--ignore-certificate-errors-spki-list')
chrome_options.add_argument('--ignore-ssl-errors')
selenium_options = {
'request_storage_base_dir': '/tmp', # Use /tmp to store captured data
'exclude_hosts': ''
}
ser = Service('/usr/bin/chromedriver')
ser.service_args=["--verbose", "--log-path=test.log"]
driver = webdriver.Chrome(service=ser, options=chrome_options, seleniumwire_options=selenium_options)
tweet_id = "1626275168157851650"
twitter_media_url = f"https://twitter.com/justinwood_/status/{tweet_id}/media_tags"
driver.get(twitter_media_url)
driver.wait_for_request("/TweetDetail", timeout=10)
Any ideas?
Apparently it looks like I'd rather need to scrape the parent Tweet URL https://twitter.com/justinwood_/status/1626275168157851650/ and right now it seems my craved GraphQL call happens. Probably I got confused while trying 100 combinations.
As with unetstack shell we can delete a route but how to delete a route in the groovy code without using the Agent
Refer to the documentation for Routing service in the Unet handbook for how to do this. Essentially you need to send a EditRouteReq created using the deleteRoute() constructor method, and send it to the agent providing the ROUTING service.
For example, if you want to delete all routes to node 10, this will look something like:
def router = agentForService(Services.ROUTING)
def req = EditRouteReq.deleteRoute()
req.to = 10
def rsp = router << req
// rsp will have performative AGREE if the request was successful
The code:
package simulations
import io.gatling.core.Predef._
import io.gatling.http.Predef._
class StarWarsBasicExample extends Simulation
{
// 1 Http Conf
val httpConf = http.baseUrl("https://swapi.dev/api/films/")
// 2 Scenario Definition
val scn = scenario("Star Wars API")
.exec(http("Get Number")
.get("4")
.check(jsonPath("$.episode_id")
.saveAs("episodeId"))
)
.exec(session => {
val movie = session("episodeId").as[String]
session.set("episode",movie)
}).pause(4)
.exec(http("$episode")
.get("$episode"))
// 3 Load Scenario
setUp(
scn.inject(atOnceUsers(1)))
.protocols(httpConf)
}
Trying to grab a variable from the first Get request, and inject that variable into a second request, but unable to do so despite using the documentation. There might be something I'm not understanding.
When I use breakpoints, and navigate through the process, it appears the session execution happens AFTER both of the other requests have been completed (by which time is too late). Can't seem to make that session execution happen between the two requests.
Already answered on Gatling's community mailing list.
"$episode" is not correct Gatling Expression Language syntax. "${episode}" is correct.
I am using App Engine Connected Android Plugin support and customized the sample project shown in Google I/O. Ran it successfully. I wrote some Tasks from Android device to cloud succesffully using the code.
CloudTasksRequestFactory factory = (CloudTasksRequestFactory) Util
.getRequestFactory(CloudTasksActivity.this,
CloudTasksRequestFactory.class);
TaskRequest request = factory.taskRequest();
TaskProxy task = request.create(TaskProxy.class);
task.setName(taskName);
task.setNote(taskDetails);
task.setDueDate(dueDate);
request.updateTask(task).fire();
This works well and I have tested.
What I am trying to now is I have an array String[][] addArrayServer and want to write all the its elements to the server. The approach I am using is:
NoteSyncDemoRequestFactory factory = Util.getRequestFactory(activity,NoteSyncDemoRequestFactory.class);
NoteSyncDemoRequest request = factory.taskRequest();
TaskProxy task;
for(int ik=0;ik<addArrayServer.length;ik++) {
task = request.create(TaskProxy.class);
Log.d(TAG,"inside uploading task:"+ik+":"+addArrayServer[ik][1]);
task.setTitle(addArrayServer[ik][1]);
task.setNote(addArrayServer[ik][2]);
task.setCreatedDate(addArrayServer[ik][3]);
// made one task
request.updateTask(task).fire();
}
One task is uploaded for sure, the first element of the array. But hangs when making a new instance of task. I am pretty new to Google-Appengine. Whats the right way to call RPC, to upload multiple entities really fast??
Thanks.
Well answer to this queston is that request.fire() can be called only once for an request object but I was calling it every time in the loop. So it would update only once. Simple solution is to call fire() outside the loop.
NoteSyncDemoRequestFactory factory = Util.getRequestFactory(activity,NoteSyncDemoRequestFactory.class);
NoteSyncDemoRequest request = factory.taskRequest();
TaskProxy task;
for(int ik=0;ik<addArrayServer.length;ik++) {
task = request.create(TaskProxy.class);
Log.d(TAG,"inside uploading task:"+ik+":"+addArrayServer[ik][1]);
task.setTitle(addArrayServer[ik][1]);
task.setNote(addArrayServer[ik][2]);
task.setCreatedDate(addArrayServer[ik][3]);
// made one task
request.updateTask(task);
}
request.fire(); //call fire only once after all the actions are done...
For more info check out this question.. GWT RequestFactory and multiple requests
what im trying to do is to load different applications (webapp2.WSGIApplication) depending on the request domain.
for example www.domain_1.com should load the application in app1.main.application while www.domain_2.com should load app2.main.appplication.
of course im on the same GAE appid and im using namespaces to separate the apps data.
this works pretty good with 'threadsafe:false' and a runner.py file where a function determines which application to return
it seems that with 'threadsafe:true' the first request loads the wsgiapplication into the instance and further requests dont execute the 'application dispatching' logic any more so the request gets a response from the wrong app.
im using python2.7 and webapp2
what is the best way to do this?
edit:
a very simplified version of my runner.py
def main():
if domain == 'www.mydomain_1.com':
from app_1 import application
namespace = 'app_1'
elif domain == 'www.domain_2.com':
from app_2 import application
namespace = 'app_2'
namespace_manager.set_namespace(namespace)
return wsgiref.handlers.CGIHandler().run(application)
if __name__ == '__main__':
main()
and in app.yaml
- url: /.*
script: app-runner.py
Your runner script is a CGI script. The full behavior of a CGI script with multithreading turned on is not documented, and the way the docs are written I'm guessing this won't be supported fully. Instead, the docs say you must refer to the WSGI application object directly from app.yaml, using the module path to a global variable containing the object, when multithreading is turned on. (CGI scripts retain their old behavior in Python 2.7 with multithreading turned off.)
The behavior you're seeing is explained by your use of imports. Within a single instance, each import statement only has an effect the first time it is encountered. After that, the module is assumed to be imported and the import statement has no effect on subsequent requests. You can import both values into separate names, then call run() with the appropriate value.
But if you want to enable multithreading (and that's a good idea), your dispatcher should be a WSGI application itself, stored in a module global referred to by app.yaml. I don't know offhand how to dispatch a request to another WSGI application from within a WSGI application, but that might be a reasonable thing to do. Alternatively, you might consider using or building a layer above WSGI to do this dispatch.
made it happen by subclassing webapp2.WSGIApplication and overriding __call__() which is called before dispatching to a RequestHandler.
prefixing routes (and removing the prefix in the handlers initialize) and substructuring config to be able to use the instance memory.
class CustomWSGIApplication(webapp2.WSGIApplication):
def __call__(self, environ, start_response):
routes, settings, ns = get_app(environ)
namespace_manager.set_namespace(ns)
environ['PATH_INFO'] = '/%s%s' %(ns, environ.get('PATH_INFO'))
for route in routes:
r, h = route # returns a tuple with mapping and handler
newroute = ('/%s%s'%(ns, r), h,)
self.router.add(newroute)
if settings:
self.config[ns] = settings
self.debug = debug
with self.request_context_class(self, environ) as (request, response):
try:
if request.method not in self.allowed_methods:
# 501 Not Implemented.
raise exc.HTTPNotImplemented()
rv = self.router.dispatch(request, response)
if rv is not None:
response = rv
except Exception, e:
try:
# Try to handle it with a custom error handler.
rv = self.handle_exception(request, response, e)
if rv is not None:
response = rv
except HTTPException, e:
# Use the HTTP exception as response.
response = e
except Exception, e:
# Error wasn't handled so we have nothing else to do.
response = self._internal_error(e)
try:
return response(environ, start_response)
except Exception, e:
return self._internal_error(e)(environ, start_response)