Python flask web application with multilanguages support by host and prefix - url-routing

I have one server with flask application instance and have several domain which mapped to this server by DNS.
My site must support several languages by host and prefix:
mysite.com - english
mysite.com/fr - franch
mysite.ru - russian
mysite.ru/by - belarusian
localhost or other unknown host without language prefix - default language (english)
I implemented it with double route registration /endpoint and /<lang>/endpoint and reloaded url_for function and it work, but now I must implement custom error pages for abort function:
mysite.com/wrong-url-there - mysite.com/404.html (english)
mysite.com/fr/wrong-url-there - mysite.com/fr/404.html (franch)
mysite.ru/wrong-url-there - mysite.ru/404.html (russian)
mysite.ru/by/wrong-url-there - mysite.ru/by/404.html (belorusian)
And I don't see solution for this.
I think my implementation bad and I must improve it. I think I must create one instance of application for each site language root with predefined language for it or use blueprint, but I don't find solution for me yet.
Is anybody can give me advice how resolve this url multilanguages support with flask or wsgi or nginx?

It's in the official doc: http://flask.pocoo.org/docs/patterns/urlprocessors/ (This is basically the same answer as Matthew Scragg's).

I worked on something similar few months back. I modified it a bit and pushed to github.
You can do what codegeek suggested if you are unable to make your templates language neutral. With this method you can cut down on the template files needed.
https://github.com/scragg0x/Flask-Localisation-Example
mysite.py
from flask import Flask, Blueprint, g, redirect, request
app = Flask(__name__)
mod = Blueprint('mysite', __name__, url_prefix='/<lang_code>')
sites = {
'mysite.com': 'en',
'myothersite.com': 'fr'
}
#app.url_defaults
def add_language_code(endpoint, values):
values.setdefault('lang_code', g.lang_code)
#app.url_value_preprocessor
def pull_lang_code(endpoint, values):
url = request.url.split('/', 3)
g.lang_code = sites[url[2]]
#mod.url_defaults
def add_language_code(endpoint, values):
values.setdefault('lang_code', g.lang_code)
#mod.url_value_preprocessor
def pull_lang_code(endpoint, values):
g.lang_code = values.pop('lang_code')
#app.route('/')
#mod.route('/')
def index():
# Use g.lang_code to pull localized data for template
return 'lang = %s' % g.lang_code
app.register_blueprint(mod)
tests.py
import os
import unittest
import re
import requests
import urllib2
import json
from mysite import app
class MySiteTestCase(unittest.TestCase):
def setUp(self):
app.config['TESTING'] = True
app.config['SERVER_NAME'] = 'mysite.com'
self.domain = 'http://mysite.com/'
self.app = app.test_client()
def tearDown(self):
pass
def test_en_index(self):
rv = self.app.get('/en/', self.domain)
self.assertEqual(rv.data, 'lang = en')
print self.domain, rv.data
def test_fr_index(self):
rv = self.app.get('/fr/', self.domain)
self.assertEqual(rv.data, 'lang = fr')
print self.domain, rv.data
def test_default(self):
rv = self.app.get('/', self.domain)
self.assertEqual(rv.data, 'lang = en')
print self.domain, rv.data
class MyOtherSiteTestCase(unittest.TestCase):
def setUp(self):
app.config['TESTING'] = True
app.config['SERVER_NAME'] = 'myothersite.com'
self.domain = 'http://myothersite.com/'
self.app = app.test_client()
def tearDown(self):
pass
def test_en_index(self):
rv = self.app.get('/en/', self.domain)
self.assertEqual(rv.data, 'lang = en')
print self.domain, rv.data
def test_fr_index(self):
rv = self.app.get('/fr/', self.domain)
self.assertEqual(rv.data, 'lang = fr')
print self.domain, rv.data
def test_default(self):
rv = self.app.get('/', self.domain)
self.assertEqual(rv.data, 'lang = fr')
print self.domain, rv.data
if __name__ == '__main__':
unittest.main()

Disclaimer: This code is not tested. I am just giving you a ballpark idea of how to approach this.
I suggest you use blueprints in combination with an extension like Flask-Babel. For example, you can do something like:
views.py
mysitebp = Blueprint('mysitebp',__name__)
Then in your application package (usually __init__.py) , you can do:
__init__.py
from mysite.views import mysitebp
app = Flask(__name__)
app.register_blueprint(mysitebp,url_prefix='/en/',template_folder='en')
app.register_blueprint(mysitebp,url_prefix='/fr',template_folder='fr')
..and so on
Your directory structure could look like:
mysite/
__init__.py
views.py
templates/
base.html
404.html
en/
en.html
fr/
french.html
Flask-Babel would help you translate the 404.html etc.

My own solution:
from flask import Flask, g, render_template, redirect, request
app = Flask(__name__)
default_language = 'en'
language_urls = {
'en': 'mysite.com',
'fr': 'mysite.com/fr',
'ru': 'mysite.ru',
'by': 'mysite.ru/by',
}
languages = ','.join(language_urls.keys())
def get_language_by_request(request_host, request_path):
'''
Looking bad, but work.
I cab't use request.view_args there,
because this can't detect language for 404 pages
like mysite.com/fr/unknown-page
'''
request_host_path = request_host + request_path
request_paths = request_host_path.split('/', 2)
if (len(request_paths) > 1 and request_paths[1] in language_urls.keys()):
request_language_prefix = request_paths[1]
return request_language_prefix
for language, url in language_urls.items():
host_prefix = url.split('/')
if len(host_prefix) == 1:
host, = host_prefix
if request_host == host:
return language
return default_language
def get_language_url_parameter_value(language, request_host):
host_prefix = language_urls[language]
if host_prefix == request_host:
return None
return language
def get_redirection_url_by_request(request_host, request_path, request_url):
'''
Looking bad, but work.
I cab't use request.view_args there,
because this can't detect language for 404 pages
like mysite.com/fr/unknown-page
'''
request_host_path = request_host + request_path
request_paths = request_host_path.split('/', 2)
request_language_prefix = None
if (len(request_paths) > 1 and request_paths[1] in language_urls.keys()):
request_language_prefix = request_paths[1]
hosts = []
for language, url in language_urls.items():
host_prefix = url.split('/')
if len(host_prefix) == 1:
host, = host_prefix
language_prefix = None
else:
host, language_prefix = host_prefix
if request_host == host and request_language_prefix == language_prefix:
return None
hosts.append(host)
if request_host not in hosts:
return None
if request_language_prefix:
request_host_prefix = request_host + '/' + request_language_prefix
host_prefix = language_urls[request_language_prefix]
return request_url.replace(request_host_prefix, host_prefix)
return None
#app.url_defaults
def set_language_in_url(endpoint, values):
if '_lang' not in values and hasattr(g, 'language_url_value'):
values['_lang'] = g.language_url_value
#app.url_value_preprocessor
def get_language_from_url(endpoint, values):
g.language = get_language_by_request(request.host, request.path)
g.language_url_value = get_language_url_parameter_value(g.language, request.host)
if values and '_lang' in values:
del values['_lang']
#app.before_request
def check_language_redirection():
redirection_url = get_redirection_url_by_request(request.host, request.path, request.url)
return redirect(redirection_url) if redirection_url else None
#app.route('/')
#app.route('/<any(%s):_lang>/' % languages)
def home():
return render_template('home.html')
#app.route('/other/')
#app.route('/<any(%s):_lang>/other/' % languages)
def other():
return render_template('other.html')
I don't use blueprints there because I also use flask-login and I can't set several login pages with different languages for each blueprint. For example if page required login, flask redirect me to login page and I must update language for this page. Also login pages can't be as mysite.com/login, mysite.com/fr/login and etc without several redirections.
UPD: I can't use request.view_args for detect language or redirection, because on this case I can't detect language for error pages as mysite.com/fr/wrong-page-there (can't detect endpoint and view_args). To avoid this problem I can use hask: add url rule as /<lang_code>/<path:path> and raise 404 error there.

Related

Custom Locust User for SageMaker Endpoint Keeps running after time limit is reached

I have been trying to build a SagemakerUser from the base User class in the Locust library. The issue though is when I use it with a timed shape test, when said test ends (you can see a message: Shape test stopping) the load test shrugs it off and continues. Below is the script I have written to this end. My question is how is this behaviour explained?
import pandas as pd
from locust import HttpUser, User, task, TaskSet, events, LoadTestShape
from sagemaker.serializers import JSONSerializer
from sagemaker.session import Session
import sagemaker
import time
import sys
import math
import pdb
df = "some df to load samples from"
endpoint = "sage maker end point name"
class SagemakerClient(sagemaker.predictor.Predictor):
def predictEx(self, data):
start_time = time.time()
start_perf_counter = time.perf_counter()
name = 'predictEx'
try:
result = self.predict(data)
except:
total_time = int((time.perf_counter() - start_perf_counter) * 1000)
events.request_failure.fire(request_type="sagemaker", name=name, response_time=total_time, exception=sys.exc_info(), response_length=0)
else:
total_time = int((time.perf_counter() - start_perf_counter) * 1000)
events.request_success.fire(request_type="sagemaker", name=name, response_time=total_time, response_length=sys.getsizeof(result))
class SagemakerLocust(User):
abstract = True
def __init__(self, *args, **kwargs):
super(SagemakerLocust, self).__init__(*args, **kwargs)
self.client = SagemakerClient(
sagemaker_session = Session(),
endpoint_name = "sagemaker-test",
serializer = JSONSerializer())
class APIUser(SagemakerLocust):
#task
def call(self):
request = df.text.sample(1, weights=df.length).iloc[0]
self.client.predictEx(request)
class StepLoadShape(LoadTestShape):
"""
A step load shape
Keyword arguments:
step_time -- Time between steps
step_load -- User increase amount at each step
spawn_rate -- Users to stop/start per second at every step
time_limit -- Time limit in seconds
"""
step_time = 30#3600
step_load = 1
spawn_rate = 1
time_limit =2#3600*6
#pdb.set_trace()
def tick(self):
run_time = self.get_run_time()
if run_time > self.time_limit:
return None
current_step = math.floor(run_time / self.step_time) + 1
return (current_step * self.step_load, self.spawn_rate)

Flink runner not splitting tasks when parallelism is turned on in BEAM python pipeline

I have a beam pipeline written in python that when deployed to a flink runner doesn't make use of the parallelism correctly.
There is unbounded data coming in through a kafka connector and I want the data to be read when split in parallel.
My understanding is that it should split up the tasks but as shown in the image one parallelism is used and all the other 5 sub tasks finished instantly leaving the one running to do all the work.
The pipeline settings are:
options = PipelineOptions([
"--runner=PortableRunner",
"--sdk_worker_parallelism=3",
"--artifact_endpoint=localhost:8098",
"--job_endpoint=localhost:8099",
"--environment_type=EXTERNAL",
"--environment_config=localhost:50000",
"--checkpointing_interval=30000",
])
options._all_options['parallelism'] = 3
Is this a missing config on the Flink runner or something that can be configured in the BEAM pipeline?
The full pipeline:
import apache_beam as beam
from apache_beam.options.pipeline_options import PipelineOptions
options = PipelineOptions([
"--runner=PortableRunner",
"--sdk_worker_parallelism=3",
"--artifact_endpoint=localhost:8098",
"--job_endpoint=localhost:8099",
"--environment_type=EXTERNAL",
"--environment_config=localhost:50000",
"--checkpointing_interval=30000",
])
options._all_options['parallelism'] = 3
class CountProvider(beam.RestrictionProvider):
def __init__(self, initial_split_size=5):
self._initial_split_size = initial_split_size
self.OffsetRestrictionTracker = None
def imports(self):
if self.OffsetRestrictionTracker is not None: return
from apache_beam.io.restriction_trackers import OffsetRestrictionTracker, OffsetRange
self.OffsetRestrictionTracker = OffsetRestrictionTracker
self.OffsetRange = OffsetRange
def initial_restriction(self, element):
self.imports()
return self.OffsetRange(0, 10)
def create_tracker(self, restriction):
self.imports()
return self.OffsetRestrictionTracker(restriction)
def restriction_size(self, element, restriction):
return restriction.size()*100_000
def split(self, element, restriction):
self.imports()
if restriction.start + 1 >= restriction.stop:
yield self.OffsetRange(restriction.start, restriction.stop)
else:
last_val = restriction.start
for i in range(1, self._initial_split_size):
next_stop = i * (restriction.start + restriction.stop) // self._initial_split_size
yield self.OffsetRange(last_val, next_stop)
last_val = next_stop
yield self.OffsetRange(last_val, restriction.stop)
class CountFn(beam.DoFn):
def setup(self):
print("setup")
def process(self, element, tracker=beam.DoFn.RestrictionParam(CountProvider())):
res = tracker.current_restriction()
print(f"Current Restriction {res.start}, {res.stop}")
for i in range(res.start, res.stop):
if not tracker.try_claim(i):
return
for j in range(10_000):
yield i, j
def get_initial_restriction(self, filename):
return (0, 10)
def teardown(self):
print("Teardown")
p = beam.Pipeline(options=options)
out = (p | f'Create' >> beam.Create([tuple()])
| f'Gen Data' >> beam.ParDo(CountFn())
| beam.Map(print)
)
result = p.run()
result.wait_until_finish()

Implement FileSystem

I had a company assign me an assignment to implement a fileSystem class to run shell commands through python without using any libraries. Does anyone have any suggestions on how to get started? Not quite sure how to tackle this problem.
Problem:
Implement a FileSystem class using python
Root path is '/'.
Path separator is '/'.
Parent directory is addressable as '..'.
Directory names consist only of English alphabet letters (A-Z and a-z).
All functions should support both relative and absolute paths.
All function parameters are the minimum required/recommended parameters.
Any additional class/function can be added.
What I've worked on so far:
class Path:
def __init__(self, path):
self.current_path = path.split("/")
def cd(self, new_path):
new_split = new_path.split("/")
for i in new_split:
if i == "..":
new_split.pop(0)
self.current_path = self.current_path[:-1]
self.current_path += new_split
def getString(self):
return "/".join(self.current_path)
def pwd(self, path):
return self.current_path
def mkdir():
pass
def rmdir():
pass
#driver code
fs = Path()
fs.mkdir('usr')
fs.cd('usr')
fs.mkdir('local')
fs.cd('local')
return fs.pwd()
So, this is what I came up with. I know I need to clean it up
'''
class Path:
dir_stack = []
def __init__(self):
print("started")
main_dir = {'/': {}}
self.dir_stack.insert( len(self.dir_stack), main_dir)
def getCurrentMap():
global current_Level
current_Level = self.dir_stack[len(self.dir_stack) - 1]
def cd(self, folder):
if(folder == '../'):
self.dir_stack.pop()
current_Level = self.dir_stack[len(self.dir_stack) - 1]
current_Map = current_Level[(list(current_Level.keys())[0])]
print('lev', current_Map)
if folder in current_Map:
print('here')
self.dir_stack.insert(len(self.dir_stack), current_Map)
else:
print ("no existing folder")
def pwd(self):
path = ''
print(self.dir_stack)
for x in self.dir_stack:
path += (list(x.keys())[0]) + '/'
print(path)
def ls(self):
current_Level = self.dir_stack[len(self.dir_stack) - 1]
current_Map = current_Level[(list(current_Level.keys())[0])]
print(current_Map)
def mkdir(self, folder_Name):
current_Level = self.dir_stack[len(self.dir_stack) - 1]
newDir = {folder_Name: {}}
current_Map = current_Level[(list(current_Level.keys())[0])]
if folder_Name in current_Map:
warning = folder_Name + ' already exists in directory'
print(warning)
else:
current_Map.update(newDir)
def rmdir(self, folder_Name):
current_Level = self.dir_stack[len(self.dir_stack) - 1]
#make global var current_Map
current_Map = current_Level[(list(current_Level.keys())[0])]
if folder_Name in current_Map:
del current_Map[folder_Name]
else:
print('folder doesnt exist')
# driver code
fs = Path()
fs.mkdir('usr')
fs.mkdir('new')
fs.mkdir('files')
fs.cd('usr')
fs.mkdir('local')
fs.cd('new')
fs.pwd()
fs.cd('../')
fs.ls()
# fs.mkdir('local')
# fs.cd('local')
fs.pwd()

WatsonApiException: Error: invalid-api-key, Code: 401

I cant find Alchemy Language API in IBM Watson.
Can I do this with natural-language-understanding service and how?
When I add
from watson_developer_cloud import NaturalLanguageUnderstandingV1
from watson_developer_cloud.natural_language_understanding_v1 \
import Features, EntitiesOptions, KeywordsOptions
It shows some error with combined keyword
# In[]:
import tweepy
import re
import time
import math
import pandas as pd
from watson_developer_cloud import AlchemyLanguageV1
def initAlchemy():
al = AlchemyLanguageV1(api_key='GRYVUMdBbOtJXxNOIs1aopjjaiyOmLG7xJBzkAnvvwLh')
return al
def initTwitterApi():
consumer_key = 'OmK1RrZCVJSRmKxIuQqkBExvw'
consumer_secret = 'VWn6OR4rRgSi7qGnZHCblJMhrSvj1QbJmf0f62uX6ZQWZUUx5q'
access_token = '4852231552-adGooMpTB3EJYPHvs6oGZ40qlo3d2JbVjqUUWkJ'
access_token_secret = 'm9hgeM9p0r1nn8IoQWJYBs5qUQu56XmrAhsDSYKjuiVA4'
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)
return api
'''This function is implemented to handle tweepy exception errors
because search is rate limited at 180 queries per 15 minute window by twitter'''
def limit(cursor):
while True:
try:
yield cursor.next()
except tweepy.TweepError as error:
print(repr(error))
print("Twitter Request limit error reached sleeping for 15 minutes")
time.sleep(16*60)
except tweepy.RateLimitError:
print("Rate Limit Error occurred Sleeping for 16 minutes")
time.sleep(16*60)
def retrieveTweets(api, search, lim):
if(lim == ""):
lim = math.inf
else:
lim = int(lim)
text = []
for tweet in limit(tweepy.Cursor(api.search, q=search).items(limit = lim)):
t = re.sub('\s+', ' ', tweet.text)
text.append(t)
data = {"Tweet":text,
"Sentiment":"",
"Score":""}
dataFrame = pd.DataFrame(data, columns=["Tweet","Sentiment","Score"])
return dataFrame
def analyze(al,dataFrame):
sentiment = []
score = []
for i in range(0, dataFrame["Tweet"].__len__()):
res = al.combined(text=dataFrame["Tweet"][i],
extract="doc-sentiment",
sentiment=1)
sentiment.append(res["docSentiment"]["type"])
if(res["docSentiment"]["type"] == "neutral"):
score.append(0)
else:
score.append(res["docSentiment"]["score"])
dataFrame["Sentiment"] = sentiment
dataFrame["Score"] = score
return dataFrame
def main():
#Initialse Twitter Api
api = initTwitterApi()
#Retrieve tweets
dataFrame = retrieveTweets(api,input("Enter the search query (e.g. #hillaryclinton ) : "), input("Enter limit for number of tweets to be searched or else just hit enter : "))
#Initialise IBM Watson Alchemy Language Api
al = initAlchemy()
#Do Document Sentiment analysis
dataFrame = analyze(al, dataFrame)
#Save tweets, sentiment, and score data frame in csv file
dataFrame.to_csv(input("Enter the name of the file (with .csv extension) : "))
if __name__ == '__main__':
main()# -*- coding: utf-8 -*-
The Watson Natural Language Understanding only has a combined call, but since it is the only call, it isn't called combined, its actually analyze. Best place to go for details would be the API documentation - https://www.ibm.com/watson/developercloud/natural-language-understanding/api/v1/?python#post-analyze

'self' not defined, jinja2, appengine

Error:
self.response.out.write(template.render(template_values)) NameError:
name 'self' is not defined
pertains to lines marked # ERROR, with other notes:
#!/usr/bin/env python27
import cgi
import webapp2
import jinja2
import time
import datetime
import urllib
#import cgitb; cgitb.enable()
import os
from google.appengine.ext import db
from google.appengine.api import users
from google.appengine.api import memcache
jinja_environment = jinja2.Environment(autoescape=True,
loader=jinja2.FileSystemLoader(os.path.join(os.path.dirname(__file__), 'templates')))
class Visitor(db.Model): # I still need this with jinja2, yes?
name = db.StringProperty(required=1)
mood = db.StringProperty(choices=["good","bad","fair"])
date = db.DateTimeProperty(auto_now_add=True)
class MainPage(webapp2.RequestHandler):
def get(self): # ERROR HERE
visitor_query = Visitor.all().order('-date') #not sure about query...need to get curent visitor's submitted form values (name, mood). no log-in in app.
visitor = visitor_query.fetch(1)
template_values = {
'visitor': visitor,
'url': url, #not sure how this applies, just following tutorial
'url_linktext': url_linktext,
}
localtime = time.localtime(time.time())
mon = localtime[1] # MONTH
h = localtime[3] # HOUR
span = "morning" if h == range(5,14) else "afternoon" if h == range(17,7) else "evening"
if mon <= 3:
var1 = "winter"
# more variables in if/elif statement here...I call these variables from index.html...
# name = self.request.get("name") # not sure if I need to define these variables here using jinja2...tutorial does not define entity properties in example.
# name = name.capitalize()
# mood = self.request.get("mood")
template = jinja_environment.get_template('index.html')
self.response.out.write(template.render(template_values)) # ERROR HERE
class Process(webapp2.RequestHandler):
def post(self):
name = self.request.get("name")
name = name.capitalize()
mood = self.request.get("mood")
message = Visitor(name=name, mood=mood)
if users.get_current_user():
message.name = users.get_current_user() #not sure if I need users.get_current...no log-in required
message.mood = self.request.get("mood")
message.put()
self.redirect("/")
app = webapp2.WSGIApplication([('/', MainPage)],
debug=True)
app.yaml:
application: emot
version: 1
runtime: python27
api_version: 1
threadsafe: true
handlers:
#- url: /stylesheets/ # I read no static files allowed with jinja2...not sure how I'll handle CSS...
# static_dir: stylesheets
- url: /.*
script: main.app
libraries:
- name: jinja2
version: latest
index.yaml (all of this works without jinja2...)
indexes:
- kind: Visitor
ancestor: yes
properties:
- name: name
- name: mood
- name: date
direction: desc
Also, I have alternately copied (not cut) jinja2 folder from g00gle_appengine/lib directory to my app directory folder, including just copying the "jinja" folder (as similar method worked using gdata atom & src...) I have also installed python-jinja2, which is located at: /usr/share/doc/python-jinja2
My index.html is in directory "templates" in my app directory. Thanks in advance for getting me going.
From the code you've posted, it looks like the erroring line of code (and the preceding few) aren't indented far enough.
The get method should be aligned as follows:
def get(self): # ERROR HERE
visitor_query = Visitor.all().order('-date') #not sure about query...need to get curent visitor's submitted form values (name, mood). no log-in in app.
visitor = visitor_query.fetch(1)
template_values = {
'visitor': visitor,
'url': url, #not sure how this applies, just following tutorial
'url_linktext': url_linktext,
}
localtime = time.localtime(time.time())
mon = localtime[1] # MONTH
h = localtime[3] # HOUR
span = "morning" if h == range(5,14) else "afternoon" if h == range(17,7) else "evening"
if mon <= 3:
var1 = "winter"
# more variables in if/elif statement here...I call these variables from index.html...
# name = self.request.get("name") # not sure if I need to define these variables here using jinja2...tutorial does not define entity properties in example.
# name = name.capitalize()
# mood = self.request.get("mood")
template = jinja_environment.get_template('index.html')
self.response.out.write(template.render(template_values)) # ERROR HERE

Resources