AWS: Why does my RDS instance keep starting after I turned it off? - database

I have an RDS database instance on AWS and have turned it off for now. However, every few days it starts up on its own. I don't have any other services running right now.
There is this event in my RDS log:
"DB instance is being started due to it exceeding the maximum allowed time being stopped."
Why is there a limit to how long my RDS instance can be stopped? I just want to put my project on hold for a few weeks, but AWS won't let me turn off my DB? It costs $12.50/mo to have it sit idle, so I don't want to pay for this, and I certainly don't want AWS starting an instance for me that does not get used.
Please help!

That's a limitation of this new feature.
You can stop an instance for up to 7 days at a time. After 7 days, it will be automatically started. For more details on stopping and starting a database instance, please refer to Stopping and Starting a DB Instance in the Amazon RDS User Guide.
You can setup a cron job to stop the instance again after 7 days. You can also change to a smaller instance size to save money.
Another option is the upcoming Aurora Serverless which stops and starts for you automatically. It might be more expensive than a dedicated instance when running 24/7.
Finally, there is always Heroku which gives you a free database instance that starts and stops itself with some limitations.
You can also try saving the following CloudFormation template as KeepDbStopped.yml and then deploy with this command:
aws cloudformation deploy --template-file KeepDbStopped.yml --stack-name stop-db --capabilities CAPABILITY_IAM --parameter-overrides DB=arn:aws:rds:us-east-1:XXX:db:XXX
Make sure to change arn:aws:rds:us-east-1:XXX:db:XXX to your RDS ARN.
Description: Automatically stop RDS instance every time it turns on due to exceeding the maximum allowed time being stopped
Parameters:
DB:
Description: ARN of database that needs to be stopped
Type: String
AllowedPattern: arn:aws:rds:[a-z0-9\-]+:[0-9]+:db:[^:]*
Resources:
DatabaseStopperFunction:
Type: AWS::Lambda::Function
Properties:
Role: !GetAtt DatabaseStopperRole.Arn
Runtime: python3.6
Handler: index.handler
Timeout: 20
Code:
ZipFile:
Fn::Sub: |
import boto3
import time
def handler(event, context):
print("got", event)
db = event["detail"]["SourceArn"]
id = event["detail"]["SourceIdentifier"]
message = event["detail"]["Message"]
region = event["region"]
rds = boto3.client("rds", region_name=region)
if message == "DB instance is being started due to it exceeding the maximum allowed time being stopped.":
print("database turned on automatically, setting last seen tag...")
last_seen = int(time.time())
rds.add_tags_to_resource(ResourceName=db, Tags=[{"Key": "DbStopperLastSeen", "Value": str(last_seen)}])
elif message == "DB instance started":
print("database started (and sort of available?)")
last_seen = 0
for t in rds.list_tags_for_resource(ResourceName=db)["TagList"]:
if t["Key"] == "DbStopperLastSeen":
last_seen = int(t["Value"])
if time.time() < last_seen + (60 * 20):
print("database was automatically started in the last 20 minutes, turning off...")
time.sleep(10) # even waiting for the "started" event is not enough, so add some wait
rds.stop_db_instance(DBInstanceIdentifier=id)
print("success! removing auto-start tag...")
rds.add_tags_to_resource(ResourceName=db, Tags=[{"Key": "DbStopperLastSeen", "Value": "0"}])
else:
print("ignoring manual database start")
else:
print("error: unknown database event!")
DatabaseStopperRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Version: '2012-10-17'
Statement:
- Action:
- sts:AssumeRole
Effect: Allow
Principal:
Service:
- lambda.amazonaws.com
ManagedPolicyArns:
- arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole
Policies:
- PolicyName: Notify
PolicyDocument:
Version: '2012-10-17'
Statement:
- Action:
- rds:StopDBInstance
Effect: Allow
Resource: !Ref DB
- Action:
- rds:AddTagsToResource
- rds:ListTagsForResource
- rds:RemoveTagsFromResource
Effect: Allow
Resource: !Ref DB
Condition:
ForAllValues:StringEquals:
aws:TagKeys:
- DbStopperLastSeen
DatabaseStopperPermission:
Type: AWS::Lambda::Permission
Properties:
Action: lambda:InvokeFunction
FunctionName: !GetAtt DatabaseStopperFunction.Arn
Principal: events.amazonaws.com
SourceArn: !GetAtt DatabaseStopperRule.Arn
DatabaseStopperRule:
Type: AWS::Events::Rule
Properties:
EventPattern:
source:
- aws.rds
detail-type:
- "RDS DB Instance Event"
resources:
- !Ref DB
detail:
Message:
- "DB instance is being started due to it exceeding the maximum allowed time being stopped."
- "DB instance started"
Targets:
- Arn: !GetAtt DatabaseStopperFunction.Arn
Id: DatabaseStopperLambda
It has worked for at least one person. If you have issues please report here.

Related

In Python, how do I construct a URL to test if my SQL Server instance is running?

I'm using Python 3.8 with the pytest-docker-compose plugin -- https://pypi.org/project/pytest-docker-compose/ . Does anyone know how to write a URL that would eventually tell me if my SQL Server is running?
I have this docker-compose.yml file
version: "3.2"
services:
sql-server-db:
build: ./
container_name: sql-server-db
image: microsoft/mssql-server-linux:2017-latest
ports:
- "1433:1433"
environment:
SA_PASSWORD: "password"
ACCEPT_EULA: "Y"
but I don't know what URL to pass to my Retry object to test that the server is running. This fails ...
import pytest
import requests
from urllib3.util.retry import Retry
from requests.adapters import HTTPAdapter
...
#pytest.fixture(scope="function")
def wait_for_api(function_scoped_container_getter):
"""Wait for sql server to become responsive"""
request_session = requests.Session()
retries = Retry(total=5,
backoff_factor=0.1,
status_forcelist=[500, 502, 503, 504])
request_session.mount('http://', HTTPAdapter(max_retries=retries))
service = function_scoped_container_getter.get("sql-server-db").network_info[0]
api_url = "http://%s:%s/" % (service.hostname, service.host_port)
assert request_session.get(api_url)
return request_session, api_url
with this exception
raise ConnectionError(e, request=request)
E requests.exceptions.ConnectionError: HTTPConnectionPool(host='0.0.0.0', port=1433): Max retries exceeded with url: / (Caused by ProtocolError('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')))
if connection.is_connected():
db_Info = connection.get_server_info()
print("Connected to MySQL Server version ", db_Info)
cursor = connection.cursor()
cursor.execute("select database();")
record = cursor.fetchone()
print("You're connected to database: ", record)
you could use something like this and it would output if it was connected
Here is a sample function that will retry to connect to the DB and won't return until it has successfully connected or the defined maxRetries is reached:
def waitDb(server, database, username, password, maxAttempts, waitBetweenAttemptsSeconds):
"""
Returns True if the connection is successfully established before the maxAttempts number is reached
Conversely returns False
pyodbc.connect has a built-in timeout. Use a waitBetweenAttemptsSeconds greater than zero to add a delay on top of this timeout
"""
for attemptNumber in range(maxAttempts):
cnxn = None
try:
cnxn = pyodbc.connect('DRIVER={ODBC Driver 17 for SQL Server};SERVER='+server+';DATABASE='+database+';UID='+username+';PWD='+ password)
cursor = cnxn.cursor()
except Exception as e:
print(traceback.format_exc())
finally:
if cnxn:
print("The DB is up and running: ")
return True
else:
print("DB not running yet on attempt numer " + str(attemptNumber))
time.sleep(waitBetweenAttemptsSeconds)
print("Max attempts waiting for DB to come online exceeded")
return False
I wrote a minimal example here: https://github.com/claudiumocanu/flow-pytest-docker-compose-wait-mssql.
I included the three actions that can be executed independently, but you can jump to the last step specifically for what you asked:
1. Connected from python to the mssql launched by the compose-file:
For me it was quite annoying to find and install the appropriate ODBC Driver and its dependencies - ODBC Driver 17 for SQL Server worked the best for me on Ubuntu 18.
To perform only this step, docker-compose up the the docker-compose.yml in my example, then run the example-connect.py
2. Created a function that attempts to connect to the DB with a maxAttemptsNumber and a delay between retries:
Just run this example-waitDb.py. You can play with the maxAttempts and the delayBetweenAttempts values, then bring up the database at randomly, to test it.
3. Put everything together in the test_db.py test suite:
the waitDb function described above.
same wrapper and annotations that you provided in your example to spin-up the resources defined in the compose-file
a dummy integration test that will not be executed before waitDb returns (if you want to block this tests completely, you can throw instead of returning False from the waitDb function)
PS: Keep using ENVs/vault etc rather than storing the real passwords like I did for the dummy example.

How to use Google App Engine Cron jobs in a cheaper way?

I have a twitter bot which tweets out content a few times a day. here is my cron.yaml file.
cron:
- description: "twitter instagram scraper"
url: /scrape/twitter_intra
schedule: every 24 hours
target: scraper
- description: "USD to LKR scraper"
url: /scrape/exRates
schedule: every 1 hours
target: scraper
- description: "last 24 hour weather"
url: /scrape/weather_last24hours
schedule: every day 12:00
target: scraper
- description: "tweet out last 24 hour weather"
url: /tweet/weather_last24hours
schedule: every day 13:00
target: twitter
- description: "tweet out exchange Rate USD to LKR"
url: /tweet/exRates
schedule: every day 7:00
target: twitter
here is an example of one request method,
app.get(`/tweet/weather_last24hours`, async (req, res, next) => {
console.log(`Tweet!! last 24 hours`);
try {
//await tweetText('This is a test');
const report = await getWeatherLast24Hours();
const content = makeTweetLast24HourWeather(report);
console.log(content);
await tweetText(content);
res.status(200)
.set('Content-Type', 'text/plain')
.send(`Completed Successfully...!`)
.end();
} catch (error) {
next(error);
}
});
right now It's working fine except it cost me 2.2$ a day because I have to keep an instance up all day by doing the following.
const PORT = process.env.PORT || 8080;
app.listen(PORT, () => {
console.log(`App listening on port ${PORT}`);
console.log('Press Ctrl+C to quit.');
});
I try removing this part of the code assuming that GCP cron job can up an instance on its own and run the task. Even though cron job itself was able to up the instance, Task was a failure with 500 Error code and following message.
This request caused a new process to be started for your application,
and thus caused your application code to be loaded for the first time.
This request may thus take longer and use more CPU than a typical
request for your application.
error message in the logs. I try it a few times and got the same result.
Is there a solution to this, by starting an instance just before cron job executes?
Can newly introduced cloud scheduler can help?
am I doing something wrong?
Thank you in advance.
Your 500 is a warning to tell you something you already know: that bootup time will be longer due to needing to warm up a new instance. This is not necessarily a problem in itself, unless you notice that your tasks are not running properly.
To use Cloud Scheduler, consider decomposing your application into Cloud Functions. You can build a separate function for each of your 5 endpoints, plus a sixth that contains the shared logic (e.g. getWeatherLast24Hours()). You can then invoke them on a schedule using Cloud Scheduler.
The cost of running with Cloud Functions + Scheduler will be near-zero, so you'll want to do your own ROI evaluation to determine whether the development effort is worth the savings.

Using JanusGraph with Solr

Setting up JanusGraph i noticed the following in the console:
09:04:12,175 INFO ReflectiveConfigOptionLoader:173 - Loaded and initialized config classes: 10 OK out of 12 attempts in PT0.023S
09:04:12,230 INFO Reflections:224 - Reflections took 28 ms to scan 1 urls, producing 2 keys and 2 values
09:04:12,291 WARN GraphDatabaseConfiguration:1445 - Local setting index.search.index-name=entity (Type: GLOBAL_OFFLINE) is overridden by globally managed value (janusgraph). Use the ManagementSystem interface instead of the local configuration to control this setting.
09:04:12,294 WARN GraphDatabaseConfiguration:1445 - Local setting index.search.backend=solr (Type: GLOBAL_OFFLINE) is overridden by globally managed value (elasticsearch). Use the ManagementSystem interface instead of the local configuration to control this setting.
09:04:12,300 INFO CassandraThriftStoreManager:628 - Closed Thrift connection pooler.
and then i see the following:
Exception in thread "main" java.lang.IllegalArgumentException: Could not instantiate implementation: org.janusgraph.diskstorage.es.ElasticSearchIndex
How do i stop using elasticsearch and switch to Solr?
My properties file is as follows:
index.search.backend=solr
index.search.directory=/path/to/directory/for/solr/index/something
index.search.index-name=something
index.search.solr.mode=http
index.search.solr.http-urls=http://127.0.0.1:8983/solr
storage.backend=cassandrathrift
storage.hostname=127.0.0.1
cache.db-cache = true
cache.db-cache-clean-wait = 20
cache.db-cache-time = 180000
cache.db-cache-size = 0.25
The answer to this basically the same as this one for Titan. JanusGraph was forked from Titan.
You are probably trying to connect to an existing graph that was previously configured to use Elasticsearch. By default, the keyspace is named janusgraph.
1) You could connect to a different keyspace by updating conf/janusgraph-cassandra.properties
gremlin.graph=org.janusgraph.core.JanusGraphFactory
storage.backend=cassandrathrift
storage.hostname=127.0.0.1
storage.cassandra.keyspace=mygraph
2) You could drop the existing keyspace. If you used bin/janusgraph.sh start from the quick start directions (which starts a single node Cassandra and a single node Elasticsearch),
bin/janusgraph.sh clean
Or if you have a standalone Cassandra installation:
$CASSANDRA_HOME/bin/cqlsh -e 'drop keyspace if exists janusgraph'
Then you would be able to connect with the default conf/janusgraph-cassandra.properties.

Geonetwork database whit Ldap Connection error

I'm trying to connect my ldap with the geonetwork database but every time I log in it doesn't show the administrator button. Then I check the database and it is empty. I am using GeOrchestra 13.09 in a localhost enviroment, the geoserver and mapfishapp are running well and they log in without a problem.
My config-security.properties is
Core security properties
logout.success.url=/index.html
passwordSalt=secret-hash-salt=
# LDAP Connection Settings
ldap.base.provider.url=ldap://localhost:389
ldap.base.dn=dc=geobolivia,dc=gob,dc=bo
ldap.security.principal=cn=admin,dc=geobolivia,dc=gob,dc=bo
ldap.security.credentials=geobolivia
ldap.base.search.base=ou=users
ldap.base.dn.pattern=uid={0},${ldap.base.search.base}
#ldap.base.dn.pattern=mail={0},${ldap.base.search.base}
# Define if groups and profile information are imported from LDAP. If not, local database is used.
# When a new user connect first, the default profile is assigned. A user administrator can update
# privilege information.
ldap.privilege.import=true
ldap.privilege.export=true
ldap.privilege.create.nonexisting.groups=false
# Define the way to extract profiles and privileges from the LDAP
# 1. Define one attribute for the profile and one for groups in config-security-overrides.properties
# 2. Define one attribute for the privilege and define a custom pattern (use LDAPUserDetailsContextMapperWithPa$
ldap.privilege.pattern=
#ldap.privilege.pattern=CAT_(.*)_(.*)
ldap.privilege.pattern.idx.group=1
ldap.privilege.pattern.idx.profil=2
# 3. Define custom location for extracting group and role (no support for group/role combination) (use LDAPUser$
#ldap.privilege.search.group.attribute=cn
#ldap.privilege.search.group.object=ou=groups
#ldap.privilege.search.group.query=(&(objectClass=posixGroup)(memberUid={0})(cn=EL_*))
#ldap.privilege.search.group.pattern=EL_(.*)
#ldap.privilege.search.privilege.attribute=cn
#ldap.privilege.search.privilege.object=ou=groups
#ldap.privilege.search.privilege.query=(&(objectClass=posixGroup)(memberUid={0})(cn=SV_*))
#ldap.privilege.search.privilege.pattern=SV_(.*)
ldap.privilege.search.group.attribute=cn
ldap.privilege.search.group.object=ou=groups
ldap.privilege.search.group.query=(&(objectClass=posixGroup)(memberUid={1})(cn=EL_*))
ldap.privilege.search.group.pattern=EL_(.*)
ldap.privilege.search.privilege.attribute=cn
ldap.privilege.search.privilege.object=ou=groups
ldap.privilege.search.privilege.query=(&(objectClass=posixGroup)(memberUid={1})(cn=SV_ADMIN))
ldap.privilege.search.privilege.pattern=SV_(.*)
# Run LDAP sync every day at 23:30
# Run LDAP sync every day at 23:30
#ldap.sync.cron=0 30 23 * * ?
ldap.sync.cron=0 * * * * ?
#ldap.sync.cron=0 0/1 * 1/1 * ? *
ldap.sync.startDelay=60000
ldap.sync.user.search.base=${ldap.base.search.base}
ldap.sync.user.search.filter=(&(objectClass=*)(mail=*#*)(givenName=*))
ldap.sync.user.search.attribute=uid
ldap.sync.group.search.base=ou=groups
ldap.sync.group.search.filter=(&(objectClass=posixGroup)(cn=EL_*))
ldap.sync.group.search.attribute=cn
ldap.sync.group.search.pattern=EL_(.*)
# CAS properties
cas.baseURL=https://localhost:8443/cas
cas.ticket.validator.url=${cas.baseURL}
cas.login.url=${cas.baseURL}/login
cas.logout.url=${cas.baseURL}/logout?url=${geonetwork.https.url}/
<import resource="config-security-cas.xml"/>
<import resource="config-security-cas-ldap.xml"/>
# either the hardcoded url to the server
# or if has the form it will be replaced with
# the server details from the server configuration
geonetwork.https.url=https://localhost/geonetwork-private/
#geonetwork.https.url=https://geobolivia.gob.bo:443
#geonetwork.https.url=https://localhost:443
The geonetwork.log shows these results:
2014-03-11 13:41:00,004 DEBUG [geonetwork.ldap] - LDAPSynchronizerJob starting ...
2014-03-11 13:41:00,006 DEBUG [org.springframework.ldap.core.support.AbstractContextSource] - Got Ldap context on server 'ldap://localhost:389/dc=geobolivia,dc=gob,dc=bo'
2014-03-11 13:41:00,008 DEBUG [org.springframework.beans.factory.support.DefaultListableBeanFactory] - Returning cached instance of singleton bean 'resourceManager'
2014-03-11 13:41:00,026 DEBUG [geonetwork.ldap] - LDAPSynchronizerJob done.
2014-03-11 13:41:26,429 INFO [geonetwork.lucene] - Done running PurgeExpiredSearchersTask. 0 versions still cached.
2014-03-11 13:41:56,430 INFO [geonetwork.lucene] - Done running PurgeExpiredSearchersTask. 0 versions still cached.
and the that appear in the geonetwork.log is
2014-03-11 13:44:06,426 INFO [jeeves.service] - Dispatching : xml.search.keywords
2014-03-11 13:44:06,427 ERROR [jeeves.service] - Exception when executing service
2014-03-11 13:44:06,427 ERROR [jeeves.service] - (C) Exc : java.lang.IllegalArgumentException: The thesaurus external.theme.inspire-service-taxonomy does not exist, there for the query cannot be excuted: 'Query [query=SELECT DISTINCT id,uppc,lowc,broader,spa_prefLabel,spa_note FROM {id} rdf:type {skos:Concept},[{id} gml:BoundedBy {} gml:upperCorner {uppc}],[{id} gml:BoundedBy {} gml:lowerCorner {lowc}],[{id} skos:broader {broader}],[{id} skos:prefLabel {spa_prefLabel} WHERE lang(spa_prefLabel) LIKE "es" IGNORE CASE],[{id} skos:scopeNote {spa_note} WHERE lang(spa_note) LIKE "es" IGNORE CASE] WHERE (spa_prefLabel LIKE "***" IGNORE CASE OR id LIKE "*") LIMIT 35 USING NAMESPACE skos=<http://www.w3.org/2004/02/skos/core#>,gml=<http://www.opengis.net/gml#>, interpreter=KeywordResultInterpreter]'
The version of GeoNetwork currently used in geOrchestra does not show the "administration" button on its first page. You have to fire a search, then in "other actions" menu on the top right, you should be able to get to the administration interface. We know that it is not very intuitive, but it should change in the next months (we recently planned an upgrade of GeoNetwork before the end of the year).
Did you solve it? I think in your config-security.properties, at this place ldap.base.dn.pattern=uid={0},${ldap.base.search.base}
you need to replace {0} with the username typed in the sign-in screen of geonetwork

Solr times out when I try to rebuild index for django

I am trying to build my solr index for Django on ubuntu for the first time with ./manage.py rebuild_index and I get the following error:
Removing all documents from your index because you said so.
Failed to clear Solr index: Connection to server 'http://localhost:8983/solr/update/?commit=true' timed out: HTTPConnectionPool(host='localhost', port=8983): Request timed out. (timeout=10)
All documents removed.
Indexing 4 dishess
Failed to add documents to Solr: Connection to server 'http://localhost:8983/solr/update/?commit=true' timed out: HTTPConnectionPool(host='localhost', port=8983): Request timed out. (timeout=10)
I have access to localhost:8983/solr/ and localhost:8983/solr/admin via my web browser
You can bump up the TIMEOUT in settings.py.
For example
HAYSTACK_CONNECTIONS = {
'default': {
'ENGINE': 'haystack.backends.solr_backend.SolrEngine',
'URL': 'http://127.0.0.1:8080/solr/default',
'INCLUDE_SPELLING': True,
'TIMEOUT': 60 * 5,
},
}
Important thing here is that you shoudn't increase default timeout, because it could possibly block all your workers as haystack works synchronously.
The best way to avoid this is to define multiple connections for reads and writes with different timeouts and define.
http://django-haystack.readthedocs.org/en/latest/settings.html#haystack-connections
And use routers for read and write separation http://django-haystack.readthedocs.org/en/v2.4.0/multiple_index.html#automatic-routing

Resources