What to do about "transaction nonce too high" errors in RSK? - web3js

I have a decentralised application deployed on RSK, and has been working for several months. Everything works correctly using the public node,
however infrequently, we start getting a totally random error:
Unknown Error: {
"jsonrpc": "2.0",
"id": 2978041344968143,
"error": {
"code": -32010,
"message": "transaction nonce too high"
}
}
There is no information about “too high” nonces but many threads about “too slow”. I’m using web3.Contract.method.send().

In Metamask, ensure you are on your dev/test account then:
1 click on the avatar circle top right
2 In the menu, choose Settings
3 Click Advanced
4 Scroll down a bit, make sure again you are on your testnet account, click Reset Account

There is a limit on the number transactions the same address can have on the transaction pool.
This limit is 4 for RSK,
and is defined within TxValidatorNonceRangeValidator
within the rskj code base:
BigInteger maxNumberOfTxsPerAddress = BigInteger.valueOf(4);
Note that Ethereum has a similar limit,
but the limit that is configured in geth is 10.
So if we have already sent 4 transactions, that have not been mined yet, and send a 5th transaction before the next block is mined, it will get an error that the nonce is too high. If a block was mined and it had let's say all 4 of the transactions, then we would be able to add up to 4 transactions for the next block.
Workarounds
(1) Send no more than 4 transactions from an address, until there is a new block.
(2) Aggregate all of the calls and then use a contract that executes them in a single go.
An example of this is seen in
RNS Batch Client ExecuteRegistrations.

For me it happened when i restarted the node, following instruction fixed it:
Open up your MetaMask window and click on the icon in the top right to
display accounts. Go to Settings, then Advanced and hit Reset Account.

Related

Can I send an alert when a message is published to a pubsub topic?

We are using pubsub & a cloud function to process a stream of incoming data. I am setting up a dead letter topic to handle cases where a message cannot be processed, as described at Cloud Pub/Sub > Guides > Handling message failures.
I've configured a subscription on the dead-letter topic to retain messages for 7 days, we're doing this using terraform:
resource "google_pubsub_subscription" "dead_letter_monitoring" {
project = var.project_id
name = "var.dead_letter_sub_name
topic = google_pubsub_topic.dead_letter.name
expiration_policy { ttl = "" }
message_retention_duration = "604800s" # 7 days
retain_acked_messages = true
ack_deadline_seconds = 600
}
We've tested our cloud function robustly and consequently our expectation is that messages will appear on this dead-letter topic very very rarely, perhaps never. Nevertheless we're putting it in place just to make sure that we catch any anomalies.
Given the rarity of which we expect messages to appear on the dead-letter-topic we need to set up an alert to send an email when such a message appears. Is it possible to do this? I've taken a look through the alerts one can create at https://console.cloud.google.com/monitoring/alerting/policies/create however I didn't see anything that could accomplish this.
I know that I could write a cloud function to consume a message from the subscription and act upon it accordingly however I'd rather not have to do that, a monitoring alert feels like a much more elegant way of achieving this.
is this possible?
Yes, you can use Cloud Monitoring for that. Create a new policy and perform that configuration
Select PubSub Topic and Published message. Observe the value every minute and count them (aligner in the advanced option). Now, in the config, when it's above 0 from the most recent value, the alert is raised.
To filter on only your topic you can add a filter by topic_id on your topic name.
Then, configure your alert to send an email. It should work!

Undefined step error even though step is defined

I have the following BDD given statement in landing.py file
use_step_matcher("re")
#given(u'that I land on the (page_name) landing page with the correct offer id (?:start network traffic)')
My feature file is as follows:
Scenario : DR lil page end2end test starting with utm parameters
Given that I land on the "google page" landing page with the correct offer id start network traffic
When I try to run it says undefined step
I am using the step matcher "re" and the text "start network traffic" in my given statement is optional. Why is it failing
try
#given(u'that I land on the {page_name} landing page with the correct offer id (?:start network traffic)')

Exceeded soft memory limit of 512 MB with 532 MB after servicing 3 requests total. Consider setting a larger instance class in app.yaml

We are on Google App engine standard environment, F2 instance (generation 1 - python 2.7). We have a reporting module that follows this flow.
Worker Task is initiated in a queue.
task = taskqueue.add(
url='/backendreport',
target='worker',
queue_name = 'generate-reports',
params={
"task_data" : task_data
})
In the worker class, we query Google datastore and write the data to a Google Sheet. We paginate through the records to find additional report elements. When we find additional page, we call the same task again to spawn another write, so it can fetch the next set of report elements and write them to Google sheet.
in the backendreport.py we have the following code.
class BackendReport():
# Query google datastore to find the records(paginated)
result = self.service.spreadsheets().values().update(
spreadsheetId=spreadsheet_Id,
range=range_name,
valueInputOption=value_input_option,
body=resource_body).execute()
# If pagination finds additional records
task = taskqueue.add(
url='/backendreport',
target='worker',
queue_name = 'generate-reports',
params={
"task_data" : task_data
})
We run the same BackendReport (with pagination) as a front end job (not as a task). The pagination works without any error - meaning we fetch each page of records and display to the front end. But when we execute the tasks iteratively it fails with the soft memory limit issue. We were under the impression that every time a task is called (for each pagination) it should act independently and there shouldn't be any memory constraints. What are we doing wrong here?
Why doesn't GCP spin a different instance when the soft memory limit is reached - automatically (our instance class is F2).
The error message says soft memory limit of 512 MB reached after servicing 3 requests total - does this mean that the backendreport module spun up 3 requests - does it mean there were 3 tasks calls (/backendreport)?
Why doesn't GCP spin a different instance when the soft memory limit is reached
One of the primary mechanisms for when app engine decides to spin up a new instance is max_concurrent_requests. You can checkout all of the automatic_scaling params you can configure here:
https://cloud.google.com/appengine/docs/standard/python/config/appref#scaling_elements
does this mean that the backendreport module spun up 3 requests - does it mean there were 3 tasks calls (/backendreport)?
I think so. To be sure, you can open up Logs viewer, find the log where this was printed and filter your logs by that instance-id to see all the requests it handled that lead to that point.
you're creating multiple tasks in Cloud Tasks, but there's no limitation for the dispatching queue there, and as the queue tries to dispatch multiple tasks at the same time, it reaches the memory limit. So the limitations you want to set in place is really max_concurrent_requests, however not for the instances in app.yaml, it should be set for the queue dispatching in queue.yaml, so only one task at a time is dispatched:
- name: generate-reports
rate: 1/s
max_concurrent_requests: 1

How to ensure all users are being sent only one daily message using GAE and deferred task queues

I am using the deferred task queues library with GAE. Every day I need to send a piece of text to all users connected to a certain page in my app. My app has multiple pages connected, so for each page, I want to go over all users, and send them a daily message. I am using a cursor to iterate over the table of Users in batches of 800. If there are more than 800 users, I want to remember where the cursor left off, and start another task with the other users.
I just want to make sure that with my algorithm I am going to send all users only one message. I want to make sure I won't miss any users, and that no user will receive the same message twice.
Does this look like the proper algorithm to handle my situation?
def send_news(page_cursor=None, page_batch_size=1,
user_cursor=None, user_batch_size=800):
p_query = PageProfile.query(PageProfile.subscribed==True)
all_pages, next_page_cursor, page_more = p_query.fetch_page(page_batch_size,
start_cursor=page_cursor)
for page in all_pages:
if page.page_news_url and page.subscribed:
query = User.query(User.subscribed==True, User.page_id == page.page_id)
all_users, next_user_cursor, user_more = query.fetch_page(user_batch_size, start_cursor=user_cursor)
for user in all_users:
user.sendNews()
# If there are more users on this page, remember the cursor
# and get the next 800 users on this same page
if user_more:
deferred.defer(send_news, page_cursor=page_cursor, user_cursor=next_user_cursor)
# If there are more pages left, use another deferred queue to
# send the daily news to users in that page
if page_more:
deferred.defer(send_news, page_cursor=next_page_cursor)
return "OK"
You could wrap your user.sendNews() into another deferred task with specific name which will ensure that it's created only once.
interval = int(time.time()) / (60 * 60 * 24)
args = ('positional_arguments_for_object')
kwargs = {'param': 'value'}
task_name = '_'.join([
'user_name',
'page_name'
str(interval_num)
])
# with interval presented in the name we are sure that the task name for the same page and same user will stay same for 24 hours
try:
deferred.defer(obj, _name=task_name, _queue='my-queue', _url='/_ah/queue/deferred', *args, **kwargs)
except (taskqueue.TaskAlreadyExistsError):
pass
# task with such name already exists, likely wasn't executed yet
except (taskqueue.TombstonedTaskError)
pass
# task with such name was created not long time ago and this name isn't available to use
# this should reset once a week or so
Note that as far as I remember App Engine does not guarantee that the task will be executed only once, in some edge cases it could be executed twice or more times and ideally they should be idempotent. If such edge cases are important for you – you could transactionally read/write some flag in the datastore for each task, and before executing the task you check if that entity is there to cancel the execution.

Ldap query only returning 1000 users... yes I am using paging

I have a simple GetStaff function that should retrieve all users from active directory. We have over a 1000 users so the directory searcher is using paging because the default for the AD MaxPageSize is 1000.
Currently the search works 'sometimes' when I build and sends back all 1054 users, and other times it only sends back 1000. If it works once, it works all the time. If it fails once, it fails all the time. I have set everything in using statements to make sure the objects are destroyed, but it still doesn't always seem to respect the PageSize attribute. By default if the PageSize attribute is set, the searcher should use a SizeLimit of 0. I have tried leaving the size limit out, setting it to 0, and setting it to 100000 and the unstable result is the same. I have also tried lowering the PageSize to 250 and get the same unstable results. Currently I am trying changing the ldap policy on the server to have a MaxPageSize of 10000 and I am still receiving 1000 users with the search PageSize to 10000 also. Not sure what I am missing here, but any help or direction would be appreciated.
public IEnumerable<StaffInfo> GetStaff(string userId)
{
try
{
var userList = new List<StaffInfo>();
using (var directoryEntry = new DirectoryEntry("LDAP://" + _adPath + _adContainer, _quarcAdminUserName, _quarcAdminPassword))
{
using (var de = new DirectorySearcher(directoryEntry)
{
Filter = GetDirectorySearcherFilter(LdapFilterOptions.AllUsers),
PageSize = 1000,
SizeLimit = 0
})
{
foreach (SearchResult sr in de.FindAll())
{
try
{
var userObj = sr.GetDirectoryEntry();
var staffInfo = new StaffInfo(userObj);
userList.Add(staffInfo);
}
catch (Exception ex)
{
Log.Error("AD Search result loop Error", ex);
}
}
}
}
return userList;
}
catch (Exception ex)
{
Log.Error("AD get staff try Error", ex);
return Enumerable.Empty<StaffInfo>();
}
}
A friend got back to me with the below response that helped me out, so I thought I would share it and hope it helps anyone else with the same issue.
The first thing I think of is "Are you using the domain name, e.g. foo.com as the _adpath?"
If so, then I have a pretty good idea. A dns query for Foo.com will return a random list of all of up to 25 DCs in the domain. If the first DC in that random list is not responsive or firewalled off and you get that DC from DNS then you will experience the behavior you describe. Since the DNS is cached on the local machine, you will see it happen consistently one day, then not do it the next. That's infuriating behavior. :/
You can verify this with a network trace to see if this is happening.
So how do you workaround it? A couple of options.
Query DNS -> create a lists of hosts returned -> Try the first one. If it fails, Try the next one. If you hit the bottom of the list, Fail. If you do this, log each independent failure noisily so the admins don't blame you.
Even better would be to ask the AD administrators for a list of ldap servers and use that with the approach described above.
80% of administrators will tell you just to use the domain name. This is good because that deploying a new domain will "just work" with no reconfiguration required.
15% of administrators will want to specify a couple of DCs that are network closest to the application. This is good for performance, but bad if they forget about this application when the time comes for them to upgrade their domain.
The other 5% doesn't really matter. :)
The next point that I see is that you are using LDAP, not LDAPs. That is fine, but there is a risk that you will use "Basic" binds. With "Basic" binds, joe hacker can steal your account credentials using a network sniffer. There are a couple of possible workarounds.
1. There is another DirectoryEntry constructor that will let you specify "Secure" as the auth method.
2. Ask your admins if you can use LdapS. (more portable, in case you need to talk to an LDAP server other than Active Directory)
The last piece is regarding Page Size. 1,000 should be fine universally. Don't use any value > 5,000 or you can expect some fidgety behaviors. i.e. This is higher than the default limit under Windows 2003, and in Windows 2008 the pagesize is hardcoded limited to 5,000 unless it's been overridden using a rather obscure bit in AD called dsHeuristics. http://support.microsoft.com/kb/2009267
LDAP is configured, by default, to only return a maximum of 1000. You can change this setting on the domain your requesting from.

Resources