How many levels are allowed in AWS IoT-Core Topics? Is it possible to increase it? - aws-iot

I am working in a project where the topics are more than 8 levels. I just realize that when I try to publish a 9 level topics T0/T1/T2/T3/T4/T5/T6/T7/T8 the AWS IoT-Core test throws an error: Error Code 8 - AMQJS0008I Socket closed:
But if I reduce to 8 levels topic (T0-T7) everything works good. I am supposing 8 levels are the limit.
The AWS official documentation doesn't show how many topics are allowed in this broker or how we can increase it if possible.
I didn't find this limit not even for a regular MQTT broker, that could give me a clue about AWS.
So, does someone know how many levels are allowed or how to increase this - if possible in AWS?

There is a limit of 8 topic levels or 7 forward slashes. This is documented under the Protocols limits at https://docs.aws.amazon.com/general/latest/gr/iot-core.html#limits_iot
Maximum number of slashes in topic and topic filter
A topic in a publish or subscribe request can have no more than 7 forward slashes (/). This excludes the first 3 slashes in the mandatory segments for Basic Ingest topics ($AWS/rules/rule-name/).
I have not seen any documentation that states that this limit can be increased.

Related

Equal connection distribution is not happening in Aurora autoscaled insatnces

We are running a REST API based spring boot application using AWS Aurora as Database. Our application connects to read-only Aurora MySQL RDS instances.
We are doing load testing on it. Initially we have one database and we have autoscaling in place, which is triggered on high CPU.
Now we are expecting that if we are getting some X throughput with one db instance then we should be getting approx 1.8X when autoscaling happens, and connections should be distributed equally among with the newly created database instances.
But it is not happening, instead DB connections are going up and down on both database instances erratically. Due to which our load is not getting distributed equally and we are not getting desired throughput. Sometimes one database is running on 100 % CPU while the other is still on 20% CPU and after few minutes it is reversed.
Below are the database connection cofiguration :-
Driver - com.mysql.jdbc.driver
Maximum active connections=100
Max age = 300000
Initial pool size = 10
Tomcat jdbc pool is used for connection pooling
NOTE:
1) We have also disabled jvm network DNS caching.
2) we also tried refreshing the database connections every 5 minutes,
Even the active ones.
3) We have tried everything suggested by AWS but nothing is working.
4)We have even written a lambda code to update Route 53 when new db instance comes up to avoid cluster endpoint caching but still same issue.
Can anyone please help what is the best practice for this as currently we cannot take this into production.
This is not a great answer, but since you haven't gotten any replies yet some thoughts.
1) The behavior you are seeing replicates bad routing logic of load balancers
This is no surprise you, but this used to be much more common with small web server deployments – especially long running queries. With connection pooling, you mirror this situation.
2) Taking this assumption forward, we need to guess on how Amazon choose to balance traffic to read only replicas.
Even in their white paper, they don't mention how they are doing routing: https://www.allthingsdistributed.com/files/p1041-verbitski.pdf
Likely options are route53 or an NLB.
My best guess would be that they are using an NLB. NLBs became available to us only in Q3 2017 and Aurora was 2 years before, but it still is a reasonable guess.
NLBs would let us balance based on least connections (far better than round robin).
3) Validating assumptions
If route53 is being used, then we would be able to use DNS to find out.
I did a dig against the route53 end point and found that it gave me an answer
dig +nocmd +noall +answer zzz-databasecluster-xxx.cluster-ro-yyy.us-east-1.rds.amazonaws.com
zzz-databasecluster-xxx.cluster-ro-yyy.us-east-1.rds.amazonaws.com. 1 IN CNAME zzz-0.yyy.us-east-1.rds.amazonaws.com.
zzz-0.yyy.us-east-1.rds.amazonaws.com. 5 IN A 10.32.8.33
I did it again and got a different answer.
dig +nocmd +noall +answer zzz-databasecluster-xxx.cluster-ro-yyy.us-east-1.rds.amazonaws.com
zzz-databasecluster-xxx.cluster-ro-yyy.us-east-1.rds.amazonaws.com. 1 IN CNAME zzz-2.yyy.us-east-1.rds.amazonaws.com.
zzz-2.yyy.us-east-1.rds.amazonaws.com. 5 IN A 10.32.7.97
What you can see is that the read only endpoint is giving me a CNAME result to
Zzz is name of my cluster, yyy came from my cloudformation stack formation, and yyy comes from amazon.
Note: zzz-0 and zzz-2 are the two read only replicas.
What we can see here is that we have route53 for our load balancing.
4) Route53 Load Balancing
They are likely setting up Route53 with round robin on all healthy read only replicas.
The TTL is likely 5s.
Healthy nodes will get removed, but there is no balancing based on
5) Ramifications
A) Using the Read Only end point can only balance traffic away from unhealthy instances
B) DB Pools will keep connections for a long time which means that new read replicas won’t be touched
If we have a small number of servers, we will be unbalanced – which we can’t do much against.
6) Thoughts on what you can do
A) Verify yourself with dig that you are getting correct DNS resolution that keeps rotating between replicas every 5s.
If you don’t, this is something you need to fix
B) Periodically recycle DB Clients
New replicas will get used and while you will be unbalanced, this will help by keeping changing.
What is critical though is you MUST not have all your clients recycle at the same time. Otherwise, you run the risk of all getting the same time. I would suggest doing some random ttl per client (within min/max).
C) Manage it yourself
Summary: When you connect, connect directly to the read replica with least connection/lowest CPU.
How you do this is slightly not simplistic. I would suggest a lambda function that keeps this connection string in a queryable location. Have it update at some frequency. I would say the frequency of updating the preferred DB is 1/10 of the frequency you are recycle the DB connections. You could add logic if the DBs are running similarly, you give the readonly end point..and only give an explicit one when there is significant inequity.
I would caution when a new instance comes up you want to be careful of floating.
D) Increase number of clients or number of read only copies
Both of these would decrease the chance that two boxes would get significant differences.

Approaches to scan and fetch all DNS entries

I'd like to conduct a experiemnt and I need a full database of all DNS entries on the Internet.
Is it practical to scan the Internet and fetch all DNS entries?
What is the limitation: storage, time or network bandwidth?
Any good approaches to start with?
(I can always bruteforcely scan the IP space and do a reverse DNS lookup, but I guess that not the efficient way to do so)
Downloading databases like RIPE's or ARIN's will not get you the reverse DNS entries you want. In fact, you'll only get the Autonomous Systems and the DNS servers resolving these ranges. Nothing else. Check this one: ftp://ftp.ripe.net/ripe/dbase/ripe.db.gz
Reverse DNS queries will get you only a fraction of all the DNS entries. In fact, no one can have them, as most domain names don't accept AXFR requests, and it could be considered illegal in some countries. To get access to the complete list of .com/.net/.org domain names you might be ICANN or maybe an ICANN reseller, but you'll never get other TLDs which aren't publicly available (several countries).
Then, the best possible approach would be to bruteforce all reverse-ip-resolution + become an Internet giant like google to set up your own public DNS's + try to perform an AXFR request on every domain name you're able to detect.
Mixing all these options are the only way to get a significative portion of all the DNS entries, but never the 100%, and probably not more than 5 to 10% . Forget about bruteforcing whois servers to get the list of domain names. It's forbidden by their terms and conditions.
We're bruteforcing reverse-ipv4-resolution right now, because it's the only legal way to do it without being Google. We started 2 weeks ago.
After two weeks of tunning, we've completed a 20% of the Internet. We've developed a python script launching thousands of threads scanning /24 ranges in parallel, from several different nodes.
It's way faster than nmap -sL, however it's not as reliable as nmap, so we'll need a "second pass" to fill-in the gaps we got (arond 85% of the IPs got resolved on the first attempt). Regular rescanning must be performed to obtain a complete and consistent database.
Right now we've several servers running at 2mbps of DNS queries on every node (from 300 to 4000 queries/second on every node, mostly depending on the RTT between our servers and the remote DNS's).
We expect to complete the first pass of all the IPV4 entries in around 30 days.
The text files where we store the preeliminary results have an average of 3M entries for every "A class" range (i.e. 111.0.0.0/8). These files are just "IP\tname\n", and we only store resolved IP's.
We needed to configure a DNS on every server, because we were affecting the DNS service of our provider and it blocked us. In fact we performed a bit of benchmarking on different DNS servers. Forget about Bind, it's to heavy and you'll hardly get more than 300 resolutions/second.
Once we'll finish the scan we'll publish an article and share the database :)
Follow me on Twitter: #kaperuzito
One conclusion we have already got is that people might think twice about the names they put in their DNS PTR entries. You can't name an IP "payroll", "ldap", "intranet", "test", "sql", "VPN" and so on... and there's millions of those :(

Why am I hitting the datastore read operation quota?

I was in a place without Internet access for 3 weeks and just came back to find out that one of my apps since January 18 started to reach a quota limit (Datastore Read Operations) after around the 18 hours.
I don't see any increase in traffic from either users or crawlers.
This is the error in the logs:
"The API call datastore_v3.RunQuery() required more quota than is available."
It seems very strange since this application has been running for some years and I'm memcaching most the datastore requests.
Please help - This is affecting my bottom line!
Thanks.
I found a subset of pages in the site that had got a sudden interest from several crawlers and some of the requests that those pages made to the Datastore were not being memcached, so that was it...problem solved.
Thanks.

App Engine server + Android multiplayer game

Greetings,
I'm creating a multi-player android game and thought it would be a interesting idea to have App Engine handle the server work.
The game consists of 4 players, each phone requests an update every 0.5 seconds.
These requests are very simple and lightweight so i shouldn't be over reaching any free quotas.
The problem i found was that App Engine only handles 500 requests per second, i would only be able to
have around 60 game sessions active before App Engine will start ignoring new requests?
"App Engine's quota system allows for efficient applications with billing enabled to scale to around 500 queries per second (qps) or more than 40 million queries per day."
Or should i just not use this platform because it is not made for this kind of usage?
I sent this same question to the discussion groups on google but after 4 hours it hasn't been posted, there was no response on whether it was a bad question or anything. Hopefully someone here can give me some advice.
Thank you kindly, i'm looking forward to an answer and or advice.
Greetings,
Rohan C
That's an interesting question, considering the only page where I can find that quote contains the answer in the same paragraph.
http://code.google.com/support/bin/request.py?contact_type=AppEngineCPURequest
App Engine's quota system allows for
efficient applications with billing
enabled to scale to around 500 queries
per second (qps) or more than 40
million queries per day. This is a
substantial amount of traffic and
should easily suffice for even the
heaviest of Slashdottings. But if you
expect your application will need to
handle even higher qps, please
complete this form so we can assist
you.

How is MapReduce a good method to analyse http server logs?

I've been looking at MapReduce for a while, and it seems to be a very good way to implement fault-tolerant distributed computing. I read a lot of papers and articles on that topic, installed Hadoop on an array of virtual machines, and did some very interesting tests. I really think I understand the Map and Reduce steps.
But here is my problem : I can't figure out how it can help with http server logs analysis.
My understanding is that big companies (Facebook for instance) use MapReduce for the purpose of computing their http logs in order to speed up the process of extracting audience statistics out of these. The company I work for, while smaller than Facebook, has a big volume of web logs to compute everyday (100Go growing between 5 and 10 percent every month). Right now we process these logs on a single server, and it works just fine. But distributing the computing jobs instantly come to mind as a soon-to-be useful optimization.
Here are the questions I can't answer right now, any help would be greatly appreciated :
Can the MapReduce concept really be applied to weblogs analysis ?
Is MapReduce the most clever way of doing it ?
How would you split the web log files between the various computing instances ?
Thank you.
Nicolas
Can the MapReduce concept really be applied to weblogs analysis ?
Yes.
You can split your hudge logfile into chunks of say 10,000 or 1,000,000 lines (whatever is a good chunk for your type of logfile - for apache logfiles I'd go for a larger number), feed them to some mappers that would extract something specific (like Browser,IP Address, ..., Username, ... ) from each log line, then reduce by counting the number of times each one appeared (simplified):
192.168.1.1,FireFox x.x,username1
192.168.1.1,FireFox x.x,username1
192.168.1.2,FireFox y.y,username1
192.168.1.7,IE 7.0,username1
You can extract browsers, ignoring version, using a map operation to get this list:
FireFox
FireFox
FireFox
IE
Then reduce to get this :
FireFox,3
IE,1
Is MapReduce the most clever way of doing it ?
It's clever, but you would need to be very big in order to gain any benefit... Splitting PETABYTES of logs.
To do this kind of thing, I would prefer to use Message Queues, and a consistent storage engine (like a database), with processing clients that pull work from the queues, perform the job, and push results to another queue, with jobs not being executed in some timeframe made available for others to process. These clients would be small programs that do something specific.
You could start with 1 client, and expand to 1000... You could even have a client that runs as a screensaver on all the PCs on a LAN, and run 8 clients on your 8-core servers, 2 on your dual core PCs...
With Pull: You could have 100 or 10 clients working, multicore machines could have multiple clients running, and whatever a client finishes would be available for the next step. And you don't need to do any hashing or assignment for the work to be done. It's 100% dynamic.
http://img355.imageshack.us/img355/7355/mqlogs.png
How would you split the web log files between the various computing instances ?
By number of elements or lines if it's a text-based logfile.
In order to test MapReduce, I'd like to suggest that you play with Hadoop.
Can the MapReduce concept really be applied to weblogs analysis ?
Sure. What sort of data are you storing?
Is MapReduce the most clever way of doing it ?
It would allow you to query across many commodity machines at once, so yes it can be useful. Alternatively, you could try Sharding.
How would you split the web log files between the various computing instances ?
Generally you would distribute your data using a consistent hashing algorithm, so you can easily add more instances later. You should hash by whatever would be your primary key in an ordinary database. It could be a user id, an ip address, referer, page, advert; whatever is the topic of your logging.

Resources