How do Transactions Per Second work on Hedera

How do Transactions Per Second work on Hedera - hedera-hashgraph

Suppose I send 1000 transactions per second and the finality is 2-3 sec. Will it take 2-3 sec to finalize all of those 1000 transactions? Or will it be a finalize delay difference between the first and last transaction?

Each individual transaction will finalise within a few seconds so yes if the consensus latency is say 4s, they will each reach consensus in 4s from when they were received by a node. Since it will take you 1s to send all 1000, it will take 5s overall for all 1000 to reach consensus.
The analogy I sometimes use to distinguish throughput and latency is to use the difference between the speed of light and sound.
The time it takes for a flash of light or sound to reach someone is latency.
Now, you can flash or sound once a second or 100 times a second, that's throughput. The fact it takes sound longer to travel to its destination (latency) doesn't prevent 100 sounds being sent in one second.

Related

Flink input rate control

We are using Flink 1.9.1.
We have a source, a process function, and a sink. The application consumes and produces to kinesis.
The input rate (produced by a simulator) is 20 events per second. The per second output rate for the process function shows 14 per second. The back pressure metrics for the source is shown as OK (green). The event count (Number of events sent by the source) and the number of events received by the process function also match with very little delay.
But this count does not match the event count pushed by the simulator. This count matches the 14 per second rate.
Now my question is, does Flink regulate the input rate automatically?
In my case, how is the input rate controlled at 14 per second?
If it is not, is there any other metric that I should be looking at that I'm missing?

It's not possible to force a Flink pipeline to consume events at a particular rate. By design, there is limited buffering in the network stack, and the slowest task in the execution graph will dictate the rate at which the pipeline will consume and process events.
The back pressure monitoring (that green OK signal) is not a definitive guide to whether back pressure is occuring. So long as the job is able to make steady forward progress, it probably won't indicate that there's a problem. You could examine some of the network queue metrics to get more insight: e.g., inPoolUsage, outPoolUsage, inputQueueLength. See Flink Network Stack Vol. 2: Monitoring, Metrics, and that Backpressure Thing for a lot more on this topic.
20 events per second seems very slow, so I am a bit surprised that something can't keep up with that rate, but that appears to be what's happening.

What happens if flink's keyBy operator is given distinct key followed by tumbling window

My flink job has keyBy operator which takes date~clientId(date as yyyymmddhhMM, MM as minutes which changes after 5 mins) as key. This operator is followed by tumbling window of 5 mins. We have kafka input of 3 millions/min events on an average and around 20 millions/min events on peak time. Checkpointing duration and minimum pause between two checkpoiting is 3 mins.
Now here are my doubts :
1) Does the state created by keyBy is persisted forever or it's evicted after 5 minutes.
2) What changes are required in case i change this window to 30 minutes.
3) How the checkpointing time is affected by the window size.
4) What will be effect in a scenerio where number of distinct client in any 5 minutes goes 5-10 times. Will this create data skew. As 1-2 sub-tasks in my job always takes around 1-2 minutes as compare to other 800 sub-taks which completes in 10-15 seconds.
5) I am getting one exception once in every 5-6 hours which restarts the flink job. TimerException{java.nio.channels.ClosedByInterruptException} at org.apache.flink.streaming.runtime.tasks.SystemProcessingTimeService$TriggerTask. What maybe the probable reason.

A few points:
keyBy is not an operator, and has no state. keyBy is simply a declaration of how the stream is to be repartitioned. The tumbling window that follows the keyBy does have state, which is purged once the window is complete. You can see how much state each subtask has if you look at the breakdown in the checkpoint stats part of the web UI.
Here's an example:
What will be effect in a scenario where number of distinct client in any 5 minutes goes 5-10 times. Will this create data skew? As 1-2 sub-tasks in my job always takes around 1-2 minutes as compared to the other 800 sub-tasks which complete in 10-15 seconds.
Perhaps you have one or a few clients with many more events than the rest?
It would be interesting to understand why you are doing event-time-based keying followed by processing time windows, rather than using event time windows. (I assume you are using processing time windows, correct me if I'm wrong.)
Do you have any idea how many different timeframes are active at once? E.g., the window for 12:00-12:05 will receive many events with timestamps in the range of 12:00-12:05, plus some events for 11:55-12:00 that didn't arrive by 12:00. And possibly events for earlier timeframes, if that much delay is possible. It's hard to think about key skew without understanding what the active keyspace looks like.

NTP and RTC HW Clock weird results

In an attempt to make the system time on an ODroid as close to realtime as possible, I've tried adding a real time clock to the ODroid. The RTC has an accuracy of +/- 4ppm.
Without the realtimeclock, I would get results like this (Synced with NTP-server every 60 seconds). The blue is an Orange Pi for comparison. The x-axis is the sample, and the y-axis is the offset reported by the NTP-server in ms.
So what I tried, was the same thing (Though more samples, but same interval), but instead of just syncing with the NTP-server, I did the following:
Set the system time to the hw-clock time.
Sync with the NTP-server to update the system time, and record the offset given by the server
Update the HW-clock to the systemtime, since it has just been synced to realtime.
Then I wait 60 seconds and repeat. I didn't expect it to be perfect, but what I got shocked me a little bit.
What in the world am I looking at? The jitter becomes less and less, and follows an almost straight line, but when it reaches the perfect time (about 410 minutes in....), it the seems to continue, and let the jitter and offset grow again.
Can anyone explain this, or maybe tell me what I'm doing wrong?
This is weird!

So you are plotting the difference between your RTC time and the NTP server time. Where is the NTP server located? In the second plot you are working in a range of a couple hundred ms. NTP has accuracy limitations. From wikipedia:
https://en.wikipedia.org/wiki/Network_Time_Protocol
NTP can usually maintain time to within tens of milliseconds over the
public Internet, and can achieve better than one millisecond accuracy
in local area networks under ideal conditions. Asymmetric routes and
network congestion can cause errors of 100 ms or more
Your data is a bit weird looking though.

SQL Server 2008 Activity Monitor Resource Wait Category: Does Latch include CPU or just disk IO?

In SQL Server 2008 Activity Monitor, I see Wait Time on Wait Category "Latch" (not Buffer Latch) spike above 10,000ms/sec at times. Average Waiter Count is under 10, but this is by far the highest area of waits in a very busy system. Disk IO is almost zero and page life expectancy is over 80,000, so I know it's not slowed down by disk hardware and assume it's not even touching SAN cache. Does this mean SQL Server is waiting on CPU (i.e. resolving a bajillion locks) or waiting to transfer data from the local server's cache memory for processing?
Background: System is a 48-core running SQL Server 2008 Enterprise w/ 64GB of RAM. Queries are under 100ms in response time - for now - but I'm trying to understand the bottlenecks before they get to 100x that level.
Class Count Sum Time Max Time
ACCESS_METHODS_DATASET_PARENT 649629086 3683117221 45600
BUFFER 20280535 23445826 8860
NESTING_TRANSACTION_READONLY 22309954 102483312 187
NESTING_TRANSACTION_FULL 7447169 123234478 265

Some latches are IO, some are CPU, some are other resource. It really depends on which particular latch type you're seeing this. sys.dm_os_latch_stats will show which latches are hot in your deployment.
I wouldn't worry about the last three items. The two nesting_transaction ones look very healthy (low average, low max). Buffer is also OK, more or less, although the the 8s max time is a bit high.
The AM_DS_PARENT latch is related to parallel queries/parallel scans. Its average is OK, but the max of 45s is rather high. W/o going into too much detail I can tell that long wait time on this latch type indicate that your IO subsystem can encounter spikes (and the max 8s BUFFER latch waits corroborate this).

How to handle daily/weekly/mothly boards on AppEngine datastore?

I'm developing a high score web service for my game, and it's running on Google App Engine.
My game has 5 difficulties, so I originally had 5 boards with entries for each (player_login, score and time). If the player submitted a lower score than the previously scored, it got dismissed, so only the highest score is kept for each player.
But to add more fun into this, I'd decided to include daily/weekly/monthly/yearly high score tables. So I've created 5 boards for each difficulty, making it 25 boards. When a score is submitted, it's saved into each board, and the boards are supposed to be cleared on every day/week/month/year.
This happens by a cron job that is invoked and deletes all entries from a specific board.
Here comes the problem: it looks like deleting entries from the datastore is slow. From my test daily cleanups it looks like deleting a single entry takes around 200 ms.
In the worst-case scenario, if the game would be quite popular and would have, say, 100 000 players, and each of them would have an entry in the yearly board, it would take 100 000 * 0.012 seconds = 12 000 seconds (3 hours!!) to clear that board. I think we are allowed to have jobs of up to 30 seconds in App Engine, so this wouldn't work.
I'm deleting with following code (thanks to Nick Johnson):
q = Score.all(keys_only=True).filter('b = ',boardToClear)
results = q.fetch(500)
while results:
self.response.out.write("deleting one batch;")
db.delete(results)
q = Score.all(keys_only=True).filter('b = ',boardToClear).with_cursor(q.cursor())
results = q.fetch(500)
What do you recommend me to do with this problem?
One approach that comes to my mind is to use a task queue and delete older scores than that are permitted in each board, i.e. which have expired, but in smaller quantities. This way I wouldn't hit the CPU limit for one task, but the cleanup would not be (nearly) instantaneous, so my 12 000 seconds long cleanup would be split into 1 200 tasks, each roughly 10 seconds long.
But I think that there is something that I'm doing wrong, this kind of operation would be a lot faster when done in relational database. Possibly something is wrong with my approach to the datastore and scoring, because being locked in RDBMS mindset.

First, a couple of small suggestions:
Does deletion take 200ms per item even when you delete items in a batch process? The fastest way to delete should be to do a keys_only query and then call db.delete() on an entire list of keys at once.
The 30-second limit was recently relaxed to 10 minutes for background work (like the cron jobs or queue tasks that you're contemplating) as of 1.4.0.
These may not fundamentally address your problem, though. I think there's no way to get around the fact that deleting a large number of records (hundreds of thousands, say), will take some time. I'm not sure that this is as big a problem for your use case though, as I can see a couple of techniques that would help.
As you suggest, use a task queue to split up a long-running tasks into several smaller tasks. Your use case (deleting a huge number of items that match a particular query) is ideal for a map-reduce task. Nick Johnson's blog post on the Mapper API may be very helpful for you (so that you don't have to write all of that task management code on your own).
Do you need to delete all the out-of-date board entries immediately? If you had a field that listed which week, month, or year that a particular entry counted for, you could index on that field and then only display entries from the current month on the visible leaderboard. (Disk space is cheap, after all.) And then if you wanted to slowly (over hours, say, instead of milliseconds) remove the out-of-date data, you could do that in the background without ever having incorrect data on your leaderboards.

Delete entities in batches. Although a single delete takes a noticeable amount of time (though 200ms seems very high), batch deletes take no longer, as they delete all the entities in parallel. Task Queue and cron jobs can now run for up to 10 minutes, so timeouts should not be an issue.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight