Multiple submit keys for a single topic in the Hedera Consensus Service - hedera-hashgraph

When creating a new Topic with Hedera Consensus Service, is there a way to have more than one or multiple submitted for a single Topic?

I run through a number of key signing scenarios in my H18 talk. You can mix and match these strategies to model varying business scenarios including this one.
https://youtu.be/OI7oT9Knnlc?t=1038

Yes. All transactions support key lists, thresholds, and nested key structures.
Docs for Keys: https://docs.hedera.com/guides/core-concepts/keys-and-signatures
So on your topicCreate you'd have a type of threshold key.
Create topic doc: https://docs.hedera.com/guides/docs/sdks/consensus/create-a-topic
Note that if you specify a list, all private keys corresponding to the public keys have to sign.
If you specify a threshold key, only the required number has to sign.
You could for example set up a 1 of 5 threshold meaning any of the 5 can sign, or a 2 of 5 meaning a minimum of 2 of the 5 have to sign

Related

Efficiently modelling a Feed schema on Google Cloud Datastore?

I'm using GCP/App Engine to build a Feed that returns posts for a given user in descending order of the post's score (a modified timestamp). Posts that are not 'seen' are returned first, followers by posts where 'seen' = true.
When a user creates a post, a Feed entity is created for each one of their followers (i.e. a fan-out inbox model)
Will my current index model result in an exploding index and/or contention on the 'score' index if many users load their feed simultaneously?
index.yaml
indexes:
- kind: "Feed"
properties:
- name: "seen" // Boolean
- name: "uid" // The user this feed belongs to
- name: "score" // Int timestamp
direction: desc
// Other entity fields include: authorUid, postId, postType
A user's feed is fetched by:
SELECT postId FROM Feed WHERE uid = abc123 AND seen = false ORDER BY score DESC
Would I be better off prefixing the 'score' with the user id? Would this improve the performance of the score index? e.g. score="{alphanumeric user id}-{unix timestamp}"
From the docs:
You can improve performance with "sharded queries", that prepend a
fixed length string to the expiration timestamp. The index is sorted
on the full string, so that entities at the same timestamp will be
located throughout the key range of the index. You run multiple
queries in parallel to fetch results from each shard.
With just 4 entities I'm seeing 44 indexes which seems excessive.
You do not have an exploding indexes problem, that problem is specific to queries on entities with repeated properties (i.e properties with multiple values) when those properties are used in composite indexes. From Index limits:
The situation becomes worse in the case of entities with multiple
properties, each of which can take on multiple values. To accommodate
such an entity, the index must include an entry for every possible
combination of property values. Custom indexes that refer to multiple properties, each with multiple values, can "explode"
combinatorially, requiring large numbers of entries for an entity with
only a relatively small number of possible property values. Such
exploding indexes can dramatically increase the storage size of an entity in Cloud Datastore, because of the large number of index
entries that must be stored. Exploding indexes also can easily cause
the entity to exceed the index entry count or size limit.
The 44 built-in indexes are nothing more than the indexes created for the multiple indexed properties of your 4 entities (probably your entity model has about 11 indexed properties). Which is normal. You can reduce the number by scrubbing your model usage and marking as unindexed all properties which you do not plan to use in queries.
You do however have the problem of potentially high number of index updates in a short time - when a user with many followers creates a post with all those indexes falling in a narrow range - hotspots, which the article you referenced applies to. Pre-pending the score with the follower user ID (not the post creator ID, which won't help as the same number of updates on the same index range will happen for one use posting event regardless of sharding being used or not) should help. The impact of followers reading the post (when the score properly is updated) is less impactful since it's less likely for all followers to read the post exactly in the same time.
Unfortunately prepending the follower ID doesn't help with the query you intend to do as the result order will be sorted by follower ID first, not by timestamp.
What I'd do:
combine the functionality of the seen and score properties into one: a score value of 0 can be used to indicate that a post was not yet seen, any other value would indicate the timestamp when it was seen. Fewer indexes, fewer index updates, less storage space.
I wouldn't bother with sharding in this particular case:
reading a post takes a bit of time, one follower reading multiple posts won't typically happen fast enough for the index updates for that particular follower to be a serious problem. In the rare worst case an already read post may appear as unread - IMHO not bad enough for justification
delays in updating the indexes for all followers again is IMHO not a big problem - it may just take a bit longer for the post to appear in a follower's feed

Google App Engine (datastore) - will a deleted key regenerate?

I've got a simple question about datastore keys. If I delete an entity, is there any possibility that the key will be created again? or each key is unique and can be generated only one-time?
Thanks.
It is definitely possible to re-use keys.
Easy to test, for example using the datastore admin page:
create an entity for one of your entity models using a custom/specified key name and some property values
delete the entity
create another one using the same key name and different property values...
As for the keys with auto-generated IDs it is theoretically possible, but I guess rather unlikely due to the high number of possibilities. From Assigning identifiers:
Cloud Datastore can be configured to generate auto IDs using two
different auto id policies:
The default policy generates a random sequence of unused IDs that are approximately uniformly distributed. Each ID can be up to 16
decimal digits long.
The legacy policy creates a sequence of non-consecutive smaller integer IDs.

AppEngine, DataStore: Preallocating normally-distributed IDs (*not* monotonically incrementing)

There are three schemes to set IDs on datastore entities:
Provide your own string or int64 ID.
Don't provide them and let AE allocate int64 IDs for you.
Pre-allocate a block of int64 IDs.
The documentation has this to say about ID generation:
This (1):
Cloud Datastore can be configured to generate auto IDs using two
different auto id policies:
The default policy generates a random sequence of unused IDs that are approximately uniformly distributed. Each ID can be up to 16
decimal digits long.
The legacy policy creates a sequence of non-consecutive smaller integer IDs.
If you want to display the entity IDs to the user, and/or depend upon
their order, the best thing to do is use manual allocation.
and this (2):
Note: Instead of using key name strings or generating numeric IDs
automatically, advanced applications may sometimes wish to assign
their own numeric IDs manually to the entities they create. Be aware,
however, that there is nothing to prevent Cloud Datastore from
assigning one of your manual numeric IDs to another entity. The only
way to avoid such conflicts is to have your application obtain a block
of IDs with the datastore.AllocateIDs function. Cloud Datastore's
automatic ID generator will keep track of IDs that have been allocated
with this function and will avoid reusing them for another entity, so
you can safely use such IDs without conflict.
and this (3):
Cloud Datastore generates a random sequence of unused IDs that are
approximately uniformly distributed. Each ID can be up to 16 decimal
digits long.
System-allocated ID values are guaranteed unique to the entity group.
If you copy an entity from one entity group or namespace to another
and wish to preserve the ID part of the key, be sure to allocate the
ID first to prevent Cloud Datastore from selecting that ID for a
future assignment.
I have a particular entity-type that is stored with an ancestor. However, I'd like to have globally-unique IDs and AE's IDs (allocated via datastore.AllocateIDs with Go) will not be globally unique when stored under an ancestor (in an entity-group). So, pre-allocation would solve this (they're ancestor-agnostic). However, you are obviously given an interval in response... a continuous range of IDs that have been reserved.
Isn't there some way to preallocate those nice, opaque, uniformally-distributed IDs?
While we're on the subject, I had assumed that the opaque IDs from AE were the result of some pseudorandom number generator with a persisted-state for each entity-type, but the word "track" in (2) seems to imply that there is a cost to optimistically generating and buffering IDs that might not be used. It's be great if someone can clarify this.
The simple solution is to do the following:
When trying to allocate a new ID for an entity:
Repeat the following:
Generate a random K bit integer. Use it for the entity ID field. [Use a uniform random distribution].
Create a Cloud Datastore transaction.
Insert the new entity. [If the transaction aborts because the entity already exists try again with a new random number].
If you make K big enough (for example 128) and have a properly seeded random number generator, then it is statistically impossible to generate an ID collision and you can remove the retry loop.
If you make K big enough stop using the integer id field in the entity key and instead use the string one. Base64 URL encode random number as a string.

Getting values out of DynamoDB

I've just started looking into Amazon's DynamoDB. Obviously the scalability appeals, but I'm trying to get my head out of SQL mode and into no-sql mode. Can this be done (with all the scalability advantages of dynamodb):
Have a load of entries (say 5 - 10 million) indexed by some number. One of the fields in each entry will be a creation date. Is there an effective way for dynamo db to give my web app all the entries created between two dates?
A more simple question - can dynamo db give me all entries in which a field matches a certain number. That is, there'll be another field that is a number, for argument's sake lets say between 0 and 10. Can I ask dynamodb to give me all the entries which have value e.g. 6?
Do both of these queries need a scan of the entire dataset (which I assume is a problem given the dataset size?)
many thanks
Is there an effective way for dynamo db to give my web app all the
entries created between two dates?
Yup, please have a look at the of the Primary Key concept within Amazon DynamoDB Data Model, specifically the Hash and Range Type Primary Key:
In this case, the primary key is made of two attributes. The first
attributes is the hash attribute and the second one is the range
attribute. Amazon DynamoDB builds an unordered hash index on the hash
primary key attribute and a sorted range index on the range primary
key attribute. [...]
The listed samples feature your use case exactly, namely the Reply ( Id, ReplyDateTime, ... ) table facilitates a primary key of type Hash and Range with a hash attribute Id and a range attribute ReplyDateTime.
You'll use this via the Query API, see RangeKeyCondition for details and Querying Tables in Amazon DynamoDB for respective examples.
can dynamo db give me all entries in which a field matches a certain
number. [...] Can I ask dynamodb to give
me all the entries which have value e.g. 6?
This is possible as well, albeit by means of the Scan API only (i.e. requires to read every item in the table indeed), see ScanFilter for details and Scanning Tables in Amazon DynamoDB for respective examples.
Do both of these queries need a scan of the entire dataset (which I
assume is a problem given the dataset size?)
As mentioned the first approach works with a Query while the second requires a Scan, and Generally, a query operation is more efficient than a scan operation - this is a good advise to get started, though the details are more complex and depend on your use case, see section Scan and Query Performance within the Query and Scan in Amazon DynamoDB overview:
For quicker response times, design your tables in a way that can use
the Query, Get, or BatchGetItem APIs, instead. Or, design your
application to use scan operations in a way that minimizes the impact
on your table's request rate. For more information, see Provisioned Throughput Guidelines in Amazon DynamoDB.
So, as usual when applying NoSQL solutions, you might need to adjust your architecture to accommodate these constraints.

Establishing entity groups while maintaining access to Long ids

I'm using the appengine datastore, and all of my entities have Long ids as their PrimaryKey. I use those ids to communicate with the client, since the full-fledged Keys take much more bandwidth to transmit.
Now, I want to form entity groups so that I can do complex operations within transactions, and it seems from http://code.google.com/appengine/docs/java/datastore/transactions.html#Entity_Groups that I need to use Keys or String encoded keys - the simple Longs are not an option.
I don't mind refactoring a little to use Keys, but I still want to avoid sending the behemoth things over the wire. How can I get a unique (per kind) Long identifier for an entity whose primary key is a Key?
You do not have to use names (strings). All of the KeyBuilder methods that take names also have counterparts that take ids (longs).
For transmission, you simply need the name or id part of a Key. Once you know the id or name, you can reconstruct the Key server side. If it is a child entity, you'll need to know both the parent and the child's names or ids.

Resources