How to think about token economics for my application? - cryptocurrency

Tokens are a powerful instrument to create economy and incentivize users of the applications.
What are main principles when launching a new token for a product or application? How to ensure that this token will have value, sustain it in long term and what are different mechanics that can be used.

This is area of active research and experimentation. The benefit that blockchain gives us is ability to experiment with such applications relatively easily. Tokens are pretty universal instrument, and can cover anything from share of the company, to internal unit of account for a marketplace, represent physical object ownership or loyalty points.
Let's classify tokens into two classes:
Tokens that are used to incentivize and bootstrap a multi role interaction, like a marketplace, community-owned DAO or social network.
Tokens that are used for single role interactions, like loyalty points.
The second case is more straightforward usually, as it aligns with how existing businesses use loyalty points to incentivize their customers. Interesting point, that tokenizing loyalty points may allow to grow single role use case into multi-role use case over time.
In the first case of multi-role interaction, there are few principles to follow:
Mint or give tokens to participants for actions that increase value of your network for other participants. For example in the case of marketplace you can incentivize listing new items, or in social network writing novel content.
Coordinate this operation either with algorithm or with existing token holders. Algorithm should be evaluated to make sure there are not ways to game it (e.g. it should be more expensive to game than how much will be allocated).
For example reward for listing items in marketplace gets allocated after the lot is bought and amount is smaller than fee of the marketplace (if it's larger, people will list items and buy themself to get the reward).
Another approach is using people, who are existing token holders. This is more complex as requires careful analysis, but a good example is grants program - you want existing token holders to "vote" for interesting projects by giving tokens to them and then allocate to this projects some of the additional incentive.
Tokens must capture some external or internal value generated by this product. If there is no value created by this system, this will be just circular token generation. This value can be in the future - e.g. when product increases in usage due to all the incentives and start generating more and more substantial revenue.
For example the product is capturing some form of revenue from providing it's data, liquidity, serving ads to users, fee from the marketplace, selling sticker packs to users, etc.
Token can capture this value in two different ways:
Users who hold token (or specifically locked), are eligible for revenue share.
Revenue is used to buy token back and burn it. Usually tokens have fixed supply, which means that overall token supply decreases and token becomes scarcer.
Note, that there are many aspects of this that require careful engineering, including regulatory.

Related

Infinite scrolling for personalized recommendations

I've been recently researching solutions that would allow me to display a personalized ranking of products in an online e-commerce store.
A natural solution for this problem would be to use a managed ML service such as
AWS Personalize.
Based on my understanding it can be implemented in terms of 2 service calls:
Recommendations - return up to ~500 products based on user's profile
Ranking - based on product ids (up to ~500) reorder the list according user's to profile
I was wondering if there exists and implementation / indexing strategy that would allow to display the entire product catalog (let's assume 10k products)?
I can imagine an implementation that would:
Return 50 products per page
Call recommendations to grab first 500 products
Alternatively, pick top 500 products platform-wise and rerank them according to the user.
For the remaining results, i.e. pages 11 to N a database query would be executed, excluding those 500 products by id. The recommended ordering wouldn't be as relevant anymore as te top recommendations have been listed and the user is less likely to encounter relevant results at the 11th page. As a downside such a query would need a relatively large array to be included as a part of the query.
Is there a better solution for this? I've seen many eccomerce platforms offering a "Recommended" order option for their product listing that allows infinite scrolling. Would that mean that such a store is typically using predefined ranks, that is an arbitrary rank assigned to each product by the content manager that is exactly the same for any user on the platform?
I don't think I've ever seen an ecommerce site that shows me 10K products without any differentiation. Most ecommerce sites use a process called "merchandising" to decide which product to show, to which customer, in which position/treatment, at which time, and in which context.
Personalized recommendations may be part of that process, but they are generally only a part. For instance, if you're browsing "Books about architecture", the fact the recommendation engine thinks you're really interested in CDs by Duran Duran is not super useful. And even within "books about architecture", there may be other attributes that are more likely to drive your buying behaviour.
Most ecommerce sites use a range of attributes to decide product ranking in a product listing page, for example "relevance", "price", "is the product on special offer?", "is the product in stock?", "do we make a big margin on this product?", "is this a best seller?", "is the supplier reliable?". Personalized recommendations are sprinkled into these factors, but the weightings are very much specific to the vendor.
Most recommendation engines will provide a "relevance score" or similar indicator. In my experience, this has a long tail distribution - a handful of products will score highly, and the rest will score very low relevancy scores. The ecommerce business I have worked with have a cut off point - a score of less than x means we'll ignore the recommendation, in favour of other criteria.
Again, in my experience, personalized recommendations are useful for squeezing the last few percentage points of conversion, but nowhere near as valuable as robust search, intuitive categorization, and rich meta data.
I'd suggest that if your customer has scrolled through 500 recommended products, it's unlikely the next 9500 need to be in "recommended" order - your recommendation mechanism is probably statistically not significant.

Auto incrementing from two different points on the same table?

Each customer has an ID. For every new customer it's incremented by 1.
My company is wanting to start selling to a new type of customer (lets call it customer B), and have their ID start at 30k.
Is this a poor design? Our system has an option to set the customer type, so there is no need for this. We're not expected to reach this number for our other type of customers until 2050 (it's currently at 10k).
I'm trying to convince them that this is a bad idea... It also doesn't help the third party trying to implement this is ok with it.
Confusing the technical and functional keys is the really bad idea. Technical keys are designed to be unchanged, to maintain the referential integrity and to speed up the joins. Technical keys are also to avoid of exposing at the application level (i.e. in the screen forms).
Functional keys are visible to application users and may be a subject to change accordingly to new requirements.
Other bad idea is to use number parts to separate data partitions (customer groups in your case). The character code having a complex structure like a social security number is a more strong and scalable solution (however it's a sort of denormalization, too).
The recommended solution is to have a technical ID as an auto-incremental number but also add a functional customer code.
If the identifier is customer-facing or business user-facing then it makes sense to design it with readability and usability in mind. Most people find it very natural work with identification schemes that have information encoded within them: think of phone numbers, vehicle licence plate numbers, airline flight numbers. There is evidence that a well structured identification scheme can improve usability and reduce the rate of errors in day to day use. You could expect that a well-designed "meaningful" identification scheme will result in fewer errors than an arbitrary incrementing number with no identifiable features.
There's nothing inherently wrong with doing this. Whether it makes sense in your business context depends a lot on how the number will be used. Technically speaking it isn't difficult to build multiple independent sequences, e.g. using the SEQUENCE feature in SQL. You may want to consider including other features as well, like check digits.

How can I get number of API transactions used by Watson NLU?

AlchemyLanguage used to return the number of API transactions that took place during any call, this was particularly useful when making a combined call.
I do not see the equivalent way to get those results per REST call.
Is there any way to track or calculate this? I am concerned about things like some of the sub-requests like when you ask for sentiment on entities does that count as two, or one plus an additional call per recognized entity?
There's currently no way to track the transactions from the API itself. To track this (particularly for cost estimates), you'll have to go to the usage dashboard in Bluemix. To find it: sign in to Bluemix, click Manage, then select Billing and Usage, and finally select Usage. At the bottom of the page you'll se a list of all your credentialed services. Expanding any of those will show the usage plus total charges for the month.
As far as how the NLU service is billed, it's not necessarily on a per API request as you mentioned. The serivce is billed in "units" and from the pricing page (https://console.ng.bluemix.net/catalog/services/natural-language-understanding):
A NLU item is based on the number of data units enriched and the
number of enrichment features applied. A data unit is 10,000
characters or less. For example: extracting Entities and Sentiment
from 15,000 characters of text is (2 Data Units * 2 Enrichment
Features) = 4 NLU Items.
So overall, the best way to understand your transaction usage would be to run a few test requests and then check the Bluemix usage dashboard.
I was able to do a simple test, and made calls to a set of highlevel features and included sub-features. And it appeared to only register calls for the highlevel features.

Couchbase, two user registering with same username but different datacenters?

Let's say I have two users, Alice in North America and Bob in Europe. Both want to register a new account with the same username, at the same time, on different datacenters. The datacenters are configured to replicate between each other using eventual consistency.
How can I make sure only one of them succeeds at registering the username? Keep in mind that the connection between the datacenters might even be offline at the time (worst case, but daily occurance on spotify's cassandra setup).
EDIT:
I do realize the key uniqueness is the big problem here. The thing is that I need all usernames to be unique. Imagine using twitter if you couldn't tag a specific person, but had to tag everyone with the same username.
With any eventual consistency system, and particularly in the presence of a network partition, you essentially have two choices:
Accept collisions, and pick a winner later.
Ensure you never have a collision.
In the case of Couchbase:
For (1) that means letting two users register with the same address in both NA and EU, and then later picking one as the "winner" (when the network link is present - not a very desirable outcome for something like a user account. A slight variation on this would be something like #Robert's suggestion and putting them in a staging area (which means the account cannot be made "active" until the partition is resolved), and then telling the "winning" user they have successfully registered, and the "loser" that the name is taken and to try again.
For (2) this means making the users unique, even though they pick the same username - for example adding a NA:: / EU:: prefix to their username document. When they login the application would need some logic to try looking up both document variations - likely trying the prefix for the local region first. (This is essentially the same idea as "realms" or "servers" that many MMO games use).
There are variations of both of these, but ultimately given an AP-type system (which Couchbase across XDCR is) you've essentially chosen Availability & Partition-Tolerance over Consistancy, and hence need to reconcile that at the application layer.
Put the user name registrations into a staging table until you can perform a replication to determine if the name already exists in one of the other data centers.
You tagged Couchbase, so I will answer about that.
As long as the key for each object is different, you should be fine with Couchbase. It is the keys that would be unique and work great with XDCR. Another solution would be to have a concatenated key made up of the username and other values (company name, etc) if that suits your use case, again giving you a unique key for the object. Yet another would be to have a key/value in a JSON document that is the username.
It's not clear to me whether you're using Cassandra or Couchbase.
As far as Cassandra is concerned, since version 2.0, you can use Lightweight Transactions which are created for the goal. A Serial Consistency has been created just to achieve what you need. In the above link you can read what follows:
For example, suppose that I have an application that allows users to
register new accounts. Without linearizable consistency, I have no way
to make sure I allow exactly one user to claim a given account — I
have a race condition analogous to two threads attempting to insert
into a [non-concurrent] Map: even if I check for existence before
performing the insert in one thread, I can’t guarantee that no other
thread inserts it after the check but before I do.
As far as the missing connection between two or more cluster its your choice how to handle it. If you can't guarantee the uniqueness at insert-time you can both refuse the registration or dealing with it, accepting and apologize later.
HTH, Carlo

How do I structure multiple Identity Data in a database

Am designing a database for a credit bureau and am seeking some guidance.
The data they receive from Banks, MFIs, Saccos, Utility companies etc comes with various types of IDs. E.g. It is perfectly legal to open a bank account with a National ID and also a Passport. Scenario One that has my head banging is that Customer1 will take a credit facility (call it loan for now) in bank1 with the passport and then go to bank2 and take another loan with their NationalID and Bank3 with their MilitaryID. Eventually when this data comes from the banks to the bureau, it would be seen as 3 different people while we know that its actually 1 person. At this point, there is nothing we can do as a bureau.
However, one way out (for now) is using the Govt registry which provides a repository which holds both passports and IDS. So once we query for this information and get a response, how do I show in the DB that Passport_X is related to NationalID_Y and MilitaryNumber_Z?
Again, a person's name could be captured in various orders states. Bank1 could do FName, LName, OName while Bank3 can do LName, FName only. How do I store this names?
Even against one ID type e.g. NationalID, you will often find misspellt names or missing names. So one NationalID in our database could end up with about 6 different names because the person's name was captured different by the various banks where he has transacted.
And that is just the tip of the iceberg. We have issues with addresses, telephone numbers, etc etc.
Could you have any insight as to how I'd structure my database to ensure we capture all data from all banks and provide the most accurate information possible regarding an individual? Better yet, do you have experience with this type of setup?
Thanks.
how do I show in the DB that Passport_X is related to NationalID_Y and MilitaryNumber_Z?
Trivial.
You ahve an identity table, that has an AlternateId field if the Identity is linked to another one. Use the first IDentity you created as master. Any alternative will have AlternateId pointing to it.
You need to separate the identity from the data in it, so you can have alterante versions of it, possibly with an origin and timestampt. You need oto likely fully support versioning and tying different identities to each other as alternative, including generating a "master identity" possibly by algorithm with the "official" version of your data (i.e. consolidated).
The details are complex - mostly you ahve to make a LOT of compromises without killing performance, so at the end HIRE A SPECIALIST. There is a reason there are people out as sensior database designers or architects that have 20+ years experience finding the optimal solution given the constrints you may not even be aware of (application wise).
Better yet, do you have experience with this type of setup?
Yes. Try financial information. Stock symbols / feeds / definitions are not necessariyl compatible and vary by whom you get it. Any non-trivial setup has different data feeds that may show the same item slightly different, sometimes in error. DIfferent name, sometimes different price (example: ES, CME group, is 50 USD per point, but on TT Fix it is 5 - to make up, the price is multiplied by 10, so instad of 1000.25 you get 10002.5). THis is the same line of consolidation, and it STINKS.
Tons of code, tons of proper database design, redoing it half a dozen time to get the proper performance. THis is tricky, sadly.

Resources