Insert or update data in aerospike

Insert or update data in aerospike - database

I want to insert some record in Aerospike, if the record already exists then I only want to update it.
Currently I am using this query(to insert) -
client.put(wPolicy, key,bin1,bin2)
Can someone please inform me how to update or insert depending on whether the record is duplicate?

Use the default write policy, which does the following:
(1) If the specified bins do not yet exist, they will be inserted; and
(2) If the specified bins exist and have values, those values will be replaced.
To use the default write policy, if you're using the Java client, just pass in null to the writePolicy parameter. I suspect other clients will be similar.
If there are more sub-parts to your question, you can add details to your question and I'll revisit later.

As Aaron mentioned, the default write policy for existence is AS_POLICY_EXISTS_IGNORE, which means "Write the record, regardless of existence. (i.e. create or update.)". Therefore, you don't have to set the existence policy explicitly, as it already does what you expect.
You can choose to have a more SQL-like behavior, with AS_POLICY_EXISTS_CREATE (with the write failing if the record already exists), AS_POLICY_EXISTS_UPDATE (with the write failing if the record doesn't already exists), and AS_POLICY_EXISTS_REPLACE (with the write failing if the record doesn't exist, AND what you write always replacing the previous version completely) and AS_POLICY_EXISTS_CREATE-OR-REPLACE (which either creates a new record if none exists, or completely overwrites the record if it does).
In the Python client you would set one of these alternative existence write policies on the aerospike.Client.put():
from __future__ import print_function
import aerospike
from aerospike.exception import RecordError
config = {
'hosts': [ ('127.0.0.1', 3000) ],
'timeout': 1500
}
client = aerospike.client(config).connect()
try:
key = ('test', 'users', 1)
bins = {
'username': 'ninjastar',
'age': 47,
'hp': 1234
}
client.put(key, bins,
policy={'exists': aerospike.POLICY_EXISTS_CREATE},
meta={'ttl': 3600})
except RecordError as e:
print("The user record already exists: {0} [{1}]".format(e.msg, e.code))
sys.exit(1)
finally:
client.close()
The possible values for exists are aerospike.POLICY_EXISTS_*.

Related

How to limit amount of associations in Elixir Ecto

I have this app where there is a Games table and a Players table, and they share an n:n association.
This association is mapped in Phoenix through a GamesPlayers schema.
What I'm wondering how to do is actually quite simple: I'd like there to be an adjustable limit of how many players are allowed per game.
If you need more details, carry on reading, but if you already know an answer feel free to skip the rest!
What I've Tried
I've taken a look at adding check constraints, but without much success. Here's what the check constraint would have to look something like:
create constraint("games_players", :limit_players, check: "count(players) <= player_limit")
Problem here is, the check syntax is very much invalid and I don't think there actually is a valid way to achieve this using this call.
I've also looked into adding a trigger to the Postgres database directly in order to enforce this (something very similar to what this answer proposes), but I am very wary of directly fiddling with the DB since I should only be using ecto's interface.
Table Schemas
For the purposes of this question, let's assume this is what the tables look like:
Games
Property
Type
id
integer
player_limit
integer
Players
Property
Type
id
integer
GamesPlayers
Property
Type
game_id
references(Games)
player_id
references(Players)

As I mentioned in my comment, I think the cleanest way to enforce this is via business logic inside the code, not via a database constraint. I would approach this using a database transaction, which Ecto supports via Ecto.Repo.transaction/2. This will prevent any race conditions.
In this case I would do something like the following:
begin the transaction
perform a SELECT query counting the number of players in the given game; if the game is already full, abort the transaction, otherwise, continue
perform an INSERT query to add the player to the game
complete the transaction
In code, this would boil down to something like this (untested):
import Ecto.Query
alias MyApp.Repo
alias MyApp.GamesPlayers
#max_allowed_players 10
def add_player_to_game(player_id, game_id, opts \\ []) do
max_allowed_players = Keyword.get(opts, :max_allowed_players, #max_allowed_players)
case is_game_full?(game_id, max_allowed_players) do
false -> %GamesPlayers{
game_id: game_id,
player_id: player_id
}
|> Repo.insert!()
# Raising an error causes the transaction to fail
true -> raise "Game #{inspect(game_id)} full; cannot add player #{inspect(player_id)}"
end
end
defp is_game_full?(game_id, max_allowed_players) do
current_players = from(r in GamesPlayers,
where: r.game_id == game_id,
select: count(r.id)
)
|> Repo.one()
current_players >= max_allowed_players
end

Trigger to restrict duplicate record for a particular type

I have a custom object consent and preferences which is child to account.
Requirement is to restrict duplicate record based on channel field.
foe example if i have created a consent of channel email it should throw error when i try to create second record with same email as channel.
The below is the code i have written,but it is letting me create only one record .for the second record irrespective of the channel its throwing me the error:
Trigger code:
set<string> newChannelSet = new set<string>();
set<string> dbChannelSet = new set<string>();
for(PE_ConsentPreferences__c newCon : trigger.new){
newChannelSet.add(newCon.PE_Channel__c);
}
for(PE_ConsentPreferences__c dbcon : [select id, PE_Channel__c from PE_ConsentPreferences__c where PE_Channel__c IN: newChannelSet]){
dbChannelSet.add(dbcon.PE_Channel__c);
}
for(PE_ConsentPreferences__c newConsent : trigger.new){
if(dbChannelSet.contains(newConsent.PE_Channel__c))
newConsent.addError('You are inserting Duplicate record');
}

Your trigger blocks you because you didn't filter by Account in the query. So it'll let you add 1 record of each channel type and that's all.
I recommend not doing it with code. It is going to get crazier than you think really fast.
You need to stop inserts. To do that you need to compare against values already in the database (fine) but also you should protect against mass loading with Data Loader for example. So you need to compare against other records in trigger.new. You can kind of simplify it if you move logic from before insert to after insert, you can then query everything from DB... But it's weak, it's a validation that should prevent save, it logically belongs in before. It'll waste account id, maybe some autonumbers... Not elegant.
On update you should handle update of Channel but also of Account Id (reparenting to another record!). Otherwise I'll create consent with acc1 and move it to acc2.
What about undelete scenario? I create 1 consent, delete it, create identical one and restore 1st one from Recycle Bin. If you didn't cover after undelete - boom, headshot.
Instead go with pure config route (or simple trigger), let the database handle that for you.
Make a helper text field, mark it unique.
Write a workflow / process builder / simple trigger (before insert, before update) that writes to this field combination of Account__c + ' ' + PE_Channel__c. Condition could be ISNEW() || ISCHANGED(Account__c) || ISCHANGED(PE_Channel__c)
Optionally prepare data fix to update existing records.
Job done, you can't break it now. And if you ever need to allow more combinations (3rd field) it's easy for admin to extend it. As long as you keep under 255 chars total.
Or (even better) there are duplicate matching rules ;) give them a go before you do anything custom? Maybe check https://trailhead.salesforce.com/en/content/learn/modules/sales_admin_duplicate_management out.

Common strategy in handling concurrent global 'inventory' updates

To give a simplified example:
I have a database with one table: names, which has 1 million records each containing a common boy or girl's name, and more added every day.
I have an application server that takes as input an http request from parents using my website 'Name Chooser' . With each request, I need to pick up a name from the db and return it, and then NOT give that name to another parent. The server is concurrent so can handle a high volume of requests, and yet have to respect "unique name per request" and still be high available.
What are the major components and strategies for an architecture of this use case?

From what I understand, you have two operations: Adding a name and Choosing a name.
I have couple of questions:
Qustion 1: Do parents choose names only or do they also add names?
Question 2 If they add names, doest that mean that when a name is added it should also be marked as already chosen?
Assuming that you don't want to make all name selection requests to wait for one another (by locking of queueing them):
One solution to resolve concurrency in case of choosing a name only is to use Optimistic offline lock.
The most common implementation to this is to add a version field to your table and increment this version when you mark a name as chosen. You will need DB support for this, but most databases offer a mechanism for this. MongoDB adds a version field to the documents by default. For a RDBMS (like SQL) you have to add this field yourself.
You havent specified what technology you are using, so I will give an example using pseudo code for an SQL DB. For MongoDB you can check how the DB makes these checks for you.
NameRecord {
id,
name,
parentID,
version,
isChosen,
function chooseForParent(parentID) {
if(this.isChosen){
throw Error/Exception;
}
this.parentID = parentID
this.isChosen = true;
this.version++;
}
}
NameRecordRepository {
function getByName(name) { ... }
function save(record) {
var oldVersion = record.version - 1;
var query = "UPDATE records SET .....
WHERE id = {record.id} AND version = {oldVersion}";
var rowsCount = db.execute(query);
if(rowsCount == 0) {
throw ConcurrencyViolation
}
}
}
// somewhere else in an object or module or whatever...
function chooseName(parentID, name) {
var record = NameRecordRepository.getByName(name);
record.chooseForParent(parentID);
NameRecordRepository.save(record);
}
Before whis object is saved to the DB a version comparison must be performed. SQL provides a way to execute a query based on some condition and return the row count of affected rows. In our case we check if the version in the Database is the same as the old one before update. If it's not, that means that someone else has updated the record.
In this simple case you can even remove the version field and use the isChosen flag in your SQL query like this:
var query = "UPDATE records SET .....
WHERE id = {record.id} AND isChosend = false";
When adding a new name to the database you will need a Unique constrant that will solve concurrenty issues.

SQL - trying to find a set of data which only has a certain set of values but not anything else in a few columns

Hopefully an easy problem for an experienced SQL person. I have an application which uses SQL Server, and I cannot perform this query in the application, so I'm hoping to back-door it, but I need help.
I have a table with a large list of emails and all its metadata. I'm trying to find email that is only between parties of this one company and flag them.
What I did was search where companyName.com is in To and From and marked a TagField as 1 (I did this through my application's front end).
Now what I need to do is search where any other possible values, ignoring companyName.com exist in To and From where I've already flagged them as 1 in TagField. From will usually just have one value, but To could have multiple, all formatted differently, but all separated by a semi-colon (I will probably have to apply this same search to CC and BCC columns, too).
Any thoughts?

Replace the ; with the empty string. Then check to see if the length changed. If there's one email address, there shouldn't be a ';'. You could also use the same technique to replace the company name with the empty string. Anything left would be the other companies.
select email_id, to_email
from yourtable
where TagField = 1 and len(to_email) <> len(replace(to_email,';',''))
This solution is based on the following thread
Number of times a particular character appears in a string

So I went an entirely different route and exported my data to a CSV and used Python to get to where I needed. Here's the code I used in case anybody needs it. What this returned for me was a list of DocIDs (unique identifiers that were in the CSV) where ever there was an email address in the To field that wasn't from one specific domain. I went into the original CSV and made sure all instances of this domain name were in all lowercase, too.
import csv
import tkinter as tk
from tkinter import filedialog
root = tk.Tk()
root.withdraw()
file_path = filedialog.askopenfilename()
sub = "domainname"
def findMultipleTo(dict):
for row in reader:
if row['To'].find(";") != -1:
toArray = row['To'].split(';')
newTo = [s for s in toArray if sub not in s]
row['To'] = newTo
else:
row['To'] = 'empty'
with open('location\\newCSV-BCCFieldSearch.csv', 'a') as f:
if row['To'] != "empty" and row['To'] != []:
print(row['DocID'], row['To'], file = f)
else:
pass
with open(file_path) as csvfile:
reader = csv.DictReader(csvfile)
findMultipleTo(reader)

Trigger Duplicate CSV

am trying to upload a CSV file / insert a bulk of records using the import wizard. In short I would like to keep the latest record, in case if duplicates are found. Duplicates record are a combination of First name, Last name and title
For example if my CSV file looks like the following:
James,Wistler,34,New York,Married
James,Wistler,34,London,Married
....
....
James,Wistler,34,New York,Divorced
This should only keep in my org: James,Wistler,34,New York,Divorced
I have been trying to write a trigger before an update / insert but so far no success Here is my trigger code: (The code is not yet finished (only filering with Firstname), I am having a problem deleting found duplicate in my CSV ) Any hints. Thanks for reading!
trigger CheckDuplicateInsert on Customer__c(before insert,before update){
Map <String, Customer__c> customerFirstName = new Map<String,Customer__c>();
list <Customer__c> CustomerList = Trigger.new;
for (Customer__c newCustomer : CustomerList)
{
if ((newCustomer.First_Name__c != null) && System.Trigger.isInsert )
{
if (customerFirstName.containsKey(newCustomer.First_Name__c) )
//remove the duplicate from the map
customerFirstName.remove(newCustomer.First_Name__c);
//end of the if clause
// add this stage we dont have any duplicate, so lets add a new customer
customerFirstName.put(newCustomer.First_Name__c , newCustomer);
}
else if ((System.Trigger.oldMap.get(newCustomer.id)!= null)&&newCustomer.First_Name__c !=System.Trigger.oldMap.get(newCustomer.id).First_Name__c )
{//field is being updated, lets mark it with UPDATED for tracking
newCustomer.First_Name__c=newCustomer.First_Name__c+'UPDATED';
customerFirstName.put(newCustomer.First_Name__c , newCustomer);
}
}
for (Customer__c customer : [SELECT First_Name__c FROM Customer__c WHERE First_Name__c IN :customerFirstName.KeySet()])
{
if (customer.First_Name__c!=null)
{
Customer__c newCustomer=customerFirstName.get(customer.First_Name__c);
newCustomer.First_Name__c=Customer.First_Name__c+'EXIST_DB';
}
}
}

Purely non-SF solution would be to sort them & deduplicate in Excel for example ;)
Good news - you don't need a trigger. Bad news - you might have to ditch the import wizard and start using Data Loader. The solution is pretty long and looks scary but once you get the hang of it it should start to make more sense and be easier to maintain in future than writing code.
You can download the Data Loader in setup area of your Production org and here's some basic info about the tool.
Anyway.
I'd make a new text field on your Contact, call it "unique key" or something and mark it as External Id. If you have never used ext. ids - Jeff Douglas has a good post about them.
You might have to populate the field on your existing data before proceeding. Easiest would be to export all Contacts where it's blank (from a report for example), fill it in with some Excel formulas and import back.
If you want, you can even write a workflow rule to handle the generation of the unique key. This might help you when Mrs. Jane Doe gets married and becomes Jane Bloggs and also will make previous point easier (you'd just import Contacts without changes, just "touching" them and the workflow will fire). Something like
condition: ISBLANK(Unique_key__c) || ISCHANGED(FirstName) || ISCHANGED(LastName) || ISCHANGED(Title)
new value: Title + FirstName + ' ' + LastName
Almost there. Fire Data Loader and prepare an upsert job (because we want to insert some records and when duplicate is found - update them instead).
My only concern is what happens when what's effectively same row will appear more than once in 1 "batch" of records sent to SF like in your example. Upsert will not know which value is valid (it's like setting x = 7; and x = 5; in same save to DB) and will decide to fail these rows. So you might have to tweak the amount of records in a batch in Data Loader's settings.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Insert or update data in aerospike - database

I want to insert some record in Aerospike, if the record already exists then I only want to update it. Currently I am using this query(to insert) - client.put(wPolicy, key,bin1,bin2) Can someone please inform me how to update or insert depending on whether the record is duplicate?

Related

How to limit amount of associations in Elixir Ecto

Trigger to restrict duplicate record for a particular type

Common strategy in handling concurrent global 'inventory' updates

SQL - trying to find a set of data which only has a certain set of values but not anything else in a few columns

Trigger Duplicate CSV

Categories

Resources