Can you please explain how can we bring the manage currencies to record level. It was changing daily so can you please explain how it was being changed and how is it used in record level.
Your question is poorly worded. An opportunity that's worth €1000 and nobody edits it will still be worth €1000 tomorrow. What can change daily is currency exchange rate to $, £...
You can enable currency management (if not done already) which causes CurrencyIsoCode picklist to appear in every table. And then -if you really need to keep info what the price was at specific point in time - you can add support for "dated currency rates" on top of that. There are some considerations that out of the box the exchange rate recalculates only on Opportunity & related object. For your custom objects you'll need to write a nightly (hourly?) batch for example.
Once you're sure that dated rates is what you need (because maybe you need just the basic version with 1 current exchange rate) look into an integration that would periodically contact some currency exchange info server and write to either CurrencyType or DatedConversionRate
Related
Salesforce provides CaseMilestone table. Each time I call the API to get a same object, I noticed that TimeRemainingInMins field has a different value. So I guessed this field is auto-calculated each time I call the API.
Is there a way to know what fields in a table are auto-calculated ?
Note : I am using python simple-salesforce library.
Case milestone is special because it's used as countdown to service level agreement (SLA) violation, drives some escalation rules. Depending on how admin configured the clock you may notice it stops for weekends, bank holidays or maybe count only Mon-Fri 9-17...
Out of the box other place that may have similar functionality is OpportunityHistory table. Don't remember exactly but it's used by SF for for duration reporting, how long oppty spent in each stage.
That's standard. For custom fields that change every time you read them despite nothing actually changing the record (lastmodifiedate staying same) - your admin could have created formula fields based on "NOW()" or "TODAY()", these would also recalculate every time you read them. You'd need some "describe" calls to get the field types and the formula itself.
We have a booking system where dozens of thousands of reservations are done every day. Because a customer can create a reservation without being logged in, it means that for every reservation a new customer id/row is created, even if the very same customer already have reserved in the system before. That results in a lot of customer duplicates.
The engineering team has decided that, in order to deduplicate the customers, they will run a nightly script, every day, which checks for this duplicates based on some business rules (email, address, etc). The logic for the deduplication then is:
If a new reservation is created, check if the (newly created) customer for this reservation has already an old customer id (by comparing email and other aspects).
If it has one or more old reservations, detach that reservation from the old customer id, and link it to a new customer id. Literally by changing the customer ID of that old reservation to the newly created customer.
I don't have a too strong technical background but this for me smells like terrible design. As we have several operational applications relying on that data, this creates a massive sync issue. Besides that, I was hoping to understand why exactly, in terms of application architecture, this is bad design and what would be a better solution for this problem of deduplication (if it even has to be solved in "this" application domain).
I would appreciate very much any help so I can drive the engineering team to the right direction.
In General
What's the problem you're trying to solve? Free-up disk space, get accurate analytics of user behavior or be more user friendly?
It feels a bit risky, and depends on how critical it is that you get the re-matching 100% correct. You need to ask "what's the worst that can happen?" and "does this open the system to abuse" - not because you should be paranoid, but because to not think that through feels a bit negligent. E.g. if you were a govt department matching private citizen records then that approach would be way too cavalier.
If the worst that can happen is not so bad, and the 80% you get right gets you the outcome you need, then maybe it's ok.
If there's not a process for validating the identity of the user then by definition your customer id/row is storing sessions, not Customers.
In terms of the nightly job - If your backend system is an old legacy system then I can appreciate why a nightly batch job might be the easiest option; that said, if done correctly and with the right architecture, you should be able to do that check on the fly as needed.
Specifics
...check if the (newly created) customer
for this reservation has already an old customer id (by comparing
email...
Are you validating the email - e.g. by getting users to confirm it through a confirmation email mechanism? If yes, and if email is a mandatory field, then this feels ok, and you could probably use the email exclusively.
... and other aspects.
What are those? Sometimes getting more data just makes it harder unless there's good data hygiene in place. E.g. what happens if you're checking phone numbers (and other data) and someone does a typo on the phone number which matches with some other customer - so you simultaneously match with more than one customer?
If it has one or more old reservations, detach that reservation from
the old customer id, and link it to a new customer id. Literally by
changing the customer ID of that old reservation to the newly created
customer.
Feels dangerous. What happens if the detaching process screws up? I've seen situations where instead of updating the delta, the system did a total purge then full re-import... when the second part fails the entire system is blank. It's not your exact situation but you are creating the possibility for similar types of issue.
As we have several operational applications relying on that data, this creates a massive sync issue.
...case in point.
In your case, doing the swap in a transaction would be wise. You may want to consider tracking all Cust ID swaps so that you can revert if something goes wrong.
Option - Phased Introduction Based on Testing
You could try this:
Keep the system as-is for now.
Add the logic which does the checks you are proposing, but have it create trial data on the side - i.e. don't change the real records, just make a copy that is what the new data would be. Do this in production - you'll get a way better sample of data.
Run extensive tests over the trial data, looking for instances where you got it wrong. What's more likely, and what you could consider building, is a "scoring" algorithm. If you are checking more than one piece of data then you'll get different combinations with different likelihood of accuracy. You can use this to gauge how good your matching is. You can then decide in which circumstances it's safe to do the ID switch and when it's not.
Once you're happy, implement as you see fit - either just the algorithm & result, or the scoring harness as well so you can observe its performance over time - especially if you introduce changes.
Alternative Customer/Session Approach
Treat all bookings (excluding personal details) as bookings, with customers (little c, i.e. Sessions) but without Customers.
Allow users to optionally be validated as "Customers" (big C).
Bookings created by a validated Customer then link to each other. All bookings relate to a customer (session) which never changes, so you have traceability.
I can tweak the answer once I know more about what problem it is you are trying to solve - i.e. what your motivations are.
I wouldn't say that's a terrible design, it's just a simple approach of solving this particular problem, with some room for improvement. It's not optimal because the runtime of that job depends on the new bookings that are received during the day, which may vary from day to day, so other workflows that depend on that will be impacted.
This approach can be improved by processing new bookings in parallel, and using an index to get a fast lookup when checking if a new e-mail already exists or not.
You can also check out Bloom Filters - an efficient data structure that is able to tell you if an element is not in a given set.
The way I would do it is to store the bookings in a No-SQL DB table keyed-off the user email. You get the user email in both situations - when it has an account or when it makes a booking without an account, so you just have to make a lookup to get the bookings by email, which makes that deduplication job redundant.
I'm looking for some thoughts about how to design a digest email feature. I'm not concerned about the actual business code; instead I'd like to focus on the gist of it.
Let's tackle this with a known example: articles. Here's a general overview of some important features:
The user is able to choose the digest frequency (e.g. daily or weekly);
The digest only contains new articles;
"New articles" are to be considered relative to the previous digest that was sent to a specific user;
I've been thinking about the following:
Introduce per-user tracking of articles previously included in a digest and filter those out?
Requires a new database table;
Could become expensive when the table contains millions of rows;
What to do in case of including multiple types of models in the digest? Multiple tracking tables? Polymorphic table? ...?
Use article creation dates to include articles between current date and the chosen digest frequency?
Uses current date and information already present in the database, so no new tables required;
What happens when a user changes from daily to weekly emails? He could receive the same article again in the weekly digest. Should this edge case be considered? If so, how to mitigate?
For some reason the creation date of an article is being updated to today, positively triggering the date comparison again. Should this edge case be considered? If so, how to mitigate?
Or can you think of other ways to implement this feature?
I'm eager to learn your insights.
You can make an additional table that will contain information about digest subscription by each user. This way gives the ability to make a database design cleaner and more universal because mailing is a separate logical module. Aside from that, the additional table gives the ability to expand stored data about digest subscription easy in future. For example:
With help usage of this table, you would manage data easy. For example, you can select all recipient of daily digest:
SELECT *
FROM digest_subscription
WHERE interval_type = 'daily'
AND last_date_distribution <= NOW()
or select all recipient of the weekly digest
SELECT *
FROM digest_subscription
WHERE interval_type = 'weekly'
AND last_date_distribution <= NOW() - INTERVAL 7 DAY
Condition by interval type and compare the last date distribution by rule "equal or less" give the ability to avoid problems of untimely sending of emails (for example technical failures on a server, etc.)
Also, you can make correct articles list with help information of the last data distribution. Usage of the last data distribution gives the ability to avoid problems of interval change. For example:
SELECT *
FROM articles
WHERE created_at >= <the last date distribution of the user>
Of course, you don't avoid the problems of updated creation date. But you should minimize the reasons for that happening. For example, your code can update the modification date but your code shouldn't modify the creation date.
Let's say I want to model a graph with sales people. They belong to an organisation, have a manager, etc.. They are assigned to specific territories and/or client accounts. Your company may work with external partners, which must be managed, and so on. A nice, none-trivial, graph.
Elements in this graph keep on changing all the time: sales people come and go, or move within the organisation and thus change responsibilities; customers sign contract or cancel them, ...
In my specific use cases, the point in time is very important. How did the graph look like at the end of last month? End of last fiscal year? last Monday when we run job ABC. E.g. what was the manager hierarchy end of last month? Which clients did the sale person manage end of last month? and so on.
In our use cases, DELETE doesn't delete anything, but some sort of end_date gets updated. UPDATE doen't update anything, but a new version of the record is created.
I'm sure I can add CREATED and START-/END_DATE properties to nodes as well as relations, and for sure I can also create queries. But these queries are a pain to write, and almost unreadable, with tons of repeating where clauses everywhere.
I wish graph databases (and their graphical query builder) would allow me to travel in time more easily, e.g. by setting a session variable to a point in time, and all the where clauses are automatically added for all nodes and references that have the start/end date properties. The algorithm should not fail for objects that don't have these properties, but consider the condition met.
What are you thoughts about this use case und what help does memgraph provide for these use cases?
thanks a lot
Juergen
As far as I am aware there is not any graph database that supports the type of functionality you are asking about directly although as #buda points out you can model and query against time series data. I agree with #buda that the way in which you would like this to work seems a bit undefined and very application specific so I would not expect this to be a feature of any database.
The closest I can think of to out of the box support for something like this would be to use a Tinkerpop-enabled database with a PartitionStrategy or SubgraphStrategy to create the subgraph of only the times you wanted and then query against that. Another option would be creating a domain specific language to minimize the amount of times you need to repeated code in your queries.
PartitionStrategy
SubgraphStrategy
Domain Specific Languages
Last year we launched http://tweetMp.org.au - a site dedicated to Australian politics and twitter.
Late last year our politician schema needed to be adjusted because some politicians retired and new politicians came in.
Changing our db required manual (SQL) change, so I was considering implementing a CMS for our admins to make these changes in the future.
There's also many other sites that government/politics sites out there for Australia that manage their own politician data.
I'd like to come up with a centralized way of doing this.
After thinking about it for a while, maybe the best approach is to not model the current view of the politician data and how they relate to the political system, but model the transactions instead. Such that the current view is the projection of all the transactions/changes that happen in the past.
Using this approach, other sites could "subscribe" to changes (a la` pubsubhub) and submit changes and just integrate these change items into their schemas.
Without this approach, most sites would have to tear down the entire db, and repopulate it, so any associated records would need to be reassociated. Managing data this way is pretty annoying, and severely impedes mashups of this data for the public good.
I've noticed some things work this way - source version control, banking records, stackoverflow points system and many other examples.
Of course, the immediate challenges and design issues with this approach includes
is the current view cached and repersisted? how often is it updated?
what base entities must exist that never change?
probably heaps more i can't think of right now...
Is there any notable literature on this subject that anyone could recommend?
Also, any patterns or practices for data modelling like this that could be useful?
Any help is greatly appreciated.
-CV
This is a fairly common problem in data modelling. Basically it comes down to this:
Are you interesting in the view now, the view at a point in time or both?
For example, if you have a service that models subscriptions you need to know:
What services someone had at a point in time: this is needed to work out how much to charge, to see a history of the account and so forth; and
What services someone has now: what can they access on the Website?
The starting point for this kind of problem is to have a history table, such as:
Service history: id, userid, serviceid, start_date, end_date
Chain together the service histories for a user and you have their history. So how do you model what they have now? The easiest (and most denormalized view) is to say the last record or the record with a NULL end date or a present or future end date is what they have now.
As you can imagine this can lead to some gnarly SQL so this is selectively denomralized so you have a Services table and another table for history. Each time Services is changed a history record is created or updated. This kind of approach makes the history table more of an audit table (another term you'll see bandied about).
This is analagous to your problem. You need to know:
Who is the current MP for each seat in the House of Representatives;
Who is the current Senator for each seat;
Who is the current Minister for each department;
Who is the Prime Minister.
But you also need to know who was each of those things at a point in time so you need a history for all those things.
So on the 20th August 2003, Peter Costello made a press release you would need to know that at this time he was:
The Member for Higgins;
The Treasurer; and
The Deputy Prime Minister.
because conceivably someone could be interesting in finding all press releases by Peter Costello or the Treasurer, which will lead to the same press release but will be impossible to trace without the history.
Additionally you might need to know which seats are in which states, possibly the geographical boundaries and so on.
None of this should require a schema change as the schema should be able to handle it.