I wish to create a generic component which can save the Object Name and field Names with old and new values in a BigObject.
The brute force algo says, on every update of each object, get field API names using describe and check old and new value of those fields. If it gets modified insert it into new BigObject.
But it will consume a lot of CPU time and I am looking for an optimum solution to handle this.
Any suggestions are appreciated.
Well, do you have any code written already? Maybe benchmark it and then see what you can optimise instead of overdesigning it from the start... Keep it simple, write test harness and then try to optimise (without breaking unit tests).
Couple random ideas:
You'd be doing that in a trigger? So your "describe" could happen only once. You don't need to describe every single field, you need only one operation outside of trigger's main loop.
Set<String> fieldNames = Account.sObjectType.getDescribe().fields.getMap().keyset();
System.debug(fieldNames);
This will get you "only" field names but that's enough. You don't care whether they're picklists or dates or what. Use that with generic sObject.get('fieldNameHere') and it's a good start.
or maybe without describe at all. sObject's getPopulatedFieldsAsMap() will give you cool Map which you can easily iterate & compare.
or JSON.serialize the old & new version of the object and if they aren't identical - you know what to do. No idea if they'll always serialise with same field order though so checking if the maps are identical might be better
do you really need to hand-craft this field history tracking like that? You have 1M records free storage but it could explode really easily in busier SF org. Especially if you have workflows, processes, other triggers that would translate to multiple updates (= multiple trigger runs) in same transaction. Perhaps normal field history tracking + chatter feed tracking + even salesforce shield (it comes with 60 more fields tracked I think) would be more sensible for your business needs.
Related
[Note: There is a Teacher Object with the fields such as Teacher Name, DateofJoining, and also a formula field called Experience]
My Task was to create a Public Group consisting of another user
and this user should only see teachers who have experience greater than 2 years
But when i create a sharing rule based on criteria the field name called Experience doesn't show up as it is a formula field.
So i got an idea of creating a new field(maybe a text or number data type) which would have the value of Experience in it. (But i have no idea on how to implement this)
Is there a way to implement this?
Any other solution is also well appreciated!
Hard to say.
Normal trick would be to create a helper field (text, number, whatever) and have piece of functionality that populates it. An "early flow" or "before insert, before update" trigger ideally. Worst case a normal flow, process builder or "after insert, after update" trigger. Something like "if Experience__c != 'your formula here' then Experience__c = 'your formula here'". Consult normal SF help and trailhead if you never used early flows
You'd make an one-off data fix to populate existing records and job done, normal field should be selectable as sharing rule criteria.
=====
But I smell trouble with your formula. What exactly you have there, something like Experience__c = (TODAY() - DateofJoining__c) / 365? That's bit evil. Formulas with TODAY(), NOW() or anything with $ (roughly speaking who's looking at the data, user's name, profile role... not what's actually on the record itself) are "nondeterministic". Unpredictable.
A "today()" changes just like that, without updating the record. Sure, when you watch the record a fresh value will be calculated but other than that LastModifiedDate doesn't change, there's no magical trigger running at midnight that rechecks sharing. (especially that there's no single midnight, you could have users in multiple timezones). SF just doesn't allow nondeterministic fields in many places, see https://salesforce.stackexchange.com/q/32122/799
So if you do rely on TODAY() in your formula you might have to make a "scheduled flow" or read about schedulable, batchable apex. Create nightly job that would run and recalculate your helper field with right experience. You'd probably even need both solutions, a "before save" flow for new data created today and nightly job to advance the clock on existing old data...
Just a question regarding NoSQL DB. As far as I know, operations are done by the app/website outside the DB. For instance, if I need to add an value to a list, I need to
download the intial list
add the new value in the list on my device
upload the whole updated list.
At the end, a lot of data is travelling (twice the initial list) with no added value.
Is there any way to request directly the DB for simple operations like this?
db.collection("collection_key").document("document_key").add("mylist", value)
Or simply increment a field?
Same for knowing the number of documents in a collection: is it needed to download the whole set of document to get the number ?
Couple different answers:
In Firestore, many intrinsic operations can be done "FieldValues", such as increment/decrement (by supplied value, so really Add/subtract). Also array unions, field deletes, etc. Just search the documentation for FieldValue. Whether this is true for NoSQL in general, I can't say.
Knowing the number of documents, on the other hand. is not trivially done in Firestore - but frankly, I can't think of any situations other than artificially contrived examples where you would need to know. Easy enough to setup ways to "count" documents as you create/delete them, and keep that separately, if for some reason you find yourself needing it.
Or were you just trying to generically put down NoSQL as a concept?
I've being searching for a solution for recurring events, so far I've found two approaches:
First approach:
Create an instance for each event, so if the user has a daily event for one year, it would be necessary 365 rows in the table.
It sounds plausible for a fixed time frame, but how to deal with events that has no end date?
Second approach:
Create a Reccuring pattern table that creates future events on runtime using some kind of Temporal expression (Martin Fowler).
Is there any reason to not choose the first approach instead of the second one?
The first approach is going to overpopulate the database and maybe affect performance, right?!
There's a quote about the approach number 1 that says:
"Storing recurring events as individual rows is a recipe for disaster." (https://github.com/bmoeskau/Extensible/blob/master/recurrence-overview.md)
What do you guys think about it? I would like some insights on why that would be a disaster.
I appreaciate your help
The proper answer is really both, and not either or.
Setting aside for a moment the issue of no end date for recurrence: what you want is a header that contains recurrence rules for the whole pattern. That way if you need to change the pattern, you've captured that pattern in a single record that can be edited without risking update anomalies.
Now, joining against some kind of recurrence pattern in SQL is going to be a great big pain in the neck. Furthermore, what if your rules allow you to tweak (edit, or even delete) specific instances of this recurrence pattern?
How do you handle this? You have to create an instance table with one row per recurring instance with a link (foreign key) back to the single rule that was used to create it. This let's you modify an individual child without losing sight of where it came from in case you need to edit (or delete) the entire pattern.
Consider a calendaring tool like Outlook or Google Calendar. These applications use this approach. You can move or edit an instance. You can also change the whole series. The apps ask you which you mean to do whenever you go into an editing mode.
There are some limitations to this. For example, if you edit an instance and then edit the pattern, you need to have a rule that says either (a) new parent wins or (b) modified children always win. I think Outlook and Google Calendar use approach (a).
As for why having each instance recorded explicitly, the only disastrous thing I can think of would be that if you didn't have the link back to the original recurrence pattern you would have a heck of a time cancelling the whole series in one action.
Back to no end date - This might be a case of discretion being the better part of valour and using some kind of rule of thumb that imposes a practical limit on how far into the future you extend such a series - or alternatively you could just not allow that kind of rule in a pattern. Force an end to the pattern and let the rule's creator worry about extending it at whatever future point it becomes necessary.
Store the calendar's event as a rule rather than just as a materialized event.
Storing recurring event materialized as a row is a recipe for disaster for the apparent reason, that the materialization will ideally be of infinite length. Since endless length table is not possible, the developer will try to mimic that behavior using some clever, incomprehensive trick - resulting in erratic behavior of the application.
My suggestion: Store the rules and materialize them and add as rows, only when queried - leading to a hybrid approach.
So you will have two tables store your information, first for storing rules, second, for storing rows materialized from any rule in the rules' table.
The general guidelines can be:
For a one-time event, add a row to the second table.
For a recurring event, add a row to the first table and materialize some of into the second table.
For a query about a future date, materialize the rules and save them in the second table.
For a modification of a specific instance of a recurring event, materialize the event up till the instance you want to modify, and then modify the last instance and store it.
Further, if the event is too far in the future, do not materialize it. Instead save it as a rule also and execute it later when the time arrives.
Plain tables will not be enough to store what you are trying to save. Keeping this kind of information in the database is best maintained when supported with Stored Procedures for access and modifications.
from the answers in the blog post and answers here:
1- eat DB storage and memory with these recurrences (with no need) , with the extreme case of "no-end date"
2- impact performance (for query / join / update / ...)
3- in case of update (or generally in any case you need to handle the recurrence set as a set not as individual occurrences) , you will need to update all rows
What is the proper way to keep data slices refreshed? Imagine I have a table with various column, but importantly a DATE_CREATED and a DATE_MODIFIED column.
If my data slicing strategy is based on DATE_CREATED, I could periodically reprocess old slices. This follows the ADF guidance of "repeatability". I don't think ADF has a way of doing this automatically, but I could externally trigger the refresh via the API (I'm guessing.) This seems like perhaps the most correct way, but given that ADF doesn't seem to support this as a feature, it makes me feel like there's a better way of doing it ... it also seems mildly wasteful.
If my data slicing strategy is based on DATE_MODIFIED, I run into issues with the ADF activities not being repeatable. An old slice, when refreshed, would give different results because rows that were within the window may have moved to a different window. On the other hand, the latest slice would always catch rows that have changed. The other issue is preventing row duplication. The pre-activity clean up actions would need to somehow be able to clean up records in the destination table prior to the copy. Or some type of UPSERT method must be employed.
The final option is to TRUNCATE the destination table every day. This is fine for smaller tables but has its own downsides, (1) we're not really "slicing" at all anymore. This is just scorched earth. (2) Any time any slice is being processed, all downstream slices from all dates are in danger of failing, due to the table being blown away. (3) practically impossible if your table has any respectable amount of data in it.
No option seems excellent but the first option seems better. Looking for advice from someone who has solved this problem or is experienced with ADF.
first of all I'd like to tell you that you're terrific audience.
I'm making an application where I have model Foo with table Foos. And I'd like to give Foo another parameter, HABTM parameter, lets say Bar. But I'd rather don't create table for Bar. Because Bar will have like 5 positions on start and in 5 years it will grow to maybe 7 positions or not at all. So I don't see a need to create another table and make CakePHP look into that table with another SELECT. Anyone have an idea this can be achieved ?
One solution I think is making an fixture for Bars table and adding only Bars_Foos table for real (it won't be big anyway). But I can't find a way to use test fixtures in normal Controller
Second solution is to save a JSON or serialized array in Foo one field and move logic to model, but I don't know if it is best solution. Something like virtual field.
Real life example:
So I have like Bikes. And every Bike have its main_type. Which is for now {"MTB","Road","Trekking","City","Downhill"}. I know that in long time this list would not grow much. Maybe 2 or 5 positions in few years. Still it will relatively short.
(For those who say that there maybe a hundred of specialized bike types. I have another parameter column specialized_type)
It needs to be a HABTM relation, but main_types table will be very small, so Id like to avoid creating it and find a way for simpler solution.
Because
It bothers MySQL for such small amount of data
It complicates MySQL queries
I have to make additional model for MainType
I have more models to unbind when I don't need most of data and would like use recursive
Insert here anything you'd like...
Judging from your real life example, I'd say you're on the wrong track. The queries won't be complicated, CakePHP uses additional queries for HABTM relations, it would be just one additional query which shouldn't be very costly, also it's very easy to sparse it out by using the containable behaviour. And if you really need to use recursive only (for whatever reason), then it's just one single additional model to unbind, that doesn't seem like overkill to me.
This might not be what you wanted to hear, but I really think a proper database solution is better than trying to hack in "virtual data". Also note that fixtures as used in tests, only define data which is written to the database on the fly when running the test, so that would be definitely more costly than using data that already exists in the database.
Maybe you'll get a small performance boost for selects that do not query the main type when using an additional column to store the data, but you'll definitely lose all the flexibility that the RDBMS has to offer, including faster selects using proper indexing, affecting multiple records by updating a single related value, etc. That doesn't sound like a good trade-off to me. Think about it, how would you select all Downhill Tracking bikes when this information is stored as a string in a single column? You would probably end up using ugly LIKE selects.
Now wait, there's a SET data type in MySQL hat can hold multiple values. Right, and it looks easier and less complex. Right, but in the background it isn't, while using a complex looking join-query can be pretty fast using proper indexing, the query for the SET type will have to scan every single row since the data stored in the column cannot be indexed appropriately in order to make more specific selects.
In the end it probably depends on your data, so I'd suggest testing both methods in your specific environment and see how they compare under workload.