Salesforce composite capable - salesforce

Can anyone please help me? Is Salesforce composite capable of a rollback strategy? In case I have 3 records being updated to a Salesforce object and the last records got failed,
will it be able to revert/roll back and reprocess again the same record and avoid duplicating the records?

Related

How to mark particular records or ranges as deleted in Timestream

I have a stream processing application that constantly ingest the data to AWS Timestream.
I try to come up with the approach when a particular range of data is processed incorrectly, thus I need to re-ingest them again and mark the ones that are already processed as deleted.
What is the best approach to do that?
Thanks in advance.
You can use Version field while re-ingesting data.
When Version field is provided Timestream overwrites/upserts the data if it has higher Version.
Ref: Timestream Developer Guide

How can I poll the Salesforce API to find records that meet criteria and have not been seen by my app before?

I am working on a Salesforce integration for an high-traffic app where we want to be able to automate the process of importing records from Salesforce to our app. To be clear I am not working from the Salesforce side (i.e. Apex), but rather using the Salesforce Rest API from within the other app.
The first idea was to use the cutoff time for when the record was created where we would increase that time on each poll based on the creation time of the applicant in the last poll. It was quickly realized this wouldn't work for this. There can be other filters in the query that might include a status field in Salesforce, for example, where the record should only import after a certain status is set. This would make checking creation time or anything like that unreliable since an older record could later become relevant to our auto importing.
My next idea was to poll the Salesforce API to find records every few hours. In order to avoid importing the same record twice, the only way I could think to do this is by keeping track of the IDs we already attempted to import and using these to do a NOT IN condition:
SELECT #{columns} FROM #{sobject_name}
WHERE Id NOT IN #{ids_we_already_imported} AND #{other_filters}
My big concern at this point was whether or not Salesforce had a limitation on the length of the WHERE clause. Through some research I see there are actually several limitations:
https://developer.salesforce.com/docs/atlas.en-us.salesforce_app_limits_cheatsheet.meta/salesforce_app_limits_cheatsheet/salesforce_app_limits_platform_soslsoql.htm
The next thing I considered was doing queries to find the all of the IDs in Salesforce that meet the conditions of the other filters without checking the ID itself. Then we could take that list of IDs and remove the ones we already tracked on our end to find a smaller IN condition we could set to find all of the data on the records we actually need.
This still doesn't seem completely reliable though. I see a single query can only return 2000 rows and only have an offset up to 2000. If we already imported 2000 records the first query might not have any necessary rows we'd want to import, but we can't offset it to get the relevant rows because of these limitations.
With these limitations I can't figure out a reliable way to find the relevant records to import as the number of records we already imported grows. I feel like this would be common usage of a Salesforce integration, but I can't find anything on this. How can I do this without having to worry about issues when we reach a high volume?
Not sure what all of your requirements are or if the solution needs to be generic, but you could do a few of things.
Flag records that have been imported, but that means making a call back to salesforce to update the records, but that can be bulkified to reduce the number of calls and modify your query to exclude the flag
Reverse the way you get the data to push instead of pull, so have salesforce push records that meet the criteria to you app whenever the record meets the criteria with workflow and outbound messages
Use the streaming API to setup a push topic that you app can subscribe to that would get notified when a records meets the criteria

cloudant dashdb sync issue

We have created a warehouse with source database in cloudant,
We had ran schema discovery process on near about 40,000 records initially.Our cloudant database consist of around 2 millions records.
Now the issue we are facing is that we have got many records in _OVERFLOW Table in DashDB (means that they have got rejected ) with error like "[column does not exist in the discovered schema. Document has not been imported.]"
Seems to me the issue is that cloudant database which is actually result of dbcopy ,contains partials in the docs and as those partials are created internally by cloudant with value which we can judge only after the partials gets created like "40000000-5fffffff" in the dd doesn't get discovered by schema discovery process and now all docs which have undiscovered partials are being rejected by cloudant-dashdb sync.
Does anyone has any idea how to resolve it..
The best option for you to resolve this is with a simple trick: feed the schema discovery algorithm exactly one document with the structure you want to create in your dashDB target.
If you can build such a "template" document ahead of time, have the algorithm discover that one and load it into dashDB. With the continuous replication from Cloudant to dashDB you can then have dbcopy load your actual documents into the database that serves as source for your cloudant-dashdb sync.
We had ran schema discovery process on near about 40,000 records initially.
Our database consist of around 2 millions records
Do all these 2 millions share the same schema? I believe not.
"[column does not exist in the discovered schema. Document has not been imported.]"
It means that during your initial 40'000 records scan application didn't find any document with that field.
Let's say sequence of documents in your Cloudant db is:
500'000 docs that match schema A
800'000 docs that match schema B
700'000 docs that match schema C
And your discovery process checked just first 40'000. It never got to schema B and C.
I would recommend to re-run discovery process and process all 2 millions records. It will take time, but will guarantee that all fields are discovered.

ADF Calendar performance leak

I am using JDeveloper 11.1.2.3.0
I have implemented af:calenar functionality in my application. My calendar is based in a ViewObject that queries a database table with a big number of records (500-1000). Performing the selection through a select query to my database table is very fast, only some ms. The problem is that the time to load of my af:calendar is too long. It requires more than 5 seconds. If I just want to change the month, or the calendar view I have to wait approximately that amount of time. I searched a lot through the net but I found no explanation to this. Can anyone please explain why it takes so long? Has anyone ever faced this issue?
PS: I have tested even with JDeveloper 12 and the problem is identically the same
You should look into the viewobject tuning properties to see how many records you fetch in a single network access, and do the same check for the executable that populates your calendar.
Also try using the HTTP Analyzer to see what network traffic is going on and the ADF Logger to check what SQL is being sent to the DB.
https://blogs.oracle.com/shay/entry/monitoring_adf_pages_round_trips

Using Task Queues in GAE to insert bulk data

I am using Google App Engine to create a web application. The app has an entity, records for which will be inserted through an upload facility by the user. User may select up to 5K rows(objects) of data. I am using DataNucleus project as JDO implementation. Here is the approach I am taking for inserting the data to Data Store.
Data is read from the CSV and converted to entity objects and stored in a list.
The list is divided into smaller groups of objects say around 300/group.
Each group is serialized and stored in cache using memcache using a unique id as the key.
For each group, a task is created and inserted into the Queue along with the key. Each task calls a servlet which takes this key as the input parameter, reads the data from memory and inserts this to the data store and deletes the data from memory.
The Queue has a maximum rate of 2/min and the bucket size is 1. The problem i am facing is the task is not able to insert all 300 records in to data store. Out of 300, maximum that gets inserted is around 50. I have validated the data once it is read from memcache and am able to get all the stored data back from the memory. I am using the makepersistent method of the PersistenceManager to save data to ds. Can someone please tell me what the issue could be?
Also, I want to know, is there a better way of handling bulk insert/update of records. I have used BulkInsert tool. But in cases like these, it will not satisfy the requirement.
This is a perfect use-case for App Engine mapreduce. Mapreduce can read lines of text from a blob as input, and it will shard your input for you and execute it on the taskqueue.
When you say that the bulkloader "will not satisfy the requirement", it would help if you say what requirement you have that it doesn't satisfy, though - I presume in this case, the issue is that you need non-admin users to upload data.

Resources