Salesforce Bulk API Lock Issue? - salesforce

I'm trying to mass update child__c object linked to parent__c object using Salesforce Bulk-API. I will not modify any relations, so in my opinion there is no need to lock parent__c object.
I do not know the parent__c.id, so I cannot order by the parent key to avoid "object already locked" issues.
Will Salesforce allow me to do the mass update or how am I supposed to upload the data in an efficient way?

Related

Azure Search Hard Delete Policy for Indexer

It has been almost three years since there is feedback to implement a Hard-Delete policy for Azure Search indexers. https://feedback.azure.com/forums/263029-azure-search/suggestions/33939013-adding-hard-delete-policy-to-all-of-indexer Has there been any development to this? I am sure this will be requested by a lot of users. If we can't have this feature anytime soon and if we can't implement Soft-delete in our applications, are there any alternatives/ideas we can try?
There's currently no hard delete policy for indexers. For blob indexers you may be able to use the native blob soft delete policy, which requires enabling "Blob soft delete" on the storage account so you don't have to manage the soft delete metadata.
For other data source types, one alternative is to use the REST API to directly remove documents from the index when they get removed from the data source, and synchronize them externally.

How to get a list of files names from Snowflake external S3 stage?

I am looking for the best way to automatically detect new files in a S3 bucket and then to load the data into a Snowflake table.
I know this can be achieved using Snowpipe and SNS, SQS notifications set up in AWS but I would like to have a self-contained solution within Snowflake which can be used for multiple data sources.
I want to have a table which is updated with the file names from a S3 bucket and then to load files which have not already been loaded from S3 into Snowflake.
The only way I have found to automatically detect new files from an external S3 stage in Snowflake so far is to use the code below and a task on a set schedule. This lists the file names and then uses result_scan to display the last query as a table.
list #STAGE_NAME;
set qid=last_query_id();
select "name" from table(result_scan($qid))
Does anyone know a better way to automatically detect new files in an external stage from Snowflake? Any help is much appreciated.
Not necessarily better than the way you've already found, but there is an alternative approach to listing the files in an S3 bucket.
If you create an EXTERNAL TABLE over the data in S3, you can then use the METADATA$FILENAME property in a query. If you have a record of which files have already been loaded into Snowflake then you can compare and select the names of the new files and process them.
e.g.
ALTER EXTERNAL TABLE MYSCHEMA.MYEXTERNALTABLE REFRESH;
SELECT DISTINCT
METADATA$FILENAME as filename
FROM
MYSCHEMA.MYEXTERNALTABLE;
Short Run:
Your approach
You've already found a viable solution, and your concern about the reliability of the last query id function is understandable. Procedures' sessions are isolated and so the last_query_id() function will be isolated to only the statements executed within that procedure. It might be unnecessary to use a procedure, but I personally like that they let you create reusable abstractions.
Another approach
An alternative, if you don't like the approach you're using, would be to create a single table with a single VARIANT data column plus the stage metadata columns, maintained by a single giant pipe, and you could maintain a set of materialized views over that table, which would filter, convert variant fields to columns, and sanitize, as appropriate.
There are some benefits:
simpler: integrating new prefixes for a stage requires only an additional materialized view, not an additional pipe + task
more control: you'd be able to operate directly and automatically on the data in raw form, rather than needing to load into a table and then check it. This means you can perform data quality checks, metadata checks, and sanitization.
maintainable: the use of materialized views over an immutable source means you can at any time change the logic and perform a full backfill with little effort.
Long Run:
Notification Integrations enable snowflake to listen (and possibly notify in the future, roadmap-gods willing) to external messaging systems. At this moment only Azure is supported, so it won't work for your case, but keep an eye out over the next few months -- I think it's safe to speculate that we will see this feature grow to support AWS, and a more direct and concise manner for implementing your original solution will eventually become available.

Database concurrency - handling out of sync objects

I am using a DAL service to retrive data from the database.
Lets observe to simplest case where I retrieve one object from the database.
After retrieving that object I am doing some changes to its properties according to some business logic,
and than I want to update the object in the persistent database.
However, some other client (maybe even one I am not aware that exists) changed the state of the underline object in the database, and I identify this when I am trying to update.
What should I do in this case?
Should I throw an exception?
Should I try to update only the fields that I changed?
Should I lock that table for writing while I am performing bussiness logic based on the persistant data?
Guy
I think what you should do depends on what you are trying to achieve.
Your main options as i see it:
lock beforehand - main pros & cons - occupying the database until you commit, much more simple.
don't lock beforehand, merge in case someone else updated it - main disadvantage - merging can be very complex
I would go with the first one, but i would try to minimize the locking time (i.e i would figure out what's all the changes i want to do prior to locking the object).
Any way i don't think this is an exceptional case.. so i won't go with throwing exception.
This is very subjective and it depends on what exactly you are trying to.
Should I throw an exception?
You should if you are not expectig the update by other user. For instance, your software is trying to book a seat which has been already booked by somebody, you will throw say SeatAlreadyBookedException and handle that appropriately by logging or showing proper message
Should I try to update only the fields that I changed?
You can do that if you have not used the existing state to make an update or you want your changes to be final changes overriding any changes already done by other users. For instance, if want to update a new date for the dead-line for a project.
Should I lock that table for writing while I am performing bussiness
logic based on the persistant data?
Locking a table will affect the overall throughput. Your application logic should take of these transactions and maintan the data integrity.

Standard practice/API for sharing database data without giving direct database access

We would like to give some of our customers the option to read data from our central database. The data is live and new records are being added every few seconds. Our database is MySQL running on Amazon RDS.
I was wondering what is the common practice for doing so.
One option would be to give them select right from specific tables, in that case they would be able to access other customers' data as well.
I have tried searching for database, interface, and API key words and some other key words, but I couldn't find a good answer.
Thanks!
Use REST for exposing specific tables to do CRUD operations. You can control the access on it too.

Proper change-logging impossible with Entity Framework?

I'd like to log all changes made an SQL Azure database using Entity Framework 4.
However, I failed to find a proper solution so far..
So far I can track all entities themself by overriding SaveChanges() and using the ObjectStateManager to retrieve all added, modified and deleted entities. This works fine. Unfortunately I don't seem to be able to retrieve any useful information out of RelationshipEntries. In our database model we some many-to-many relationships, where I want to log new / modified / deleted entries too.
I want to store all changes in an Azure Storage, to be able to follow changes made by a user and perhaps roll back to a previous version of an entity.
Is there any good way to accomplish this?
Edit:
Our scenario is that we're hosting a RESTful WebService that contains all business logic and stores the data in the Azure SQL Database. A client must be authenticated as a user with the WebService, and I'd need to store the information which user changed the data.
See FrameLog, an Entity Framework logging library that I wrote for this purpose. It is open-source, including for commercial use.
Even if you don't want to use the library, you can look at the code to see one way of handling logging relationships. It handles all relationship multiplicities.
Particularly, see the code for the private methods logRelationshipChange, and logForeignKeyChange in the ChangeLogger class.
You can do it with a tracing provider.
You may want to consider just using a database trigger for this. Whenever a value in a table is changed, copy the row to another Archive table. It has worked pretty well for me.

Resources