dbt Snapshot Failing (ERROR: 100090 (42P18): Duplicate row detected during DML action) - database

So we have a table called dim_merchant.sql and a snapshot of this table called dim_merchant_snapshot.
{% snapshot dim_merchant_snapshot %}
{{
config
(
target_schema='snapshots',
unique_key='id',
strategy='check',
check_cols='all'
)
}}
select * from {{ ref('dim_merchant') }}
{% endsnapshot %}
We never had a trouble with it but since yesterday, we ran into failure in running the snapshot with the following error message:
Database Error in snapshot dim_merchant_snapshot (snapshots/dim_merchant_snapshot.sql)
100090 (42P18): Duplicate row detected during DML action
The error is happening during this step of the snapshot:
On snapshot.analytics.dim_merchant_snapshot: merge into "X"."SNAPSHOTS"."DIM_MERCHANT_SNAPSHOT" as DBT_INTERNAL_DEST
using "X"."SNAPSHOTS"."DIM_MERCHANT_SNAPSHOT__dbt_tmp" as DBT_INTERNAL_SOURCE
on DBT_INTERNAL_SOURCE.dbt_scd_id = DBT_INTERNAL_DEST.dbt_scd_id
when matched
and DBT_INTERNAL_DEST.dbt_valid_to is null
and DBT_INTERNAL_SOURCE.dbt_change_type in ('update', 'delete')
then update
set dbt_valid_to = DBT_INTERNAL_SOURCE.dbt_valid_to
when not matched
and DBT_INTERNAL_SOURCE.dbt_change_type = 'insert'
We realized that some values were being inserted and updated twice in the snapshot (since yesterday) and that caused the failure of our snapshot but we are not sure as to why.
Note that the id key on dim_merchant is tested for its uniqueness and there are no duplicated of it. Meanwhile, the snapshot table contains duplicate after our first snapshot run (that doesn't cause any failure), but the subsequent runs on the snapshot table infected with duplicates are failing.
We just recently updated dbt from 0.20.0 to 1.0.3, but we didnt find any change in the snapshot definition between these versions.
SETUP:
dbt-core==1.0.3,
dbt-snowflake==1.0.0,
dbt-extractor==0.4.0,
Snowflake version: 6.7.1
Thanks !

I know it's been a while since this was posted. I wanted to report that I'm seeing this weird behavior as well. I have a table that I'm snapshotting with the timestamp strategy. The unique_key is made up of several columns. The intention is that this is a full snapshot each time the model is run. The table that is being snapshotted has all unique rows meaning dbt_scd_id is a unique key. I resolved this issue by adding the updated_at column to the unique_key config. In theory, this shouldn't matter since the dbt_scd_id is already a concatenation of unique_key and updated_at. Regardless, it has resolved the issue.

Related

Error in Salesforce flow: INSUFFICIENT_ACCESS_ON_CROSS_REFERENCE_ENTITY

Error detected during debug:
Create one OpportunityLineItem record where
OpportunityId = {!$Record.ConvertedOpportunityId} (null)
Product2Id = {!get_related_Product.Id} (01t5j000000I80nAAC)
Quantity = {!DefaulQuantityOfProduct} (1)
TotalPrice = {!TotalPriceDefaultForNewOpp} ($637.54)
Result
Info
Failed to create record.
Info
Error Occurred:
This error occurred when the flow tried to create records: INSUFFICIENT_ACCESS_ON_CROSS_REFERENCE_ENTITY: insufficient access rights on cross-reference id. You can look up ExceptionCode values in the SOAP API Developer Guide.
Manually or using Apex, I can add a record. The fields that I add (opportunityid,Quantity,TotalPrice,Product2id) are not read-only. I need a detailed answer. Am I missing some permission?
1 flow schema
2 trigger condition
3 get related record
4 create new record
OpportunityId = {!$Record.ConvertedOpportunityId} (null)
This looks like the opportunity doesn't exist yet (got any errors? Do you create them always or only when certain condition is met? Did it fail?). Or it exists but it just has been inserted, SF didn't manage to update the Lead with the Id yet? Maybe add entry criteria that check if the field is populated.
Lead conversion is bit nasty topic, see https://salesforce.stackexchange.com/questions/3956/lead-conversion-trigger-order-of-execution
Actually you might be better of on SF stackexchange, more admins over there and flows are no-code solution so not a good fit for stackoverflow?

NO_SQL_DATA error when deleting a record, firedac, delphi 10.3.1

using sql server and delphi 10.3.1, and firedac.
I am using cached updates, with autocommit on.
I keep managing to get my data into a state where the record has been deleted from the database, and I have also deleted that record in the dataset.
then, when it attempts to commit the change to the database(where the data no longer exists), I get an error:
[my application] raised exception class emssqlNativeException with message [firedac][Phys][odbc][sqlncli11.dll] SQL_NO_DATA
and then I can't clear the cached updates flag on the dataset, because there is stuff 'sitting' there.
my question - how can I get it to NOT return that error? because it's really not an error, it's trying to delete a record that no longer exists. I am not finding ANY documentation on the update options on a query, so is there a flag there I need to set?
You can handle update errors in OnUpdateError and perform any additional checks before deciding how to proceed. Blindly pretending all deletes worked would be something like:
procedure TForm1.FDQuery1UpdateError(ASender: TDataSet; AException:
EFDException; ARow: TFDDatSRow; ARequest: TFDUpdateRequest; var AAction:
TFDErrorAction);
begin
if ARequest = ARDelete then AAction := eaApplied;
end;
Read the online help for OnUpdateError for more information.

Apex Trigger in Salesforce using 2 objects

I am new to Salesforce, project requests I keep track of last_id used.
I created 2 SF objects, one holds the last_id other holds total number of ids to assign. I would like user to enter a number and the number will be added to the last_id. Result will be stored in the tracking object.
Code:
tracking_next_id__c[] btnext = [SELECT last_end_id__c FROM tracking_next_id__c];
for (tracking__c updatedAccount : Trigger.new) 
{
updatedAccount.next_id__c = btnext[0].last_end_id__c + updatedAccount.total_account__c;
}
When I run the trigger; I get error Error:
Invalid Data.
Review all error messages below to correct your data.
Apex trigger getNextId caused an unexpected exception, contact your administrator: getNextId: execution of AfterUpdate caused by: System.FinalException: Record is read-only: Trigger.getNextId: line 11, column 1
After much research, found if I changed my trigger to before update instead of after update, it worked.
Right, you cannot edit a record in after insert trigger.

Solr deletions with custom full import

I'm trying to use the DataImportHandler to keep my index in sync with a SQL database (what I would think is a pretty vanilla thing to do). Since my database will be pretty large, I want to use incremental imports using this method http://wiki.apache.org/solr/DataImportHandlerDeltaQueryViaFullImport so the calls are of the form http://localhost:8983/solr/Items/dataimport?command=full-import&clean=false. This works perfectly well for adding items.
I have a separate DeletedItems table in my database which contains the primary keys of the items that have been deleted from the Items table, along with when they were deleted. As part of the DataImport call, I had hoped to be able to delete the relevant items from my index based on a query along the lines of
SELECT Id FROM DeletedItems WHERE WasDeletedOn > '${dataimporter.last_index_time}'
but I can't figure out how to do this. The link above alludes to it with the cryptic
In this case it means obviously that in case you also want to use deletedPkQuery then when running the delta-import command is still necessary.
but setting deletedPkQuery to the above SQL query doesn't seem to work. I then read that deletedPkQuery only works with delta-imports, so I am forced to make two requests to my solr server as part of the sync process? This doesn't seem right as the operations are parameterized by the dataimporter.last_index_time property, which changes. Both steps would need to be done in one "atomic" action, surely? Any ideas?
You must use the import handler special commands
https://wiki.apache.org/solr/DataImportHandler#Special_Commands
With these commands you can alter the boost or delete a document coming from the recordset of the full import query. Be aware that you must use the $skipDoc field to avoid that the document gets indexed again and that you must repeat the id in the $deleteDocById field.
You can use a union query
select
id,
text,
'false' as [$deleteDocById],
'false' as [$skipDoc]
from [rows to update or add]
Union Select
id,
'' as text,
id as [$deleteDocById],
true as [$skipDoc]
or a case when
select
id,
text,
CASE
when deleted = 1 then id
else 'false'
END as [$deleteDocById],
CASE
when deleted = 1 then 'true'
else 'false'
END as [$skipDoc]
Where updated > ${dih.last_index_time}
The deletedPkQuery is run as part of the regular call to delta-import, so you don't have to run anything twice (and when doing a full-import, there's no need to run deletedPkQuery, since the whole connection is cleared before importing anyway).
The deletedPkQuery should be configured on the same element as your main query. Be sure to match the field names exactly as well, and that the id produced by your deletedPkQuery matches the one provided by the main query.
There's a minimal example on solr.pl for importing and deleting fields using the same deleted_entries-table structure as you have here:
<entity
name="album"
query="SELECT * from albums"
deletedPkQuery="SELECT deleted_id as id FROM deletes WHERE deleted_at > '${dataimporter.last_index_time}'"
>
Also make sure that the format of the deleted_at-field is comparable against the value produced by last_index_time. The default is yyyy-MM-dd HH:mm:ss.
.. and lastly, remember that the last_index_time property isn't available before the second time the task is run, since there's no "previous index time" the first time an index is being populated (but the deletedPkQuery shouldn't run before that anyway).

CFINDEX throwing attribute validation error exception

I am upgrading to ColdFusion 11 from ColdFusion 8, so I need to rebuild my search indices to work Solr instead of Verity. I cannot find any reliable way to import my old Verity collections, so I'm attempting to build the new indices from scratch. I am using the following code to index some items along with their corresponding documents which are located on the server:
<cfsetting requesttimeout="3600" />
<cfquery name="qDocuments" datasource="#APPLICATION.DataSource#">
SELECT DISTINCT
ID,
Status,
'C:\Documents\'
CONCAT ID
CONCAT '.PDF' AS File
FROM tblDocuments
</cfquery>
<cfindex
query="qDocuments"
collection="solrdocuments"
action="fullimport"
type="file"
key="document_file"
custom1="ID"
custom2="Status" />
A very similar setup was used with Verity for years without a problem.
When I run the above code, I get the following exception:
Attribute validation error for CFINDEX.
The value of the FULLIMPORT attribute is invalid.
Valid values are: UPDATE, DELETE, PURGE, REFRESH, FULL-IMPORT,
DELTA-IMPORT,STATUS, ABORT.
This makes absolutely no sense, since there is no "FULLIMPORT" attribute for CFINDEX.
I am running ColdFusion 11 Update 3 with Java 1.8.0_25 on Windows Server 2008R2/IIS7.5.
You should believe the error message. try this:
<cfindex
query="qDocuments"
collection="solrdocuments"
action="FULL-IMPORT"
type="file"
key="document_file"
custom1="ID"
custom2="Status" />
It's referring to the value of the attribute action.
This is definitely a bug. In the ColdFusion documentation, fullimport is not an attribute of cfindex.
I know this is an old thread, but in case anyone else has the same question, it's just poor descriptions in the documentation. The action "FullImport" is only available when using type="dih" (i.e. Data Import Handler). When using the query attributes, use action="refresh" instead.
Source: CFIndex Documentation:
...
When type="dih", these actions are used:
abort: Aborts an ongoing indexing task.
deltaimport: For partial indexing. For instance, for any updates in the database,
instead of a full import, you can perform delta import to update your
collection.
fullimport: To index full database. For instance,
when you index the database for the first time.
status:
Provides the status of indexing, such as the total number of documents
processed and status such as idle or running.

Resources