DBT: Invalid incremental strategy provided: append on Snowflake? - snowflake-cloud-data-platform

I am getting the following error
Invalid incremental strategy provided: append
Expected one of: 'merge', 'delete+insert'
with:
{{
config(
materialized='incremental'
, incremental_strategy='append'
)
}}
Please advise

{{
config(
materialized='incremental',
unique_key='uniqueCol'
)
}}
Use incremental with unique_key, as name suggests it must be unique

Related

How to use jinja template with mix of params and airflow inbuilts

I am trying to create a SQL template for Snowflake where I am trying to load a S3 file using SnowflakeOperator and s3 file is provided as xcom variable from upstream task.
Here is an example template for SQL
create or replace temp table {{ params.raw_target_table }}_tmp
as
select *
from '#{{ params.stage_name }}/{{ params.get_s3_file }}'
file_format => '{{ params.file_format }}'
;
params.get_s3_file is set to use ti like {{{{ti.xcom_pull(task_ids="foo", key="file_uploaded_to_s3")}}}}
I understand that in the template if used directly, it will work if it is not coming from params, but I want it to be configurable so I can use it with multiple dags/tasks.
Ideally I want this to work
create or replace temp table {{ params.raw_target_table }}_tmp
as
select *
from '#{{ params.stage_name }}/{{ti.xcom_pull(task_ids="{{params.previous_task}}", key="file_uploaded_to_s3")}}'
file_format => '{{ params.file_format }}'. --note the nested structure
;
So it resolves params.previous_task first and then gets the xcom values. Not sure how to instruct it do it.
When you use {{ <some code> }} jinja execute the code during the runtime, so this code is just hard python code (not template) executed during the runtime.
{{ti.xcom_pull(task_ids="{{params.previous_task}}", key="file_uploaded_to_s3")}} will try to pull the xcom with key file_uploaded_to_s3 from the task {{params.previous_task}} which doesn't exist. Instead of providing a string as task_ids, you can provide params.previous_task and jinja will replace it by the value of previous_task from the params dict:
create or replace temp table {{ params.raw_target_table }}_tmp
as
select *
from '#{{ params.stage_name }}/{{ti.xcom_pull(task_ids=params.previous_task, key="file_uploaded_to_s3")}}'
file_format => '{{ params.file_format }}'. --note the nested structure
;

dbt query to Snowflake resulting in an "invalid identifier" error for a column that exists

I've been pulling my hair out for several hours trying to understand what's going on, to no avail so far.
I've got this query on dbt:
{{
config(
materialized='incremental',
unique_key='event_ID'
)
}}
SELECT
{{ dbt_utils.star(from=ref('staging_pg_ahoy_events'), relation_alias='events', prefix='event_') }},
{{ dbt_utils.star(from=ref('staging_pg_ahoy_visits'), relation_alias='visits', prefix='visit_') }}
FROM
{{ ref('staging_pg_ahoy_events') }} AS events
LEFT JOIN {{ ref('staging_pg_ahoy_visits') }} AS visits ON events.visit_id = visits.id
{% if is_incremental() %}
WHERE "events"."event_ID" >= (SELECT max("events"."event_ID") FROM {{ this }})
{% endif %}
Along with this config:
version: 2
models:
- name: facts_ahoy_events
columns:
- name: event_ID
quote: true
tests:
- unique
- not_null
dbt run -m facts_ahoy_events --full-refresh runs successfully, however when I try an incremental backup by dropping the --full-refresh flag, the following error ensues:
10:35:51 1 of 1 START incremental model DBT_PCOISNE.facts_ahoy_events.................... [RUN]
10:35:52 1 of 1 ERROR creating incremental model DBT_PCOISNE.facts_ahoy_events........... [ERROR in 0.88s]
10:35:52
10:35:52 Finished running 1 incremental model in 3.01s.
10:35:52
10:35:52 Completed with 1 error and 0 warnings:
10:35:52
10:35:52 Database Error in model facts_ahoy_events (models/marts/facts/facts_ahoy_events.sql)
10:35:52 000904 (42000): SQL compilation error: error line 41 at position 10
10:35:52 invalid identifier '"events"."event_ID"'
I've gotten used to the case-sensitive column names on Snowflake, but I can't for the life of me figure out what's going on, since the following query run directly on Snowflake, completes:
select "event_ID" from DBT_PCOISNE.FACTS_AHOY_EVENTS limit 10;
Whereas this one expectedly fails:
select event_ID from DBT_PCOISNE.FACTS_AHOY_EVENTS limit 10;
I think I've tried every combination of upper, lower, and mixed casing, each with and without quoting, but all my attempts have failed.
Any help or insight would be greatly appreciated!
Thank you
Most probably your column event_ID was created using "" around it which means an identifier was used. Now, using it also requires "" as all column names are capitalized inside Snowflake unless using identifiers.
Solution is to either use "" around column name or rename it to lower case using an ALTER.
For DBT you can read more here

Using variable arrays in models

Is it possible to define an array in the vars section and use it inside the SQL syntax of a model?
Something like this
dbt_project.yml:
vars:
active_country_codes: ['it','ge']
model.sql
SELECT ...
FROM TABLE WHERE country_code IN ('{{ var("active_country_codes") }}')
I've tried with a single value, i.e:['it'], and works but if I add another it starts failing.
I am using the SQL Server Data connector.
The query that you are writing is correct. You just need to pass the variable as a string with a comma also as a string character.
vars:
active_country_codes: 'it'',''ge'
You can do something like this :
SELECT ...
FROM TABLE WHERE country_code IN ('{{ var("active_country_codes") }}')
And it will create query for you like this:
SELECT ...
FROM TABLE WHERE country_code IN ('it,'ge')
I have tested this and it's working fine. I'm using Bigquery Connection but it shouldn't matter as it's dbt generation.
My educated guess is that the result of {{ var("active_country_codes") }} is to insert a comma separated string. In that case, you'll need a string splitting function. You will have to roll your own if you haven't already, unless you have SQL Server 2016 or later. Then you can use string_split. Below is code using it. I use the exists approach as opposed to in due to performance.
select ...
from table t
where exists (
select 0
from string_split('{{ var("active_country_codes") }}', ',') ss
where t.country_code = ss.value
)
I would use:
vars:
var_name: "'one','two','three'"
where field_name in ({{ var("var_name") }})
Looks a little bit clearer than:
active_country_codes: 'it'',''ge'

Get a Sql Server error on Order By - Symfony2

Using Symfony with Sql Server and from what I've read, it seems that the connection to the database is not stable.
As soon as I use the orderBy method I get an error :
Here's an example :
$qStores =
$this->getManager()
->createQueryBuilder()
->select('rpdv')
->from('MainBundle:PointDeVenteReference', 'rpdv')
->andWhere( 'rpdv.partenaireClient = :id_partner ' )
->setParameter( 'id_partner', $this->getUser()->getPartenaire()->getIdPartenaire() )
->orderBy( 'rpdv.idPointDeVenteReference' , 'DESC' )
->setFirstResult( 0 )
->setMaxResults( 30 );
$stores = new Paginator( $qStores, FALSE );
And the error :
An exception has been thrown during the rendering of a template ("An exception occurred while executing
'SELECT DISTINCT TOP 30 id_point_de_vente_reference0
FROM ( SELECT p0_.id_point_de_vente_reference AS id_point_de_vente_reference0,
p0_.reference AS reference1,
p0_.date_derniere_modification AS date_derniere_modification2,
p0_.blocage AS blocage3
FROM point_de_vente_reference p0_
WHERE p0_.id_partenaire_client = ?
ORDER BY p0_.id_point_de_vente_reference DESC ) dctrn_result
ORDER BY id_point_de_vente_reference0 DESC'
with params [2829]:SQLSTATE[42000]:
[Microsoft][SQL Server Native Client 11.0][SQL Server]
The ORDER BY clause is invalid in views, inline functions, derived tables, subqueries, and common table expressions,
unless TOP, OFFSET or FOR XML is also specified.") in MainBundle:Default:store/list.html.twig at line 79.
I tried to change the class SQLServerPlatform with corrections found on the net, without success.
Do you have any idea?
Thx !
Edit :
This bug appears to be related to the Paginator with the second parameter to true. Passing it to false, I have no error
The dctrn_result is a derived table. From the error message, you can not use an order by. I do not know symfony2, but the code going to the database engine is invalid.
Craftydba

'do_replace()' not working?

while trying ATK4 I've found a problem:
$this->api->db->dsql()->table('person')->set('id', 1)->set('name', 'Test user')->do_replace();
This is not working. Then I looked a little bit deeper in ATK4 source and found in /opt/ipism/www/atk4/lib/DB/dsql.php the lines
public $sql_templates=array(
'select'=>"select [options] [field] [from] [table] [join] [where] [group] [having] [order] [limit]",
'insert'=>"insert [options_insert] into [table_noalias] ([set_fields]) values ([set_values])",
'replace'=>"replace [options_replace] into [table_noalias] ([set_fields]) values ([set_values])",
'update'=>"update [table_noalias] set [set] [where]",
'delete'=>"delete from [table_noalias] [where]",
'truncate'=>'truncate table [table_noalias]',
'describe'=>'desc [table_noalias]',
);
After changing the 'replace'-line into
'replace'=>"replace into [table_noalias] ([set_fields]) values ([set_values])",
it worked for me (removing the options_replace and appending a 's' to set_value). I'm using latest version from git with a MySQL database connection.
But I'm not sure, if I'm using 'do-replace()' in the wrong way?
ByE...
By the way: Is there a way to send fixes, without creating an account on GitHub or somewhere?
Edit: Here is the output if the options_replace isn't removed from the template:
replace [options_replace] into `person` (`id`,`name`) values ("1","John Doe") [:a_2, :a]Application Error: Database Query Failed
Exception_DB, code: 0Additional information: pdo_error: SQLSTATE[42000]: Syntax error or access violation: 1064 You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '[options_replace] into `person` (`id`,`name`) values ('1' at line 1 mode: replace params: :a: 1 :a_2: John Doe query: replace [options_replace] into `person` (`id`,`name`) values (:a,:a_2) template: replace [options_replace] into [table_noalias] ([set_fields]) values ([set_values])/opt/ipism/www/atk4/lib/DB/dsql.php:1519
Stack trace:
File Object NameStack Trace/opt/ipism/www/atk4/lib/BaseException.php:63 Exception_DBException_DB->collectBasicData(Null)
/opt/ipism/www/atk4/lib/AbstractObject.php:545 Exception_DBException_DB->__construct("Database Query Failed", Null)
/opt/ipism/www/atk4/lib/DB/dsql.php:1519 sample_project_db_db_dsql_mysqlDB_dsql_mysql->exception("Database Query Failed")
/opt/ipism/www/atk4/lib/DB/dsql.php:1586 sample_project_db_db_dsql_mysqlDB_dsql_mysql->execute()
/opt/ipism/www/atk4/lib/DB/dsql.php:1624 sample_project_db_db_dsql_mysqlDB_dsql_mysql->replace()
/opt/ipism/www/page/test.php:40 sample_project_db_db_dsql_mysqlDB_dsql_mysql->do_replace()
/opt/ipism/www/atk4/lib/AbstractObject.php:306 sample_project_testpage_test->init()
/opt/ipism/www/atk4/lib/ApiFrontend.php:130 sample_projectFrontend->add("page_test", "test", "Content")
/opt/ipism/www/atk4/lib/ApiWeb.php:428 sample_projectFrontend->layout_Content()
/opt/ipism/www/atk4/lib/ApiFrontend.php:39 sample_projectFrontend->addLayout("Content")
/opt/ipism/www/atk4/lib/ApiWeb.php:275 sample_projectFrontend->initLayout()
/opt/ipism/www/index.php:15 sample_projectFrontend->main()
Note: To hide this information from your users, add $config['logger']['web_output']=false to your config.php file. Refer to documentation on 'Logger' for alternative logging options
Replace is similar to "insert" by it's nature, but instead of failing when primary key is duplicated, it replaces the value.
Please add ->debug() to your line before do_replace and give me the output, which would help me understand why that parameter needs removing.
set_value seems to be a typo, I have changed and committed it into master: https://github.com/atk4/atk4/commit/24b20865b9e3345a8e7504dfb68b7ef96335009e
the best way to submit changes is by creating a pull request. The best way to report issues is through "issues" in github currently.

Resources