How to snapshot schema and data from a different database and different schema on DBT

How to snapshot schema and data from a different database and different schema on DBT - snowflake-cloud-data-platform

running my sources on DBT, it returns the following error:
18:32:41 Encountered an error:
Compilation Error in snapshot customer_snapshot (snapshots\customer.sql)
Snapshot 'snapshot.myproject._customer_snapshot' (snapshots\customer.sql) depends on a source named 'myschema.customer' which was not found
The message is correct. There is no such schema in the database. Schema, tables and data are present in another DB.
Assuming, I'm in a sandbox, how would I snapshot schema and data from a different db, different schema, using DBT?

Note that this only works because Snowflake databases are just logical databases that share a connection. This will not work on other dbt adapters.
First you configure a source that specifies the database:
# models/sources.yml
version: 2
sources:
- name: myschema # dbt also uses this as the schema name unless you provide a schema property
database: other_db
tables:
- name: customer
Then in your snapshot, you can reference this source:
{% snapshot customer_snapshot %}
{{
config(
target_database='analytics',
target_schema='snapshots',
unique_key='id',
strategy='timestamp',
updated_at='updated_at',
)
}}
select * from {{ source('myschema', 'customer') }}
{% endsnapshot %}

Related

Errors in the high-level relational engine (SQL Server)

I made changes to a SQL Server table and when processing my cube I get these errors:
Errors in the high-level relational engine. A connection could not be made to the data source with the DataSourceID of 'Channel Final', Name of 'Channel Final'
My question is: if I change column names of a table, how can I change the names of that table in the cube so that I don't get that error?

First try update your dsv then remove column from affected dimension and add a new one with updated name. After that full process your dimension and cube in these steps.

ansible cannot execute SQL CREATE DATABASE CREATE DATABASE cannot run inside a transaction block

There are several entries that CREATE DATABASE cannot run inside a transaction block which give the answer autocommit needs to be on. However, they do not reference ansible which is what I was looking for.

There is a specific postgresql_db module that will take care of your db creation (or removal/dump/restoration) and will manage idempotency out of the box.
- name: Create database if needed
postgresql_db:
name: "{{ dbname }}"
state: present
become: yes
become_user: postgres

I found in the anisble documentation there is a way to turn autocommit on such as:
- name: create database if needed
postgresql_query:
autocommit: yes
query: |
CREATE DATABASE {{ dbname }};
become: yes
become_user: postgres
I thought this would be helpful for people like me who tend to look at stack overflow first when searching for help. Note: {{ dbname }} is a variable. You could also use a literal.

h2 database unit test across multiple schema

I am trying to use unit test along with h2 database. My application uses MSSQL database. And below are the 2 table that am using in my application:
SchemaA.dbo.Table1<br>
SchemaB.dbo.table2<br>
#Entity<br>
#Table(name="SchemaB..table")<br>
Class A <br>
{
private Long id;
............
}
I am trying to write unit test to test the persistance of the above class. But h2 database does not recognise this tablename syntax:
SchemaB..table
Note : the 2 dots between schema name and table name.
Any suggestion would be greatly appreciated.

You may want to use the schema attribute of the Table JPA annotation.
For example:
#Entity(name = "Foo")
#Table(name = "TABLE_FOO", schema = "bar")
If you have a single data source, which connects to your h2 with user A. In order to access schema 'bar', you may want to tell h2 to automatically create schema 'bar' on connect.
jdbc:h2:mem:play;MODE=MySQL;INIT=RUNSCRIPT FROM 'test/init.sql'
The final part of the JDBC URL test/init.sql points to a sql file with the following content.
CREATE SCHEMA IF NOT EXISTS bar
H2 will execute the sql and create the schema on connect.
I've created a demo project at github.
The project has an init.sql file that creates 2 schema, foo and bar.
2 model classes foo.A and bar.B that use #Entity(schema="foo", name="A") to specify schema accordingly. see app/models.
The test case uses play framework therefore the built-in evolution tool can be applied every time test cases are executed. But it should be fine to use setUp method to apply your own sql script before executing test cases. Please see test folder for the sample test case. (it's actually scalaTest but it basically has the same idea as junit)

Oracle + dbunit throws AmbiguousTableNameException

I'm using DBUnit to populate the database so that its content is a known content during testing.
The db schema I'm working on is in an Oracle 11g instance in which they reside other db schemas. In some of these schemas has been defined a table to which has been associated with a public synonym and on which have been given the rights to select.
When I run the xml that defines how the database must be populated, also if the xml file doesn't contain the table defined in several schemas, DBUnit throws the AmbiguousTableNameException exception on that table.
I found that there are 3 solutions to solve this behavior:
Use a database connection credential that has access to only one
database schema.
Specify a schema name to the DatabaseConnection or
DatabaseDataSourceConnection constructor.
Enable the qualified table name support (see How-to documentation).
In my case, I can only apply the solution 1, but even if I adopt it, I got the same exception.
The table that gives me problems is defined in 3 schemas and I don't have the opportunity to act on it in any way.
Please, someone could help me?

I found the solution: I specified the schema in the name of the tables and I have set to true the property http://www.dbunit.org/features/qualifiedTableNames (corresponding to org.dbunit.database.FEATURE_QUALIFIED_TABLE_NAMES).
By this way, my xml code to populate tables look like:
<?xml version='1.0' encoding='UTF-8'?>
<dataset>
<SCHEMA.TABLE ID_FIELD="1" />
</dataset>
where SCHEMA is the schema name, TABLE is the table name.
To se the property I've used the following code:
DatabaseConfig dBConfig = dBConn.getConfig(); // dBConn is a IDatabaseConnection
dBConfig.setProperty(DatabaseConfig.FEATURE_QUALIFIED_TABLE_NAMES, true);

In my case,
I granted dba role to user, thus dbunit throw AmbiguousTableNameException.
After I revoke dba role to user, I solve that problem.
SQL> revoke dba from username;

I had the same AmbiguousTableNameException while executing Dbunits aginst Oracle DB. It was working fine and started throwing error one day.
Rootcause: while calling a stored procedure, it got modified by mistake to lower case. When changed to upper case it stared working.
I could solve this also by setting the shema name to IDatabaseTester like iDatabaseTester.setSchema("SCHEMANAMEINCAPS")
Thanks
Smitha

I was using SpringJDBC along with MySQL Connector (v8.0.17). Following the 2 steps explained in this answer alone did not help.
First I had to set the schema on the spring datasource.
Then I also had to set a property "databaseTerm" to "schema"
by default it is set to "catalogue" as explained here.
We must set this property because (in Spring's implementation of javax.sql.DataSource) if it's not set (i.e. defaulted to "catalogue") then the connection returned by dataSource.getConnection() will not have the schema set on it even if we had set it on the dataSource.
#Bean
public DriverManagerDataSource cloudmcDataSource() {
DriverManagerDataSource dataSource = new DriverManagerDataSource();
dataSource.setDriverClassName("<driver>");
dataSource.setUrl("<url>");
dataSource.setUsername("<uname>");
dataSource.setPassword("<password>");
dataSource.setSchema("<schema_name>");
Properties props = new Properties();
// the following key-value pair are constants; must be set as is
props.setProperty("databaseTerm", "schema");
dataSource.setConnectionProperties(props);
return dataSource;
}
Don't forget to make the changes explained in answer here.

Bucardo add sync to replicate data

I am using Bucardo to replicate data in a database. I have one database, called mydb, and another called mydb2. They both contain identical tables, called "data" in both cases. Following the steps on this website, I have installed Bucardo and added the two databases:
bucardo_ctl add database mydb
bucardo_ctl add database mydb2
and added the tables:
bucardo_ctl add all tables
Now when I try to add a sync using the following command:
bucardo_ctl add sync testfc source=mydb targetdb=mydb2 type=pushdelta tables=data
I get the following error:
DBD::Pg::st execute failed: ERROR: error from Perl function "herdcheck": Cannot have goats from different databases in the same herd (1) at line 17. at /usr/bin/bucardo_ctl line 3346.
Anyone have any suggestions? Any would be appreciated.

So, in the source option you should put the name of the herd (which, as I know, is the list of tables.
Then, instead of:
bucardo_ctl add all tables
use
bucardo_ctl add all tables --herd=foobar
And instead of using
bucardo_ctl add sync testfc source=mydb targetdb=mydb2 type=pushdelta tables=data
use
bucardo_ctl add sync testfc source=foobar targetdb=mydb2 type=pushdelta tables=data
The thing is that the source option is not a place where you put the source database, but the "herd" or tables.
Remember that the pushdelta are for tables with primary keys, and the fullcopy are for tables that doesn't matter is they have a PK or not.
Hope that helps.