Weird behavior with web2py DAL on table definition - database

Hi getting behaviour I don't understand with web2py
In [50]: db = DAL('sqlite://deposit/sample.sqlite')
In [51]: db.define_table('customer',Field('name','string',required=True),
Field('nric','string',required=True),
Field('address','string'),
Field('phone','integer'),
primarykey=['name'])
Out[51]: <Table customer (name,nric,address,phone)>
works as expected.
I then do
In [53]: db.define_table('check',
Field('nric', db.customer.nric, required=True),
Field('clear','string'))
which gets me the message
AttributeError: 'DAL' object has no attribute 'customer.nric'
So thinking this may be an issue of not having committed customer to the database
so I do a db.commit() and then try again
In [56]: db.define_table('check',Field('nric', db.customer.nric, required=True), Field('clear','string'))
File "<string>", line unknown
SyntaxError: table already defined: check
Not sure why .. but anyway I try and drop the table
In [59]: db['check'].drop()
and get the following weird traceback
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-59-998297b798f5> in <module>()
----> 1 db['check'].drop()
/home/tahnoon/.dropbox-cyn/Dropbox (TIS Group)/Cynopsis/Builds/Apollo/Code Src/web2py/gluon/dal.pyc in drop(self, mode)
9225
9226 def drop(self, mode=''):
-> 9227 return self._db._adapter.drop(self, mode)
9228
9229 def _listify(self, fields, update=False):
/home/tahnoon/.dropbox-cyn/Dropbox (TIS Group)/Cynopsis/Builds/Apollo/Code Src/web2py/gluon/dal.pyc in drop(self, table, mode)
1328 queries = self._drop(table, mode)
1329 for query in queries:
-> 1330 if table._dbt:
1331 self.log(query + '\n', table)
1332 self.execute(query)
/home/tahnoon/.dropbox-cyn/Dropbox (TIS Group)/Cynopsis/Builds/Apollo/Code Src/web2py/gluon/dal.pyc in __getitem__(self, key)
9108 return self._db(self._id == key).select(limitby=(0, 1), orderby_on_limitby=False).first()
9109 elif key:
-> 9110 return ogetattr(self, str(key))
9111
9112 def __call__(self, key=DEFAULT, **kwargs):
AttributeError: 'Table' object has no attribute '_dbt'
Checking tables shows
In [61]: db.tables()
Out[61]: ['customer']
Is this expected behaviour? If so how do I drop/create a table after a syntax error? thanks

Since db.customer is a keyed table (i.e., you have defined a primarykey attribute rather than relying on the default autoincrement integer ID field as the primary key), it can only be referenced by other keyed tables.
Also, when creating reference fields for keyed tables, use the following syntax:
Field('nric', 'reference customer.nric', required=True)
However, I don't think keyed tables are supported for SQLite (the docs say only DB2, MS-SQL, Ingres, and Informix are supported). Anyway, if you are creating a new table in SQLite, there is no reason to use a keyed table (that functionality was added primarily to enable access to legacy databases that lack autoincrement integer primary key fields).
Finally, dropping a table does not remove the model from the db DAL instance -- rather, that operation drops the table from the database itself. If you want to redefine a model within a shell session, you should use the "redefine" argument:
db.define_table(..., redefine=True)

Related

Finding missing records from 2 data sources with Flink

I have two data sources - an S3 bucket and a postgres database table. Both sources have records in the same format with a unique identifier of type uuid. Some of the records present in the S3 bucket are not part of the postgres table and the intent is to find those missing records. The data is bounded as it is partitioned by every day in the s3 bucket.
Reading the s3-source (I believe this operation is reading the data in batch mode since I am not providing the monitorContinuously() argument) -
final FileSource<GenericRecord> source = FileSource.forRecordStreamFormat(
AvroParquetReaders.forGenericRecord(schema), path).build();
final DataStream<GenericRecord> avroStream = env.fromSource(
source, WatermarkStrategy.noWatermarks(), "s3-source");
DataStream<Row> s3Stream = avroStream.map(x -> Row.of(x.get("uuid").toString()))
.returns(Types.ROW_NAMED(new String[] {"uuid"}, Types.STRING));
Table s3table = tableEnv.fromDataStream(s3Stream);
tableEnv.createTemporaryView("s3table", s3table);
For reading from Postgres, I created a postgres catalog -
PostgresCatalog postgresCatalog = (PostgresCatalog) JdbcCatalogUtils.createCatalog(
catalogName,
defaultDatabase,
username,
pwd,
baseUrl);
tableEnv.registerCatalog(postgresCatalog.getName(), postgresCatalog);
tableEnv.useCatalog(postgresCatalog.getName());
Table dbtable = tableEnv.sqlQuery("select cast(uuid as varchar) from `localschema.table`");
tableEnv.createTemporaryView("dbtable", dbtable);
My intention was to simply perform left join and find the missing records from the dbtable. Something like this -
Table resultTable = tableEnv.sqlQuery("SELECT * FROM s3table LEFT JOIN dbtable ON s3table.uuid = dbtable.uuid where dbtable.uuid is null");
DataStream<Row> resultStream = tableEnv.toDataStream(resultTable);
resultStream.print();
However, it seems that the UUID column type is not supported just yet because I get the following exception.
Caused by: java.lang.UnsupportedOperationException: Doesn't support Postgres type 'uuid' yet
at org.apache.flink.connector.jdbc.dialect.psql.PostgresTypeMapper.mapping(PostgresTypeMapper.java:171)
As an alternative, I tried to read the database table as follows -
TypeInformation<?>[] fieldTypes = new TypeInformation<?>[] {
BasicTypeInfo.of(String.class)
};
RowTypeInfo rowTypeInfo = new RowTypeInfo(fieldTypes);
JdbcInputFormat jdbcInputFormat = JdbcInputFormat.buildJdbcInputFormat()
.setDrivername("org.postgresql.Driver")
.setDBUrl("jdbc:postgresql://127.0.0.1:5432/localdatabase")
.setQuery("select cast(uuid as varchar) from localschema.table")
.setUsername("postgres")
.setPassword("postgres")
.setRowTypeInfo(rowTypeInfo)
.finish();
DataStream<Row> dbStream = env.createInput(jdbcInputFormat);
Table dbtable = tableEnv.fromDataStream(dbStream).as("uuid");
tableEnv.createTemporaryView("dbtable", dbtable);
Only this time, I get the following exception on performing the left join (as above) -
Exception in thread "main" org.apache.flink.table.api.TableException: Table sink '*anonymous_datastream_sink$3*' doesn't support consuming update and delete changes which is produced by node Join(joinType=[LeftOuterJoin]
It works if I tweak the resultStream to publish the changeLogStream -
Table resultTable = tableEnv.sqlQuery("SELECT * FROM s3table LEFT JOIN dbtable ON s3table.sync_id = dbtable.sync_id where dbtable.sync_id is null");
DataStream<Row> resultStream = tableEnv.toChangelogStream(resultTable);
resultStream.print();
Sample O/P
+I[9cc38226-bcce-47ce-befc-3576195a0933, null]
+I[a24bf933-1bb7-425f-b1a7-588fb175fa11, null]
+I[da6f57c8-3ad1-4df5-9636-c6b36df2695f, null]
+I[2f3845c1-6444-44b6-b1e8-c694eee63403, null]
-D[9cc38226-bcce-47ce-befc-3576195a0933, null]
-D[a24bf933-1bb7-425f-b1a7-588fb175fa11, null]
However, I do not want the sink to have Inserts and Deletes as separate. I want just the final list of missing uuids. I guess it happens because my Postgres Source created with DataStream<Row> dbStream = env.createInput(jdbcInputFormat); is a streaming source. If I try to execute the whole application in BATCH mode, I get the following exception -
org.apache.flink.table.api.ValidationException: Querying an unbounded table '*anonymous_datastream_source$2*' in batch mode is not allowed. The table source is unbounded.
Is it possible to have a bounded JDBC source? If not, how can I achieve this using the streaming API. (using Flink version - 1.15.2)
I believe this kind of case would be a common usecase that can be implemented with Flink but clearly I'm missing something. Any leads would be appreciated.
For now common approach would be to sink the resultStream to a table. So you can schedule a job which truncates the table and then executes the Apache Flink job. And then read the results from this table.
I also noticed Apache Flink Table Store 0.3.0 is just released. And they have materialized views on the roadmap for 0.4.0. This might be a solution too. Very exciting imho.

Change sa.Column nullable property with Alembic in Batch mode

I have a database upgrade migration I want to apply to Database column:
def upgrade():
# ### commands auto generated by Alembic - please adjust! ###
with op.batch_alter_table('details') as batch_op:
batch_op.alter_column('details', 'non_essential_cookies',
existing_type=sa.BOOLEAN(),
nullable=False)
# ### end Alembic commands ###
I am implementing batch mode sice ALTER is unsupported and previously I received this error : sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) near "ALTER":. However, I hoped batch mode would work but now instead I receive the new error:
TypeError: <flask_script.commands.Command object at 0x1149bb278>: alter_column() got multiple values for argument 'nullable'.
I only have one tuple in the table and the relevant attribute is not NULL so the database migration is valid. I just don't understand why there are multiple values
From docs:
The method is used as a context manager, which returns an instance of
BatchOperations; this object is the same as Operations except that
table names and schema names are omitted.
Key point here being that you don’t have to provide the table name when calling ops on the BatchOperations instance.
The signature for alter_column is:
alter_column(table_name, column_name, nullable=None, server_default=False, new_column_name=None, type_=None, existing_type=None, existing_server_default=False, existing_nullable=None, schema=None, **kw)
So from your code:
with op.batch_alter_table('details') as batch_op:
batch_op.alter_column('details', 'non_essential_cookies',
existing_type=sa.BOOLEAN(),
nullable=False)
'details' is being passed to column_name, and 'non_essential_cookies' is getting passed to nullable as a positional argument. The issue is caused later when you specify the value of nullable again with the keyword arg,nullable=False.

Is it possible to change name of the system table

I want to change name of the system table in my database is it possible? Probably I shouldn't change it but I'm curious.
When I execute sp_rename I get the following error:
Msg 15001, Level 16, State 1, Procedure sp_rename, Line 404
Object 'cdc.[dbo_CdcTest_CT]' does not exist or is not a valid object for this operation.
Edit:
I want to change name of tables created by Change Data Capture because I want to disable CDC mechanism for table and still have data - I know that I can create additional table and move there data from CDC table but it's easier to change name of the CDC and then disable cdc for specified table.
No you cannot change the name of the system tables. However you can refer it with a different name.
You can use synonyms for that:
CREATE SYNONYM [ schema_name_1. ] synonym_name FOR <object>
<object> :: =
{
[ server_name.[ database_name ] . [ schema_name_2 ].| database_name . [ schema_name_2 ].| schema_name_2. ] object_name
}
Also to mention that sp_rename
Changes the name of a user-created object in the current database.
This object can be a table, index, column, alias data type, or
Microsoft .NET Framework common language runtime

Cayenne, Postgres: primary key generation

I'm using Cayenne 3.2M1 and Postgres 9.0.1 to create a database. Right now I'm having problems with the primary key generation of Cayenne since I have tables with more than one primary key and as far as I've read Cayenne cant generate more that one primary key per table. So I want the Postgres to do that work.
I have this table:
CREATE TABLE telefonocliente
(
cod_cliente integer NOT NULL DEFAULT currval('cliente_serial'::regclass),
cod_telefono integer NOT NULL DEFAULT nextval('telefonocliente_serial'::regclass),
fijo integer,
CONSTRAINT telefonocliente_pkey PRIMARY KEY (cod_cliente, cod_telefono)
)
WITH (
OIDS=FALSE
);
TelefonoCliente telefono = context.newObject(TelefonoCliente.class);
telefono.setFijo(4999000);
context.commitChanges();
and this is the error I get:
INFO: --- transaction started.
19/11/2013 22:46:17 org.apache.cayenne.access.dbsync.CreateIfNoSchemaStrategy processSchemaUpdate
INFO: Full or partial schema detected, skipping tables creation
19/11/2013 22:46:17 org.apache.cayenne.log.CommonsJdbcEventLogger logQuery
INFO: SELECT nextval('pk_telefonocliente')
Exception in thread "main" org.apache.cayenne.CayenneRuntimeException: [v.3.2M1 Jul 07 2013 16:23:58] Commit Exception
at org.apache.cayenne.access.DataContext.flushToParent(DataContext.java:759)
at org.apache.cayenne.access.DataContext.commitChanges(DataContext.java:676)
at org.example.cayenne.Main.main(Main.java:45)
Caused by: org.postgresql.util.PSQLException: ERROR: no existe la relaci?n ≪pk_telefonocliente≫
Position: 16
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2102)
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1835)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257)
at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:500)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:374)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:254)
at org.apache.cayenne.dba.postgres.PostgresPkGenerator.longPkFromDatabase(PostgresPkGenerator.java:79)
at org.apache.cayenne.dba.JdbcPkGenerator.generatePk(JdbcPkGenerator.java:272)
at org.apache.cayenne.access.DataDomainInsertBucket.createPermIds(DataDomainInsertBucket.java:171)
at org.apache.cayenne.access.DataDomainInsertBucket.appendQueriesInternal(DataDomainInsertBucket.java:76)
at org.apache.cayenne.access.DataDomainSyncBucket.appendQueries(DataDomainSyncBucket.java:78)
at org.apache.cayenne.access.DataDomainFlushAction.preprocess(DataDomainFlushAction.java:188)
at org.apache.cayenne.access.DataDomainFlushAction.flush(DataDomainFlushAction.java:144)
at org.apache.cayenne.access.DataDomain.onSyncFlush(DataDomain.java:685)
at org.apache.cayenne.access.DataDomain$2.transform(DataDomain.java:651)
at org.apache.cayenne.access.DataDomain.runInTransaction(DataDomain.java:712)
at org.apache.cayenne.access.DataDomain.onSyncNoFilters(DataDomain.java:648)
at org.apache.cayenne.access.DataDomain$DataDomainSyncFilterChain.onSync(DataDomain.java:852)
at org.apache.cayenne.access.DataDomain.onSync(DataDomain.java:629)
at org.apache.cayenne.access.DataContext.flushToParent(DataContext.java:727)
... 2 more
I've been trying the suggestions on Cayenne tutorial "generated columns", "primary key support" but I seems to always get some error.
INFO: SELECT nextval('pk_telefonocliente')
Exception in thread "main" org.apache.cayenne.CayenneRuntimeException: [v.3.2M1 Jul 07 2013 16:23:58] Primary Key autogeneration only works for a single attribute.
I want to know how to solve this.
Thanks in advance
From your description in comments, out of 2 columns comprising the PK of 'telefonocliente', only one is truly independent - 'cod_telefono'. This will be what Cayenne will generate. In case of PosgreSQL, you will need the following sequence in DB for this to happen:
CREATE SEQUENCE pk_telefonocliente INCREMENT 20 START 200;
Now, where does the second PK 'cod_cliente' come from? Since it is also FK to another table, it means it is a "dependent" PK, and must come from a relationship. So first you need to map a many-to-one relationship between 'telefonocliente' and 'cliente'. Check "To Dep Pk" checkbox on the 'telefonocliente' side. Generate a matching ObjRelationship for your Java objects. Now you can use it in your code:
Cliente c = .. // get a hold of this object somehow
TelefonoCliente telefono = context.newObject(TelefonoCliente.class);
telefono.setFijo(4999000);
telefono.setCliente(c); // this line is what will populate 'cod_cliente' PK/FK
That should be it.
The primary key is allowed just to be one per a table! In your case you create a primary key on two columns, that is right, it is defined in an SQL standard and Postgres supports it well.
However there is a not in Cayenne documentation:
Cayenne only supports automatic PK generation for a single column per table.
see http://cayenne.apache.org/docs/3.0/primary-key-generation.html at the bottom of the page.
Probably they can fix it in a newer version or you can put a request to the Cayenne community.

PostgreSQL JPA cascade partially working

Again we probably have a very simple problem;
our database look like:
CREATE TABLE Question (
idQuestion SERIAL,
questionContent VARCHAR,
CONSTRAINT Question_idQuestion_PK PRIMARY KEY (idQuestion),
);
CREATE TABLE Answer (
idAnswer SERIAL,
answerContent VARCHAR,
idQuestion INTEGER,
CONSTRAINT Answer_idAnswer_PK PRIMARY KEY (idAnswer),
CONSTRAINT Answer_idQuestion_FK FOREIGN KEY (idQuestion) REFERENCES Question(idQuestion),
);
So a Question have many Answers.
Following in entity generated by Netbeans 7.1.2 we have field:
#OneToMany(mappedBy = "idquestion", orphanRemoval=true, cascade= CascadeType.ALL, fetch= FetchType.EAGER)
private Collection<Answer> answerCollection;
As you can see I've already added all possible orphan removal and cascade instructions for cascade removal of collection. And it's working fine but for one moment:
You can delete a Question and connected Answers only if they were created in previous 'instance' of our Application. If I first create a new Question and even one Answer and then go straight and delete it we got an error like:
root cause
Exception [EclipseLink-4002] (Eclipse Persistence Services - 2.2.0.v20110202-r8913): org.eclipse.persistence.exceptions.DatabaseException
Internal Exception: org.postgresql.util.PSQLException: ERROR: update or delete on table "question" violates foreign key constraint "answer_idquestion_fk" on table "answer"
Detail: Key (idquestion)=(30) is still referenced from table "answer".
Error Code: 0
Call: DELETE FROM question WHERE ((idquestion = ?) AND (version = ?))
bind => [2 parameters bound]
Query: DeleteObjectQuery(com.accenture.androidwebapp.entities.Question[ idquestion=30 ])
root cause
org.postgresql.util.PSQLException: ERROR: update or delete on table "question" violates foreign key constraint "answer_idquestion_fk" on table "answer"
Detail: Key (idquestion)=(30) is still referenced from table "answer".
If I restart (rebuild, redeploy) the application it's working though.. why? Thanks!
Try adding the EclipseLink #PrivateOwned extension annotation to your collection mapping.
As for the issue with deletion not working until you restart the app, two things come to mind:
You might be using very long EntityManager sessions where everything stays attached to the EnitityManager. If that's the case, it's the change of EntityManager session forced by a restart that's helping. Consider using shorter EntityManager sessions by calling EntityManager.close() when you're done with a session. Work with detached entities and EntityManager.merge() state back in when you want to modify it.
The second level cache, if any, will be cleared by a redeploy. Try disabling the second level cache and see if that helps.

Resources