Disable clickhouse logs in the system database - database

In clickhouse, there is a database called system where logs are stored.
My problem is that after installing clickhouse, the volume of the system database has increased a day, and I sent a photo of it, and if I only use it for 30 days, I have to allocate nearly 30 gigs of space on the server just for the system database, which costs It will be high.
Especially the two tables trace_log and part_log take a lot of space
How to disable the logs in the system database?
I have already seen the link below and did everything and it didn't work (link).
The following command does not work to prevent system database logs:
set log_queries = 0;
And also the following code does not work for me:
cat /etc/clickhouse-server/users.d/log_queries.xml
<?xml version="1.0" ?>
<yandex>
<users>
<default>
<log_queries>0</log_queries>
</default>
</users>
</yandex>
I even went to this path sudo nano /etc/clickhouse-server/config.xml
and I entered the following values, but it didn't work:
<logger>
<level>none</level>
<output>null</output>
</logger>
In addition, I restarted clickhouse every time to apply the changes
It is interesting here that when I do not insert any data into my database in my codes, the system database increases in size for no reason.
I searched a lot and did a lot of tests, but I didn't get any results. Thank you for your guidance

https://kb.altinity.com/altinity-kb-setup-and-maintenance/altinity-kb-system-tables-eat-my-disk/
You can disable all / any of them
Do not create log tables at all (a restart is needed for these changes to take effect).
$ cat /etc/clickhouse-server/config.d/z_log_disable.xml
<?xml version="1.0"?>
<clickhouse>
<asynchronous_metric_log remove="1"/>
<metric_log remove="1"/>
<query_thread_log remove="1" />
<query_log remove="1" />
<query_views_log remove="1" />
<part_log remove="1"/>
<session_log remove="1"/>
<text_log remove="1" />
<trace_log remove="1"/>
<crash_log remove="1"/>
<opentelemetry_span_log remove="1"/>
<zookeeper_log remove="1"/>
</clickhouse>
And you need to drop existing tables
drop table system.trace_log;
...

The settings you referenced control the query_log table, more details are available here:
https://clickhouse.com/docs/en/operations/system-tables/query_log/
Note that it is not recommended to turn off the query_log because information in this table is important for solving issues.
The trace_log and part_log tables are different and shouldn't be enabled by default, you can locate these blocks in your config.xml and comment them:
<trace_log>
<database>system</database>
<table>trace_log</table>
<partition_by>toYYYYMM(event_date)</partition_by>
<flush_interval_milliseconds>7500</flush_interval_milliseconds>
</trace_log>
and
<part_log>
<database>system</database>
<table>part_log</table>
<partition_by>toMonday(event_date)</partition_by>
<flush_interval_milliseconds>7500</flush_interval_milliseconds>
</part_log>
Reference:
https://clickhouse.com/docs/en/operations/server-configuration-parameters/settings#server_configuration_parameters-trace_log
https://clickhouse.com/docs/en/operations/server-configuration-parameters/settings/#server_configuration_parameters-part-log

Related

Debezium error, schema isn't known to this connector

I have a project using Debezium, mostly based on this example, which is then connected to an Apache Pulsar.
I have changed a few configurations. The file now looks like this:
database.history=io.debezium.relational.history.MemoryDatabaseHistory
connector.class=io.debezium.connector.mysql.MySqlConnector
offset.storage=org.apache.kafka.connect.storage.FileOffsetBackingStore
offset.storage.file.filename=offset.dat
offset.flush.interval.ms=5000
name=mysql-dbz-connector
database.hostname={ip}
database.port=3308
database.user={user}
database.password={pass}
database.dbname=database
database.server.name=test
table.whitelist=database.history_table,database.project_table
snapshot.mode=schema_only
schemas.enable=false
include.schema.changes=false
pulsar.topic=persistent://public/default/{0}
pulsar.broker.address=pulsar://{ip}:6650
database.history=io.debezium.relational.history.MemoryDatabaseHistory
As you may understand, what I'm trying to do is to monitor the history_table and the project_table modifications from the database and then write payloads onto an Apache Pulsar.
My problem is as follows. In whatever snapshot mode I use, when an offset has been written, I can't restart the Debezium without getting an error on the next database update.
Encountered change event for table database.history_table whose schema isn't known to this connector
It only happens with an existing offset.dat file. I think this is because the schema is null within the offset.dat file. Take this one for example:
¨Ìsrjava.util.HashMap⁄¡√`—F
loadFactorI thresholdxp?#wur[B¨Û¯T‡xpG{"schema":null,"payload":["mysql-dbz-connector",{"server":"test"}]}uq~U{"ts_sec":1563802215,"file":"database-bin.000005","pos":79574,"server_id":1,"event":1}x
I first suspected the schemas.enable=false or the include.schema.changes=false parameters that I used to make the JSON more concise, but their values don't change anything in the offset.dat file.
The problem lies in line database.history=io.debezium.relational.history.MemoryDatabaseHistory. The history will not survive restart. You should use FileDatabaseHistory instead of it.

Dynamic variables in Database Project views

I have a Database Project with some views.
The views should behave differently depending on the environment they are published to.
When published to the development environment, the INNER JOINs should use a specific prefix for target schema name, and another prefix on the test environment.
Is it possible to achieve this? In the below code snippet, id like to use Hub when developing locally and when publishing to the dev envrionment, and ISA when published to test.
Example:
CREATE VIEW [ISA].[v_CoveredRisk]
AS SELECT
CR.Bkey_CoveredRisk_Unique
,CO.Bkey_Coverage_Unique
,CO.Name
,PO.EKey_Policy
,CoObj.Bkey_CoveredObject
,CoObj.BKey_Building
,CoObj.Bkey_Home
,CoObj.BKey_Object
,CoObj.BKey_Person
,CoObj.BKey_Pet
,CoObj.BKey_Vehicle
,Risk_Excess
,Risk_Sum
,CAST(CurrentYearPremiumAmount AS float) AS CurrentYearPremiumAmount
,IsActive
,PO.BKey_Policy
,CR.Record_Timestamp
FROM Hub.[CoveredRisk] CR
INNER JOIN Hub.Coverage CO ON CR.EKey_Coverage = CO.EKey_Coverage
INNER JOIN Hub.CoveredObject CoObj ON CR.EKey_CoveredObject = CoObj.EKey_CoveredObject
INNER JOIN Hub.[Policy] PO ON CR.EKey_Policy = PO.Ekey_Policy
The first thing is that you should remove this requirement and have the code the same in all your databases. You are almost 100% guaranteed to make a mistake at some point regarding this and deploy something that doesn't work in a different environment.
If you do want to do this, you can do it with synonyms - in your view reference a synonym and have that pointing to the respective schema. You can't get a synonym to point to a schema directly but can objects within a schema so if you have:
dev table devSchema.table
prod table prodSchema.table
in dev, have a synonym like:
create synonym Hub.table for devSchema.table
then your view reference Hub.table and it will be resolved to the dev table.
have a T4 template that generates your SQL script
in your template find out what environment you are running against... create output accordingly
to find out for which environment you need to generate the output, have a look at publishing profiles, and configuration specific variables (a.k.a. "conditional compilation symbols" in your project's build properties)
you can probe those in the T4 template
You can't use variables in schema or object name. If you really want to achieve what you say, I can suggest you 2 ways:
You will not control it with variables but you'll control it with release configurations. You can use conditional statements in the sqlproj file. So, I've created 2 views:
CREATE VIEW [HUB].[View1]
AS SELECT 1 as one;
CREATE VIEW [ISA].[View1]
AS SELECT 1 as one;
Then in sqlproj file I do following thing:
<Build Include="View1.sql" />
<None Include="View1.sql" Condition=" '$(Configuration)' == 'Debug'" />
<None Include="View1_1.sql" />
<Build Include="View1_1.sql" Condition=" '$(Configuration)' == 'Release' " />
And then just pick the right release configuration when you deploy
NOTE: You need include and exclude the same file from build to achieve that.
This is much simpler approach. Always create view with the same and move it to the proper schema in the publish script. For example:
IF DB_NAME() = 'Dev' EXEC sp_rename ....

Where is the data saved, from the Datatable-items, in ECM eRoom-database?

I am trying to retrieve data out of the ECM eRoom Database (which isn't documented, as I know of).
I have an eRoom with an custom "Database", some Fields.
When I query the objects table I find the "Database" row
select * from[dbo].[Objects] where internalId = 1234567
and the Rows for the entries
select top 10 * from[dbo].[Objects] where parentInternalId = 1234567
but I don't find any field with the values of the entries, only an Column with NonSearchableProperties., that is only full with Hex Data.
My Question(s),
how could i retrieve the values?
Is it possible to retrieve them with mere SQL?
What is the simplest way?
This is not the silver bullet, but it is okay for my usecase
After long goolging and alot of test scripts, i found some answers, but probably due to the fact the the system is soon end-of-life and that the documentation is not easy to read, here are my finding.
Is it possible to retrieve them with mere SQL?
As far as I could find out, no! (please correct me If I'm wrong)
how could i retrieve the values?
With the eRoom API(on the Server there are some Sample programms to query the data/objects <installation-path>\eRoom Server\Toolkit\Samples, with c++, vb, vbscript, ... all a bit to much overhead), or with the eRoom XML Query Language(exql) over soap calls.
What is the simplest way?
After alot of tests, searching in forums and many tests with soap ui. I found out that queries with exql seem to be the simplest way to retrieve Data, if you understand the structure.
Here some ressource that were helpful:
(very) Basic info of exql from the manufacturer https://eroom.asce.org/eRoomHelp/en/XML_Help/Introduction.htm
(disclaimer: I didn't find it helpful, but it shows at least some basics)
short 9 page Developer guide https://developer-content.emc.com/developer/downloads/eRoomXMLCapabilitiesUseCaseProgramDashboard.pdf (the last example on page 8, helped me understand how to setup the query, with alot of fantasy)
But for this to work, don't forget, to activate Allow XML queries and commands from external applications in the Site Settings
TIP 1:
you always can go go deeper you just need to know the right xml-element under. <Database>, <Cells> and <DBCell> can help you go deeper
TIP 2:
don't query to much data since this query likely run into timeouts
Update 1:
Just to save time for anyone who is looking, this "query" returns all rows (properties) for a Database(s) created in an eRoom Root.
(don't forget to set facility and room in the Url ex. http://server/eroomxml/facilities/TEST/Rooms/TestRoom, although it could be set in the query)
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:er="http://www.eroom.com/eRoomXML/2003/700">
<soapenv:Header/>
<soapenv:Body>
<er:ExecuteXMLCommand>
<er:eRoomXML>
<er:command er:select="HomePage/Items">
<er:getproperties>
<er:Item>
<Database>
<Rows>
<Item>
<Cells>
<DBCell>
<Content>
</Content>
</DBCell>
</Cells>
</Item>
</Rows>
</Database>
</er:Item>
</er:getproperties>
</er:command>
</er:eRoomXML>
</er:ExecuteXMLCommand>
</soapenv:Body>
</soapenv:Envelope>

Catalog 'myDB' not found in database 'db'

I was trying to run the Benerator to populate database (shop demo to fill database schemas based on a setup file). While running the following,
I am getting the below error.
15:25:50,232 INFO (main) [DefaultDBSystem] Fetching table details and ordering tables by dependency
15:25:50,554 ERROR (main) [DescriptorRunner] Error in Benerator execution
org.databene.commons.ConfigurationError: Catalog 'myDB' not found in database 'db'
at org.databene.platform.db.DBSystem.findTableInConfiguredCatalogAndSchema(DBSystem.java:819)
at org.databene.platform.db.DBSystem.getTable(DBSystem.java:791)
at org.databene.platform.db.DBSystem.getWriteColumnInfos(DBSystem.java:744)
at org.databene.platform.db.DBSystem.persistOrUpdate(DBSystem.java:831)
at org.databene.platform.db.DBSystem.store(DBSystem.java:360)
at org.databene.benerator.storage.StorageSystemInserter.startProductConsumption(StorageSystemInserter.java:53)
at org.databene.benerator.consumer.AbstractConsumer.startConsuming(AbstractConsumer.java:47)
at org.databene.benerator.consumer.ConsumerProxy.startConsuming(ConsumerProxy.java:62)
at org.databene.benerator.engine.statement.ConsumptionStatement.execute(ConsumptionStatement.java:53)
at org.databene.benerator.engine.statement.GenerateAndConsumeTask.execute(GenerateAndConsumeTask.java:159)
at org.databene.task.TaskProxy.execute(TaskProxy.java:59)
at org.databene.task.StateTrackingTaskProxy.execute(StateTrackingTaskProxy.java:52)
at org.databene.task.TaskExecutor.runWithoutPage(TaskExecutor.java:136)
at org.databene.task.TaskExecutor.runPage(TaskExecutor.java:126)
at org.databene.task.TaskExecutor.run(TaskExecutor.java:101)
at org.databene.task.TaskExecutor.run(TaskExecutor.java:77)
at org.databene.task.TaskExecutor.execute(TaskExecutor.java:71)
at org.databene.benerator.engine.statement.GenerateOrIterateStatement.executeTask(GenerateOrIterateStatement.java:156
at org.databene.benerator.engine.statement.GenerateOrIterateStatement.execute(GenerateOrIterateStatement.java:99)
at org.databene.benerator.engine.statement.LazyStatement.execute(LazyStatement.java:58)
at org.databene.benerator.engine.statement.StatementProxy.execute(StatementProxy.java:46)
at org.databene.benerator.engine.statement.TimedGeneratorStatement.execute(TimedGeneratorStatement.java:70)
at org.databene.benerator.engine.statement.SequentialStatement.executeSubStatements(SequentialStatement.java:52)
at org.databene.benerator.engine.statement.SequentialStatement.execute(SequentialStatement.java:47)
at org.databene.benerator.engine.BeneratorRootStatement.execute(BeneratorRootStatement.java:63)
at org.databene.benerator.engine.DescriptorRunner.execute(DescriptorRunner.java:127)
at org.databene.benerator.engine.DescriptorRunner.runWithoutShutdownHook(DescriptorRunner.java:109)
at org.databene.benerator.engine.DescriptorRunner.run(DescriptorRunner.java:102)
at org.databene.benerator.main.Benerator.runFile(Benerator.java:94)
at org.databene.benerator.main.Benerator.runFromCommandLine(Benerator.java:75)
at org.databene.benerator.main.Benerator.main(Benerator.java:68)
15:25:50,611 INFO (main) [CachingDBImporter] Exporting Database meta data of ___temp to cache file
15:25:50,635 INFO (main) [CONFIG] Max. committed heap size: 15 MB
Inside my 'db' folder, I have the file user.ben.xml and it starts with,
<database id="db" url="jdbc:oracle:thin:#localhost:1521:mirev" driver="oracle.jdbc.driver.OracleDriver" user="myDB" tableFilter="DB_.*" />
i am new to Benerator. Could anyone please tell me why this error is throwing.
By default Oracle DB does not support 'Catalog'. Make sure your DB has catalog enabled and defined. If not then remove the catalog from your configuration.
I tried the same today...
It seems the oracle user/schema (=catalog in jdbc terms) needs to be alphabetically first to make the example work. I created a user 'A1000' to make the example work.

Solr 3.5 indexing taking very long

We recently migrated from solr3.1 to solr3.5, we have one master and one slave configured. The master has two cores,
1) Core1 – 44555972 documents
2) Core2 – 29419244 documents
We commit every 5000 documents, but lately the commit is taking very long 15 minutes plus in some cases. What could have caused this, I have checked the logs and the only warning i can see is,
“WARNING: Use of deprecated update request parameter update.processor detected. Please use the new parameter update.chain instead, as support for update.processor will be removed in a later version.”
Memory details:
export JAVA_OPTS="$JAVA_OPTS -Xms6g -Xmx36g -XX:MaxPermSize=5g"
Solr Config:
<useCompoundFile>false</useCompoundFile>
<mergeFactor>10</mergeFactor>
<ramBufferSizeMB>32</ramBufferSizeMB>
<!-- <maxBufferedDocs>1000</maxBufferedDocs> -->
<maxFieldLength>10000</maxFieldLength>
<writeLockTimeout>1000</writeLockTimeout>
<commitLockTimeout>10000</commitLockTimeout>
Also noticed, that top command show almost 350GB of Virtual memory usage.
What could be causing this, as everything was running fine a few days back?
Do you have a large search warming query? Our commits take upto 2 mins because of search warming in place. Wondering if that is the case.
The large virtual memory usage would explain this.

Resources