Using Volttron aggregation agent - volttron

I'm trying to get the aggregate agent to work with timescale db.
https://volttron.readthedocs.io/en/main/developing-volttron/developing-agents/specifications/aggregate.html
I have a file aggregation.config
{
"connection": {
"type": "postgresql",
"params": {
"dbname": "volttrondb",
"host": "127.0.0.1",
"port": 5432,
"user": "user",
"password": "password",
"timescale_dialect": true
}
},
"aggregations":[
# list of aggregation groups each with unique aggregation_period and
# list of points that needs to be collected
{
"aggregation_period": "1h",
"use_calendar_time_periods": true,
"utc_collection_start_time":"2016-03-01T01:15:01.000000",
"points": [
{
"topic_names": ["campus/building/fake/EKG_Cos", "campus/building/fake/EKG_Sin"],
"aggregation_topic_name":"campus/building/fake/avg_of_EKG_Cos_EKG_Sin",
"aggregation_type": "avg",
"min_count": 2
}
]
}
]
}
And run the following command
vctl install services/core/SQLAggregateHistorian/ --tag aggregate-historian -c config/aggregation.config --start
It starts correctly - the vctl status shows it running and there are no errors in the log.
I do not see the point campus/building/fake/avg_of_EKG_Cos_EKG_Sin in the topics table.
Any suggestions?

Related

Starting and stopping services in vespa

In the benchmarking page "https://docs.vespa.ai/en/performance/vespa-benchmarking.html" it is given that we need to restart the services after we increase the persearch thread using the commands vespa-stop-services and vespa-start-services.
Could you tell us if we need to do this on all the content nodes or just the config nodes?
When deploying a change that requires a restart, the deploy command will list the actions you need to take. For example when changing the global per search thread setting changing from 2 to 5 in the below example:
curl --header Content-Type:application/zip --data-binary #target/application.zip localhost:19071/application/v2/tenant/default/prepareandactivate |jq .
{
"log": [
{
"time": 1645036778830,
"level": "WARNING",
"message": "Change(s) between active and new application that require restart:\nIn cluster 'mycluster' of type 'search':\n Restart services of type 'searchnode' because:\n 1) # Number of threads used per search\nproton.numthreadspersearch has changed from 2 to 5\n"
}
],
"tenant": "default",
"url": "http://localhost:19071/application/v2/tenant/default/application/default/environment/prod/region/default/instance/default",
"message": "Session 8 for tenant 'default' prepared and activated.",
"configChangeActions": {
"restart": [
{
"clusterName": "mycluster",
"clusterType": "search",
"serviceType": "searchnode",
"messages": [
"# Number of threads used per search\nproton.numthreadspersearch has changed from 2 to 5"
],
"services": [
{
"serviceName": "searchnode",
"serviceType": "searchnode",
"configId": "mycluster/search/cluster.mycluster/0",
"hostName": "vespa-container"
}
]
}
],
"refeed": [],
"reindex": []
}
}

80 Postgres databases in Azure Data Factory to copy into

I'm using Azure Data factory,
I'm using SQLServer as source and Postgres as target. Goal is to copy 30 tables from SQLServer, with transformation, to 30 tables in Postgres. Hard part is I have 80 databases from and to, all with the exact same layout but different data. Its one database per customer so 80 customers each with their own databases.
Linked Services doesn't allow parameters for Postgres.
I have one dataset per source and target using parameters for schema and table names.
I have one pipeline per table with SQLServer source and Postgres target.
I can parameterize the SQLServer source in linked service but not Postgres
Problem is how can I copy 80 source databases to 80 target databases without adding 80 target linked services and 80 target datasets? Plus I'd have to repeat all 30 pipelines per target database.
BTW I'm only familiar with the UI, however anything else that does the job is acceptable.
Any help would be appreciated.
There is simple way to implement this. Essentially you need to have a single Linked Service, which reads the connection string out of KeyVault. You can then parameterize source and target as keyvault secret names, and easily switch between data sources by just changing the secret name. This relies on all connection related information being enclosed within a single connection string.
I will provide a simple overview for Postgresql, but the same logic applies to MSSQL servers as source.
Implement a Linked Service for Azure Key Vault.
Add a Linked Service for Azure Postgresql that uses Key Vault to store access url in format: Server=your_server_name.postgres.database.azure.com;Database=your_database_name;Port=5432;UID=your_user_name;Password=your_password;SSL Mode=Require;Keepalive=600; (advise to use server name as secret name)
Pass this parameter, which is essentially correct secret name, in the Pipeline (you can also implement a loop that would accept immediately array of x elements, and parse n elements at a time into separate pipeline)
Linked Service Definition for KeyVault:
{
"name": "your_keyvault_name",
"properties": {
"description": "KeyVault",
"annotations": [],
"type": "AzureKeyVault",
"typeProperties": {
"baseUrl": "https://your_keyvault_name.vault.azure.net/"
}
}
}
Linked Service Definition for Postgresql:
{ "name": "generic_postgres_service".
"properties": {
"type": "AzurePostgreSql",
"parameters": {
"pg_database": {
"type": "string",
"defaultValue": "your_database_name"
}
},
"annotations": [],
"typeProperties": {
"connectionString": {
"type": "AzureKeyVaultSecret",
"store": {
"referenceName": "KeyVaultName",
"type": "LinkedServiceReference"
},
"secretName": "#linkedService().secret_name_for_server"
}
},
"connectVia": {
"referenceName": "AutoResolveIntegrationRuntime",
"type": "IntegrationRuntimeReference"
}
}
}
Dataset Definition for Postgresql:
{
"name": "your_postgresql_dataset",
"properties": {
"linkedServiceName": {
"referenceName": "generic_postgres_service",
"type": "LinkedServiceReference",
"parameters": {
"secret_name_for_server": {
"value": "#dataset().secret_name_for_server",
"type": "Expression"
}
}
},
"parameters": {
"secret_name_for_server": {
"type": "string"
}
},
"annotations": [],
"type": "AzurePostgreSqlTable",
"schema": [],
"typeProperties": {
"schema": {
"value": "#dataset().schema_name",
"type": "Expression"
},
"table": {
"value": "#dataset().table_name",
"type": "Expression"
}
}
}
}
Pipeline Definition for Postgresql:
{
"name": "your_postgres_pipeline",
"properties": {
"activities": [
{
"name": "Copy_Activity_1",
"type": "Copy",
"dependsOn": [],
"policy": {
"timeout": "7.00:00:00",
"retry": 0,
"retryIntervalInSeconds": 30,
"secureOutput": false,
"secureInput": false
},
...
... i skipped definition
...
"inputs": [
{
"referenceName": "your_postgresql_dataset",
"type": "DatasetReference",
"parameters": {
"secret_name_for_server": "secret_name"
}
}
]
}
],
"annotations": []
}
}

How to get the table-name and database-name in the CDC event received from debezium kafka connect

Setup: I have CDC enabled on MS SQL Server and the CDC events are fed to Kafka using debezium kafka connect(source). Also more than one table CDC events are routed to a single topic in Kafka.
Question: Since I have more than one table data in the kafka topic, I would like to have the table name and the database name in the CDC data.
I am getting the table name and database name in MySQL CDC but not in MS SQL CDC.
Below is the Debezium source connector for the SQL Server
curl -i -X POST -H "Accept:application/json" -H "Content-Type:application/json" localhost:8083/connectors/ -d '{
"name": "cdc-user_profile-connector",
"config": {
"connector.class": "io.debezium.connector.sqlserver.SqlServerConnector",
"tasks.max": "1",
"database.hostname": "<<hostname>>",
"database.port": "<<port>>",
"database.user": "<<username>>",
"database.password": "<<password>>",
"database.server.name": "test",
"database.dbname": "testDb",
"table.whitelist": "schema01.table1,schema01.table2",
"database.history.kafka.bootstrap.servers": "broker:9092",
"database.history.kafka.topic": "digital.user_profile.schema.audit",
"database.history.store.only.monitored.tables.ddl": true,
"include.schema.changes": false,
"event.deserialization.failure.handling.mode": "fail",
"snapshot.mode": "initial_schema_only",
"snapshot.locking.mode": "none",
"transforms":"addStaticField,topicRoute",
"transforms.addStaticField.type":"org.apache.kafka.connect.transforms.InsertField$Value",
"transforms.addStaticField.static.field":"source_system",
"transforms.addStaticField.static.value":"source_system_1",
"transforms.topicRoute.type":"org.apache.kafka.connect.transforms.RegexRouter",
"transforms.topicRoute.regex":"(.*)",
"transforms.topicRoute.replacement":"digital.user_profile",
"errors.tolerance": "none",
"errors.log.enable": true,
"errors.log.include.messages": true,
"errors.retry.delay.max.ms": 60000,
"errors.retry.timeout": 300000
}
}'
I am getting the below output (Demo data)
{
"before": {
"profile_id": 147,
"email_address": "test#gmail.com"
},
"after": {
"profile_id": 147,
"email_address": "test_modified#gmail.com"
},
"source": {
"version": "0.9.4.Final",
"connector": "sqlserver",
"name": "test",
"ts_ms": 1556723528917,
"change_lsn": "0007cbe5:0000b98c:0002",
"commit_lsn": "0007cbe5:0000b98c:0003",
"snapshot": false
},
"op": "u",
"ts_ms": 1556748731417,
"source_system": "source_system_1"
}
My requirement is to get as below
{
"before": {
"profile_id": 147,
"email_address": "test#gmail.com"
},
"after": {
"profile_id": 147,
"email_address": "test_modified#gmail.com"
},
"source": {
"version": "0.9.4.Final",
"connector": "sqlserver",
"name": "test",
"ts_ms": 1556723528917,
"change_lsn": "0007cbe5:0000b98c:0002",
"commit_lsn": "0007cbe5:0000b98c:0003",
"snapshot": false,
"db": "testDb",
"table": "table1/table2"
},
"op": "u",
"ts_ms": 1556748731417,
"source_system": "source_system_1"
}
This is planned as a part of https://issues.jboss.org/browse/DBZ-875 issue
Debezium Kafka-Connect generally puts data from each table in a separate topic and the topic name is of the format hostname.database.table. We generally use the topic name to distinguish between the source table & database name.
If you are putting the data from all the tables manually into one topic then you might have to add the table and database name manually as well.

IBM CLOUD function action took too long to respond in IBM watson chat dialog

Hi, I am creating a chatbot. I developed a IBM cloud function(action) in IBM.
This is the action code..
{
"context": {
"my_creds": {
"user": "ssssssssssssssssss",
"password": "sssssssssssssssssssssss"
}
},
"output": {
"generic": [
{
"values": [
{
"text": ""
}
],
"response_type": "text",
"selection_policy": "sequential"
}
]
},
"actions": [
{
"name": "ssssssssssss/user-detail",
"type": "server",
"parameters": {
"name": "<?input.text?>",
"lastname": "<?input.text?>"
},
"credentials": "$my_creds",
"result_variable": "$my_result"
}
]
}
Now my action user detail is giving response when i am invoking the code.
But when i am checking the output with my chatbot I am getting execution of cloud functions action took too long.
There is currently a 5 second limitation on processing time for a cloud function being called from a dialog node. If your process will need longer than this, you'll need to do it client side through your application layer.

Create Solr readonly user

I want to create a read only user for my Solr-Cloud-Cluster. For this I create a new security.json file an uploaded into me zookeeper server. But the user solr can do selects/inserts, but the SOLRREAD user only insert. I want that the SOLRREAD user can only read a collection, but not write into it.
Solr 5.5.0
Do you know whats wrong?
/usr/iop/4.2.0.0/solr/server/scripts/cloud-scripts/zkcli.sh -zkhost bdmstd001.zit.com:2181 -cmd put /solr/security.json '
{
"authentication": {
"blockUnknown": true,
"class": "solr.BasicAuthPlugin",
"credentials": {
"solr": "Some hash",
"SOLRREAD": "Some hash"
}
},
"":{"v":3},
"authorization": {
"class": "solr.RuleBasedAuthorizationPlugin",
"user-role": {
"solr": "admin",
"SOLRREAD" : "dev"
},
"permissions": [
{
"role": "dev",
"name": "collection-admin-read",
},
{
"role": "admin",
"name": "collection-admin-edit",
}
]
}
}
'

Resources