How do you load a csv stored in the config store from Volttron agent? - volttron

I have a CSV/Raw file with series data in it that I would like my agent to read from the config store when it starts.
Steps I am following:
store the config:
volttron-ctl config store myagent mycsv.csv -c mycsvfile.csv --csv
I can then get the contents:
volttron-ctl config get myagent my.csv
In my agent config I specify:
{
"mycsv": "config://myagent/mycsv.csv"
}
In my agent I try get the config stored.
def myagent(config_path, **kwargs):
try:
config = utils.load_config(config_path)
except StandardError:
config = {}
if not config:
_log.info("Using Agent defaults for starting configuration.")
mycsv = config.get('mycsv', '')
mycsv always return the string "config://myagent/mycsv.csv"

One thing you can try is "subscribing" to changes to the config store.
For example, if you stored your config with:
volttron-ctl config store myagent data/mydata.csv -c mydata.csv --csv
The you can add a callback hook by:
def __init__(self, **kwargs):
...
self.vip.config.subscribe(self.read_data, actions=["NEW"], pattern="data/mydata.csv")
def read_data(self, config_path, action, contents):
# Do stuff
pass

Related

Zeppelin Python Flink cannot print to console

I'm using Kinesis Data Analytics Studio which provides a Zeppelin environment.
Very simple code:
%flink.pyflink
from pyflink.common.serialization import JsonRowDeserializationSchema
from pyflink.common.typeinfo import Types
from pyflink.datastream import StreamExecutionEnvironment
from pyflink.datastream.connectors import FlinkKafkaConsumer
# create env = determine app runs locally or remotely
env = s_env or StreamExecutionEnvironment.get_execution_environment()
env.add_jars("file:///home/ec2-user/flink-sql-connector-kafka_2.12-1.13.5.jar")
# create a kafka consumer
deserialization_schema = JsonRowDeserializationSchema.builder() \
.type_info(type_info=Types.ROW_NAMED(
['id', 'name'],
[Types.INT(), Types.STRING()])
).build()
kafka_consumer = FlinkKafkaConsumer(
topics='nihao',
deserialization_schema=deserialization_schema,
properties={
'bootstrap.servers': 'kakfa-brokers:9092',
'group.id': 'group1'
})
kafka_consumer.set_start_from_earliest()
ds = env.add_source(kafka_consumer)
ds.print()
env.execute('job1')
I can get this working locally can sees change logs being produced to console. However I cannot get the same results in Zeppelin.
Also checked STDOUT in Flink web console task managers, nothing is there too.
Am I missing something? Searched for days and could not find anything on it.
I'm not 100% sure but I think you may need a sink to begin pulling data through the datastream, you could potentially use the included Print Sink Function

Is is possible to set options in a SnowSQL config file?

I know I can do:
snowsql --connection my_connection --option friendly=false
But I'd like to do:
[connections.my_connection]
accountname = aa12345.us-central1.gcp
username = my_username
password = my_password
warehouse = my_warehouse
role = my_role
option = friendly=false
The above yields:
Error parsing /home/me/.snowsql/config Recovering partially parsed config values: cause: Parsing failed with several errors.
Yes, in your home directory find .snowsql sub-directory, inside is a config file you can set set exactly that!
https://docs.snowflake.com/en/user-guide/snowsql-config.html#snowsql-config-file

Stuck at: Could not find a suitable table factory

While playing around with Flink, I have been trying to upsert data into Elasticsearch. I'm having this error on my STDOUT:
Caused by: org.apache.flink.table.api.NoMatchingTableFactoryException: Could not find a suitable table factory for 'org.apache.flink.table.factories.TableSinkFactory' in
the classpath.
Reason: Required context properties mismatch.
The following properties are requested:
connector.hosts=http://elasticsearch-elasticsearch-coordinating-only.default.svc.cluster.local:9200
connector.index=transfers-sum
connector.key-null-literal=n/a
connector.property-version=1
connector.type=elasticsearch
connector.version=6
format.json-schema={ \"curr_careUnit\": {\"type\": \"text\"}, \"sum\": {\"type\": \"float\"} }
format.property-version=1
format.type=json
schema.0.data-type=VARCHAR(2147483647)
schema.0.name=curr_careUnit
schema.1.data-type=FLOAT
schema.1.name=sum
update-mode=upsert
The following factories have been considered:
org.apache.flink.streaming.connectors.kafka.Kafka09TableSourceSinkFactory
org.apache.flink.table.sinks.CsvBatchTableSinkFactory
org.apache.flink.table.sinks.CsvAppendTableSinkFactory
at org.apache.flink.table.factories.TableFactoryService.filterByContext(TableFactoryService.java:322)
...
Here is what I have in my scala Flink code:
def main(args: Array[String]) {
// Create streaming execution environment
val env = StreamExecutionEnvironment.getExecutionEnvironment
env.setParallelism(1)
// Set properties per KafkaConsumer API
val properties = new Properties()
properties.setProperty("bootstrap.servers", "kafka.kafka:9092")
properties.setProperty("group.id", "test")
// Add Kafka source to environment
val myKConsumer = new FlinkKafkaConsumer010[String]("raw.data4", new SimpleStringSchema(), properties)
// Read from beginning of topic
myKConsumer.setStartFromEarliest()
val streamSource = env
.addSource(myKConsumer)
// Transform CSV (with a header row per Kafka event into a Transfers object
val streamTransfers = streamSource.map(new TransfersMapper())
// create a TableEnvironment
val tEnv = StreamTableEnvironment.create(env)
// register a Table
val tblTransfers: Table = tEnv.fromDataStream(streamTransfers)
tEnv.createTemporaryView("transfers", tblTransfers)
tEnv.connect(
new Elasticsearch()
.version("6")
.host("elasticsearch-elasticsearch-coordinating-only.default.svc.cluster.local", 9200, "http")
.index("transfers-sum")
.keyNullLiteral("n/a")
.withFormat(new Json().jsonSchema("{ \"curr_careUnit\": {\"type\": \"text\"}, \"sum\": {\"type\": \"float\"} }"))
.withSchema(new Schema()
.field("curr_careUnit", DataTypes.STRING())
.field("sum", DataTypes.FLOAT())
)
.inUpsertMode()
.createTemporaryTable("transfersSum")
val result = tEnv.sqlQuery(
"""
|SELECT curr_careUnit, sum(los)
|FROM transfers
|GROUP BY curr_careUnit
|""".stripMargin)
result.insertInto("transfersSum")
env.execute("Flink Streaming Demo Dump to Elasticsearch")
}
}
I am creating a fat jar and uploading it to my remote flink instance. Here is my build.gradle dependencies:
compile 'org.scala-lang:scala-library:2.11.12'
compile 'org.apache.flink:flink-scala_2.11:1.10.0'
compile 'org.apache.flink:flink-streaming-scala_2.11:1.10.0'
compile 'org.apache.flink:flink-connector-kafka-0.10_2.11:1.10.0'
compile 'org.apache.flink:flink-table-api-scala-bridge_2.11:1.10.0'
compile 'org.apache.flink:flink-connector-elasticsearch6_2.11:1.10.0'
compile 'org.apache.flink:flink-json:1.10.0'
compile 'com.fasterxml.jackson.core:jackson-core:2.10.1'
compile 'com.fasterxml.jackson.module:jackson-module-scala_2.11:2.10.1'
compile 'org.json4s:json4s-jackson_2.11:3.7.0-M1'
Here is how the farJar command is built for gradle:
jar {
from {
(configurations.compile).collect {
it.isDirectory() ? it : zipTree(it)
}
}
manifest {
attributes("Main-Class": "main" )
}
}
task fatJar(type: Jar) {
zip64 true
manifest {
attributes 'Main-Class': "flinkNamePull.Demo"
}
baseName = "${rootProject.name}"
from { configurations.compile.collect { it.isDirectory() ? it : zipTree(it) } }
with jar
}
Could anybody please help me to see what I am missing? I'm fairly new to Flink and data streaming in general. Hehe
Thank you in advance!
Is the list in The following factories have been considered: complete? Does it contain Elasticsearch6UpsertTableSinkFactory? If not as far as I can tell there is a problem with the service discovery dependencies.
How do you submit your job? Can you check if you have a file META-INF/services/org.apache.flink.table.factories.TableFactory in the uber jar with an entry for Elasticsearch6UpsertTableSinkFactory?
When using maven you have to add a transformer to properly merge service files:
<!-- The service transformer is needed to merge META-INF/services files -->
<transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/>
I don't know how do you do it in gradle.
EDIT:
Thanks to Arvid Heise
In gradle when using shadowJar plugin you can merge service files via:
// Merging Service Files
shadowJar {
mergeServiceFiles()
}
You should use the shadow plugin to create the fat jar instead of doing it manually.
In particular, you want to merge service descriptors.

Prototyping: Simplest HTTP server with URL routing (to use w/ Backbone.Router)?

We're working on a Backbone.js application and the fact that we can start a HTTP server by typing python -m SimpleHTTPServer is brilliant.
We'd like the ability to route any URL (e.g. localhost:8000/path/to/something) to our index.html so that we can test Backbone.Router with HTML5 pushState.
What is the most painless way to accomplish that? (For the purpose of quick prototyping)
Just use the built in python functionality in BaseHTTPServer
import BaseHTTPServer
class Handler( BaseHTTPServer.BaseHTTPRequestHandler ):
def do_GET( self ):
self.send_response(200)
self.send_header( 'Content-type', 'text/html' )
self.end_headers()
self.wfile.write( open('index.html').read() )
httpd = BaseHTTPServer.HTTPServer( ('127.0.0.1', 8000), Handler )
httpd.serve_forever()
Download and install CherryPy
Create the following python script (call it always_index.py or something like that) and also replace 'c:\index.html' with the path of your actual file that you want to use
import cherrypy
class Root:
def __init__(self, content):
self.content = content
def default(self, *args):
return self.content
default.exposed = True
cherrypy.quickstart(Root(open('c:\index.html', 'r').read()))
Run python <path\to\always_index.py>
Point your browser at http://localhost:8080 and no matter what url you request, you always get the same content.

Grails - logging from Quartz job and Filter

I would like to stock my logs into files:
Here is how I declare my appenders in Config.groovy:
log4j = {
appenders {
// console name:'stdout', layout:pattern(conversionPattern: '%c{2} %m%n')
file name: "scraperServiceDetailedLogger",
file: "target/scraperServiceDetailed.log"
file name: "scraperServiceLogger",
file: "target/scraperService.log"
file name: "filterLogger",
file: "target/filter.log"
}
error 'org.codehaus.groovy.grails.web.servlet', // controllers
'org.codehaus.groovy.grails.web.pages', // GSP
'org.codehaus.groovy.grails.web.sitemesh', // layouts
'org.codehaus.groovy.grails.web.mapping.filter', // URL mapping
'org.codehaus.groovy.grails.web.mapping', // URL mapping
'org.codehaus.groovy.grails.commons', // core / classloading
'org.codehaus.groovy.grails.plugins', // plugins
'org.codehaus.groovy.grails.orm.hibernate', // hibernate integration
'org.springframework',
'org.hibernate',
'net.sf.ehcache.hibernate'
error scraperServiceDetailedLogger: "grails.app.service.personalcreditcomparator.ScraperService"
info scraperServiceLogger: "grails.app.jobs.personalcreditcomparator.ScraperJob"
info filterLogger: "grails.app.conf.personalcreditcomparator.AdministratorInterfaceProtectorFilters"
}
The 3 files are created properly but only scraperServiceDetailedLogger stores the logs properly. The other two files remain empty.
The level of logging is respected while calling the log.
What am I missing ?
Thank you for any help provided.
For quartz jobs try the Logger prefix of 'grails.app.task'
info scraperServiceLogger: "grails.app.task.personalcreditcomparator.ScraperJob"
And for filters try the Logger prefix of 'grails.app.filters'
info filterLogger: "grails.app.filters.personalcreditcomparator.AdministratorInterfaceProtectorFilters"

Resources