Unable to push vespa metrics to cloudwatch - vespa

Basically I need to monitor vespa metrics and for that I am trying to implement method to push metrics to cloudwatch.
This is the document that I am referring to https://docs.vespa.ai/documentation/monitoring.html
I have added the credentials file and putMetricData permission in the IAM role attached. The service.xml file that I am using in my code looks like this:
<admin version="2.0">
<adminserver hostalias="admin0"/>
<configservers>
<configserver hostalias="admin0"/>
</configservers>
<monitoring>
</monitoring>
<metrics>
<consumer id="my-cloudwatch">
<metric-set id="vespa" />
<cloudwatch region="ap-south-1" namespace="vespa">
<shared-credentials file="~/.aws/credentials" profile="default" />
</cloudwatch>
</consumer>
</metrics>
</admin>
I have deployed the code using vespa-deploy prepare application.zip && vespa-deploy activatebut I am still not seeing any metrics updated on my cloudwatch.
Also, I have tried to add:
<monitoring>
<interval>1</interval>
<systemname>vespa</systemname>
</monitoring>
But getting this error when deploying:
Request failed. HTTP status code: 400
Invalid application package: default.default: Error loading model: XML error in services.xml: element "interval" not allowed here; expected the element end-tag [9:16], input:
How can I fix this issue. Or atleast debug the issue that I am facing.

I suggest to use absolute path to the credentials file, as the ~ may not resolve to the directory you intended at runtime.
A couple more things:
I recommend using the default metric set, as vespa contains a lot of metrics, which will drive your CloudWatch cost higher. If you need additional metrics, you can add them with the metric tag inside consumer.
The monitoring element doesn't do anything useful in this context, so you should just drop it.
If you still don't see any metrics, please check for warnings or errors in the vespa log file (use vespa-logfmt) and the Telegraf log file: /opt/vespa/logs/telegraf/telegraf.log. (Vespa uses Telegraf internally to emit metrics to CloudWatch.)

Related

Google AppEngine application log assigned to the wrong request log

When I look at the logs in the Google Log Viewer for my GAE project, I see that often the logs that I write myself in the code are assigned to the wrong request. Most of the time the log is assigned to the request directly after the request that produced the log entry.
As the root of every application log in GAE must be a request, this means that the wrong request is sometimes marked as error, because another request before produced an error, but the log is somehow assigned to the request after that.
I don't really do anything special, I use Ktor as my servlet and have an interceptor that creates a log when an exception occurs before returning status 500.
I use Java logging via SLF4J with the google cloud logging handler, but before that I used logback via SLf4J and had the same problem.
The content of the logs itself is also correct, the returned status of the request, the level of the log entry, the message, everything is ok.
I thought that it may be because I use kotlin and switch coroutine contexts during a single request, but in some cases the point where I write the log and where I send the response are exactly next to each other, so I'm not sure if kotlin has anything to do with it.
My logging.properties:
# To use this configuration, add to system properties : -Djava.util.logging.config.file="/path/to/file"
#
.level = INFO
# it is recommended that io.grpc and sun.net logging level is kept at INFO level,
# as both these packages are used by Stackdriver internals and can result in verbose / initialization problems.
io.grpc.netty.level=INFO
sun.net.level=INFO
handlers=com.google.cloud.logging.LoggingHandler
# default : java.log
com.google.cloud.logging.LoggingHandler.log=custom_log
# default : INFO
com.google.cloud.logging.LoggingHandler.level=INFO
# default : ERROR
com.google.cloud.logging.LoggingHandler.flushLevel=WARNING
# default : auto-detected, fallback "global"
#com.google.cloud.logging.LoggingHandler.resourceType=container
# custom formatter
com.google.cloud.logging.LoggingHandler.formatter=java.util.logging.SimpleFormatter
java.util.logging.SimpleFormatter.format=%1$tY-%1$tm-%1$td %1$tH:%1$tM:%1$tS %4$-6s %2$s %5$s%6$s%n
#optional enhancers (to add additional fields, labels)
#com.google.cloud.logging.LoggingHandler.enhancers=com.example.logging.jul.enhancers.ExampleEnhancer
My logging relevant dependencies:
implementation "org.slf4j:slf4j-jdk14:1.7.30"
implementation "com.google.cloud:google-cloud-logging:1.100.0"
An example logging call:
exception<Throwable> { e ->
logger().error("Error", e)
call.respondText(e.message ?: "", ContentType.Text.Plain, HttpStatusCode.InternalServerError)
}
with logger() being:
import org.slf4j.Logger
import org.slf4j.LoggerFactory
inline fun <reified T : Any> T.logger(): Logger = LoggerFactory.getLogger(T::class.java)
Edit:
An example of the log in Google cloud. The first request has the query parameter GAID=cdda802e-fb9c-47ad-0794d394c913, but as you can see the error log for that request is in the one below, marked in red.

How to log as jsonPayload to stackdriver from google app engine using logback?

My spring boot app uses logback to log messages in json format. The app is configured to use consolelogappender (stdout).When the logs appear in stackdriver, they appear as textPayload instead of jsonPayload. Is it possible to write message to jsonPayload field in stackdriver using logback? If not, what are my options to log in json format?
Based on this Github Link it seems the issue all log entries are seen as textpayload. It has been added as a Feature Request but we do not have an ETA on when it will be available.
I'm not entirely sure if an alternative exist as Logback seems to be giving extensive log information, but if you are able to use the Stackdriver Logging Client instead, you could format the entry in order to get your object as a JsonPayLoad, although you will have specify most of the log categories yourself which can be an extra amount of work.
The easy way to do this, is to implements the transformation of TextPayload(JSON Format) to JSONPayload on the LoggingEnhacer
Check this answer How to use Stackdriver Structured Logging in App Engine Flex Java environment
It is possible via google-cloud-logging-logback library.
However, please note following (from https://cloud.google.com/logging/docs/structured-logging):
Note: message is saved as textPayload if it is the only field remaining
after the Logging agent moves the other special-purpose fields and
detect_json wasn't enabled; otherwise message remains in jsonPayload.
detect_json is not applicable to managed logging environments like
Google Kubernetes Engine.
To add more data to json add an enhancer. Example:
import ch.qos.logback.classic.spi.ILoggingEvent;
import com.google.cloud.logging.LogEntry;
import com.google.cloud.logging.Payload;
import com.google.cloud.logging.logback.LoggingEventEnhancer;
import java.util.HashMap;
public class EventEnhancer implements LoggingEventEnhancer {
#Override
public void enhanceLogEntry(
LogEntry.Builder logEntry,
ILoggingEvent e
) {
HashMap<String, Object> map = new HashMap<>();
map.put("thread", e.getThreadName());
map.put("context", e.getLoggerContextVO().getName());
map.put("logger", e.getLoggerName());
Payload.JsonPayload payload = logEntry.build().getPayload();
map.putAll(payload.getDataAsMap());
logEntry.setPayload(
Payload.JsonPayload.of(map)
);
}
}
Configuration:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE configuration>
<configuration scan="true">
<appender name="CLOUD" class="com.google.cloud.logging.logback.LoggingAppender">
<filter class="ch.qos.logback.classic.filter.ThresholdFilter">
<level>INFO</level>
</filter>
<log>application.log</log>
<redirectToStdout>true</redirectToStdout>
<resourceType>gae_app</resourceType>
<loggingEventEnhancer>EventEnhancer</loggingEventEnhancer>
<flushLevel>INFO</flushLevel>
</appender>
<root level="INFO">
<appender-ref ref="CLOUD"/>
</root>
</configuration>

Solr: where to find the Luke request handler

I'm trying to get a list of all the fields, both static and dynamic, in my Solr index. Another SO answer suggested using the Luke Request Handler for this.
It suggests finding the handler at this url:
http://solr:8983/solr/admin/luke?numTerms=0
When I try this url on my server, however, I get a 404 error.
The admin page for my core is here http://solr:8983/solr/#/mycore, so I also tried http://solr:8983/solr/#/mycore/admin/luke. This also gave me another 404.
Does anyone know what I'm doing wrong? Which url should I be using?
First of all you have to enable the Luke Request Handler. Note that if you started from the example solrconfig.xml you probably don't need to enable it explicitly because
<requestHandler name="/admin/" class="solr.admin.AdminHandlers" />
does it for you.
Then if you need to access the data programmatically you have to make an HTTP GET request to http://solr:8983/solr/mycore/admin/luke (no hash mark!). The response is in XML but specifying wt parameter you can obtain different formats (e.g. http://solr:8983/solr/mycore/admin/luke?wt=json)
If you only want to see fields in SOLR web interface select your core from the drop down menu and then click on "Schema Browser"
In Solr 6, the solr.admin.AdminHandlers has been removed. If your solrconfig.xml has the line <requestHandler name="/admin/" class="solr.admin.AdminHandlers" />, it will fail to load. You will see errors in the log telling you it failed to load the class org.apache.solr.handler.admin.AdminHandlers.
You must include in your solrconfig.xml the line,
<requestHandler name="/admin/luke" class="org.apache.solr.handler.admin.LukeRequestHandler" />
but the URL is core-specific, i.e. http://your_server.com:8983/solr/your_core_name/admin/luke
And you can specify the parameters fl,numTerms,id,docId as follows:
/admin/luke
/admin/luke?fl=cat
/admin/luke?fl=id&numTerms=50
/admin/luke?id=SOLR1000
/admin/luke?docId=2
You can use this Luke tool which allows you to explore Lucene index.
You can also use the solr admin page :
http://localhost:8983/solr/#/core/schema-browser

How can I handle errors I get from the liquibase updateDatabase ant task

I'm currently working on some ant for applying liquibase changes to databases.
I'd like to be able to handle errors that I get in ant from the liquibase updateDatabase task. Here is what I have right now in my build file (bear in mind what I have now works fine I just need to be able to handle errors I might get from running the liquibase).
<target name="update_db" depends="prepare">
<taskdef resource="liquibasetasks.properties">
<classpath refid="classpath"/>
</taskdef>
<updateDatabase
changeLogFile="${db.changelog.file}"
driver="${database.driver}"
url="jdbc:mysql://localhost/${db.name}"
username="${user}"
password="${password}"
promptOnNonLocalDatabase="not local database"
dropFirst="false"
classpathref="classpath"
/>
</target>
Currently when I get an error I get something similar to this (from a situation I created to demonstrate):
BUILD FAILED
MYPATH\build.xml:15: The following error occurred while executing this line:
MYPATH\\build.xml:117: liquibase.exception.MigrationFailedException: Migration failed
for change set PATH/2.20.9/tables.xml::FFP-1384::AUSER:
Reason: liquibase.exception.DatabaseException: Error executing SQL ALTER TABLE
test.widget ADD full_screen BIT(1) DEFAULT 0: Duplicate column name 'full_screen'
.............. and a the wall of text continues
Ideally I would like to be able to get the return code (rather than this block of text) from liquibase into ant and then based on that do something such as :
<echo message="this failed because ${reason}"/>
but not limited to that.
Is there some way for me to obtain the return code from liquibase? My best guess is that similar to the ant exec task, by default the return code is ignored and I'm hoping there is some way for me to get at it. Any suggestions welcome.
edit: Vaguely similar question https://stackoverflow.com/questions/17856564/liquibase-3-0-2-logging-to-error-console
The ant contrib trycatch task has enabled me to handle the error, which turned out to be a suitable fix since the stack trace is actually useful to see anyway.
http://ant-contrib.sourceforge.net/tasks/tasks/trycatch.html
<trycatch property="foo" reference="bar">
<try>
<fail>Tada!</fail>
</try>
<catch>
<echo>In <catch>.</echo>
</catch>
<finally>
<echo>In <finally>.</echo>
</finally>
</trycatch>
You need to download the ant contrib jar and if you don't want to place it in your ANT_HOME then you can use
<taskdef resource="net/sf/antcontrib/antcontrib.properties">
<classpath>
<pathelement location="PATH TO JAR"/>
</classpath>
</taskdef>

How can I get AppEngine to log info level only for my app?

So I've tried configuring AppEngine logging according to this guide, ensuring I've configured the logging.properties file to be used in web.xml. I've configured logging.properties the following way:
.level = WARNING
nilsnett.chinese.backend.level = INFO
The package name of my logging wrapper is nilsnett.chinese.backend. The problem is that even with this configuration, info-level log output from my app is filtered. Evidence:
I've also tried the following config, which yielded the same result (including the logger class name at the end of the package name):
.level = WARNING
nilsnett.chinese.backend.JavaUtilLogger.level = INFO
To demonstrate that the logging.properties-file is actually read, and that I actually do write info-level logging data to app-engine in this service call, let me show you what happens when I set.level=INFO:
So my desired result is to have INFO and higher-level log outputs from my packages, while other packages, like org.datanucleus, only shows output if WARNING or more severe. In the example above, I want only the two lines marked with the purple star. Am I doing anything wrong?
change your config to:
.level = WARNING
# Set the default logging level for the datanucleus loggers
DataNucleus.JDO.level=WARNING
DataNucleus.Persistence.level=WARNING
DataNucleus.Cache.level=WARNING
DataNucleus.MetaData.level=WARNING
DataNucleus.General.level=WARNING
DataNucleus.Utility.level=WARNING
DataNucleus.Transaction.level=WARNING
DataNucleus.Datastore.level=WARNING
DataNucleus.ClassLoading.level=WARNING
DataNucleus.Plugin.level=WARNING
DataNucleus.ValueGeneration.level=WARNING
DataNucleus.Enhancer.level=WARNING
DataNucleus.SchemaTool.level=WARNING
# FinalizableReferenceQueue tries to spin up a thread and fails. This
# is inconsequential, so don't scare the user.
com.google.common.base.FinalizableReferenceQueue.level=WARNING
com.google.appengine.repackaged.com.google.common.base.FinalizableReferenceQueue.level=WARNING
this is are coming from logging config template, so to set datanucleus to warning you have todo like in this template.
https://developers.google.com/appengine/docs/java/#Logging
and then just add your own logging config:
nilsnett.chinese.backend.level = INFO
this should solve it

Resources