batch query is not allowed to request data from ""."" - ubuntu-18.04

I'm getting started with Kapacitor and have been trying to run the first guide in the Kapacitor documentation, but with data I already have. I managed to define a task, but I can neither enable it nor can I run a backfill. I came across this question, which is similar to my problem, but the answer there didn't help. In contrast to the error message there I get empty strings for database, retention policy, and/or measurement.
In Kapacitor config I set an InfluxDB connection to the local host instance with the name localhost (which has a database mydb and the measurements weather.current.clouds and weather.current.visibility with default retention policy autogen) and created the following weathertest.tick script:
dbrp "mydb"."autogen"
var clouds = batch
|query('select mean(value) / 100.0 as val from "mydb"."autogen"."weather.current.clouds"')
.period(1h)
.every(1h)
.groupBy(time(1m), *)
.fill(0)
var vis = batch
|query('select mean(value) / 10000.0 as val from "mydb"."autogen"."weather.current.visibility"')
.period(1h)
.every(1h)
.groupBy(time(1m), *)
.fill(0)
clouds
|join(vis)
.as('c', 'v')
|eval(lambda: 100 * (1 - "c.val") * "v.val")
.as('pcent')
|influxDBOut()
.cluster('localhost')
.database('mydb')
.retentionPolicy('autogen')
.measurement('testmetric')
.tag('host', 'myhost.local')
.tag('key', 'weather.current.lightidx')
This is what I came up with after hours of trial and (especially) error. As given in the title, when I try to enable my task with kapacitor enable weathertest, I get the error message enabling task weathertest: batch query is not allowed to request data from ""."". Same thing when I try to record as in the "Backfill" example. Also, in that example there is a start and a stop date for limiting the time frame. The time format given there is wrong and is not understood by Kapacitor. Instead of e. g. 2015-10-01 I have to put in 2015-10-01T00:00Z to make it at least pass the error message regarding time format error.
In the Kapacitor logs there is not a single line regarding these errors, only when I try to remove a record, I get something like remove /var/lib/kapacitor/replay/1f5...750.brpl: no such file or directory and this can be found in the logs. There are lots of info lines in the logs showing successful POSTs to/from InfluxDB for the _internal database with HTTP response result 204.
Has anyone an Idea what I may be doing wrong?

OK, after the weekend I tried again. Without any change it accepted my script now in the failing steps, however, now I was able to find error messages in the log. The node mentioned there was the eval node and pointed towards a type mismatch. When I changed the line
|eval(lambda: 100 * (1 - "c.val") * "v.val")
to
|eval(lambda: 100.0 * (1.0 - "c.val") * "v.val")
the error messages were gone and the command kapacitor show weathertest showed a rather sane content now.
Furthermore, I redefined, recorded, replayed and deleted the tasks and recordings during my tests over and over again and I may have forgotten to redefine tasks after making changes to the tick script (not really sure). After changing the above, redefining the task and replaying it I finally found the expected data in the InfluxDB instance.

Related

Azure SQL DB, Extended Events using Ring Buffer stop on their own after a while - why?

I run a fairly basic Xevent on Azure SQL DB, using a Ring Buffer target, that looks for Severity errors over 10, based on a post Brent Ozar made. However, the session stops on its own, and I'm not sure why. I suspect it's filling up, even though all the documentation says it's FIFO and will drop the oldest events. Am I missing something? Do I need to set something differently? Thanks!
update: weirdly, I made a much smaller test, with max_memory = 200 and a much bigger amount of data in the failing code, to try and force it to stop/die, but I show it's looping as I expected. So I'm still confused why it's stopping, but it doesn't seem to be because it's filling up.
CREATE EVENT SESSION
severity_10plus_errors_XE
ON database
ADD EVENT sqlserver.error_reported
(
ACTION(sqlserver.client_app_name,sqlserver.client_hostname,sqlserver.database_id,sqlserver.sql_text,sqlserver.tsql_stack,sqlserver.username)
--ACTION (sqlserver.sql_text, sqlserver.tsql_stack, sqlserver.database_id, sqlserver.username)
WHERE ([severity]> 10)
)
ADD TARGET
package0.ring_buffer
(SET
max_memory = 10000 ) -- Units of KB.
WITH (MAX_DISPATCH_LATENCY = 60SECONDS)
GO
ALTER EVENT SESSION severity_10plus_errors_XE
ON DATABASE
STATE = START;
GO

Trying to obtain specific data from an API with different ID's but nothing logs to the console

I'm trying to access data from an API that has 10 different ID's (1 - 10) with each ID containing bits of mock information for a client. The base URL is
https://europe-west2-mpx-tools-internal.cloudfunctions.net/frontend-mock-api/clients
Adding the {clientId} to the end of the URL return the last 30 days of data for a specific client for instance: https://europe-west2-mpx-tools-internal.cloudfunctions.net/frontend-mock-api/clients/1 returns data like so:
But when I try to console.log data containing the (date, cost, impressions, clicks, conversions), nothing appears, I'm not getting an error
Here's my code from both files:
I'm essentially wondering how best to ensure I'm able to access the specific bits of data for each day and for each client from 1 - 10.
I'm not getting an error but believe I'm going wrong somewhere if anyone can help.
Thanks in advance and sorry if I've missed key info that's needed as I'm still learning!

index method is failing while executing indexClient.Documents.Index(batch)

I am adding 1000 documents to the IndexBatch and calling index() method some thing like below
var batch = IndexBatch.New(actions);
indexClient.Documents.Index(batch);
I kept this code in a loop where I need to upload around 50 Million documents to Azure Search. After it is executing around somewhere 15 to 20 times (15k to 20k documents) in the loop, it is failing and throwing exception which says below
"The request is invalid. Details: actions: No indexing actions found in the request. Please include between 1 and 32000 indexing actions in your request."
Why I am getting this exception randomly.
Can you suggest the better approach to handle below scenarios
How to make sure previous batch of documents got indexed before attempting to load another batch (Since I am running these statements at least 50K times in a loop)
any errors caused because of the load of service.
Is it possible to add a try / catch to this code so that you can validate that there are actually items in the batch? In the catch maybe add a breakpoint to investigate what is in the batch to see if that helps clear up what might be the issue?

Debezium error, schema isn't known to this connector

I have a project using Debezium, mostly based on this example, which is then connected to an Apache Pulsar.
I have changed a few configurations. The file now looks like this:
database.history=io.debezium.relational.history.MemoryDatabaseHistory
connector.class=io.debezium.connector.mysql.MySqlConnector
offset.storage=org.apache.kafka.connect.storage.FileOffsetBackingStore
offset.storage.file.filename=offset.dat
offset.flush.interval.ms=5000
name=mysql-dbz-connector
database.hostname={ip}
database.port=3308
database.user={user}
database.password={pass}
database.dbname=database
database.server.name=test
table.whitelist=database.history_table,database.project_table
snapshot.mode=schema_only
schemas.enable=false
include.schema.changes=false
pulsar.topic=persistent://public/default/{0}
pulsar.broker.address=pulsar://{ip}:6650
database.history=io.debezium.relational.history.MemoryDatabaseHistory
As you may understand, what I'm trying to do is to monitor the history_table and the project_table modifications from the database and then write payloads onto an Apache Pulsar.
My problem is as follows. In whatever snapshot mode I use, when an offset has been written, I can't restart the Debezium without getting an error on the next database update.
Encountered change event for table database.history_table whose schema isn't known to this connector
It only happens with an existing offset.dat file. I think this is because the schema is null within the offset.dat file. Take this one for example:
¨Ìsrjava.util.HashMap⁄¡√`—F
loadFactorI thresholdxp?#wur[B¨Û¯T‡xpG{"schema":null,"payload":["mysql-dbz-connector",{"server":"test"}]}uq~U{"ts_sec":1563802215,"file":"database-bin.000005","pos":79574,"server_id":1,"event":1}x
I first suspected the schemas.enable=false or the include.schema.changes=false parameters that I used to make the JSON more concise, but their values don't change anything in the offset.dat file.
The problem lies in line database.history=io.debezium.relational.history.MemoryDatabaseHistory. The history will not survive restart. You should use FileDatabaseHistory instead of it.

How to debug long-running mapreduce jobs with millions of writes?

I am using the simple control.start_map() function of the appengine-mapreduce library to start a mapreduce job. This job successfully completes and shows ~43M mapper-calls on the resulting /mapreduce/detail?mapreduce_id=<my_id> page. However, this page makes no mention of the reduce step or any of the underlying appengine-pipeline processes that I believe are still running. Is there some way to return the pipeline ID that this calls makes so I can look at the underlying pipelines to help debug this long-running job? I would like to retrieve enough information to pull up this page: /mapreduce/pipeline/status?root=<guid>
Here is an example of the code I am using to start up my mapreduce job originally:
from third_party.mapreduce import control
mapreduce_id = control.start_map(
name="Backfill",
handler_spec="mark_tos_accepted",
reader_spec=(
"third_party.mapreduce.input_readers.DatastoreInputReader"),
mapper_parameters={
"input_reader": {
"entity_kind": "ModelX"
},
},
shard_count=64,
queue_name="backfill-mapreduce-queue",
)
Here is the mapping function:
# This is where we keep our copy of appengine-mapreduce
from third_party.mapreduce import operation as op
def mark_tos_accepted(modelx):
# Skip users who have already been marked
if (not modelx
or modelx.tos_accepted == myglobals.LAST_MATERIAL_CHANGE_TO_TOS):
return
modelx.tos_accepted = user_models.LAST_MATERIAL_CHANGE_TO_TOS
yield op.db.Put(modelx)
Here are the relevant portions of the ModelX:
class BackupModel(db.Model):
backup_timestamp = db.DateTimeProperty(indexed=True, auto_now=True)
class ModelX(BackupModel):
tos_accepted = db.IntegerProperty(indexed=False, default=0)
For more context, I am trying to debug a problem I am seeing with writes showing up in our data warehouse.
On 3/23/2013, we launched a MapReduce job (let's call it A) over a db.Model (let's call it ModelX) with ~43M entities. 7 hours later, the job "finished" and the /mapreduce/detail page showed that we had successfully mapped over all of the entities, as shown below.
mapper-calls: 43613334 (1747.47/sec avg.)
On 3/31/2013, we launched another MapReduce job (let's call it B) over ModelX. 12 hours later, the job finished with status Success and the /mapreduce/detail page showed that we had successfully mapped over all of the entities, as shown below.
mapper-calls: 43803632 (964.24/sec avg.)
I know that MR job A wrote to all ModelX entities, since we introduced a new property that none of the entities contained before. The ModelX contains an auto_add property like so.
backup_timestamp = ndb.DateTimeProperty(indexed=True, auto_now=True)
Our data warehousing process runs a query over ModelX to find those entities that changed on a certain day and then downloads those entities and stores them in a separate (AWS) database so that we can run analysis over them. An example of this query is:
db.GqlQuery('select * from ModelX where backup_timestamp >= DATETIME(2013, 4, 10, 0, 0, 0) and backup_timestamp < DATETIME(2013, 4, 11, 0, 0, 0) order by backup_timestamp')
I would expect that our data warehouse would have ~43M entities on each of the days that the MR jobs completed, but it is actually more like ~3M, with each subsequent day showing an increase, as shown in this progression:
3/16/13 230751
3/17/13 193316
3/18/13 344114
3/19/13 437790
3/20/13 443850
3/21/13 640560
3/22/13 612143
3/23/13 547817
3/24/13 2317784 // Why isn't this ~43M ?
3/25/13 3701792 // Why didn't this go down to ~500K again?
3/26/13 4166678
3/27/13 3513732
3/28/13 3652571
This makes me think that although the op.db.Put() calls issued by the mapreduce job are still running in some pipeline or queue and causing this trickle effect.
Furthermore, if I query for entities with an old backup_timestamp, I can go back pretty far and still get plenty of entities, but I would expect all of these queries to return 0:
In [4]: ModelX.all().filter('backup_timestamp <', 'DATETIME(2013,2,23,1,1,1)').count()
Out[4]: 1000L
In [5]: ModelX.all().filter('backup_timestamp <', 'DATETIME(2013,1,23,1,1,1)').count()
Out[5]: 1000L
In [6]: ModelX.all().filter('backup_timestamp <', 'DATETIME(2012,1,23,1,1,1)').count()
Out[6]: 1000L
However, there is this strange behavior where the query returns entities that it should not:
In [8]: old = ModelX.all().filter('backup_timestamp <', 'DATETIME(2012,1,1,1,1,1)')
In [9]: paste
for o in old[1:100]:
print o.backup_timestamp
## -- End pasted text --
2013-03-22 22:56:03.877840
2013-03-22 22:56:18.149020
2013-03-22 22:56:19.288400
2013-03-22 22:56:31.412290
2013-03-22 22:58:37.710790
2013-03-22 22:59:14.144200
2013-03-22 22:59:41.396550
2013-03-22 22:59:46.482890
2013-03-22 22:59:46.703210
2013-03-22 22:59:57.525220
2013-03-22 23:00:03.864200
2013-03-22 23:00:18.040840
2013-03-22 23:00:39.636020
Which makes me think that the index is just taking a long time to be updated.
I have also graphed the number of entities that our data warehousing downloads and am noticing some cliff-like drops that makes me think that there is some behind-the-scenes throttling going on somewhere that I cannot see with any of the diagnostic tools exposed on the appengine dashboard. For example, this graph shows a fairly large spike on 3/23, when we started the mapreduce job, but then a dramatic fall shortly thereafter.
This graph shows the count of entities returned by the BackupTimestamp GqlQuery for each 10-minute interval for each day. Note that the purple line shows a huge spike as the MapReduce job spins up, and then a dramatic fall ~1hr later as the throttling kicks in. This graph also shows that there seems to be some time-based throttling going on.
I don't think you'll have any reducer functions there, because all you've done is start a mapper. To do a complete mapreduce, you have to explicitly instantiate a MapReducePipeline and call start on it. As a bonus, that answers your question, as it returns the pipeline ID which you can then use in the status URL.
Just trying to understand the specific problem. Is it that you are expecting a bigger number of entities in your AWS database? I would suspect that the problem lies with the process that downloads your old ModelX entities into an AWS database, that it's somehow not catching all the updated entities.
Is the AWS-downloading process modifying ModelX in any way? If not, then why would you be surprised at finding entities with an old modified time stamp? modified would only be updated on writes, not on read operations.
Kind of unrelated - with respect to throttling I've usually found a throttled task queue to be the problem, so maybe check how old your tasks in there are or if your app is being throttled due to a large amount of errors incurred somewhere else.
control.start_map doesn't use pipeline and has no shuffle/reduce step. When the mapreduce status page shows its finished, all mapreduce related taskqueue tasks should have finished. You can examine your queue or even pause it.
I suspect there are problems related to old indexes for the old Model or to eventual consistency. To debug MR, it is useful to filter your warnings/errors log and search by the mr id. To help with your particular case, it might be useful to see your Map handler.

Resources