How can I debug problems with warehouse creation? - cloudant

When trying to create a warehouse from the Cloudant dashboard, sometimes the process fails with an error dialog. Other times, the warehouse extraction stays in a state of triggered even after hours.
How can I debug this? For example is there an API I can call to see what is going on?

Take a look inside the document inside the _warehouser database, and look for the warehouser_error_message element. For example:
"warehouser_error_message": "Exception occurred while creating table.
[SQL0670N The statement failed because the row size of the
resulting table would have exceeded the row size limit. Row size
limit: \"\". Table space name: \"\". Resulting row size: \"\".
com.ibm.db2.jcc.am.SqlException: DB2 SQL Error: SQLCODE=-670,
SQLSTATE=54010, SQLERRMC=32677;;34593, DRIVER=4.18.60]"
The warehouser error message usually gives you enough information to debug the problem.
You can view the _warehouser document in the Cloudant dashboard or use the API, e.g.
export cl_username='<your_cloudant_account>'
curl -s -u $cl_username -p \
https://$cl_username.cloudant.com/_warehouser/_all_docs?include_docs=true \
| jq [.warehouse_error_code]

Related

<MSDialect_pyodbc, TIMESTAMP> error when comparing tables in DVT - BigQuery and SQL Server

I'm trying to compare the table data between two databases (on-prem SQL Server and BigQuery). I'm currently using Data-Validation-Tool for that (DVT).
Using the instructions from Github (link: https://github.com/GoogleCloudPlatform/professional-services-data-validator/tree/develop/docs), I installed and created a connection for DVT.
When tried to compare two table from single source, it works fine and provide correct output. But when checked for two different source, it returns dtype: <MSDialect_pyodbc, TIMESTAMP> error.
Details:
I tried to validate over level count but still same error. Command -
data-validation validate column -sc wc_sql_conn -tc my_bq_conn -tbls EDW.dbo.TableName=project_id.dataset_name.TableName --primary-keys PK_Col_Name --count '*'
Also, When checked for schema level validation -
data-validation validate schema -sc wc_sql_conn -tc my_bq_conn -tbls EDW.dbo.TableName=project_id.dataset_name.TableName
I tried to add only INT/STRING columns in custom query and then comparing it.
Custom Query -
SELECT PK_Col_Name, Category FROM EDW.dbo.TableName
Similar custom query prepared for BQ and command for custom query comparison -
data-validation validate custom-query -sc wc_sql_conn -tc my_bq_conn -cqt 'column' -sqf sql_query.txt -tqf bq_query.txt -pk PK_Col_Name --count '*'
data-validation validate custom-query -sc wc_sql_conn -tc my_bq_conn -cqt 'row' -sqf sql_query.txt -tqf bq_query.txt -pk PK_Col_Name --count '*'
Even when using multiple approaches and the scenario doesn't involve a datetime/timestamp column, I typically only get one error -
NotImplementedError: Could not find signature for dtype: <MSDialect_pyodbc, TIMESTAMP>
I tried to google down the error, but no luck. Could someone please help me identify the error?
Additionally, there are no data-validation-tool or google-pso-data-validator tags available. If someone could add that, it may be used in the future and reach the right people.

Azure Databricks Spark DataFrame fails to insert into MS SQL Server using the MS Spark JDBC connector when executor tries fewer than 4,096 records

That's a title and a half, but it pretty much summarises my "problem".
I have an Azure Databricks workspace, and a an Azure Virtual Machine running SQL Server 2019 Developer. They're on the same VNET, and they can communicate nicely with each other. I can select rows very happily from the SQL Server, and some instances of inserts work really nicely too.
My scenario:
I have a spark table foo, containing any number of rows. Could be 1, could be 20m.
foo contains 19 fields.
The contents of foo needs to be inserted into a table on the SQL Server also called foo, in a database called bar, meaning my destination is bar.dbo.foo
I've got the com.microsoft.sqlserver.jdbc.spark connector configured on the cluster, and I connect using an IP, port, username and password.
My notebook cell of relevance:
df = spark.table("foo")
try:
url = "jdbc:sqlserver://ip:port"
table_name = "bar.dbo.foo"
username = "user"
password = "password"
df.write \
.format("com.microsoft.sqlserver.jdbc.spark") \
.mode("append") \
.option("truncate",True) \
.option("url", url) \
.option("dbtable", table_name) \
.option("user", username) \
.option("password", password) \
.option("queryTimeout", 120) \
.option("tableLock",True) \
.option("numPartitions",1) \
.save()
except ValueError as error :
print("Connector write failed", error)
If I prepare foo to contain 10,000 rows, I can run this script time and time again, and it succeeds every time.
As the rows start dropping down, the Executor occasionally tries to process 4,096 rows in a task. As soon as it tries to do 4,096 in a task, weird things happen.
For example, having created foo to contain 5,000 rows and executing the code, this is the task information:
Index Task Id Attempt Status Executor ID Host Duration Input Size/Records Errors
0 660 0 FAILED 0 10.139.64.6 40s 261.3 KiB / 4096 com.microsoft.sqlserver.jdbc.SQLServerException: The connection is closed.
0 661 1 FAILED 3 10.139.64.8 40s 261.3 KiB / 4096 com.microsoft.sqlserver.jdbc.SQLServerException: The connection is closed.
0 662 2 FAILED 3 10.139.64.8 40s 261.3 KiB / 4096 com.microsoft.sqlserver.jdbc.SQLServerException: The connection is closed.
0 663 3 SUCCESS 1 10.139.64.5 0.4s 261.3 KiB / 5000
I don't fully understand why it fails after 40 seconds. Our timeouts are set to 600 seconds on the SQL box, and the query timeout in the script is 120 seconds.
Every time the Executor does more than 4,096 rows, it succeeds. This is true regardless of the size of the dataset. Sometimes it tries to do 4,096 rows on 100k row sets, fails, and then changes the records in the set to 100k and it immediately succeeds.
When the set is smaller than 4,096, the execution will typically generate one message:
com.microsoft.sqlserver.jdbc.SQLServerException: The connection is closed
and then immediately work successfully having moved onto the next executor.
On the SQL Server itself, I see ASYNC_NETWORK_IO as the wait using Adam Mechanic's sp_whoisactive. This wait persists for the full duration of the 40s attempt. It looks like at 40s there's an immediate abandonment of the attempt, and a new connection is created - consistent with the messages I see from the task information.
Additionally, when looking at the statements, I note that it's doing ROWS_PER_BATCH = 1000 regardless of the original number of rows. I can't see any way of changing that in the docs, but I tried rowsPerBatch in the option for the df, but didn't appear to make a difference - still showing the 1000 value.
I've been running this with lots of different amounts of rows in foo - and when the total rows is greater than 4,096 my testing suggests that the spark executor succeeds if it tries a number of records that exceeds 4,096. If I remove the numPartitions, there are more attempts of 4,096 records, and so I see more failures.
Weirdly, if I cancel a query that appears to be running for longer than 10s, and immediately retry it - if the number of rows in foo is != 4,096, it seems to succeed every time. My sample is obviously pretty small - tens of attempts.
Is there a limitation I'm not familiar with here? What's the magic of 4,096?
In discussing this with my friend, we're wondering whether there is some form of implicit type conversions happening in the arrays when they're <4,096 records, which causes delays somehow.
I'm at quite a loss on this one - and wondering whether I just need to check the length of the DF before attempting the transfer - doing an iterative cursor in PYODBC for fewer rows, and sticking to the JDBC connector for larger numbers of rows. It seems like it shouldn't be needed!
Many thanks,
Johan

batch query is not allowed to request data from "".""

I'm getting started with Kapacitor and have been trying to run the first guide in the Kapacitor documentation, but with data I already have. I managed to define a task, but I can neither enable it nor can I run a backfill. I came across this question, which is similar to my problem, but the answer there didn't help. In contrast to the error message there I get empty strings for database, retention policy, and/or measurement.
In Kapacitor config I set an InfluxDB connection to the local host instance with the name localhost (which has a database mydb and the measurements weather.current.clouds and weather.current.visibility with default retention policy autogen) and created the following weathertest.tick script:
dbrp "mydb"."autogen"
var clouds = batch
|query('select mean(value) / 100.0 as val from "mydb"."autogen"."weather.current.clouds"')
.period(1h)
.every(1h)
.groupBy(time(1m), *)
.fill(0)
var vis = batch
|query('select mean(value) / 10000.0 as val from "mydb"."autogen"."weather.current.visibility"')
.period(1h)
.every(1h)
.groupBy(time(1m), *)
.fill(0)
clouds
|join(vis)
.as('c', 'v')
|eval(lambda: 100 * (1 - "c.val") * "v.val")
.as('pcent')
|influxDBOut()
.cluster('localhost')
.database('mydb')
.retentionPolicy('autogen')
.measurement('testmetric')
.tag('host', 'myhost.local')
.tag('key', 'weather.current.lightidx')
This is what I came up with after hours of trial and (especially) error. As given in the title, when I try to enable my task with kapacitor enable weathertest, I get the error message enabling task weathertest: batch query is not allowed to request data from ""."". Same thing when I try to record as in the "Backfill" example. Also, in that example there is a start and a stop date for limiting the time frame. The time format given there is wrong and is not understood by Kapacitor. Instead of e. g. 2015-10-01 I have to put in 2015-10-01T00:00Z to make it at least pass the error message regarding time format error.
In the Kapacitor logs there is not a single line regarding these errors, only when I try to remove a record, I get something like remove /var/lib/kapacitor/replay/1f5...750.brpl: no such file or directory and this can be found in the logs. There are lots of info lines in the logs showing successful POSTs to/from InfluxDB for the _internal database with HTTP response result 204.
Has anyone an Idea what I may be doing wrong?
OK, after the weekend I tried again. Without any change it accepted my script now in the failing steps, however, now I was able to find error messages in the log. The node mentioned there was the eval node and pointed towards a type mismatch. When I changed the line
|eval(lambda: 100 * (1 - "c.val") * "v.val")
to
|eval(lambda: 100.0 * (1.0 - "c.val") * "v.val")
the error messages were gone and the command kapacitor show weathertest showed a rather sane content now.
Furthermore, I redefined, recorded, replayed and deleted the tasks and recordings during my tests over and over again and I may have forgotten to redefine tasks after making changes to the tick script (not really sure). After changing the above, redefining the task and replaying it I finally found the expected data in the InfluxDB instance.

WSO2 Message Broker Error while adding Queue - Invalid Object Name

I have just set up a WSO2 Message Broker 3.0.0 connecting to a SQL Server DB.
The DB for the Carbon MB component has been created successfully as well.
The DB for the Message Broker Data store is created and contains the table MB_QUEUE_MAPPING.
However when adding a Queue via the MB UI I see the following error in the stack trace:
[2015-12-16 15:00:41,472] ERROR {org.wso2.andes.store.rdbms.RDBMSMessageStoreImpl} - Error occurred while retrieving destination queue id for destina
tion queue TestQ
java.sql.SQLException: Invalid object name 'MB_QUEUE_MAPPING'.
at net.sourceforge.jtds.jdbc.SQLDiagnostic.addDiagnostic(SQLDiagnostic.java:372)
at net.sourceforge.jtds.jdbc.TdsCore.tdsErrorToken(TdsCore.java:2988)
at net.sourceforge.jtds.jdbc.TdsCore.nextToken(TdsCore.java:2421)
at net.sourceforge.jtds.jdbc.TdsCore.getMoreResults(TdsCore.java:671)
at net.sourceforge.jtds.jdbc.JtdsStatement.executeSQLQuery(JtdsStatement.java:505)
at net.sourceforge.jtds.jdbc.JtdsPreparedStatement.executeQuery(JtdsPreparedStatement.java:1029)
at org.wso2.andes.store.rdbms.RDBMSMessageStoreImpl.getQueueID(RDBMSMessageStoreImpl.java:1324)
at org.wso2.andes.store.rdbms.RDBMSMessageStoreImpl.getCachedQueueID(RDBMSMessageStoreImpl.java:1298)
at org.wso2.andes.store.rdbms.RDBMSMessageStoreImpl.addQueue(RDBMSMessageStoreImpl.java:1634)
at org.wso2.andes.store.FailureObservingMessageStore.addQueue(FailureObservingMessageStore.java:445)
at org.wso2.andes.kernel.AMQPConstructStore.addQueue(AMQPConstructStore.java:116)
at org.wso2.andes.kernel.AndesContextInformationManager.createQueue(AndesContextInformationManager.java:154)
at org.wso2.andes.kernel.disruptor.inbound.InboundQueueEvent.updateState(InboundQueueEvent.java:151)
at org.wso2.andes.kernel.disruptor.inbound.InboundEventContainer.updateState(InboundEventContainer.java:167)
at org.wso2.andes.kernel.disruptor.inbound.StateEventHandler.onEvent(StateEventHandler.java:67)
at org.wso2.andes.kernel.disruptor.inbound.StateEventHandler.onEvent(StateEventHandler.java:41)
at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:128)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
The "Add Queue" screen does not go away however the Queue does get added to the MB_QUEUE table just fine in the DB. Both tables MB_QUEUE_MAPPING & MB_QUEUE_COUNTER are blank.
The "List Queues" screen does blank despite a number of Queues in the MB_QUEUE table. Stack trace also shows errors but is not included as its not relevant to the error above.
I can create a Topic just fine however.
I want to know why MB would say the table MB_QUEUE_MAPPING is an Invalid object name when the table clearly exists ?
I suspect the way you have configure the mysql database is incorrect.So you can better try out one of these below two scenarios to make sure about this issue.
1) starting the server for the first time with the -Dsetup parameter or
2) you can refer the documentation(https://docs.wso2.com/display/MB300/Configuring+MySQL) "Configuring MySQL" and follow step by step instructions given in order.
I have tried out the second scenario and I did not get any exception when I am adding queue.And the document I have mentioned will have to be update as below.
you can see this command in the step 3.
mysql -u <db_user_name> -p -D<database_name> < '<WSO2MB_HOME>/dbscripts/mb-store/mysql-mb.sql ';
db_user_name - username of db.
database_name - database name that you have created in the step 1.
WSO2MB_HOME - home directory path for MB.
Hope this could help you to resolve this issue.
It seems user connecting to MSSQL database not having correct permission. Most probably SELECT permission. Reason why I am saying is, when you adding queue, it does get added. This means user has INSERT permission. Once queue added, page redirected to Queue List page. User must have SELECT permission to retrieve queue list. Topic are not getting added to database, it keeps in registry. You can verify user who connecting to MSSQL from configuration like below in wso2mb-3.0.0/repository/conf/datasources/master-datasources.xml.
<datasource>
   <name>WSO2_MB_STORE_DB</name>
   <jndiConfig>
       <name>WSO2MBStoreDB</name>
   </jndiConfig>
   <definition type="RDBMS">
         <configuration>
                    <url>jdbc:jtds:sqlserver://localhost:1433/wso2_mb</url>
                    <username>sa</username>
                    <password>sa</password>
                    <driverClassName>net.sourceforge.jtds.jdbc.Driver</driverClassName>
                    <maxActive>200</maxActive>
                    <maxWait>60000</maxWait>
                    <minIdle>5</minIdle>
                    <testOnBorrow>true</testOnBorrow>
                    <validationQuery>SELECT 1</validationQuery>
                    <validationInterval>30000</validationInterval>
                    <defaultAutoCommit>false</defaultAutoCommit>
         </configuration>
     </definition>
</datasource>

Rexster/Rexpro : RexProScriptException: .. java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: PermGen space

I am using a TITAN-0.4.3, REXSTER 2.4 over Cassandra & Elasticsearch.
I am calling rexpro from Python. In a single gremlin-request, I am trying to add 100 vertices and commit. I am able to successfully add 40000+ vertices, in 400+ gremlin-requests. However after that , I am getting exception :
Encountered a RexProScriptException: An error occurred while processing the script for language [groov
y]. All transactions across all graphs in the session have been concluded with failure: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: PermGen space
Rexster.sh [JVM heap size]
I tried to increase heap memory, but still throws the exception, after insertion of few more batches of vertices.
# Set Java options
if [ "$JAVA_OPTIONS" = "" ] ; then
JAVA_OPTIONS="-Xms256m -Xmx1024m"
fi
Please advice
Just a guess based on the information you provided, but.....PermGen errors usually show up in Rexster if you are not parameterizing the scripts you are sending. Most of the python libraries out there that I know of support that feature. You can read more about this issue here:
https://github.com/tinkerpop/rexster/issues/143
and other places in the gremlin users mailing list if you search around. If for some reason you can't parameterize then you can alter this JVM setting:
-XX:+CMSClassUnloadingEnabled -XX:MaxPermSize=512M
but I'd consider that a last resort. Parameterization should not only get rid of your problem but will also greatly speed up your data loading process.

Resources