Apache Drill 1.2 and SQL Server JDBC - sql-server

Apache Drill 1.2 adds the exciting feature of including JDBC relational sources in your query. I would like to include Microsoft SQL Server.
So, following the docs I copied the SQL Server jar sqldjbc42.jar (the most recent MS JDBC driver) into the proper 3rd party directory.
I successfully added the storage.
The configuration is:
{
"type": "jdbc",
"driver": "com.microsoft.sqlserver.jdbc.SQLServerDriver",
"url": "jdbc:sqlserver://myservername",
"username": "myusername",
"password": "mypassword",
"enabled": true
}
as "mysqlserverstorage"
However, running queries fails. I've tried:
select * from mysqlserverstorage.databasename.schemaname.tablename
(of course I've use real existing tables instead of the placeholders here)
Error:
org.apache.drill.common.exceptions.UserRemoteException: VALIDATION ERROR: From line 2, column 6 to line 2, column 17: Table 'mysqlserverstorage.databasename.schemaname.tablename' not found [Error Id: f5b68a73-973f-4292-bdbf-54c2b6d5d21e on PC1234:31010]
and
select * from mysqlserverstorage.`databasename.schemaname.tablename`
Error:
org.apache.drill.common.exceptions.UserRemoteException: VALIDATION ERROR: Exception while reading tables [Error Id: 213772b8-0bc7-4426-93d5-d9fcdd60ace8 on PC1234:31010]
Has anyone had success in configuring and using this new feature?

Success has been reported using a storage plugin configuration, such as
{
type: "jdbc",
enabled: true,
driver: "com.microsoft.sqlserver.jdbc.SQLServerDriver",
url:"jdbc:sqlserver://172.31.36.88:1433;databaseName=msdb",
username:"root",
password:"<password>"
}
on pre-release Drill 1.3 and using sqljdbc41.4.2.6420.100.jar.

Construct you query as,
select * from storagename.schemaname.tablename
This will work with sqljdbc4.X as it works for me.

Related

MongoDB — Which Authentication Database should I use when creating users

In Current Mongo Instance
Following the Mongo Best Practices : Users are created in system database(Admin) rather than respective database and made admin database as authorization database
But creating users in system database (admin) works fine when tested in standalone functions when checked with code(Docker) getting exceptions
Also before creating user used the command
switch to admin
connection string used:
mongodb:// :#xxxxxxx:27017/admin
Caused by: com.mongodb.MongoCommandException: Command failed with
error 263 (OperationNotSupportedInTransaction): 'Cannot run command
against the 'admin' database in a transaction.' on server xxxxxxxxxxx.
The full response is {"operationTime": {"$timestamp": {"t":
1649307185, "i": 1}}, "ok": 0.0, "errmsg": "Cannot run command against
the 'admin' database in a transaction.", "code": 263, "codeName":
"OperationNotSupportedInTransaction", "$clusterTime": {"clusterTime":
{"$timestamp": {"t": 1649307185, "i": 1}}, "signature": {"hash":
{"$binary": {"base64": "AAAAAAAAAAAAAAAAAAAAAAAAAAA=", "subType":
"00"}}, "keyId": 0}}}

Azure Stream Analytics output to Azure Cosmos DB

Stream Analytics job ( iot hub to CosmosDB output) "Start" command is failing with the following error.
[12:49:30 PM] Source 'cosmosiot' had 1 occurrences of kind
'OutputDataConversionError.RequiredColumnMissing' between processing
times '2019-04-17T02:49:30.2736530Z' and
'2019-04-17T02:49:30.2736530Z'.
I followed the instructions and not sure what is causing this error.
Any suggestions please? Here is the CosmosDB Query:
SELECT
[bearings temperature],
[windings temperature],
[tower sway],
[position sensor],
[blade strain gauge],
[main shaft strain gauge],
[shroud accelerometer],
[gearbox fluid levels],
[power generation],
[EventProcessedUtcTime],
[EventEnqueuedUtcTime],
[IoTHub].[CorrelationId],
[IoTHub].[ConnectionDeviceId]
INTO
cosmosiot
FROM
TurbineData
If you're specifying fields in your query (ie Select Name, ModelNumber ...) rather than just using Select * ... the field names are converted to lowercase by default when using Compatibility Level 1.0, which throws off Cosmos DB. In the portal if you open your Stream Analytics job and go to 'Compatibility level' under the 'Configure' section and select v1.1 or higher that should fix the issue. You can read more about the compatibility levels in the Stream Analytics documentation here: https://learn.microsoft.com/en-us/azure/stream-analytics/stream-analytics-compatibility-level

Getting error while ingesting a nested Avro in a MS SQL table using kafka-connect-jdbc in kafka

As part of POC, I am trying to ingest Avro messages with schema registry enabled from Kafka Topics into JDBC Sink(MS SQL Database).But i am facing some issues while ingesting nested avro data to a MS Sql table. I am using kafka-connect-jdbc-sink to ingest avro data to a MS Sql table from Kafka Avro Console Producer.
Details mentioned below
Kafka Avro Producer CLI Command
kafka-avro-console-producer --broker-list server1:9092, server2:9092,server3:9092 --topic testing25 --property schema.registry.url=http://server3:8081 --property value.schema='{"type":"record","name":"myrecord","fields":[{"name":"tradeid","type":"int"},{"name":"tradedate", "type": "string"}, {"name":"frontofficetradeid", "type": "int"}, {"name":"brokerorderid","type": "int"}, {"name":"accountid","type": "int"}, {"name": "productcode", "type": "string"}, {"name": "amount", "type": "float"}, {"name": "trademessage", "type": { "type": "array", "items": "string"}}]}'
JDBC-Sink.properties
name=test-sink
connector.class=io.confluent.connect.jdbc.JdbcSinkConnector
tasks.max=1
topics=testing25 connection.url=jdbc:sqlserver://testing;DatabaseName=testkafkasink;user=Testkafka
insert.mode=upsert
pk.mode=record_value
pk.fields=tradeid
auto.create=true
tranforms=FlattenValueRecords
transforms.FlattenValueRecords.type=org.apache.kafka.connect.transforms.Flatten$Value
transforms.FlattenValueRecords.field=trademessage
connect-avro-standalone.properties
bootstrap.servers=server1:9092,server2:9092,server3:9092
key.converter=io.confluent.connect.avro.AvroConverter
key.converter.schema.registry.url=http://server3:8081
value.converter=io.confluent.connect.avro.AvroConverter
value.converter.schema.registry.url=http://server3:8081
internal.key.converter=org.apache.kafka.connect.json.JsonConverter
internal.value.converter=org.apache.kafka.connect.json.JsonConverter
internal.key.converter.schemas.enable=false
internal.value.converter.schemas.enable=false
offset.storage.file.filename=/tmp/connect.offsets
plugin.path=/usr/share/java
So after running the jdbc-sink and the producer while i am trying to insert the data in cli i am getting this error
ERROR WorkerSinkTask{id=test-sink-0} Task threw an uncaught and unrecoverable exception. Task is being killed and will not recover until manually restarted. (org.apache.kafka.connect.runtime.WorkerSinkTask:584)
org.apache.kafka.connect.errors.ConnectException: null (ARRAY) type doesn't have a mapping to the SQL database column type
I understand that it is failing on Array Data type as SQL Server does not contain any such data type. So I researched and found that we can use Kafka Connect SMT's(Single Message Transform) functionality(flatten) to flatten nested values.
But this does not seems to be working in my case. Transform values passed in JDBC-sink are doing nothing. Infact I tested with other transformations as well like InsertField$Value & InsertField$Key but none of them are working. Please let me know if i am doing anything wrong in running these transformations in Kafka connect.
Any help would be appreciated.
Thanks

Error processing tabular model - From SQL Server Agent

We have a tabular cube, processing database (full) in SSMS works fine, but when processing from SQL server agent, throws following error.
<return xmlns="urn:schemas-microsoft-com:xml-analysis">
  <root xmlns="urn:schemas-microsoft-com:xml-analysis:empty">
    <Messages xmlns="urn:schemas-microsoft-com:xml-analysis:exception">
      <Warning WarningCode="1092550744" Description="Cannot order ''[] by [] because at least one value in [] has multiple distinct values in []. For example, you can sort [City] by [Region] because there is only one region for each city, but you cannot sort [Region] by [City] because there are multiple cities for each region." Source="Microsoft SQL Server 2016 Analysis Services Managed Code Module" HelpFile="" />
    </Messages>
  </root>
</return>
Here is the script is used from SQL server agent.
{
"refresh": {
"type": "full",
"objects": [
{
"database": "DBName"
}
]
}
}
Can anyone suggest how to eliminate this error or ignore this error/warning?
Thanks,
I had the same issue, tabular model in VS 2015, cube in SSAS. Builds fine when I process the database but the SQL Server Agent was bringing up this error. A couple of forums had some mention of the error but no steps for deeper investigation & resolution. Particularly difficult when the 'Cannot Order' is blank. I opened the model in VS, select every column in turn and looked for any sorting operation in either the filter or the 'Sort by Column' button which is easy to miss. Removed all the sorts and it built fine. Take a note of the ones removed as you may have a data issue.
Use SQL Server Integration Services (SSIS) for processing. Just create a package with an "Analysis Services Processing Task". This task processes the model like SSMS.
The error message correctly explains the problem but unhelpfully doesn't tell which attribute is the offending one. I was sorting account names by account number but because there were a few accounts with the same name but different number, I got this same error. Setting keepUniqueRows didn't help.
Removing the offending sortBy fixes the problem when processing with an SQL Server Agent. What's interesting is that when the sortBy is in place and I processed the model with SSMS the accounts were sorted as expected. This led me to think this is because SQL Agent Job interprets the warning as an error and does a rollback but SSMS ignores it. The SSIS task probably ignores the warning just like SSMS and processing succeeds.
Try this,
<Process xmlns="http://schemas.microsoft.com/analysisservices/2003/engine">
<Type>ProcessFull</Type>
<Object>
<DatabaseID>DBName</DatabaseID>
</Object>
</Process>
I also faced same problem. I just made type "full" to "automatic" and it starts working.
{
"refresh": {
"type": "automatic",
"objects": [
{
"database": "AU MY Model"
}
]
}
}

DocumentDB Sql Injection?

I'm trying to offload some client specific query building to the client. I don't think I'm in danger of sql injection for documentdb since it doesn't have UPDATE or DELETE statements but i'm not positive. Additionally, I don't know if these will be added in the future.
Here is an example of my problem.
IceCreamApp wants to find all flavors where the name is like "choco". A flavor document looks like this-
{
"name": "Chocolate",
"price": 1.50
}
The API knows about the DocumentDB and knows how to request data from it, but it doesn't know the entity structure of any of the clients entities. So to do this on the API-
_documentClient.CreateDocumentQuery("...")
.Where((d) => d.name.Contains(query));
Would throw an error (d is dynamic and name isn't necessarily a common property).
I could build this on the client and send it.
Client search request-
{
"page": 1,
"pageSize": 10,
"query": "CONTAINS(name, 'choco')"
}
Without sanitzation this would be a big no-no for sql. But does it / will it ever matter for documentdb? How safe am I to run un-sanitized client queries?
As this official document Announcing SQL Parameterization in DocumentDB:
Using this feature, you can now write parameterized SQL queries. Parameterized SQL provides robust handling and escaping of user input, preventing accidental exposure of data through “SQL injection” *. Let's take a look at a sample using the .NET SDK; In addition to plain SQL strings and LINQ expressions, we've added a new SqlQuerySpec class that can be used to build parameterized queries.
DocumentDB is not susceptible to the most common kinds of injection attacks that lead to “elevation of privileges” because queries are strictly read-only operations. However, it might be possible for a user to gain access to data they shouldn’t be accessing within the same collection by crafting malicious SQL queries. SQL parameterization support helps prevent these sort of attacks.
Here's a official sample that queries a "Books" collection with a single user supplied parameter for author name:
POST https://contosomarketing.documents.azure.com/dbs/XP0mAA==/colls/XP0mAJ3H-AA=/docs
HTTP/1.1 x-ms-documentdb-isquery: True
x-ms-date: Mon, 18 Aug 2014 13:05:49 GMT
authorization: type%3dmaster%26ver%3d1.0%26sig%3dkOU%2bBn2vkvIlHypfE8AA5fulpn8zKjLwdrxBqyg0YGQ%3d
x-ms-version: 2014-08-21
Accept: application/json
Content-Type: application/query+json
Host: contosomarketing.documents.azure.com
Content-Length: 50
{
"query": "SELECT * FROM books b WHERE (b.Author.Name = #name)",
"parameters": [
{"name": "#name", "value": "Herman Melville"}
]
}

Resources