Streaming MSSQL CDC to AWS MSK with Debezium

Streaming MSSQL CDC to AWS MSK with Debezium - sql-server

I'm a newbie with Kafka and currently learning to streaming data changed from MSSQL to Amazon MSK using Debezium connector
I already have a MS SQL Server with CDC enabled, a MSK cluster which I can connect, create topic, produce and consume data manually through an EC2 client. Now I'm setting up a MSK Connect with Debezium SQL Server connector as custom plugin, here is my MSK Connector configurations:
connector.class = io.debezium.connector.sqlserver.SqlServerConnector,
tasks.max = 1
database.hostname = xxx,
database.port = xxx,
database.user = xxx,
database.password = xxx,
database.dbname = dbName,
database.server.name = serverName,
table.include.list = dbo.tableName,
database.history.kafka.bootstrap.servers = xxx,
database.history.kafka.topic = xxx
But my MSK connector keeps returning status Failed. I have searched Google though but it seems there is no instruction or guide related to my idea.
That makes me wondering whether my solution is possible? Could someone please shed some light and point me to the right direction?
Edited: some logs I got from CloudWatch
ERROR [AdminClient clientId=adminclient-1] Connection to node -2 () failed authentication due to: []: Access denied (org.apache.kafka.clients.NetworkClient:771)
INFO App info kafka.admin.client for adminclient-1 unregistered (org.apache.kafka.common.utils.AppInfoParser:83)
[INFO [AdminClient clientId=adminclient-1] Metadata update failed (org.apache.kafka.clients.admin.internals.AdminMetadataManager:235)
org.apache.kafka.connect.errors.ConnectException: Failed to connect to and describe Kafka cluster. Check worker's broker connection and security properties.
Caused by: org.apache.kafka.common.errors.SaslAuthenticationException: [4f91d358-fb7b-4f3b-8930-1b4aefce6d0b]: Access denied
[Worker-08134a52fe88cdc49] MSK Connect encountered errors and failed.
Many thanks,

If you are using IAM role based auth for your MSK cluster, your bootstrap server port will be 9098
Along with all the properties, you also have send these properties in your MSK connect config
database.history.consumer.security.protocol=SASL_SSL
database.history.consumer.sasl.mechanism=AWS_MSK_IAM
database.history.consumer.sasl.jaas.config=software.amazon.msk.auth.iam.IAMLoginModule required;
database.history.consumer.sasl.client.callback.handler.class=software.amazon.msk.auth.iam.IAMClientCallbackHandler
database.history.producer.security.protocol=SASL_SSL
database.history.producer.sasl.mechanism=AWS_MSK_IAM
database.history.producer.sasl.jaas.config=software.amazon.msk.auth.iam.IAMLoginModule required;
database.history.producer.sasl.client.callback.handler.class=software.amazon.msk.auth.iam.IAMClientCallbackHandler
Refer: https://aws.amazon.com/blogs/aws/introducing-amazon-msk-connect-stream-data-to-and-from-your-apache-kafka-clusters-using-managed-connectors/

Related

AWS MSK Connect w/ MSSQL Debezium connector fails with disconnect

I am trying to setup a mssql debezium connector with AWS MSK Connect but keep getting the following error messages:
Connector error log:
[Worker-0a949760f6b805d4f] [2023-02-15 19:57:56,122] WARN [src-connector-014|task-0] [Consumer clientId=dlp.compcare.ccdemo-schemahistory, groupId=dlp.compcare.ccdemo-schemahistory] Bootstrap broker b-3.stuff.morestuff.c7.kafka.us-east-1.amazonaws.com:9098 (id: -2 rack: null) disconnected (org.apache.kafka.clients.NetworkClient:1079)
This error happens continuously for a bit then I see this error:
org.apache.kafka.common.errors.TimeoutException: Timeout expired while fetching topic metadata
In the cluster logs I see a corresponding error when I get the disconnect error:
[2023-02-15 20:08:21,627] INFO [SocketServer listenerType=ZK_BROKER, nodeId=3] Failed authentication with /172.32.34.126 (SSL handshake failed) (org.apache.kafka.common.network.Selector)
I have an ec2 client that i've setup to connect to my cluster and am able to connect and run commands against the cluster using IAM auth. I have setup a topic and produced and consumed from the topic using the console producer/consumers. I've also verified that when the connector start up it is creating the __amazon_msk_connect_status_* and __amazon_msk_connect_offsets_* topics.
I've verified that ip in the logs is the ip assigned to my connector by checking the Elastic Network Interface it was attached to.
Also for testing purposes I've opened up all traffic from 0.0.0.0/0 for the SG they are running in and also made sure the IAM role has msk*, msk-connect*, kafka*, and s3*.
I've also verified CDC is enabled on the RDS and that it is working properly. I see changes being picked and added to the CDC tables.
I believe the issue is related to IAM auth still but am not certain.
Cluster Config:
auto.create.topics.enable=true
delete.topic.enable=true
worker config:
key.converter=org.apache.kafka.connect.storage.StringConverter
value.converter=org.apache.kafka.connect.storage.StringConverter
config.providers.secretManager.class=com.github.jcustenborder.kafka.config.aws.SecretsManagerConfigProvider
config.providers=secretManager
config.providers.secretManager.param.aws.region=us-east-1
request.timeout.ms=90000
errors.log.enable=true
errors.log.include.messages=true
Connector Config:
connector.class=io.debezium.connector.sqlserver.SqlServerConnector
tasks.max=1
database.history.consumer.sasl.jaas.config=software.amazon.msk.auth.iam.IAMLoginModule required;
schema.include.list=dbo
database.history.producer.sasl.client.callback.handler.class=software.amazon.msk.auth.iam.IAMClientCallbackHandler
database.history.consumer.sasl.client.callback.handler.class=software.amazon.msk.auth.iam.IAMClientCallbackHandler
database.history.consumer.security.protocol=SASL_SSL
database.instance=MSSQLSERVER
topic.prefix=dlp.compcare.ccdemo
schema.history.internal.kafka.topic=dlp.compcare.ccdemo.history
value.converter=org.apache.kafka.connect.json.JsonConverter
key.converter=org.apache.kafka.connect.storage.StringConverter
database.history.sasl.mechanism=AWS_MSK_IAM
database.encrypt=false
database.history.sasl.jaas.config=software.amazon.msk.auth.iam.IAMLoginModule required;
database.history.producer.sasl.mechanism=AWS_MSK_IAM
database.history.producer.sasl.jaas.config=software.amazon.msk.auth.iam.IAMLoginModule required;
database.user=debezium
database.names=Intermodal_CCDEMO
database.history.producer.security.protocol=SASL_SSL
database.server.name=ccdemo_1
schema.history.internal.kafka.bootstrap.servers=b-1:9098
database.port=1433
database.hostname=my-mssql-rds.rds.amazonaws.com
database.history.sasl.client.callback.handler.class=software.amazon.msk.auth.iam.IAMClientCallbackHandler
database.password=${secretManager:dlp-compcare:dbpassword}
table.include.list=dbo.EquipmentSetup
database.history.security.protocol=SASL_SSL
database.history.consumer.sasl.mechanism=AWS_MSK_IAM
I was able to do this same process but with a postgres rds with no issues.
I've tried everything I can think of so any an all help would be greatly appreciated!
I also referenced the following when setting up the cluster/connector:
https://catalog.workshops.aws/msk-labs/en-US/mskconnect/source-connector-setup
https://thedataguy.in/debezium-with-aws-msk-iam-authentication/
https://debezium.io/documentation/reference/stable/connectors/sqlserver.html#sqlserver-connector-properties
Streaming MSSQL CDC to AWS MSK with Debezium
https://docs.aws.amazon.com/msk/latest/developerguide/mkc-debeziumsource-connector-example.html

Debezium SQL Server Connector - "Couldn't obtain database name"

I'm trying to set up a Debezium SQL Server Connector against a SQL Server instance that is controlled by DBAs at my workplace. I've been able to start up Zookeeper and Kafka Server without issue, and Kafka Connect itself works with sample Connectors, but when attempting to start a Debezium SQL Server Connector instance I've been getting the error "Couldn't obtain database name".
[2022-07-12 16:36:04,269] ERROR Stopping after connector error (org.apache.kafka.connect.cli.ConnectStandalone:117)
java.util.concurrent.ExecutionException: org.apache.kafka.connect.runtime.rest.errors.BadRequestException: Connector configuration is invalid and contains the following 1 error(s):
Unable to connect. Check this and other connection properties. Error: Couldn't obtain database name
Here is my debezium config:
name=Dbz-SqlServer-connector
connector.class=io.debezium.connector.sqlserver.SqlServerConnector
database.hostname=MyDbHost
database.port=1433
database.user=MyUsername
database.password=MyPassword
database.dbname=MyDatabase
database.server.name=MyDbHost
table.include.list=dbo.CdcTest
database.history.kafka.bootstrap.servers=localhost:9092
database.history.kafka.topic=dbhistory.CdcTest
I've tried this in a .properties file passed to a standalone Connect instance, and as a JSON POST to a distributed Connect instance. I have tried all of the same steps on both my local Windows machine as well as on a linux VM, with the same results.
Confluent and Docker are not options for me in this situation.
for SQL Server login credentials, I am using a local account on the SQL Server instance that does have access to the database in question. I found the source code for debezium's connectors on their github and was able to find that specific error message within the code:
private static final String GET_DATABASE_NAME = "SELECT name FROM sys.databases WHERE name = ?";
...
public String retrieveRealDatabaseName(String databaseName) {
try {
return prepareQueryAndMap(GET_DATABASE_NAME,
ps -> ps.setString(1, databaseName),
singleResultMapper(rs -> rs.getString(1), "Could not retrieve exactly one database name"));
}
catch (SQLException e) {
throw new RuntimeException("Couldn't obtain database name", e);
}
}
I'm not completely familiar with Java but it appears that basically something is going wrong when the connector is trying to run "SELECT name FROM sys.databases WHERE name = 'MyDatabase'". When I run this against the database myself, logged in with the same account I'm using, it seems to work just fine, so I'm really not sure where to go from here. It is fair to say that since I'm not in full control of the SQL Server environment that I'm using, there may be some permissions issues that I'm not aware of, but from what I'm able to test it seems like it should be working.
I would greatly appreciate any help at all, whether just suggestions on settings/configs to check or a full-blown solution.
Thank you!
Update: I've built a simple console app to run that sys.databases query against MyDbHost, master database, as the relevant account, and it's working just fine so I feel like that confirms that my connection info is correct and account permissions are also correct. Seems like this is an issue within the Debezium connector.

It turned out that my problem was a mistake in the connector's config setting. I misunderstood which specific pieces of data to put into database.hostname and database.server.name, and one I corrected those fields the connector works.

Terraform InvalidParameterCombination with AWS Microsoft SQL

Having issues when trying to create an AWS RDS instance with terraform, I have gone through the documentation in AWS and Terraform and I just cannot see why this would be an invalid combination, I'm trying to create a free tier DB for testing:
resource "aws_db_instance" "rds-mssql" {
allocated_storage = 20
engine = "sqlserver-ee"
engine_version = "14.00.3356.20.v1"
instance_class = "db.t2.micro"
name = "mydbtest"
username = "usernameGoesHere"
password = "passwordGoesHere"
license_model = "license-included"
}
Getting the following error:
Error: Error creating DB Instance: InvalidParameterCombination: RDS does not support creating a DB instance with the following combination: DBInstanceClass=db.t2.micro, Engine=sqlserver-ee, EngineVersion=14.00.3356.20.v1, LicenseModel=license-included. For supported combinations of instance class and database engine version, see the documentation
I have followed documentation from here: https://docs.aws.amazon.com/AmazonRDS/latest/APIReference/API_CreateDBInstance.html
Also I have created manually from the AWS console a DB with those specs with no problem
Thanks in advance

sqlserver-ee is for Enterprise Edition which does not support t2.micro. I guess you want sqlserver-ex (express edition).

Why do i get this error when i connect snowflake and python

This is the error i get when i connect to snowflake via python?
OperationalError: 250003: Failed to execute request: ("bad handshake: Error([('SSL routines', 'tls_process_server_certificate', 'certificate verify failed')])",)
I connect using:
ctx = snowflake.connector.connect(
user='JoeBloggs',
password='pwd',
account='JoeBloggs',
database='DEV_DATA'
)
do i need to feed in other paramters such as port, host, etc how did i find what these are?

I think your value for 'account' needs to be modified. It looks like you're using your username there, but it should be the Snowflake account. This should be the portion of the URL that you connect directly to that precedes the snowflakecomputing.com portion. For example, 'xy12345.east-us-2.azure'.

My initial thoughts are that the error indicates a firewall or proxy issue. In particular, a proxy might intercept Snowflake's SSL certificate and replace it with their own. The best way to resolve this is to ensure the certificate is trusted in the proxy and the proxy is configured as per Snowflake's documentation so that the Snowflake certificate can pass through.
The documentation below has more information on using a proxy with SnowSQL. You can pass along the error with issuer details to your network engineer and can request to whitelist the required URLs (documentation also below outlining the whitelisting requirements). You can use the SYSTEM$WHITELIST function to get all the URLs to whitelist in a proxy or firewall for your account.
https://docs.snowflake.net/manuals/user-guide/snowsql-start.html#using-a-proxy-server
https://docs.snowflake.net/manuals/user-guide/hostname-whitelist.html

First, install Snowflake python connector .pip3 install snowflake-python-connector.
Can you try with code below:
------------------------------------------------------
import snowflake.connector
PASSWORD = '*****'
USER = '<UNAME>'
ACCOUNT = '<ACCNTNAME>'
WAREHOUSE = '<WHNAME>'
DATABASE = '<DBNAME>'
SCHEMA = 'PUBLIC'
print("Connecting...")
con = snowflake.connector.connect(
user=USER,
password=PASSWORD,
account=ACCOUNT,
warehouse=WAREHOUSE,
database=DATABASE,
schema=SCHEMA
)
con.cursor().execute("USE WAREHOUSE " + WAREHOUSE)
con.cursor().execute("USE DATABASE " + DATABASE)
try:
result = con.cursor().execute("Select * from <TABLENAME>")
result_list = result.fetchall()
print(result_list)
finally:
con.cursor().close()
con.cursor().close()
---------------------------------------------------

Failure trying to get a pooled connection. java.sql.SQLException: Protocol violation

I'm administering a web-based application that is set-up to pull in data from a variety of databases; SQL, Oracle, Mainframe, etc.
I was given credentials to access an Oracle DB, and am establishing a connection through the web-based app server via JDBC. The JDBC connection requires me to provide a Database URL and JDBC driver for the connection. I also built in a SQL statement to pull only the information I needed from the Oracle DB into my web-based app.
Things were running smoothly with this set-up, until just recently. I now receive the following error when trying to establish a connection from my web-based app to the Oracle DB:
Failure trying to get a pooled connection to
[jdbc:oracle:thin:#<SERVER NAME>:1521:cqdb]java.sql.SQLException: Protocol violation
The Oracle DBA I work with is not very helpful in resolving in helping me troubleshoot this issue. Without his help, I really don't even know where to start with troubleshooting.
Any suggestions on where to start? I can provide additional information if needed.
*Additional information. This is what is in my STDOUT file in relation to the error. I can keep digging as well:
07:31:08,565 WARN QuartzScheduler_Worker-5 WEB-APP.api.Aggregator:979 - Exception during aggregation. Reason: Failure trying to get a pooled connection to [jdbc:oracle:thin:#SERVER-NAME:1521:cqdb]java.sql.SQLException: Protocol violation
WEB-APP.tools.GeneralException: Failure trying to get a pooled connection to [jdbc:oracle:thin:#SERVER-NAME:1521:cqdb]java.sql.SQLException: Protocol violation
at WEB-APP.api.Aggregator.aggregateAccounts(Aggregator.java:1897)
at WEB-APP.api.Aggregator.execute(Aggregator.java:1222)
at WEB-APP.task.ResourceIdentityScan.execute(ResourceIdentityScan.java:76)
at WEB-APP.api.TaskManager.runSync(TaskManager.java:643)
at WEB-APP.scheduler.JobAdapter.execute(JobAdapter.java:116)
at org.quartz.core.JobRunShell.run(JobRunShell.java:202)
at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:529)
Caused by: WEB-APP.connector.ConnectorException: Failure trying to get a pooled connection to [jdbc:oracle:thin:#SERVER-NAME:1521:cqdb]java.sql.SQLException: Protocol violation
at WEB-APP.connector.JDBCConnector.getConnection(JDBCConnector.java:833)
at WEB-APP.connector.JDBCConnector.iterateObjects(JDBCConnector.java:649)
at WEB-APP.connector.JDBCConnector.iterateObjects(JDBCConnector.java:90)
at WEB-APP.connector.ConnectorProxy.iterateObjects(ConnectorProxy.java:109)
at WEB-APP.api.Aggregator.iterateObjects(Aggregator.java:2673)
at WEB-APP.api.Aggregator.aggregateAccounts(Aggregator.java:1818)
... 6 more
Caused by: WEB-APP.tools.GeneralException: Failure trying to get a pooled connection to [jdbc:oracle:thin:#SERVER-NAME:1521:cqdb]java.sql.SQLException: Protocol violation
at WEB-APP.tools.JdbcUtil.getPooledConnection(JdbcUtil.java:1178)
at WEB-APP.tools.JdbcUtil.getConnection(JdbcUtil.java:823)
at WEB-APP.connector.JDBCConnector.getConnection(JDBCConnector.java:830)
... 11 more
07:33:03,983 ERROR http-8080-2 WEB-APP.server.Authenticator:229 - WEB-APP.connector.AuthenticationFailedException: [LDAP: error code 49 - 80090308: LdapErr: DSID-0C090334, comment: AcceptSecurityContext error, data 52e, vece
**Additional Information:
Oracle JDBC Driver version 14, JRE ver 1.6.0_23-b05. I don't have the Oracle DB version. Awaiting response from our Oracle DBA.
***Additional Information:
This issue was resolved. Our Oracle DBA did something on his end to correct the connection issue. He hasn't explained what he did, yet. Thanks for your help. Sorry for not getting you all the info you needed up front.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight