Time lag while connecting from Talend-SalesForce using tSalesForceConnection component - salesforce

I am trying to connect to SalesForce using SalesForce connector in Talend. Now when I send the connection request, I can see that the connection is established after ~2 seconds and as soon as the connection is established to SalesForce url, the response is in milliseconds. I am using tSalesForceConnection to establish the connection to SalesForce from Talend but there is a time lag in that process.
I am using oAuth authentication type to authenticate with SalesForce server
An example using tsttatcacter logs we see between begin and end it is taking 2 seconds.
2020-10-22 13:52:38|CgqWpf|CgqWpf|CgqWpf|16220|sample|samplelog|_PF7PIBPKEeuGlI4APhARAg|0.1|Default||begin||
2020-10-22 13:52:38|CgqWpf|CgqWpf|CgqWpf|16220|sample|samplelog|_PF7PIBPKEeuGlI4APhARAg|0.1|Default|tSalesforceConnection_2|begin||
2020-10-22 13:52:39|CgqWpf|CgqWpf|CgqWpf|16220|sample|samplelog|_PF7PIBPKEeuGlI4APhARAg|0.1|Default|tSalesforceConnection_2|end|success|1478
1515 milliseconds
2020-10-22 13:52:39|CgqWpf|CgqWpf|CgqWpf|16220|sample|samplelog|_PF7PIBPKEeuGlI4APhARAg|0.1|Default||end|success|1515
Does anyone have any idea why is there a time lag in establishing connection via Talend and if there are ways to improve it?

Related

Query triangulation

I have the following usage pattern that I'm wondering if there's a known way to deal with it.
Let's say I have a website where a user can build a query to run it against the remote database. The remote database is secure and the user will not have access to it. Therefore, the query, what will be something like: SELECT * FROM myTable will be sent to our web server, and our web server will query the remote DB on another server, receive the results and pass them back in the HTTP response. So, the flow is:
Location1 (Europe): User/browser submits HTTP POST containing the SQL Query.
Location2 (US): HTTP Server receives request, runs SQL against database:
Location3 (Asia): Database runs query, returns data
Location2 (US): HTTP Server receives SQL resultset back. Sends response.
Location1 (Europe): User/browser receives the data back in the rendered webpage.
Supposing that I don't have control of the three locations, we can see that there may be a lot of data transfer latency if the size of the resultset is large. I was wondering if there is any way to do something like the following instead, and if so how it could be done:
Location1 (Europe): User/browser submits HTTP POST containing the SQL Query.
Location2 (US): HTTP Server receives request, sends back QueryID immediately, runs SQL against database, asynchronously.
Location3 (Asia) Database runs query
Location1 (Europe): User/browser receives response from database. (How? It cannot pull directly from DB)
To summarize, if we imagine the resultset is 50MB in size, in the first case, the 50MB would go from:
Asia (DB) -> US (Server) -> Europe (Client)
and in the second case it would go from:
Asia (DB) -> Europe (Client)
You can decouple authentication with authorization to allow more flexible connections between all three entities: Browser, HTTP server, and DB.
To make your second example work you could do:
The HTTP server (US) submits asynchroneously the query to the DB (Asia) and requests a auth token for it.
The HTTP server (US) sends the auth token back to the browser (Europe), while the query is now running.
The browser (Europe) now initiates a second HTTP call against the DB (Asia) using the auth token, and maybe the queryID as well.
The DB will probably need to implement a simple token auth protocol. It should:
Authenticate the incoming auth token.
Retrieve the session.
Start streaming the query result set back to the caller.
For the DB server, there are plenty of out-of-the-box slim docker images you can spin in seconds that implement authorization server and that can listen to the browser using nginx.
As you can see the architecture can be worked out. However, the DB server in Asia will need to be revamped to implement some kind of token authorization. The simplest and widespread strategy is to use OAuth2, that is all the rage nowadays.
Building on #TheImpalers answer:
How about add another table to your remote DB that is just for retrieving query result?
When client asks the backend service for database query, the backend service will generate a UUID or other secure token and tell the DB to run the query and store it under the given UUID. The backend service also returns the UUID to the client who can then retrieve the associated data from the DB directly.
TLDR:
Europe (Client) -> US (Server) -> Asia (Server) -> Asia (DB)
Open a HTTP server in Asia (if not don't have access to same DC/server - rent a different one), then re-direct request from HTTP US -> HTTP Asia, which will connect to local DB & stream the response.
Redirect can either be a public one (302) or a private proxying over VPN if you care about latency & have such possibility.
Frontend talking to DB directly is not a very good pattern, because you can't do any middleware operations that you'll need in a long term (breaking changes, analytics, authorization, redirects, rate-limiting, scalability...)
If your SQL is very heavy & you can't do sync requests with long-lasting TCP connections, set up streaming over websocket server (also in Asia).

Openliberty datasource always uses 1 database connection

I have configured openliberty (version 21) with a database (oracle) connection as follows in the server.xml :
<dataSource jndiName="jdbc/myds" transactional="true">
<connectionManager maxPoolSize="20" minPoolSize="5" agedTimeout="120s" connectionTimeout="10s"/>
<jdbcDriver libraryRef="jdbcLib" />
<properties.oracle URL="jdbc:oracle:thin:#..." user="..." password="..."/>
</dataSource>
The server starts and I can make queries to the database via my rest api but I have noticed that I only use 1 active database connection and parallel http queries result in queuing databases queries over that 1 connection.
I have verified this by monitoring the active open database connections in combination with slow queries (I make several rest calls in parallel). Only 1 connection is opened and 1 query is processes after the other. How do I open a connection pool with for example 5-20 connections for parallel operation.
Based on your described usage, the connection pool should be creating connections as requests come in if there are no connections available in the free pool.
Your connectionTimeout is configured to be 10 seconds. To ensure that your test really is running in parallel would be to make two requests to the server. The server should create a connection, use it, wait 11 seconds, then close the connection.
If your requests are NOT running in parallel, you will not get any exception since the second request won't start until after the first one finished and that would be an issue with your test procedure.
If your requests are running in parallel, and you do not get any exception output from Liberty. Then Liberty likely is making multiple connections and that can be confirmed by enabling J2C trace.
See: https://openliberty.io/docs/21.0.0.9/log-trace-configuration.html
Enable: J2C=ALL
If your requests are running in parallel, and no more than one connection is being created, then you will get a ConnectionWaitTimeoutException. This could be caused by the driver not being able to create more than one connection, incorrect use of the Oracle Connection Pool (UCP), or a number of other factors. I would need more information to debug that issue.

Connection timeout with SlashDB and Azure SQL DB

I've just installed SlashDB and connected to an Azure SQL DB successfully. Querying works and everything is fine. However, after a while, if I retry my previously working query, I get an error from SlashDB:
500 Internal Server Error (pyodbc.OperationalError) ('08S01', u'[08S01] [FreeTDS][SQL Server]Write to the server failed (20006) (SQLExecDirectW)')
I'm not writing anything to the server, if that matters. But if I retry the query immediately, it works. My deep analysis (=guess) of this all is that the SQL Server terminates the idle connection. Now, I'd like SlashDB to retry when it fails, instead of returning error to the the client. Is this possible?
Apparently Azure SQL DB could be breaking connections due to either their redundancy limitations or the 30 minute idle.
https://azure.microsoft.com/en-us/blog/connections-and-sql-azure/
For performance reasons SlashDB does not establish a new connection for every request, but instead maintains a pool of connections.
MySQL, has a similar behavior, which is widely known (60 minutes idle timeout) and SlashDB actually has a logic to attempt reconnect. It should implement the same for all database types, but I need to confirm that with the development team (and fix if not the case).
In the meantime, you could either retry on the client side or send a periodic request for to avoid the timeout.

Multi connection to SQL Server database

I am facing an issue since yesterday. I have a SQL Server located on a remote hosting provider. The provider claims that there could be max 5 connections at the same time.
I also have my own developed app. Until now there was only one user using the application and there was no problem with the connection to that database.
Now we got an additional user who will be working with this app from another place. The problem is, when the first user logged in into the program and he's using it, the second user retrieving information on login form that cannot connect to the SQL Server but before application thinking around 30 seconds before that message.
Seems there can be only one conenction at same time and not 5. Can you advise something or is there any test I can do to check it and make sure where I stand?

Reconnct to database isn't happening after Mule starts

I am trying the Reconnection configuration in Mule for database connection. My goal is to make Mule retry the connection if the database is down, or at least keep the message somewhere so that I can post the message again later.
So I configure the Reconnection in the jdbc connector as followed:
<jdbc-ee:connector name="MyDatabase" dataSource-ref="DB2_Data_Source" validateConnections="true" queryTimeout="-1" pollingFrequency="0" doc:name="Database">
<reconnect frequency="6000" count="8"/>
</jdbc-ee:connector>
It works if I start Mule when database is down. I can see in the log that Mule is retrying the reconnect.
The problem comes when the database is down after Mule has already started. I tried by shutting down my database after I ran the flow. I don't see Mule trying to reconnect to the database. When I post a message that hits the database endpoint, it just throws a SQLException and doesn't try to reconnect. Thus my message is lost.
Is there a way to make Mule retry the connection if database is down after Mule starts and log the error somewhere if the retry fails?

Resources