Timeout waiting for connection from pool - despite single SolrServer - solr

We are having problems with our solrServer client's connection pool running out of connections in no time, even when using a pool of several hundred (we've tried 1024, just for good measure).
From what I've read, the following exception can be caused by not using a singleton HttpSolrServer object. However, see our XML config below, as well:
Caused by: org.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for connection from pool
at org.apache.http.impl.conn.PoolingClientConnectionManager.leaseConnection(PoolingClientConnectionManager.java:232)
at org.apache.http.impl.conn.PoolingClientConnectionManager$1.getConnection(PoolingClientConnectionManager.java:199)
at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:455)
at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784)
at org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:448)
XML Config:
<solr:solr-server id="solrServer" url="http://solr.url.domain/"/>
<solr:repositories base-package="de.ourpackage.data.solr" multicore-support="true"/>
At this point, we are at a loss. We are running a web application on a tomcat7. Whenever a user requests a new website, we send one or more request to the Solr Server, requesting whatever we need, which are usually single entries or page of 20 (using Spring Data).
As for the rest of our implementation, we are using an abstract SolrOperationsrepository class, which is extended by each of our repositories (one repository for each core).
The following is how we set our solrServer. I suspect we are doing something fundamentally wrong here, which is why our connections are overflowing. According to the logs, they are always being returned into the pool, btw.
private SolrOperations solrOperations;
#SuppressWarnings("unchecked")
public final Class<T> getEntityClass() {
return (Class<T>)((ParameterizedType)getClass().getGenericSuperclass()).getActualTypeArguments()[0];
}
public final SolrOperations getSolrOperations() {
/*HttpSolrServer solrServer = (HttpSolrServer)solrOperations.getSolrServer();
solrServer.getHttpClient().getConnectionManager().closeIdleConnections(500, TimeUnit.MILLISECONDS);*/
logger.info("solrOperations: " + solrOperations);
return solrOperations;
}
#Autowired
public final void setSolrServer(SolrServer solrServer) {
try {
String core = SolrServerUtils.resolveSolrCoreName(getEntityClass());
SolrTemplate template = templateHolder.get(core);
/*solrServer.setConnectionTimeout(500);
solrServer.setMaxTotalConnections(2048);
solrServer.setDefaultMaxConnectionsPerHost(2048);
solrServer.getHttpClient().getConnectionManager().closeIdleConnections(500, TimeUnit.MILLISECONDS);*/
if ( template == null ) {
template = new SolrTemplate(new MulticoreSolrServerFactory(solrServer));
template.setSolrCore(core);
template.afterPropertiesSet();
logger.debug("Creating new SolrTemplate for core '" + core + "'");
templateHolder.put(core, template);
}
logger.debug("setting SolrServer " + template);
this.solrOperations = template;
} catch (Exception e) {
logger.error("cannot set solrServer...", e);
}
}
The code that is commented out has been mostly used for testing purposes. I also read somewhere else that you cannot manipulate the solrServer object on-the-fly. Which begs the question, how do I set a timeout/poolsize in the XML config?
The implementation of a repository looks like this:
#Repository(value="stellenanzeigenSolrRepository")
public class StellenanzeigenSolrRepositoryImpl extends SolrOperationsRepository<Stellenanzeige> implements StellenanzeigenSolrRepositoryCustom {
...
public Query createQuery(Criteria criteria, Sort sort, Pageable pageable) {
Query resultQuery = new SimpleQuery(criteria);
if ( pageable != null ) resultQuery.setPageRequest(pageable);
if ( sort != null ) resultQuery.addSort(sort);
return resultQuery;
}
public Page<Stellenanzeige> findBySearchtext(String searchtext, Pageable pageable) {
Criteria searchtextCriteria = createSearchtextCriteria(searchtext);
Query query = createQuery(searchtextCriteria, null, pageable);
return getSolrOperations().queryForPage(query, getEntityClass());
}
...
}
Can any of you point to mistakes that we've made, that could possibly lead to this issue? Like I said, we are at a loss. Thanks in advance, and I will, of course update the question as we make progress or you request more information.

The MulticoreServerFactory always returns an object of HttpClient, that only ever allows 2 concurrent connections to the same host, thus causing the above problem.
This seems to be a bug with spring-data-solr that can be worked around by creating a custom factory and overriding a few methods.
Edit: The clone method in MultiCoreSolrServerFactory is broken. This hasn't been corrected yet. As some of my colleagues have run into this issue recently, I will post a workaround here - create your own class and override one method.
public class CustomMulticoreSolrServerFactory extends MulticoreSolrServerFactory {
public CustomMulticoreSolrServerFactory(final SolrServer solrServer) {
super(solrServer);
}
#Override
protected SolrServer createServerForCore(final SolrServer reference, final String core) {
// There is a bug in the original SolrServerUtils.cloneHttpSolrServer()
// method
// that doesn't clone the ConnectionManager and always returns the
// default
// PoolingClientConnectionManager with a maximum of 2 connections per
// host
if (StringUtils.hasText(core) && reference instanceof HttpSolrServer) {
HttpClient client = ((HttpSolrServer) reference).getHttpClient();
String baseURL = ((HttpSolrServer) reference).getBaseURL();
baseURL = SolrServerUtils.appendCoreToBaseUrl(baseURL, core);
return new HttpSolrServer(baseURL, client);
}
return reference;
}
}

Related

.NET 7 Distributed Transactions issues

I am developing small POC application to test .NET7 support for distributed transactions since this is pretty important aspect in our workflow.
So far I've been unable to make it work and I'm not sure why. It seems to me either some kind of bug in .NET7 or im missing something.
In short POC is pretty simple, it runs WorkerService which does two things:
Saves into "bussiness database"
Publishes a message on NServiceBus queue which uses MSSQL Transport.
Without Transaction Scope this works fine however, when adding transaction scope I'm asked to turn on support for distributed transactions using:
TransactionManager.ImplicitDistributedTransactions = true;
Executable code in Worker service is as follows:
protected override async Task ExecuteAsync(CancellationToken stoppingToken)
{
int number = 0;
try
{
while (!stoppingToken.IsCancellationRequested)
{
number = number + 1;
using var transactionScope = TransactionUtils.CreateTransactionScope();
await SaveDummyDataIntoTable2Dapper($"saved {number}").ConfigureAwait(false);
await messageSession.Publish(new MyMessage { Number = number }, stoppingToken)
.ConfigureAwait(false);
_logger.LogInformation("Publishing message {number}", number);
_logger.LogInformation("Worker running at: {time}", DateTimeOffset.Now);
transactionScope.Complete();
_logger.LogInformation("Transaction complete");
await Task.Delay(1000, stoppingToken);
}
}
catch (Exception e)
{
_logger.LogError("Exception: {ex}", e);
throw;
}
}
Transaction scope is created with the following parameters:
public class TransactionUtils
{
public static TransactionScope CreateTransactionScope()
{
var transactionOptions = new TransactionOptions();
transactionOptions.IsolationLevel = IsolationLevel.ReadCommitted;
transactionOptions.Timeout = TransactionManager.MaximumTimeout;
return new TransactionScope(TransactionScopeOption.Required, transactionOptions,TransactionScopeAsyncFlowOption.Enabled);
}
}
Code for saving into database uses simple dapper GenericRepository library:
private async Task SaveDummyDataIntoTable2Dapper(string data)
{
using var scope = ServiceProvider.CreateScope();
var mainTableRepository =
scope.ServiceProvider
.GetRequiredService<MainTableRepository>();
await mainTableRepository.InsertAsync(new MainTable()
{
Data = data,
UpdatedDate = DateTime.Now
});
}
I had to use scope here since repository is scoped and worker is singleton so It cannot be injected directly.
I've tried persistence with EF Core as well same results:
Transaction.Complete() line passes and then when trying to dispose of transaction scope it hangs(sometimes it manages to insert couple of rows then hangs).
Without transaction scope everything works fine
I'm not sure what(if anything) I'm missing here or simply this still does not work in .NET7?
Note that I have MSDTC enable on my machine and im executing this on Windows 10
We've been able to solve this by using the following code.
With this modification DTC is actually invoked correctly and works from within .NET7.
using var transactionScope = TransactionUtils.CreateTransactionScope().EnsureDistributed();
Extension method EnsureDistributed implementation is as follows:
public static TransactionScope EnsureDistributed(this TransactionScope ts)
{
Transaction.Current?.EnlistDurable(DummyEnlistmentNotification.Id, new DummyEnlistmentNotification(),
EnlistmentOptions.None);
return ts;
}
internal class DummyEnlistmentNotification : IEnlistmentNotification
{
internal static readonly Guid Id = new("8d952615-7f67-4579-94fa-5c36f0c61478");
public void Prepare(PreparingEnlistment preparingEnlistment)
{
preparingEnlistment.Prepared();
}
public void Commit(Enlistment enlistment)
{
enlistment.Done();
}
public void Rollback(Enlistment enlistment)
{
enlistment.Done();
}
public void InDoubt(Enlistment enlistment)
{
enlistment.Done();
}
This is 10year old code snippet yet it works(im guessing because .NET Core merely copied and refactored the code from .NET for DistributedTransactions, which also copied bugs).
What it does it creates Distributed transaction right away rather than creating LTM transaction then promoting it to DTC if required.
More details explanation can be found here:
https://www.davidboike.dev/2010/04/forcibly-creating-a-distributed-net-transaction/
https://github.com/davybrion/companysite-dotnet/blob/master/content/blog/2010-03-msdtc-woes-with-nservicebus-and-nhibernate.md
Ensure you're using Microsoft.Data.SqlClient +v5.1
Replace all "usings" System.Data.SqlClient > Microsoft.Data.SqlClient
Ensure ImplicitDistributedTransactions is set True:
TransactionManager.ImplicitDistributedTransactions = true;
using (var ts = new TransactionScope(your options))
{
TransactionInterop.GetTransmitterPropagationToken(Transaction.Current);
... your code ..
ts.Complete();
}

Spring boot Oracle JPA set QueryTimeout

I'm using Spring Boot with Ojdbc8 18.3.0.0.0
With Hikari Datasource and JPA, all query work fine.
But now I need to set Query timeout for all database query
I was try many way:
javax.persistence.query.timeout=1000
spring.transaction.default-timeout=1
spring.jdbc.template.query-timeout=1
spring.jpa.properties.hibernate.c3p0.timeout=1
spring.jpa.properties.javax.persistence.query.timeout=1
Config Class:
#Configuration
public class JPAQueryTimeout {
#Value("${spring.jpa.properties.javax.persistence.query.timeout}")
private int queryTimeout;
#Bean
public PlatformTransactionManager transactionManager() throws Exception {
JpaTransactionManager txManager = new JpaTransactionManager();
txManager.setDefaultTimeout(queryTimeout); //Put 1 seconds timeout
return txManager;
}
}
Query:
List<Integer> llll = manager.createNativeQuery("select test_sleep(5) from dual")
.setHint("javax.persistence.query.timeout", 1).getResultList();
The database task take 5 second before return value, but in all of case, no error occor.
Could anyone tell me how to set query timeout?
You can try using the simplest solution, that is using the timeout value within #Transactional;
#Transactional(timeout = 1) // in seconds (so that is 1 second timeout)
public Foo runQuery(String id) {
String result = repo.findById(id);
// other logic
}
Be aware that the method annotated with #Transactional must be public for it to work properly

Hystrix Javanica : Call always returning result from fallback method.(java web app without spring)

I am trying to integrate Hystrix javanica into my existing java EJB web application and facing 2 issues with running it.
When I try to invoke following service it always returns response from fallback method and I see that the Throwable object in fallback method has "com.netflix.hystrix.exception.HystrixTimeoutException" exception.
Each time this service is triggered, HystrixCommad and fallback methods are called multiple times around 50 times.
Can anyone suggest me with any inputs? Am I missing any configuration?
I am including following libraries in my project.
project libraries
I have setup my aspect file as follows:
<aspectj>
<weaver options="-verbose -showWeaveInfo"></weaver>
<aspects>
<aspect name="com.netflix.hystrix.contrib.javanica.aop.aspectj.HystrixCommandAspect"/>
</aspects>
</aspectj>
Here is my config.properties file in META-INF/config.properties
hystrix.command.default.execution.timeout.enabled=false
Here is my rest service file
#Path("/hystrix")
public class HystrixService {
#GET
#Path("clusterName")
#Produces({ MediaType.APPLICATION_JSON })
public Response getClusterName(#QueryParam("id") int id) {
ClusterCmdBean clusterCmdBean = new ClusterCmdBean();
String result = clusterCmdBean.getClusterNameForId(id);
return Response.ok(result).build();
}
}
Here is my bean class
public class ClusterCmdBean {
#HystrixCommand(groupKey = "ClusterCmdBeanGroup", commandKey = "getClusterNameForId", fallbackMethod = "defaultClusterName")
public String getClusterNameForId(int id) {
if (id > 0) {
return "cluster"+id;
} else {
throw new RuntimeException("command failed");
}
}
public String defaultClusterName(int id, Throwable e) {
return "No cluster - returned from fallback:" + e.getMessage();
}
}
Thanks for the help.
If you want to ensure you are setting the property, you can do that explicitly in the circuit annotation itself:
#HystrixCommand(commandProperties = {
#HystrixProperty(name = "execution.timeout.enabled", value = "false")
})
I would only recommend this for debugging purposes though.
Something that jumps out to me is that Javanica uses AspectJ AOP, which I have never seen work with new MyBean() before. I've always have to use #Autowired with Spring or similar to allow proxying. This could well just be something that is new to me though.
If you set a breakpoint inside the getClusterNameForId can you see in the stack trace that its being called via reflection (which it should be AFAIK)?
Note you can remove commandKey as this will default to the method name. Personally I would also remove groupKey and let it default to the class name.

Exception (cause) is always null in Hystrix feign fallback

I have an issue finding the exception cause in FallBackFactory, mine is a old application, so i can not use spring cloud approach (with annotations etc..)
I found the below solution, but still not working for me:
Issue in getting cause in HystrixFeign client fallback
Here is the code i have:
public static class ProfileFallbackFactory implements ProfileProxy, FallbackFactory<ProfileFallbackFactory> {
final Throwable cause;
public ProfileFallbackFactory() {
this(null);
}
ProfileFallbackFactory(Throwable cause) {
this.cause = cause;
}
#Override
public ProfileFallbackFactory create(Throwable cause) {
LOG.info("Profile fallback create "+cause);
return new ProfileFallbackFactory(cause);
}
public Profile getProfile(String id) {
}
instance creation:
profileProxy = new HystrixFeign.Builder().setterFactory(new CustomSetterFactory())
.decode404()
.decoder(new GsonDecoder(gsonWithDateFormat))
.encoder(new GsonEncoder(gsonWithDateFormat))
.errorDecoder(new profileProxyErrorDecoder())
.target(ProfileProxy.class,profileServiceUrl, (FallbackFactory<ProfileFallbackFactory>)new ProfileFallbackFactory());
There is a logger added in ProfileProxyErrorDecoder class, but this logger is not found in logs. I can see com.netflix.hystrix.exception.HystrixRuntimeException in server logs
Can someone please point me where i am going wrong?

DataContract doesn't work after publish into web site

I tried to solve by myself, but... Looks like I need help from people.
I have Business Silverlight application with WCF RIA and EntityFramework. Access to Database I get via LinqToEntites.
Common loading data from database I making by this:
return DbContext.Customers
This code returns full Customers table from DataBase. But sometimes I do not need to show all data. Easy way is use linq filters in client side by next code:
public LoadInfo()
{
...
var LO1 = PublicDomainContext.Load(PublicDomainContext.GetCustomersQuery());
LO1.Completed += LO1Completed;
...
}
private void LO1Completed(object sender, EventArgs eventArgs)
{
...
DatatViewGrid.ItemsSource = null;
DatatViewGrid.ItemsSource = loadOperation.Entities.Where(c=>c ...filtering...);
//or PublicDomainContext.Customers.Where(c=>c ...filtering...)
...
}
However this way has very and very important flaw: all data passing from server to client side via DomainService may be viewed by applications like Fiddler. So I need to come up with another way.
Task: filter recieving data in server side and return this data.
Way #1: LinqToEntites has a beautiful projection method:
//MSDN Example
var query =
contacts.SelectMany(
contact => orders.Where(order =>
(contact.ContactID == order.Contact.ContactID)
&& order.TotalDue < totalDue)
.Select(order => new
{
ContactID = contact.ContactID,
LastName = contact.LastName,
FirstName = contact.FirstName,
OrderID = order.SalesOrderID,
Total = order.TotalDue
}));
But, unfortunately, DomainServices cannot return undefined types, so this way won't work.
Way #2: I found next solution - make separate DTO classes (DataTransferObject). I just read some samples and made on the server side next class:
[DataContract]
public partial class CustomerDTO
{
[DataMember]
public int ISN { get; set; }
[DataMember]
public string FIO { get; set; }
[DataMember]
public string Listeners { get; set; }
}
And based this class I made a row of methods which return filtered data:
[OperationContract]
public List<CustomerDTO> Customers_Common()
{
return DbContext.Customers....Select(c => new CustomerDTO { ISN = c.ISN, FIO = c.FIO, Listeners = c.Listeners }).ToList();
}
And this works fine, all good...
But, there is strange problem: running application locally does not affect any troubles, but after publishing project on the Web Site, DomainService returns per each method HTTP 500 Error ("Not Found" exception). Of course, I cannot even LogIn into my application. DomainService is dead. If I delete last class and new methods from application and republish - all works fine, but without speacial filtering...
The Question: what I do wrong, why Service is dying with new classes, or tell me another way to solve my trouble. Please.
U P D A T E :
Hey, finally I solved this!
There is an answer: Dynamic query with WCF RIA Services
Your best shot is to find out what is causing the error. For that, override the OnError method on the DomainService like this:
protected override void OnError(DomainServiceErrorInfo errorInfo)
{
/* Log the error info to a file. Don't forget inner exceptions.
*/
base.OnError(errorInfo);
}
This is useful, because only two exceptions will be passed to the client, so if there are a lot of nested inner exceptions, you should still be able to see what actually causes the error.
In addition, you can inspect the error by attaching the debugger to the browser instance you are opening the site with. In VS2010 this is done by doing [Debug] -> [Attach to Process] in the menu-bar.

Resources