SQL Server 2022 in LXC crashes and won't get back up - sql-server

I have big scary problem with SQL Server 2022 in LXC containers on PVE.
SQL Server 2019 is working fine in same environment but SQL Server 2022 crashes after few months of working good. After crash it can't be restarted only solution I found is to reinstall the container.
Feb 19 04:30:58 sqll sqlservr[4857]: This program has encountered a fatal error and cannot continue running at Sun Feb 19 04:30:58 2023
Feb 19 04:30:58 sqll sqlservr[4857]: The following diagnostic information is available:
Feb 19 04:30:58 sqll sqlservr[4857]: Reason: 0x00000001
Feb 19 04:30:58 sqll sqlservr[4857]: Signal: SIGABRT - Aborted (6)
Feb 19 04:30:58 sqll sqlservr[4857]: Stack:
Feb 19 04:30:58 sqll sqlservr[4857]: IP Function
Feb 19 04:30:58 sqll sqlservr[4857]: ---------------- --------------------------------------
Feb 19 04:30:58 sqll sqlservr[4857]: 0000562a9b7549d2 <unknown>
Feb 19 04:30:58 sqll sqlservr[4857]: 0000562a9b7543f0 <unknown>
Feb 19 04:30:58 sqll sqlservr[4857]: 0000562a9b753a8f <unknown>
Feb 19 04:30:58 sqll sqlservr[4857]: 00007f093a98f090 killpg+0x40
Feb 19 04:30:58 sqll sqlservr[4857]: 00007f093a98f00b gsignal+0xcb
Feb 19 04:30:58 sqll sqlservr[4857]: 00007f093a96e859 abort+0x12b
Feb 19 04:30:58 sqll sqlservr[4857]: 0000562a9b70ec26 <unknown>
Feb 19 04:30:58 sqll sqlservr[4857]: 0000562a9b78576a <unknown>
Feb 19 04:30:58 sqll sqlservr[4857]: Process: 4857 - sqlservr
Feb 19 04:30:58 sqll sqlservr[4857]: Thread: 4862 (application thread 0x8)
Feb 19 04:30:58 sqll sqlservr[4857]: Instance Id: 1f101619-527d-4094-ac00-c2b857339367
Feb 19 04:30:58 sqll sqlservr[4857]: Crash Id: 8b1a4608-d26c-421a-beff-e83d1f30b91f
Feb 19 04:30:58 sqll sqlservr[4857]: Build stamp: 7381104a7baabc096c65c6cf9b3c3c2c36e1f155ac27f817e87fab585602cb5f
Feb 19 04:30:58 sqll sqlservr[4857]: Distribution: Ubuntu 20.04.5 LTS
Feb 19 04:30:58 sqll sqlservr[4857]: Processors: 2
Feb 19 04:30:58 sqll sqlservr[4857]: Total Memory: 67350208512 bytes
Feb 19 04:30:58 sqll sqlservr[4857]: Timestamp: Sun Feb 19 04:30:58 2023
Feb 19 04:30:58 sqll sqlservr[4857]: Last errno: 2
Feb 19 04:30:58 sqll sqlservr[4857]: Last errno text: No such file or directory
Feb 19 04:30:58 sqll sqlservr[4855]: Capturing a dump of 4857
Feb 19 04:30:58 sqll sqlservr[4855]: FAILED to capture a dump. Details in paldumper log.
Feb 19 04:30:58 sqll sqlservr[4855]: Executing: /opt/mssql/bin/handle-crash.sh with parameters
Feb 19 04:30:58 sqll sqlservr[4855]: handle-crash.sh
Feb 19 04:30:58 sqll sqlservr[4855]: /opt/mssql/bin/sqlservr
Feb 19 04:30:58 sqll sqlservr[4855]: 4857
Feb 19 04:30:58 sqll sqlservr[4855]: /opt/mssql/bin
Feb 19 04:30:58 sqll sqlservr[4855]: /var/opt/mssql/log/
Feb 19 04:30:58 sqll sqlservr[4855]:
Feb 19 04:30:58 sqll sqlservr[4855]: 1f101619-527d-4094-ac00-c2b857339367
Feb 19 04:30:58 sqll sqlservr[4855]: 8b1a4608-d26c-421a-beff-e83d1f30b91f
Feb 19 04:30:58 sqll sqlservr[4855]:
Feb 19 04:30:58 sqll sqlservr[4855]:
Feb 19 04:30:58 sqll sqlservr[4855]: Ubuntu 20.04.5 LTS
This probably needs attention from somebody in msft.
Related reports:
https://github.com/MicrosoftDocs/sql-docs/issues/8525
https://forum.proxmox.com/threads/sqlserver-2022-lxc.120898/

Related

Apache2 Error AH00072 Reliably determine the server

i want to create a DB, and i'm used to PHPMYADMIN. I wanted to use PHPMYADMIN
for it, but after i install php, i got this error, any clues?
● apache2.service - The Apache HTTP Server
Loaded: loaded (/lib/systemd/system/apache2.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Sun 2023-01-29 00:16:49 UTC; 6min ago
Docs: https://httpd.apache.org/docs/2.4/
Jan 29 00:16:49 ubuntu-4gb-hel1-2 apachectl[260775]: AH00558: apache2: Could not reliably determine the server's fully qualified domain name, using 127.0.1.1. Set the 'ServerName' directive globally to suppress this message
Jan 29 00:16:49 ubuntu-4gb-hel1-2 apachectl[260775]: (98)Address already in use: AH00072: make_sock: could not bind to address [::]:80
Jan 29 00:16:49 ubuntu-4gb-hel1-2 apachectl[260775]: (98)Address already in use: AH00072: make_sock: could not bind to address 0.0.0.0:80
Jan 29 00:16:49 ubuntu-4gb-hel1-2 apachectl[260775]: no listening sockets available, shutting down
Jan 29 00:16:49 ubuntu-4gb-hel1-2 apachectl[260775]: AH00015: Unable to open logs
Jan 29 00:16:49 ubuntu-4gb-hel1-2 apachectl[260772]: Action 'start' failed.
Jan 29 00:16:49 ubuntu-4gb-hel1-2 apachectl[260772]: The Apache error log may have more information.
Jan 29 00:16:49 ubuntu-4gb-hel1-2 systemd[1]: apache2.service: Control process exited, code=exited, status=1/FAILURE
Jan 29 00:16:49 ubuntu-4gb-hel1-2 systemd[1]: apache2.service: Failed with result 'exit-code'.
Jan 29 00:16:49 ubuntu-4gb-hel1-2 systemd[1]: Failed to start The Apache HTTP Server.

MongoDB lost all data after several day in AWS server

I'm using a mongo database link to a AngularJS/NodeJS website on an Amazon server with ubuntu 14.04.
Since 1 month, every 5 or 6 day my data are unreachable I can't connect to the website with mail/password. I need to shutdown the database and relaunch it to make it works. And all the data stored are lost.
I don't understand why and the logfile looks normal.
Here it's the full log when the bug appears
Thu Aug 17 00:38:52.947 [initandlisten] MongoDB starting : pid=1714 port=27017 dbpath=/home/ubuntu/data/db 64-bit host=ip-myIp
Thu Aug 17 00:38:52.947 [initandlisten] db version v2.4.9
Thu Aug 17 00:38:52.947 [initandlisten] git version: nogitversion
Thu Aug 17 00:38:52.947 [initandlisten] build info: Linux orlo 3.2.0-58-generic #88-Ubuntu SMP Tue Dec 3 17:37:58 UTC 2013 x86_64 BOOST_LIB_VERSION=1_54
Thu Aug 17 00:38:52.947 [initandlisten] allocator: tcmalloc
Thu Aug 17 00:38:52.947 [initandlisten] options: { dbpath: "/home/ubuntu/data/db", fork: true, logpath: "/var/log/mongod.log" }
Thu Aug 17 00:38:52.952 [initandlisten] journal dir=/home/ubuntu/data/db/journal
Thu Aug 17 00:38:52.952 [initandlisten] recover : no journal files present, no recovery needed
Thu Aug 17 00:38:52.978 [initandlisten] waiting for connections on port 27017
Thu Aug 17 00:38:52.978 [websvr] admin web console waiting for connections on port 28017
Thu Aug 17 00:40:23.927 [initandlisten] connection accepted from 127.0.0.1:52328 #1 (1 connection now open)
Thu Aug 17 00:42:43.295 [conn1] end connection 127.0.0.1:52328 (0 connections now open)
Thu Aug 17 00:43:17.159 [initandlisten] connection accepted from 127.0.0.1:52329 #2 (1 connection now open)
Thu Aug 17 00:47:13.931 [initandlisten] connection accepted from 127.0.0.1:52330 #3 (2 connections now open)
Thu Aug 17 02:53:56.046 [initandlisten] connection accepted from 62.210.127.77:35059 #4 (3 connections now open)
Thu Aug 17 02:53:57.064 [conn4] end connection 62.210.127.77:35059 (2 connections now open)
Thu Aug 17 02:53:57.096 [initandlisten] connection accepted from 62.210.127.17:51812 #5 (3 connections now open)
Thu Aug 17 02:53:57.125 [initandlisten] connection accepted from 62.210.127.17:51816 #6 (4 connections now open)
Thu Aug 17 02:53:57.532 [conn5] end connection 62.210.127.17:51812 (3 connections now open)
Thu Aug 17 02:53:57.532 [conn6] end connection 62.210.127.17:51816 (2 connections now open)
Thu Aug 17 03:23:44.832 [initandlisten] connection accepted from 74.82.47.5:35734 #7 (3 connections now open)
Thu Aug 17 03:23:44.976 [conn7] end connection 74.82.47.5:35734 (2 connections now open)
Thu Aug 17 03:23:57.019 [initandlisten] connection accepted from 74.82.47.5:41550 #8 (3 connections now open)
Thu Aug 17 03:23:57.172 [conn8] end connection 74.82.47.5:41550 (2 connections now open)
Thu Aug 17 05:45:19.925 [initandlisten] connection accepted from 220.181.159.73:40602 #9 (3 connections now open)
Thu Aug 17 05:45:22.925 [conn9] end connection 220.181.159.73:40602 (2 connections now open)
Thu Aug 17 05:45:23.168 [initandlisten] connection accepted from 220.181.159.73:49766 #10 (3 connections now open)
Thu Aug 17 05:45:25.929 [conn10] end connection 220.181.159.73:49766 (2 connections now open)
Thu Aug 17 05:45:26.159 [initandlisten] connection accepted from 220.181.159.73:58268 #11 (3 connections now open)
Thu Aug 17 05:45:26.159 [conn11] end connection 220.181.159.73:58268 (2 connections now open)
Fri Aug 18 03:01:37.788 [initandlisten] connection accepted from 184.105.247.196:61094 #12 (3 connections now open)
Fri Aug 18 03:01:37.931 [conn12] end connection 184.105.247.196:61094 (2 connections now open)
Fri Aug 18 03:01:51.123 [initandlisten] connection accepted from 184.105.247.196:3532 #13 (3 connections now open)
Fri Aug 18 03:01:51.267 [conn13] end connection 184.105.247.196:3532 (2 connections now open)
Sat Aug 19 00:21:23.527 [initandlisten] connection accepted from 45.55.29.41:43416 #14 (3 connections now open)
Sat Aug 19 00:21:33.361 [conn14] end connection 45.55.29.41:43416 (2 connections now open)
Sat Aug 19 03:17:28.802 [initandlisten] connection accepted from 184.105.247.195:42566 #15 (3 connections now open)
Sat Aug 19 03:17:29.028 [conn15] end connection 184.105.247.195:42566 (2 connections now open)
Sat Aug 19 03:17:41.312 [initandlisten] connection accepted from 184.105.247.195:61782 #16 (3 connections now open)
Sat Aug 19 03:17:41.456 [conn16] end connection 184.105.247.195:61782 (2 connections now open)
Sat Aug 19 11:24:28.098 [initandlisten] connection accepted from 168.1.128.35:10000 #17 (3 connections now open)
Sat Aug 19 11:24:31.686 [conn17] end connection 168.1.128.35:10000 (2 connections now open)
Sun Aug 20 03:17:03.998 [initandlisten] connection accepted from 184.105.247.252:57362 #18 (3 connections now open)
Sun Aug 20 03:17:04.298 [conn18] end connection 184.105.247.252:57362 (2 connections now open)
Sun Aug 20 03:17:16.801 [initandlisten] connection accepted from 184.105.247.252:11208 #19 (3 connections now open)
Sun Aug 20 03:17:16.945 [conn19] end connection 184.105.247.252:11208 (2 connections now open)
Sun Aug 20 19:07:53.815 [initandlisten] connection accepted from 106.2.120.103:49396 #20 (3 connections now open)
Sun Aug 20 19:08:03.825 [conn20] end connection 106.2.120.103:49396 (2 connections now open)
Sun Aug 20 23:08:15.624 [initandlisten] connection accepted from 106.2.120.103:48933 #21 (3 connections now open)
Sun Aug 20 23:08:16.383 [conn21] end connection 106.2.120.103:48933 (2 connections now open)
Mon Aug 21 12:38:02.076 [initandlisten] connection accepted from 207.226.141.36:41710 #22 (3 connections now open)
Mon Aug 21 12:38:03.379 [conn22] end connection 207.226.141.36:41710 (2 connections now open)
Mon Aug 21 12:38:03.706 [initandlisten] connection accepted from 207.226.141.36:42522 #23 (3 connections now open)
Mon Aug 21 12:38:04.499 [conn23] dropDatabase BACKUP_DB starting
Mon Aug 21 12:38:04.500 [conn23] removeJournalFiles
Mon Aug 21 12:38:04.507 [conn23] dropDatabase BACKUP_DB finished
Mon Aug 21 12:38:05.037 [conn23] end connection 207.226.141.36:42522 (2 connections now open)
Mon Aug 21 12:38:05.361 [initandlisten] connection accepted from 207.226.141.36:43398 #24 (3 connections now open)
Mon Aug 21 12:38:06.166 [conn24] dropDatabase morethanwinebo starting
Mon Aug 21 12:38:06.166 [conn24] removeJournalFiles
Mon Aug 21 12:38:06.170 [conn24] dropDatabase morethanwinebo finished
Mon Aug 21 12:38:06.708 [conn24] end connection 207.226.141.36:43398 (2 connections now open)
Mon Aug 21 12:38:07.042 [initandlisten] connection accepted from 207.226.141.36:44336 #25 (3 connections now open)
Mon Aug 21 12:38:08.154 [FileAllocator] allocating new datafile /home/ubuntu/data/db/Warning.ns, filling with zeroes...
Mon Aug 21 12:38:08.154 [FileAllocator] creating directory /home/ubuntu/data/db/_tmp
Mon Aug 21 12:38:08.158 [FileAllocator] done allocating datafile /home/ubuntu/data/db/Warning.ns, size: 16MB, took 0.001 secs
Mon Aug 21 12:38:08.158 [FileAllocator] allocating new datafile /home/ubuntu/data/db/Warning.0, filling with zeroes...
Mon Aug 21 12:38:08.161 [FileAllocator] done allocating datafile /home/ubuntu/data/db/Warning.0, size: 64MB, took 0.002 secs
Mon Aug 21 12:38:08.161 [FileAllocator] allocating new datafile /home/ubuntu/data/db/Warning.1, filling with zeroes...
Mon Aug 21 12:38:08.163 [FileAllocator] done allocating datafile /home/ubuntu/data/db/Warning.1, size: 128MB, took 0.001 secs
Mon Aug 21 12:38:08.165 [conn25] build index Warning.Readme { _id: 1 }
Mon Aug 21 12:38:08.166 [conn25] build index done. scanned 0 total records. 0.001 secs
Mon Aug 21 12:38:08.724 [conn25] end connection 207.226.141.36:44336 (2 connections now open)
Mon Aug 21 12:53:15.501 [FileAllocator] allocating new datafile /home/ubuntu/data/db/morethanwinebo.ns, filling with zeroes...
Mon Aug 21 12:53:15.503 [FileAllocator] done allocating datafile /home/ubuntu/data/db/morethanwinebo.ns, size: 16MB, took 0.001 secs
Mon Aug 21 12:53:15.503 [FileAllocator] allocating new datafile /home/ubuntu/data/db/morethanwinebo.0, filling with zeroes...
Mon Aug 21 12:53:15.505 [FileAllocator] done allocating datafile /home/ubuntu/data/db/morethanwinebo.0, size: 64MB, took 0.001 secs
Mon Aug 21 12:53:15.508 [conn3] build index morethanwinebo.sessions { _id: 1 }
Mon Aug 21 12:53:15.508 [conn3] build index done. scanned 0 total records. 0 secs
Mon Aug 21 12:53:15.508 [FileAllocator] allocating new datafile /home/ubuntu/data/db/morethanwinebo.1, filling with zeroes...
Mon Aug 21 12:53:15.510 [FileAllocator] done allocating datafile /home/ubuntu/data/db/morethanwinebo.1, size: 128MB, took 0.001 secs
Tue Aug 22 03:05:13.792 [initandlisten] connection accepted from 74.82.47.2:27720 #26 (3 connections now open)
Tue Aug 22 03:05:14.026 [conn26] end connection 74.82.47.2:27720 (2 connections now open)
Tue Aug 22 03:05:27.955 [initandlisten] connection accepted from 74.82.47.2:52792 #27 (3 connections now open)
Tue Aug 22 03:05:28.099 [conn27] end connection 74.82.47.2:52792 (2 connections now open)
When I'm launching the database I use this command
sudo mongod --fork --logpath /var/log/mongod.log --dbpath /home/ubuntu/data/db
Is it because I'm using sudo for launching the database ? Maybe mongo need some permission in read/write for doing some things and he can't so the bug appears ?
My first thought was because the server was too small so I increase it from 8Go to 16Go but this change nothing and the bug appears yesterday
Have a look at the activity of IP address 207.226.141.36 on Monday August21. You have had an unsolicited visitor:
https://www.abuseipdb.com/check/207.226.141.36

Apache 2 configtest failed: File name too long

I run an apache2 webserver locally on a raspberry-pi with raspbian distro.
It worked without a problem until suddenly it can't be started anymore with
sudo /etc/init.d/apache2 start, which yields the following error:
[....] Starting apache2 (via systemctl): apache2.serviceJob for
apache2.service failed. See 'systemctl status apache2.service' and
'journalctl -xn' for details. failed!
Removing and reinstalling with apt-get doesn't solve it.systemctl status shows the following entries:
> Apr 22 11:27:16 raspberrypi apache2[18234]: [344B blob data] Apr 22
> 11:27:16 raspberrypi apache2[18234]: [293B blob data] Apr 22 11:27:16
> raspberrypi apache2[18234]: [293B blob data] Apr 22 11:27:16
> raspberrypi apache2[18234]: [293B blob data] Apr 22 11:27:16
> raspberrypi apache2[18234]: [293B blob data] Apr 22 11:27:16
> raspberrypi apache2[18234]: Action 'configtest' failed. Apr 22
> 11:27:16 raspberrypi apache2[18234]: The Apache error log may have
> more information. Apr 22 11:27:16 raspberrypi systemd[1]:
> apache2.service: control process exited, code=exited status=1 Apr 22
> 11:27:16 raspberrypi systemd[1]: Failed to start LSB: Apache2 web
> server. Apr 22 11:27:16 raspberrypi systemd[1]: Unit apache2.service
> entered failed state.
journalctl -xn yields:
> -- Logs begin at Sun 2017-04-23 07:06:12 UTC, end at Sun 2017-04-23 12:38:48 UTC. -- Apr 23 12:38:30 raspberrypi apache2[3292]: [2.0K blob
> data] Apr 23 12:38:30 raspberrypi apache2[3292]: [1.7K blob data] Apr
> 23 12:38:30 raspberrypi apache2[3292]: Action 'configtest' failed. Apr
> 23 12:38:30 raspberrypi apache2[3292]: The Apache error log may have
> more information. Apr 23 12:38:31 raspberrypi systemd[1]:
> apache2.service: control process exited, code=exited status=1 Apr 23
> 12:38:31 raspberrypi systemd[1]: Failed to start LSB: Apache2 web
> server.
> -- Subject: Unit apache2.service has failed
> -- Defined-By: systemd
> -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
> --
> -- Unit apache2.service has failed.
> --
> -- The result is failed. Apr 23 12:38:31 raspberrypi systemd[1]: Unit apache2.service entered failed state. Apr 23 12:38:31 raspberrypi
> sudo[3278]: pam_unix(sudo:session): session closed for user root Apr
> 23 12:38:48 raspberrypi sudo[3345]: pi : TTY=pts/0 ;
> PWD=/etc/apache2/conf-enabled ; USER=root ; COMMAND=/bin/journalctl
> -xn Apr 23 12:38:48 raspberrypi sudo[3345]: pam_unix(sudo:session): session opened for user root by pi(uid=0)
Unfortunately the apache error.log contains no useful information whatsoever. But if i run apache2 configtest I get a "File name too long" error.
Strangely even after formatting my sdcard and putting a fresh copy of the distribution image on it and reinstalling apache2 the error remains.
What can I do?

how to fix custom fastcgi program written in C

This is my program:
#include "fcgi_stdio.h"
#include <stdlib.h>
int main(){
while (FCGI_Accept() >= 0){
FCGI_printf("Content-Type:text/html\r\n\r\n");
FCGI_printf("<h1>Test</h1>\n");
}
return 0;
}
These are the options I added to the apache config for the virtual host I'm working with:
SetHandler fastcgi-script
Options +ExecCGI
I also have the following line in the same apache config file:
LoadModule fastcgi_module modules/mod_fastcgi.so
I followed directions at FastCGI script can't find libfcgi.so.0 in Apache 2.4.6 and mod_fastcgi to compile my program.
If I execute the compiled program directly, I receive the familiar "segmentation fault" on screen. When I try executing it via the server at http://127.0.0.1/a.out, I get an internal server error and the following in the error_log:
[Tue Nov 17 00:48:10 2015] [warn] FastCGI: (dynamic) server "/usr/local/apache2/virt1/a.out" started (pid 9331)
[Tue Nov 17 00:48:10 2015] [warn] FastCGI: (dynamic) server "/usr/local/apache2/virt1/a.out" (pid 9331) terminated due to uncaught signal '11' (Segmentation fault)
[Tue Nov 17 00:48:15 2015] [warn] FastCGI: (dynamic) server "/usr/local/apache2/virt1/a.out" restarted (pid 9333)
[Tue Nov 17 00:48:15 2015] [warn] FastCGI: (dynamic) server "/usr/local/apache2/virt1/a.out" (pid 9333) terminated due to uncaught signal '11' (Segmentation fault)
[Tue Nov 17 00:48:20 2015] [warn] FastCGI: (dynamic) server "/usr/local/apache2/virt1/a.out" restarted (pid 9334)
[Tue Nov 17 00:48:20 2015] [warn] FastCGI: (dynamic) server "/usr/local/apache2/virt1/a.out" (pid 9334) terminated due to uncaught signal '11' (Segmentation fault)
[Tue Nov 17 00:48:25 2015] [warn] FastCGI: (dynamic) server "/usr/local/apache2/virt1/a.out" restarted (pid 9335)
[Tue Nov 17 00:48:25 2015] [warn] FastCGI: (dynamic) server "/usr/local/apache2/virt1/a.out" (pid 9335) terminated due to uncaught signal '11' (Segmentation fault)
[Tue Nov 17 00:48:25 2015] [warn] FastCGI: (dynamic) server "/usr/local/apache2/virt1/a.out" has failed to remain running for 30 seconds given 3 attempts, its restart interval has been backed off to 600 seconds
[Tue Nov 17 00:48:26 2015] [warn] FastCGI: (dynamic) server "/usr/local/apache2/virt1/a.out" has failed to remain running for 30 seconds given 3 attempts, its restart interval has been backed off to 600 seconds
[Tue Nov 17 00:48:29 2015] [warn] FastCGI: (dynamic) server "/usr/local/apache2/virt1/a.out" has failed to remain running for 30 seconds given 3 attempts, its restart interval has been backed off to 600 seconds
[Tue Nov 17 00:48:32 2015] [warn] FastCGI: (dynamic) server "/usr/local/apache2/virt1/a.out" has failed to remain running for 30 seconds given 3 attempts, its restart interval has been backed off to 600 seconds
[Tue Nov 17 00:48:35 2015] [warn] FastCGI: (dynamic) server "/usr/local/apache2/virt1/a.out" has failed to remain running for 30 seconds given 3 attempts, its restart interval has been backed off to 600 seconds
[Tue Nov 17 00:48:38 2015] [warn] FastCGI: (dynamic) server "/usr/local/apache2/virt1/a.out" has failed to remain running for 30 seconds given 3 attempts, its restart interval has been backed off to 600 seconds
[Tue Nov 17 00:48:41 2015] [warn] FastCGI: (dynamic) server "/usr/local/apache2/virt1/a.out" has failed to remain running for 30 seconds given 3 attempts, its restart interval has been backed off to 600 seconds
[Tue Nov 17 00:48:44 2015] [error] [client 127.0.0.1] FastCGI: comm with (dynamic) server "/usr/local/apache2/virt1/a.out" aborted: (first read) idle timeout (30 sec)
[Tue Nov 17 00:48:44 2015] [error] [client 127.0.0.1] FastCGI: incomplete headers (0 bytes) received from server "/usr/local/apache2/virt1/a.out"
[Tue Nov 17 00:48:44 2015] [warn] FastCGI: (dynamic) server "/usr/local/apache2/virt1/a.out" has failed to remain running for 30 seconds given 3 attempts, its restart interval has been backed off to 600 seconds
[Tue Nov 17 00:48:44 2015] [error] [client 127.0.0.1] (2)No such file or directory: FastCGI: stat() of "/usr/local/apache2/virt1/favicon.ico" failed
Its basically telling me that apache is attempting to start the program a few times but the program returns segmentation fault.
The result I was expecting was the word Test in bold in a web browser.
How do I fix my program to make it compatible with fast-cgi so that I can execute it via a web browser?
I don't want to resort to the slow CGI interface which is why I'm trying to use functions beginning with FCGI.

solr ReplicationHandler - SnapPull failed to download files

we are continuously getting this exception during replication from master to slave.
our index size is 9.7 G and we are trying to replicate a slave from scratch.
30 Oct 2013 18:22:16,996 [explicit-fetchindex-cmd] ERROR ReplicationHandler - SnapPull failed :org.apache.solr.common.SolrException: Unable to download _41c_Lucene41_0.doc completely. Downloaded 0!=107464871
at org.apache.solr.handler.SnapPuller$DirectoryFileFetcher.cleanup(SnapPuller.java:1266)
at org.apache.solr.handler.SnapPuller$DirectoryFileFetcher.fetchFile(SnapPuller.java:1146)
at org.apache.solr.handler.SnapPuller.downloadIndexFiles(SnapPuller.java:741)
at org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:405)
at org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:319)
at org.apache.solr.handler.ReplicationHandler$1.run(ReplicationHandler.java:220)
I read in some thread that there was a related bug in solr 4.1, but we are using solr 4.3 and tried with 4.5.1 also.
It seams that DirectoryFileFetcher can not download a file sometimes , the files is downloaded to the salve in size zero.
this is the master setup:
<requestHandler name="/replication" class="solr.ReplicationHandler" >
<lst name="master">
<str name="replicateAfter">commit</str>
<str name="replicateAfter">startup</str>
<str name="confFiles">stopwords.txt,spellings.txt,synonyms.txt,protwords.txt,elevate.xml,currency.xml</str>
<str name="commitReserveDuration">00:00:50</str>
</lst>
</requestHandler>
and the slave setup:
<requestHandler name="/replication" class="solr.ReplicationHandler" >
<lst name="master">
<str name="replicateAfter">commit</str>
<str name="replicateAfter">startup</str>
<str name="confFiles">stopwords.txt,spellings.txt,synonyms.txt,protwords.txt,elevate.xml,currency.xml</str>
<str name="commitReserveDuration">00:00:50</str>
</lst>
</requestHandler>
The problem appeared to be with httpclient.
I turned on debug logging for all libraries and saw a message "Garbage in response" coming from httpclient just before the failure.
this is a log snippet:
31 Oct 2013 18:10:40,360 [explicit-fetchindex-cmd] DEBUG DefaultClientConnection - Sending request: GET /solr-master/replication?comman
d=filecontent&generation=6814&qt=%2Freplication&file=_aa7_Lucene41_0.pos&checksum=true&wt=filestream HTTP/1.1
31 Oct 2013 18:10:40,361 [explicit-fetchindex-cmd] DEBUG wire - >> "GET /solr-master/replication?command=filecontent&generation=6814&qt
=%2Freplication&file=_aa7_Lucene41_0.pos&checksum=true&wt=filestream HTTP/1.1[\r][\n]"
31 Oct 2013 18:10:40,361 [explicit-fetchindex-cmd] DEBUG wire - >> "User-Agent: Solr[org.apache.solr.client.solrj.impl.HttpSolrServer]
1.0[\r][\n]"
31 Oct 2013 18:10:40,361 [explicit-fetchindex-cmd] DEBUG wire - >> "Host: solr-master.saltdev.sealdoc.com:8081[\r][\n]"
31 Oct 2013 18:10:40,361 [explicit-fetchindex-cmd] DEBUG wire - >> "Connection: Keep-Alive[\r][\n]"
31 Oct 2013 18:10:40,361 [explicit-fetchindex-cmd] DEBUG wire - >> "[\r][\n]"
31 Oct 2013 18:10:40,361 [explicit-fetchindex-cmd] DEBUG headers - >> GET /solr-master/replication?command=filecontent&generation=6814&
qt=%2Freplication&file=_aa7_Lucene41_0.pos&checksum=true&wt=filestream HTTP/1.1
31 Oct 2013 18:10:40,361 [explicit-fetchindex-cmd] DEBUG headers - >> User-Agent: Solr[org.apache.solr.client.solrj.impl.HttpSolrServer
] 1.0
31 Oct 2013 18:10:40,361 [explicit-fetchindex-cmd] DEBUG headers - >> Host: solr-master.saltdev.sealdoc.com:8081
31 Oct 2013 18:10:40,361 [explicit-fetchindex-cmd] DEBUG headers - >> Connection: Keep-Alive
31 Oct 2013 18:10:40,361 [explicit-fetchindex-cmd] DEBUG wire - << "[\r][\n]"
31 Oct 2013 18:10:40,361 [explicit-fetchindex-cmd] DEBUG DefaultHttpResponseParser - Garbage in response:
31 Oct 2013 18:10:40,361 [explicit-fetchindex-cmd] DEBUG wire - << "4[\r][\n]"
31 Oct 2013 18:10:40,361 [explicit-fetchindex-cmd] DEBUG DefaultHttpResponseParser - Garbage in response: 4
31 Oct 2013 18:10:40,361 [explicit-fetchindex-cmd] DEBUG wire - << "[0x0][0x0][0x0][0x0][\r][\n]"
31 Oct 2013 18:10:40,361 [explicit-fetchindex-cmd] DEBUG DefaultHttpResponseParser - Garbage in response: ^#^#^#^#
31 Oct 2013 18:10:40,361 [explicit-fetchindex-cmd] DEBUG wire - << "0[\r][\n]"
31 Oct 2013 18:10:40,361 [explicit-fetchindex-cmd] DEBUG DefaultHttpResponseParser - Garbage in response: 0
31 Oct 2013 18:10:40,361 [explicit-fetchindex-cmd] DEBUG wire - << "[\r][\n]"
31 Oct 2013 18:10:40,361 [explicit-fetchindex-cmd] DEBUG DefaultHttpResponseParser - Garbage in response:
31 Oct 2013 18:10:40,398 [explicit-fetchindex-cmd] DEBUG DefaultClientConnection - Connection 0.0.0.0:55266<->172.16.77.121:8081 closed
31 Oct 2013 18:10:40,398 [explicit-fetchindex-cmd] DEBUG DefaultClientConnection - Connection 0.0.0.0:55266<->172.16.77.121:8081 shut down
31 Oct 2013 18:10:40,398 [explicit-fetchindex-cmd] DEBUG DefaultClientConnection - Connection 0.0.0.0:55266<->172.16.77.121:8081 closed
31 Oct 2013 18:10:40,398 [explicit-fetchindex-cmd] DEBUG PoolingClientConnectionManager - Connection released: [id: 0][route: {}->http://solr-master.saltdev.sealdoc.com:8081][total kept alive: 1; route allocated: 1 of 10000; total allocated: 1 of 10000]
31 Oct 2013 18:10:40,425 [explicit-fetchindex-cmd] DEBUG CachingDirectoryFactory - Releasing directory: /opt/watchdox/solr-slave/data/index 2 false
31 Oct 2013 18:10:40,425 [explicit-fetchindex-cmd] DEBUG CachingDirectoryFactory - Reusing cached directory: CachedDir<>
31 Oct 2013 18:10:40,425 [explicit-fetchindex-cmd] DEBUG CachingDirectoryFactory - Releasing directory: /opt/watchdox/solr-slave/data 0 false
31 Oct 2013 18:10:40,425 [explicit-fetchindex-cmd] DEBUG CachingDirectoryFactory - Reusing cached directory: CachedDir<>
31 Oct 2013 18:10:40,427 [explicit-fetchindex-cmd] DEBUG CachingDirectoryFactory - Releasing directory: /opt/watchdox/solr-slave/data 0 false
31 Oct 2013 18:10:40,428 [explicit-fetchindex-cmd] DEBUG CachingDirectoryFactory - Done with dir: CachedDir<>
31 Oct 2013 18:10:40,428 [explicit-fetchindex-cmd] DEBUG CachingDirectoryFactory - Releasing directory: /opt/watchdox/solr-slave/data/index.20131031180837277 0 true
31 Oct 2013 18:10:40,428 [explicit-fetchindex-cmd] INFO CachingDirectoryFactory - looking to close /opt/watchdox/solr-slave/data/index.20131031180837277 [CachedDir<>]
31 Oct 2013 18:10:40,428 [explicit-fetchindex-cmd] INFO CachingDirectoryFactory - Closing directory: /opt/watchdox/solr-slave/data/index.20131031180837277
31 Oct 2013 18:10:40,428 [explicit-fetchindex-cmd] INFO CachingDirectoryFactory - Removing directory before core close: /opt/watchdox/solr-slave/data/index.20131031180837277
31 Oct 2013 18:10:40,878 [explicit-fetchindex-cmd] DEBUG CachingDirectoryFactory - Removing from cache: CachedDir<>
31 Oct 2013 18:10:40,878 [explicit-fetchindex-cmd] DEBUG CachingDirectoryFactory - Releasing directory: /opt/watchdox/solr-slave/data/index 1 false
31 Oct 2013 18:10:40,879 [explicit-fetchindex-cmd] ERROR ReplicationHandler - SnapPull failed :org.apache.solr.common.SolrException: Unable to download _aa7_Lucene41_0.pos completely. Downloaded 0!=1081710
at org.apache.solr.handler.SnapPuller$DirectoryFileFetcher.cleanup(SnapPuller.java:1212)
at org.apache.solr.handler.SnapPuller$DirectoryFileFetcher.fetchFile(SnapPuller.java:1092)
at org.apache.solr.handler.SnapPuller.downloadIndexFiles(SnapPuller.java:719)
at org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:397)
at org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:317)
at org.apache.solr.handler.ReplicationHandler$1.run(ReplicationHandler.java:218)
31 Oct 2013 18:10:40,910 [http-bio-8080-exec-8] DEBUG CachingDirectoryFactory - Reusing cached directory: CachedDir<>
So I upgraded the httpcomponents jars to their latest 4.3.x version and the problem disappeared.
the httpcomponents jars which are dependencies of solrj where in the 4.2.x version, I upgraded to httpclient-4.3.1 , httpcore-4.3 and httpmime-4.3.1
I ran the replication a few times now and no problem at all, it is now working as expected.
It seams that the upgrade is necessary only on the slave side but I'm going to upgrade the master too.

Resources