Joomla SEF URL Issues on several sites - joomla3.0

I am running into an issue on several sites where the urls are "nesting" other links in the main url structure.
For example:
We are running K2, RS SEO, and JAmp. I have K2 Set to "Enable advanced SEF for K2 URLs"
These are only showing up with an auto crawl from RS SEO. It is creating more than 1000 links! They are unpublished but added to the site map! Is there a way to fix or prevent this?
htaccess file contents with extra comments removed
IndexIgnore *
Options +FollowSymlinks
Options -Indexes
RewriteEngine On
RewriteCond %{HTTPS} on
RewriteCond %{HTTP_HOST} ^www\.(.*)
RewriteRule ^(.*)$ https://%1/$1 [R=301,L]
# Block any script trying to base64_encode data within the URL.
RewriteCond %{QUERY_STRING} base64_encode[^(]*\([^)]*\) [OR]
# Block any script that includes a <script> tag in URL.
RewriteCond %{QUERY_STRING} (<|%3C)([^s]*s)+cript.*(>|%3E) [NC,OR]
# Block any script trying to set a PHP GLOBALS variable via URL.
RewriteCond %{QUERY_STRING} GLOBALS(=|\[|\%[0-9A-Z]{0,2}) [OR]
# Block any script trying to modify a _REQUEST variable via URL.
RewriteCond %{QUERY_STRING} _REQUEST(=|\[|\%[0-9A-Z]{0,2})
# Return 403 Forbidden header and show the content of the root home page
RewriteRule .* index.php [F]
## Begin - Joomla! core SEF Section.
RewriteRule .* - [E=HTTP_AUTHORIZATION:%{HTTP:Authorization}]
# If the requested path and file is not /index.php and the request
# has not already been internally rewritten to the index.php script
RewriteCond %{REQUEST_URI} !^/index\.php
# and the requested path and file doesn't directly match a physical file
RewriteCond %{REQUEST_FILENAME} !-f
# and the requested path and file doesn't directly match a physical folder
RewriteCond %{REQUEST_FILENAME} !-d
# internally rewrite the request to the index.php script
RewriteRule .* index.php [L]
## End - Joomla! core SEF Section.
<IfModule mod_expires.c>
ExpiresActive on
# Perhaps better to whitelist expires rules? Perhaps.
ExpiresDefault "access plus 1 month"
# cache.appcache needs re-requests in FF 3.6 (thanks Remy ~Introducing HTML5)
ExpiresByType text/cache-manifest "access plus 0 seconds"
# Your document html
ExpiresByType text/html "access plus 0 seconds"
# Data
ExpiresByType text/xml "access plus 0 seconds"
ExpiresByType application/xml "access plus 0 seconds"
ExpiresByType application/json "access plus 0 seconds"
# Feed
ExpiresByType application/rss+xml "access plus 1 hour"
ExpiresByType application/atom+xml "access plus 1 hour"
# Favicon (cannot be renamed)
ExpiresByType image/x-icon "access plus 1 week"
# Media: images, video, audio
ExpiresByType image/gif "access plus 1 month"
ExpiresByType image/png "access plus 1 month"
ExpiresByType image/jpg "access plus 1 month"
ExpiresByType image/jpeg "access plus 1 month"
ExpiresByType video/ogg "access plus 1 month"
ExpiresByType audio/ogg "access plus 1 month"
ExpiresByType video/mp4 "access plus 1 month"
ExpiresByType video/webm "access plus 1 month"
# HTC files (css3pie)
ExpiresByType text/x-component "access plus 1 month"
# Webfonts
ExpiresByType application/font-ttf "access plus 1 month"
ExpiresByType font/opentype "access plus 1 month"
ExpiresByType application/font-woff "access plus 1 month"
ExpiresByType application/font-woff2 "access plus 1 month"
ExpiresByType image/svg+xml "access plus 1 month"
ExpiresByType application/ "access plus 1 month"
# CSS and JavaScript
ExpiresByType text/css "access plus 1 year"
ExpiresByType text/javascript "access plus 1 year"
ExpiresByType application/javascript "access plus 1 year"
<IfModule mod_headers.c>
Header append Cache-Control "public"
<IfModule mod_deflate.c>
AddOutputFilterByType DEFLATE text/html
AddOutputFilterByType DEFLATE text/css
AddOutputFilterByType DEFLATE text/javascript
AddOutputFilterByType DEFLATE text/xml
AddOutputFilterByType DEFLATE text/plain
AddOutputFilterByType DEFLATE image/x-icon
AddOutputFilterByType DEFLATE image/svg+xml
AddOutputFilterByType DEFLATE application/rss+xml
AddOutputFilterByType DEFLATE application/javascript
AddOutputFilterByType DEFLATE application/x-javascript
AddOutputFilterByType DEFLATE application/xml
AddOutputFilterByType DEFLATE application/xhtml+xml
AddOutputFilterByType DEFLATE application/font
AddOutputFilterByType DEFLATE application/font-truetype
AddOutputFilterByType DEFLATE application/font-ttf
AddOutputFilterByType DEFLATE application/font-otf
AddOutputFilterByType DEFLATE application/font-opentype
AddOutputFilterByType DEFLATE application/font-woff
AddOutputFilterByType DEFLATE application/font-woff2
AddOutputFilterByType DEFLATE application/
AddOutputFilterByType DEFLATE font/ttf
AddOutputFilterByType DEFLATE font/otf
AddOutputFilterByType DEFLATE font/opentype
AddOutputFilterByType DEFLATE font/woff
AddOutputFilterByType DEFLATE font/woff2
# For Olders Browsers Which Can't Handle Compression
BrowserMatch ^Mozilla/4 gzip-only-text/html
BrowserMatch ^Mozilla/4\.0[678] no-gzip
BrowserMatch \bMSIE !no-gzip !gzip-only-text/html
# SEO Fixes
RewriteCond %{HTTPS} on
RewriteCond %{HTTP_HOST} ^www\.(.*)
RewriteRule ^(.*)$ https://%1/$1 [R=301,L]
HTTP header for docx file from a cgi-bin script

I'm trying to have a cgi-bin script given the path to a .docx file add a watermark to indicate it is a copy and present it. I have used Libreoffice to create the docx file. So my starting script just takes in the path and tries to present it to make sure the HTTP header is formed correctly. Good thing I have started out with baby steps as things do not work. When I run it with my Apache2 server, I get an error in the error.log file:
[Mon Jan 23 11:06:45.883760 2023] [cgid:error] [pid 1191] [client] malformed header from script 'copydocx': Bad header: PK\x03\x04\x14, referer: http://robert-linux.local/builtsys/
and the file is not presented.
So what am I doing wrong?
Here are my scripts.
# -*-sh-*-
# copydocx - Present a .docx file with a copy watermark for a browser with a CGI script
# Arguments
# docx the path to a docx file
# Bring in URL forms arguments. Get from: /urlgetopt
eval `./urlgetopt -l "${QUERY_STRING}" 2>/dev/null`
# If an error present it and exit
cat <<EOF
Content-type: text/html
red {
color: red
echo '<table width=100%><tr><td><form><input type=button value=Back onclick="history.back()"></form></td>
<td valign=top><font size=+2><red>ERROR: copydocx: $*</red></font></td>
# A terse set of error checks
test "${docx}" = "" && present_error No docx argument.
test ! -f "${docx}" && present_error File not found. "${docx}"
test "${docx##*.}" != "docx" && present_error File "${docx}" does not have a .docx suffix.
test ! -f && present_error File not found.
# Now get a copy of the docx file and present it
import cgi
import sys
import os
# python3 script
# - Present a .docx file with a copy watermark for a browser
# Arguments
# docx the path to a docx file
def present_docx(filename):
# This if is so that 'python3' runs
if filename != "":
statinfo = os.stat(filename)
size = statinfo.st_size
size = 1234
filename = "notexisting.docx"
print("""Content-Type: application/octet-stream
Content-Disposition: attachment; filename="%s"
Content-Transfer-Encoding: binary
Content-Length: %d
""" % (filename, size))
os.system("cat %s" % filename)
form = cgi.FieldStorage()
docx = ""
if "docx" in form:
docx = form["docx"].value
# For now just present input to make sure we can present it at all.
I hoped that this would work.

cURL Mail Attachment (Imagines, .exe file, .rar/.zip files)

I was wondering if I can send different type of files to an email as an attachments. I only know how to send a text file using cUrl. Could someone give me some examples of how can I accomplish my goal ?
This is what I have so far :
curl --url "smtps://" --mail-from "" --mail-rcpt "" --ssl --user "" --upload-file "C:\Folder\File.txt"
Thank you for all the effort !
You can use multipart/mixed content to transmit your text body and each of your binary attachments.
Here is a template of file you could use to display a text file and attach 2 binary files :
From: Some Name <>
To: Some Name <>
Subject: example of mail
Reply-To: Some Name <>
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="MULTIPART-MIXED-BOUNDARY"
Content-Type: multipart/alternative; boundary="MULTIPART-ALTERNATIVE-BOUNDARY"
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: base64
Content-Disposition: inline
This is an email example. This is text/plain content inside the mail.
Content-Type: application/octet-stream
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="file.rar"
Content-Type: application/octet-stream
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename=""
Note that the binary files are encoded in base64 and are transmitted as attachment.
Here is an example of building this file and send the email with a bash script :
echo "From: Some Name <$rtmp_from>
To: Some Name <$rtmp_to>
Subject: example of mail
Reply-To: Some Name <$rtmp_from>
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary=\"MULTIPART-MIXED-BOUNDARY\"
Content-Type: multipart/alternative; boundary=\"MULTIPART-ALTERNATIVE-BOUNDARY\"
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: base64
Content-Disposition: inline
This is an email example. This is text/plain content inside the mail.
Content-Type: application/octet-stream
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename=\"file.rar\"" > "$file_upload"
# convert file.rar to base64 and append to the upload file
cat file.rar | base64 >> "$file_upload"
Content-Type: application/octet-stream
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename=\"\"" >> "$file_upload"
# convert to base64 and append to the upload file
cat | base64 >> "$file_upload"
# end of uploaded file
echo "--MULTIPART-MIXED-BOUNDARY--" >> "$file_upload"
# send email
echo "sending ...."
curl -s "$rtmp_url" \
--mail-from "$rtmp_from" \
--mail-rcpt "$rtmp_to" \
--ssl -u "$rtmp_credentials" \
-T "$file_upload" -k --anyauth
if test "$res" != "0"; then
echo "sending failed with: $res"
echo "OK"
The received email will have the inline text/plain document and both of attachment document of type application/octet-stream :

Is there such a thing as HeaderMatch for apache?

I am "managing" some file's version by setting their last modified date manually when serving them with apache, like so
<Directory />
Header set Last-modified "Tue, 01 Jan 2013 00:00:00 GMT"
This already works fine.
If the client honors the cache standard, it should send me the If-Modified-Since header in the next request, where I'd return 304 instead of 200.
Is there any way to accomplish the following without too much hassle? (I don't need fancy processing or anything, the most hardcoded way will work ok for me)
I need something like this (it's obviously in pseudo code):
<HeaderMatch If-Modified-Since>
If Equals "Tue, 01 Jan 2013 00:00:00 GMT"
Header set Code 304
// Process Directory section
Any ideas/workarounds?
Well, it was easy.
Using mod_rewrite, I ended up with this
RewriteEngine on
RewriteCond %{HTTP:if-modified-since} "Tue, 01 Jan 2013 00:00:00 GMT" [NC]
RewriteRule .* . [R=304,L]

htaccess directories, files and variables

I am trying to get something similar to $_SERVER["PATH_INFO"] but weird server issues are preventing me from using it...
In my application, the links can look like or
With .htaccess, I am trying to get to the proper pages, and not to redirect to the index.php or similar.
So far, I have this, which is not working :)
RewriteRule ^(.+)$ /$1.php # page only
RewriteRule ^(.+)/(.+)$ /$1.php?x=$2 # page + variable
RewriteRule ^(.+)/(.+)$ /$1/$2.php # folder / page
RewriteRule ^(.+)/(.+)/(.+)$ /$1/$2.php?x=$3 # folder / page + variable
I am sure I need to use RewriteCond %{REQUEST_FILENAME} -f to check if the request is a filename, or directory... but I was unable to make it work...
Variables can contain all weird characters - that is why i am matching with dot... Maybe I should try to match file / folder names with a-z only ( since i do not think they will ever contain anything but a-z, _ or - ).
Any help is greatly appreciated, since its been almost two days of agony now :)
Reverse the Rewrite Rule the most specific to the first.
RewriteEngine On
RewriteRule ^(.+)/(.+)/(.+)$ /$1/$2.php?x=$3 [L]
# RewriteRule to check that the file is exists here
RewriteCond %{REQUEST_URI} ^(.+)/(.+)$
RewriteCond %{DOCUMENT_ROOT}/%{REQUEST_URI}.php -f
RewriteRule ^(.+)/(.+)$ /$1/$2.php [L]
# If file is not exists, then check by put to the variable
RewriteRule ^(.+)/(.+)$ /$1.php?x=$2 [L]
RewriteRule ^(.+)$ /$1.php [L]
Most people/frameworks pass everything that does not have an extension specified to a single php front controller that then works out what to do. I think one reason most people go this route is because of simply how complex mod_rewrite is!
thanks to #LazyOne's hint, I was able to solve this.
htaccess file now looks something like this:
RewriteRule ^([-a-zA-Z0-9_]+)/([-a-zA-Z0-9_]+)/([-a-zA-Z0-9_]+)$ /$1/$2.php?x=$3
RewriteRule ^([-a-zA-Z0-9_]+)/([-a-zA-Z0-9_]+)/$ /$1/$2.php
RewriteRule ^([-a-zA-Z0-9_]+)/([-a-zA-Z0-9_]+)$ /$1.php?x=$2
RewriteRule ^([-a-zA-Z0-9_]+)/$ /$1.php
all folder or files paths must end with "/" while variable must not. this is not a problem in my framework - but might not be useful for others.

"Internal dummy connection" in log, MaxClient reached, server crashes. Opinions?

I am trying to streamline a server of a clients. After downloading the access_log files, I noticed that there were an awful lot of entries that looked like:
::1 - - [11/May/2009:23:21:16 +0100] "GET / HTTP/1.0" 403 5043 "-" "Apache/2.2.3 (CentOS) (internal dummy connection)"
I have also checked the httpd.conf file, and I have seen the following settings:
# ServerLimit: maximum value for MaxClients for the lifetime of the server
# MaxClients: maximum number of server processes allowed to start
# MaxRequestsPerChild: maximum number of requests a server process serves
<IfModule prefork.c>
StartServers 8
MinSpareServers 8
MaxSpareServers 13
ServerLimit 256
MaxClients 256
MaxRequestsPerChild 50
# worker MPM
# StartServers: initial number of server processes to start
# MaxClients: maximum number of simultaneous client connections
# MinSpareThreads: minimum number of worker threads which are kept spare
# MaxSpareThreads: maximum number of worker threads which are kept spare
# ThreadsPerChild: constant number of worker threads in each server process
# MaxRequestsPerChild: maximum number of requests a server process serves
<IfModule worker.c>
StartServers 2
MaxClients 150
MinSpareThreads 25
MaxSpareThreads 75
ThreadsPerChild 25
MaxRequestsPerChild 0
I've been reading that I need to set MaxSpareServes to a value greater than MinSpareServers. Opinions are greatly appreciated.
Kindest Regards.
As far as I know it's nothing to worry about, you can just stop them getting into the log if you want by using the info from the link given already by Andri...
If you wish to exclude them from your
log, you can use normal
conditional-logging techniques. For
example, to omit all requests from the
loopback interface from your logs, you
can use
SetEnvIf Remote_Addr "127\.0\.0\.1" loopback
and then add env=!loopback to
the end of your CustomLog directive.
