Keep source IP when clients request to pod on Kubernetes - database

I want to prevent brute force attacks to my services on Kubernetes. My solution is to ban IPs that have many failed attempts but every request will be SNAT (Source NAT) and I don't know what can I do. Is there Any proxy I can use for my TCP requests and after that I can ban IPs?

If You are using a database to your service and facing this brutal attack by more failed attempts then you can block those IP's or users for some time and can release them. Database-persisted short lockout period for the given account (1-5 minutes) is the only way to handle this. Each userid in your database contains a timeOfLastFailedLogin and numberOfFailedAttempts. When numbeOfFailedAttempts > X you can lockout for some minutes.
Refer to this SO and Doc for More information.

Related

Apache2: how to log rejected connections and client timeout

I am doing some load testing on a service run with Apache2 and my load testing tool has a default timeout of 30 seconds. When I run the tool for a minute with 1 request per second load, it reports that 40 succeeded with 200 OK response and 20 requests were cancelled because client timeout exceeded while awaiting headers.
Now, I was trying to spot this on the server side. I can't see the timeouts logged either in apache access logs or gunicorn access logs. Note that I am interested in connections that weren't accepted as well as that are accepted and times out.
I have some experience working on similar services on Windows. The http.sys error logs would show connection dropped errors and we would know if our server was dropping connections.
When a client times out, all the server knows is that the client has aborted the connection. In mod_log's config, the %X format specifier is used to log the status of the client connection after the request has completed, which is exactly what you want to know in this case.
Configure your logs to use %X, and look for the X character in the log lines.
Bonus: I even found the discussion about this feature in apache's dev forum, from 20 years ago
Update:
Regarding refused connections, these cannot be logged by apache. Connection refusal is done by the kernel, in the tcp stack, and not by apache. The closest solution including only apache that I can think of is keeping track of the amount of open connections (using mod_status). If it reaches the maximum you know you might be refusing connections. Otherwise, you'd need to set up some monitoring solution to track tcp resets sent by the kernel.

Webapp server data storage: Memory vs database

We are making a web application in Go with a MySQL database. Our users are allowed to only have one active client at a time. Much like Spotify allows you to only listen to music on one device at a time. To do this I made a map with as key the user ids and a reference to their active websocket connection as a value. Based on the websocket id that the client has to send in the header of the request we can identify weather the request comes from their active session.
My question is if it's a good practice to store data (in this case the map with the user ids and websockets) in a global space or is it better to store it in the database.
We don't expect to reach over 10000 simultaneously active clients. Average is probably gonna be around 1000.
If you only run one instance of the websocket server storing it in memory should be sufficient. Because if it for some reason goes down/restarts then all the connections will be lost and all the clients will have to create them again (and hence the list of connection will once again be populated by all the clients who want to use the service).
However, if you plan on scaling it horizontally so you have multiple websocket services behind a load balancer, then the connections may need to be stored in a database of some sort. And not because it necessarily needs to be more persistant but because you need to be able to check the request against all the services connections.
It is also possible to have a separate service which handles the incoming request and asks all the websocket services if any of them have the connection specified in the request. This could be done if you add a pub/sub queue and every websocket service subscribes to channels for all its websocket ids and the service that receives the request then publishes the websocket id, and the websocket services can then send back replies on a separate channel if they have that connection. You must decide how to handle if no one is responding (no websocket service has the websocket id). Either the channel does not exist, or you expect the answer within a specific time. Or you could publish the question on a general topic and expect all the websocket services to reply (yes or no).
And regarding whether you need to scale it I guess depends mostly on the underlying server you're running the service on. If I understand it correctly the websocket service will basically not do anything except from keeping track of its connections (you should add some ping pong to discover if connections are lost). Then your limitation should mainly be on how many file descriptors your system can handle at once. If that limit is much larger than your expected maximum number of users, then running only one server and storing everything in memory might be an OK solution!
Finally, if you're in the business of having a websocket open for all users, why not do all the "other" communication over that websocket connection instead of having them send HTTP requests with their websocket id? Perhaps HTTP fits better for your use case but could be something to think about :)

Limit the usage of Kerberos TGTs

I'm pretty new to Kerberos. I'm testing the Single Sign On feature using Kerberos. The environment: Windows clients (with Active Directory authentication) connecting to an Apache server running on Linux machine. The called cgi script (in Perl) connects to a DB server using the forwarded user TGT. Everything works fine (I have the principals, the keytab files, config files and the result from the DB server :) ). So, if as win_usr_a on Windows side I launch my CGI request, the CGI script connects to the remote DB and queries select user from dual and it gets back win_usr_a#EXAMPLE.COM.
I have only one issue I'd like to solve. Currently the credential cache stored as FILE:.... On the intermediate Apache server, the user running the Apache server gets the forwarded TGTs of all authenticated users (as it can see all the credential caches) and while the TGTs lifetime are not expired it can requests any service principals for those users.
I know that the hosts are considered as trusted in Kerberos by definition, but I would be happy if I could limit the usability of the forwarded TGTs. For example can I set the Active Directory to limit the forwarded TGT to be valid only to request a given service principal? And/Or is there a way to define the forwarded TGT to make it able to be used only once, namely after requesting any service principal, become invalid. Or is there a way the cgi script could detect if the forwarded TGT was used by someone else (maybe check a usage counter?).
Now I have only one solution. I can define the lifetime of the forwarded TGT to 2 sec and initiate a kdestroy in the CGI script after the DB connection is established (I set that the CGI script can be executed by the apache-user, but it cannot modify the code). Can I do a bit more?
The credential caches should be hidden somehow. I think defining the credential cache as API: would be nice, but this is only defined for Windows. On Linux maybe the KEYRING:process:name or MEMORY: could be a better solution as this is local to the current process and destroyed when the process is exited. As I know apache create a new process for a new connection, so this may work. Maybe KEYRING:thread:name is the solution? But - according to the thread-keyring(7) man page - it is not inherited by clone and cleared by execve sys call. So, if e.g. Perl is called by execve it will not get the credential cache. Maybe using mod_perl + KEYRING:thread:name?
Any idea would be appreciated! Thanks in advance!
The short answer is that Kerberos itself does not provide any mechanism to limit the scope of who can use it if the client happens to have all the necessary bits at a given point in time. Once you have a usable TGT, you have a usable TGT, and can do with it what you like. This is a fundamentally flawed design as far as security concerns go.
Windows refers to this as unconstrained delegation, and specifically has a solution for this through a Kerberos extension called [MS-SFU] which is more broadly referred to as Constrained Delegation.
The gist of the protocol is that you send a regular service ticket (without attached TGT) to the server (Apache) and the server is enlightened enough to know that it can exchange that service ticket to itself for a service ticket to a delegated server (DB) from Active Directory. The server then uses the new service ticket to authenticate to the DB, and the DB see's it's a service ticket for win_usr_a despite being sent by Apache.
The trick of course is that enlightenment bit. Without knowing more about the specifics of how the authentication is happening in your CGI, it's impossible to say whether whatever you're doing supports [MS-SFU].
Quoting a previous answer of mine (to a different question, focused on "race conditions" when updating the cache)
If multiple processes create tickets independently, then they have no
reason to use the same credentials cache. In the worst case they would
even use different principals, and the side effects would be...
interesting.
Solution: change the environment of each process so that KRB5CCNAME
points to a specific file -- and preferably, in an
application-specific directory.
If your focus in on securing the credentials, then go one step further and don't use a cache. Modify your client app so that it creates the TGT and service tickets on-the-fly and keeps it private.
Note that Java never publishes anything to the Kerberos cache; it may either read from the cache or bypass it altogether, depending on the JAAS config. Too bad the Java implementation of Kerberos is limited and rather brittle, cf. https://steveloughran.gitbooks.io/kerberos_and_hadoop/content/sections/jdk_versions.html and https://steveloughran.gitbooks.io/kerberos_and_hadoop/content/sections/jaas.html

Why the same IP of Cron Job in a App Engine is used by malicius requests?

I noticed this:
(Note that I deployed this service just 2 days ago and no one know it!!)
The only valid request here is the one to "/extract" at 4:10, because it is correct and it is at the scheduled time (of the cron job).
(You can also notice that the user agent is "AppEngine")
I created some routing for the "hacking try" paths to reply with a "F*** Y**".
Then I decided to use a dos.yaml file to define a blacklist.
I was taking note of the IPs and I noticed that the IP of the valid request is also used in this request (and probably other "malicious" ones):
I noticed that every execution of the my scheduled "cron job" (automated one or using "Run now") has a different IP. Also all the malicious requests have different IPs but you can see that some of them are executed programmatically:
How is it possible?
My explanation is that Google Cloud Platform is hosting a lot of malicious services.
Also, the documentation says that cron jobs are issued by the IP 0.1.0.1.
Someone can explain me this unexpected behavior or where is the fault in my argument?
Alex

Using 1 intance of google-app-engine to monitor external service

I planning to create a NodeJS program, that work 24/7, that ping and make requests to an external server (outside of google cloud) every minute. Just to see that it the external services are are live.
If there is any error it will notify me by SMS & Email.
I don't need any front-end for this app, and no one needs to connect to it. Just simple NodeJS program.
The monitoring and configuration will be by texts files.
Now the questions:
It looks like it will cost me just $1.64. It sounds very cheap. Am I missing something?
It needs to work around the clock, I will request it to start it once, and it need to continue working, (by using setInterval). Is it will be aborted?
What it is exactly mean buy 1 instance. What an instance can do? Only respond to one request or what?
I tried to search in Google: appengine timeout, but didn't found anything that helps.
Free Quota
If you write your application in Python, PHP, Go or Java it can fit in free usage quota:
https://cloud.google.com/appengine/docs/quotas
So there will be absolutely no costs to run it on Google App Engine platform.
There are limit of 657,000 UrlFetch API Calls per day (more than 450 calls per minute in 24/7 mode) for free apps. 4GB traffic may also be sufficient for this kind of work.
Keep in mind there is no SMS sending services provided by Google App Engine and you will need to spend additional UrlFetch API calls to use external SMS services.
Email sending is also limited to 100 Emails per day (or 5000 Emails to admin address), so try not so send repeated notifications about same monitored server every minute, or you'll deplete your Email quote in 1.5 hours.
Scheduled Tasks
There is no way to run single process indefinitely without interruption on App Engine. But you don't have to!
You'll need to encapsulate all the work you're planning to execute in every iteration into single task and then schedule it to run every minute with Cron. See this documentation for Python: https://cloud.google.com/appengine/docs/python/config/cron
It is recommended to have some configuration page where you can set some internal configuration or see monitoring statistics, at least manage flag to temporarily pause tasks execution without redeploying your app.

Resources