Why is appcfg.sh only downloading the last 100 lines of my logs - google-app-engine

I am attempting to download my google appengine logs using the appcfg.sh client utility, but no matter what I do I only get (exactly) 100 log lines. I have tried the --num_days specifying a few days or 0 as per the docs to retrieve all available but it has no effect. My logs are not particularly large and the 100 lines result in a few hours of logs totaling about 40kB. And of course if I view the logs in the web console I can see many weeks (or months) worth of logs just fine.
I've been trying variations on the following command:
appcfg.sh--num_days=0 --include_all -A <<my app name>> request_logs <<path to my app>> api_2017_04_10.log
and the output I get is:
Reading application configuration data...
Apr 10, 2017 1:12:41 PM com.google.apphosting.utils.config.IndexesXmlReader readConfigXml
INFO: Successfully processed <<my app path>>/WEB-INF/datastore-indexes.xml
Beginning interaction for module <<my module name>>...
0% Beginning to retrieve log records...
25% Received 100 log records...
Success.
Cleaning up temporary files for module <<my module name>>...
Note that it always ends at "25%" and "100 log records"... and 100 lines is nowhere near 25% of the total I'd expect regardless.

After a week of intermittently messing with the this and always getting that same result this evening I ran the exact same script again and to my surprise I got 400 lines of the logs instead of 100. I ran it again immediately and it chugged along for several minutes all the while reporting "97%" finished but continuing to indicate thousands of additional lines of the logs. However it was not actually writing any data to the log at that point (I think it wants to buffer all of the data)... So I backed it down to --num_days=7 and that appears to have worked.
I think the client or API just very buggy.

Related

Host server bloating with numerous fake session files created in hundreds every minute on php(7.3) website

I'm experiencing an unusual problem with my php (7.3) website creating huge number of unwanted session files on server every minute (around 50 to 100 files) and i noticed all of them having a fixed size of 125K or 0K in cPanel's file manager, hitting iNode counts going uncontrolled into thousands in hours & hundred thousands+ in a day; where as my website really have a small traffic of less than 3K a day and google crawler on top it. I'm denying all bad bots in .htaccess.
I'm able to control situation with help of a cron command that executes every six hours cleaning all session files older than 12hours from /tmp, however this isn't an ideal solution as fake session files getting created great in number eating all my server resources like RAM, Processor & most importantly Storage getting bloated impacting overall site performance.
I opened many of such files to examine but found them not associated with any valid user as i add user id, name, email to session upon successful authentication. Even assuming a session created for every visitor (without acc/login), it shouldn't go beyond 3K on a day but sessions count going as high as 125.000+ just in a day. Couldn't figure out the glitch.
I've gone through relevant posts and made checks like adding IP & UserAgent to sessions to track suspecious server monitoring, bot crawling, overwhelming proxy activities, but with no luck! I can also confirm by watching their timestamps that there is no human or crawler activity taken place when they're being created. Can see files being created every single minute without any break throughout the day!!.
Didn't find any clue yet in order to figure out root cause behind this weird behavior and highly appreciate any sort of help to troubleshoot this! Unfortunately server team unable to help much but added clean-up cron. Pasting below content of example session files:
0K Sized> favourites|a:0:{}LAST_ACTIVITY|i:1608871384
125K Sized> favourites|a:0:{}LAST_ACTIVITY|i:1608871395;empcontact|s:0:"";encryptedToken|s:40:"b881239480a324f621948029c0c02dc45ab4262a";
Valid Ex.File1> favourites|a:0:{}LAST_ACTIVITY|i:1608870991;applicant_email|s:26:"raju.mallxxxxx#gmail.com";applicant_phone|s:11:"09701300000";applicant|1;applicant_name|s:4:Raju;
Valid Ex.File2> favourites|a:0:{}LAST_ACTIVITY|i:1608919741;applicant_email|s:26:"raju.mallxxxxx#gmail.com";applicant_phone|s:11:"09701300000";IP|s:13:"13.126.144.95";UA|s:92:"Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:82.0) Gecko/20100101 Firefox/82.0 X-Middleton/1";applicant|N;applicant_name|N;
We found that the issue triggered following hosting server's PHP version change from 5.6 to 7.3. However we noticed unwanted overwhelming session files not created on PHP 7.0! It's same code base we tested against three versions. Posting this as it may help others facing similar issue due to PHP version changes.

Run shiny app permanetly on server

I have developed a shiny app, first have to run SQL queries which need around 5-10 minutes to be ran. The building of the plots afterwards is quite fast.
My idea was to run the queries once per day (with invalidLater) before shinyServer(). This worked well.
Now I got access to a shiny server. I could save my app in ~/ShinyApps/APPNAME/ and access it by http://SERVERNAME.com:3838/USER/APPNAME/. But if I open the app, while it is not open in some other browser it takes 5-10 min to start. If I open it, while it is also open on another computer it starts fast.
I have no experience with severs, but I conclude my server only runs the app as long someone is accessing it. But in my case it should be run permanently, so it always starts fast and can update the data (using the sql queries) once per day.
I looked up in the documentation, since I guess it is some setting problem.
To keep the app running:
Brute force: You can have a server/computer, have a view of your app open all the time so it does not drop from shiny server memory. but that won't load new data.
Server settings: you can set the idle time of your server to a large interval, meaning it will wait that interval before dropping your app from memory. This is done in the shiny-server.conf file with fx. app_idle_timeout 3600
To have daily updates:
Crontab:
Set up a crontab job in your SSH client fx. PuTTY:
$ crontab -e
like this(read more: https://en.wikipedia.org/wiki/Cron):
00 00 * * * Rscript /Location/YourDailyScript.R
YourDailyScript.R:
1. setwd(location) #remember that!
2. [Your awesome 5 minute query]
3. Save result as .csv or whatever.
and then have to app just load that result.

VB6+Access(ADO/JET) Randomly getting "Disk or Network Error" or "Cannot find the input table or query X on Database Y"

We have a Practice Management system that is about 15 years old. I've been working on it for about 12, and I've never encountered this problem until just recently, and we can't figure it out.
It is written in VB6 which uses ADO/JET to access an Access .mdb file on the network. The application opens the connection when it starts, keeps it open while it's open, and closes it when it exits. It does a LOT of stuff with the DB - the system deals with Patient Accounts, Charges, Payments, Scheduling Appointments, and about a million other things. We have dozens of clients that use this program, each with their own DB and most of them 'offsite', where they have their own server, some number of workstations, between 1 and 20 users, pounding away at the system 24 hours a day, 7 days a week, and except for the occasional DB field bug or having to compact/repair DBs, it's pretty stable.
About 3 weeks ago, we started seeing a problem that we hadn't seen before.
We have an 'in-house' system setup for us to use: the DB is on our Server which houses maybe 10 other DBs, and only a couple of people connect in to this system.
We started noticing that if someone logged into the system, went straight to our Scheduler screen, and then sat idle for about 5-10 minutes, they might get "Disk or Network Error", or "Cannot find the input table or query X on database Y".
What's strange is that it SEEMS to happen only when 2 or more people are logged into that DB from different computers, and then one of them will get the error (Randomly?) but the other 2 users will be fine.
There is a Timer on the 'main' MDI Parent form of the system which wakes up about ever minute (there are some things which will change the interval to a shorter interval, and there are some things which disable the timer, but we don't think either of them are happening in this situation). It performs a pretty basic SQL Query on the DB: SELECT loggedin FROM Users WHERE UserId = 'DBUPDATER' THIS is the SQL Query that seems to be ALWAYS triggering one of those errors.
There is also a timer that runs about every 2 minutes while a user is logged in to check for emails and a few other things that would be running during this time.
And since they are on the Scheduler screen, there's a timer on there that runs every 30s or so which will check the DB to see if any changes to the scheduler have been made to see if it needs to refresh the screen or not.
There are some other strange things:
The DB and the Users table seem to be perfectly fine. When the person who gets the error logged into the system - usually only 5 minutes earlier - the system HAD to look at the Users table to mark them as Logged in, and I'm almost 100% sure that the query that it's dying on had been run at least once - probably 4-5 times before it dies.
Once a user gets this error, if they leave the Error's MessageBox on the screen (not quit the .exe), if they try to access that DB file in any way, they get "Disk or Network error" - this includes if they try to open that DB in Access. HOWEVER, they can still get to the Network just fine, and even open up other mdb files IN THE SAME FOLDER as the one they can't open. OTHER PEOPLE on other computers can open that mdb without any errors. Once the user acknowledges the Error, and allow the exe to close, they can open that DB file again just fine.
I'm told that this is not in anyway a Network issue. Our IT guy says he's run pings and traces and all sorts of tests to ensure that the network connection is not getting dropped.
I've also run some things on the DB to make sure it's not corrupted and it seems to be fine - and we get it to happen on other DBs.
If anyone has seen anything like this and knows a possible fix, I would greatly appreciate it! Thanks!
New Information (9/5/13 - 10:30am)
We noticed that at the same time we get this error, in the errorlog on the computer that gets the error is a Warning Event that says:
{Delayed Write Failed} Windows was unable to save all the data for the file \MEDTECHSERVER\MEDTECH DATA\VB\SCHEDTEST2\MEDTECH.MDB; the data has been lost. This error was returned by the server on which the file exists. Please try to save this file elsewhere.

google app engine deployment files limit per directory

I got the following error message while deploying my app:
...
07:45 AM Scanned 5500 files.
...
Error 400: --- begin server output ---
Exceeded the limit of 1000 for allowable files per directory within gaelibs/romn/cscd/
--- end server output ---
but in the document it says:
maximum total number of files (app files and static files) 10,000 per directory 10,000 total
Could somebody tell me what's wrong with this? Thanks.
check issue 9256 opened by me. Now Google guys change the limit from 10,000 to 1,000.
Check that you're running the latest version of the SDK. I was not aware of this change, I thought the limit was still 1000. So if it says 10000 it means the limit was raised and it can have been introduced in a later version of the SDk than the version that you're using.
I discovered that you can get around the 1000 file limit by creating subdirectories. For example, if you have 1500 files in one directory Google will complain. If you divvy up those files into, let's say 500 files each for 3 new subdirectories, Google won't complain. You still have 1500 files in one directory, sort of.
This was a lot easier for me than migrating everything to gcs.

Backends logs depth

I have a long-running process in a backend and I have seen that the log only stores the last 1000 logging calls per request.
While this might not be an issue for a frontend handler, I find it very inconvenient for a backend, where a process might be running indefinitely.
I have tried flushing logs to see if it creates a new logging entry, but it didn't. This seems so wrong, that I'm sure there must be a simple solution for this. Please, help!
Thanks stackoverflowians!
Update: Someone already asked about this in the appengine google group, but there was no answer....
Edit: The 'depth' I am concerned with is not the total number of RequestLogs, which is fine, but the number of AppLogs in a RequestLog (which is limited to 1000).
Edit 2: I did the following test to try David Pope's suggestions:
def test_backends(self):
launched = self.request.get('launched')
if launched:
#Do the job, we are running in the backend
logging.info('There we go!')
from google.appengine.api.logservice import logservice
for i in range(1500):
if i == 500:
logservice.flush()
logging.info('flushhhhh')
logging.info('Call number %s'%i)
else:
#Launch the task in the backend
from google.appengine.api import taskqueue
tq_params = {'url': self.uri_for('backend.test_backends'),
'params': {'launched': True},
}
if not DEBUG:
tq_params['target'] = 'crawler'
taskqueue.add(**tq_params)
Basically, this creates a backend task that logs 1500 lines, flushing at number 500. I would expect to see two RequestLogs, the first one with 500 lines in it and the second one with 1000 lines.
The results are the following:
I didn't get the result that the documentation suggests, manually flushing the logs doesn't create a new log entry, I still have one single RequestLog with 1000 lines in it. I already saw this part of the docs some time ago, but I got this same result, so I thought I wasn't understanding what the docs were saying. Anyways, at the time, I left a logservice.flush() call in my backend code, and the problem wasn't solved.
I downloaded the logs with appcfg.py, and guess what?... all the AppLogs are there! I usually browse the logs in the web UI, I'm not sure if I could get a confortable workflow to view the logs this way... The ideal solution for me would be the one that is described in the docs.
My apps autoflush settings are set to the default, I played with them when at some time, but I saw that the problem persisted, so I left them unset.
I'm using python ;)
The Google docs suggest that flushing should do exactly what you want. If your flushing is working correctly, you will see "partial" request logs tagged with "flush" and the start time of the originating request.
A couple of things to check:
Can you post your code that flushes the logs? It might not be working.
Are you using the GAE web console to view the logs? It's possible that the limit is just a web UI limit, and that if you actually fetch the logs via the API then all the data will be there. (This should only be an issue if flushing isn't working correctly.)
Check your application's autoflush settings.
I assume there are corresponding links for Java, if that's what you're using; you didn't say.
All I can think that might help is to use a timed/cron script like the following to run every hour or so from you workstation/server
appcfg.py --oauth2 request_logs appname/ output.log --append
This should give you a complete log - I haven't tested it myself
I did some more reading and it seems CRON is already part of appcfg
https://developers.google.com/appengine/docs/python/tools/uploadinganapp#oauth
appcfg.py [options] cron_info <app-directory>
Displays information about the scheduled task (cron) configuration, including the
expected times of the next few executions. By default, displays the times of the
next 5 runs. You can modify the number of future run times displayed
with the -- num_runs=... option.
Based on your comment, I would try.
1) Write you own logger class
2) Use more than one version

Resources