How do I dynamically generate a sitemap with Google App Engine - google-app-engine

My website changes every day - I run a news website with new stories every day. I want Google to index my site as often as possible and want/need to autogenerate the sitemap.
I use Google App Engine (with Node.js) to run my site. With GAE - I do not have write-access to the root directory. To post the site map - I need to re-deploy my whole site after generating the map. That is an unnecessarily complex step.
I have searched far and wide and cannot see how to save my sitemap. So - I considered using a static one with a dynamically generated child that I store in another location where I have write access. Google says it wants all linked sitemaps in the same directory. So that appears to be a dead-end.
Can I use "App Deploy" in such a way that only the sitemap is uploaded? Any other possibilities? Appreciate any and all suggestions. It seems unlikely that Google didn't provide some way to solve this.

For a site where new URLs are being created regularly (like a news, blog site, etc), don't 'store' your sitemap. It should be generated on demand i.e. your App should include code to generate the content when the link <your_website>/sitemap.xml is loaded.
Separately, you should note that gcloud app deploy doesn't always deploys all your files. It usually deploys only files that have changed. You can easily confirm this by running the deploy command, changing a single file and then running the deploy command again. You will see that the logs will say something like - Uploading 1 files to Google Cloud Storage and the deploy will be faster. You can change X number of files, deploy again and the message will be updated to indicate it is only deploying x files.
However, I'm not sure what it uses to compute the diff. Maybe it compares it to the files currently in your staging bucket and if the files in the staging bucket have been deleted (they have a default life span of 15 days) it will deploy all the files again (but as I said, I'm not sure of this)

Related

AWS CloudFront how to update existing S3 file?

I use CloudFront from AWS and I have an S3 static website.
I use ReactJs and I changed some texts on most pages.
The problem I have now is that I use
npm run build
to produce the production application. I want to update the content on AWS in the S3 bucket (I previously uploaded the same files) however two things happen:
-When I access in incognito, everything works fine, I am given the updated version of the website
-When I access in normal mode with web browsers that I used to access the website before, I am still given the old version of files.
I accessed AWS documentation and I have two solutions:
-Wait 24 hours for CloudFront to cache files in edge locations
-Use version name for the files (for example, change the name of image.jpg to image_1.jpg; image_2.jpg etc.)
I would definitely go with the second option which indeed is time consuming but for sure less than 24 hours. Should I change the name of EACH file that I have in build or just those in static?
Any other solutions?
Something I haven't tried is, before uploading to AWS S3, create a folder such as V1 and upload my react files. When I make a change I call the folder V2 and so on.
Using version names is the most robust method. It gives you full control on the cache behavior without messing around with CloudFront.
So yes, each time there's a new version update your file names.
Btw, if you bootstrapped your react app using create-react-app then the build process does that by default. It will name each bundle with a unique hash every time the bundle changes. This way you can utilize long-term cache in the browser and CF for your files.
You'd probably still have to invalidate the root index.html each deployment as its name doesn't change between versions.

Downloading App Engine source code

So it seems from a few SO questions I've seen that this is a problem among other users. Recently one of our head dev's left and I inherited a lot of his projects. One of which, is a website that what seems like lives on an app engine from google cloud platforms. From the App Engine documentation, to download source code you use the appcfg.py download_app command. Which I did, however the only results I get back from that call is:
Fetching file list...
Fetching files...
And then it just ends. No error message or any kind of message at all, and of course, it did not download the source code into the output dir I specified.
Scratching my head and looking at various SO posts, someone mentioned something about going into the google cloud vm directly and doing the same command, and to my surprise finding the same exact behavior that I did in my local terminal.
This made me realize it must be something else at play. I took a look at my versions tab in the App Engine dashboard on GCP. I see my instance running, it correctly says Serving and if I click the link it brings me to the website which loads fine. However, under Size it says 0 B which made me think perhaps this is why the download_app isn't downloading anything, because the version is 0 B?
What I'm trying to figure out is why it says 0 B for the version, when clearly the site runs fine and how I can get the source code for this. Here's a screenshot for reference
And screenshot of my terminal (local). Obviously I omitted the -A and -V flags, but they are correctly set and if I purposely make them incorrect I do indeed get an error message.
EDIT
Just so everyone is aware, I also made sure my user had the correct permissions. Owner, App Engine Owner... and some others. I don't think that's the problem.
When you deploy an App Engine Flexible application, the source code is uploaded to Cloud Storage on your project in a bucket named staging.<project-id>.appspot.com. You can navigate in this bucket and download the source code for a specific version as a .tar file.
Alternatively, you can find the exact Cloud Storage URL for your source code by going to Dev Console > Container Registry > Build History and select the build for your version. You'll find the link to your source code under Build Information.
One thing to note however is that the staging... bucket is created by default with a Lifecycle rule that deletes files older than 15 days automatically. You can delete this rule if you want so that all versions' source code is kept indefinitely.
In your case I believe that may not have helped since files may have been deleted already but it's worth knowing you can get the source code from there (source code isn't pushed to Source Repository by default, your developer had to configure it manually).
Posting this since none of the listed methods on the web didn't take me to the code (by June 2021)
Note: appcfg.py is deprecated by Google
You could try accessing your source code through;
Google Cloud Platform > Debugger > choosing the version of the
Application from combo at top.
This will list the files of that version on the left pane. There is no way to download code automatically but you can copy-paste the code.
Advice: Push your code to a Git repository to avoid this hassle next time.
Hope you will find this helpful.
In the developer console you can select the respective project and check:
on the Services page - which services, AKA modules - as they used to be (and still are) called in various places, you app has deployed
on the Versions page - which versions for each of the services are deployed
This information is what appcfg.py download_app expects. See also:
the various appcfg.py options using its --help flag
How do I download a specific service's source code off of AppEngine?
You can also access the deployed source code live (if everything else fails it could still be a last resort method to get the code, but tedious), see my answer to Google Cloud DataStore automatic indexing
Update:
I just now noticed in your screenshot that it's a flexible environment app. The appcfg.py docs are in the standard environment section, I suspect it's not applicable to the flexible environment, for which what's deployed is actually a docker image built during the deployment operation. From Deploying your application:
Deploy your app to App Engine using the gcloud app deploy
command. This command automatically builds a container image by using
the Container Builder service and then deploys that image to the
App Engine flexible environment. The container will include any local
modifications that you've made to the runtime image.
It might be possible to access the code on the actual GCE instance running the app, by connecting to the running instance and starting a shell in your app container, see Connecting to the instance

Cleanup Google Mobile Backend Starter

So I started to play with Google Mobile Backend Starter.
Now I want to clean this instance and anything that this starter thingy created into the project (e.g. task queues, data store, etc...)
How do we achieve this?
Is this done through some command line somewhere along the lines described in this page?
appengine-java-sdk/bin
EDIT: I should have made it clearer, I don't intend to delete the project. I just want to "clean" it and replace with my own application. Anyway, I ended up using the appengine SDK tools mentioned above (Updating and Managing a Java App). It was a long process, and tedious. It could be improved.
What I did:
Using the appengine SDK tool, downloaded the application first. It prompted me for a password. I had to create a new "App Password" entry in my Google account, since it didn't accept my "usual" Google password (e.g. GMAIL)
To clean the Scheduled Tasks, edit cron.xml so that no entries are left. A sample empty cron.xml file is shown in the documentation page. Run the update procedure of the SDK tool for cron jobs
To clean the Queues, use the same approach (queue.xml)
Clean DataStore by going to DataStore Admin page (if applicable)
Upload your new application (either thru SDK, or thru Android/Eclipse AppEngine plugin)
There will now be 2 (or more) versions of your AppEngine. If necessary, make your newly uploaded application, the default version. This is done in the Developer Console.
Check the instances as well. Remove if necessary

GWT how to store information on google App Engine?

In my GWT application, a 'root' user upload a specific text file with data and that data should be available to anyone who have access to the app (using GAE).
What's the classic way to store a data that will be available to all users? I don't want to use any database (objectify!?) since this is a relatively small amount of information and it changes from time to time by root.
I was wondering if there was such static MAP on the 'engine level' (not user's session) that this info can be stored (and if the server is down - no bigi, root will upload again)
Thanks
You have three primary options:
Add this file to your /war/ directory and deploy with the app. This is what we typically do with all static files that rarely change (like .css file, images, etc.) This file will be available to all users, whether they are authenticated or not.
Add this file to your /war/WEB-INF/ directory and deploy with the app. This file will be available to your server-side code, so you can read it on the server-side and show to a user. This way you can decide which users can see this file and which users should not have access to it.
Upload this file to Google Cloud Storage. You can do it through an app, or you can simply upload it manually to a bucket using a GCS console or gsutil command-line tool. Then you simply provide a link to your users. The advantage of this option is that you do not have to redeploy your app when a file changes.
The only reason to go with the first two options is to have this file under version control. If you don't need that, I would recommend going with the GCS option.

How to create static pages programmatically from app on App Engine?

GAE is great. Everything becomes easy developing on it.
But recently I have found the need to create pages that can be served statically without changing my app.yaml and re deploying. Is this possible through the API?
No. To change anything in the app, you have to re-deploy (which you could do programmatically).
Your options are serving dynamic contents, or putting the contents in the blob store (or some other CDS like Amazon S3), and serving it from there.

Resources