Nginx config for serving snapshots to the Google bot - angularjs

I have an AngularJS app which I'd like to get indexed properly on Google.
I wrote a client that scrapes the sites for links and then downloads the pages with Phantomjs making snapshots. This all works fine. What I'm having a problem with is serving those snapshots to the Google bot.
For some reason, the Google bot appends ?_escaped_fragment= to my URLs. As an example, http://me.com/about gets changed to http://me.com/about?_escaped_fragment=. I've verified this in the access logs.
I'm trying to catch this request and serve the Google bot the snapshot with this config:
location / {
if ($args ~ "_escaped_fragment_=") {
rewrite ^ /snapshots/$1;
}
}
However, requesting this URL: http://me.com/about?_escaped_fragment= always results in a 404. Same with the other pages.
The snapshots are stored in /snapshots, relative to the root of the website. They're named after their pages, following directory structure, so http://me.com/business/register has a snapshot in /snapshots/business/register.html.
What can I do to get these snapshots to work?
Thanks.

Ok first let me explain why google uses ?_escaped_fragment_, This is used for websites that rely on ajax, and mark their page with hashes, like for example if you have http://example.com/gallery/#!image1 and each time the user changes to the next image you update the hash to image2, image3, but if the user goes directly to http://example.com/gallery/#!image50 your javascript uses that hash to load the 50th image directly instead of image1 ( servers can't see the hash part, only javascript can ).
So google uses this _excaped_fragment_ to tell the server which page it's trying to cache.
For more explanation use this link
As for why you get a 404 error, I think because you used a $1 without using a capturing block, The right rule would be something like this
location / {
if ($args ~ "_escaped_fragment_=(.*)") {
rewrite ^ /snapshots/$1;
}
}
But I don't think this will fix your problem, because according to your example, you didn't use hashes, you used the uri of the page, so i would rewrite the rule to something like this
location / {
# try snapshot, if not found try direct file.
try_files snapshots$request_uri.html $uri;
}

Here is what I have in nginx and it is working fine, you might need to add a special one for index.html (i.e. when accessing the root of your website)
if ($args ~ "_escaped_fragment_=/(.+)/?") {
set $path $1;
rewrite ^ /snapshots/$path.html;
break;
}
location /snapshots/ {
internal;
alias /var/www/snapshots/;
}
So http://me.com/?_escaped_fragment_=/about will access /var/www/snaphots/about.html
Don't forget this meta tag as well in your page if you use html pushstate instead of hashbangs:
meta(name="fragment", content="!")

Related

Is there a way to rename automatically generated routes JSON file in Next.js?

I have a problem, when I click to go to the /analytics page on my site, adblockers block the analytics.json file that's being requested by Next.js as they think it's an analytics tracker (it's not, it's a page listing analytics products).
Is there a way to rename the route files Next.js uses when navigating to server-side rendered pages on the client-side?
I want to either obfuscate the names so they're not machine readable, or have a way to rename them all.
Any help appreciated.
With thanks to #gaston-flores I've managed to get something working.
In my instance /analytics is a dynamic page for a category, so I moved my pages/[category]/index.tsx file to pages/[category]/category.tsx and added the following rewrite:
// next.config.js
module.exports = {
async rewrites() {
return [
{
source: "/:category",
destination: "/:category/category",
},
];
},
};
This now gets the category.json file rather than analytics.json, which passes the adblockers checks and renders as expected.
Note that due to having a dynamic file name in the pages/[category] directory (pages/[category]/[product].tsx), I had to move that to pages/[category]/product/[product].tsx as I was seeing the /analytics page redirected to /analytics/category for some reason without this tweak.

Authentication to serve static files on Next.js?

So, I looked for a few authentication options for Next.js that wouldn't require any work on the server side of things. My goal was to block users from entering the website without a password.
I've set up a few tests with NextAuth (after a few other tries) and apparently I can block pages with sessions and cookies, but after a few hours of research I still can't find how I would go about blocking assets (e.g. /image.png from the /public folder) from non-authenticated requests.
Is that even possible without a custom server? Am I missing some core understanding here?
Thanks in advance.
I did stumble upon this problem too. It took my dumbass a while but i figured it out in the end.
As you said - for auth you can just use whatever. Such as NextAuth.
And for file serving: I setup new api endpoint and used NodeJS magic of getting the file and serving it in pipe. It's pretty similar to what you would do in Express. Don't forget to setup proper head info in your response.
Here is little snippet to demonstrate (typescript version):
import { NextApiRequest, NextApiResponse } from 'next'
import {stat} from "fs/promises"
import {createReadStream, existsSync} from "fs"
import path from "path"
import mime from "mime"
//basic nextjs api
export default async function getFile (req: NextApiRequest, res: NextApiResponse) {
// Dont forget to auth first!1!!!
// for this i created folder in root folder (at same level as normal nextjs "public" folder) and the "somefile.png" is in it
const someFilePath = path.resolve('./private/somefile.png');
// if file is not located in specified folder then stop and end with 404
if (! existsSync(someFilePath)) return res.status(404);
// Create read stream from path and now its ready to serve to client
const file = createReadStream(path.resolve('./private/somefile.png'))
// set cache so its proper cached. not necessary
// 'private' part means that it should be cached by an invidual(= is intended for single user) and not by single cache. More about in https://stackoverflow.com/questions/12908766/what-is-cache-control-private#answer-49637255
res.setHeader('Cache-Control', `private, max-age=5000`);
// set size header so browser knows how large the file really is
// im using native fs/promise#stat here since theres nothing special about it. no need to be using external pckages
const stats = await stat(someFilePath);
res.setHeader('Content-Length', stats.size);
// set mime type. in case a browser cant really determine what file its gettin
// you can get mime type by lot if varieties of methods but this working so yay
const mimetype = mime.getType(someFilePath);
res.setHeader('Content-type', mimetype);
// Pipe it to the client - with "res" that has been given
file.pipe(res);
}
Cheers

Next.js: How can we have dynamic routing redirect to static pages?

Using Next.js , I currently have an app with a single entry point in the form of /pages/[...slug]/index.ts
It contains a getServerSideProps function which analyses the slug and decide upon a redirection
In some cases a redirection is needed, but it will always be towards a page that can be statically rendered. Example: redirect /fr/uid towards /fr/blog/uid which can be static.
In other cases the slug already is the url of a page that can be static.
How can I mix this dynamic element with a static generation of all pages?
Thanks a lot for your help!
If I understood you problem correctly, you cannot use getServerSideProps if you are going to export a static site.
You have two solutions:
Configure your redirection rules in your web hosting solution (i.e. Amazon S3/CloudFront).
Create client-side redirects (when _app.tsx mounts you can check if router.asPath matches any of the redirection you would like to have configured.
Please remember that the first solution is more correct (as 301 redirects from the browser) for SEO purposes.
EDIT: #juliomalves rightly pointed out OP is looking at two different things: redirection, and hybrid builds.
However, question should be clarified a bit more to really be able to solve his problem.
Because you will need to host a web-server for SSR, you can leverage Next.js 9.5 built-in redirection system to have permanent server-side redirects.
When it comes to SSR vs SSG, Next.js allows you to adopt a hybrid approach, by giving you the possibility of choosing with Data Fetching strategy to adopt.
In case you are using AWS CloudFront, then you can redirect with CloudFront Functions.
CloudFront Functions is ideal for lightweight, short-running functions for use cases like the following:
URL redirects or rewrites – You can redirect viewers to other pages based on information in the request, or rewrite all requests from one path to another.
Here is what we are using to redirect clients (e.g. Native App, Google search index, etc.) to new location when NextJS page was moved or removed.
// NOTE: Choose "viewer request" for event trigger when you associate this function with CloudFront distribution.
function makeRedirectResponse(location) {
var response = {
statusCode: 301,
statusDescription: 'Moved Permanently',
headers: {
'location': { value: location }
}
};
return response;
}
function handler(event) {
var mappings = [
{ from: "/products/decode/app.html", to: '/products/decode.html' },
{ from: "/products/decode/privacy/2021_01_25.html", to: '/products/decode/privacy.html' }
];
var request = event.request;
var uri = request.uri;
for (var i = 0; i < mappings.length; i++) {
var mapping = mappings[i]
if (mapping.from === uri) {
return makeRedirectResponse(mapping.to)
}
}
return request;
}

Laravel returning a 404 on an image

This should be fairly simple though it is completely stumping me.
I have a backend Laravel installation running on localhost:8000
I have a front end Angular app running on localhost:9001.
I have some 'static' images I have included in my seed data (eg.
'1', 'user.png'), these images are being rendered perfectly in my front end (they are also served from the exact place my image uploads are going).
The URL I am currently serving images from is http://localhost:8000/images/{filename}
I can upload images from the front to the back end and they appear in the DB and the image is being put in the filesystem, I'm populating the correct URL in my front end (as evidenced by the previous URL).
My uploaded images are not being shown.
In my logs I am getting:
[2015-01-20 18:13:49] local.ERROR: NotFoundHttpException Route: http://localhost:8000/images/j249ae747ce28c317e02f1fb6d0a10c3.jpg [] []
[2015-01-20 18:13:49] local.ERROR: exception 'Symfony\Component\HttpKernel\Exception\NotFoundHttpException'
I tried a method in my routes file, but couldnt see why, when I am already serving some images already?
I have also set all permissions to 755 on my /images folder.
Any ideas?
I'm not sure I follow every bit of multi-system interaction you have going on, but I'd drop back to first HTTP principles.
Try accessing the image URL directly.
http://localhost:8000/images/j249ae747ce28c317e02f1fb6d0a10c3.jpg
If the error in your browser (or your logs, if you're not developing with debug set to true) is
local.ERROR: NotFoundHttpException Route: http://localhost:8000/images/j249ae747ce28c317e02f1fb6d0a10c3.jpg
This means your web server couldn't find a file at images/j249ae747ce28c317e02f1fb6d0a10c3.jpg, and handed the request to Laravel. This means you need to figure out why your webserver can't see the file.
Assuming you're serving index.php from the public folder
Do you have a public/images/j249ae747ce28c317e02f1fb6d0a10c3.jpg file?
Are you sure? Copy and paste the path into terminal and to a ls public/images/j249ae747ce28c317e02f1fb6d0a10c3.jpg to make sure your brain isn't missing some subtle case issue
Are any errors showing up in your web server's logs (not Laravel's)
Can you create a text/html file in the images folder and serve it? If not, then you may not be pointing your web server at the folder you think you are.
Something like
http://localhost:8000/images/test.txt
http://localhost:8000/images/test.html
Some first principles debugging like that should point you in the right direction.
rm public/storage
php artisan optimize:clear
php artisan storage:link
This worked for me.
The problem is you haven't generated a url for your uploaded image
Try accessing your url like this
http://localhost:8000/storage/images/j249ae747ce28c317e02f1fb6d0a10c3.jpg
To generate the above url
Add this method \Storage::disk('public')->url(); method in your controller.This method accesses the public disk array which is found in Config\filesystems.php and it generates a url in the following format
http://localhost:8000/storage/images/j249ae747ce28c317e02f1fb6d0a10c3.jpg
For example the method below stores the image in the image folder and generates the url of the image path.
public function uploadImage(Request $request)
{
$request->validate(['image'=>'file|image|max:5000']);
$imageProfile = new ImageProfile();
if($request->hasFile('image') && $request->file('image')->isValid())
{
$image = $request->file('image')->store('images');
$imageProfile->image_profile_url = \Storage::disk('public')->url($image);
$imageProfile->save()
}
return response()->json($imageProfile,200);
}
The code returns a Json response below
{
"id": 13,
"image_profile_url ": "http://127.0.0.1:8000/storage/images/cxlogqdI8aodERsmw74nmEx7BkxkWrnyJLMH7sFj.jpeg",
"updated_at": "2020-01-13 16:27:37",
"created_at": "2020-01-13 16:27:37",
}
Try to copy the url and test it in postman.
Visit the link to learn more about Laravel file storage
Laravel File Storage
Hope it helps.
laravel 8
Controler function
public function store(Request $request)
{
$this->validate($request, [
'site_title' => 'required',
'logo_image' => 'required|image|mimes:jpeg,png,jpg,gif,svg|max:2048',
]);
$input['logo_image'] = time().'.'.$request->logo_image->getClientOriginalExtension();
$request->logo_image->move(public_path('images'), $input['logo_image']);
$input['site_title'] = $request->site_title;
//dd($input);
Site_settings::create($input);
return back()->with('success','Image Uploaded successfully.');
}
blade view
<td>
<img src="{{ url('/images/').'/'.$site_settings->logo_image ?? '' }}" alt="" width="250px" height="auto">
</td>

get domain name (base url without protocol)

How to get app domain name? (I mean base url without protocol http:// or https://)
So if app is installed on 'http://sub.example.com/app', I want to get 'sub.example.com'.
There is the PHP global $_SERVER['HTTP_HOST'] - this will return the domain without protocol, but won't give you a sub-directory, e.g. if your app is at www.domain.com/myapp/app.
Assuming you are referring to a CakePHP app (given the tag on the question), you can use the following constant upto version 2.4:
FULL_BASE_URL
In version 2.4 you can use:
Router::fullbaseUrl()
This returns the base URL with http:// or https://. You can then do a regex replace to get rid of that part.
Try:
function replace_http($url) {
$pattern = "https{0,1}:\/{2}";
return preg_replace($pattern, "", $url);
}
$baseUrl = replace_http(Router::fullBaseUrl());
I use this in my config.php/bootstrap.php which is located in config folder to get url app domain with folder
$_SERVER['HTTP_HOST'].dirname(dirname(dirname($_SERVER['PHP_SELF'])));

Resources