htaccess url rewrite (no 301!), but duplicate content - file

I have a file called "abc.html". In the past this was called "abc.cfm". From an outsider's point of view I want it to look like abc.cfm still exists (and it should be the content of abc.html).
Currently I have this in my htaccess:
RewriteRule ^abc.cfm$ abc.html [L]
This works perfectly. Whenever you go to abc.cfm, it shows the content of abc.html, withour redirects (from the user's POV).
The problem is that I can also reach abc.html now and that's duplicate content. I can solve this by adding a canonical saying the abc.cfm is original URL. I just wondered if it's possible to have a 301 from the html to the cfm file (which internally calls for the html again). Without getting into an infinite loop of course :-)
I am also open for other solutions. (but I can't change the links pointing to the abc.cfm file and I don't want a 301 redirect to the abc.html file)

Redirect 301 /abc.html /abc.cfm
Put this line before your RewriteEngine on directive.
If you want to do it in mod_rewrite, you could use the following instead:
RewriteRule ^abc.html$ abc.cfm [R=301, L]
This will force the user's browser to only be able to request the .cfm page, while internally the server knows to grab the content from the .html page. This rule needs to be before the internal rewrite rule you have listed above, but after the RewriteEngine on directive.

Related

What is the unusual thing that I sometimes see in website urls?

Sorry, I don't really know how to word the title but this has been bothering me for ages. I have always wondered how on websites sometimes you see a weird url structure. Take youtube for example:
youtube.com/c/this_persons_channel
I just don't understand how it works!
I understand that the / means it is a folder within the website but how it ends off.
Is it a file? and if so, where is the file type?
And is this file made from code? because it is for a specific user. and It can't of been there when they made the site.
I'm sorry if this is a stupid question but I have no idea what to search for to find out more about this.
I think what you're talking about is rewriting the url. This can be done in the .htaccess file of your project.
The end of the url is not a file or a folder. It is a query string. which can be accessed by code and used for custom pages. take the url for this question:
https://stackoverflow.com/questions/69684211/what-is-the-unusual-thing-that-i-sometimes-see-in-website-urls
That can also be represented as:
https://stackoverflow.com/questions?id=69684211&name=what-is-the-unusual-thing-that-i-sometimes-see-in-website-urls
Note: These are probably not the actual names for the query strings, just an example.
By writing:
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}\.php -f
RewriteRule ^index/([0-9]+) index.html?id=$1 [NC,L]
In the .htaccess file, you can turn the url
http://localhost/index.php?id=5
Into
http://localhost/index/5
Note: This will not simply change the url when you refresh the page. But if I were to type in 'http://localhost/index/5' it will give me the same page as if I typed in 'http://localhost/index.php?id=5'. So you will have to change all redirects to that page to the prefered url. All this code does is make it so it doesn't send you to a 404 error page.
If you want, you can probably write some php code that automatically redirects you if you are on the original url.
Websites do this for a couple of reasons:
Makes the url more memorable and easier to write,
Makes it easier for search engines to find the website.
Explaining the .htaccess code
Once you understand what the code means it is actually fairly simple.
Line 1:
RewriteEngine on
By setting this to on, it allows you to rewrite and make your own rules for the url.
What is a rewrite engine?
Line 2:
RewriteCond %{REQUEST_FILENAME} !-d
This line is a condition, kind of like an if statement, where it is saying if the Requested File name is not a directory, then continue with code. This is to prevent the issue where a file has the same name as a folder and it can get confusing in the url.
Line 3:
RewriteCond %{REQUEST_FILENAME}\.php -f
This line checks if the requested file has the extension 'php' it doesn't have to be php, it can be whatever you want.
For more information on .htaccess rules and conditions: .htaccess rewrite rules
Line 4:
RewriteRule ^index/([0-9a-zA-Z_-]+) index.php?id=$1 [NC,L]
Ok so this is the most confusing line. It is basically describing the rule for what to change the url to.
The '^' right at the beginning means ignore anything in the url before what comes after it. So it would ignore whatever comes before 'index' in the url.
Now you want to set the rules for what characters are allowed between index/ and the next '/' in the url. If you only have one query in the url then don't worry about adding another '/' at the end. But if you have lets say a ?name parameter, the url will be https://localhost/index/5/charlie for example. but because I have two search parameters, I have to add another regex. So the last line will now become RewriteRule ^index/([0-9a-zA-Z_-]+)/([0-9a-zA-Z_-]+) index.php?id=$1 [NC,L]
In this example it allows all characters from 0-9, a-z (lowercase and capitals), and underscores and hyphens. So it will rewrite the url only if it contains those characters. If it contained a dollar sign $, for example, then it wouldn't rewrite.
Make sure you include the + after the closing square bracket! Without the + at the end, it would only allow 1 character values. So ?id=5 would work but ?id=52 wouldn't work.But with the + there it will be fine.
Now, the next part of the line is telling the server what you want to redirect to in the url. In this case you want to redirect to index.php so you write that.
You also want to include the parameters so you can actually use them in your code. So now add on the query string/s. But we don't know what the ?id parameter is. So we need to put a placeholder there. So now type in ?id=$1. Make sure the variable is called 1 and not something else.
If you were to have more than one parameter, you would simply write &param2=$2 then &param3=$3 and so on.
The final thing you need to add is [NC,L]
Where the NC means Non Case which means if someone were to write index.php with a capital I, it would not matter. And where the L means the Rewrite Conditions from earlier only apply to that rule and not any rules you may have later on.
If you didn't understand my explanation watch Dani Krossing's video on it, he explains it very well: https://www.youtube.com/watch?v=zJxCq6D14eM

Clean the URL like of Google+

Sorry for so brief title. I am wondering how Google+ makes user URLs so clean. I mean like :
https://plus.google.com/+PuruVijay
Would take me to my page. I want to know how is that + after/ was put and how it loaded the corresponding page. I want a database to get the URL. The URL actually should have been like
Plus.Google. com/user?id=134566
Looking for a good answer please help
Edit:
An example is of this page's URL
You can also do like that... just you need to create folder of name
e.g. http://yoursite.com/PuruVijay
here PuruVijay is folder you need to create in you Website directory.. and put index file in that folder
In a comment you say you are using an Apache server. The typical way to handle URL manipulations like this is the module mod_rewrite, which you can find documentation on here. This uses regular expressions to match URLs and direct to another. For example, a rule for /~user to /u/user is
RewriteRule ^/~([^/]+)/?(.*) /u/$1/$2
For the Google+ example, you say you want to translate from /+PuruVijay to /user?id=134566. This will be a little more complicated because the URL as given does not include the User ID, so you will have to retrieve the number some other way. You can however use mod_rewrite to redirect to /user?name=+PuruVijay. This would look something like (not tested!)
RewriteRule ^/\+(.*) /user?id=$1
Then, your user page can get the id parameter and look it up in the database to display the correct page, while allowing the user to type in an easy-to-remember URL.
As far as mapping PuruVijay to 134566, Google+ requires the custom URLs to be unique, so there is a 1-1 correspondence between the handle PuruVijay and the user ID number 134566. Otherwise it would be impossible to look up a specific ID number given a custom URL. Your site would have to place a similar restriction if you decide to allow custom handles like this.

.htaccess rewriterule pattern Hash twitter

I want to catch a link containing #.
Consider:
RewriteEngine On
RewriteRule pattern target [flags]
The question in here is the pattern portion. since \# isnt working. is there any other way to read/catch a hash in the incoming link?
Just like twitter does: http://twitter/#chakku will redirect you to tweets containing #chakku.
The # and everything after it is the URL Fragment, it will never get sent to the server. Therefore, neither apache nor mod_rewrite will even know it exists.
Fragments are primarily used on the client/browser side, so you'll probably need something like javascript in order to trap it.

How to setup a general 301 redirect that will ignore the ending url string

Is there a way to make any redirect that redirect based on the start of the URL? for example I would like this URL and all others like it ( the /57/userid/25516 is always different numbers).
http://www.example.com/site/ActivityFeed/MyProfile/tabid/57/UserId/25516/Default.aspx
To redirect based on (regardless of the following part of the URL string)
http://www.example.com/site/ActivityFeed/MyProfile/*will redirect no matter what else the string contains/
This would be amazingly helpful if someone knows the answer. Thank a lot guys.
I'm having some trubles understanding where do you want to redirect it to. So this should redirect to http://www.example.com/site/. I think it might be what you want:
Options +FollowSymLinks
RewriteEngine On
RewriteRule ^site/ActivityFeed/MyProfile site/ [R=301,L,NC]
Try it out and let me know

Nginx fallback for file not found to different directory with same path

Short Version: Is there any easy way to automatically redirect a path like /templates/my_child_theme/main/sidebar/user_nav.html to /templates/default/main/sidebar/user_nav.html whenever a 404 is encountered?
Question: Is there an easy way to use something like try_files within nginx to change a filepath when not found to a different folder with the same file path at the end? I'm currently using a client-side framework (AngularJS) and dealing with checking for the existence of files can get fairly expensive as I am literally performing an AJAX call for each file and looking for a 404 before performing the logic to swap out the path in Javascript. I've seen similar solutions for fallback images but haven't gotten a good solution yet. Here's the gist of what I'm looking to do:
Look for file at /templates/$1/$2.
On HTTP 404 instead return /templates/default/$2.
This only really needs to happen in the /templates/ location for now.
It's possible that these files could be nested several layers deep so I need something like /templates/my_child_theme/main/sidebar/user_nav.html to redirect to /templates/default/main/sidebar/user_nav.html
Backstory:
I've been building a site in AngularJS that has a fairly complex templating setup. I am utilizing the awesome ui-router module from Angular UI for deeply nested views and have written a few custom functions to implement child themes similar to Magento's fallback system. Here's what currently happens:
AngularJS requests a template path using a getTemplate() function I wrote which first looks for the file at /templates/child_theme_name_here/filepath by performing an XMLHttpRequest, checking for a status code of 404 (file not found), and then either returning that path or (in the case of a 404) returning /templates/default/filepath instead.
This way I can overload modify specific templates without needing to copy the entire theme each time, making development easier (we have 3 major corporate clients which will each have their own branded child theme) by not making me keep up with each change across multiple themes.
If there is a better way to do this within AngularJS I am open to that as well, it just seemed to me that Nginx would be the most logical place to perform such an action due to to it's low-level integration with the filesystem.
Solved.
Had to teach myself a bit on regular expressions, but finally got it working.
Here's what worked:
location ~* ^\/templates\/([^\/\\\]+)(.*)$ {
try_files /templates/$1$2 /templates/default$2 =404;
}
Regex Explanation
~* means case-insensitive matching (not really regex, just nginx syntax)
^ means start of a string
\/ means match a backslash
templates means literally match the word templates
\/ means match a backslash again
( means start capturing the following match as a group for later use
[^\\\/] means match anything that's not a backslash or forward slash means the previous set of characters can be matched multiple times (i.e. keep matching anything that isn't a slash.
) means stop capturing characters for this group. We have now defined the string that represents the first folder after /templates/
(.*) means match any other character as many times as needed (match everything that isn't a line feed in other words)
$ means match the end of the string
try_files then tries each URL in order
/templates/$1$2 means try /templates/(everything in capture group 1 above, which holds the folder we captured)(then add everything from capture group which holds the backslash and anything after it until the end of the url)
/templates/default$2 is very similar, except instead of using the text from capture group 1 ($1, the folder name we matched) we use the text "default" and then add everything from the second capture group to the end like before
=404 means that if neither of those worked return a 404 error
I'm seeing a significant speed improvement by moving this fallback mechanism into the server versus all of the extraneous calls I was forced to do before on the client.

Resources