Be Google’s friend: Make your URLs canonical with .htaccess

July 18, 2008 by Gary Illyes  
Filed under .htaccess, Apache, Server Management

This subject is… is… well :|
Every second site on the net has at least one article about this subject. But to be honest, it’s good to have so many articles about this, in a way. At least people recognize they should use it. Or not.
So, what’s the fuss around the URL canonicalization? One thing only: the search engines and their hate of duplicated content. If your website is accessible both on www.example.com and the plain example.com the search engines will index both areas, they think you duplicated your content to get more positions in the search results, so they penalize your domain. Weird. They should know it’s the same website, or at least the coders should teach them that www is the same with non-www. Or at least on well-configured servers.
So, here pops Apache in and throws a resolution for the issue: the mod_rewrite engine, again. You will have to have mod_rewrite bundled into Apache and working correctly.
As always, here’s the code for those who just want to copy&paste and then the explanation for all the lines.


RewriteEngine On
RewriteCond %{HTTP_HOST} ^www\.example\.com [NC]
RewriteRule ^/(.*) http://example.com/$1 [R=301,L]

That is. Placed in a .htaccess file, it will redirect with code [301:Moved Permanently] all the queries sent to the www.domain.com to domain.com. Now let’s explain it line-by-line:

  1. We switch On the mod_rewrite module, thus telling Apache we want to work with it.
  2. If the hostname contains “www”, apply the rule, so this a condition
  3. This last line is the rule which has to be done if the condition can be applied on the HTTP request. In our case do a 301 redirection to the non-www version of the site

That was all. Search engines are now happy, World saved again.
As always, if something is unclear, drop a comment and i answer as soon as possible.

Hotlink Protection using .htaccess made easy

July 18, 2008 by Gary Illyes  
Filed under .htaccess, Apache, Server Management

This is one of the most used tricks by the webmasters who care about their allocated bandwidth. The code which controls what are domains where your images can show up is very short, 4 line that is.
As always, I provide the full code, then below it I explain everything.
To use this code, you have to have an Apache web-server with mod_rewrite correctly installed.
So, let’s see the code for those who don’t want the explanations:


RewriteEngine on
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http(s)?://(www\.)?example.com [NC]
RewriteRule \.(jpg|jpeg|png|gif)$ - [NC,F,L]

Now some explanations:
The first line,

RewriteEngine on

practically tells Apache we will do something with mod_rewrite so turn it on. This line is optional if you already turned it on before in the same .htaccess where you put the above code in.
The second line,

RewriteCond %{HTTP_REFERER} !^$

this is nastier. Basically, if there is no referrer, let the image to be displayed. I guess this needs a bit of explanation. When you navigate on the internet from one site to the other, the browser always sends a “referrer” header to the host you are accessing. So, for example if you are currently on http://www.Google.com and you navigate to http://yahoo.com, the browser will send yahoo the following : “Referrer: http://www.google.com”. This header is what we use in our .htaccess to prevent hotlinking, BUT! Some antiviruses, firewalls clears this header on the clients’ side so there is no referrer at all, thus we don’t know the user browses our site, or it’s hotlinking our image on another site. Thus we just let the image to be displayed if there is no referrer.

RewriteCond %{HTTP_REFERER} !^http(s)?://(www\.)?example.com [NC]

If the referrer domain is our own domain, display the image. We set: http(s)?://(www\.)?yourdomain.com, so our condition will work on HTTP, HTTPS and also on our www and non-www hostname/domain.

And the last step is to tell Apache which files to protect:

RewriteRule \.(jpg|jpeg|png|gif)$ - [NC,F,L]

In the above case the jpg, jpeg, png and gif images will be protected. If you want to protect your Flash-files as well, put swf in the list and your movies will not display embedded in remote sites.
On our domain the php files are also protected because the Imagick examples are parsed by php codes.

I hope the above example was somewhat useful, if you need help with it, just say your problem below and will answer as soon as possible.

Use IPTables to reroute or just annoy your visitors

July 15, 2008 by Gary Illyes  
Filed under Linux, Server Management

:D

Yeah, I know, I’m an idiot.

So, what I wanted to do is to annoy one of my friends in a way he never observes I did something. He is a frequent visitor of one of the site’s I manage, as I couldn’t find a better way, I decided to do something with the site somehow. The site I couldn’t alter as it’s too popular. But I knew his IP from the logs so I decided to redirect him each time he tries to access the site.

Since php had the header() function disabled redirecting him via IP matching didn’t work, had to use something else. Meta refresh isn’t good, Javascript neither as he has it disabled all the time.

IPTables! Godly touch …


iptables -A PREROUTING -s HIS.IP.ADDRESS.01/255.255.255.0 -p tcp -j DNAT --to-destination 64.233.167.99

Every request from him to the server will forward him to Google search. Nice. The problem was, that I had to listen his theory of how Google bought his favorite site. :|

Why is this more effective than any other script-based method?
Well, that’s obvious why is better than the client-side methods, it can’t be overridden. As of why is better than the server side codes, a valid reason would be that if you simply can not use header resetting methods.

The website which made me WOW!

July 13, 2008 by Gary Illyes  
Filed under Bulk

My partner (an excellent web developer) had/has a client which asked him to develop him a layout which would make him WOW. He couldn’t, anything he showed him wasn’t enough.

The other day I was lurking over the net without any point, just pressing links and buttons without meaning, then I arrived to a page which, well, if I say the most interesting page I didn’t say enough.
It’s pure Flash, that’s its only con, but in rest… WOW.

Check it out yourself then let me know what do you think about it: The WOW site
Probably that design would have WOW my partner’s client.

Filter your variables easily but like a pro!

July 13, 2008 by Gary Illyes  
Filed under Development, PHP

How painful input validation is! Think about all the possible threats, combination of threats… think with the users’ mind. It’s a pain. And usually who can write scripts which filters effectively the user inputs is considered a pro, without hesitation. Just because it’s hard to do it.
Take the following scenario: you have a text-field which accepts text as user comment. You don’t want to let the user to use HTML in the comment box, and definitely not to allow the user to put javascript in the comment.
So how do you sanitize the string you get? It’s a long and hard way. You would use RegExp to exclude some entities then some php inbuilt functions to encode the remaining or even better to strip tags.
I show you an easier way:

filter_var(’<script>alert('Hello');</script>', FILTER_SANITIZE_STRING);

Done, the <script> tags will be stripped so the string will arrive in the database as alert(’Hello World’).
There are many available filters, just to mention the most interesting ones:

  • FILTER_SANITIZE_EMAIL — it sanitizes email address, strips characters which are not in conformance with the applicable RFC (link)
  • FILTER_SANITIZE_URL — whether the URL from the variable is in conformance with the applicable RFC (link)
  • FILTER_VALIDATE_IP — whether if the input is an IP address or not

I recommend using the filter_var() function and its filters for two obvious reasons: it saves you a lot of headaches and saves you time. Even though the filter_var function was introduced only in php 5.2 the function is extremely useful and gives another reason for you of why to upgrade to php5 ;)

For a complete reference please check php.net.

Next Page »