PingBin

where all the used ping's go…

HowTo: PHP MemCache Guide

leave a comment


What is MemCache?

Some of you have probably just stumbled along this post without actually knowing what MemCache does, so here is a bit of technical context before I dive into the implementation methods, problems and of course solutions.

In it’s most basic form MemCache is a normal program that runs on top of your operating system, typically you run this on something like your web server as a separate service, just like you would Apache, NTP, MySQL etc.

The service has a very simple aim, you provide it with a key and some corresponding data, this is then saved in memory via the MemCache process and you can access it at a later date. You leverage the benefit in two ways, firstly because the data is held in RAM which is much faster than other storage options, and secondly because the data is typically already in a usable format and therefore requires little extra processing by your application, saving time and physical resources.

How do I use it?

Well I did start of by writing an explanation, however here is an example (with some inline comments):

// We will start by making a new MemCache instance, and assign it to the variable $mc
$mc = new Memcache;

// Using the $mc instance we will connect to our server, normally this will be on port 11211 (localhost)
$mc->connect('127.0.0.1', 11211) or die ("Could not connect");

// Memcache will save data to a specific unique key, you can set this to almost anything
$thekey = "PostCount";

// Now we are going to define the data, this will be saved to our $thekey
$thedata = "12340";

// Now to actually pass the key and data to memcache, note '0' means we are not going to have this expire after a specified amount of time.
$mc->set($thekey, $thedata, false, 0);

// Using $thekey we will get the memcache data back, this will be saved into our variable $result
$result = $mc->get($thekey);

echo $result; //will return 12340

Problems?

The majority of todays applications are built on top of relational databases such as MySQL, therefore adding a basic key/data technology on top of them for cacheing can be hell for programmers. With Key based data you have to know the exact key otherwise you will get nothing returned, MySQL however allows you to search by almost any method you can think of using the saved data.

Another problem is building a solution to define keys that’s going to be accurate, scalable and easy to implement. For example i could save web based user information to memcache using a key based on their numeric user-id (Eg: Key = 412, UserName = Bob), however I then need to go through all of my code and before performing a SQL lookup, perform a memcache lookup, that’s going to be a headache. Plus some things such as the login progress might want to convert ‘bob’ into the userid ’412′, that’s not possible as ‘bob’ is my memcache data, which you can’t search by, you can only request data by using the Key.

Solution, Hash Key

The solution is fairly simple to implement, you leave almost all of your code as it is and go to your database class (I am assuming that you have one, most developers use one), below is an example extract of what yours could look like, before we get started changing it.

// Function: pass some SQL and it will return the result
// (Insecure, just shown as an example)
function sql_search($sql){

  // Connect to my database
  mysql_connect("localhost", "root", "password") or die(mysql_error());
  mysql_select_db("testdb") or die(mysql_error());

  // Run the SQL query
  $result = mysql_query($sql) or die(mysql_error());
  return mysql_fetch_array( $result );

}

Solution: So what you do is accept the $sql string, run it through an md5() hash function, and check if there is any memcache result using the hash as your key, if there is skip the SQL, if there isnt run the SQL but then add the result to memcache.

// Function: pass some SQL and it will return the result
// (Insecure, just shown as an example)
function sql_search($sql){

  // Create the hash key
  $thekey = md5($sql);

  // Connect to memcache server
  $mc = new Memcache;
  $mc->connect('127.0.0.1', 11211) or die ("Could not connect");
  $result = $mc->get($thekey);

  // If there was a memcache hit, return the cache result
  if(!empty($result)) return $result;

  // Connect to my database, there was no cache hit
  mysql_connect("localhost", "root", "password") or die(mysql_error());
  mysql_select_db("testdb") or die(mysql_error());

  // Run the SQL query
  $result = mysql_query($sql) or die(mysql_error()); 

  // Save the result to memcache
  $mc->set($thekey, $result, false, 0);

  return mysql_fetch_array( $result );

}

Overview

As you can see it’s a simple solution, no matter what application your using if there is a SQL backend and also a SQL class/function memcache can be added in with very minimal work, here are a few last tips:

  • Build a variable into your SQL function to disable/enable cache, I have a few applications that only use it in high server load times.
  • Consider building a function to clear the cache for some of your variables, for example if you add a blog comment you might want to clear blog comment cache.
  • Sometimes just use memcache code if the data isn’t valuable, i have done this with stat’s information before, and just poll/clear memcache every few hours.
  • Don’t forget cache will be deleted after some time, so make sure your application doesn’t error if this happens.
  • Build some checks in to returned memcache data, don’t just assume that its worked

Written by Tom

January 22nd, 2012 at 2:18 pm

Posted in Hosting,PHP

HowTo: Rsync to a Backup FTP server

leave a comment

My hosting company is great, with all of their dedicated servers even the low cost ones users get 100GB free of FTP backup space. Now at first I didn’t really see the point in this, not because backup’s aren’t important but because I like to use rsync for my backups, and it has no support for FTP connections.

However there is a neat trick you can do for FTP support in Rsync, it’s fairly simple really, you install some extra software that lets you mount an FTP service just as if it was a hard drive, usb pen or network share. Then your able to backup straight onto this mount without the worry about FTP support and using rsync.

So you need to first get this ftp mounting software installed.

apt-get install curlftpfs

Now create a folder somewhere on your machine, this is where the ftp server is going to be mounted, it’s posistion is completly upto you, this is just an example

mkdir /mnt/backup_server

The configuration for this is fairly simple and is done in the normal fstab file, which is great because you can then use the mount and unmount commands just like any other drive on your system.

#Open and edit your fstab
nano -w /etc/fstab

#At the end of the file enter the following (change to match your setup)
#curlftpfs#user:pass@ip-address /local/directory fuse rw,allow_other,uid=user-id 0 0

curlftpfs#pingbin:1234@10.0.0.1 /mnt/backup_server fuse rw,allow_other,uid=1000 0 0

#Save and close the file, then mount your directory

mount /mnt/backup_server

Cool, you should be ready to go, I would suggest doing the following just to ensure it’s working

#Go in the mounted directory, check nothings there.
cd /mnt/backup_server
ls -la

#Create a file
touch ./test.txt

#Now FTP to the server like normal and check if it's there

root@fr:/mnt/ftpbackup# ftp 10.0.0.1
Connected to ftpback
220-Welcome to Pure-FTPd.
Name (ftpback):
OK. Password required
Password:
ftp> ls
200 PORT command successful
-rw-r--r-- 1 100 100 0 Jan 1 17:54 test.txt

As you can see it worked, so you should be a bit more confident that your backups are going to save correctly.

Finally I enter this into my crontab to backup daily, monthly and weekly

crontab -e

1 1 * * * rsync -asvz --no-owner --no-group /webserver /mnt/backup_server/daily
2 2 2 * * rsync -asvz --no-owner --no-group /webserver /mnt/backup_server/weekly
3 3 3 3 * rsync -asvz --no-owner --no-group /webserver /mnt/backup_server/monthly

Written by Tom

January 1st, 2012 at 5:25 pm

Posted in Servers

Nginx PHP REMOTE_ADDR with Proxy (Varnish Cache)

one comment

Typically your public web server with consist of two principle pieces of software, the web server such as Nginx (Apache is more commonly used) and some server side scripting technology such as php (in my case php-fpm).

However many people are now either installing cache engines (such as Varnish) or load balancers infront of their web server to increase performance and possibly availability. Which can play havoc for server side scripts that use the client information.

Your HTTP requests are now handled by the cache (Varnish) so PHP variables such as $_SERVER['REMOTE_ADDR'] will no longer hold the client IP, however they will have your varnish cache IP, 127.0.0.1 if it’s on the same physical server.

This poses a problem because many scripts now use the client IP address, this could be for Spam protection, session tracking/security, or just statistics logging.

Some may say you should use another variable in your script such as $_SERVER['X-Forwarded-For'], this way your script can know the client IP and also the proxy IP, while the theory might have some benifits most people don’t write their own code, and chances are it’s configured to use REMOTE_ADDR.

So the fix is to have Varnish assign the real user’s IP address to a variable called “X-Forwarded-For”, this is done as below:

#Backup the configuration
cp /etc/varnish.default.vcl /etc/varnish.default.vcl.backup

#Open your varnish configuration
nano -w /etc/varnish.default.vcl

#Add the following lines to sub vcl_recv
sub vcl_recv {
        remove req.http.X-Forwarded-For;
        set    req.http.X-Forwarded-For = req.http.rlnclientipaddr;
}

#Save and exit, then restart the varnish service
service varnish restart

So your cache engine is now passing the IP back to Nginx, now we just need configure Nginx to capture this variable and replace REMOTE_ADDR with it. To do this you need a Nginx module installed called http_realip_module, so run the following command and look out for “–with-http_realip_module”:

nginx -V
nginx version: nginx/0.7.67
TLS SNI support enabled
configure arguments: --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --http-client-body-temp-path=/var/lib/nginx/body --http-fastcgi-temp-path=/var/lib/nginx/fastcgi --http-log-path=/var/log/nginx/access.log --http-proxy-temp-path=/var/lib/nginx/proxy --lock-path=/var/lock/nginx.lock --pid-path=/var/run/nginx.pid --with-debug --with-http_dav_module --with-http_flv_module --with-http_geoip_module --with-http_gzip_static_module --with-http_realip_module --with-http_stub_status_module --with-http_ssl_module --with-http_sub_module --with-ipv6 --with-mail --with-mail_ssl_module --add-module=/tmp/buildd/nginx-0.7.67/modules/nginx-upstream-fair

As you can see I have it installed already, as you hopefully will do, now I just need to use the following configuration to perform the variable replacement.

#Create a backup of the configuration
cp /etc/nginx/nginx.conf /etc/nginx/nginx.conf.backup

#Edit the configuration file
nano -w /etc/nginx/nginx.conf

#Add the following lines to your http{} statement
set_real_ip_from   127.0.0.1;
real_ip_header     X-Forwarded-For;

#Save, close and restart the nginx service
service nginx restart

Ok, just to explan what that configuration did, “set_real_ip_from” is going to be the IP address of your cache/load balancer basically the IP your currently seeing all of the proxied requests come from, in my case this was 127.0.0.1 as the cache and nginx are on the same server.

And “real_ip_header” defines what variable actually has the correct information for “REMOTE_ADDR” as you will see this matches what we setup Varnish to pass it back as.

You should be able to test if this works using the below test script.

echo $_SERVER["REMOTE_ADDR"] ;
echo "
".time()."
";
echo $_SERVER["HTTP_X_FORWARDED_FOR"] ;

Echoing the time just makes sure your not getting a varnish cache hit.

Hope that helps, please leave any questions you still have below…

Written by Tom

January 1st, 2012 at 4:01 pm

Posted in Servers

iptables blocking outbound traceroute

leave a comment

I have seen a few people having problems with getting outbound traceroutes to work, normally just after they have setup a new rule base for iptables, a linux based software firewall that’s installed on most systems.

Basically they have allowed the trace-route in the outbound rules, normally by permitting all outbound traffic. I wouldn’t recommend that, however that’s a post for another day.

However they are blocking the inbound responses, meaning that trace route never gets any of the data back from the hops.

This fault is actually a sign of some much larger issues as many services work in the same way, basically you have forgotten to track ‘related and established’ sessions, and permit their inbound traffic.

The command to fix is below, assuming your inbound chain is called ‘INPUT’ and your internet interface is eth0.

iptables -A INPUT -m state -i eth0 --state ESTABLISHED,RELATED -j ACCEPT

 

Written by Tom

January 1st, 2012 at 2:17 pm

Posted in Servers

My 2011 insights.

leave a comment

I failed in attempt at writing a interesting first paragraph, so here it is. It’s new years day at the moment and here is my 2011 year in review, aka what I learnt in 2011, mostly inrelation to blogging and websites.

Write what you know and do.

My job is a technical one, I fundamentally work in telecommunication/ISP networks however I also do a lot of work with servers (at least that was true in my last role), we all get to that stage troubleshooting where you give something a quick ‘Google’ because your either out of ideas or just want to confirm a theory, before you possibly beak it even further. So my aim was simple, over 2 months I would make note of my searches, document the solutions to problems and post that in a blog form. I am no SEO expert but the plan was people must be having the same problems.

In sort it worked; my most visited page was one of the first that I did in this idea, this blog now gets on most days over 100 people, and it’s had a few peaks over 500 which I don’t think is too bad. Another advantage is you can also look back when your faced with the same problems again, as I am sure you will be :)

Create rules and follow them

Another problem I hit was actually finishing a post, I either got bored or just didn’t think the quality was good enough, the post then got deleted or left in the wordpress drafts section, so there were two simple rules:

Use wordpress - Ok so a bit of background, I would write a post in something like microsoft word and then copy/paste over to WordPress when it was finished, I am not really sure why I did this, but it was certainly a bad habit. I would just close word, lose all the content and be no further in progressing my blog. By writing everything in WordPress it’s always in the draft section, staring at you to do something about it.

No Drafts- So the second rule, your never allowed to create a new post if there is a draft in the queue, ‘first come first serve’. Finish the post and get the content out there for people to see, you will probably get some search traffic from it! Plus it gives you some motivation to write about more stuff when the drafts bin is clear.

Projects cover costs.

If you enjoy building stuff then start a project and build something useful (route.im and whatportis.com are an example of mine), once you have created it post the new service/idea on a few social sites like hacker news, just make sure you put all your effort into creating it!

Throughout 2011 my server bills have all been paid for by the first 3 days of 4 projects that I launched, cool isn’t it? Your left with something useful that can continue to grow, with no costs or worries about costs for the remaining year!

Finally even if your not ready to ‘show off’ the service to a forum/social website, build it and keep it sitting there to improve on later, chances are some people will find it (using search engines etc..) and start to use the service. You can improve it when your ready and you will already have some decent ‘testing’ done.

Links aren’t bouncy.

I am the first to acknowledge this blog structure isn’t great, however i blame that on not having enough content in all the areas yet, something that I will hopefully progress and address in 2012. But I admit you can certainly help out your visitors, in the blog post text add some links to relevant other posts that you have done, chances are they will be interested in the content and probably click through, therefore browsing instead of bouncing.

Written by Tom

January 1st, 2012 at 1:17 pm

Posted in General & News