Posts from ‘Servers’

Jan
22


What is MemCache?

Some of you have probably just stumbled along this post without actually knowing what MemCache does, so here is a bit of technical context before I dive into the implementation methods, problems and of course solutions.

In it’s most basic form MemCache is a normal program that runs on top of your operating system, typically you run this on something like your web server as a separate service, just like you would Apache, NTP, MySQL etc.

The service has a very simple aim, you provide it with a key and some corresponding data, this is then saved in memory via the MemCache process and you can access it at a later date. You leverage the benefit in two ways, firstly because the data is held in RAM which is much faster than other storage options, and secondly because the data is typically already in a usable format and therefore requires little extra processing by your application, saving time and physical resources.

How do I use it?

Well I did start of by writing an explanation, however here is an example (with some inline comments):

// We will start by making a new MemCache instance, and assign it to the variable $mc
$mc = new Memcache;

// Using the $mc instance we will connect to our server, normally this will be on port 11211 (localhost)
$mc->connect('127.0.0.1', 11211) or die ("Could not connect");

// Memcache will save data to a specific unique key, you can set this to almost anything
$thekey = "PostCount";

// Now we are going to define the data, this will be saved to our $thekey
$thedata = "12340";

// Now to actually pass the key and data to memcache, note '0' means we are not going to have this expire after a specified amount of time.
$mc->set($thekey, $thedata, false, 0);

// Using $thekey we will get the memcache data back, this will be saved into our variable $result
$result = $mc->get($thekey);

echo $result; //will return 12340

Problems?

The majority of todays applications are built on top of relational databases such as MySQL, therefore adding a basic key/data technology on top of them for cacheing can be hell for programmers. With Key based data you have to know the exact key otherwise you will get nothing returned, MySQL however allows you to search by almost any method you can think of using the saved data.

Another problem is building a solution to define keys that’s going to be accurate, scalable and easy to implement. For example i could save web based user information to memcache using a key based on their numeric user-id (Eg: Key = 412, UserName = Bob), however I then need to go through all of my code and before performing a SQL lookup, perform a memcache lookup, that’s going to be a headache. Plus some things such as the login progress might want to convert ‘bob’ into the userid ‘412’, that’s not possible as ‘bob’ is my memcache data, which you can’t search by, you can only request data by using the Key.

Solution, Hash Key

The solution is fairly simple to implement, you leave almost all of your code as it is and go to your database class (I am assuming that you have one, most developers use one), below is an example extract of what yours could look like, before we get started changing it.

// Function: pass some SQL and it will return the result
// (Insecure, just shown as an example)
function sql_search($sql){

  // Connect to my database
  mysql_connect("localhost", "root", "password") or die(mysql_error());
  mysql_select_db("testdb") or die(mysql_error());

  // Run the SQL query
  $result = mysql_query($sql) or die(mysql_error());
  return mysql_fetch_array( $result );

}

Solution: So what you do is accept the $sql string, run it through an md5() hash function, and check if there is any memcache result using the hash as your key, if there is skip the SQL, if there isnt run the SQL but then add the result to memcache.

// Function: pass some SQL and it will return the result
// (Insecure, just shown as an example)
function sql_search($sql){

  // Create the hash key
  $thekey = md5($sql);

  // Connect to memcache server
  $mc = new Memcache;
  $mc->connect('127.0.0.1', 11211) or die ("Could not connect");
  $result = $mc->get($thekey);

  // If there was a memcache hit, return the cache result
  if(!empty($result)) return $result;

  // Connect to my database, there was no cache hit
  mysql_connect("localhost", "root", "password") or die(mysql_error());
  mysql_select_db("testdb") or die(mysql_error());

  // Run the SQL query
  $result = mysql_query($sql) or die(mysql_error()); 

  // Save the result to memcache
  $mc->set($thekey, $result, false, 0);

  return mysql_fetch_array( $result );

}

Overview

As you can see it’s a simple solution, no matter what application your using if there is a SQL backend and also a SQL class/function memcache can be added in with very minimal work, here are a few last tips:

  • Build a variable into your SQL function to disable/enable cache, I have a few applications that only use it in high server load times.
  • Consider building a function to clear the cache for some of your variables, for example if you add a blog comment you might want to clear blog comment cache.
  • Sometimes just use memcache code if the data isn’t valuable, i have done this with stat’s information before, and just poll/clear memcache every few hours.
  • Don’t forget cache will be deleted after some time, so make sure your application doesn’t error if this happens.
  • Build some checks in to returned memcache data, don’t just assume that its worked
Jan
01

My hosting company is great, with all of their dedicated servers even the low cost ones users get 100GB free of FTP backup space. Now at first I didn’t really see the point in this, not because backup’s aren’t important but because I like to use rsync for my backups, and it has no support for FTP connections.

However there is a neat trick you can do for FTP support in Rsync, it’s fairly simple really, you install some extra software that lets you mount an FTP service just as if it was a hard drive, usb pen or network share. Then your able to backup straight onto this mount without the worry about FTP support and using rsync.

So you need to first get this ftp mounting software installed.

apt-get install curlftpfs

Now create a folder somewhere on your machine, this is where the ftp server is going to be mounted, it’s posistion is completly upto you, this is just an example

mkdir /mnt/backup_server

The configuration for this is fairly simple and is done in the normal fstab file, which is great because you can then use the mount and unmount commands just like any other drive on your system.

#Open and edit your fstab
nano -w /etc/fstab

#At the end of the file enter the following (change to match your setup)
#curlftpfs#user:pass@ip-address /local/directory fuse rw,allow_other,uid=user-id 0 0

curlftpfs#pingbin:1234@10.0.0.1 /mnt/backup_server fuse rw,allow_other,uid=1000 0 0

#Save and close the file, then mount your directory

mount /mnt/backup_server

Cool, you should be ready to go, I would suggest doing the following just to ensure it’s working

#Go in the mounted directory, check nothings there.
cd /mnt/backup_server
ls -la

#Create a file
touch ./test.txt

#Now FTP to the server like normal and check if it's there

root@fr:/mnt/ftpbackup# ftp 10.0.0.1
Connected to ftpback
220-Welcome to Pure-FTPd.
Name (ftpback):
OK. Password required
Password:
ftp> ls
200 PORT command successful
-rw-r--r-- 1 100 100 0 Jan 1 17:54 test.txt

As you can see it worked, so you should be a bit more confident that your backups are going to save correctly.

Finally I enter this into my crontab to backup daily, monthly and weekly

crontab -e

1 1 * * * rsync -asvz --no-owner --no-group /webserver /mnt/backup_server/daily
2 2 2 * * rsync -asvz --no-owner --no-group /webserver /mnt/backup_server/weekly
3 3 3 3 * rsync -asvz --no-owner --no-group /webserver /mnt/backup_server/monthly
Jan
01

Typically your public web server with consist of two principle pieces of software, the web server such as Nginx (Apache is more commonly used) and some server side scripting technology such as php (in my case php-fpm).

However many people are now either installing cache engines (such as Varnish) or load balancers infront of their web server to increase performance and possibly availability. Which can play havoc for server side scripts that use the client information.

Your HTTP requests are now handled by the cache (Varnish) so PHP variables such as $_SERVER[‘REMOTE_ADDR’] will no longer hold the client IP, however they will have your varnish cache IP, 127.0.0.1 if it’s on the same physical server.

This poses a problem because many scripts now use the client IP address, this could be for Spam protection, session tracking/security, or just statistics logging.

Some may say you should use another variable in your script such as $_SERVER[‘X-Forwarded-For’], this way your script can know the client IP and also the proxy IP, while the theory might have some benifits most people don’t write their own code, and chances are it’s configured to use REMOTE_ADDR.

So the fix is to have Varnish assign the real user’s IP address to a variable called “X-Forwarded-For”, this is done as below:

#Backup the configuration
cp /etc/varnish.default.vcl /etc/varnish.default.vcl.backup

#Open your varnish configuration
nano -w /etc/varnish.default.vcl

#Add the following lines to sub vcl_recv
sub vcl_recv {  
        remove req.http.X-Forwarded-For;
        set    req.http.X-Forwarded-For = req.http.rlnclientipaddr;
}

#Save and exit, then restart the varnish service
service varnish restart

So your cache engine is now passing the IP back to Nginx, now we just need configure Nginx to capture this variable and replace REMOTE_ADDR with it. To do this you need a Nginx module installed called http_realip_module, so run the following command and look out for “–with-http_realip_module”:

nginx -V
nginx version: nginx/0.7.67
TLS SNI support enabled
configure arguments: --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --http-client-body-temp-path=/var/lib/nginx/body --http-fastcgi-temp-path=/var/lib/nginx/fastcgi --http-log-path=/var/log/nginx/access.log --http-proxy-temp-path=/var/lib/nginx/proxy --lock-path=/var/lock/nginx.lock --pid-path=/var/run/nginx.pid --with-debug --with-http_dav_module --with-http_flv_module --with-http_geoip_module --with-http_gzip_static_module --with-http_realip_module --with-http_stub_status_module --with-http_ssl_module --with-http_sub_module --with-ipv6 --with-mail --with-mail_ssl_module --add-module=/tmp/buildd/nginx-0.7.67/modules/nginx-upstream-fair

As you can see I have it installed already, as you hopefully will do, now I just need to use the following configuration to perform the variable replacement.

#Create a backup of the configuration
cp /etc/nginx/nginx.conf /etc/nginx/nginx.conf.backup

#Edit the configuration file
nano -w /etc/nginx/nginx.conf

#Add the following lines to your http{} statement
set_real_ip_from   127.0.0.1;
real_ip_header     X-Forwarded-For;

#Save, close and restart the nginx service
service nginx restart

Ok, just to explan what that configuration did, “set_real_ip_from” is going to be the IP address of your cache/load balancer basically the IP your currently seeing all of the proxied requests come from, in my case this was 127.0.0.1 as the cache and nginx are on the same server.

And “real_ip_header” defines what variable actually has the correct information for “REMOTE_ADDR” as you will see this matches what we setup Varnish to pass it back as.

You should be able to test if this works using the below test script.

echo $_SERVER["REMOTE_ADDR"] ;
echo "
".time()."
";
echo $_SERVER["HTTP_X_FORWARDED_FOR"] ;

Echoing the time just makes sure your not getting a varnish cache hit.

Hope that helps, please leave any questions you still have below…

Jan
01

I have seen a few people having problems with getting outbound traceroutes to work, normally just after they have setup a new rule base for iptables, a linux based software firewall that’s installed on most systems.

Basically they have allowed the trace-route in the outbound rules, normally by permitting all outbound traffic. I wouldn’t recommend that, however that’s a post for another day.

However they are blocking the inbound responses, meaning that trace route never gets any of the data back from the hops.

This fault is actually a sign of some much larger issues as many services work in the same way, basically you have forgotten to track ‘related and established’ sessions, and permit their inbound traffic.

The command to fix is below, assuming your inbound chain is called ‘INPUT’ and your internet interface is eth0.

iptables -A INPUT -m state -i eth0 --state ESTABLISHED,RELATED -j ACCEPT

 

Dec
28

So during my misspent christmas break I decided to play around and make a new website, however I wanted to make it a success with some meaningful traffic. So I  decided to play around and see how high I could get on hacker news, and how much traffic I could actually generate from there to my site (also advertising revenue would be interesting).

From experience of reading HN (Hacker News) one of the best things to do is present the community with a recent project that you have completed by hacking stuff together, they generally offer some great feedback and you get quite a few hits (I assume).

Obviously I also wanted to make something that was useful, so I had a look around at the domains I already had, route.im and traceroute.im sprung to mind as I purchased them a few months before, with the intention of creating a simple web interface to trace route an IP address or Hostname.

Build Time

The Design

I started with a fairly simple CSS template from another unfinished project and used some very basic PHP scripting to perform a simple trace route and display the results. I was fairly happy with the product so I uploaded to my web server (also used for this blog) and did some final testing, everything looked good to go!

The DNS for the domains got updated, however route.im was actually already pointing at the right server so I could launch it even faster than normal. Traceroute.im was now updated however that would take some time to prorogate, but nothing that should stop me…

Time to go public

Route.im was now live, so I passed the url around to a few people on twitter, MSN and facebook to have a look at, beta test completed and finally time to let the public have at it, or at least I hope it is.

Picking the title wasn’t too hard “My 2 hour morning project, route.im” was short and to the point, surly that’s going to drag a few people in to have a look?

That’s it we are live, time to keep an eye on my HN points and comments, when you first post on HN you are placed in the ‘newest’ section, only a few people go through here an up vote content giving you points, the aim is to get enough points to get into the top 30 stories, as your then on the home page! I am not sure of their formula however you basically need a constant stream of points to get on the home page and then stay there.

Welcome to the home page

Thankfully enough people found my story interesting enough and with only 10 votes in a short space of time I got to number 3 on the home page of HN (I think this was the peak position although I got too busy to keep checking), I did a quick look at my server load and it didn’t seem to be high, however looking at the nginx logs there were a lot of people on the site!

Then I quickly decided I needed some better logging on the site, so I made a MySQL table and started to log all of the searches been made and timestamps, wow there was a lot! Also I created a page so vistiors could see this.

Server died…

One of the best things about HN is the comments you get back from people going on your site, so always have a window open and keep refreshing it, instantly there were 3 posts from people saying it wasn’t working…

Turns out the server died, before I only had 4 PHP processes as the PHP code processed quick and didn’t hog the resources. However while one process was waiting for a trace route to complete it couldn’t serve any other users, and there was a lot more than 4 request waiting for a process. So I quickly bumped up the PHP processes to 20, also I did the same to Nginx and restarted the services. Disaster averted and users were ‘happy’ again. Here’s a look at some of the traffic figures, and this was after the first ‘peak’.

Live Traffic View

Firefighting, is fun!

Here was a few bugs found with the script, I tried to fix these ASAP

  • So many users caused the site to crash – Not enough PHP processes, added more and restarted PHP server.
  • Couldn’t trace IPv6 via IP but could via Hostname – Regular expression filtered out the colon ‘:’, simple fix.
  • Couldn’t trace a domain with ‘-‘ included – Regular expression filtered out the ‘-‘, simple fix

Over time I did go down the list on HN home page, however I am still there while posting this, and there is quite a few comments!

That's me at 9th

Feature Requests

With the bug reports and comments came a load of feature requestes:

  • Show IP geographic location
  • Show IP route on a Map
  • Show ASN for the route
  • Change URL format so you can enter target there, e.g. route.im/8.8.8.8
  • Trace route from multiple locations to a single destination
  • Auto fill the user field with their IP address

Revenue

I wasn’t going to post about revenue, however it did see a bit dishonest if I intentionally missed it out, so yes have made some money from been featured on HN, below is a screenshot of my adsense revenue. I expect this will probably die off to less than that in a month, however hopefully with some new features we can keep people coming back to the site and sustain some of it.

I am probably going to do some kind of contest on the website over the next few weeks with the generated revenue, just to say thanks 🙂

Some tips

  • Make sure your app is secure, people on HN like looking for SQL injection from my logs 🙂
  • Decent logs is always a good thing to have
  • Pay attention to the feedback, bad feedback is the best type!
  • Don’t be afraid of putting something out there so get some feedback
  • Make sure your server is ready for the traffic, if its not, be ready
  • Hack together bug fixes quick if you can
  • Reply to all comments, you will probably get more people commenting then
  • If you don’t get to the front page, change something before posting again.
  • Have a look at other HN stories to get ideas.
  • Please go in the new section of HN and up vote good stories, it really helps people!

A finial little graph of my Varnish Cache graph from the initial burst:

Varnish Cache is working!

 

Sorry for the quick post, lack of proof reading and probably vast number of typo’s. I just wanted to get my thoughts out there, I am off now to do some more coding to improve the site!

Thanks, please leave you comments below!

Our Sponsors