TheRealAndyCook;4994317; said:So lets say we have 3 datacenters each named after its physical location.
Canada.site.com
Usa.site.com
Uk.site.com
When a user connects to site.com it trys to forward them to the closet site possible.
So, Andy who is a canadian lands on Canada.site.com
His request for index goes to apache, apache then goes to the sql server, located at sql.canada.site.com, then the sql data goes back to apache and is visible in Andys browser.
So, Jeff who is an American lands on Usa.site.com
Like andy, his request for index goes to apache, then to sql.usa.site.com, then the sql data goes back to apache and is visable in Jeffs browser.
Now John has been using the site and is about to make a update.
Johns data gets sent from the apache server @ Uk.site.com, apache tells sql at sql.uk.site.com to update with the new data, then John can see the data when he views the page.
HOWEVER, when Andy Or Jeff visit the page, they will not see johns update, because they are not on the same servers they will not be loading from the same SQL server.
Similarally if Andy or Jeff make an update neither will see it.
This is where we implement the syncing script. When John made his update, the apache server at uk.site.com also sent out a "warning" to canada.site.com and us.site.com that a change was made and where the change is located.
Canada.site.com and us.site.com then go an get the new data and update there respective sql servers.
The whole update takes appropriately 2 times the total latency from server to server plus the latency from sql to apache. Since the databases are typically located on fibre backbones, the update MAY propagate Faster then the original user will get the original page back.
(thus taking that math into effect a fast users in the states might actually see the change before the user making the change in the uk)
I assume you would be using GeoIP bind for proper resolution. Your sql connections depending on the frequency of database changes, and the size of those changes could be setup using MySQL clustering. Where each DC would have a pair of mysql data nodes that replicate across the wire as it happens. You would also then have Cluster manager server in each of the data centers which would talk to all the data nodes and each of the other cluster managers. The data nodes in a MySQL cluster do most of the heavy lifting while the actual MySQL nodes in the MySQL cluster process the request through the data nodes, the MySQL node could reside on the LAMP server. In additon the MySQL nodes in each data center would communicate with the MySQL data node in that data center; with the ability fail over to one of hte other DC to provide reduced performance access to The DC with the failed node. All of this is of course Dependant on the quality and speed of your link between the data centers.
This leaves the data on the web servers. From the sounds of it you copying data from one dc to the other to keep them in sync. Here you could setup File servers in each data center, setup replication. It's a bit tricky since you have 3 DC you need to replicate, if you only had two you could use something like DRBD. Again all of this depends on the quality and speed of the link that you would dedicate between the DC's.
From my experience File replication across DC can work well with minimal latency If you have a high quality dedicated link. However with a multi primary clustered file system such as this is describing, you would need to be able to quickly detect and restore a failed file server to mitigate possible split brain damage.
With that said, if I had to set this up myself, depending on the actual content type I would do something like this.
A central location that houses the Core File Server and Core SQL server. The US and UK locations would house a small Varnish/MemCache cluster, this cluster would provide caching for the US and UK location for the most used content, including DB content. Then you would only be traveling back for the obscure information. This would assume the sites are mostly dynamic php sites that are backed by mysql. The varnish memcache would then be controlled by the main location, upon updates only that section of the varnish cache would be cleared, same with memcache. For example on a typical word press site once configured with varnish and memcache never hits the apache server unless the cache is flushed. Mysql is hit slighly more often do to the cache expiring, but this is manageable.
However im looking for something to actually be useful this isn't a science fair project