<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>HostingFu &#187; cluster</title>
	<atom:link href="http://hostingfu.com/tag/cluster/feed" rel="self" type="application/rss+xml" />
	<link>http://hostingfu.com</link>
	<description>Web Hosting Blog by a Software Developer</description>
	<lastBuildDate>Mon, 19 Jul 2010 09:27:08 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<item>
		<title>Setting Up Part-Time Web Cluster with Amazon&#8217;s EC2</title>
		<link>http://hostingfu.com/article/setting-up-part-time-web-cluster-with-amazons-ec2</link>
		<comments>http://hostingfu.com/article/setting-up-part-time-web-cluster-with-amazons-ec2#comments</comments>
		<pubDate>Thu, 18 Jan 2007 05:34:40 +0000</pubDate>
		<dc:creator>scotty</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[amazon]]></category>
		<category><![CDATA[cluster]]></category>
		<category><![CDATA[ec2]]></category>

		<guid isPermaLink="false">http://hostingfu.com/?p=83</guid>
		<description><![CDATA[What do you do when you have regular traffic spike? Say, for once a month, traffic increases 3 fold for 12 hours after your company sent out the monthly news letter? Your current web server barely copes with regular load. Do you go out to buy 2 more dedicated servers just for that 12 hours [...]]]></description>
			<content:encoded><![CDATA[<p><img src="http://hostingfu.com/files/images/amazon-aws.png" alt="Amazon Web Services" width="170" height="69" style="float:right;margin:0 0 1ex 1ex"/> What do you do when you have regular traffic spike? Say, for once a month, traffic increases 3 fold for 12 hours after your company sent out the monthly news letter? Your current web server barely copes with regular load. Do you go out to buy 2 more dedicated servers just for that 12 hours in a month? That wouldn&#8217;t be too economical paying 2 extra servers sitting there idling most of the time, wouldn&#8217;t it?</p>
<p><a href="http://www.zeroflux.org/">Judd Vinet</a> of <a href="http://www.archlinux.org/">ArchLinux</a> (one of my favourite, btw) has recently written an article to solve this very issue, <a href="http://www.zeroflux.org/blog/post/view?id=223">Web clustering with Amazon EC2</a>, where extra servers are hired from <a href="http://aws.amazon.com/ec2">Amazon EC2</a> on part-time basis to serve surge in traffic. A semi-automated system has been built to make the task of &#8220;summoning new servers&#8221; much easier, and has been discussed in the article.</p>
<p><span id="more-83"></span></p>
<p>It&#8217;s quite a detailed write up with a lot of good information. Basically it is what Judd is trying to achieve.</p>
<ul>
<li>2 existing web server cluster using <a href="http://www.linuxvirtualserver.org/">LVS</a>. Use <a href="http://www.apsis.ch/pound/">Pound</a> for web balancer.</li>
<li>Start EC2 instances, and upon start up each instance will phone home to register itself.</li>
<li>Master server (where Pound is) will add new EC2 instances, and grant database permission to allow connection from EC2.</li>
</ul>
<p>Seems to be quite a solid approach to scale up your web site instantly. Fire up a few EC2 instances before expected traffic surge, and shot them down after the storm &#8212; you are only paying for the time and bandwidth you have used!</p>
<p>At the end of the article Judd mentioned that a fully-automated system will be their next step, so they don&#8217;t need to manually activate/deactivate EC2 instances &#8212; they should just start/stop whenever the traffic builds up/slows down. Nice for unexpected traffic like Slashdotting or Digging.</p>
<p>Who said the dedicated server vendors shouldn&#8217;t be worried?</p>
<p>However, a few problems are also encountered during this exercise &#8212; <strong>web traffic load balancing</strong> and <strong>database access</strong>. Pound has been used in the example but I think other &#8220;web fronts&#8221; like Nginx should have no problem either. I have never used Pound but from the example given, a 2 second delay is needed to cleanly stop the old process before new one can be started. I guess it would be unnecessary if Nginx or Lighttpd is used, where you can gracefully reload the server without causing any delay.</p>
<p>However the biggest problem I have with the proxy approach is &#8212; which is actually an issue with EC2 itself &#8212; that you are effectively paying double the data transfer. You are paying Amazon for data transfer coming in or going out from EC2 at $0.20/Gb. At the same time you are paying whoever is hosting your reverse-proxy servers for traffic going out to your customers. Problem wouldn&#8217;t exist if you can run your reverse proxy right inside EC2, as traffic between EC2 instances are free. Except it is not trivial. Not until persistency and static IP are implemented anyway.</p>
<p>Database access over public Internet is also not something trivial to get it right. For a busy website, even 20ms latency between the web server and DB server can cause quite a noticeable degrading in performance. A locally replicated DB will definitely be much better. But somehow &#8220;part-time&#8221; &#8220;on-demand&#8221; &#8220;MySQL replication&#8221; all sounds oxymoron to me.</p>
<p>Conclusion? I guess that&#8217;s why EC2 is still in beta, and is still pretty much &#8220;developers only&#8221;. But no doubt it is full of potential.</p>
]]></content:encoded>
			<wfw:commentRss>http://hostingfu.com/article/setting-up-part-time-web-cluster-with-amazons-ec2/feed</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Media Temple&#8217;s Grid-Server</title>
		<link>http://hostingfu.com/article/media-temples-grid-server</link>
		<comments>http://hostingfu.com/article/media-temples-grid-server#comments</comments>
		<pubDate>Wed, 18 Oct 2006 05:15:02 +0000</pubDate>
		<dc:creator>scotty</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[cluster]]></category>
		<category><![CDATA[mediatemple]]></category>

		<guid isPermaLink="false">http://hostingfu.com/?p=59</guid>
		<description><![CDATA[Via TechCrunch, MediaTemple has introduced its Grid-Server hosting product, superseding their old shared hosting products. It is basically a cluster of web servers/application servers connecting to a fast array of storage (SAN?), and load balancers or routers placed in front of the web servers to evenly distribute the requests across individual physical servers in the [...]]]></description>
			<content:encoded><![CDATA[<p>Via <a href="http://www.techcrunch.com/2006/10/17/media-temple-crushes-shared-hosting/">TechCrunch</a>, <a href="http://www.mediatemple.com/">MediaTemple</a> has introduced its <a href="http://www.mediatemple.net/webhosting/gs/">Grid-Server hosting product</a>, superseding their old shared hosting products.</p>
<p>It is basically a cluster of web servers/application servers connecting to a fast array of storage (SAN?), and load balancers or routers placed in front of the web servers to evenly distribute the requests across individual physical servers in the grid. A new measurement, <a href="http://www.mediatemple.net/webhosting/gs/faq/grid_performance_unit-faq.htm">Grid Performance Unit</a>, is introduced to measure how much CPU time your account has used.</p>
<p>Michael Arrington has also <a href="http://www.talkcrunch.com/2006/10/17/mediatemple-launches-grid-server/">interviewed the MediaTemple guys</a> on TalkCrunch, talking about the ideas behind their latest offerings.</p>
<p><span id="more-59"></span></p>
<p><img src="http://hostingfu.com/files/images/mediatemple-gridserver.jpg" width="300" height="319" alt="MediaTemple Grid-Server website" style="float:right;border:#888 solid 1px;margin:0 0 5px 12px"/> It is definitely an interesting product that is targeted to solve scalability issue of many growing web sites that has exceeded the capacity of the single-box shared hosting, and it comes at a very attractive price point &#8212; <strong>$20 a month</strong> for 1,000 GPU, 100Gb &#8220;premium&#8221; storage and 1Tb monthly data transfer. It also has one of the most flexible PHP configuration you can have with shared hosting &#8212; choice between PHP4 or PHP5, no stupid safe-mode that limits you in every possible way, and options to use your own <code>php.ini</code>.</p>
<p>Another innovation is their <a href="http://www.mediatemple.net/webhosting/gs/faq/grid_performance_unit-faq.htm">Ruby on Rails Containers</a> technology. Not to be confused with <a href="http://en.wikipedia.org/wiki/Solaris_Containers">Solaris Containers</a>, MediaTemple&#8217;s RoR containers look like a jailed execution environment for your Ruby on Rails applications. Memory is limited to 64Mb, and you can run <a href="http://www.techcrunch.com/2006/10/17/media-temple-crushes-shared-hosting/#comment-273022">multiple rails apps</a>. There are also options for you to purchase more memory for your container. Sounds like a light-weight VPS optimised for RoR applications, without the overhead of management.</p>
<p>Here are some of my random thoughts.</p>
<h3 id="toc-cluster-hosting-is-not-new">Cluster Hosting is Not New</h3>
<p>For one thing, MediaTemple is definitely not the industrial first. Cluster hosting has been around for a while. TechCrunch mentioned Rackspace&#8217;s <a href="http://www.mosso.com/">Mosso</a>. Even in the land of Down Under, <a href="http://www.ilisys.com.au/">Ilisys</a> has been offering cluster web hosting for years.</p>
<p>What sets MediaTemple apart? <strong>Buzz</strong>. Being on TechCrunch and interviewed by Michael Arrington certainly helps.</p>
<h3 id="toc-very-competitive-priced">Very Competitive Priced</h3>
<p>Another thing that differentiates MediaTemple is their pricing model. I have no idea on how much processing power is 1,000 GPU, but from MediaTemple&#8217;s description that it will be sufficient for most of their customers. That means $20/month for up to 100 sites on clouds of servers with huge storage and bandwidth &#8212; very attractive. Certainly compares very well with Mosso&#8217;s $100/month with 80Gb storage and 2Tb data transfer.</p>
<p>Both as potential replacement for low-end dedicated servers, however, I think Mosso is more geared towards web hosting resellers. Mosso comes with add-on <a href="http://www.mosso.com/clientservices.jsp">client services</a> that includes billing and support. And with an infrastructure that can utilise both Linux and Windows application platforms, Mosso is best for those starting out reselling web hosting to be able to host both Linux and Windows applications. Whereas Grid-Server is Linux-only with no reselling options.</p>
<p>Still, it&#8217;s only <strong>$20 a month</strong> for all your sites. I am looking forward to other cluster hosting services to catch up and drop their prices.</p>
<h3 id="toc-clusters-are-good-for-customers">Clusters are Good for Customers</h3>
<p>There are many reasons why clustered shared hosting is way better than single-box model that most hosting companies are running today. These are not specific to MediaTemple&#8217;s Grid-Server.</p>
<ul>
<li><strong>Centralised storage on fast SAN</strong>. Because all cluster nodes have access to the same pool of data, hosting companies need to deploy those high-end storage solutions that are faster and more reliable. At the end the customers benefit.</li>
<li><strong>Higher uptime guarantee</strong>. Server hardware failure no longer means hour-long down time, and from <a href="http://www.dreamhoststatus.com/">DreamHost status</a> you&#8217;ll learn that hardware do fail, and quite often too!</li>
<li><strong>Scalable performance</strong>. What happen if you site got slashdotted or dugg that requires the horse power of more than one physical servers for 12 hours? A cluster will automatically load balance the requests to other servers. Or if your neighbour gets traffic spike, you don&#8217;t get to suffer with him as other cluster nodes will take care the traffic.</li>
</ul>
<p>I think the day of single-box shared hosting, aka the cpanel model, is numbered.</p>
<h3 id="toc-clusters-are-good-for-hosting-providers">Clusters are Good for Hosting Providers</h3>
<p>While developing this technology might be costly, I think the benefit should easily pay off the cost in the long run, as clustering is just so obviously the better solution.</p>
<ul>
<li><strong>Overselling CPU time</strong>. Already overselling storage and bandwidth? Why not CPU time as well? I think clustering actually makes it easier, according to the <a href="http://en.wikipedia.org/wiki/Law_of_large_numbers">law of large numbers</a>. It is much easier to oversell from a big pool of clustered CPUs than separate units of individual boxes &#8212; just like you can oversell from your pool of gigabit bandwidth and network attached terabyte storage. End result? More customers on less hardware, which leads to <strong>more profit</strong>.</li>
<li><strong>Easier traffic spike management</strong>. If you are the host you&#8217;ll hate clients getting slashdotted. You have to find spare servers and migrate the accounts around. No longer the case with a clustered backend. Again, law of large numbers works in the advantage of the hosting providers here.</li>
<li><strong>Easier server management</strong>. All the servers in the clusters can be provisioned and managed with the same configuration. No account specific info needs to be stored on these servers. You can even buy cheap hardware for your front-end computational boxes, aka the Google data centre style, as long as your storage is high quality and scalable.</li>
</ul>
<p>I am sure there are others. At the end, all these leads to happier customers, which lead to longer stays, more referrals and more profit.</p>
<p>Smart move MediaTemple! Have I said that the day of single-box shared hosting is numbered?</p>
<h3 id="toc-grid-server-is-not-competing-against-amazon-ec2">Grid Server is Not Competing Against Amazon EC2</h3>
<p>Even the MediaTemple guys said so, that (gs) and <a href="http://hostingfu.com/article/amazon-announced-elastic-compute-cloud-ec2">EC2</a> are very different. One is <em>fully managed shared hosting platform</em>, and the other is <em>unmanaged Xen virtual dedicated server</em>. They serve different purposes, even though many people have confused the two.</p>
<p>I&#8217;ll even take another step saying MediaTemple&#8217;s Grid-Server product is not replacing unmanaged VPS either. It does not give you root access, does not let you run things like VPN tunnels, Jabber servers, VoIP software, etc. I&#8217;ll say a VPS is still a better <strong>development platform</strong>, whereas (gs) is a great <strong>deployment platform</strong> for 95% the case.</p>
<h3 id="toc-grid-server-is-competing-against-managed-vpslow-end-dedicated">Grid Server is Competing Against Managed VPS/Low-End Dedicated</h3>
<p>Besides traditional shared hosting, I think managed VPS/low-end dedicated is where (gs) will steal most of the customers from. Many people buy low-end managed solutions because:</p>
<ul>
<li>They have out grown the managed shared hosting account.</li>
<li>They can&#8217;t manage the servers themselves.</li>
</ul>
<p>A scalable managed shared hosting in cluster servers is a much better solutions for those customers. It will be cheaper, more scalable, and much better looked-after than a managed VPS.</p>
<p>I am hoping that (gs) will also attract those who are supposed to get managed VPS, but were too cheap and went for unmanaged solution. After all, running a Linux server, having it all patched up, setting up custom applications on it, etc &#8212; these are not for everyone. A mis-configured unmanaged server is actually more harmful on the Internet, if it has been exploited.</p>
<p>If you don&#8217;t have the time to keep the software up to date, not having ability to, or simply don&#8217;t want to, please go for a managed service. If a standard LAMP stack fits the bill, then MediaTemple&#8217;s Grid-Server might actually be a better choice.</p>
<h3 id="toc-software-compatibility-might-be-an-issue">Software Compatibility Might Be An Issue</h3>
<p>If you have been in software development for clustered applications, you know scaling the system up is <em>much more</em> than just adding more parallel hardware into the grid. As HTTP itself is stateless, you need to somehow keep the system states between individual requests (data, session, etc). Some software packages might not expect different requests handled by different hardware nodes. Anything related to persistency of states warrants a second look.</p>
<p>PHP, due to its <a href="http://www.zefhemel.com/archives/2004/09/01/the-share-nothing-architecture">share nothing</a> architecture, makes it easier to scale, provided you can also scale the backend database. Does (gs) give you clustered MySQL/PostgreSQL servers with replication? Not from what I can see. The database itself might become the bottleneck when the site tries to scale up.</p>
<p>MediaTemple kept a list of <a href="http://www.mediatemple.net/webhosting/gs/faq/grid_compatible_applications-faq.htm">applications known to be grid compatible</a>. However, take WordPress or Drupal for example, can you be 100% sure that every single module or plugin is compatible? I&#8217;ll still suggest test, test and <strong>more test</strong>.</p>
<p>Localised optimisation is also something difficult to do with clusters where adjacent requests can be served by completely different nodes. For example, your PHP intermediate code cache (like <a href="http://pecl.php.net/package/APC">APC</a>) might get a lot of cache misses, if requests to your sites are handled all over the places, and caches are trashed by requests from other sites.</p>
<h3 id="toc-conclusion">Conclusion</h3>
<p>Clustering is nothing new, but MediaTemple being today&#8217;s web hosting buzz, hopefully more people will aware the benefit of grid computing. At this price point I think it would be great for those who want to have good scalable and reliable website.</p>
<p>As web hosting has become a commodity, it is great to see MediaTemple, being a low-budget shared hosting company, introduced something that sets them apart from the rest. Hopefully we will see the more clustered shared hosting at an affordable price, when this market starts gaining momentum.</p>
<p><b>Update</b>: <a href="http://timdorr.com/">Tim Dorr</a> of A Small Orange <a href="http://www.webhostingtalk.com/showpost.php?p=4164013&amp;postcount=28">reported some interesting find</a> on (gs) service, including:</p>
<ul>
<li>A site with 10% discount.</li>
<li>Load balancing server structure.</li>
<li>MySQL servers layout.</li>
<li>Foundry Networks ServerIron is used for load balacning/app switching.</li>
</ul>
<p><b>Update 2</b>: <a href=http://www.cuddletech.com/">Ben Rockwood</a> from <a href="http://joyent.com/">Joyent</a> blogged about <a href="http://www.cuddletech.com/blog/pivot/entry.php?id=764">Grid-Server internals</a>, after bought an account and tried to figure it all out himself.</p>
]]></content:encoded>
			<wfw:commentRss>http://hostingfu.com/article/media-temples-grid-server/feed</wfw:commentRss>
		<slash:comments>13</slash:comments>
		</item>
		<item>
		<title>Web Hosting Clusters</title>
		<link>http://hostingfu.com/article/web-hosting-clusters</link>
		<comments>http://hostingfu.com/article/web-hosting-clusters#comments</comments>
		<pubDate>Thu, 06 Jul 2006 05:05:14 +0000</pubDate>
		<dc:creator>scotty</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[cluster]]></category>
		<category><![CDATA[shared hosting]]></category>

		<guid isPermaLink="false">http://hostingfu.com/?p=29</guid>
		<description><![CDATA[An interesting article written by Isabel Wang, The Future of Dedicated Servers, where she discussed whether the rise of Google and Microsoft Live and their abundant resources might make the dedicated server market obsolete, when people are switched to on-demand hosting. As Google and Microsoft continue to invest in their Internet infrastructure, might they ever [...]]]></description>
			<content:encoded><![CDATA[<p>An interesting article written by Isabel Wang, <a href="http://isabelwang.typepad.com/blog/2006/07/the_future_of_d.html">The Future of Dedicated Servers</a>, where she discussed whether the rise of Google and Microsoft Live and their abundant resources might make the dedicated server market obsolete, when people are switched to on-demand hosting.</p>
<p><span id="more-29"></span></p>
<blockquote><p>
  As Google and Microsoft continue to invest in their Internet infrastructure, might they ever reach an economy of scale where it&#8217;d make sense to &#8220;lower the barrier of entry&#8221; by giving away not just APIs, but hosting resources on which third party developers could build and run complex, high traffic mashups?
</p></blockquote>
<p>Imagine this &#8212; instead of renting a $99/month Intel/AMD dedicated box, you rent CPU slices in Google&#8217;s 400,000+ cluster/server-farm, where it provides high availability (when was the last time you can&#8217;t access google.com?), great parallel computing technology and almost limitless bandwidth. Whether it runs single-threaded PHP script for your WordPress blog is another thing&#8230;</p>
<p>Well, I will not say that a big clustering account would replace the dedicated server market today, as there are many reasons people opt for dedicated servers. It is like reseller account verses virtual private server &#8212; those who picked reseller accounts for their bigger storage, more bandwidth, more CPU resource and an easy to use control panel so they can just focus on content creation. VPS customers (or when they graduated, dedicated server or co-location customers) want their root (or Administrator) access, flexibility of software installation, custom configuration etc.</p>
<p><a href="http://www.mosso.com/">Mosso</a> is an interesting hosting product. 80Gb storage + 2Tb bandwidth. Moreover, it sits behind inside farm of application servers (PHP, ASP, JSP, MySQL, etc) behind a cluster of load balancer. 100% SLA on network and site availability. Redundant everything. You don&#8217;t know who serves your web page, just that it would be served by the cluster member with least load and fastest response time. Even if your neighbour gets slashdotted or digged, it is not going to affect your sites&#8217; performance much. Great infrastructure for $100 a month.</p>
<p>It&#8217;s good for the host as well. Overselling CPU time is now closer to reality. I imagine the CPU utilisation would be much better distributed, if you have 500 websites over a 10 node cluster, than 500 websites over 10 dedicated servers.</p>
<p>Maybe in 3 years, all major shared hosting companies will operate this way. We&#8217;ll see.</p>
<p>However, it is also known that <em>not all web applications work in cluster</em>. What if the PHP app uses a different way to track sessions and cannot afford to have different requests served by different cluster node? How many support load balanced MySQL servers? However, when there&#8217;s demand, we&#8217;ll also see more web-based apps designed for this type of environment.</p>
<p>Also looking forward to have a cluster-ready Linux distribution for web hosting companies. Boot straight up from SAN, configured itself from the master nodes of the cluster, and read to serve requests. Hmmm.</p>
]]></content:encoded>
			<wfw:commentRss>http://hostingfu.com/article/web-hosting-clusters/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
