TheRegister: Lightning strikes Amazon cloud (honest)
by Cade Metz
Amazon's cloud was struck by lightning earlier this week. And that's the truth.
On Wednesday evening at about 6:30pm Pacific time, some Amazon cloud sitters saw their floating servers disappear - and yes, the company blamed the temporary outage on a lightning strike.
Click here to find out more!According to a web post from the company, the strike zapped a power distribution unit in one of its data centers, taking out server instances in one - and only one - Availability Zone. Amazon's Elastic Compute Cloud (EC2) serves up on-demand processing power from two separate geographic locations - the US and Europe - and each geographic region is split into multiple zones designed never to vanish at the same time.
"A lightning storm caused damage to a single Power Distribution Unit (PDU) in a single Availability Zone," the company said in a web post at 7:33pm. "While most instances were unaffected, a set of racks does not currently have power, so the instances on those racks are down."
At 9:26, Amazon said power had been restored and the affected server instances were beginning to recover. By 1:20am, the company said the problem had been fully resolved.
While Amazon was correcting the problem, it told customers they had the option of launching new server instances to replace those that went down. But customers were also able to wait for their original instances to come back up after power was restored to the hardware in question.
This was a relatively minor issue compared to the two major outages Amazon's cloud suffered in October 2007 and February 2008. And it's nowhere near as amusing as the time an engineer accidentally deleted Flexiscale's infrastructure cloud. Well, not nearly as amusing except for the lightning bit. ®
InfoWorld: Google tests 'revolutionary' cloud-based database
by Juan Carlos Perez
Google has released an early version of a new type of database whose approach to data management will be revolutionary, according to an analyst who has studied the technology behind it.
On Tuesday, Google quietly announced in its research team blog a new online database called Fusion Tables designed to sidestep the limitations of conventional relational databases.
Specifically, Fusion Tables has been built to simplify a number of operations that are notoriously difficult in relational databases, including the integration of data from multiple, heterogenous sources and the ability to collaborate on large data sets, according to Google.
"Without an easy way to offer all the collaborators access to the same server, data sets get copied, emailed and ftp'd – resulting in multiple versions that get out of sync very quickly," reads the Google announcement, which has been largely overlooked, probably because it was made on the same day the company held a high-profile press event to launch its Google Apps Sync for Microsoft Outlook.
Under the hood of Fusion Tables is data-spaces technology, which will make conventional databases go the way of the rotary phone, according to Stephen E. Arnold, a technology and financial analyst who is president of Arnold Information Technology.
Data spaces as a concept has been around since the early 1990s, and Google, realizing its potential, has been developing it since it acquired Transformic, a pioneer of the technology, in 2005, Arnold said.
Data-spaces technology seeks to solve the problem of the multiple data types and data formats that reside in organizations, which have to scrub the data and make it uniform, often at great cost and effort, in order to store and analyze it in conventional databases.
Data spaces envisions a system that creates an index that provides access to data in its disparate formats and types, solving what Arnold calls the "Tower of Babel" problem.
In the case of Fusion Tables, the technology should allow Google to add to the conventional two-dimensional database tables a third coordinate with elements like product reviews, blog posts, Twitter messages and the like, as well as a fourth dimension of real-time updates, he said.
"So now we have an n-cube, a four-dimensional space, and in that space we can now do new kinds of queries which create new kinds of products and new market opportunities," said Arnold, whose research about this topic includes a study done for IDC last August.
"If you're IBM, Microsoft, and Oracle, your worst nightmare is now visible. Google is going to automatically construct data spaces and implement new types of queries," he said. "Those guys are going to be blindsided."
Fusion Tables is an early version of the product, as evidenced by its "Labs" label, which means Google considers it an experimental product. "As usual with first releases, we realize there is much missing, and we look forward to hearing your feedback," Google's blog post reads.
CNet: The more Hadoop grows, the better Cloudera looks
by Matt Asay
The Internet largely abolishes scarcity in digital goods, shifting competitive advantage to those that can profit from abundance, not scarcity, like Red Hat, Google, and Facebook. For this reason, the more Hadoop grows as a community, the better the business opportunity for Cloudera, the start-up that distributes a commercial version of Hadoop.
Let me explain.
As CNET's Tom Krazit explains, "Hadoop is essentially an open-source version of the software Google uses to run its Web indexing servers." Yahoo also uses it internally for roughly the same reason, and has released its own open-source version of Hadoop to nudge adoption by other firms and to encourage contributions to the Hadoop project.
As Savio Rodrigues points out, however, Hadoop is already getting significant contributions from outside Yahoo. While initially dominated by Yahoo employees, Rodrigues points to recent data that indicates that 70 percent of Hadoop's community isn't employed by Yahoo.
That's great progress for Hadoop, and it's also great for Cloudera, the company that aims to make Hadoop relevant and useful for companies that lack the scale of a Google or Yahoo. Cloudera actively contributes to the Hadoop project, but perhaps its greatest contribution is in providing a commercial distribution of Hadoop.
The more contributors to Hadoop and the more complex it becomes, the greater the need for a Cloudera to provide a conservative, trusted distribution of Hadoop for enterprise customers. In other words, the greater the abundance of community around Hadoop, the more enterprises need scarcity: one throat to choke for their Hadoop deployments, not many.
As Yahoo and others contribute heavily to Hadoop, in short, they're also contributing to the likelihood of Cloudera's success.