Map-reduce for dummies
While browsing around, I incidentally found an
article by Joel Spolsky where he introduces very simply and
progressively the principles of the map-reduce
pattern that underlies a big part of Google's
infrastructure. A must
read!
We use map-reduce at Joost to process usage data (roughly equivalent to the log files on a web server) and extract lots of useful information about the usage of the platform. This is built on Apache Hadoop, an open source implementation of map-reduce.
Considering the still limited numbers of users we have, map-reduce isn't absolutely necessary and a SQL database could have done the trick, but with the expected huge user base when Joost becomes generally available, having a solution that is able to scale mostly by throwing in more machines is a must have.
We use map-reduce at Joost to process usage data (roughly equivalent to the log files on a web server) and extract lots of useful information about the usage of the platform. This is built on Apache Hadoop, an open source implementation of map-reduce.
Considering the still limited numbers of users we have, map-reduce isn't absolutely necessary and a SQL database could have done the trick, but with the expected huge user base when Joost becomes generally available, having a solution that is able to scale mostly by throwing in more machines is a must have.
Comments
Do you have any extra
invites?
Posted by: Raymond Wade | May 11, 2007 01:18 AM
can you invite me to joost please ??
i really want it.
thank you
Celia
Posted by: Celia | May 11, 2007 03:03 AM
hi? can someone invite me?
Posted by: noël | May 12, 2007 03:17 AM
Please invite me to participate in the beta testing of Joost.
Posted by: Cliff | May 15, 2007 01:14 PM