» Map-reduce for dummies / Reply to comment
Reply to comment
Map-reduce for dummies
While browsing around, I incidentally found an
article by Joel Spolsky where he introduces very simply and
progressively the principles of the map-reduce
pattern that underlies a big part of Google's
infrastructure. A must
read!
We use map-reduce at Joost to process usage data (roughly equivalent to the log files on a web server) and extract lots of useful information about the usage of the platform. This is built on Apache Hadoop, an open source implementation of map-reduce.
Considering the still limited numbers of users we have, map-reduce isn't absolutely necessary and a SQL database could have done the trick, but with the expected huge user base when Joost becomes generally available, having a solution that is able to scale mostly by throwing in more machines is a must have.
We use map-reduce at Joost to process usage data (roughly equivalent to the log files on a web server) and extract lots of useful information about the usage of the platform. This is built on Apache Hadoop, an open source implementation of map-reduce.
Considering the still limited numbers of users we have, map-reduce isn't absolutely necessary and a SQL database could have done the trick, but with the expected huge user base when Joost becomes generally available, having a solution that is able to scale mostly by throwing in more machines is a must have.
