The problem
Build a system and programming model for processing large
datasets.
Take away the complexities of dealing with large data.
Exceptions will occur.
Components (both hard and soft) will fail.
Data structures will exceed available memory.
Approximate or incomplete results are usually good enough.