MapReduce-MPI WWW Site - MapReduce-MPI Documentation

MapReduce collate() method

uint64_t MapReduce::collate(int (*myhash)(char *, int)) 

This calls the collate() method of a MapReduce object, which aggregates a KeyValue object across processors and converts it into a KeyMultiValue object. This method is exactly the same as performing an aggregate() followed by a convert(). The method returns the total number of unique key/value pairs in the KeyMultiValue object.

The hash argument is used by the aggregate() portion of the operation and can be specified as NULL. See the aggregate() doc page for details.

Note that if your map operation does not produce duplicate keys, you do not typically need to perform a collate(). Instead you can convert a KeyValue object into a KeyMultiValue object directly via the clone() method, which requires no communication. Or you can pass it directly to another map() operation. One exception would be if your map operation produces a KeyValue object which is highly imbalanced across processors. The aggregate() portion of the operation should redistribute the key/value pairs more evenly.

This method is a parallel operation (aggregate()), followed by an on-processor operation (convert()).


Related methods: aggregate(), clone, collapse(), compress(), convert()