MR-MPI WWW Site -MR-MPI Documentation - OINK Documentation - OINK Commands

3. Adding Callback Functions to OINK

In the oink directory, the files map_*.cpp, reduce_*.cpp, compare_*.cpp, hash_*.cpp, and scan_*.cpp each contain one or more functions which can be used as callback methods, passed to MR-MPI library calls, such as the map() and reduce() operations. This can be done either in named commands that you write, as described in this section of the documention, or in MR-MPI library commands made directly from an OINK input script.

The collection of these files and callback functions is effectively a library of tools that can be used by new named commands or your input script to speed the development of new MapReduce algorithms and workflows. Over time, we intend to add new callback function to OINK, and also invite users to send their own functions to the developers for inclusion in OINK.

The map(), reduce(), and scan() callback functions include a "void *ptr" as a final argument, which the caller can pass to the callback. This is typically done to enable the callback function to access additional parameters stored by the caller. When doing this with functions listed in the map_*.cpp, reduce_*.cpp, and scan_*.cpp files in OINK, you will want to make the data these pointers point to "portable", so that and "named command" can use it. Thus you would should not typically encode class-specific or command-specific data in the structure pointed to. Instead, your caller should create the minimial data structure that the callback function needs to operate, and store the structure in a map_*.h file that corresponds to the specific map_*.cpp file that contains the function (or reduce_*.h or scan_*.h). See the file oink/map_rmat_generate.h file as an example. It contains the definition of an RMAT_params structure, which is used by both the rmat command and the map() methods it uses, listed in map_rmat_generate.cpp. Both the rmat.h and map_rmat_generate.cpp files include the map_rmat_generate.h header file to accomplish this. Other commands or callback functions could use the same data structure by including that header file.

The following sections list the various callback function currently included in OINK, and a brief explanation of what each of them does.

Note that map() functions come in 4 flavors, depending on what MR-MPI library map() method is being used. Similarly, scan() functions come in 2 flavors, as documented on the scan() method page. Map_*.cpp and scan_*.cpp files within OINK can contain any of the 4 or 2 flavors of map() and scan() methods.

3.1 Map() functions
3.2 Reduce() functions
3.3 Compare() functions
3.4 Hash() functions
3.5 Scan() functions

The documenation below this double line is auto-generated when the OINK manual is created. This is done by extracting C-style documentation text from the map_*.cpp, reduce_*.cpp, compare_*.cpp, hash_*.cpp, and scan_*.cpp files in the oink directory. Thus you should not edit content below this double line.

In the *.cpp files in the oink directory, the lines between a line with a "/*" and a line with a "*/" are extracted. In the tables below, the first such line of extracted text is assumed to be the function name and appears in the left column. The remaining lines appear in the right columns.



Map() functions

add_label add a default integer label to each key, key could be vertex or edge
input: key = anything, value = NULL
output: key = unchanged, value = 1
add_weight add a default floating point weight to each key, key could be vertex or edge
input: key = anything, value = NULL
output: key = unchanged, value = 1.0
edge_to_vertex emit 1 vertex for each edge, just first one
input: key = Vi Vj, value = NULL
output: key = Vi, value = NULL
edge_to_vertex_pair emit 1 vertex for each edge, just first one
input: key = Vi Vj, value = NULL
output: key = Vi, value = NULL
edge_to_vertices emit 2 vertices for each edge
input: key = Vi Vj, value = NULL
output:
key = Vi, value = NULL
key = Vj, value = NULL
edge_upper emit each edge with Vi < Vj, drop self-edges with Vi = Vj
input: key = Vi Vj, value = NULL
output: key = Vi Vj, value = NULL, with Vi < Vj
invert invert key and value
input: key, value
output: key = value, value = key
read_edge read edges from file, formatted with 2 vertices per line
output: key = Vi Vj, value = NULL
read_edge_label read edges and labels from file
file format = 2 vertices and integer label per line
output: key = Vi Vj, value = label
read_edge_weight read edges and weights from file
file format = 2 vertices and floating point weight per line
output: key = Vi Vj, value = weight
read_vertex_label read vertices and labels from file
file format = vertex and integer label per line
output: key = Vi, value = label
read_vertex_weight read vertices and weights from file
file format = vertex and floating point weight per line
output: key = Vi, value = weight
read_words read words from file, separated by whitespace
output: key = word, value = NULL
rmat_generate generate graph edges via recursive R-MAT algorithm
input: # to generated & R-MAT params extracted from RMAT_struct in ptr
output: key = Vi Vj, value = NULL

Reduce() functions

count count number of values associated with key
input: KMV with key and one or more values
output: key = unchanged, value = count
cull eliminate duplicate values
input: KMV with key and one or more values (assumed to be duplicates)
output: key = unchanged, value = first value

Compare() functions


Hash() functions


Scan() functions

print_edge print out an edge to a file
input: key = Vi Vj, value = NULL
print_string_int print out key as string and value as int, to a file
input: key = string, value = int
print_vertex print out an vertex to a file
input: key = Vi, value = NULL