This is the home page for the PHISH library, which is the chief component of a simple framework for processing large volumes of data, either in a real-time streaming context or from files.
Pheatures | Lingo | Documentation | Benchmarks | Authors | Publications |
GitHub | Download | Latest features & bug fixes | Contribute | Open source | . |
PHISH stands for Parallel Harness for Informatic Stream Hashing. The phishy metaphor is meant to evoke the image of many small minnows (programs) swimming in a stream (of data).
PHISH is a lightweight framework which a set of independent processes can use to exchange data as they run on the same desktop machine, on processors of a parallel machine, or on different machines across a network. This enables them to work in a coordinated parallel fashion to perform computations on either streaming or archived data.
The PHISH distribution includes a simple, portable library for performing data exchanges in useful patterns either via MPI message-passing or ZMQ sockets. PHISH input scripts are used to describe a data-processing algorithm, and an additional tool provided in the PHISH distribution converts the script into a form that can be launched as a parallel job.
PHISH was developed at Sandia National Laboratories, a US Department of Energy facility, for use on informatics problems. It has a C-style interface which can be used from any hi-level language, including C, C++, Fortran, or Python. PHISH is distributed as an open-source code, under the terms of the modified Berkeley Software Distribution (BSD) License. See this page for more details.
The authors of PHISH are Steve Plimpton and Tim Shead, who can be contacted at sjplimp at sandia.gov and tshead at sandia.gov.
These are other software packages that provide frameworks for streaming computations: