(Quick Reference)

3.7 Parallel Speculations - Reference Documentation

Authors: The Whole GPars Gang

Version: 1.2.0

3.7 Parallel Speculations

With processor cores having become plentiful, some algorithms might benefit from brutal-force parallel duplication. Instead of deciding up-front about how to solve a problem, what algorithm to use or which location to connect to, you run all potential solutions in parallel.

Parallel speculations

Imagine you need to perform a task like e.g. calculate an expensive function or read data from a file, database or internet. Luckily, you know of several good ways (e.g. functions or urls) to achieve your goal. However, they are not all equal. Although they return back the same (as far as your needs are concerned) result, they may all take different amount of time to complete and some of them may even fail (e.g. network issues). What's worse, no-one is going to tell you which path gives you the solution first nor which paths lead to no solution at all. Shall I run quick sort or merge sort on my list? Which url will work best? Is this service available at its primary location or should I use the backup one?

GPars speculations give you the option to try all the available alternatives in parallel and so get the result from the fastest functional path, silently ignoring the slow or broken ones.

This is what the speculate() methods on GParsPool and GParsExecutorsPool() can do.

def numbers = …
def quickSort = …
def mergeSort = …
def sortedNumbers = speculate(quickSort, mergeSort)

Here we're performing both quick sort and merge sort concurrently, while getting the result of the faster one. Given the parallel resources available these days on mainstream hardware, running the two functions in parallel will not have dramatic impact on speed of calculation of either one, and so we get the result in about the same time as if we ran solely the faster of the two calculations. And we get the result sooner than when running the slower one. Yet we didn't have to know up-front, which of the two sorting algorithms would perform better on our data. Thus we speculated.

Similarly, downloading a document from multiple sources of different speed and reliability would look like this:

import static groovyx.gpars.GParsPool.speculate
import static groovyx.gpars.GParsPool.withPool

def alternative1 = { 'http://www.dzone.com/links/index.html'.toURL().text }

def alternative2 = { 'http://www.dzone.com/'.toURL().text }

def alternative3 = { 'http://www.dzzzzzone.com/'.toURL().text //wrong url }

def alternative4 = { 'http://dzone.com/'.toURL().text }

withPool(4) { println speculate([alternative1, alternative2, alternative3, alternative4]).contains('groovy') }

Make sure the surrounding thread pool has enough threads to process all alternatives in parallel. The size of the pool should match the number of closures supplied.

Alternatives using dataflow variables and streams

In cases, when stopping unsuccessful alternatives is not needed, dataflow variables or streams may be used to obtain the result value from the winning speculation.

Please refer to the Dataflow Concurrency section of the User Guide for details on Dataflow variables and streams.

import groovyx.gpars.dataflow.DataflowQueue
import static groovyx.gpars.dataflow.Dataflow.task

def alternative1 = { 'http://www.dzone.com/links/index.html'.toURL().text }

def alternative2 = { 'http://www.dzone.com/'.toURL().text }

def alternative3 = { 'http://www.dzzzzzone.com/'.toURL().text //will fail due to wrong url }

def alternative4 = { 'http://dzone.com/'.toURL().text }

//Pick either one of the following, both will work: final def result = new DataflowQueue() // final def result = new DataflowVariable()

[alternative1, alternative2, alternative3, alternative4].each {code -> task { try { result << code() } catch (ignore) { } //We deliberately ignore unsuccessful urls } }

println result.val.contains('groovy')