Introduction to Parallel Algorithms 0 Comments

Using multiple cores to break a problem into pieces for faster execution
With the demand for ever-faster code falling on developers’ shoulders, Slashdot Media Contributing Editor Rick Leinecker looks at how best to parallelize for performance
Introduction to Parallel Algorithms
Rick Leinecker, July 2016
This blog marks the beginning of a series in which I will explore parallel algorithms. It is a natural topic to add to the mix here at Go Parallel since parallel algorithms are the natural outcome after learning parallelization. This blog explores the high level concepts for parallel algorithms. In future blogs, probably once a month, I will drill down and show how some of the various parallel algorithms work.
Why Parallel Algorithms?
The goal of parallel algorithms is to use a multiple core system to break an algorithm into pieces in order to achieve faster execution times. It only makes sense that if you can distribute the work between multiple cores that an algorithm will execute faster. I teach at University of Central Florida, and we have an entire course dedicated to parallel algorithms. The topic is rich and the research goes deep.
For the blogs that I will present over the next several months, I will take advantage of the parallelization that we have already looked at. This includes Open MP, Cilk, Threaded Building Blocks, and straight Windows API threads.
Parallelizing Serial Algorithms
Note that we normally think of algorithms as a sequence of steps which must be taken in order to solve a programming problem. I explain algorithms to my non-technical friends with a cake making analogy. To make a cake, you mix the ingredients, put it in the oven, let it cool, and then ice it. Of course, there are lots of details to each step, but altogether they represent a cake creation algorithm.
Parallel algorithms break the notion of a strict sequence. For example, you could have three people work together to make a cake. One person could measure and mix the dry ingredients while another measures and mixes the liquid ingredients. And the third person could grease the pan and preheat the oven. This simple example shows you how even making a cake can be broken up into parallel tasks. We sometimes refer to parallel tasks as concurrent processing.
The need for parallel algorithms arose from our need for fast processing to meet the increasing demands made on technology. How about our crowed airport air traffic which is controlled by software? There is no time to wait for an algorithm to figure out who will land next—it needs to make decisions quickly in real time. Another example is analysis of satellite images for weather prediction and also for national security. Image processing used to be a CPU-intensive, time-consuming process. But now with parallel algorithms it can be done in close to real time.
Parallelizing Code
We have shown a simple real-life example where a cake algorithm could be parallelized. Now let’s talk about some code. Consider the following code snippet that assigns trigonometric values into elements of an array.
for (int i = 0; i < 10000000; i++)
{
value[i] = (int)(sin((double)i) +
cos((double)i) + tan((double)i));
}
I added some timing code before and after this loop and saw that it executed in 1109 milliseconds. But if we can break the loop up into pieces that can be split up among the available cores, we could potentially speed up execution. The easiest solution (in my opinion), is to use Open MP. With a single pragma directive, Open MP can parallelize a loop. Take a look at the following code, and how a single pragma gives Open MP the instruction to parallelize it.
#pragma omp parallel for
for (int i = 0; i < 10000000; i++)
{
value[i] = (int)(sin((double)i) +
cos((double)i) + tan((double)i));
}
Now my timing code shows an execution time of 250 milliseconds, a fraction of the original time of 1109 milliseconds. The simple pragma directive caused the compiler to break the loop up so that the multiple cores on my computer could split up the task. And this is how the magic or parallelization works. Thus we enter the world of parallel algorithms.
Conclusion
I plan on writing one blog per month showing how classic algorithms such as sorts and searches can be reworked so that they are parallel algorithms. These examples will give you the basics you will need to write your own parallel algorithms.
July 29, 2016 Rick Leinecker, Slashdot Media Contributing Editor
Nessun commento:
Posta un commento