304 North Cardinal St.
Dorchester Center, MA 02124
304 North Cardinal St.
Dorchester Center, MA 02124
Parallel processing is a way for two or more processors (CPUs) to work on different parts of a larger task at the same time. By giving different parts of a task to different processors, you can cut down on the time it takes to run a program. Parallel processing can be done on any system with more than one CPU. It can also be done on multi-core processors, which are common in computers today.
Multi-core processors are IC chips that have two or more processors to improve performance, use less power, and handle multiple tasks more efficiently. These multi-core configurations are like having several separate processors in the same computer. Most computers have between two and four cores, but some have as many as twelve.
Parallel processing is often used to do hard tasks and complicated calculations. Data scientists frequently use parallel processing for tasks that call for a lot of data and computation.
Usually, a computer scientist will use a software tool to break a complicated task into multiple parts and give each part to a different processor. Each processor will then solve its part, and the data will be put back together by a software tool to read the answer or run the task.
Most of the time, each processor will work normally and do what it’s told, pulling data from the computer’s memory. Processors will also use software to talk to each other so they can stay in sync when data values change. As long as all the processors stay in sync with each other, software will put all the pieces of data together at the end of a task.
Even if a computer doesn’t have more than one processor, it can still be used for parallel processing if it is part of a cluster.
There are many kinds of parallel processing. SIMD and MIMD are two of the most common ones. SIMD, which stands for “single instruction, multiple data,” is a type of parallel processing in which two or more processors in a computer follow the same set of instructions but handle different sets of data. SIMD is usually used to look at big sets of data that are all based on the same set of benchmarks.
MIMD, which stands for “multiple instructions, multiple data,” is another common type of parallel processing in which each computer has two or more of its own processors and gets data from different data streams.
MISD, which stands for “multiple instructions, single data,” is another type of parallel processing that is less often used. In this case, each processor will use a different algorithm with the same input data.
Parallel processing uses two or more processors to finish multiple tasks at once, while serial processing (also called sequential processing) uses only one processor to finish one task at a time. If a computer has been given more than one task to do, it will do one task at a time. Also, a computer with serial processing will take longer to finish a complicated task than one with a parallel processor.
When computers were first made, they could only run one program at a time. A program with a lot of calculations that took an hour to run and another program that copied tapes that took an hour to run would take two hours to run. Early forms of parallel processing let both programs run at the same time, one after the other. The computer would start an I/O operation, and while it was waiting for it to finish, it would run the program that used a lot of processing power. The two jobs would take a little over an hour to do all at once.
Multiprogramming was the next step forward. In a multiprogramming system, each user’s program was given a short amount of time to run on the processor. To the users, it looked like all of the programs were running at once. These systems were the first ones to have problems sharing resources. Explicit requests for resources led to a problem called “deadlock,” in which multiple requests for the same resource made it impossible for a program to use the resource. The critical section routine is what happens when machines with no instructions on how to break a tie compete for resources.
Vector processing was another way to try to speed things up by doing more than one thing at once. In this case, machines were given the ability to add, subtract, multiply, or otherwise change two arrays of numbers with a single instruction. This was helpful in engineering applications where vectors or matrices were the most natural way for data to be represented. Vector processing wasn’t as useful when the data wasn’t so well-formed.
Multiprocessing was the next step in the process of doing things at the same time. In these systems, the work to be done was split between two or more processors. The first ones were set up as masters and slaves. One processor (the master) was set up to do all the work in the system, while the other (the slave) only did the tasks that the master told it to do. This was necessary because no one knew how to program the machines so that they could work together to manage the system’s resources at the time.
Both SMP and MMP
The symmetric multiprocessing system was made because these problems were solved (SMP). In an SMP system, each processor is just as good as the others and has the same job: to manage how work moves through the system. At first, the goal was to make SMP systems look exactly the same to programmers as multiprogramming systems with a single processor.
But engineers found that system performance could be improved by 10–20% if some instructions were run out of order and programmers had to deal with the added complexity (the problem can become visible only when two or more programs simultaneously read and write the same operands; thus, the burden of dealing with the increased complexity falls on only a very few programmers and then only in very specialized circumstances). We still don’t know how SMP machines should handle data that is shared.
When there are more processors in an SMP system, it takes longer for data to move from one part of the system to all the other parts. When the number of processors is somewhere between a few dozen and a few hundred, adding more processors doesn’t improve performance enough to be worth the extra cost. In order to get around the problem of long propagation times, a system for sending messages was made. In these systems, programs that share data send messages to each other when they change the value of an operand.
Instead of sending the new value of an operand to every part of a system, the new value is only sent to the programs that need to know it. Instead of shared memory, programs can send messages to each other through a network. This makes things easier so that hundreds or even thousands of processors can work well together in one system. So, these kinds of systems are called massively parallel processing (MPP) systems.
The most successful MPP applications have been for problems that can be broken down into many separate, independent operations on huge amounts of data. In data mining, you need to look through a static database more than once. In artificial intelligence, you have to look at many different options, just like in a game of chess. Processors are often grouped together in MPP systems. As in an SMP system, the processors in each cluster talk to each other. Messages are only passed between the clusters. Some MPP systems are called NUMA machines, which stands for “Non-Uniform Memory Addressing,” because you can send messages to the operands or use memory addresses to get to them.
Programming an SMP machine isn’t too hard, but programming an MPP machine is. SMP machines work well on all kinds of problems, as long as there isn’t too much data to deal with. MPP systems are the only way to solve some problems, like getting information from large databases.
READ MORE ARTICLES;