| ||||||||
|
How does it work?The two keys to the significant speed up of the processing are the following:
Quick data read and quick sorting:When the speed of sorting of the seismic data is recognized as a bottleneck in processing time, the typical solution is to buy more expensive disk array system. To double the performance, you normally need several times more expensive RAID array. Using the SeisJet Seismic Data Server software with the same hardware gives you 10 times the performance gain. Why? Because it reads the data in an optimum way, properly utilizing big amounts (gigabytes) of RAM available on modern computers. It is well known that reading the data from hard discs (RAID array) in an arbitrary order (so-called random access to the data) is much slower than if the data is read sequentially, in the same order as it is stored (so-called sequential access). However, the data can be accessed in any order very rapidly when loaded into RAM (which stands for Random-Access Memory) . For this reason, the key to significant acceleration of the data input is (1) reading as much data as possible from disk into RAM in its original order, and (2) then making random-access operations with the data (e.g. sorting) in RAM. Of course, the access to the data on disk cannot be made truly sequential if there is a need to resort big data volume on input, unless the whole data fits into RAM. However, it is possible to maximize the amount of data that is read from the disk sequentially by analysing the original order of the data on disk and the required order on input of the processing flow. The SeisJet Seismic Data Server builds optimized strategy of data reading and, as a result, gets the data from the disk much more sequentially than it is typically done when straightforward seismic data input approaches are used. Optimized data distribution to parallelized flows:Straightforward approaches to data distribution between several parallelized copies of a flow are poorly scaled. It means that after a certain (typically small) number of nodes/processes executing the flow in parallel, further increase of the number of nodes/processes does not speed up the processing any more: though all the computing resources (hard disks, network, processors) are not fully loaded, the execution is still slow. Typical solution is to buy expensive hardware: low latency network over expensive disk arrays. This leads to some moderate performance gain but it is not proportional to the expenses. The SeisJet Seismic Data Server completely solves the network latency problems by means of optimized software design. In fact, the main reasons for the poor scaling of the
parallel processing typically can be as following: Obviously, this type of problems can be solved by means of software. The SeisJet Seismic Data Server reads the data only once and takes care of truly parallel data distribution between different nodes as quick as possible. _____________ |