Fault-tolerant scalable hierarchical scheduling in grid computing

Document Type : Original Article

Authors

Electrical Engineering Department, Assiut University, Egypt.

Abstract

Computational grids have the potential for solving large-scale scientific applications using heterogeneous, distributed and possibly non-dedicated resources. Grid environment is dynamic in nature, hence scalable and fault-tolerant scheduling is a
much needed to schedule parallel applications with inter-process communication. In this paper, we propose a hierarchical and fault-tolerant scheduling approach, in which the application’s processes communicate indirectly by sending messages over the network through mailbox-based communication technique at a shared node. In grid, process often migrates from one node to another, so this technique ensures the reliable delivery of messages; prevents messages sent to the migrating process form losing. A nonevolutionary mapping heuristic based on Max-Min approach is also proposed for
mapping such applications on grid resources. Finally, MPICH-V1 protocol is integrated into our scheduling framework that exploits the mailbox-based technique instead of channel memories. The simulation experimental results demonstrate that, the proposed approach as a whole effectively schedules the grid applications in scalable and fault tolerant way thereby ensures the application to be executed within its deadline making the grid environment trust worthy.

Keywords