Distributed processing platform for large datasets: satellite imagery usecase

No Thumbnail Available
Journal Title
Journal ISSN
Volume Title
For most of the programs, the most significant amount of time is spent on running a CPU-intensive component, while retrieving and reading the input or formatting the output only takes an insignificant amount of time, especially if both the input and output are structured data. Thus, most of the computer science literature on distributed systems discuss the optimization of algorithms and frequently treat the I/O and input preparation parts of the program execution as a much less important one for overall efficiency. That well-informed decision of ignoring the I/O part when discussing runtime optimization might change to a fault when we consider algorithms running over massive datasets that should also be retrieved at a user's request. In this paper, we propose and analyze an user-centric processing platform for running algorithms over satellite imagery products, including a review of the important challenges, solutions, and opportunities on extracting valuable results from GIS products. The 4th section of the paper includes two examples of real-life usage of remote sensing for the observation of natural habitats. In the 5th section we include three practical experiments assessing different aspects of the proposed solution.
Big Data, Cloud Computing, Satellite imagery, Processing Platform, Large Datasets