Department of Information Technology



From intelligent data acquisition via smart data-management to confident predictions


Cost-aware Application Development and Management using CLOUD-METRIC

Traditional application development tends to focus on two key objectives: the best possible performance and a scalable system architecture. This application development logic works well on private resources but with the growing use of public IaaS it is essential to find a balance between the cost and the performance of an application. Here we propose CLOUD-METRIC: a lightweight framework for cost-aware development of applications to be deployed in public clouds. The key functionality of CLOUD-METRIC is to allow users to develop applications on private IaaS (a dedicated cluster or an in-house cloud) while estimating the cost of running them on public IaaS. In addition to cost estimation, the framework can also be used for basic application monitoring and as a real-time programming support tool to find bottlenecks in the distributed architecture.

GROOT: Infrastructure_Security-as-a-Service (ISaaS)

Clouds are inherently complex because of the complex interconnection of underlying services. The OpenStack cloud suite eases the deployment and management of cloud services. However, security is one core area that is difficult to isolate and has to be addressed at each level, ranging from low-level system security to the user-facing multi-tenant environments. There are solutions available that offer end-to-end security but most of them are proprietary and with their sophisticated licensing scheme, expertise that might be affordable for large enterprises but difficult for medium and small-scale organizations is required.

Recently we have started a project named GROOT. The aim is to design a minimalistic security service that helps cloud administrators get the first-hand information regarding any activities that may cause threats to the instance or the whole tenant in the cloud infrastructure. The project has the following three main components:

  • Best Practices Guide: An initial document that contains steps that are important to know before utilizing the resources. These include key based logins, system updates and management of security rules.
  • Interactive System Monitoring: This part is the major contribution of this project. The aim is to minimize the workload of the cloud administrators yet have the better understanding of activities that may create vulnerabilities. Rather than relying on log files or alerts via emails, we have opted an unconventional approach of using Watchdogs (monitoring-VMs) with SLACK BOTs (Software Robot Device). In the setup, BOTs work like active security agents that keep the cloud administrators informed regarding actions that need special attentions.
  • Backend System Forensics: In order to safeguard the future attacks, often system/security administrators want to analyze the infected resource. For this purpose, we are working on a backend forensic tool that captures most of the important information before actually terminating the instance. This work is still in progress.

The presentation will cover the basic idea of the project, challenges, details and last but not least current limitations and future directions.

HarmonicIO: A Stream-based Solution for Scalable Data Analysis

Many frameworks have been introduced to cope up the challenges from data velocity for streaming data. The nature of data streaming rate itself is not constant, but unpredictable. Over-provisioning is one strategy used to ensure quality of service. However, this approach is expensive, it requires static settings and results in inefficient resource utilization. Another key challenge is the tight coupling with the underlying infrastructure. To address these issues, we have designed a framework that supports stream-based data analysis on inconsistent data rates with real-time scaling while efficiently utilizing the underlying resources.

A Flexible Computational Framework Using R and Map-Reduce

In quantitative trait locus (QTL) mapping significance of putative QTL is often determined using permutation testing. The computational needs to calculate the significance level are immense, 10^4 up to 10^8 or even more permutations can be needed. In this project we have developed a flexible, parallel computing framework for identifying multiple interacting QTL using the PruneDIRECT algorithm which uses the map-reduce model as implemented in Hadoop. The framework is implemented in R, a widely used software tool among geneticists. This enables users to rearrange algorithmic steps to adapt genetic models, search algorithms, and parallelization steps to their needs in a flexible way.

A Scalable Infrastructure for CMS Data Analysis

The challenge of providing a resilient and scalable computational and data management solution for massive scale research environments, requires continuous exploration of new technologies and techniques. In this project the aim has been to design a scalable and resilient infrastructure for CERN HEP data analysis. The infrastructure is based on Openstack components for structuring a private Cloud with Gluster File System. We integrate the state-of-the-art Cloud technologies with the traditional Grid middleware infrastructure. Our test results show that the adopted approach provides a scalable and resilient solution for managing resources without compromising on performance and high availability.

Secure Cloud Connectivity for Scientific Applications

Cloud computing improves utilization and flexibility in allocating computing resources while reducing the infrastructural costs. However, in many cases cloud technology is still proprietary and tainted by security issues rooted in the multi-user and hybrid cloud environment. Secure connectivity in hybrid cloud environment hinders especially the adaptation of clouds in scientific community that requires scaling-out the local infrastructure using publicly available resources for large-scale experiments. In this project we have developed the DII-HEP secure cloud infrastructure and propose an approach to securely scale-out a local setup to public clouds. Our approach is based on the Host Identity Protocol that we employ to authenticate the hosts and to protect data flows in order to provide a secure inter and intra-cloud infrastructure solution for scientific research. In addition our solution offers secure network fabric for scaling scientific applications across different cloud infrastructure providers.

E-FDIR System for Complex Network Structures

A Framework for Efficient Fault-Detection, Isolation and Recovery (E-FDIR) in Datacenters

The guarantee of high availability of services is the core of datacenter’s business model. It requires mechanisms to address challenges ranging from low-level hardware failures to disruption of high-level software communication in the system. In the world of datacenters, hardware failure is a norm rather than the exception, requires exceptional mechanisms to support long-term service availability. Number of tools and techniques have developed to address the challenges of efficient fault-detection, isolation and system recovery. Such techniques involve creating virtual groups of resources, know as zones; continued monitoring; running fault detection analysis and taking system backups. The growing number of resources in datacenters accumulates massive amount of monitoring data per day. Such large datasets are very useful for holistic resource management and future strategies but are inadequate to enable real-time decisions. Thus, it is required to further explore efficient mechanisms for fault-detection, isolation and real-time recovery of services in datacenters. In this project, I plan to explore state-based models for enhanced support of Service-Level-Agreements (SLA) in complex network structures such as datacenters.

Chelonia: A Self-healing Distributed Store

Chelonia was designed to fill the requirements gap between those of large, sophisticated scientific collaborations which have adopted the grid paradigm for their distributed storage needs, and of corporate business communities which are gravitating towards the cloud paradigm. The design of Chelonia has been chosen to optimize high reliability and scalability of an integrated system of heterogeneous, geographically dispersed storage sites and the ability to easily expand the system dynamically.

Vedio link

Updated  2017-07-26 23:38:21 by Salman Toor.