2nd IFIP/IEEE International Workshop on Analytics for Network and Service Management

AnNet 2017

May 8, 2017 in Lisbon, Portugal

IFIP/IEEE International Symposium on Integrated Network Management
                   Lisbon Portigal 8-12 MAY 2017

11:00 - 12:30 Technical Session 1: SDN and Modeling
  Generating Synthetic Internet- and IP-Topologies using the Stochastic Block-Model
  Abstract: Developing models to generate realistic graphs of communication networks often requires a deep understanding and extensive analysis of the underlying network structure. Since deployed communication networks are dynamic, the findings a generator is based on might lose validity. We alleviate the need for extensive analysis of graphs by estimating parameters of a probabilistic model. The model parameter encode the structure of the graph, which is thus learned in an unsupervised fashion. Synthetic graphs can be generated from the model and will have the structure previously inferred. For this, we introduce the Stochastic Block Model (SBM) and a variant, allowing for heavy tailed degree distributions. The models originate in the social sciences and separate a graph into groups of nodes. To show the applicability of the models to the task of synthetic graph generation in the domain of communication networks, we use one router level and one IP-to-IP communication graph. We assert the quality of the generated models by evaluating a number of graph features and comparing our results to those obtained with the network generator Orbis. We find our approach to be on par with, or even outperforming Orbis, and able to capture large-scale structure in communication networks.
  Virtual Instance Resource Usage Modeling: a Method For Efficient Resource Provisioning in Cloud
  Abstract: Cloud computing is a promising framework providing solutions from software services to infrastructure services. For all range of solutions, a cloud computing service is established on building of virtual instances. The cloud manager is responsible for resource provisioning to these instances to provided guaranteed performance but at the same time avoid under utilization of the platform. Analyzing the resource usage behavior of these instances helps achieving these goals. We used Gaussian Mixture models to excavate the resource utilization and to build their corresponding usage model. We compared our modeling scheme with other statistician methods on a virtual instance placement algorithm. Our results support the efficiency and accuracy of our method on modeling the instances.
  Tag-And-Forward: A Source-Routing Enabled Data Plane for OpenFlow Fat-Tree Networks
  Abstract: Software-Defined Networking (SDN) has turned the Data Center Network (DCN) environment into a more flexible one by decoupling control plane from data plane, allowing an innovative and easily extensible network management solutions. Nowadays, OpenFlow is the most successful protocol for SDN. However, SDN based on OpenFlow protocol presents performance issues regarding to the growth of the forwarding table and the flow setup performance. Our proposal named Tag-and-Forward (TF) is a data plane that reduces the number of flow table required in the Fat-Tree software-defined DCNs to optimize packet forwarding. The results show a performance gain by roughly 53% on RTT and 40% on the packet transmission rate when compared to traditional OpenFlow data plane.
  Learning in SDN-Based Multi-Tenant Cellular Networks: A Game-Theoretic Perspective
  Abstract: n order to cope with the challenges of increasing user bandwidth demands as well as create new revenues by offering innovative services and applications, Mobile Network Operators (MNOs) are willing to increase their networks' capabilities by making it more flexible, programmable and agile. MNOs are also seeking new technologies to benefit from recent advances in cloud for rapid deployments and elastically scaling services that cloud providers are mostly benefiting today. On one hand, Software-Defined Networking (SDN) concept can be helpful for enabling network infrastructure sharing/slicing and elasticity for "softwarization" of network elements. On the other hand, machine learning and game-theoretical concepts can also be utilized to address network management and orchestration needs of services and applications and improve network infrastructure's operational needs. In that regard, joint utilization of machine learning, game theoretical approaches and SDN concepts for network slicing can be beneficial to MNOs as well as infrastructure providers. In this paper, we utilize regret-matching based learning approach for efficient Radio Remote Head (RRH) assignments among MNOs in software-defined based cloud radio access network (C-RAN). Using game-theoretical approach, we demonstrate convergence of RRH allocations to mixed strategy Nash equilibrium and present significant performance improvements compared to traditional assignment approach.
14:00 - 15:10 Technical Session 2: Security
  Knowledge Discovery of Port Scans from Darknet
  Abstract: Port scanning is widely used in Internet prior for attacks in order to identify accessible and potentially vulnerable hosts. In this work, we propose an approach that allows to discover port scanning behavior patterns and group properties of port scans. This approach is based on graph modelling and graph mining. It provides to security analysts relevant information of what services are jointly targeted, and the relationship of the scanned ports. This is helpful to assess the skills and strategy of the attacker. We applied our method to data collected from a large darknet data, i.e. a full /20 network where no machines or services are or have been hosted to study scanning activities.
  Investigation of Malicious Portable Executable File Detection on the Network using Supervised Learning Techniques
  Abstract: Malware continues to be a critical concern for everyone from home users to enterprises. Today, most devices are connected through networks to the Internet. Therefore, malicious code can easily and rapidly spread. The objective of this paper is to examine how malicious portable executable (PE) files can be detected on the network by utilizing machine learning algorithms. The efficiency and effectiveness of the network detection rely on the number of features and the learning algorithms. In this work, we examined 28 features extracted from metadata, packing, imported DLLs and functions of four different types of PE files for malware detection. The returned results showed that the proposed system can achieve 98.7% detection rates, 1.8% false positive rate, and with an average scanning speed of 0.5 seconds per file in our testing network environment.
  Exploring a Service-Based Normal Behaviour Profiling System for Botnet Detection
  Abstract: Effective detection of botnet traffic becomes difficult as the attackers use encrypted payload and dynamically changing port numbers (protocols) to bypass signature based detection and deep packet inspection. In this paper, we build a normal profiling-based botnet detection system using three unsupervised learning algorithms on service-based flow-based data, including self-organizing map, local outlier, and k-NN outlier factors. Evaluations on publicly available botnet data sets show that the proposed system could reach up to 91% detection rate with a false alarm rate of 5%.
16:00 - 17:10 Technical Session 3: Event and Log Analytics
  The Application of Neural Networks to Predicting the Root Cause of Service Failures
  Abstract: The principal objective when monitoring compute and communications infrastructure is to minimize the Mean Time To Resolution of service impacting incidents. Key to achieving that goal is determining which of the many alerts that are presented to an operator are likely to be the root cause of an incident. In turn this is critical in identifying which alerts should be investigated with the highest priority. Noise reduction techniques can be employed to reduce the quantity of alerts a network operator needs to examine but even in favorable scenarios there may be multiple candidate alerts that need to be investigated before the root cause of the incident can be accurately identied, resolved and full service resumed. The current contribution describes a novel technique, Probable Root Cause, that applies supervised machine learning in the form of Neural Networks to determine the alerts most likely to be responsible for a service-impacting incident. An evaluation of different models and model parameters is presented. The effectiveness of the approach is demonstrated against sample data from a large commercial environment.
  Discovering Cloud Operation History through Log Analysis
  Abstract: Management costs in private clouds will be promisingly reduced by reviewing 'operation history,' which is defined as a holistic view of past operation executions. Operation history provides insights into breakdown of operations: the breakdown clarifies cost-dominant operations to be improved and repetitive ones to be automated. Towards obtaining the operation history, a conventional approach relying on manual investigation is time-consuming, and another relying on agent-based monitoring is not often acceptable in sensitive mission-critical enterprise clouds. Different from these approaches, our idea is to discover the operation history by automatically analyzing 'system logs' that are easily accessible even in sensitive clouds. Since system logs contain only low-level debugging messages about programmatic events without direct contexts about operations, the challenge is to recover high-level operational contexts from low-level system logs. To address this challenge, we develop a method that first abstracts system logs using a pre-defined event sequence model, and then maps the abstracted events to high-level individual operations--this mapping between different contextual levels is achieved by using complementary cross-cloud reference data. Evaluation of an implementation revealed that this method reduces the time taken to discover the history by 99.9% compared to a conventional approach while achieving up to 95% correctness.
  Random Access in Nondelimited Variable-length Record Collections for Parallel Reading with Hadoop
  Abstract: The industry standard Packet CAPture (PCAP) format for storing network packet traces is normally only readable in serial due to its lack of delimiters, indexing, or blocking. This presents a challenge for parallel analysis of large networks, where packet traces can be many gigabytes in size. In this work, we present a novel method for random access into variable-length record collections like PCAP by identifying a record boundary within a small number of bytes of the access point. Unlike related heuristic methods, the new method offers a correctness guarantee with a well formed file and does not rely on prior knowledge of the contents. We include a practical implementation of the algorithm with an extension to the Hadoop framework, and a performance comparison to serial ingestion. Finally, we present a number of similar storage types that could utilize a modified version of RAPCAP for random access.