This middleware will give Network Scientists access to an unparalleled computational and analytic environment for research, education and training. By harnessing new cloud-based resources in an easily accessible manner, this project will enable Network Science researchers to tackle larger, more complex problems. The project vision is to provide researchers, analysts and educators interested in Network Science with an easy-to-use cyber-environment that is accessible from their desktop and integrates into their daily work. A key goal is to greatly expand the size of networks that are routinely studied from hundreds or thousands of nodes to hundreds of millions of nodes. It will leverage the technology, data and experience of a multi-institutional team.

**Resources**

*Edison*

EDISON utilizes big data and data mining to perform social dynamics on networks. Dynamics on networked populations are useful in understanding social processes on networked populations.*Granite*

This tool contains network analysis libraries to compute structural characteristics of networks. The libraries are GaLib, NetworkX, and SNAP.*Course Materials for CINET Training*

CINET is actively being used in classrooms in several universities. The platform for research, teaching and collaboration offers an environment for resource sharing in network science.

**Goals**

*A broadly accessible cyber-infrastructure for network analysis*

A web portal that hides the details of computation and data management, thereby minimizing the learning effort required*A flexible framework*

Allows easy extension by integrating off-the-shelf network analysis suites for analysis and visualization; this means new algorithms can be added easily over time*Fostering research, teaching and collaboration*

Building a broad user base, from multiple disciplines, including incorporation into courses on network science at many different universities*A self-sustainable framework*

Users can contribute new networks, data, algorithms, hardware, and research results*Self-manageability*

End users will be insulated from the complexities of resource allocation, scheduling, cross-platform interactions, and other low-level concerns

**Features**

*Structural Network Analysis*

Implementation of 70+ network analysis algorithms with variety of types related to shortest path, subgraph and motif counting, centrality, graph traversal and so on.*Dynamic Network Analysis*

Analysis of the phasic structure of a graph dynamical system (e.g., spreading dynamic phenomena such as rumors through networks).*A Rich Collection of Networks*

Provides over 110+ networks from various areas such as social networks, web/internet networks, biological networks, and infrastructure and transportation networks, artificial networks and so on. Networks size varies from 100 nodes/edges to ~10M nodes (410M edges).*Network Generators*

*Implementation of ~20 random and deterministic network generators such as Barabasi-Albert, Erdos-Reyni, small world, star graphs, etc.**Adding New Networks*

Users can add interesting networks to the system using an easy-to-use web interface. They might want to keep their networks either private or public. Currently the system supports edge list and adjacency list formats of networks*Adding New Analytical Tools (Graph Algorithms)*

Users can add new graph algorithms for different analysis on networks through administrative assistance. Please contact CINET team if you have such algorithms to add.*Network Visualization*

An integrated visualization module that supports dynamic range of visualizations as given below.- Multiple layout algorithms: Random, Force atlas, Yifan Hu, etc.
- Feature based organization: Determining node size and color by degree, betweenness, etc.
- Coloring communities: Applying community detection algorithm to visualize different communities in different colors.

*Computing Resources*

Utilizes both traditional high performance computing clusters, e.g., Shadowfax, Pecos (Virginia Tech), and cloud computing infrastructure, e.g., FutureGrid. An intelligent resource manager chooses appropriate computing platform for a network analysis job considering resource availability and computational and memory requirement. This underlying HPC system is highly scalable having total processing cores of ~7000.*Platform for research, teaching and collaboration*

Researchers and collaborators can interact with CINET by running analytics on networks, visualizing networks, adding networks to the system, and adding structural analysis tools. CINET is actively being used in classrooms in several universities (e.g., University at Albany – State University of New York, Virginia Tech, North Carolina A&T State University).

GaLib has been developed at the Network Dynamics and Simulation Science Laboratory at Virginia Tech, specifically for computing with very large graphs. GaLib provides efficient implementations of various classical and new graph algorithms that are motivated by the analysis of social contact graphs and disease dynamics on such graphs, including:

- Analysis of the subgraph structure and relational labeled graph queries, e.g., counts and relative frequencies of subgraphs such as cliques and stars.
- Graph measures (algorithms) related to disease dynamics on social contact networks, e.g., vulnerability of a node, which is defined as the probability that the node gets infected; also, there are subgraph queries motivated by disease dynamics, which are also provided as part of GaLib.

For some of these algorithms, GaLib implements sampling based approximation algorithms known in the literature, but with error guarantees that can be controlled by the users, in contrast to exact implementations provided in other libraries, leading to significant speedups. The data structure and algorithms included in GaLib are carefully tuned to be capable for running on large graphs containing up to millions of nodes. GaLib is written in C++.

In addition to the sequential algorithms, GaLib contains a bunch of distributed-memory parallel algorithms for generating massive networks (e.g., Preferential-Attachment, Chung-Lu models, etc.), and for computing different graph algorithms (e.g., counting motifs or subgraphs, performing edge-switch in a simple graph, computing clustering coefficient, number of triangles, multinomial distribution, etc.). The parallel algorithms are able to deal with massive graphs (i.e., graphs with billions of nodes and hundreds of billions of edges) that the sequential algorithms are not even able to load into memory.

We have adapted a few graph algorithms (implementations) of GaLib for CINET. GaLib, along with NetworkX and SNAP, is used in CINET as computation engine for network analysis.

The GaLib manual contains a list of the graph algorithms currently adapted from GaLib to include them in CINET.

NetworkX is a powerful Python package for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks. It was developed at Los Alamos National Laboratory by Aric Hagberg and his group, and was first released in 2005 for public use as an open source software package. NetworkX scales up to networks with hundreds of thousands of nodes, and provides algorithms for generating various kinds of random networks and for computing several properties of networks. For more information on NetworkX, please visit: http://networkx.github.io/

We have adapted a few graph algorithms (implementations) of NetworkX for CINET. NetworkX, along with GaLib and SNAP, is used in CINET as computation engine for network analysis.

The NetworkX manual contains a list of the graph algorithms currently adapted from NetworkX to include them in CINET.

Stanford Network Analysis Platform (SNAP) is a general purpose network analysis and graph mining library developed at Stanford University by Jure Leskovec and his group. It is written in C++ and easily scales to massive networks with hundreds of millions of nodes, and billions of edges. It efficiently manipulates large graphs, calculates structural properties, generates regular and random graphs, and supports attributes on nodes and edges. For more information on SNAP, please visit: http://snap.stanford.edu.

We have adapted a few graph algorithms (implementations) of SNAP for CINET. SNAP, along with GaLib and NetworkX, is used in CINET as computation engine for network analysis.

The SNAP manual contains a list of the graph algorithms currently adapted from SNAP to include them in CINET.

CINET provides implementation of 70+ network analysis algorithms with variety of types related to shortest path, sub graph and motif counting, centrality, graph traversal and so on.

In upcoming versions, CINET will also include the capability to simulate diffusion processes on networks. We will integrate multiple different simulation codes that have been developed at NDSSL, to provide different diffusion models and simulation capabilities.

EpiFast: A fast, scalable, distributed memory simulation tool capable of representing and reasoning about complex interventions and public policies.

Indemics: An interactive epidemic simulator that allows online interaction between a user and the simulation engine. It integrates a database with the simulation engine using abstractions and data models that allow efficient queries.

InterSim: A general-purpose flexible framework for simulating Graph Dynamical Systems and their generalizations. These include general kinds of (vector valued) update functions, interaction networks, update orders, and finite state machines (FSM) that describe state transitions. InterSim also integrates with Indemics.

CINET provides more than 110+ networks from various areas such as social networks, web/internet networks, biological networks, and infrastructure and transportation networks, artificial networks and so on. Networks can be visualized using different layout algorithms and feature based organizations, e.g., determining node size using degree, betweenness centrality, applying community detection algorithm. Users can add their own networks and make them either public or private. CINET supports the following two different representations of the networks:

- Adjacency list (Galib) format
- Edge list (NetworkX) format

*Participating Institutions and Investigators:*

Virginia Tech: Madhav V. Marathe, Keith R. Bisset, Edward A. Fox, Maleq Khan, Chris J. Kuhlman, Anil Vullikanti, Henning Mortveit, Samarth Swarup

Indiana University: Geoffrey C. Fox, Judy Qiu

University of Houston-Downtown: Ongard Sirisaengtaksin

University of Chicago and Argonne National Laboratory: Kamil Iskra

Northwestern University: Peter Beckman

Jackson State University: Richard A. Alo

North Carolina Agricultural and Technical State University: Albert Esterline

University at Albany, State University of New York: S. S. Ravi

*External collaborators:*

University of Illinois at Urbana-Champaign: Zsuzsanna Fagyal

Clemson University: Matthew Macauley

Virginia Tech: T. M. Murali, Rahul Kulkarni

*2015 Workshop Organizing Committee:*

Keith Bisset, Maleq Khan, Chris Kuhlman, Madhav Marathe, S.S. Ravi, Sherif Abdelhamid, S.M. Arifuzzaman, Md Hasanuzzaman Bhuiyan, S.M. Shamimul Hasan

- Abdelhamid S, Alo R, Arifuzzaman S, Beckman P, Bhuiyan M, Bisset K, Fox E, Fox G, Hall K, Hasan S, Joshi A, Khan M, Kuhlman C, Lee S, Leidig J, Makkapati H, Marathe M, Mortveit H, Qiu J, Ravi S, Shams Z, Sirisaengtaksin O, Subbiah R, Swarup S, Trebon N, Vullikanti A, Zhao Z (2012) CINET: A CyberInfrastructure for Network Science. In The 8th IEEE International Conference on eScience, 2012. Chicago, IL, October 8-12, 2012.
- Abdelhamid S, Alam M, Alo R, Arifuzzaman S, Beckman P, Bhattacharjee T, Bhuiyan H, Bisset K, Eubank S, Esterline A, Fox E, Fox G, Hasan S, Hayatnagarkar H, Khan M, Kuhlman C, Marathe M, Meghanathan N, Mortveit H, Qiu J, Ravi S, Shams Z, Sirisaengtaksin O, Swarup S, Vullikanti A, Wu T (2014) CINET 2.0: A CyberInfrastructure for Network Science. In The 10th IEEE International Conference on eScience, 324-331.
- Abdelhamid et. al., “GDSCalc: A Web-Based Application for Evaluating Discrete Graph Dynamical Systems,” Plos One 2015.

- Kuhlman C, Kumar VS Anil, Marathe M, Ravi S, Rosenkrantz D (2015) Inhibiting diffusion of complex contagions in social networks: theoretical and experimental results. Data Mining and Knowledge Discovery, 29(2): 423-465.
- Bisset K, Aji A, Marathe M, Feng W (2012) High-Performance biocomputing for simulating the spread of contagion over large contact networks. BMC Genomics, 13(Suppl 2).
- Kuhlman et. al., “A General-Purpose Graph Dynamical System Modeling Framework,” WSC 2011.
- Maksudul Alam and Maleq Khan,Parallel Algorithms for Generating Random Networks with Given Degree Sequences, 12th IFIP International Conference on Network and Parallel Computing (NPC), New York City, Sep. 2015.
- Shaikh Arifuzzaman, Maleq Khan and Madhav Marathe, A Space-efficient Parallel Algorithm for Counting Exact Triangles in Massive Networks, 17th IEEE International Conference on High Performance Computing and Communications (HPCC), New York City, Aug. 2015.
- Shaikh Arifuzzaman and Maleq Khan, Fast Parallel Conversion of Edge List to Adjacency List for Large-Scale Graphs, 23rd High Performance Computing Symposium (HPC), Alexandria, VA, USA, April 2015.
- Hasanuzzaman Bhuiyan, Jiangzhuo Chen, Maleq Khan, and Madhav V. Marathe,Fast Parallel Algorithms for Edge-Switching to Achieve a Target Visit Rate in Heterogeneous Graphs, International Conference on Parallel Processing (ICPP), Minneapolis, Sep. 2014.
- Maksudul Alam, Maleq Khan, and Madhav V. Marathe,Distributed-Memory Parallel Algorithms for Generating Massive Scale-free Networks Using Preferential Attachment Model, Intl. Conf. for High Performance Computing, Networking, Storage and Analysis (SuperComputing), Denver, Nov. 2013.
- Shaikh Arifuzzaman, Maleq Khan, and Madhav V. Marathe,PATRIC: A Parallel Algorithm for Counting Triangles in Massive Networks, ACM Conference on Information and Knowledge Management (CIKM), San Francisco, Oct. 2013.
- Zhao Zhao, Guanying Wang, Ali Butt, Maleq Khan, V.S. Anil Kumar, and Madhav Marathe, SAHAD: Subgraph Analysis in Massive Networks Using Hadoop, 26th IEEE International Parallel & Distributed Processing Symposium (IPDPS), Shanghai, China, May 2012.
- Zhao Zhao, Maleq Khan, V.S. Anil Kumar and Madhav V. Marathe, Subgraph Enumeration in Large Social Contact Networks using Parallel Color Coding and Streaming, 39th International Conference on Parallel Processing (ICPP), San Diego, California, Sep. 2010.

- Kuhlman, Chris J., and Henning S. Mortveit, “Limit Sets of Generalized, Multi-Threshold Networks,” Journal of Cellular Automata, Vol. 10, pp. 161-193, 2015.
- Kuhlman, Chris J., and Henning S. Mortveit, “Attractor Stability in Nonuniform Boolean Networks,” Theoretical Computer Science, Vol. 559, pp. 20-33, 2014.
- Kuhlman, Chris J., Henning S. Mortveit, David Murrugarra, and V. S. Anil Kumar, “Bifurcations in Boolean Networks,” Automata, pp. 29-46, 2011.

- Dumas, C., D. LaManna, T. M. Harrison, S. S. Ravi. L. Hagen, C. Kotfila and F. Chen, ``Examining Political Mobilization of Online Communities through E-petitioning Behavior in We the People (Extended Abstract), presented at the Social Media and Society Conference, Toronto, Canada, Oct. 2014.
- Dumas, C., D. LaManna, T. M. Harrison, S. S. Ravi. L. Hagen, C. Kotfila and F. Chen, ``Examining Political Mobilization of Online Communities through E-petitioning Behavior in We the People", accepted for publication the Journal of Big Data and Society, 2015.
- Dumas, C., D. LaManna, T. M. Harrison, S. S. Ravi. L. Hagen, C. Kotfila and F. Chen, ``E-petitioning as Collective Political Action in We the People", Proc. iConference 2015, Newport Beach, CA, March 2015 (20 pages).