GRAPH MINING AND COMMUNITY EVALUATION WITH DEGENERACY

Abstract : The study and analysis of social networks attract attention from a variety of Sciences (psychology, statistics, sociology). Among them, the field of Data Mining offers tools to automatically extract useful information on properties of those networks. More specifically, Graph Mining serves the need to model and investigate social networks especially in the case of large communities - usually found in online media - where social networks are prohibitively large for non-automated methodologies. The general modeling of a social network is based on graph structures. Nodes of the graph represent individuals and edges signify different actions or types of social connections between them. A community is defined as a subgraph (of a social network) and is characterized by dense connections. Various measures have been proposed to evaluate different quality aspects of such communities - in most cases ignoring various properties of the connections (e.g. directionality). In the work presented here, the k-core concept is used as a means to evaluate communities and extract information. The k-core structure essentially measures the robustness of an undirected network through degeneracy. Further more extensions of degeneracy are introduced to networks that their edges offer more information than the undirected type. Starting point is the exploration of properties that can be extracted from undirected graphs (of social networks). On this, degeneracy is used to evaluate collaboration features - a property not captured by the single node metrics or by the established community evaluation metrics - of both individuals and the entire community. Next, this process is extended for weighted, directed and signed graphs offering a plethora of novel evaluation metrics for social networks. These new features offer measurement tools for collaboration in social networks where we can assign a weight or a direction to a connection and provide alternative ways to signify the importance of individuals within a community. For signed graphs the extension of degeneracy offers additional metrics that can be used for trust management. Moreover, a clustering approach is introduced which capitalizes on processing the graph in a hierarchical manner provided by its core expansion sequence, an ordered partition of the graph into different levels according to the k-core decomposition The graph theoretical models are then applied in real world graphs to investigate trends and behaviors. The datasets explored include scientific collaboration and citation graphs (DBLP and ARXIV), a snapshot of Wikipedia's inner graph and trust networks (e.g. Epinions and Slashdot). The findings on these datasets are interesting and the proposed models offer intuitive results.
Document type :
Theses
Complete list of metadatas

Cited literature [89 references]  Display  Hide  Download

https://pastel.archives-ouvertes.fr/pastel-00959615
Contributor : Christos Giatsidis <>
Submitted on : Friday, March 14, 2014 - 6:22:03 PM
Last modification on : Wednesday, March 27, 2019 - 4:41:27 PM
Long-term archiving on : Saturday, June 14, 2014 - 12:00:28 PM

Identifiers

  • HAL Id : pastel-00959615, version 1

Collections

Citation

Christos Giatsidis. GRAPH MINING AND COMMUNITY EVALUATION WITH DEGENERACY. Web. Ecole Polytechnique X, 2013. English. ⟨pastel-00959615⟩

Share

Metrics

Record views

430

Files downloads

573