Network compression with configuration models and the minimum description length

Hébert-Dufresne, Laurent; Young, Jean Gabriel; Daniels, Alexander; Kirkley, Alec; Allard, Antoine

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1103/PhysRevE.110.034305
Scopus: eid_2-s2.0-85203599285
Find via

Supplementary

Citations:
- Scopus: 0
Appears in Collections:
- HKU Musketeers Foundation Institute of Data Science: Journal/Magazine Articles

Article: Network compression with configuration models and the minimum description length

Title	Network compression with configuration models and the minimum description length
Authors	Hébert-Dufresne, Laurent Young, Jean Gabriel Daniels, Alexander Kirkley, Alec Allard, Antoine
Issue Date	6-Sep-2024
Publisher	American Physical Society
Citation	Physical Review E, 2024, v. 110, n. 3 How to Cite? DOI: http://dx.doi.org/10.1103/PhysRevE.110.034305
Abstract	Random network models, constrained to reproduce specific statistical features, are often used to represent and analyze network data and their mathematical descriptions. Chief among them, the configuration model constrains random networks by their degree distribution and is foundational to many areas of network science. However, configuration models and their variants are often selected based on intuition or mathematical and computational simplicity rather than on statistical evidence. To evaluate the quality of a network representation, we need to consider both the amount of information required to specify a random network model and the probability of recovering the original data when using the model as a generative process. To this end, we calculate the approximate size of network ensembles generated by the popular configuration model and its generalizations, including versions accounting for degree correlations and centrality layers. We then apply the minimum description length principle as a model selection criterion over the resulting nested family of configuration models. Using a dataset of over 100 networks from various domains, we find that the classic configuration model is generally preferred on networks with an average degree above 10, while a layered configuration model constrained by a centrality metric offers the most compact representation of the majority of sparse networks.
Persistent Identifier	http://hdl.handle.net/10722/362877
ISSN	2470-0045 2023 Impact Factor: 2.2 2023 SCImago Journal Rankings: 0.805

DC Field	Value	Language
dc.contributor.author	Hébert-Dufresne, Laurent	-
dc.contributor.author	Young, Jean Gabriel	-
dc.contributor.author	Daniels, Alexander	-
dc.contributor.author	Kirkley, Alec	-
dc.contributor.author	Allard, Antoine	-
dc.date.accessioned	2025-10-03T00:35:45Z	-
dc.date.available	2025-10-03T00:35:45Z	-
dc.date.issued	2024-09-06	-
dc.identifier.citation	Physical Review E, 2024, v. 110, n. 3	-
dc.identifier.issn	2470-0045	-
dc.identifier.uri	http://hdl.handle.net/10722/362877	-
dc.description.abstract	<p>Random network models, constrained to reproduce specific statistical features, are often used to represent and analyze network data and their mathematical descriptions. Chief among them, the configuration model constrains random networks by their degree distribution and is foundational to many areas of network science. However, configuration models and their variants are often selected based on intuition or mathematical and computational simplicity rather than on statistical evidence. To evaluate the quality of a network representation, we need to consider both the amount of information required to specify a random network model and the probability of recovering the original data when using the model as a generative process. To this end, we calculate the approximate size of network ensembles generated by the popular configuration model and its generalizations, including versions accounting for degree correlations and centrality layers. We then apply the minimum description length principle as a model selection criterion over the resulting nested family of configuration models. Using a dataset of over 100 networks from various domains, we find that the classic configuration model is generally preferred on networks with an average degree above 10, while a layered configuration model constrained by a centrality metric offers the most compact representation of the majority of sparse networks.</p>	-
dc.language	eng	-
dc.publisher	American Physical Society	-
dc.relation.ispartof	Physical Review E	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.title	Network compression with configuration models and the minimum description length	-
dc.type	Article	-
dc.identifier.doi	10.1103/PhysRevE.110.034305	-
dc.identifier.scopus	eid_2-s2.0-85203599285	-
dc.identifier.volume	110	-
dc.identifier.issue	3	-
dc.identifier.eissn	2470-0053	-
dc.identifier.issnl	2470-0045	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Article: Network compression with configuration models and the minimum description length

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats