Data for PyFLANK, a Graph Neural Network Based Null Distribution Inference Model for FST Outlier Detection
Themes: Conversion, Feedstock Production
Keywords: AI/ML, Genetics, Genomics, Software
Citation
Zhang, Z., Jia, W., Gomes Viana, J.P., Hsieh, P., Yoshikuni, Y., Hudson, M. April 6, 2026. Data for: “PyFLANK, a Graph Neural Network Based Null Distribution Inference Model for FST Outlier Detection.” GitHub.
Overview

pyFLANK is an open-source and automated Python implementation which detects FST outliers using a null distribution inferred from quasi-independent loci inspired by the R package OutFLANK(https://doi.org/10.1086/682949). Our tool integrates three approaches to identify loci obeying a null distribution: graph neural network (GNN) inference, linkage disequilibrium (LD)-based inference, and user-defined input. Because pyFLANK uses GNN-based inference of quasi-independent loci, it yields a more accurate null model with less need for user parameter input.
FST calculation is based on Weir and Cockerham (1984).
Key Features
- Graph-based representation of local dependency context of loci and their dependence structure,
- GNN-based null distribution inference,
- Compatible with standard FST-based workflows,
- Designed to complement, not replace, LD pruning and clumping.
Data
GitHub: Software and necessary datasets