Algorithms Implemented in TAGster | National Institute of Environmental Health Sciences Skip Navigation Algorithms Implemented in TAGster Close the left navigation Add TAGster: Efficient Selection of LD Tag SNP in Single or Multiple Populations Consider a set S which contains M bi-allelic SNP markers a ,a ...,a in populations and S contains M SNP markers s ,s ,...,s in population . First, we estimated pairwise LD measure r for each SNP pair within each population. Two markers s im and s in are said to be in strong LD if the r (s im ,s in ) is greater than or equal to a pre-specified threshold value r . Both are considered tag SNP for each other, in that s im can be used as a surrogate for s in , or vice versa. Our aim is to find a tag SNP set, denoted by T, such that for ∀s im ∈S =1,..., , ∃α that satisfies r (α ,S im ) ≥ r . In our presentation, we introduce intermediate SNP sets, and = 1,..., where, is called the candidate set which contains all the SNPs in population that are eligible to be chosen as a tag SNP, contains SNPs in population that are already tagged by at least one of tag SNPs in , i.e. ∀s im = 1,..., , ∃α that satisfies r (α ,S im ) ≥ r . We implemented several algorithms in TAG ster to select tag SNP set Algorithm 1: A greedy algorithm for single or multiple populations 1. Set = ∅, P = S and = ∅, for any =1,..., 2. For each SNP α in , calculate If α If α 3. Find the SNP α max that has the highest , and add α max to . If α max , add any SNP s im in with r (α max , s im ) ≥ r to and then exclude α max from 4. Repeat Steps 2-3 until =S for any =1,..., Algorithm 2: An optimal solution for single population tag SNP An exhaustive Search is performed within each population to find minimal number of population specific tag SNPs for = 1,..., 1. Set = ∅ and , for =1,..., 2. Within population , partition SNPs in into disjoint precinct ij = 1,..., , so that r (s im ,s in )<r for any two SNPs s im and s in that belong to different precincts. 3. Within a precinct P ij For any two SNPs s im and s in in precinct ij , if ,we exclude one with smaller from precinct ij Conduct an exhaustive search to find a set of minimum number of tag SNPs for SNPs in precinct ij and add these tag SNPs into 4. Repeat step (3) for each precinct Algorithm 3: Two-stage solution for multi-populations 1. Conduct Algorithm 2 within each population to select a set of population specific tag SNPs for = 1,..., 2. Set = ∅, = S for = 1,..., 3. For each SNP ij in , find and SNP s im (s im and s im ) that satify r ij ,S im ) ≥ r and then add them as well as ij into LD bin ij and exclude them for 4. With each LD bin ij , set ij = ∅. Find any SNP s im in ij that satify r (s im ,S in ) ≥ r for any SNP s in in ij , and then add s im to ij 5. Set . For each SNP τ in = 1,...,| P| , construct a one dimensional array with elements, where 6. Cluster SNPs in so that any two SNPs τ and τ in a cluster satisfy 7. Set Ψ = ∅. Find one SNP τ in each cluster with maximum and add it to Ψ. 8. Cluster SNPs in Ψ so that any two SNPs τ and τ in a cluster satisfy 9. For each cluster, set LD bin set = ∅, record the LD bins in each population that can be tagged by any SNP in the cluster to , and then conduct an exhaustive search to find a minimum set of tag SNPs in the cluster that can tag all LD bins in . Add this set of SNPs to Back to Top Last Reviewed: February 18, 2026