Basic Genetics

This vignette will introduce you to how the basic genetic parameters like the allele frequency, the genotype frequency and Hardy-Weinberg Equilibrium results are calculated with mixIndependR.

Data Import

The dataset imported should be in a format of the genotype data with individuals in rows and markers in columns. Excel, csv and vcf file format are compatible.

#>     STR1 SNP1
#> 1  12|12  A|A
#> 2  13|14  T|T
#> 3  13|13  A|T
#> 4  14|15  A|T
#> 5  15|13  T|A
#> 6  13|14  A|T
#> 7  14|13  A|A
#> 8  12|12  T|A
#> 9  14|14  T|T
#> 10 15|15  A|T

Basic Genetic Parameters

library(mixIndependR)

AlleleFreq calculates the allele frequencies for one dataset.

AlleleFreq(x,sep = "\\|")
#>    STR1 SNP1
#> 12  0.2  0.0
#> 13  0.3  0.0
#> 14  0.3  0.0
#> 15  0.2  0.0
#> A   0.0  0.5
#> T   0.0  0.5

GenotypeFreq calculates the observed or expected genotype frequency. If expect=FALSE, the observed genotype frequencies from the original dataset will be calculated. If expected=TRUE, the expected genotype probabilities from allele frequency table under Hardy-Weinberg Equilibrium will be exported.

GenotypeFreq(x,sep = "\\|",expect = FALSE)  ####or GenotypeFreq(x)
#>       STR1 SNP1
#> 12|12    2    0
#> 13|13    1    0
#> 14|14    1    0
#> 15|15    1    0
#> A|A      0    2
#> T|T      0    2
#> 12|13    0    0
#> 12|14    0    0
#> 12|15    0    0
#> 12|A     0    0
#> 12|T     0    0
#> 13|14    2    0
#> 13|15    0    0
#> 13|A     0    0
#> 13|T     0    0
#> 14|15    1    0
#> 14|A     0    0
#> 14|T     0    0
#> 15|A     0    0
#> 15|T     0    0
#> A|T      0    4
#> T|A      0    2
#> T|15     0    0
#> T|14     0    0
#> T|13     0    0
#> T|12     0    0
#> A|15     0    0
#> A|14     0    0
#> A|13     0    0
#> A|12     0    0
#> 15|14    0    0
#> 15|13    1    0
#> 15|12    0    0
#> 14|13    1    0
#> 14|12    0    0
#> 13|12    0    0
GenotypeFreq(x,sep = "\\|",expect = TRUE) ####or GenotypeFreq(x,expect =T)
#>       STR1 SNP1
#> 12|12 0.04 0.00
#> 13|13 0.09 0.00
#> 14|14 0.09 0.00
#> 15|15 0.04 0.00
#> A|A   0.00 0.25
#> T|T   0.00 0.25
#> 12|13 0.06 0.00
#> 12|14 0.06 0.00
#> 12|15 0.04 0.00
#> 12|A  0.00 0.00
#> 12|T  0.00 0.00
#> 13|14 0.09 0.00
#> 13|15 0.06 0.00
#> 13|A  0.00 0.00
#> 13|T  0.00 0.00
#> 14|15 0.06 0.00
#> 14|A  0.00 0.00
#> 14|T  0.00 0.00
#> 15|A  0.00 0.00
#> 15|T  0.00 0.00
#> A|T   0.00 0.25
#> T|A   0.00 0.25
#> T|15  0.00 0.00
#> T|14  0.00 0.00
#> T|13  0.00 0.00
#> T|12  0.00 0.00
#> A|15  0.00 0.00
#> A|14  0.00 0.00
#> A|13  0.00 0.00
#> A|12  0.00 0.00
#> 15|14 0.06 0.00
#> 15|13 0.06 0.00
#> 15|12 0.04 0.00
#> 14|13 0.09 0.00
#> 14|12 0.06 0.00
#> 13|12 0.06 0.00

Heterozygous test the heterozygosity of each individuals at each locus and output a table with 0 denoting homozygous and 1 heterozygous.

h <-Heterozygous(x,sep = "\\|") ####or Just use Heterozygous(x)
print(h)
#>       STR1 SNP1
#>  [1,]    0    0
#>  [2,]    1    0
#>  [3,]    0    1
#>  [4,]    1    1
#>  [5,]    1    1
#>  [6,]    1    1
#>  [7,]    1    0
#>  [8,]    0    1
#>  [9,]    0    0
#> [10,]    0    1

RxpHetero calculate Real or Expected Average Heterozygosity at each locus. If HWE=TRUE, this function will calculate the expected heterozygosities under Hardy-Weinberg Equilibrium; If HWE=FALSE, this function will calculate the real average heterozygosities.

p<-AlleleFreq(x,sep = "\\|")
H <- RxpHetero(h,p,HWE=TRUE)
head(H)
#> STR1 SNP1 
#> 0.74 0.50

AlleleShare calculates the table of number of shared alleles for each pair of individuals at each locus.If replacement=TRUE, the pairs are formed with replicates; if replacement=FALSE, the pairs are formed without replicate.

AS<-AlleleShare(x,sep = "\\|",replacement = FALSE) ###or without "sep="
head(AS)
#>      STR1 SNP1
#> 10 2    0    1
#> 1 3     0    1
#> 9 8     0    1
#> 4 6     1    2
#> 7 5     1    1

RealProAlleleShare and ExpProAllelShare calculate the average proportions and the expected probabilities of sharing 0,1 and 2 alleles at each locus.

e <-RealProAlleleShare(AS)
e0<-ExpProAlleleShare(p)
head(e)
#>       P0  P1  P2
#> STR1 0.6 0.4 0.0
#> SNP1 0.0 0.8 0.2
head(e0)
#>         P0     P1     P2
#> STR1 0.317 0.5672 0.1158
#> SNP1 0.125 0.5000 0.3750

HWE_Chisq test the Hardy-Weinberg Equilibrium with Pearson’s Chi-square test. B is an integer specifying the number of replicates used in the Monte Carlo test.

g <- GenotypeFreq(x,expect=FALSE)
g0 <- GenotypeFreq(x,expect=TRUE)
HWE.Chisq(g,g0,rescale.p = T,simulate.p.value = T,B=2000)
#>      STR1      SNP1 
#> 0.5352324 0.8725637