TY - JOUR
T1 - Characterization of the immunoglobulin lambda chain locus from diverse populations reveals extensive genetic variation
AU - Gibson, William S.
AU - Rodriguez, Oscar L.
AU - Shields, Kaitlyn
AU - Silver, Catherine A.
AU - Dorgham, Abdullah
AU - Emery, Matthew
AU - Deikus, Gintaras
AU - Sebra, Robert
AU - Eichler, Evan E.
AU - Bashir, Ali
AU - Smith, Melissa L.
AU - Watson, Corey T.
N1 - Publisher Copyright:
© 2022, The Author(s), under exclusive licence to Springer Nature Limited.
PY - 2023/2
Y1 - 2023/2
N2 - Immunoglobulins (IGs), crucial components of the adaptive immune system, are encoded by three genomic loci. However, the complexity of the IG loci severely limits the effective use of short read sequencing, limiting our knowledge of population diversity in these loci. We leveraged existing long read whole-genome sequencing (WGS) data, fosmid technology, and IG targeted single-molecule, real-time (SMRT) long-read sequencing (IG-Cap) to create haplotype-resolved assemblies of the IG Lambda (IGL) locus from 6 ethnically diverse individuals. In addition, we generated 10 diploid assemblies of IGL from a diverse cohort of individuals utilizing IG-Cap. From these 16 individuals, we identified significant allelic diversity, including 36 novel IGLV alleles. In addition, we observed highly elevated single nucleotide variation (SNV) in IGLV genes relative to IGL intergenic and genomic background SNV density. By comparing SNV calls between our high quality assemblies and existing short read datasets from the same individuals, we show a high propensity for false-positives in the short read datasets. Finally, for the first time, we nucleotide-resolved common 5-10 Kb duplications in the IGLC region that contain functional IGLJ and IGLC genes. Together these data represent a significant advancement in our understanding of genetic variation and population diversity in the IGL locus.
AB - Immunoglobulins (IGs), crucial components of the adaptive immune system, are encoded by three genomic loci. However, the complexity of the IG loci severely limits the effective use of short read sequencing, limiting our knowledge of population diversity in these loci. We leveraged existing long read whole-genome sequencing (WGS) data, fosmid technology, and IG targeted single-molecule, real-time (SMRT) long-read sequencing (IG-Cap) to create haplotype-resolved assemblies of the IG Lambda (IGL) locus from 6 ethnically diverse individuals. In addition, we generated 10 diploid assemblies of IGL from a diverse cohort of individuals utilizing IG-Cap. From these 16 individuals, we identified significant allelic diversity, including 36 novel IGLV alleles. In addition, we observed highly elevated single nucleotide variation (SNV) in IGLV genes relative to IGL intergenic and genomic background SNV density. By comparing SNV calls between our high quality assemblies and existing short read datasets from the same individuals, we show a high propensity for false-positives in the short read datasets. Finally, for the first time, we nucleotide-resolved common 5-10 Kb duplications in the IGLC region that contain functional IGLJ and IGLC genes. Together these data represent a significant advancement in our understanding of genetic variation and population diversity in the IGL locus.
UR - http://www.scopus.com/inward/record.url?scp=85144395203&partnerID=8YFLogxK
U2 - 10.1038/s41435-022-00188-2
DO - 10.1038/s41435-022-00188-2
M3 - Article
AN - SCOPUS:85144395203
SN - 1466-4879
VL - 24
SP - 21
EP - 31
JO - Genes and Immunity
JF - Genes and Immunity
IS - 1
ER -