We all know that sinking feeling that hits when you open the discovery for that big sexual assault case and see the words “DNA Lab Results” sitting nonchalantly on the page. A giant weight suddenly curls up on your chest, and rests there waiting for you to download the document that can make or break your theory. Your client has undoubtedly told you that there is “no way” this lab report comes back against him, and yet you brace for the inevitable statistic claiming it is 600 sextillion times more likely that the incriminating DNA came from him than anyone else on the planet. As the download progress bar fills, you start reciting your touch-DNA cross examination in your head to convince yourself the case is still triable. The report finally opens… and you let out a giant sigh of relief (into your mask of course – we are in a pandemic after all). The results are inconclusive. Despite a positive male screening test, they found only the complaining witness’s DNA. You live to fight another day.
A few weeks later, the prosecutor emails. The subject line reads “DNA.” You’re not worried. Maybe it’s a new offer, considering the lab results went your way. You open it, and that sinking feeling returns. They ran a different type of DNA test – something called “Y-STR Analysis” – and the results inculpate your client. You stare at the screen, unsure how to react. What does this mean? What do you do? Can you fight this?
Y-STR testing is being used more and more frequently in sexual assault cases when the typical testing yields no results. Often, it feels like the Y-STR results pull the rug out from under your case. But don’t worry, Y-STR analysis is not the same forensic powerhouse as the typical STR analysis – and it can be bought down by those differences. This article outlines scientific and statistical weaknesses of Y-STR DNA analysis that can be used to contest its admissibility and challenge experts on cross examination.
What is Y-STR DNA Analysis?
The human genome is comprised of long strands of DNA that are packaged into twenty-three pairs of chromosomes. All twenty-three pairs are made up of one chromosome inherited from each biological parent, but which chromosome is from which parent is a matter of random selection. This randomization creates a unique genetic fingerprint for all individuals, including siblings (other than identical twins). Forensic analysts create a DNA profile by examining specific sections of DNA called “short tandem repeat” (STR) markers that are known to vary between individuals at designated loci on the chromosome. The analyst then compares a suspect’s DNA profile to the DNA profile obtained as evidence to determine if they match. If they don’t match, that suspect is “excluded as a contributor.” If they do, the analysist calculates the “random match” probability – the likelihood that a random alternate individual, rather than the suspect, is the true donor of the DNA evidence.
Typical STR analysis compares DNA profiles created from loci found on the twenty-two sex-neutral, or autosomal, chromosome pairs. In contrast, Y-STR analysis studies only loci found on the Y chromosome – the male specific sex-determining chromosome. The most common application of Y-STR analysis is in sexual assault cases, where autosomal STR typing is difficult or impossible because the excess amount of female DNA masks the male DNA. Since a female victim does not have any Y-chromosomes, the Y-chromosomes found in a victim sample are presumed to have come from the perpetrator. Once the perpetrator’s Y-chromosomes are isolated, the analyst can generate a Y-STR profile to compare against the Y-STR profile of a particular suspect. Once a match is found, the statistical significance of the match is determined by how rare that profile is in the database.
Limitations of Y-STR DNA Analysis
Scientists are in agreement that Y-STR analysis is a valid and precise mechanism to exclude persons as possible contributors to DNA evidence. However, the weight of a “match” is much weaker than in typical DNA testing due to (1) the inheritance patterns of the Y-chromosome and (2) the confines of the “counting method” to determine statistical significance.
Inheritance of the Y Chromosome
Whereas every other chromosome is found in both men and women, Y-chromosomes are found only in males. A male inherits his Y-chromosome in its entirety from his father. His Y-STR profile will be genetically indistinguishable from those of all his paternally related male relatives—his father, his son, his grandfather, his uncle, his cousins, etc. Because spontaneous Y-chromosome mutation is relatively rare, Y-STR profiles are also likely to be shared by males whose biological relationships are historically remote. Men who do not know each other or recognize each other may have identical Y-STR profiles if they had a common male ancestor hundreds of years in the past. Therefore, a “match” between an evidence sample and a suspect simply creates a population of possible contributors that includes the defendant plus all patrilineal related male relatives and an unknown number of unrelated males. Both because any one person’s Y chromosome is likely to be shared by an unknowably large number of other individuals and because Y-STR testing analyzes only one of the 46 chromosomes a person possesses, the probability of a “random match” with respect to Y-STR DNA is significantly higher than the probability of a random match with respect to a complete DNA profile using typical STR analysis. For example, random match probabilities with Y-STR DNA may be as high as 1 in 30, as opposed to the 1 in several million (or greater) probabilities generated by STR analysis.
Confines of the “Counting Method”
Since Y-STR analysis is all linked to one chromosome, the method typically used to calculate the “random match” probability cannot be used. Instead, analysts must literally “count” the number of similar profiles that exist in a profile database in order to determine the statistical significance of a match. This is called the counting method, and it presumes that the frequency of a Y-STR profile within the database parallels its frequency in the location where crime occurred. Therefore, its reliability depends directly on the size and quality of the database that is being used. It is fundamentally necessary that the database comprise an appropriate representative subset of the population. Unlike autosomal DNA, Y-STR profiles cluster geographically where common ancestors followed migration and settlement patterns. Specific profiles are not likely to be evenly dispersed among distant populations. Even within local geographic regions, certain Y-STR profiles are common within certain ethnic groups but entirely absent among others.
Forensic analysts attempt to control for this kind of non-random sorting by stating profile frequencies in terms of race. Geneticists have tended to assume that historical within-race genetic mixing is sufficient to disperse Y-STR profiles evenly among that racial group. For this reason, Y-STR probability statistics are usually expressed as the frequency at which that profile was found in the database within each racial group. However, even within discrete racial groups, there can still be statistically significant differences in the frequency of Y-STR profiles depending upon the geographic location.1 Additionally, a different racial distribution within the database as compared to the given area will skew the results, and easily over or under-represent the local regularity of a given profile. For example, a 2003 study comparing U.S. populations found that the Y-STR profiles of both European-Americans and Hispanics were much more varied within their ethnic groups than the profiles of African-Americans, and that the variation in Hispanic genotype was higher in Texas than anywhere else in the country.2 In order for the “counting method” to generate the accurate statistical significance of any profile match, the database should reflect same genotype variations as the local populations and ethnic groups. However, such databases do not yet exist; in fact, the U.S. Y-STR database was decommissioned in 2019. The profiles form the U.S. database were all transferred to the international Y-Chromosome Haplotype Reference Database (YHRD).3 YHRD includes about 3,500 total profiles from the state of Texas (which is home to over 13.6 million men), 33% from European-American men, 33% from Hispanic-American men, and 33% from African-American men. This does not reflect racial and ethnic composition of the male Texas population, which, as of 2015, was 54% European-American, 28% Hispanic, and only 9% Africa-American.4
Exploiting the Limitations
Y-STR DNA analysis is ripe for challenge in Texas courts. Defense attorneys faced with inculpating Y-STR evidence should request a Kelly/Daubert hearing to determine if the loci tested or kits used meet threshold requirements of reliability, and that the statistical evidence is likewise supported. The first erroneous Y-STR conviction recently came to light, underscoring the need for care with regard to interpretation of these results.5 While at least one Texas Court of Appeal has held that Y-STR analysis does meet the reliability standard, it did so only after a full pretrial hearing.6 The Court of Criminal Appeals has yet to weigh in on the issue, but in a concurring opinion, Judge Johnson has expressed concern that current databases do not contain enough samples, or a proper distribution of samples by race, to support the method’s reliability in this State.7 Importantly, two organizational leaders in quality assurance standards in DNA testing have released directives creating and changing guidelines in Y-STR interpretation and database selection within the past year, exemplifying that this method is still evolving and affording litigants the opportunity to challenge testing not done in accordance with those directives.8
Additionally, there are many circumstances in which the admission of Y-STR results may be excludable under Texas Rule of Evidence 403. For example, there is very limited probative value to Y-STR testing in cases where there are potential suspects of the same paternal line, there is already likely contamination of the DNA sample, or there is a mixture of multiple male profiles in the sample. If all else fails, Y-STR analysis affords ample fodder for cross-examination, allowing you to challenge the method directly to the jury.
Footnotes
- See e.g., Carolina Bonilla et al., Admixture in the Hispanics of the San Luis Valley, Colorado, and Its Implications for Complex Trait Genemapping, 68 Annals of Hum. Genetics 139 (2004) (reporting wide variation in genetic profiles of various ethnic groups falling under cultural rubric of Hispanic); M. Hedman et al., Analysis of 16 Y STR Loci in the Finnish Population Reveals a Local Reduction in the Diversity of Male Lineages, 142 Forensic Sci. Intl. 37 (2004) (particular sixteen-locus Y-STR profile is shared by thirteen percent of Finnish population)
- Manfred Kayser, et al, Y Chromosome STR Haplotypes and the Genetic Structure of U.S. Populations of African, European, and Hispanic Ancestry, 13 Genome Res 4 (2003)
- SWGDAM (2019) Notice to U.S. Forensic Laboratories on the status of the U.S. Y-STR Database
- Office of the Texas Governor (2015) Texas Demographics
- Greg Hampikian et al., Case report: Coincidental inclusion in a 17-locus Y-STR mixture, wrongful conviction and exoneration, 31 Forensic Sci. Int’l: Genetics, 1 (2017)
- Curtis v. State, 205 S.W.3d 656, 661 (Tex. App.—Fort Worth 2006, pet. ref’d)
- Cortez v. State, AP-76,101, 2011 WL 4088105, at *29 (Tex. Crim. App. Sept. 14, 2011)(Johnson, concurring)
- SWGDAM (2019) Notice to U.S. Forensic Laboratories on the status of the U.S. Y-STR Database; ISFG (2020) DNA commission of the International Society of Forensic Genetics (ISFG): Recommendations on the Interpretation of Y-STR Results in Forensic Analysis.