Why do ethiopians have caucasian features




















The selection of a particular value for K is going to be really important, and we shouldn't confuse the method from the reality which the method is trying to plumb. I've reedited to highlight populations which might inform the variation of Ethiopians.

Now let's look at a series of K's. Note the changes. Luckily for us, we don't need to stop here. Dienekes included Behar's Ethiopians non-Jews for Dodecad. Additionally, he included the Masai population from the HapMap.

This turns out to be important because he found that Ethiopian Sub-Saharan ancestry is similar to that of the Masai, not the other African groups. Dienekes also provided individual outputs. I've stitched together Ethiopians with Egyptians and Saudis. The color coding is the same as above. You should be able to tell where the three groups start and stop pretty easily.

Ethiopians, in particular highland Ethiopians, seem to me likely an ancient stabilized hybrid population between a population from Arabia, and a local Sub-Saharan population.

This population seems unlikely to have been related to the peoples of West-Central Africa, who are associated with the Bantus across eastern and southern Africa. The Bantu agricultural toolkit runs into ecological constraints in various regions, and it is in those regions that non-Bantu populations have persisted. Ethiopia, with its unique climate and topography, naturally remains non-Bantu as well as the Horn of Africa as a whole. The possible connections between Khoisan and Ethiopia may be a function of the fact that these areas harbor genetic variants which have disappeared in the intervening regions because of the Bantu expansion.

DNA from a man who lived in Ethiopia about 4, years ago is prompting scientists to rethink the history of human migration in Africa. Until now, the conventional wisdom had been that the first groups of modern humans left Africa roughly 70, years ago , stopping in the Middle East en route to Europe, Asia and beyond.

Then about 3, years ago, a group of farmers from the Middle East and present-day Turkey came back to the Horn of Africa probably bringing crops like wheat, barley and lentils with them. Population geneticists pieced this story together by comparing the DNA of distinct groups of people alive today.

Since humans emerged in Africa, DNA from an ancient Africa could provide a valuable genetic baseline that would make it easier for scientists to track genome changes over time. Unfortunately, such DNA has been hard to come by. The samples of ancient DNA that have been sequenced to date were extracted from bodies in Europe and Asia that were naturally refrigerated in cooler climates.

His body was found face-down in Mota cave, which is situated in the highlands in the southern part of the country. The cool, dry conditions in the cave preserved his DNA, and scientists extracted a sample from the petrous bone at the base of his skull. The resulting sequence is the first nuclear genome from an ancient African, according to a report published Thursday in the journal Science.

Radiocarbon dating revealed that the bone was 4, years old. That meant Mota as the researchers called him lived before Eurasians returned to the African continent. Consistent with that timeline, Mota did not have any of the genetic variants for light-colored eyes or skin that evolved in the populations that left Africa. Nor did he have variants that arose in Eurasian farmers that allowed them to digest milk as adults.

The Chabu, a hunter-gatherer group and linguistic isolate, exhibit the strongest overall degree of genetic differentiation from all other ethnic groups, consistent with previous analyses highlighting their genetic distinctiveness 30 , However, we clarify this further by showing the Chabu to be significantly more genetically similar to the Mezhenger sample than other samples examined here Supplementary Fig.

Third, we find unexpectedly high genetic similarities among groups classified into distantly related linguistic categories Fig. These observations demonstrate that shared linguistic affiliation, even using broad categories, is not always a reliable predictor of relatively higher genetic similarity.

This suggests that speakers of the first three tiers of Ethiopian language classifications at www. We also find that several groups spanning the three AA classifications of Cushitic, Omotic, and Semitic show high genetic similarity to each other on average and less genetic similarity to NS speakers Fig. We find no clear genetic evidence Omotic is an outgroup to other AA language groups, as previously claimed 29 , at least among Ethiopians.

Unsurprisingly, given our previous genetic similarity results Fig. However, using clusters rather than self-reported label can increase power to infer ancestral histories by merging ethnic groups with similar genetic variation patterns.

This also can clarify ancestry inference, as it does not assume that all individuals reporting the same ethnicity share recent ancestry. Simulations mimicking patterns we observe showed that our approach accurately infers sources and dates of admixture Supplementary Fig.

Blue and green borders in the ancestry composition highlight different admixing sources. In particular we enclose the reference populations representing one of the inferred admixing sources with a thick blue line. We infer six broad categories of admixture, correlated with both geography and linguistics Fig. For example, 12 clusters primarily containing individuals from NS-speaking groups clusters 1—5, 7, 9, 10, 15, 48—50 on Fig.

Similar admixture is inferred in the AA Omotic speaking Karo cluster 6 , AA Cushitic speaking Dasanech cluster 11 and linguistically-unclassified Chabu cluster 8 , which each show relatively high genetic similarity to NS-speakers Fig. In contrast, clusters primarily containing AA speakers, including all Ari and Woylata clusters clusters 22, 24, 25, 39, 41, 43, 45, 54, 56 and a cluster containing the linguistically-unclassified Negede-Woyto cluster 58 , typically show evidence of admixture between two or more sources related to the 4.

Eurasian groups, over a broader range of dates generations ago. Among these, five northern clusters containing AA Semitic-speakers and the AA Cushitic-speaking Agaw clusters 62, 64, 66—68 , plus two geographically nearby clusters containing the AA Cushitic-speaking Qimant cluster 63 and AA Omotic-speaking Shinasha cluster 65 , show the highest amounts of Egypt-like ancestry in our dataset and similar admixture dates point estimates 71—85 generations ago.

Inferred dates typically are more recent under the latter, indicating this analysis is picking up relatively more recent intermixing among sources represented by present-day Ethiopian clusters. Colours match those in Fig. Each Ethiopian cluster X , also including Mota, has a corresponding colour outer circle. Lines of this colour emerging from X indicate that X was inferred as the best surrogate for the admixing source contributing the minority of ancestry to each other cluster it connects with.

The thickness of lines is proportional to the contributing proportion. Ethiopian clusters, with labels coloured by language category according to Fig. We next explored whether groups that share cultural practices also show evidence of recent intermixing. These practices include male and female circumcision and four different marriage practices see Supplementary Note 6 for details. The average genetic similarity among groups sharing one of these six cultural traits in common was higher than that expected based on linguistic affiliation and spatial distance Fig.

Numbers of groups in each category are in parentheses in red. Each boxplot depicts the median horizontal black bar , interquartile range box , minimum and maximum endpoints values across pairwise comparisons. Here we analyse a large-scale Ethiopian cohort densely sampled across ethnicities and geography, and annotated for cultural practices Supplementary Note 2. This resource enabled us to disentangle several factors shaping genetic structure in Ethiopians.

Wherever possible we only included individuals whose ethnicity matched that reported for parents and grandparents, which—if accurate—should exclude instances of ethnic re-identification and between-group intermixing occurring within the last two generations.

This inclusion criterion implies that the patterns we have inferred reflect genetic patterns in Ethiopia approximately two generations prior to the present-day. This plausibly underrepresents genetic similarity and intermixing among ethnic groups that would be observable in a random sample, though our results support widespread recent intermixing among ethnic groups nonetheless Fig.

Consistent with this isolation, these groups also exhibit signatures of recent endogamy as reflected by higher degrees of genetic homogeneity Supplementary Fig. In the Ari, we infer very similar sources and dates of admixture in independent analyses of distinct clusters that correspond to occupational groups clusters 22, 24 and 25 in Fig.

A parsimonious explanation of these findings, consistent with our simulations Supplementary Fig. This corresponds to the time period during which iron working is thought to have first appeared in Ethiopia 79 and supports the marginalisation theory of their origins 80 consistent with previous genetic studies 31 , Analogous to this, in the Chabu, who are not linguistically classified by Ethnologue, we infer admixture events dated to — years ago and ancestry proportions that are similar to those inferred in the Mezhenger Fig.

For the Negede-Woyto, the other group in this study for which there is no established linguistic classification in Ethnologue, we infer a relatively high amount of Egyptian-related ancestry Fig. The large number of non-Ethiopian groups included in this sample, particularly those geographically proximal to Ethiopia, diminishes this possibility, but more samples from other sources, in particular from ancient individuals in Ethiopia, may increase our ability to identify older ancestral differences between Ethiopians using these techniques.

We also identify a correlation between genetic similarity and elevation difference, even after correcting for genetic similarity over geographic distances. Strikingly, we also see a correlation between spatial distance and the degree of genetic ancestry related to Mota, an ancient individual 4 whose remains were found in the Gamo Highlands of present-day Ethiopia years ago Supplementary Fig.

This suggests a notable preservation of some population structure in parts of Ethiopia over the intervening period 4 , This timing is also consistent with trading ties between the greater Horn and Egypt. Such recent intermixing is consistent with mixed ancestry signals we see in some NS groups e.

We excluded their aDNA samples as reference groups, because they reported them to have admixture from these four sources. While using different reference groups and techniques complicates direct comparisons, our inferred sources of ancestry broadly agree with that study. For example, the Agaw clusters 66, 67 have relatively more Levant-like ancestry which we match most closely to Egypt , the Ari clusters 22, 24, 25; called Aari in Prendergast et al. Simulations mimicking the admixture inferred here show high accuracy in inferred dates and sources, though illustrate a limitation whereby older dates of admixture e.

Thus complex intermixing events, such as those exhibited here, can be difficult to dissect fully with these approaches and sample sizes, e. A potential example are the NS-speaking Berta clusters 48, 50 , in which we infer only a single recent date of admixture but whom have complicated sources of ancestry that suggest multiple events Fig.

This suggests a recent separation of these groups, i. Future work can compare these and other published genetic results e. For example, some Mezhenger report that their ancestors originally migrated from Sudan to the present-day Gambella Regional State where Anuak lived, after which they migrated with the AA Omotic-speaking Sheko for a period before settling in their present-day homeland The Council of Nationalities, Southern Nations and Peoples Region Our study also highlights the importance of considering topographical and cultural factors, in particular language, ethnicity and in some cases occupation, when designing sampling strategies for future Ethiopian genetic studies, e.

Similar sampling strategies may be necessary to capture the genetic structure of peoples in some other African countries that also exhibit relatively high levels of genetic diversity and structure 65 , Finally, our analyses illustrate how cultural practices, e.

DNA samples from the Ethiopians whose autosomal genetic variation data are newly reported in this study following quality control, see below were collected in several field trips from to , through a long-standing collaboration including researchers at University College London and Addis Ababa University. All study participants, including non-Ethiopians whose genetic variation data are newly reported in this study, gave their informed consent.

Local permissions were obtained in all cases where applicable local ethical approval and regulations existed, e. Buccal swab samples were collected from anonymous donors over 18 years of age, unrelated at the paternal level. In order to mitigate the effects of admixture from recent migrations that may be causing any genetic distinctions between ethnic groups to blur, analogous to Leslie et al. However, for a few ethnic groups Bana, Meinit, Negede Woyto, Qimant, Shinasha, Suri , we did not find any individuals fulfilling this birthplace condition; in such cases we randomly selected individuals whose grandparents had the same ethnicity.

We did not have geographic or birthplace information for Beta Israel individuals whose genetic variation data is newly released in this study. All the Ethiopian individuals included in the dataset are classified into 75 groups based on self-reported ethnicity 68 ethnic groups plus occupations Blacksmith, Cultivator, Potter, Tanner, Weaver within the Ari and Wolayta ethnicities.

Supplementary Data 1 shows the number of samples from each Ethiopian population and ethnic group that passed genotyping QC and were used in subsequent analyses. Figure 1a shows the geographic locations i. For comparison, we also incorporated non-Ethiopians after quality control below from labelled present-day populations, and 40 high coverage aDNA genomes including Mota , as described in this paragraph.

To these data we added present-day Indians and Iranians published by Broushaki et al. Using a stepwise greedy approach, we then selected individuals from this list that were in the most pairs to be excluded from further analysis, continuing until at least one individual had been removed from every pair. This resulted in a total of individuals removed, including 62 Ethiopians. Following the quality control described above, the total number of samples in the merge was , analyzed at , autosomal SNPs.

We used three different approaches to assess within-group genetic homogeneity in the Ethiopian ethnic groups. First, we computed the observed autosomal homozygous genotype counts for each sample using the—het command in PLINK v1. Second, we pruned SNP data based on linkage disequilibrium —indep-pairwise 50 5 0. This ROH procedure find runs of consecutive homozygous SNPs within groups that are identical-by-descent; here we report the total length of these runs per individual Supplementary Fig.

For each population, we report the fraction of the genome that each pair of individuals shares IBD Supplementary Fig. We assessed whether the degree of genetic diversity in Ethiopian ethnic groups was associated with census population size, by comparing different measures of genetic diversity described above homozygosity, IBD and ROH with the census population size using standard linear regression Supplementary Fig.

As population census are not always available and can be inaccurate, we limited this analysis to ethnic groups in the SNNPR, for whom census information was recently reported The Council of Nationalities, Southern Nations and Peoples Region, By modelling correlations among neighboring SNPs i. These probabilities are then tabulated across all positions to infer the total proportion of DNA for which each target haploid shares an MRCA with each reference haploid.

We can then sum these total proportions across the reference haploids assigned to each of K pre-defined groups. Following van Dorp et al. This is because individuals from groups subjected to such isolation typically will match relatively long segments of DNA to only a subset of Ethiopians i. However, this isolation will not affect how the same individuals match to each non-Ethiopian under analysis 2 , for which they typically share more temporally distant ancestors.

Consistent with this, in our sample the average size of DNA segments that an Ethiopian individual matches to another Ethiopian is 0. Potentially this could give more power by reducing noise in the inferred copy vector for each group through averaging. Overall this permutation procedure tests whether the ancestry profiles of individuals from A and B are exchangeable, while accounting for sample size and avoiding how some permutations may by chance put an unusually large proportion of individuals from the same group into the same permuted group.

For each Ethiopian group A , in Supplementary Fig. For each A , any group B where we cannot reject the null hypothesis at the 0. We found that results change very little, e. Therefore, we assumed. As the main observed signal of association between genetic and spatial distance is the increased G ij at small values of d ij , e.

Therefore, for these analysis we assumed:. To test whether geographic distance was still associated with genetic similarity after accounting for elevation difference, we assumed:. Then to test for an association between genetic similarity and geographic distance after accounting for elevation, we used Eq. Similarly, to test for an association between genetic similarity and elevation difference after accounting for geographic distance, we replaced x ij in Eq.

We used the same permutation procedure described above to generate p values Supplementary Table 5b, d. We then tested whether sharing the same A self-reported group label, B language category of reported ethnicity, C self-reported first language, D self-reported second language, or E self-reported religious affiliation were significantly associated with increased genetic similarity after accounting for geographic distance or elevation difference.

For B , we used the four labels in the second tier of linguistic classifications at www. To test whether each of these factors are associated with genetic similarity, we repeated the above analyses that use Eqs. Our reported p values give the proportion of permutations for which genetic similarity among permuted individuals sharing the same Y is more extreme than or equal to that of the real un-permuted data. As group label, language and religion can also be correlated with spatial distance and with each other e.

For example, when fixing A , we only permuted birthplaces and each of B - E across individuals within each group label, hence preserving the effect of group label on G ij. Applying this permutation procedure for each of A - E , we repeated all tests described above, reporting p values in Supplementary Table 5. For each of geographic distance, elevation difference, and A - E , our final p values reported in the main text and Fig. First, relative to the distances between birthplaces among all individuals, Ethiopians who share the same group label or who share the same first language live near each other Supplementary Table 6 , so that permuting birthplaces while fixing group label or first language do not permute across large spatial distances.

Therefore, we ignore those permutations when reporting our final p values for geographic distance and elevation difference i. Second, the high correlation between group label and first language Supplementary Fig. Furthermore, few permutations are possible when testing language group while accounting for group label 0 permutations available or first language.

Therefore, we excluded permutations fixing group and fixing first language when testing each of group, first language and language group when reporting our final p values in the main text and Supplementary Fig. Note we do observe a significant association with genetic similarity and ethnicity after accounting for spatial distance geographic or elevation and major language group, suggesting ethnicity explains genetic similarity beyond that of classifications according to the second language tier of at Ethnologue.

We caution that these analyses assume that the relationships among genetic, geographic and elevation distance can be modelled with simple linear or exponential functions, which is sometimes debatable Supplementary Fig. To focus on the fine-scale clustering of Ethiopians, we fixed all non-Ethiopian samples in the dataset as seven super-individual populations Africa, America, Central Asia Siberia, East Asia, Oceania, South Asia and West Eurasia that were not merged with the rest of the tree.

Following Lawson et al. Starting from this clustering, we then performed , additional hill-climbing steps to find a nearby state with even higher posterior probability. This gave a final inferred number of clusters containing Ethiopians. We used a visual inspection of this tree to merge clusters, starting at the bottom level of clusters, that had small numbers of individuals of the same ethnicity, as shown in Supplementary Fig.

After merging, we ended up with a total of 78 Ethiopian clusters. We followed Leslie et al. In particular for each of these MCMC samples, we assigned a certainty score for each individual i being assigned to each final cluster j out of 78 as the percentage of individuals assigned to the same cluster as individual i in that MCMC sample that are found in final cluster j.

For each combination of individual and final cluster, we averaged these certainty scores across all MCMC samples. For each of our 78 final clusters, in Supplementary Data 4 we report the average certainty score of being assigned to that cluster across all individuals assigned to that cluster.

This average certainty score had a mean of For comparison, the average certainty score of being assigned to a cluster other than the final classification we used had a mean of 0. We note that clusters do not necessarily correspond to distinct groups that split from one another in the past, but instead provide a convenient means to increase power and clarity of ancestry inference by i merging people with similar genetic variation patterns, and ii separating individuals of the same self-identified label that have different genetic variation patterns.

We fixed the mean of this truncated Poisson to 4 while allowing 8 total groups to contribute at each MCMC iteration, otherwise using default parameters. In Supplementary Data 7 and Fig. For 2 , following Hellenthal et al. We report results for null. To test whether individuals from language classification A are more genetically similar to each other than an individual from classification A is to an individual from classification B , we followed an analogous procedure to that detailed above to test for genetic differences between group labels A and B.

Despite the SSNPR book also containing information about the Ari, we did not include them among these 46 because of the major genetic differences among occupational groups Fig. For the Wolayta, we included individuals that did not report belonging to any of the occupational groups analysed here. To do so, if H ethnic groups in total reported participating in a practice, any pair of ethnicities that both reported participating in this practice added a contribution of 1.

Similarly, if Z ethnic groups in total reported not participating in a practice, any pair of ethnicities that both reported not participating in this practice added a contribution of 1.

Genetic similarity, geographic distance and elevation difference between two ethnic groups A , B were each calculated as the average such measure between all pairings of individuals where i is from A and j from B.

We then applied a mantel test using the mantel package in the vegan library in R with , permutations to assess the significance of association between genetic and cultural similarity across all pairings of ethnic groups Supplementary Table 8. We also used separate partial mantel tests, using the mantel. For each of the 31 cultural practices, all 46 ethnic groups were classified as either i reporting participation in the practice, ii reporting not participating in the practice or iii not reporting whether they participated in the practice.

To do so, we calculated the difference in mean genetic similarity among all pairs of groups assigned to X versus that among all pairs assigned to Y. We then randomly permuted ethnic groups across the two categories 10, times, calculating p values as the proportion of times where the corresponding difference between permuted groups assigned to X versus Y was higher than that observed in the real data.

These p values remained after first adjusting for spatial distance as described in this paragraph. We calculated the average genetic similarity between all ethnic groups sharing these six practices after accounting for the effects of spatial distance and language classification. To account for spatial distance, we used Eqs. In each of the above regressions, we fit our models using all pairs of Ethiopians that were not from the same language classification at the branch level i.



0コメント

  • 1000 / 1000