$\S$ 1. Data

We analyzed 80 surveys from 29 SSA countries, each having between 1 and 6 surveys between the years 2000 and 2018. HIV incidence ranged from a minimum of 0.09 new cases per 1000 population in Niger in 2018 to 50.04 new cases per 1000 population in Zimbabwe in 1991. HIV prevalence ranged from <0.1 per 100 population in Gambia and Senegal in 1990 and Gambia in 1991, to 25.4 per 100 population in Zimbabwe in 1996. HIV mortality ranged from 0.01 per 1000 population in Benin in 1990 to 12.83 per 1000 population in Zimbabwe in 2003. Likewise, sociobehavioral characteristics varied substantially across SSA countries and surveys [Table 1].

$\S$ 2. Analysis

2.1. Sociobehavioral characteristics and Principal Component Analysis

Using PCA we find that 95% of the variance found in the original 46 dimension data set can be explained by only the first 8 PCs, with the first, PC-1, explaining 60.7% and the second, PC-2, explaining 12.5% of the total variance across those surveys done in SSA after 2015 [Fig. 1B]. Of the 46 original sociobehavioral indicators, male circumcision (10.7%) contributed the most to these 2 main PCs, followed by religion (9.8% for Muslim and 8.6% for Christian), acceptance of domestic violence (6.9% for women who think wife beating can be justified and 6.3% for married women who disagree with wife beating), HIV testing (6.6% for men and 5.2% for women), an accepting attitude towards people living with HIV/AIDS (PLWHA) (6.5% for men and 6.2% for women), rurality (6.2%), women participating in decisions (5.7%), literacy ($5.0% for women and 4.1% for men), and ART coverage (4.6%) [Fig. 1A and Fig. S1].

Figure 1A Figure 1B

Figure 1A shows how the rotation matrix projects the original sociobehavioral characteristics onto the 2 dimensional space of PC-1 and PC-2. Similarly, Figure 2A shows how this transformation places the SSA surveys since 2015 onto the same 2D-space. We discern an upside down V-shape, and distinguish on the lower left hand side the first cluster of countries, identified in red, of mostly southern and eastern Africa (Burundi, Kenya, Lesotho, Malawi, Rwanda, Uganda, Zambia, and Zimbabwe). These countries are characterized by more accepting attitudes towards PLWHA and better knowledge about HIV, higher literacy rates, higher ART coverage and HIV testing, but notably lower rates of male circumcision. On the lower right hand side, identified in yellow, a second cluster of countries of the Sahel (Chad, Mali, and Senegal) characterized by a high percentage of Muslim populations, lower mean number of sexual partners, and fewer women participating in decisions. In between and higher on the PC-2 axis, and identified in yellow, is the third cluster of countries of mostly West and Central Africa (Angola, Benin, Cameroon, Ethiopia, Ghana, and Nigeria). These countries are characterized by a less rural population, stronger women empowerment (women who work and disagree with domestic violence), and high rates of male circumcision.

Projecting earlier surveys onto the same 2D-space gives a longitudinal perspective and insight into the evolution of sociobehavioral characteristics in SSA [Figure 2] since the turn of the millennium. We notice the upside down V-shape is mostly preserved through time, and although there are some edge cases like Ethiopia (between Sahel and western/central countries) and Gabon (between western/central and eastern/southern countries), the geographical and sociobehavioral similarities remain strong and the clusters easily distinguishable. Since 2000, while no clear movement can be seen on the PC-2 axis, we observe a clear trend of the three clusters to shift towards the negative PC-1 direction, although to varying degree across the clusters. This shift corresponds in all three clusters to an increase in HIV testing, particularly for women, an increase in ART coverage, and an increase in acceptance of PLWHA. While this leftwards shift is slowed in the second cluster (countries of Sahel region) by a decrease in working women, the trend is accentuated in the other two clusters by increases in knowledge about HIV, increases in women empowerment indicators (women participating in decision making and disagreeing with domestic violence), and an increase in working men for the first cluster in red [Figures 2 and S3]. From 2000 to 2018 we found the first cluster to move the most, a Cartesian distance of almost 80 units, the second cluster drifts the least with only a change of 30 units, and the third cluster moves almost 50 units. We found the sociobehavioral heterogeneity across clusters to increase over time as measured by the 2D Cartesian distance between them: from 2000 to 2018, the first and second clusters moved 142 to 190 units away, the first and third clusters moved from 70 to 110 units away, and the second and third clusters moved from 90 to 110 units away.

2015-2019 2010-2014 2005-2009 2000-2004

While overall levels of HIV incidence have tended to decrease between 2000 and 2019 we find that countries of the same clusters have mostly similar HIV incidence and that HIV incidence differs across clusters. Countries of the first cluster in red tend to have a very high HIV incidence, countries of the second cluster identified in yellow have low HIV incidence, and the third cluster in orange tend to land somewhere in between with rather low levels of HIV incidence. We find that this is not something that developed over the time period 2000-2019, and find no association between changes in sociobehavioral characteristics since 2000 and the decreases in HIV incidence over the same period [Figure 3].

$\S$ 2.2.2. Course of HIV epidemics and effective contact rate β

[Figure 3] also shows the progression of the effective contact rates across SSA since 1990. While we discern a sharp decrease from 1990 to 2000, with substantial variation across SSA in 1990 to a point where low values are standard across SSA in 2019, we find that the three clusters have a remarkably similar progression of effective contact rates.

HIV incidence progression β progression

Our silhouette score indicates countries of SSA should be clustered into 2 groups based on their effective contact rate progression. The first cluster identified in green includes countries whose effective contact rates are already decreasing in 1990 and continue doing so until rapidly reaching low value steady-states around the year 2000. The second group of countries identified in purple are countries whose effective contact rates tend to only start decreasing later around 1995, and only reaching the low steady-state values between 2005 and 2010 [Figure 4].

β of all countries Barycenters of 3 original SB clusters Barycenters of 2 β clusters

We find that while the 3 original clusters based on sociobehavioral characteristics associate well with how high the peak HIV incidence will be, the 2 new clusters based on effective contact rates associate with the timing of the epidemics and when those peak HIV incidence levels will occur [Figure 5].

SB space clusters Peak HIV incidence
β clusters Date of Peak HIV Incidence

References