FAQ

How should the SVAC data be cited?

In a references list, the SVAC Dataset may be cited as follows:

Cohen, Dara Kay and Ragnhild Nordås. 2014. Sexual Violence in Armed Conflict Dataset. [Date Retrieved], from the Sexual Violence in Armed Conflict Dataset website: http://www.sexualviolencedata.org

Please note that additional information and more FAQ are available in the SVAC Codebook.

How is “sexual violence” defined?

The definition of sexual violence used in the SVAC Dataset builds on the definition used by the International Criminal Court, and includes (1) rape, (2) sexual slavery, (3) forced prostitution, (4) forced pregnancy, and (5) forced sterilization/abortion. Following Elisabeth Wood (2009), we also include (6) sexual mutilation, and (7) sexual torture.

The definition is gender neutral, and does not preclude the existence of female perpetrators or male victims, both of which are observed in the data.

We focus on behaviors that involve direct force or physical violence, and therefore exclude acts that do not go beyond verbal sexual harassment, such as sexualized insults or verbal humiliation.

Which conflict actors are included?

In the original SVAC dataset, we included reports of sexual violence by three actor types: (1) state forces, (2) rebel groups, both from the UCDP dataset by Harbom et al. (2008) and (3) pro-government militias from Carey et al. (2012).

The SVAC 2.0 and 3.0 Datasets (1989-2019) include reports of sexual violence by state forces and rebel groups. It also retains pro-government militias for the years 1989-2009. We do not include pro-government militias in the years 2010-2019 because the Pro-Government Militia Database (Carey et al. 2013) ends in 2009. We plan to update the SVAC 3.0 Dataset following the release of the updated Pro-Government Militia Database.

A pro-government militia is defined by Carey et al. (2013) as “a group that is identified as pro-government or sponsored by the government (national or subnational), is identified as not being part of the regular security forces, is armed, and has some level of organization.” More information about the militia data can be found here: http://www.sowi.uni-mannheim.de/militias/

Actors such as domestic police, interrogators, border patrol, border police, and checkpoint police are coded as government actors only in cases where there is explicit evidence that the violence perpetrated by these groups is directly conflict-related and/or directed at an insurgent or suspected insurgent.

Note that we do not code sexual violence perpetrated by civilians, such as intimate partner violence. We also do not code sexual violence perpetrated against members of an armed group’s own organization.

What types of conflicts are included?

The SVAC Dataset includes all active state-based armed conflicts in the period 1989-2019, based on Harbom et al. (2008). An armed conflict is defined as: “a contested incompatibility that concerns government and/or territory where the use of armed force between two parties, of which at least one is the government of a state, results in at least 25 battle-related deaths” (Gleditsch et al. 2002). The conflict definition, therefore, captures large-scale wars (defined as those with more than 1000 battle deaths per year) as well as lower intensity armed conflicts. It also includes intrastate, internationalized internal, and interstate conflicts.

How is prevalence coded?

The prevalence measure gives an estimate of the relative magnitude of sexual violence perpetrated by a conflict actor in a particular year. This is coded according to an ordinal scale, adapted from Cohen (2010; 2016) and discussed in Cohen and Nordås (2014). Note that the coding is primarily based on the qualitative description; only when a description lacked any of the relevant key words do we rely on a count of estimated incidents. The SVAC dataset cannot be used as a means to estimate the number of victims.

Prevalence = 3 (Massive). Sexual violence is likely related to the conflict, and:

Sexual violence was described as “massive,” “systematic,” or “innumerable”
Actor used sexual violence as a “means of intimidation,” “instrument of control and punishment,” “weapon,” “tactic to terrorize the population,” “terror tactic,” “tool of war,” on a “massive scale”

Note: Reports of 1,000 or more incidents or victims of sexual violence are coded as 3.

Prevalence = 2 (Common). Sexual violence is likely related to the conflict, but did not meet the requirements for a 3 coding, and:

Sexual violence was described as “widespread,” “common,” “commonplace,” “extensive,” “frequent,” “often,” “persistent,” “recurring,” a “pattern,” a “common pattern,” or a “spree”
Sexual violence occurred “commonly,” “frequently,” “in large numbers,” “periodically,” “regularly,” “routinely,” “widely,” or on a “number of occasions;” there were “many” or “numerous instances”

Note: Reports of 25-999 incidents or victims of sexual violence are coded as 2.

Prevalence = 1 (Some). Sexual violence is likely related to the conflict, but did not meet the requirements for a 2 or 3 coding, and:

There were “reports,” “isolated reports,” or “there continued to be reports” of occurrences of sexual violence

Note: Reports of less than 25 incidents or victims of sexual violence are coded as 1.

Prevalence = 0 (No reported sexual violence). A report was issued for a country in a given year, but there was no mention of sexual violence related to the conflict.

Prevalence = -99 (Missing; BOTH no report AND no information.) No report was issued for a country-year and no data about this actor-conflict-year was available from subsequent years.

Why are the prevalence scores sometimes different across the three sources?

Prevalence scores are coded separately from each of the three sources. “Prev_ State” scores are assigned using information from U.S. State Department annual reports. “Prev_ HRW” scores are assigned using information from Human Rights Watch annual and periodic special reports. “Prev_ AI” scores are assigned using information from Amnesty International annual and periodic special reports.

Note that these are the only sexual violence variables that are disaggregated by source. All other variables reflect reporting from one or more of the three sources. The Conflict Manuscripts contain details about which source was used to determine the code for each variable.

Which prevalence score should I use?

This depends on the nature of the research project. One option is to use an average of the three scores, a second option is to use the highest value across the three scores, and a third might be to estimate separate models for each source. The Bibliography page lists publications that show how others have used these variables.

What are “interim years”?

Active conflict years are those conflict-actor-years that reach 25 or more battle-related deaths in a state-insurgent dyad.

We also include up to five years in between active conflict years; these are years when battle deaths drop below the 25 threshold but increase again before five years have elapsed. We call these “interim years.” For example, if a rebel group was active in 1993, 1994, and 1996, we include information about activities in those years and also code any sexual violence that was reported in 1995, the interim year. See the Codebook and User Guide for more details.

What are “post-conflict years”?

Post-conflict years are defined as the first five years after a conflict dyad was last active. The exception is when a dyad ends due to actors shifting sides (for example, a rebel group becomes the government). See the Codebook and User Guide for more details.

When was the dataset updated?

We released an updated version of the original SVAC dataset (version 1.1) in November 2016. This version included a number of minor changes that correct errors from the initial release.

SVAC 2.0 is the first major update and extension of the dataset; it was released in November 2019, and includes 2010-2015.

SVAC 3.0 is the second major update and extension; it was completed in February 2021 (and released in May 2021) and includes 2016-2019.

Why are there fewer variables in SVAC 3.0?

The updated dataset is restricted to the core variables of interest: prevalence and forms of sexual violence. The updated version no longer includes the following variables from the original dataset:

pgm_id, selection, selection_ethnicity, selection_nationality, selection_religion, selection_age, selection_actor, selection_other, male, child, detainee, refugee, timing, timing_month, timing_military, timing_political, timing_errands, timing_search, location_text, location_camp, location_checkpoint, location_detention, location_private, location_school, public_public, public_semipublic, public_private, witness_family, witness_victims, witness_soldiers, witness_other, gang, byproxy.

Why are there two different SVAC 3.0 datasets available?

To facilitate the use of the SVAC data we created two datasets:

(1) The complete version (SVAC_complete_1989-2019) conserves the original dataset structure by including interim years and the first five post-conflict years.

(2) The smaller dataset (SVAC_conflictyears_1989-2019) includes only active conflict-years. Because most analysts are primarily interested in sexual violence during active conflict-years, we created this smaller dataset to facilitate the use of the data.

Further Reading