Skip to main content
  • News
  • Events
  • Blog
  • Search

Natural Resource Governance Institute

  • Topics
    Beneficial ownership
    Economic diversification
    Mandatory payment disclosure
    Revenue sharing
    Civic space
    Energy transition
    Measurement of environmental and social impacts
    Sovereign wealth funds
    Commodity prices
    Gender
    Measurement of governance
    State-owned enterprises
    Contract transparency and monitoring
    Global initiatives
    Open data
    Subnational governance
    Coronavirus
    Legislation and regulation
    Revenue management
    Tax policy and revenue collection
    Corruption
    Licensing and negotiation
  • Approach
    • Stakeholders
      • Civil society actors
      • Government officials
      • Journalists and media
      • Parliaments and political parties
      • Private sector
    • Natural Resource Charter
    • Regional knowledge hubs
  • Countries
    NRGI Priority Countries
    Colombia
    Guinea
    Nigeria
    Tanzania
    Dem. Rep. of Congo
    Mexico
    Peru
    Tunisia
    Ghana
    Mongolia
    Senegal
    Uganda
    OTHER COUNTRIES
  • Learning
    • Training
      • Residential training courses
        • Executive
        • Anglophone Africa
        • Francophone Africa
        • Asia-Pacific
        • Eurasia
        • Latin America
        • Middle East and North Africa
      • Online training courses
        • Advanced
        • Negotiating Contracts
        • Massive open online course (MOOC)
        • Interactive course: Petronia
      • Trainers' modules
        • (empty)
    • Primers
    • Glossary
  • Analysis & Tools
    • Publications
    • Tools
    • Economic models
  • About Us
    • What we do
      • 2020-2025 Strategy
      • Country prioritization
    • NRGI impact
    • Board of Directors
    • Emeritus Board Members
    • Advisory Council
    • Leadership team
    • Experts and staff
    • Careers and opportunities
    • Grant-making
    • Financials
    • Privacy policy
    • Contact us
  • News
  • Events
  • Blog

You are here

  1. Home
  2. Blog

Extracting Trends and Truths from Oil, Gas and Mining Contracts: Text Analytics and ResourceContracts.org

7 June 2016
Author
Giorgia CecchinatoJim Cust
Topics
Contract transparency and monitoringOpen dataTax policy and revenue collection
Stakeholders
Government officialsPrivate sector
Precepts
P3 What are Natural Resource Charter precepts?
Social Sharing

As a global repository of more than 1,000 mining and petroleum contracts between governments and extractives companies, ResourceContracts.org now provides a large corpus of data for advanced text analytics. The site, re-launched in November, has been upgraded to include digitized text as well as scans of original documents. One objective is to provide a rich source of information on the deals governments have been signing around the world; for journalists, researchers and policymakers.

In this blog we illustrate what is available on the platform and explore some examples of the kind of analysis that is now possible.

The contracts database now contains a total of 1,106 documents, including 863 digitized contracts; it hosts approximately 12.5 million words that can be searched, summarized and aggregated in response to queries. Most of the contracts available in the database are in English. The figure below illustrates the distribution of English-language contract documents across time, with 395 contracts (almost 80 percent) in the database originating during the 2000s.

Illustrative research objective:
Exploring the how the prevalence of key terms "infrastructure" and "profits taxes" have changed across petroleum and mineral contracts in recent years

To guide potential users through the techniques available on ResourceContracts.org and its application programming interface (API) we use an illustrative research inquiry. Here we are interested in the evolution of key terms in resource contracts over time, to understand what specific term may have become more or less prevalent over time. This can be a first step for a citizen or government official to understand how trends in resource contracts might compare to contractual terms in their own countries.

Infrastructure

First, we queried the prevalence (count) of the term "infrastructure" across all the contracts in our sample. You can do so by simply searching for the term using the main site search tool, or using the scripted Elastic Search query function via the API.

Contracts are mostly found to mention the term infrastructure in the sections specific to infrastructure requirements and provisions. (Three hundred and thirteen contracts contain the term infrastructure.) In some cases, this constitutes part of an agreement with the host government to share or jointly pay for that infrastructure with other parties (commonly termed as "multi-user" or "multi-access" provisions). Another context where the term "infrastructure" occurs is around the provision of "social infrastructure"—facilities such as schools, hospitals and in general assets accommodating community services (42 of 578 total contracts).

Looking at the graph above, we observe several notable trends. First, overall the prevalence of the term "infrastructure" is higher in mining contracts compared to hydrocarbon (petroleum) contracts. (Forty eight percent of mining contracts compared with 36 percent of hydrocarbons contracts contain at least one mention of the term.) Second, prevalence rises over time, with higher relative occurrence in the 2010-2014 period compared to earlier periods for both mining and hydrocarbons. Third, its prevalence in mining contracts has grown the most rapidly in our sample period—beginning with a lower or similar relative prevalence compared to petroleum, and by the end of the period recording a significantly larger prevalence.

The general upward trend in both petroleum and mining contracts could be explained by two factors. First, commodity prices rose significantly during 2000-2011 period, including for petroleum and many minerals. This increased the attractiveness of extraction for investors, increased rents and may have given governments sufficient bargaining power to ask for increased infrastructure provisions by companies, or propose new models for infrastructure provision (e.g., multi-access or multi-use forms). Second, the rise in prices led to investors seeking opportunities in frontier countries that typically do not have the infrastructure required to support major extraction projects. Thus, in general, contacts record a rising prevalence of infrastructure requirements as this becomes an increasingly important consideration in order to service extraction sites.

Profits tax

Second, we queried ResourceContracts.org using the term “profits tax” and its synonym “income tax.” (Here we report the results jointly.) These terms relate to a variety of common tax types used in extractives contracts to capture revenues for government; they are associated with the profitability (or income) a company is earning from its extraction activities. In particular, these terms relate to the existence of excess profits taxes in addition to the more commonly used corporate income taxes.

Compared to hydrocarbons contracts, we observe that a much lower proportion of mining contracts include the words “profit tax” and “income tax”—on average, 55 percent of hydrocarbon contracts and 14 percent of mining contracts.

This marked difference might be the result of systematic differences in the way government seek to tax the respective sectors. A lower prevalence of “income tax” or “profits tax” in contracts could occur when companies are subject to generally applicable legislation, for example standard corporate income tax, thus reducing the need for contracts to discuss such terms. Thus, it might be the case that it is more common for the hydrocarbons sector to set out exceptions to the generally applicable legislation or to include a rule in the contract that is separate from the general legislation. This in turn may be driven by the differences in rents available: in hydrocarbons, governments may be more inclined to set sector-specific profits taxes to capture the higher excess profits associated with oil extraction, compared to mineral extraction. Indeed, according to the IMF, the effective tax rates were higher for hydrocarbons than for minerals during this period, which may support this hypothesis.

Research techniques

The illustrative research cases presented above are simple examples of the possibilities that ResourceContracts.org offers as a data source. What makes this contract repository a promising platform for new research is the ability to search, download and analyze the digital text of each document it contains, including using powerful query tools such as Elastic Search. This allows for a wide variety of queries and research tasks. We summarize three options:

  • Full-text search is made possible by turning optically scanned PDF pages into searchable text, thanks to high-quality optical character recognition techniques. When source documents are of too poor a quality to optically transform to text, they are transcribed via Mechanical Turk, a paid service from Amazon. Full-text search across all contracts can be used to filter contracts based on a specific term in the text, allowing for some preliminary exploratory analysis and suggestion of directions of research. Annotations can be searched as well, making the database an even more precise research engine.
  • Another essential feature of ResourceContracts.org is the way in which each contract has been tagged with relevant attributes, making metadata filtering possible. Currently, it is possible to filter contracts based on signature year, country, resource, company name, corporate group, contract type and annotation group.
  • The third type of research that the site has made possible is quantitative text analytics, as showcased above. Text analytics can be defined as the use of one or more methods for drawing statistical inferences from text populations. This approach combines the use of simple full-text search and the information stored in the metadata. It can range from simple count techniques (as seen above) to more sophisticated treatments, such as correspondence analysis and classification methods for document clustering.

Conclusion

There are many ways to take this research further: potential researchers could focus on a certain jurisdiction or on a region and combine text analysis with insights from national and international legislation—current laws as well as those in place when the contract was signed. Alternatively, the focus could be on a certain topic (e.g., a certain type of provision or clause) across different countries and/or companies.

Has this post prompted you to think about how you can use ResourceContracts.org data? We are currently awarding grants of up to USD 10,000 for researchers, journalists and civil society actors who wish to use the site for applied research questions or investigations. Learn more here. If you have an idea we hope to hear from you!

Giorgia Cecchinato is an NRGI research and data associate. Jim Cust is the director of research and data.

Related content

Twelve Ways EITI Stakeholders Can Improve Contract and License Disclosure

Robert Pitman
7 March 2017

Tullow Disclosure Yields Insight into Ghana Oil, Gas Sector

David Mihalyi
15 May 2017

Natural Resource Charter Benchmarking Framework: 170 Crucial Questions for Resource-Rich Countries

Robert PitmanDavid Manley
17 October 2016

Six Transparency Steps Toward Better Extractives Governance in Ukraine

Robert Pitman
7 August 2017

Open Contracting for Oil, Gas and Mining Rights: Seven Things We’ve Learned

Gavin HaymanRobert PitmanAmir Shafaie
26 June 2018
Helping people to realize the benefits of their countries’ endowments of oil, gas and minerals.
Follow on Facebook Follow on Twitter Subscribe to Updates
  • Topics
    Beneficial ownership
    Civic space
    Commodity prices
    Contract transparency and monitoring
    Coronavirus
    Corruption
    Economic diversification
    Energy transition
    Gender
    Global initiatives
    Legislation and regulation
    Licensing and negotiation
    Mandatory payment disclosure
    Measurement of environmental and social impacts
    Measurement of governance
    Open data
    Revenue management
    Revenue sharing
    Sovereign wealth funds
    State-owned enterprises
    Subnational governance
    Tax policy and revenue collection
  • Approach
    • Stakeholders
    • Natural Resource Charter
    • Regional knowledge hubs
  • Priority
    Countries
    • Colombia
    • Dem. Rep. of Congo
    • Ghana
    • Guinea
    • Mexico
    • Mongolia
    • Nigeria
    • Peru
    • Senegal
    • Tanzania
    • Tunisia
    • Uganda
  • Learning
    • Training
    • Primers
  • Analysis & Tools
    • Publications
    • Tools
    • Economic models
  • About Us
    • What we do
    • NRGI impact
    • Board of Directors
    • Emeritus Board Members
    • Advisory Council
    • Leadership team
    • Experts and staff
    • Careers and opportunities
    • Grant-making
    • Financials
    • Privacy policy
    • Contact us
  • News
  • Blog
  • Events
  • Search