The Natural Resource Governance Institute (NRGI) is an independent, non-profit policy institute that promotes responsible management of oil, gas and mineral resources for the public good. Too often natural resource wealth results in corruption and poverty, instead of growth and development. With effective revenue management, citizen engagement and government accountability, natural resource wealth can drive economic growth and development. NRGI provides the expertise, funding and technical assistance to help countries realize these benefits.
NRGI is seeking a data scientist to join the data team to enhance the development of the ResourceProjects tool, expand uptake of innovative data analysis across NRGI data projects and work with NRGI country and programmatic teams to explore data analysis and visualization on projects related to, for example, national oil companies, corruption or subnational revenue sharing.
The data scientist will advance the data team’s expansion in application of innovative use of extractive data across NRGI data products (ResourceProjects, ResourceContracts, Resource Governance Index and an NRGI data store) as well as other key data sources, e.g. satellite imagery of mines, fiscal data or environmental disclosures. They will have the opportunity to initiate and lead data wrangling, visualization and analysis projects. As Product Owner, the Data Scientist will drive the development of the next phase of ResourceProjects.org (a MongoDB of structured financial payment data); they will engage with users, lead user centered design processes, implement development sprints with external contractors and plan the technical roadmap.
- Initiate and co-develop data driven projects using data wrangling, analysis and visualization on oil and mining. Projects can include predictive analysis of extractive revenues, mapping of mining projects or forecasting based on analysis of financial disclosures. You will work with colleagues across teams as well as external partners to deliver data projects generating new insights and you will lead the exploration to determine the best methods and tools to answer tough research questions.
- Write short form blog posts related to data projects or longer research briefings.
- Stay abreast of developments around data tools relevant to our work. Utilize open source tools and contribute back to these.
- You will be product owner for ResourceProjects.org, a database of payments made by oil and mining companies to governments. This role assumes: leading user centered design processes; design technical road maps for the site based on user input; manage product sprints with external vendors; advance the capacity of the site to consume information from unstructured PDFs; and drive analytical questions using data on the site.
- Provide strategic support and input to other selected tools.
- Advise on other data projects such as ResourceContracts.org (a PHP document repository) and ResourceData.org (a CKAN data store).
- Take part in onsite data events to present NRGI’s work and advance professional development.
- You will have access to technical vendors that support technical implementations. You will work with well documented projects supported by a long-term sysadmin partner.
Master’s degree in relevant field (social science, journalism or computer science), or newly graduated Ph.D. graduate
Knowledge, skills and experience
- Three to five years’ proficiency in R and/or Python (experience with Pandas, Matplotib, Jupyter, Anaconda etc);
- Experience collecting or scraping unstructured data from the web (for example Beautiful Soup) and interacting with web APIs;
- Experience with data wrangling of unstructured and messy data (tabula, OpenRefine);
- Experience in data analysis of social science problems, e.g., working with government statistics, financial data, survey data, satellite imagery on questions linked to economic growth, poverty, environmental and social impacts, sustainability.
- Experience with micro services for data collection (such as Upwork or Mechanical Turk);
- Experience with SQL, Natural Language Processing, geoJSON;
- We host our sites using Amazon Web Services (AWS) and run a well-documented, infrastructure with solid processes for deployment. Previous experience with AWS would therefore be a plus;
- Experience in advanced statistical techniques such as econometric analysis, forecasting, data mining or financial modelling;
- We use Github and services such as compose.io and paper-trail. You will have the opportunity to grow manager responsibilities with external vendors. Previous experience with a similar context would be ideal;
- Interest or experience in working on development and/or resource sector.
Work Location: London, New York or Washington, D.C. Candidates must possess the right to work in the United Kingdom or the United States.
Start Date: March 2017
Compensation: Commensurate with experience. Benefits include medical, dental, work travel insurance, life and disability insurance, private pension scheme, 20 annual leave days plus all public holidays.
Applications are accepted on a rolling basis and applicants are encouraged to apply as soon as possible. To apply, please send a resume, cover letter to email@example.com with Data Scientist in subject line.