Skip to content

Blog

Developing LateriteAI – technology to automate tedious research tasks

Dimitri Stoelinga, co-founder of Laterite, shares their journey towards creating LateriteAI, a platform that harnesses the power of large language models to automate time-consuming tasks, enabling researchers to focus on what they love doing.

What if tedious research tasks could be automated?

Since our inception in Rwanda in 2010, Laterite – a data, research and analytics firm – has worked with dozens of organisations in the development sector to maximise the impact of their research.

While our job as development researchers is fascinating and supported by powerful tools like SurveyCTO, Stata, R, or Python, it often involves repetitive tasks that consume time we could use to think creatively and produce quality research. This sparked the question, “What if we could automate some of these tasks?”

 The LateriteAI project

LateriteAI began as an internal project, exploring the potential of large language models to streamline our research processes. As we developed our internal tools, it became clear that other researchers could also benefit from these applications. This realisation inspired the vision of LateriteAI as an online service, created by researchers for researchers, to accelerate tasks and foster collaboration.

What are the problems that LateriteAI is trying to solve?

The suite of tools we are developing could be helpful in a range of cases, including for example:

  • Extracting survey data: When we conduct electronic surveys at Laterite, we frequently invest substantial time copy-pasting questions and options from Word documents to Excel files. The Question Extractor app streamlines this process by identifying questions and multiple-choice options in a Word document and extracting them into a user-friendly Excel format, saving researchers valuable time.
  • Checking bias: Ensuring unbiased survey questions is crucial for collecting reliable data. However, biases can often go unnoticed in the hectic process of survey development. The Bias Checker app helps researchers to quickly identify and address biases in survey questions by scanning for leading, loaded, double-barrelled, negatively framed, ambiguous, and gendered questions. The app also suggests alternative, less biased formulations for the flagged questions.
  • Coding qualitative data: Analysing open-ended responses can provide invaluable insights, but manually coding thousands of responses can be prohibitively time-consuming. The Topic Clustering app tackles this challenge by identifying initial groupings and their collective meaning, making it easier for researchers to extract insights from open-ended questions.

Striking a balance between automation and human expertise

Researchers must confront the implications of technological breakthroughs such as the ones we are harnessing at Laterite. A recent study identified survey research as highly susceptible to large language models’ influence, presenting both opportunities to reduce drudgery and risks associated with biases, representativeness, and nuanced task performance.

Large language models like ChatGPT4 present a tremendous opportunity for research by streamlining tasks and enhancing efficiency, but they are still unreliable for more nuanced tasks (for example identifying biases). Even repetitive research tasks can demand intricate judgment, emphasising the importance of human expertise.

Developing LateriteAI has so far involved fine-tuning models with high-quality datasets to capture diverse inputs and ensure performance in complex contexts. Despite investing substantial researcher time in training models with thousands of prompt-completion pairs, some of the errors these models make still highlight the importance of continuous iteration, improvement, and researcher involvement.

To maximise the opportunities presented by new technologies, striking a balance between automation and human expertise is crucial. Developing evaluation frameworks, benchmarks, and methodologies is essential to align models with research intent and handle nuanced tasks effectively. Moreover, addressing data privacy and security concerns will be essential to maintain trust and compliance with evolving data protection regulations. By tackling these challenges with optimism, creativity and diligence, we can unlock the full potential of large language models in the research landscape.

Join us to collectively learn from and improve the technology

As we prepare for the alpha launch of LateriteAI, our hope is that we will be able to create a an ever-evolving platform where researchers can contribute their own applications and benefit from the collective ingenuity of their peers. If you are interested in these tools, subscribe to the Laterite newsletter, where we’ll be posting updates and the sign-up link.

Meanwhile, we’d like to hear others’ experiences of harnessing new technical innovations in MEL for systemic change. Hence, I invite you to join me at Itad’s upcoming webinar, on 29 March 2023, where we’ll discuss LateriteAI and the application of other innovations with colleagues from CASM Technology, Fondation Botnar, Itad, and Luminate.

 

Dimitri Stoelinga, is Co-founder and Managing Partner, Laterite.