The HARG Truth: AI’s Need for the Human Element

Valtteri Karesto

September 9, 2023

In the evolving landscape of AI-powered systems, combining human intuition with machine efficiency can create robust and reliable solutions. The Human-Augmented Retrieval Generation (HARG) method builds upon the Retrieval Augmented Generation (RAG) model, integrating a crucial human touch to the pipeline.

To understand HARG, it's essential to first understand how RAG operates:

  1. Query: For example User asks a question
  2. Retrieval Step: Question is parsed and relevant documents related to question are retrieved
  3. Documents and original query: Documents and original prompt are fed to the language model
  4. Response: Answer is generated based on the documents and original query

This is a distilled overview. Each step contains its own intricacies and nuances, but the basic framework is as outlined above.

While RAG boasts numerous promising applications, it sometimes falls short. There are instances where the retrieved documents might be similar to the query but not strictly relevant. For instance, if someone inquires about Manchester United's performance this season, and the system retrieves documents related to the seasons '74-'75, '78-'79, and '87-'88, the response would be imprecise. A human, upon reviewing, would likely notice this discrepancy and adjust the query accordingly (add current year to it) or pick manually correct documents as context.

Adding a human step

HARG is designed for knowledge-intensive tasks that not only rely on accurate retrieval of information but also human judgment to select the most appropriate context. Unlike RAG, which automatically concatenates retrieved documents as context, HARG proposes a step where a human reviews the suggestions made by the retrieval component. This ensures that the selected context is both relevant and appropriate, thereby further reducing the chances of “hallucination” or generation of incorrect or irrelevant information.

Here’s how HARG operates:

  1. Query: For example User asks a question
  2. Retrieval Step: Just like RAG, HARG retrieves a set of relevant/supporting documents from a source (e.g., Wikipedia) based on the input.
  3. Human Selection Step: Instead of automatically feeding the retrieved documents to the generator, a human expert reviews and selects the most pertinent context from the suggestions.
  4. Documents and original query: Documents and original prompt are fed to the language model
  5. Response: Answer is generated based on the documents and original query

The inclusion of the human element in HARG serves a dual purpose: enhancing reliability by minimizing machine errors and ensuring the context aligns well with human intuition and understanding.

With the growing emphasis on human-in-the-loop AI systems, HARG bridges the best of both worlds, ensuring efficiency and relevance while maintaining the adaptability of retrieval-based generation models.

This HARG concept provides an additional layer of verification, ensuring more accurate and contextually appropriate responses.

Optimal use-cases for HARG

HARG might not be the optional solution for use cases where user is purely searching answers to questions. User might not know which documents are relevant to the query. Prominent use cases for HARG lies in co-pilot like applications, where user is generating something eg. code or parts of legal documents. In these cases user usually have some knowledge whether the retrieved documents are relevant and contain answers.

One use case would be helper tool for tech support operator. Operator might have traditional chat UI where they have conversation with users. While chatting with user, HARG enabled agent might analyse the conversation, fetch relevant documents based on user information, questions etc. and surface them on the UI for the Operator. Operator on the other hand can pick and choose relevant documents and ask Agent to generate possible answers to users questions based on human augmented context.

While doing this, all the generated question / answer pairs can be stored to later improve the agent itself for example by fine tuning. Same logic would apply to numerous co-pilot like applications.

Get started with Gen AI

Intentface sprint is a fast-paced focused program where we go through your organization needs and opportunities for productive use of generative AI technologies. We can then create a prototype, a proof of concept, or a roadmap to ignite your generative AI journey.

Who we are

We're a team of senior developers and designers. We understand how businesses work. We empower organizations to prototype, implement, and deploy pioneering generative AI solutions and products.