OpenAIAgent chat returns empty sourceNodes and metadata #1015

irevived1 · 2024-07-03T19:37:03Z

Hello LlamaIndexTS,
I am currently using OpenAIAgent with QueryEngineTool from a custom VectorStoreIndex.
The agent is responding with correct information from the query engine but the sourceNodes and metadata is empty.
Not sure if this is a bug or me misusing the tool.

Thanks in advance,

  const menuContext = await storageContextFromDefaults({
    persistDir: "./menuDB",
  });

  const menuIndex = await VectorStoreIndex.fromDocuments([], {
    storageContext: menuContext,
  });
  const menuRetriever = await menuIndex.asRetriever()

  const menuRetrieverQueryEngine = await menuIndex.asQueryEngine({
    retriever: menuRetriever,
  });

////////////////////////////////

  const openaiLLM = new OpenAI({ model: "gpt-4o", temperature: 0 });

  const agent = new OpenAIAgent({
    systemPrompt: systemMessage,
    verbose: true,
    model: openaiLLM, tools: [
      // ..... more tools here
      new QueryEngineTool({
        queryEngine: menuRetrieverQueryEngine,
        metadata: {
          name: "menu_tool",
          description: `This tool can answer about items on the menu`,
        },
      })
    ]
  });


  const response = await agent.chat({ message: 'do you burgers?', verbose: true })

// response
EngineResponse {
sourceNodes: undefined,
metadata: {},
message: {
content: 'We have two burgers:\n' +
......

The text was updated successfully, but these errors were encountered:

marcusschiesser · 2024-07-04T13:04:57Z

yes the sourceNodes are not forwarded to the agent, you'll need a callbackmanager:

const callbackManager = new CallbackManager();

callbackManager.on("retrieve-end", (data) => {
    const { nodes, query } = data.detail.payload;
    ... do something with nodes
});

const response = await Settings.withCallbackManager(callbackManager, () => {
      return agent.chat({ message: 'do you burgers?', verbose: true })
 });

An agent could also have multiple QueryEngineTools, so which sourceNodes to forward?

@logan-markewich I think we should deprecate sourceNodes and metadata

logan-markewich · 2024-07-04T21:45:35Z

@marcusschiesser what would the alternative look like? This is a very well used method for tracking down sources

Take a look at python to see how we bubble up source nodes in agents. We have both sources (for tool calls) and source_nodes (in case any tool calls were a query engine with a response object)

irevived1 · 2024-07-05T03:35:38Z

Thanks for the response everyone. We can take advantage of CallbackManager for now.
Is it possible to update this thread if you plan to change the usage in the future?

I've also noticed another issue, its not completely related to the source nodes though.
The verbose flag doesn't really do anything. I tried both true/false and I couldn't spot the difference.

marcusschiesser · 2024-07-05T07:30:55Z

@logan-markewich

source_nodes (in case any tool calls were a query engine with a response object)

I see two problems with just having a source_nodes in the response object:

if we stream a response, in which chunk of the stream do we put the source_nodes?
if it's an agent and we have multiple QueryEngineTools, to which tool do the source_nodes belong to?

With the callbacks, we can solve both problems:

we send the retrieve-end event with the source_nodes as payload. The user even gets the time of the retrieval that way
similar, we have an llm-tool-result event which contains the result of the QueryToolEngine

logan-markewich · 2024-07-07T04:25:19Z

I guess in python, the response object and the stream are two separate objects
Technically this would be tracked unde response.sources, which has each tool call. .source_nodes is an aggregate of the nodes the agent used

Callbacks or instrumentation are ok-ish. As long as you can (as you mentioned) trace back the retrieved nodes to a specific tool/query engine. Hooking into custom callbacks is also slightly less user friendly, at least in python land

marcusschiesser · 2024-07-08T09:20:13Z

I agree that callbacks are ok-ish.

source_nodes is an aggregate of the nodes the agent used
if using the aggregate is ok for each use case, we could also add it to LITS (would also solve the original issue of this ticket).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OpenAIAgent chat returns empty sourceNodes and metadata #1015

OpenAIAgent chat returns empty sourceNodes and metadata #1015

irevived1 commented Jul 3, 2024

marcusschiesser commented Jul 4, 2024

logan-markewich commented Jul 4, 2024 •

edited

Loading

irevived1 commented Jul 5, 2024

marcusschiesser commented Jul 5, 2024

logan-markewich commented Jul 7, 2024

marcusschiesser commented Jul 8, 2024

OpenAIAgent chat returns empty sourceNodes and metadata #1015

OpenAIAgent chat returns empty sourceNodes and metadata #1015

Comments

irevived1 commented Jul 3, 2024

marcusschiesser commented Jul 4, 2024

logan-markewich commented Jul 4, 2024 • edited Loading

irevived1 commented Jul 5, 2024

marcusschiesser commented Jul 5, 2024

logan-markewich commented Jul 7, 2024

marcusschiesser commented Jul 8, 2024

logan-markewich commented Jul 4, 2024 •

edited

Loading