{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Breakpoints for Agent in a Pipeline\n",
    "\n",
    "This notebook demonstrates how to set up breakpoints within an `Agent` component in a Haystack pipeline. Breakpoints can be placed either on the `chat_generator` or on any of the `tools` used by the `Agent`. This guide showcases both approaches.\n",
    "\n",
    "The pipeline features an `Agent` acting as a database assistant, responsible for extracting relevant information and writing it to the database."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Install packages"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%bash\n",
    "\n",
    "pip install \"haystack-ai>=2.16.1\"\n",
    "pip install \"transformers[torch,sentencepiece]\"\n",
    "pip install \"sentence-transformers>=3.0.0\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Setup OpenAI API key for the `chat_generator`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "from getpass import getpass\n",
    "\n",
    "if \"OPENAI_API_KEY\" not in os.environ:\n",
    "    os.environ[\"OPENAI_API_KEY\"] = getpass(\"Enter OpenAI API key:\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Initializations\n",
    "\n",
    "Now we initialize the components required to build an agentic pipeline. We will set up:\n",
    "\n",
    "- A `chat_generator` for the Agent\n",
    "- A custom `tool` that writes structured information to an `InMemoryDocumentStore`\n",
    "- An `Agent` that uses the these components to extract and store entities from user-supplied context"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "from haystack.document_stores.in_memory import InMemoryDocumentStore\n",
    "from haystack.components.agents.agent import Agent\n",
    "from haystack.components.generators.chat import OpenAIChatGenerator\n",
    "\n",
    "from haystack.dataclasses import Document\n",
    "from haystack.tools import tool\n",
    "from typing import Optional\n",
    "\n",
    "# Initialize a document store and a chat_generator\n",
    "document_store = InMemoryDocumentStore()\n",
    "chat_generator = OpenAIChatGenerator(\n",
    "    model=\"gpt-4o-mini\",\n",
    ")\n",
    "\n",
    "# Initialize a tool\n",
    "@tool\n",
    "def add_database_tool(name: str, surname: str, job_title: Optional[str], other: Optional[str]):\n",
    "    document_store.write_documents(\n",
    "        [Document(content=name + \" \" + surname + \" \" + (job_title or \"\"), meta={\"other\":other})]\n",
    "    )\n",
    "\n",
    "# Create the Agent\n",
    "database_assistant = Agent(\n",
    "        chat_generator=chat_generator,\n",
    "        tools=[add_database_tool],\n",
    "        system_prompt=\"\"\"\n",
    "        You are a database assistant.\n",
    "        Your task is to extract the names of people mentioned in the given context and add them to a knowledge base, \n",
    "        along with additional relevant information about them that can be extracted from the context.\n",
    "        Do not use your own knowledge, stay grounded to the given context.\n",
    "        Do not ask the user for confirmation. Instead, automatically update the knowledge base and return a brief \n",
    "        summary of the people added, including the information stored for each.\n",
    "        \"\"\",\n",
    "        exit_conditions=[\"text\"],\n",
    "        max_agent_steps=100,\n",
    "        raise_on_tool_invocation_failure=False\n",
    "    )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Initialize the Pipeline\n",
    "In this step, we construct a Haystack pipeline that performs the following tasks:\n",
    "\n",
    "- Fetches HTML content from a specified URL.\n",
    "- Converts the HTML into Haystack Document objects.\n",
    "- Builds a `prompt` from the extracted content.\n",
    "- Passes the prompt to the previously defined Agent, which processes the context and writes relevant information to a document store."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<haystack.core.pipeline.pipeline.Pipeline object at 0x107b24da0>\n",
       "🚅 Components\n",
       "  - fetcher: LinkContentFetcher\n",
       "  - converter: HTMLToDocument\n",
       "  - builder: ChatPromptBuilder\n",
       "  - database_agent: Agent\n",
       "🛤️ Connections\n",
       "  - fetcher.streams -> converter.sources (List[ByteStream])\n",
       "  - converter.documents -> builder.docs (List[Document])\n",
       "  - builder.prompt -> database_agent.messages (List[ChatMessage])"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from haystack import Pipeline\n",
    "from haystack.components.converters import HTMLToDocument\n",
    "from haystack.components.fetchers import LinkContentFetcher\n",
    "from haystack.components.builders import ChatPromptBuilder\n",
    "from haystack.dataclasses import ChatMessage\n",
    "\n",
    "pipeline_with_agent = Pipeline()\n",
    "pipeline_with_agent.add_component(\"fetcher\", LinkContentFetcher())\n",
    "pipeline_with_agent.add_component(\"converter\", HTMLToDocument())\n",
    "pipeline_with_agent.add_component(\"builder\", ChatPromptBuilder(\n",
    "    template=[ChatMessage.from_user(\"\"\"\n",
    "    {% for doc in docs %}\n",
    "    {{ doc.content|default|truncate(25000) }}\n",
    "    {% endfor %}\n",
    "    \"\"\")],\n",
    "    required_variables=[\"docs\"]\n",
    "))\n",
    "pipeline_with_agent.add_component(\"database_agent\", database_assistant)\n",
    "\n",
    "pipeline_with_agent.connect(\"fetcher.streams\", \"converter.sources\")\n",
    "pipeline_with_agent.connect(\"converter.documents\", \"builder.docs\")\n",
    "pipeline_with_agent.connect(\"builder\", \"database_agent\")\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Set up Breakpoints\n",
    "With our pipeline in place, we can now configure a breakpoint on the Agent. This allows us to pause the pipeline execution at a specific step—in this case, during the Agent's operation—and save the intermediate pipeline snapshot to an external file for inspection or debugging.\n",
    "\n",
    "We’ll first create a `Breakpoint` for the `chat_generator` and then wrap it using `AgentBreakpoint`, which explicitly targets the `Agent` component in the pipeline.\n",
    "\n",
    "Set the `snapshot_file_path` to indicate where you want to save the file."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "ename": "BreakpointException",
     "evalue": "Breaking at chat_generator visit count 0",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mBreakpointException\u001b[0m                       Traceback (most recent call last)",
      "Cell \u001b[0;32mIn[3], line 5\u001b[0m\n\u001b[1;32m      3\u001b[0m agent_generator_breakpoint \u001b[38;5;241m=\u001b[39m Breakpoint(component_name\u001b[38;5;241m=\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mchat_generator\u001b[39m\u001b[38;5;124m\"\u001b[39m, visit_count\u001b[38;5;241m=\u001b[39m\u001b[38;5;241m0\u001b[39m, snapshot_file_path\u001b[38;5;241m=\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124msnapshots/\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n\u001b[1;32m      4\u001b[0m agent_breakpoint \u001b[38;5;241m=\u001b[39m AgentBreakpoint(break_point\u001b[38;5;241m=\u001b[39magent_generator_breakpoint, agent_name\u001b[38;5;241m=\u001b[39m\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mdatabase_agent\u001b[39m\u001b[38;5;124m'\u001b[39m)\n\u001b[0;32m----> 5\u001b[0m \u001b[43mpipeline_with_agent\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mrun\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m      6\u001b[0m \u001b[43m    \u001b[49m\u001b[43mdata\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m{\u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mfetcher\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m:\u001b[49m\u001b[43m \u001b[49m\u001b[43m{\u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43murls\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m:\u001b[49m\u001b[43m \u001b[49m\u001b[43m[\u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mhttps://en.wikipedia.org/wiki/Deepset\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m]\u001b[49m\u001b[43m}\u001b[49m\u001b[43m}\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m      7\u001b[0m \u001b[43m    \u001b[49m\u001b[43mbreak_point\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43magent_breakpoint\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m      8\u001b[0m \u001b[43m)\u001b[49m\n",
      "File \u001b[0;32m~/haystack-cookbook/.venv/lib/python3.12/site-packages/haystack/core/pipeline/pipeline.py:382\u001b[0m, in \u001b[0;36mPipeline.run\u001b[0;34m(self, data, include_outputs_from, break_point, pipeline_snapshot)\u001b[0m\n\u001b[1;32m    377\u001b[0m         \u001b[38;5;28;01mif\u001b[39;00m should_trigger_breakpoint:\n\u001b[1;32m    378\u001b[0m             _trigger_break_point(\n\u001b[1;32m    379\u001b[0m                 pipeline_snapshot\u001b[38;5;241m=\u001b[39mnew_pipeline_snapshot, pipeline_outputs\u001b[38;5;241m=\u001b[39mpipeline_outputs\n\u001b[1;32m    380\u001b[0m             )\n\u001b[0;32m--> 382\u001b[0m component_outputs \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_run_component\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m    383\u001b[0m \u001b[43m    \u001b[49m\u001b[43mcomponent_name\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mcomponent_name\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    384\u001b[0m \u001b[43m    \u001b[49m\u001b[43mcomponent\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mcomponent\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    385\u001b[0m \u001b[43m    \u001b[49m\u001b[43minputs\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mcomponent_inputs\u001b[49m\u001b[43m,\u001b[49m\u001b[43m  \u001b[49m\u001b[38;5;66;43;03m# the inputs to the current component\u001b[39;49;00m\n\u001b[1;32m    386\u001b[0m \u001b[43m    \u001b[49m\u001b[43mcomponent_visits\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mcomponent_visits\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    387\u001b[0m \u001b[43m    \u001b[49m\u001b[43mparent_span\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mspan\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    388\u001b[0m \u001b[43m\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m    390\u001b[0m \u001b[38;5;66;03m# Updates global input state with component outputs and returns outputs that should go to\u001b[39;00m\n\u001b[1;32m    391\u001b[0m \u001b[38;5;66;03m# pipeline outputs.\u001b[39;00m\n\u001b[1;32m    392\u001b[0m component_pipeline_outputs \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_write_component_outputs(\n\u001b[1;32m    393\u001b[0m     component_name\u001b[38;5;241m=\u001b[39mcomponent_name,\n\u001b[1;32m    394\u001b[0m     component_outputs\u001b[38;5;241m=\u001b[39mcomponent_outputs,\n\u001b[0;32m   (...)\u001b[0m\n\u001b[1;32m    397\u001b[0m     include_outputs_from\u001b[38;5;241m=\u001b[39minclude_outputs_from,\n\u001b[1;32m    398\u001b[0m )\n",
      "File \u001b[0;32m~/haystack-cookbook/.venv/lib/python3.12/site-packages/haystack/core/pipeline/pipeline.py:75\u001b[0m, in \u001b[0;36mPipeline._run_component\u001b[0;34m(component_name, component, inputs, component_visits, parent_span)\u001b[0m\n\u001b[1;32m     70\u001b[0m     component_output \u001b[38;5;241m=\u001b[39m instance\u001b[38;5;241m.\u001b[39mrun(\u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39minputs)\n\u001b[1;32m     71\u001b[0m \u001b[38;5;28;01mexcept\u001b[39;00m BreakpointException \u001b[38;5;28;01mas\u001b[39;00m error:\n\u001b[1;32m     72\u001b[0m     \u001b[38;5;66;03m# Re-raise BreakpointException to preserve the original exception context\u001b[39;00m\n\u001b[1;32m     73\u001b[0m     \u001b[38;5;66;03m# This is important when Agent components internally use Pipeline._run_component\u001b[39;00m\n\u001b[1;32m     74\u001b[0m     \u001b[38;5;66;03m# and trigger breakpoints that need to bubble up to the main pipeline\u001b[39;00m\n\u001b[0;32m---> 75\u001b[0m     \u001b[38;5;28;01mraise\u001b[39;00m error\n\u001b[1;32m     76\u001b[0m \u001b[38;5;28;01mexcept\u001b[39;00m \u001b[38;5;167;01mException\u001b[39;00m \u001b[38;5;28;01mas\u001b[39;00m error:\n\u001b[1;32m     77\u001b[0m     \u001b[38;5;28;01mraise\u001b[39;00m PipelineRuntimeError\u001b[38;5;241m.\u001b[39mfrom_exception(component_name, instance\u001b[38;5;241m.\u001b[39m\u001b[38;5;18m__class__\u001b[39m, error) \u001b[38;5;28;01mfrom\u001b[39;00m\u001b[38;5;250m \u001b[39m\u001b[38;5;21;01merror\u001b[39;00m\n",
      "File \u001b[0;32m~/haystack-cookbook/.venv/lib/python3.12/site-packages/haystack/core/pipeline/pipeline.py:70\u001b[0m, in \u001b[0;36mPipeline._run_component\u001b[0;34m(component_name, component, inputs, component_visits, parent_span)\u001b[0m\n\u001b[1;32m     67\u001b[0m logger\u001b[38;5;241m.\u001b[39minfo(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mRunning component \u001b[39m\u001b[38;5;132;01m{component_name}\u001b[39;00m\u001b[38;5;124m\"\u001b[39m, component_name\u001b[38;5;241m=\u001b[39mcomponent_name)\n\u001b[1;32m     69\u001b[0m \u001b[38;5;28;01mtry\u001b[39;00m:\n\u001b[0;32m---> 70\u001b[0m     component_output \u001b[38;5;241m=\u001b[39m \u001b[43minstance\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mrun\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43minputs\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m     71\u001b[0m \u001b[38;5;28;01mexcept\u001b[39;00m BreakpointException \u001b[38;5;28;01mas\u001b[39;00m error:\n\u001b[1;32m     72\u001b[0m     \u001b[38;5;66;03m# Re-raise BreakpointException to preserve the original exception context\u001b[39;00m\n\u001b[1;32m     73\u001b[0m     \u001b[38;5;66;03m# This is important when Agent components internally use Pipeline._run_component\u001b[39;00m\n\u001b[1;32m     74\u001b[0m     \u001b[38;5;66;03m# and trigger breakpoints that need to bubble up to the main pipeline\u001b[39;00m\n\u001b[1;32m     75\u001b[0m     \u001b[38;5;28;01mraise\u001b[39;00m error\n",
      "File \u001b[0;32m~/haystack-cookbook/.venv/lib/python3.12/site-packages/haystack/components/agents/agent.py:350\u001b[0m, in \u001b[0;36mAgent.run\u001b[0;34m(self, messages, streaming_callback, break_point, snapshot, **kwargs)\u001b[0m\n\u001b[1;32m    337\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m (\n\u001b[1;32m    338\u001b[0m     break_point\n\u001b[1;32m    339\u001b[0m     \u001b[38;5;129;01mand\u001b[39;00m break_point\u001b[38;5;241m.\u001b[39mbreak_point\u001b[38;5;241m.\u001b[39mcomponent_name \u001b[38;5;241m==\u001b[39m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mchat_generator\u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m    340\u001b[0m     \u001b[38;5;129;01mand\u001b[39;00m component_visits[\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mchat_generator\u001b[39m\u001b[38;5;124m\"\u001b[39m] \u001b[38;5;241m==\u001b[39m break_point\u001b[38;5;241m.\u001b[39mbreak_point\u001b[38;5;241m.\u001b[39mvisit_count\n\u001b[1;32m    341\u001b[0m ):\n\u001b[1;32m    342\u001b[0m     agent_snapshot \u001b[38;5;241m=\u001b[39m _create_agent_snapshot(\n\u001b[1;32m    343\u001b[0m         component_visits\u001b[38;5;241m=\u001b[39mcomponent_visits,\n\u001b[1;32m    344\u001b[0m         agent_breakpoint\u001b[38;5;241m=\u001b[39mbreak_point,\n\u001b[0;32m   (...)\u001b[0m\n\u001b[1;32m    348\u001b[0m         },\n\u001b[1;32m    349\u001b[0m     )\n\u001b[0;32m--> 350\u001b[0m     \u001b[43m_check_chat_generator_breakpoint\u001b[49m\u001b[43m(\u001b[49m\u001b[43magent_snapshot\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43magent_snapshot\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mparent_snapshot\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mparent_snapshot\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m    352\u001b[0m \u001b[38;5;66;03m# 1. Call the ChatGenerator\u001b[39;00m\n\u001b[1;32m    353\u001b[0m \u001b[38;5;66;03m# We skip the chat generator when restarting from a snapshot where we restart at the ToolInvoker.\u001b[39;00m\n\u001b[1;32m    354\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m skip_chat_generator:\n",
      "File \u001b[0;32m~/haystack-cookbook/.venv/lib/python3.12/site-packages/haystack/core/pipeline/breakpoint.py:377\u001b[0m, in \u001b[0;36m_check_chat_generator_breakpoint\u001b[0;34m(agent_snapshot, parent_snapshot)\u001b[0m\n\u001b[1;32m    372\u001b[0m msg \u001b[38;5;241m=\u001b[39m (\n\u001b[1;32m    373\u001b[0m     \u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mBreaking at \u001b[39m\u001b[38;5;132;01m{\u001b[39;00mbreak_point\u001b[38;5;241m.\u001b[39mcomponent_name\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m visit count \u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m    374\u001b[0m     \u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;132;01m{\u001b[39;00magent_snapshot\u001b[38;5;241m.\u001b[39mcomponent_visits[break_point\u001b[38;5;241m.\u001b[39mcomponent_name]\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m    375\u001b[0m )\n\u001b[1;32m    376\u001b[0m logger\u001b[38;5;241m.\u001b[39minfo(msg)\n\u001b[0;32m--> 377\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m BreakpointException(\n\u001b[1;32m    378\u001b[0m     message\u001b[38;5;241m=\u001b[39mmsg,\n\u001b[1;32m    379\u001b[0m     component\u001b[38;5;241m=\u001b[39mbreak_point\u001b[38;5;241m.\u001b[39mcomponent_name,\n\u001b[1;32m    380\u001b[0m     inputs\u001b[38;5;241m=\u001b[39magent_snapshot\u001b[38;5;241m.\u001b[39mcomponent_inputs,\n\u001b[1;32m    381\u001b[0m     results\u001b[38;5;241m=\u001b[39magent_snapshot\u001b[38;5;241m.\u001b[39mcomponent_inputs[\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mtool_invoker\u001b[39m\u001b[38;5;124m\"\u001b[39m][\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mserialized_data\u001b[39m\u001b[38;5;124m\"\u001b[39m][\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mstate\u001b[39m\u001b[38;5;124m\"\u001b[39m],\n\u001b[1;32m    382\u001b[0m )\n",
      "\u001b[0;31mBreakpointException\u001b[0m: Breaking at chat_generator visit count 0"
     ]
    }
   ],
   "source": [
    "from haystack.dataclasses.breakpoints import AgentBreakpoint, Breakpoint, ToolBreakpoint\n",
    "\n",
    "agent_generator_breakpoint = Breakpoint(component_name=\"chat_generator\", visit_count=0, snapshot_file_path=\"snapshots/\")\n",
    "agent_breakpoint = AgentBreakpoint(break_point=agent_generator_breakpoint, agent_name='database_agent')\n",
    "pipeline_with_agent.run(\n",
    "    data={\"fetcher\": {\"urls\": [\"https://en.wikipedia.org/wiki/Deepset\"]}},\n",
    "    break_point=agent_breakpoint,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This will generate a JSON file, named after the agent and component associated with the breakpoint, in the \"snapshosts\" directory containing a snapshot of the Pipeline where the Agent is running as well as a snapshot of the Agent state at the time of breakpoint."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "snapshots/database_agent_chat_generator_2025_07_26_12_22_11.json\n"
     ]
    }
   ],
   "source": [
    "!ls snapshots/database_agent_chat*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can also place a breakpoint on the `tool` used by the `Agent`. This allows us to interrupt the pipeline execution at the point where the `tool` is invoked by the `tool_invoker`.\n",
    "\n",
    "To achieve this, we initialize a `ToolBreakpoint` with the name of the target tool, wrap it with an `AgentBreakpoint`, and then run the pipeline with the configured breakpoint."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "ename": "BreakpointException",
     "evalue": "Breaking at tool_invoker visit count 0 for tool add_database_tool",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mBreakpointException\u001b[0m                       Traceback (most recent call last)",
      "Cell \u001b[0;32mIn[6], line 4\u001b[0m\n\u001b[1;32m      1\u001b[0m agent_tool_breakpoint \u001b[38;5;241m=\u001b[39m ToolBreakpoint(component_name\u001b[38;5;241m=\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mtool_invoker\u001b[39m\u001b[38;5;124m\"\u001b[39m, visit_count\u001b[38;5;241m=\u001b[39m\u001b[38;5;241m0\u001b[39m, tool_name\u001b[38;5;241m=\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124madd_database_tool\u001b[39m\u001b[38;5;124m\"\u001b[39m, snapshot_file_path\u001b[38;5;241m=\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124msnapshots\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n\u001b[1;32m      2\u001b[0m agent_breakpoint \u001b[38;5;241m=\u001b[39m AgentBreakpoint(break_point\u001b[38;5;241m=\u001b[39magent_tool_breakpoint, agent_name \u001b[38;5;241m=\u001b[39m \u001b[38;5;124m'\u001b[39m\u001b[38;5;124mdatabase_agent\u001b[39m\u001b[38;5;124m'\u001b[39m)\n\u001b[0;32m----> 4\u001b[0m \u001b[43mpipeline_with_agent\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mrun\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m      5\u001b[0m \u001b[43m    \u001b[49m\u001b[43mdata\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m{\u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mfetcher\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m:\u001b[49m\u001b[43m \u001b[49m\u001b[43m{\u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43murls\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m:\u001b[49m\u001b[43m \u001b[49m\u001b[43m[\u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mhttps://en.wikipedia.org/wiki/Deepset\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m]\u001b[49m\u001b[43m}\u001b[49m\u001b[43m}\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m      6\u001b[0m \u001b[43m    \u001b[49m\u001b[43mbreak_point\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43magent_breakpoint\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m      7\u001b[0m \u001b[43m)\u001b[49m\n",
      "File \u001b[0;32m~/haystack-cookbook/.venv/lib/python3.12/site-packages/haystack/core/pipeline/pipeline.py:382\u001b[0m, in \u001b[0;36mPipeline.run\u001b[0;34m(self, data, include_outputs_from, break_point, pipeline_snapshot)\u001b[0m\n\u001b[1;32m    377\u001b[0m         \u001b[38;5;28;01mif\u001b[39;00m should_trigger_breakpoint:\n\u001b[1;32m    378\u001b[0m             _trigger_break_point(\n\u001b[1;32m    379\u001b[0m                 pipeline_snapshot\u001b[38;5;241m=\u001b[39mnew_pipeline_snapshot, pipeline_outputs\u001b[38;5;241m=\u001b[39mpipeline_outputs\n\u001b[1;32m    380\u001b[0m             )\n\u001b[0;32m--> 382\u001b[0m component_outputs \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_run_component\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m    383\u001b[0m \u001b[43m    \u001b[49m\u001b[43mcomponent_name\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mcomponent_name\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    384\u001b[0m \u001b[43m    \u001b[49m\u001b[43mcomponent\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mcomponent\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    385\u001b[0m \u001b[43m    \u001b[49m\u001b[43minputs\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mcomponent_inputs\u001b[49m\u001b[43m,\u001b[49m\u001b[43m  \u001b[49m\u001b[38;5;66;43;03m# the inputs to the current component\u001b[39;49;00m\n\u001b[1;32m    386\u001b[0m \u001b[43m    \u001b[49m\u001b[43mcomponent_visits\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mcomponent_visits\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    387\u001b[0m \u001b[43m    \u001b[49m\u001b[43mparent_span\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mspan\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    388\u001b[0m \u001b[43m\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m    390\u001b[0m \u001b[38;5;66;03m# Updates global input state with component outputs and returns outputs that should go to\u001b[39;00m\n\u001b[1;32m    391\u001b[0m \u001b[38;5;66;03m# pipeline outputs.\u001b[39;00m\n\u001b[1;32m    392\u001b[0m component_pipeline_outputs \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_write_component_outputs(\n\u001b[1;32m    393\u001b[0m     component_name\u001b[38;5;241m=\u001b[39mcomponent_name,\n\u001b[1;32m    394\u001b[0m     component_outputs\u001b[38;5;241m=\u001b[39mcomponent_outputs,\n\u001b[0;32m   (...)\u001b[0m\n\u001b[1;32m    397\u001b[0m     include_outputs_from\u001b[38;5;241m=\u001b[39minclude_outputs_from,\n\u001b[1;32m    398\u001b[0m )\n",
      "File \u001b[0;32m~/haystack-cookbook/.venv/lib/python3.12/site-packages/haystack/core/pipeline/pipeline.py:75\u001b[0m, in \u001b[0;36mPipeline._run_component\u001b[0;34m(component_name, component, inputs, component_visits, parent_span)\u001b[0m\n\u001b[1;32m     70\u001b[0m     component_output \u001b[38;5;241m=\u001b[39m instance\u001b[38;5;241m.\u001b[39mrun(\u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39minputs)\n\u001b[1;32m     71\u001b[0m \u001b[38;5;28;01mexcept\u001b[39;00m BreakpointException \u001b[38;5;28;01mas\u001b[39;00m error:\n\u001b[1;32m     72\u001b[0m     \u001b[38;5;66;03m# Re-raise BreakpointException to preserve the original exception context\u001b[39;00m\n\u001b[1;32m     73\u001b[0m     \u001b[38;5;66;03m# This is important when Agent components internally use Pipeline._run_component\u001b[39;00m\n\u001b[1;32m     74\u001b[0m     \u001b[38;5;66;03m# and trigger breakpoints that need to bubble up to the main pipeline\u001b[39;00m\n\u001b[0;32m---> 75\u001b[0m     \u001b[38;5;28;01mraise\u001b[39;00m error\n\u001b[1;32m     76\u001b[0m \u001b[38;5;28;01mexcept\u001b[39;00m \u001b[38;5;167;01mException\u001b[39;00m \u001b[38;5;28;01mas\u001b[39;00m error:\n\u001b[1;32m     77\u001b[0m     \u001b[38;5;28;01mraise\u001b[39;00m PipelineRuntimeError\u001b[38;5;241m.\u001b[39mfrom_exception(component_name, instance\u001b[38;5;241m.\u001b[39m\u001b[38;5;18m__class__\u001b[39m, error) \u001b[38;5;28;01mfrom\u001b[39;00m\u001b[38;5;250m \u001b[39m\u001b[38;5;21;01merror\u001b[39;00m\n",
      "File \u001b[0;32m~/haystack-cookbook/.venv/lib/python3.12/site-packages/haystack/core/pipeline/pipeline.py:70\u001b[0m, in \u001b[0;36mPipeline._run_component\u001b[0;34m(component_name, component, inputs, component_visits, parent_span)\u001b[0m\n\u001b[1;32m     67\u001b[0m logger\u001b[38;5;241m.\u001b[39minfo(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mRunning component \u001b[39m\u001b[38;5;132;01m{component_name}\u001b[39;00m\u001b[38;5;124m\"\u001b[39m, component_name\u001b[38;5;241m=\u001b[39mcomponent_name)\n\u001b[1;32m     69\u001b[0m \u001b[38;5;28;01mtry\u001b[39;00m:\n\u001b[0;32m---> 70\u001b[0m     component_output \u001b[38;5;241m=\u001b[39m \u001b[43minstance\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mrun\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43minputs\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m     71\u001b[0m \u001b[38;5;28;01mexcept\u001b[39;00m BreakpointException \u001b[38;5;28;01mas\u001b[39;00m error:\n\u001b[1;32m     72\u001b[0m     \u001b[38;5;66;03m# Re-raise BreakpointException to preserve the original exception context\u001b[39;00m\n\u001b[1;32m     73\u001b[0m     \u001b[38;5;66;03m# This is important when Agent components internally use Pipeline._run_component\u001b[39;00m\n\u001b[1;32m     74\u001b[0m     \u001b[38;5;66;03m# and trigger breakpoints that need to bubble up to the main pipeline\u001b[39;00m\n\u001b[1;32m     75\u001b[0m     \u001b[38;5;28;01mraise\u001b[39;00m error\n",
      "File \u001b[0;32m~/haystack-cookbook/.venv/lib/python3.12/site-packages/haystack/components/agents/agent.py:392\u001b[0m, in \u001b[0;36mAgent.run\u001b[0;34m(self, messages, streaming_callback, break_point, snapshot, **kwargs)\u001b[0m\n\u001b[1;32m    375\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m (\n\u001b[1;32m    376\u001b[0m     break_point\n\u001b[1;32m    377\u001b[0m     \u001b[38;5;129;01mand\u001b[39;00m break_point\u001b[38;5;241m.\u001b[39mbreak_point\u001b[38;5;241m.\u001b[39mcomponent_name \u001b[38;5;241m==\u001b[39m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mtool_invoker\u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m    378\u001b[0m     \u001b[38;5;129;01mand\u001b[39;00m break_point\u001b[38;5;241m.\u001b[39mbreak_point\u001b[38;5;241m.\u001b[39mvisit_count \u001b[38;5;241m==\u001b[39m component_visits[\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mtool_invoker\u001b[39m\u001b[38;5;124m\"\u001b[39m]\n\u001b[1;32m    379\u001b[0m ):\n\u001b[1;32m    380\u001b[0m     agent_snapshot \u001b[38;5;241m=\u001b[39m _create_agent_snapshot(\n\u001b[1;32m    381\u001b[0m         component_visits\u001b[38;5;241m=\u001b[39mcomponent_visits,\n\u001b[1;32m    382\u001b[0m         agent_breakpoint\u001b[38;5;241m=\u001b[39mbreak_point,\n\u001b[0;32m   (...)\u001b[0m\n\u001b[1;32m    390\u001b[0m         },\n\u001b[1;32m    391\u001b[0m     )\n\u001b[0;32m--> 392\u001b[0m     \u001b[43m_check_tool_invoker_breakpoint\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m    393\u001b[0m \u001b[43m        \u001b[49m\u001b[43mllm_messages\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mllm_messages\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43magent_snapshot\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43magent_snapshot\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mparent_snapshot\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mparent_snapshot\u001b[49m\n\u001b[1;32m    394\u001b[0m \u001b[43m    \u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m    396\u001b[0m \u001b[38;5;66;03m# 3. Call the ToolInvoker\u001b[39;00m\n\u001b[1;32m    397\u001b[0m \u001b[38;5;66;03m# We only send the messages from the LLM to the tool invoker\u001b[39;00m\n\u001b[1;32m    398\u001b[0m tool_invoker_result \u001b[38;5;241m=\u001b[39m Pipeline\u001b[38;5;241m.\u001b[39m_run_component(\n\u001b[1;32m    399\u001b[0m     component_name\u001b[38;5;241m=\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mtool_invoker\u001b[39m\u001b[38;5;124m\"\u001b[39m,\n\u001b[1;32m    400\u001b[0m     component\u001b[38;5;241m=\u001b[39m{\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124minstance\u001b[39m\u001b[38;5;124m\"\u001b[39m: \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_tool_invoker},\n\u001b[0;32m   (...)\u001b[0m\n\u001b[1;32m    403\u001b[0m     parent_span\u001b[38;5;241m=\u001b[39mspan,\n\u001b[1;32m    404\u001b[0m )\n",
      "File \u001b[0;32m~/haystack-cookbook/.venv/lib/python3.12/site-packages/haystack/core/pipeline/breakpoint.py:437\u001b[0m, in \u001b[0;36m_check_tool_invoker_breakpoint\u001b[0;34m(llm_messages, agent_snapshot, parent_snapshot)\u001b[0m\n\u001b[1;32m    434\u001b[0m     msg \u001b[38;5;241m+\u001b[39m\u001b[38;5;241m=\u001b[39m \u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124m for tool \u001b[39m\u001b[38;5;132;01m{\u001b[39;00mtool_breakpoint\u001b[38;5;241m.\u001b[39mtool_name\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m    435\u001b[0m logger\u001b[38;5;241m.\u001b[39minfo(msg)\n\u001b[0;32m--> 437\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m BreakpointException(\n\u001b[1;32m    438\u001b[0m     message\u001b[38;5;241m=\u001b[39mmsg,\n\u001b[1;32m    439\u001b[0m     component\u001b[38;5;241m=\u001b[39mtool_breakpoint\u001b[38;5;241m.\u001b[39mcomponent_name,\n\u001b[1;32m    440\u001b[0m     inputs\u001b[38;5;241m=\u001b[39magent_snapshot\u001b[38;5;241m.\u001b[39mcomponent_inputs,\n\u001b[1;32m    441\u001b[0m     results\u001b[38;5;241m=\u001b[39magent_snapshot\u001b[38;5;241m.\u001b[39mcomponent_inputs[\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mtool_invoker\u001b[39m\u001b[38;5;124m\"\u001b[39m][\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mserialized_data\u001b[39m\u001b[38;5;124m\"\u001b[39m][\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mstate\u001b[39m\u001b[38;5;124m\"\u001b[39m],\n\u001b[1;32m    442\u001b[0m )\n",
      "\u001b[0;31mBreakpointException\u001b[0m: Breaking at tool_invoker visit count 0 for tool add_database_tool"
     ]
    }
   ],
   "source": [
    "agent_tool_breakpoint = ToolBreakpoint(component_name=\"tool_invoker\", visit_count=0, tool_name=\"add_database_tool\", snapshot_file_path=\"snapshots\")\n",
    "agent_breakpoint = AgentBreakpoint(break_point=agent_tool_breakpoint, agent_name = 'database_agent')\n",
    "\n",
    "pipeline_with_agent.run(\n",
    "    data={\"fetcher\": {\"urls\": [\"https://en.wikipedia.org/wiki/Deepset\"]}},\n",
    "    break_point=agent_breakpoint,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Similarly this will also generate a JSON file in the \"snapshosts\" directory named after the agent's name and the the \"tool_invoker\" component which handled the tools used by the Agent."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "snapshots/database_agent_tool_invoker_2025_07_26_12_43_03.json\n"
     ]
    }
   ],
   "source": [
    "!ls snapshots/database_agent_tool_invoker*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Resuming from a break point\n",
    "\n",
    "For debugging purposes the snapshot files can be inspected and edited, and later injected into a pipeline and resume the execution from the point where the breakpoint was triggered.\n",
    "\n",
    "Once a pipeline execution has been interrupted, we can resume the `pipeline_with_agent` from that saved state.\n",
    "\n",
    "To do this:\n",
    "- Use `load_state()` to load the saved pipeline state from disk. This function converts the stored JSON file back into a Python dictionary representing the intermediate state.\n",
    "- Pass this state as an argument to the `Pipeline.run()` method.\n",
    "\n",
    "The pipeline will resume execution from where it left off and continue until completion."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "from haystack.core.pipeline.breakpoint import load_pipeline_snapshot\n",
    "\n",
    "# resume the pipeline from the saved state\n",
    "snapshot = load_pipeline_snapshot(\"snapshots/database_agent_chat_generator_2025_07_26_12_22_11.json\")\n",
    "\n",
    "result = pipeline_with_agent.run(\n",
    "    data={},\n",
    "    pipeline_snapshot=snapshot\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The following individuals have been added to the knowledge base along with their relevant information:\n",
      "\n",
      "1. **Milos Rusic**\n",
      "   - **Job Title:** Co-Founder\n",
      "   - **Other:** Co-founded deepset in 2018 in Berlin.\n",
      "\n",
      "2. **Malte Pietsch**\n",
      "   - **Job Title:** Co-Founder\n",
      "   - **Other:** Co-founded deepset in 2018 in Berlin.\n",
      "\n",
      "3. **Timo Möller**\n",
      "   - **Job Title:** Co-Founder\n",
      "   - **Other:** Co-founded deepset in 2018 in Berlin.\n",
      "\n",
      "4. **Alex Ratner**\n",
      "   - **Job Title:** Founder\n",
      "   - **Other:** Snorkel AI.\n",
      "\n",
      "5. **Mustafa Suleyman**\n",
      "   - **Job Title:** Co-Founder\n",
      "   - **Other:** Deepmind.\n",
      "\n",
      "6. **Spencer Kimball**\n",
      "   - **Job Title:** Co-Founder\n",
      "   - **Other:** Cockroach Labs.\n",
      "\n",
      "7. **Jeff Hammerbacher**\n",
      "   - **Job Title:** Co-Founder\n",
      "   - **Other:** Cloudera.\n",
      "\n",
      "8. **Emil Eifrem**\n",
      "   - **Job Title:** Founder\n",
      "   - **Other:** Neo4j. \n",
      "\n",
      "This information emphasizes their roles in the establishment and growth of deepset as well as their affiliations with other notable companies in the tech industry.\n"
     ]
    }
   ],
   "source": [
    "print(result['database_agent']['last_message'].text)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
