# **Intelligent Example: Solving Mini Crosswords with ToT and Backtracking**

The objective is to fill a $5\times5$ grid by finding ten words that satisfy both the horizontal and vertical clues (lexical, spatial, and deductive reasoning are all required).

## **Problem Setup**

* **Task:** $5\times5$ Mini Crossword (20 questions/clues in total).
* **Goal:** Fill the entire grid correctly.
* **Thought Decomposition:** Each "thought" is the placement of a single word/clue filling (e.g., h1. TASKS; v5. NALED). The thoughts are sequenced based on priority queue, creating up to 10 intermediate steps.[1]
* **Search Algorithm:** Depth-First Search (DFS). This prioritizes exploring one path completely before trying another.[2, 1]
* **Heuristic Evaluation (Pruning):** At each step, the LLM is prompted to evaluate *all remaining unfilled clues* based on the current letter constraints. The output is a confidence score or a classification (e.g., "possible," "impossible").[1]

***

## **Step-by-Step ToT Execution (Demonstrating Backtracking)**

Let's assume the LLM has already successfully filled **h1. TASKS** and is now at a search node (State $s_{2}$).

### **Step 1: Thought Generation (Prioritization)**

The LLM is prompted to generate and prioritize candidates for the next word/clue to fill, considering the existing letter constraints (the 'A' from T**A**SKS constrains one vertical clue, for instance).

| Clue/Thought | Proposed Word | LLM Confidence (Heuristic) | Search Action |
| :--- | :--- | :--- | :--- |
| **h2.** [Clue] | **MOTOR** | High | **Prioritize.** Select for deep exploration. |
| **v3.** [Clue] | **STRING** | Medium | Keep as alternative. |
| **h4.** [Clue] | **SALON** | High | Keep as alternative. |

**Search Action:** DFS commits to the **h2. MOTOR** path first.

### **Step 2: Deep Exploration (Fatal Error)**

The system now expands the tree deeply along the chosen path. After placing h2. MOTOR, a new constraint is created (the 'T' from MOTOR constrains a different vertical clue). The LLM proposes and places the next thought, for instance, **v1. TENETS**.

| Thought Generated | Partial Solution State | Search Action |
| :--- | :--- | :--- |
| **v1. TENETS** | Grid now contains TASKS, MOTOR, and TENETS | Continue deep search. |

### **Step 3: State Evaluation and Pruning**

The LLM is then asked to evaluate the viability of the *entire remaining problem* from this new state ($s_{3}$). It examines all un-filled horizontal and vertical clues against the letters placed so far.

The LLM finds that, due to the letter placement conflict between h1, h2, and v1, one remaining vertical clue, **v5.**, now has the mandatory constraint: S\_R\_D\_.

| Remaining Clue | Constraint | LLM Value Prompt Result | Pruning Trigger |
| :--- | :--- | :--- | :--- |
| v5. Desiccator... | S\_R\_D\_ | **Impossible** [1] | **Pruning Activated.** |

The LLM determines that no known word can satisfy the S\_R\_D\_ constraint given the clue, rendering the current path a "dead-end." This is an explicit, language-based heuristic determination.[1]

### **Step 4: Backtracking**

Because the current state is deemed "impossible," the DFS algorithm executes the crucial ToT mechanism: **Backtracking**.[1]

1. The entire sub-tree stemming from **v1. TENETS** is pruned and discarded.
2. The system reverts the search state back to the parent node, where only **h1. TASKS** and **h2. MOTOR** were placed.
3. The search mechanism marks **v1. TENETS** as a failed branch and selects the next alternative from the queue at that level (Step 2). If no alternatives exist, it backtracks again to the previous parent (State $s_{2}$ before *any* move was made from it).

**Intelligence Demonstrated:**

The key advantage here is the LLM's capacity to recognize a long-term failure immediately after a local step, prompting a structural correction to the problem-solving process.[1]

* **Linear CoT Failure:** A linear Chain-of-Thought process would have continued generating tokens sequentially, amplifying the error from the "impossible" constraint until the whole sequence was produced and failed.[1]
* **ToT Success:** ToT uses its **deliberate self-evaluation** (System 2 reasoning) to trigger a global search control function (backtracking), thus saving computational steps and efficiently recovering from the local error to search an alternative, viable path.[2, 1] The research confirmed this capability is indispensable for complex planning: removing the backtracking feature caused the success rate to plummet from 60% to only 20% on the Mini Crosswords task.[1]