> ## Documentation Index
> Fetch the complete documentation index at: https://docs.case.dev/llms.txt
> Use this file to discover all available pages before exploring further.

# Search

> Find documents by meaning—from simple queries to complex analysis

This is why vaults exist. Ask questions in plain English and get relevant passages—even when documents use different words than your query.

***

## Basic search

```bash title="Endpoint" theme={"theme":{"light":"github-light","dark":"one-dark-pro"}}
POST /vault/:id/search
```

<CodeGroup>
  ```bash title="cURL" theme={"theme":{"light":"github-light","dark":"one-dark-pro"}}
  curl -X POST "https://api.case.dev/vault/$VAULT_ID/search" \
    -H "Authorization: Bearer $CASEDEV_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{"query": "What did the witness say about post-operative care?"}'
  ```

  ```bash title="CLI" theme={"theme":{"light":"github-light","dark":"one-dark-pro"}}
  casedev vault search \
    --id $VAULT_ID \
    --query "What did the witness say about post-operative care?"
  ```

  ```typescript title="Typescript" theme={"theme":{"light":"github-light","dark":"one-dark-pro"}}
  import Casedev from 'casedev';

  const client = new Casedev({ apiKey: 'sk_case_YOUR_API_KEY' });

  const results = await client.vault.search(vaultId, {
    query: 'What did the witness say about post-operative care?'
  });

  for (const chunk of results.chunks) {
    console.log(chunk.text);
    // Page info for citation navigation (null if unavailable)
    if (chunk.page_start) {
      const pageRange = chunk.page_start === chunk.page_end 
        ? `p. ${chunk.page_start}` 
        : `pp. ${chunk.page_start}-${chunk.page_end}`;
      console.log(`Document: ${chunk.object_id}, ${pageRange}`);
    }
    // Media timing for transcript navigation (undefined if unavailable)
    if (chunk.start_ms !== undefined) {
      console.log(`Media range: ${chunk.start_ms}ms-${chunk.end_ms}ms`);
    }
  }
  ```

  ```python title="Python" theme={"theme":{"light":"github-light","dark":"one-dark-pro"}}
  import casedev

  client = casedev.Casedev(api_key='sk_case_YOUR_API_KEY')

  results = client.vault.search(vault_id,
      query='What did the witness say about post-operative care?'
  )

  for chunk in results.chunks:
      print(chunk.text)
      # Page info for citation navigation (None if unavailable)
      if chunk.page_start:
          page_range = f'p. {chunk.page_start}' if chunk.page_start == chunk.page_end else f'pp. {chunk.page_start}-{chunk.page_end}'
          print(f'Document: {chunk.object_id}, {page_range}')
      # Media timing for transcript navigation (None if unavailable)
      if chunk.start_ms is not None:
          print(f'Media range: {chunk.start_ms}ms-{chunk.end_ms}ms')
  ```

  ```go title="Go" theme={"theme":{"light":"github-light","dark":"one-dark-pro"}}
  results, _ := client.Vault.Search(ctx, vaultID, casedev.VaultSearchParams{
  	Query: casedev.F("What did the witness say about post-operative care?"),
  })
  for _, chunk := range results.Chunks {
  	fmt.Println(chunk.Text)
  }
  ```
</CodeGroup>

```json title="Response" theme={"theme":{"light":"github-light","dark":"one-dark-pro"}}
{
  "chunks": [
    {
      "text": "Q: Can you describe the monitoring protocol?\nA: Standard protocol requires vital signs every 15 minutes for the first hour...",
      "object_id": "obj_abc123",
      "chunk_index": 12,
      "page_start": 45,
      "page_end": 46,
      "word_start_index": 5544,
      "word_end_index": 6055,
      "distance": 0.11
    }
  ],
  "sources": [
    {
      "id": "obj_abc123",
      "filename": "deposition-johnson.pdf",
      "pageCount": 120
    }
  ]
}
```

<Info>
  **Page citations**: The `page_start` and `page_end` fields tell you exactly which PDF pages the chunk spans. This enables "jump to page" functionality in document viewers. For documents without page information (TXT, DOCX, or older documents), these fields will be `null`.
</Info>

For media-backed transcripts, search chunks can include source audio/video timing when real word timing is available:

```json title="Media transcript search chunk" theme={"theme":{"light":"github-light","dark":"one-dark-pro"}}
{
  "text": "Good morning. Could you please state your full name for the record?...",
  "object_id": "obj_audio123",
  "chunk_index": 0,
  "page_start": null,
  "page_end": null,
  "word_start_index": 0,
  "word_end_index": 181,
  "start_ms": 100,
  "end_ms": 75291,
  "distance": 0.77
}
```

<Info>
  **Media transcript timing**: For audio/video-backed transcript chunks, `start_ms` and `end_ms` are the source media timestamps for the first and last word in the chunk. These fields are returned only when real word timing exists and are omitted for normal documents and text-only transcripts.
</Info>

***

## Understanding search methods

Vaults support different search methods for different needs:

| Method   | What it does                 | Best for                                         |
| -------- | ---------------------------- | ------------------------------------------------ |
| `hybrid` | Combines meaning + keywords  | **Default.** Best for most queries.              |
| `fast`   | Meaning only (vector search) | Speed-critical applications                      |
| `local`  | GraphRAG entity search       | "What did \[person] say about \[topic]?"         |
| `global` | GraphRAG corpus-wide         | "What are the main themes across all documents?" |

### Hybrid search (default)

Combines semantic understanding with keyword matching. If you search "timeline" it will find documents that say "timeline" AND documents that say "schedule" (similar meaning).

<CodeGroup>
  ```bash title="CLI" theme={"theme":{"light":"github-light","dark":"one-dark-pro"}}
  casedev vault search \
    --id $VAULT_ID \
    --query "timeline of events" \
    --method hybrid
  ```

  ```typescript title="Typescript" theme={"theme":{"light":"github-light","dark":"one-dark-pro"}}
  const results = await client.vault.search(vaultId, {
    query: 'timeline of events',
    method: 'hybrid'  // This is the default
  });
  ```

  ```python title="Python" theme={"theme":{"light":"github-light","dark":"one-dark-pro"}}
  results = client.vault.search(vault_id,
      query='timeline of events',
      method='hybrid'  # This is the default
  )
  ```

  ```go title="Go" theme={"theme":{"light":"github-light","dark":"one-dark-pro"}}
  results, _ := client.Vault.Search(ctx, vaultID, casedev.VaultSearchParams{
      Query:  casedev.F("timeline of events"),
      Method: casedev.F(casedev.VaultSearchParamsMethodHybrid),
  })
  ```
</CodeGroup>

### Fast search

Pure vector similarity—faster but misses exact keyword matches.

<CodeGroup>
  ```bash title="CLI" theme={"theme":{"light":"github-light","dark":"one-dark-pro"}}
  casedev vault search \
    --id $VAULT_ID \
    --query "medical negligence" \
    --method fast
  ```

  ```typescript title="Typescript" theme={"theme":{"light":"github-light","dark":"one-dark-pro"}}
  const results = await client.vault.search(vaultId, {
    query: 'medical negligence',
    method: 'fast'
  });
  ```

  ```python title="Python" theme={"theme":{"light":"github-light","dark":"one-dark-pro"}}
  results = client.vault.search(vault_id,
      query='medical negligence',
      method='fast'
  )
  ```

  ```go title="Go" theme={"theme":{"light":"github-light","dark":"one-dark-pro"}}
  results, _ := client.Vault.Search(ctx, vaultID, casedev.VaultSearchParams{
      Query:  casedev.F("medical negligence"),
      Method: casedev.F(casedev.VaultSearchParamsMethodFast),
  })
  ```
</CodeGroup>

***

## GraphRAG: When basic search isn't enough

Basic search finds **individual passages**. But some questions need to understand your **entire document collection**:

* *"What are the main contradictions between witnesses?"*
* *"Summarize all the expert testimony"*
* *"What did Dr. Johnson say about the defendant?"*

This is what **GraphRAG** does. It builds a knowledge graph—a map of people, organizations, concepts, and how they connect—across all your documents.

### How GraphRAG works

```mermaid theme={"theme":{"light":"github-light","dark":"one-dark-pro"}}
flowchart LR
    subgraph A["Standard RAG"]
        A1["Query"] --> A2["Retrieve similar chunks"]
        A2 --> A3["Return passages"]
        A3 --> A4["Good for local answers<br/>Weak at whole-collection synthesis"]
    end

    subgraph B["GraphRAG"]
        B1["Build a knowledge graph"] --> B2["Map entities, relationships, and themes"]
        B2 --> B3["Query globally or locally"]
        B3 --> B4["Return a synthesized answer across the collection"]
    end

    style A fill:#e8f1ff,stroke:#6b8de3,color:#1f3b73
    style B fill:#e8f1ff,stroke:#6b8de3,color:#1f3b73
```

### Enable GraphRAG

GraphRAG requires a one-time initialization to build the knowledge graph:

```bash title="Endpoint" theme={"theme":{"light":"github-light","dark":"one-dark-pro"}}
POST /vault/:id/graphrag/init
```

<CodeGroup>
  ```bash title="CLI" theme={"theme":{"light":"github-light","dark":"one-dark-pro"}}
  casedev vault:graphrag init --id $VAULT_ID
  casedev vault:graphrag get-stats --id $VAULT_ID
  ```

  ```typescript title="Typescript" theme={"theme":{"light":"github-light","dark":"one-dark-pro"}}
  // Initialize the knowledge graph (one-time)
  await client.vault.graphrag.init(vaultId);

  // Check progress
  const stats = await client.vault.graphrag.stats(vaultId);
  console.log(`Entities: ${stats.entities}`);
  console.log(`Relationships: ${stats.relationships}`);
  console.log(`Status: ${stats.status}`);
  ```

  ```python title="Python" theme={"theme":{"light":"github-light","dark":"one-dark-pro"}}
  # Initialize the knowledge graph (one-time)
  client.vault.graphrag.init(vault_id)

  # Check progress
  stats = client.vault.graphrag.get_stats(vault_id)
  print(f'Entities: {stats.entities}')
  print(f'Relationships: {stats.relationships}')
  print(f'Status: {stats.status}')
  ```

  ```go title="Go" theme={"theme":{"light":"github-light","dark":"one-dark-pro"}}
  client.Vault.Graphrag.Init(ctx, vaultID)

  stats, _ := client.Vault.Graphrag.GetStats(ctx, vaultID)
  fmt.Printf("Entities: %d, Relationships: %d\n", stats.Entities, stats.Relationships)
  ```
</CodeGroup>

<Info>
  **One-time setup.** You only initialize GraphRAG once. New documents are automatically added to the graph when ingested.
</Info>

### Global search: Corpus-wide questions

Ask questions that span your entire document collection:

<CodeGroup>
  ```bash title="CLI" theme={"theme":{"light":"github-light","dark":"one-dark-pro"}}
  casedev vault search \
    --id $VAULT_ID \
    --query "What are the key disputed facts in this case?" \
    --method global
  ```

  ```typescript title="Typescript" theme={"theme":{"light":"github-light","dark":"one-dark-pro"}}
  const results = await client.vault.search(vaultId, {
    query: 'What are the key disputed facts in this case?',
    method: 'global'
  });

  // Returns a synthesized answer, not just chunks
  console.log(results.response);
  ```

  ```python title="Python" theme={"theme":{"light":"github-light","dark":"one-dark-pro"}}
  results = client.vault.search(vault_id,
      query='What are the key disputed facts in this case?',
      method='global'
  )

  # Returns a synthesized answer, not just chunks
  print(results.response)
  ```

  ```go title="Go" theme={"theme":{"light":"github-light","dark":"one-dark-pro"}}
  results, _ := client.Vault.Search(ctx, vaultID, casedev.VaultSearchParams{
      Query:  casedev.F("What are the key disputed facts in this case?"),
      Method: casedev.F(casedev.VaultSearchParamsMethodGlobal),
  })

  // Returns a synthesized answer, not just chunks
  fmt.Println(results.Response)
  ```
</CodeGroup>

**Good for:**

* *"What are the main themes across all depositions?"*
* *"Summarize the expert witness testimony"*
* *"What patterns appear in these contracts?"*

### Local search: Entity-focused questions

Ask about specific people, organizations, or concepts:

<CodeGroup>
  ```bash title="CLI" theme={"theme":{"light":"github-light","dark":"one-dark-pro"}}
  casedev vault search \
    --id $VAULT_ID \
    --query "What did Dr. Johnson testify about the monitoring protocol?" \
    --method local
  ```

  ```typescript title="Typescript" theme={"theme":{"light":"github-light","dark":"one-dark-pro"}}
  const results = await client.vault.search(vaultId, {
    query: 'What did Dr. Johnson testify about the monitoring protocol?',
    method: 'local'
  });

  console.log(results.response);
  ```

  ```python title="Python" theme={"theme":{"light":"github-light","dark":"one-dark-pro"}}
  results = client.vault.search(vault_id,
      query='What did Dr. Johnson testify about the monitoring protocol?',
      method='local'
  )

  print(results.response)
  ```

  ```go title="Go" theme={"theme":{"light":"github-light","dark":"one-dark-pro"}}
  results, _ := client.Vault.Search(ctx, vaultID, casedev.VaultSearchParams{
      Query:  casedev.F("What did Dr. Johnson testify about the monitoring protocol?"),
      Method: casedev.F(casedev.VaultSearchParamsMethodLocal),
  })

  fmt.Println(results.Response)
  ```
</CodeGroup>

**Good for:**

* *"What did \[person] say about \[topic]?"*
* *"What is \[company]'s position on \[issue]?"*
* *"Find everything about \[entity]"*

***

## When to use what

| Your question                       | Method   | Why                         |
| ----------------------------------- | -------- | --------------------------- |
| "Find testimony about medication"   | `hybrid` | Finding specific passages   |
| "What are the main themes?"         | `global` | Needs corpus-wide synthesis |
| "What did Dr. Smith say about X?"   | `local`  | Entity-focused              |
| "Contradictions between witnesses?" | `global` | Cross-document analysis     |
| Speed-critical search               | `fast`   | Pure vector, faster         |

***

## Filtering results

Narrow results by metadata you added during upload:

<CodeGroup>
  ```bash title="CLI" theme={"theme":{"light":"github-light","dark":"one-dark-pro"}}
  casedev vault search \
    --id $VAULT_ID \
    --query "post-operative complications" \
    --filters '{"document_type": "medical_record", "date": {"after": "2024-01-01"}}'
  ```

  ```typescript title="Typescript" theme={"theme":{"light":"github-light","dark":"one-dark-pro"}}
  const results = await client.vault.search(vaultId, {
    query: 'post-operative complications',
    filters: {
      document_type: 'medical_record',
      date: { after: '2024-01-01' }
    }
  });
  ```

  ```python title="Python" theme={"theme":{"light":"github-light","dark":"one-dark-pro"}}
  results = client.vault.search(vault_id,
      query='post-operative complications',
      filters={
          'document_type': 'medical_record',
          'date': {'after': '2024-01-01'}
      }
  )
  ```

  ```go title="Go" theme={"theme":{"light":"github-light","dark":"one-dark-pro"}}
  results, _ := client.Vault.Search(ctx, vaultID, casedev.VaultSearchParams{
      Query: casedev.F("post-operative complications"),
      Filters: casedev.F(map[string]interface{}{
          "document_type": "medical_record",
          "date":          map[string]interface{}{"after": "2024-01-01"},
      }),
  })
  ```
</CodeGroup>

***

## Parameters

| Parameter | Type   | Default      | Description                            |
| --------- | ------ | ------------ | -------------------------------------- |
| `query`   | string | **required** | Your question in natural language      |
| `method`  | string | `hybrid`     | `hybrid`, `fast`, `local`, or `global` |
| `topK`    | number | 10           | How many results to return             |
| `filters` | object | `{}`         | Filter by metadata fields              |

***

## Response fields

Each chunk in the response includes:

| Field         | Type           | Description                                       |
| ------------- | -------------- | ------------------------------------------------- |
| `text`        | string         | Preview of the chunk text (up to 500 characters)  |
| `object_id`   | string         | ID of the source document                         |
| `chunk_index` | number         | Position of this chunk in the document (0-based)  |
| `page_start`  | number \| null | PDF page where chunk begins (1-indexed)           |
| `page_end`    | number \| null | PDF page where chunk ends (1-indexed)             |
| `distance`    | number         | Vector similarity distance (lower = more similar) |

<Note>
  **Page fields may be null** for: non-PDF documents (TXT, DOCX), documents ingested before page tracking was added, or OCR results that don't include page boundaries.
</Note>

***

## Understanding scores

Each result has a relevance score from 0 to 1:

<CodeGroup>
  ```bash title="cURL" theme={"theme":{"light":"github-light","dark":"one-dark-pro"}}
  # Response includes score for each chunk
  curl -X POST "https://api.case.dev/vault/$VAULT_ID/search" \
    -H "Authorization: Bearer $CASEDEV_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{"query": "your search query"}'
  # Each chunk in response has "score": 0.0–1.0
  ```

  ```bash title="CLI" theme={"theme":{"light":"github-light","dark":"one-dark-pro"}}
  # CLI outputs scores in the results table automatically
  casedev vault search \
    --id $VAULT_ID \
    --query "your search query"
  ```

  ```typescript title="Typescript" theme={"theme":{"light":"github-light","dark":"one-dark-pro"}}
  for (const chunk of results.chunks) {
    console.log(`${chunk.filename}: ${chunk.score}`);
    // deposition.pdf: 0.92  ← Very relevant
    // contract.pdf: 0.71    ← Somewhat relevant
    // memo.pdf: 0.45        ← Weakly relevant
  }
  ```

  ```python title="Python" theme={"theme":{"light":"github-light","dark":"one-dark-pro"}}
  for chunk in results.chunks:
      print(f'{chunk.filename}: {chunk.score}')
      # deposition.pdf: 0.92  ← Very relevant
      # contract.pdf: 0.71    ← Somewhat relevant
      # memo.pdf: 0.45        ← Weakly relevant
  ```

  ```go title="Go" theme={"theme":{"light":"github-light","dark":"one-dark-pro"}}
  for _, chunk := range results.Chunks {
      fmt.Printf("%s: %.2f\n", chunk.Filename, chunk.Score)
      // deposition.pdf: 0.92  ← Very relevant
      // contract.pdf: 0.71    ← Somewhat relevant
      // memo.pdf: 0.45        ← Weakly relevant
  }
  ```
</CodeGroup>

Higher score = more relevant to your query.
