Edit on GitHub

pdoc.search

pdoc has a search box which allows users to quickly find relevant parts in the documentation. This feature is implemented entirely client-side so that pdoc can still be hosted statically, and works without any third-party services in a privacy-preserving way. When a user focuses the search box for the first time, pdoc will fetch the search index (search.js) and use that to answer all upcoming queries.

Search Coverage

The search functionality covers all documented elements and their docstrings. You may find documentation objects using their name, arguments, or type annotations; the source code is not considered.

Search Performance

pdoc uses Elasticlunr.js to implement search. To improve end user performance, pdoc will attempt to precompile the search index when building the documentation. This only works if nodejs is available, and pdoc gracefully falls back to client-side index building if this is not the case.

If your search index reaches a size where compilation times are meaningful and nodejs cannot be invoked, pdoc will let you know and print a notice when building your documentation. In this case it should be enough to install a recent version of Node.js on your system and make a nodejs or node available on your PATH. There are no other additional dependencies. pdoc only uses node to interpret a local JS file, it does not download any additional packages.

You can test if your search index is precompiled by clicking the search box (so that the search index is fetched) and then checking your browser's developer console.

Search Index Size

The search index can be relatively large as it includes all docstrings. For larger projects, you should make sure that you have HTTP compression and caching enabled. search.js usually compresses to about 10% of its original size. For example, pdoc's own precompiled search index compresses from 312kB to 27kB.

If you wish to hide the search box, you can add

{% block search %}{% endblock %}
{% block search_js %}{% endblock %}

in your module.html.jinja2 template.

  1"""
  2pdoc has a search box which allows users to quickly find relevant parts in the documentation.
  3This feature is implemented entirely client-side so that pdoc can still be hosted statically,
  4and works without any third-party services in a privacy-preserving way. When a user focuses the
  5search box for the first time, pdoc will fetch the search index (`search.js`) and use that to
  6answer all upcoming queries.
  7
  8##### Search Coverage
  9
 10The search functionality covers all documented elements and their docstrings.
 11You may find documentation objects using their name, arguments, or type annotations; the source code is not considered.
 12
 13##### Search Performance
 14
 15pdoc uses [Elasticlunr.js](https://github.com/weixsong/elasticlunr.js) to implement search. To improve end user
 16performance, pdoc will attempt to precompile the search index when building the documentation. This only works if
 17`nodejs` is available, and pdoc gracefully falls back to client-side index building if this is not the case.
 18
 19If your search index reaches a size where compilation times are meaningful and `nodejs` cannot be invoked,
 20pdoc will let you know and print a notice when building your documentation. In this case it should be enough to install
 21a recent version of [Node.js](https://nodejs.org/) on your system and make a `nodejs` or `node` available on your PATH.
 22There are no other additional dependencies. pdoc only uses `node` to interpret a local JS file, it does not download any
 23additional packages.
 24
 25You can test if your search index is precompiled by clicking the search box (so that the search index is fetched) and
 26then checking your browser's developer console.
 27
 28##### Search Index Size
 29
 30The search index can be relatively large as it includes all docstrings. For larger projects, you should make sure that
 31you have [HTTP compression](https://en.wikipedia.org/wiki/HTTP_compression) and caching enabled. `search.js` usually
 32compresses to about 10% of its original size. For example, pdoc's own precompiled search index compresses from 312kB
 33to 27kB.
 34
 35##### Disabling Search
 36
 37If you wish to hide the search box, you can add
 38```html+jinja
 39{% block search %}{% endblock %}
 40{% block search_js %}{% endblock %}
 41```
 42in your [`module.html.jinja2` template](../pdoc.html#edit-pdocs-html-template).
 43"""
 44from __future__ import annotations
 45
 46import json
 47import shutil
 48import subprocess
 49import textwrap
 50from collections.abc import Callable, Mapping
 51from pathlib import Path
 52
 53import pdoc.doc
 54from pdoc.render_helpers import to_html, to_markdown
 55
 56
 57def make_index(
 58    all_modules: Mapping[str, pdoc.doc.Module],
 59    is_public: Callable[[pdoc.doc.Doc], bool],
 60    default_docformat: str,
 61) -> list[dict]:
 62    """
 63    This method compiles all currently documented modules into a pile of documentation JSON objects,
 64    which can then be ingested by Elasticlunr.js.
 65    """
 66
 67    documents = []
 68    for modname, module in all_modules.items():
 69
 70        def make_item(doc: pdoc.doc.Doc, **kwargs) -> dict[str, str]:
 71            # TODO: We could be extra fancy here and split `doc.docstring` by toc sections.
 72            ret = {
 73                "fullname": doc.fullname,
 74                "modulename": doc.modulename,
 75                "qualname": doc.qualname,
 76                "type": doc.type,
 77                "doc": to_html(to_markdown(doc.docstring, module, default_docformat)),
 78                **kwargs,
 79            }
 80            return {k: v for k, v in ret.items() if v}
 81
 82        # TODO: Instead of building our own JSON objects here we could also use module.html.jinja2's member()
 83        #  implementation to render HTML for each documentation object and then implement a elasticlunr tokenizer that
 84        #  removes HTML. It wouldn't be great for search index size, but the rendered search entries would be fully
 85        #  consistent.
 86        def make_index(mod: pdoc.doc.Namespace, **extra):
 87            if not is_public(mod):
 88                return
 89            yield make_item(mod, **extra)
 90            for m in mod.own_members:
 91                if isinstance(m, pdoc.doc.Variable) and is_public(m):
 92                    yield make_item(
 93                        m,
 94                        annotation=m.annotation_str,
 95                        default_value=m.default_value_str,
 96                    )
 97                elif isinstance(m, pdoc.doc.Function) and is_public(m):
 98                    yield make_item(
 99                        m,
100                        signature=str(m.signature),
101                        funcdef=m.funcdef,
102                    )
103                elif isinstance(m, pdoc.doc.Class):
104                    yield from make_index(
105                        m,
106                        bases=", ".join(x[2] for x in m.bases),
107                    )
108                else:
109                    pass
110
111        documents.extend(make_index(module))
112
113    return documents
114
115
116def precompile_index(documents: list[dict], compile_js: Path) -> str:
117    """
118    This method tries to precompile the Elasticlunr.js search index by invoking `nodejs` or `node`.
119    If that fails, an unprocessed index will be returned (which will be compiled locally on the client side).
120    If this happens and the index is rather large (>3MB), a warning with precompile instructions is printed.
121
122    We currently require nodejs, but we'd welcome PRs that support other JavaScript runtimes or
123    – even better – a Python-based search index generation similar to
124    [elasticlunr-rs](https://github.com/mattico/elasticlunr-rs) that could be shipped as part of pdoc.
125    """
126    raw = json.dumps(documents)
127    try:
128        if shutil.which("nodejs"):
129            executable = "nodejs"
130        else:
131            executable = "node"
132        out = subprocess.check_output(
133            [executable, compile_js],
134            input=raw.encode(),
135            cwd=Path(__file__).parent / "templates",
136            stderr=subprocess.STDOUT,
137        )
138        index = json.loads(out)
139        index["_isPrebuiltIndex"] = True
140    except Exception as e:
141        if len(raw) > 3 * 1024 * 1024:
142            print(
143                f"pdoc failed to precompile the search index: {e}\n"
144                f"Search will work, but may be slower. "
145                f"This error may only show up now because your index has reached a certain size. "
146                f"See https://pdoc.dev/docs/pdoc/search.html for details."
147            )
148            if isinstance(e, subprocess.CalledProcessError):
149                print(f"{' Node.js Output ':=^80}")
150                print(
151                    textwrap.indent(e.output.decode("utf8", "replace"), "    ").rstrip()
152                )
153                print("=" * 80)
154        return raw
155    else:
156        return json.dumps(index)
def make_index( all_modules: collections.abc.Mapping[str, pdoc.doc.Module], is_public: collections.abc.Callable[[pdoc.doc.Doc], bool], default_docformat: str) -> list[dict]:
 58def make_index(
 59    all_modules: Mapping[str, pdoc.doc.Module],
 60    is_public: Callable[[pdoc.doc.Doc], bool],
 61    default_docformat: str,
 62) -> list[dict]:
 63    """
 64    This method compiles all currently documented modules into a pile of documentation JSON objects,
 65    which can then be ingested by Elasticlunr.js.
 66    """
 67
 68    documents = []
 69    for modname, module in all_modules.items():
 70
 71        def make_item(doc: pdoc.doc.Doc, **kwargs) -> dict[str, str]:
 72            # TODO: We could be extra fancy here and split `doc.docstring` by toc sections.
 73            ret = {
 74                "fullname": doc.fullname,
 75                "modulename": doc.modulename,
 76                "qualname": doc.qualname,
 77                "type": doc.type,
 78                "doc": to_html(to_markdown(doc.docstring, module, default_docformat)),
 79                **kwargs,
 80            }
 81            return {k: v for k, v in ret.items() if v}
 82
 83        # TODO: Instead of building our own JSON objects here we could also use module.html.jinja2's member()
 84        #  implementation to render HTML for each documentation object and then implement a elasticlunr tokenizer that
 85        #  removes HTML. It wouldn't be great for search index size, but the rendered search entries would be fully
 86        #  consistent.
 87        def make_index(mod: pdoc.doc.Namespace, **extra):
 88            if not is_public(mod):
 89                return
 90            yield make_item(mod, **extra)
 91            for m in mod.own_members:
 92                if isinstance(m, pdoc.doc.Variable) and is_public(m):
 93                    yield make_item(
 94                        m,
 95                        annotation=m.annotation_str,
 96                        default_value=m.default_value_str,
 97                    )
 98                elif isinstance(m, pdoc.doc.Function) and is_public(m):
 99                    yield make_item(
100                        m,
101                        signature=str(m.signature),
102                        funcdef=m.funcdef,
103                    )
104                elif isinstance(m, pdoc.doc.Class):
105                    yield from make_index(
106                        m,
107                        bases=", ".join(x[2] for x in m.bases),
108                    )
109                else:
110                    pass
111
112        documents.extend(make_index(module))
113
114    return documents

This method compiles all currently documented modules into a pile of documentation JSON objects, which can then be ingested by Elasticlunr.js.

def precompile_index(documents: list[dict], compile_js: pathlib.Path) -> str:
117def precompile_index(documents: list[dict], compile_js: Path) -> str:
118    """
119    This method tries to precompile the Elasticlunr.js search index by invoking `nodejs` or `node`.
120    If that fails, an unprocessed index will be returned (which will be compiled locally on the client side).
121    If this happens and the index is rather large (>3MB), a warning with precompile instructions is printed.
122
123    We currently require nodejs, but we'd welcome PRs that support other JavaScript runtimes or
124    – even better – a Python-based search index generation similar to
125    [elasticlunr-rs](https://github.com/mattico/elasticlunr-rs) that could be shipped as part of pdoc.
126    """
127    raw = json.dumps(documents)
128    try:
129        if shutil.which("nodejs"):
130            executable = "nodejs"
131        else:
132            executable = "node"
133        out = subprocess.check_output(
134            [executable, compile_js],
135            input=raw.encode(),
136            cwd=Path(__file__).parent / "templates",
137            stderr=subprocess.STDOUT,
138        )
139        index = json.loads(out)
140        index["_isPrebuiltIndex"] = True
141    except Exception as e:
142        if len(raw) > 3 * 1024 * 1024:
143            print(
144                f"pdoc failed to precompile the search index: {e}\n"
145                f"Search will work, but may be slower. "
146                f"This error may only show up now because your index has reached a certain size. "
147                f"See https://pdoc.dev/docs/pdoc/search.html for details."
148            )
149            if isinstance(e, subprocess.CalledProcessError):
150                print(f"{' Node.js Output ':=^80}")
151                print(
152                    textwrap.indent(e.output.decode("utf8", "replace"), "    ").rstrip()
153                )
154                print("=" * 80)
155        return raw
156    else:
157        return json.dumps(index)

This method tries to precompile the Elasticlunr.js search index by invoking nodejs or node. If that fails, an unprocessed index will be returned (which will be compiled locally on the client side). If this happens and the index is rather large (>3MB), a warning with precompile instructions is printed.

We currently require nodejs, but we'd welcome PRs that support other JavaScript runtimes or – even better – a Python-based search index generation similar to elasticlunr-rs that could be shipped as part of pdoc.