Edit on GitHub

pdoc.search

pdoc has a search box which allows users to quickly find relevant parts in the documentation. This feature is implemented entirely client-side so that pdoc can still be hosted statically, and works without any third-party services in a privacy-preserving way. When a user focuses the search box for the first time, pdoc will fetch the search index (search.js) and use that to answer all upcoming queries.

Single-Page Documentation

If pdoc is documenting a single module only, search functionality will be disabled. The browser's built-in search functionality will provide a better user experience in these cases.

Search Coverage

The search functionality covers all documented elements and their docstrings. You may find documentation objects using their name, arguments, or type annotations; the source code is not considered.

Search Performance

pdoc uses Elasticlunr.js to implement search. To improve end user performance, pdoc will attempt to precompile the search index when building the documentation. This only works if nodejs is available, and pdoc gracefully falls back to client-side index building if this is not the case.

If your search index reaches a size where compilation times are meaningful and nodejs cannot be invoked, pdoc will let you know and print a notice when building your documentation. In this case it should be enough to install a recent version of Node.js on your system and make a nodejs or node available on your PATH. There are no other additional dependencies. pdoc only uses node to interpret a local JS file, it does not download any additional packages.

You can test if your search index is precompiled by clicking the search box (so that the search index is fetched) and then checking your browser's developer console.

Search Index Size

The search index can be relatively large as it includes all docstrings. For larger projects, you should make sure that you have HTTP compression and caching enabled. search.js usually compresses to about 10% of its original size. For example, pdoc's own precompiled search index compresses from 312kB to 27kB.

If you wish to disable the search functionality, you can pass --no-search when invoking pdoc.

  1"""
  2pdoc has a search box which allows users to quickly find relevant parts in the documentation.
  3This feature is implemented entirely client-side so that pdoc can still be hosted statically,
  4and works without any third-party services in a privacy-preserving way. When a user focuses the
  5search box for the first time, pdoc will fetch the search index (`search.js`) and use that to
  6answer all upcoming queries.
  7
  8##### Single-Page Documentation
  9
 10If pdoc is documenting a single module only, search functionality will be disabled.
 11The browser's built-in search functionality will provide a better user experience in these cases.
 12
 13##### Search Coverage
 14
 15The search functionality covers all documented elements and their docstrings.
 16You may find documentation objects using their name, arguments, or type annotations; the source code is not considered.
 17
 18##### Search Performance
 19
 20pdoc uses [Elasticlunr.js](https://github.com/weixsong/elasticlunr.js) to implement search. To improve end user
 21performance, pdoc will attempt to precompile the search index when building the documentation. This only works if
 22`nodejs` is available, and pdoc gracefully falls back to client-side index building if this is not the case.
 23
 24If your search index reaches a size where compilation times are meaningful and `nodejs` cannot be invoked,
 25pdoc will let you know and print a notice when building your documentation. In this case it should be enough to install
 26a recent version of [Node.js](https://nodejs.org/) on your system and make a `nodejs` or `node` available on your PATH.
 27There are no other additional dependencies. pdoc only uses `node` to interpret a local JS file, it does not download any
 28additional packages.
 29
 30You can test if your search index is precompiled by clicking the search box (so that the search index is fetched) and
 31then checking your browser's developer console.
 32
 33##### Search Index Size
 34
 35The search index can be relatively large as it includes all docstrings. For larger projects, you should make sure that
 36you have [HTTP compression](https://en.wikipedia.org/wiki/HTTP_compression) and caching enabled. `search.js` usually
 37compresses to about 10% of its original size. For example, pdoc's own precompiled search index compresses from 312kB
 38to 27kB.
 39
 40##### Disabling Search
 41
 42If you wish to disable the search functionality, you can pass `--no-search` when invoking pdoc.
 43"""
 44
 45from __future__ import annotations
 46
 47from collections.abc import Callable
 48from collections.abc import Mapping
 49import html
 50import json
 51from pathlib import Path
 52import shutil
 53import subprocess
 54import textwrap
 55
 56import pdoc.doc
 57from pdoc.render_helpers import format_signature
 58from pdoc.render_helpers import to_html
 59from pdoc.render_helpers import to_markdown
 60
 61
 62def make_index(
 63    all_modules: Mapping[str, pdoc.doc.Module],
 64    is_public: Callable[[pdoc.doc.Doc], bool],
 65    default_docformat: str,
 66) -> list[dict]:
 67    """
 68    This method compiles all currently documented modules into a pile of documentation JSON objects,
 69    which can then be ingested by Elasticlunr.js.
 70    """
 71
 72    documents = []
 73    for modname, module in all_modules.items():
 74
 75        def make_item(doc: pdoc.doc.Doc, **kwargs) -> dict[str, str]:
 76            # TODO: We could be extra fancy here and split `doc.docstring` by toc sections.
 77            ret = {
 78                "fullname": doc.fullname,
 79                "modulename": doc.modulename,
 80                "qualname": doc.qualname,
 81                "kind": doc.kind,
 82                "doc": to_html(to_markdown(doc.docstring, module, default_docformat)),
 83                **kwargs,
 84            }
 85            return {k: v for k, v in ret.items() if v}
 86
 87        # TODO: Instead of building our own JSON objects here we could also use module.html.jinja2's member()
 88        #  implementation to render HTML for each documentation object and then implement a elasticlunr tokenizer that
 89        #  removes HTML. It wouldn't be great for search index size, but the rendered search entries would be fully
 90        #  consistent.
 91        def make_index(mod: pdoc.doc.Namespace, **extra):
 92            if not is_public(mod):
 93                return
 94            yield make_item(mod, **extra)
 95            for m in mod.own_members:
 96                if isinstance(m, pdoc.doc.Variable) and is_public(m):
 97                    yield make_item(
 98                        m,
 99                        annotation=html.escape(m.annotation_str),
100                        default_value=html.escape(m.default_value_str),
101                    )
102                elif isinstance(m, pdoc.doc.Function) and is_public(m):
103                    if m.name == "__init__":
104                        yield make_item(
105                            m,
106                            signature=format_signature(m.signature_without_self, False),
107                        )
108                    else:
109                        yield make_item(
110                            m,
111                            signature=format_signature(m.signature, True),
112                            funcdef=m.funcdef,
113                        )
114                elif isinstance(m, pdoc.doc.Class):
115                    yield from make_index(
116                        m,
117                        bases=", ".join(x[2] for x in m.bases),
118                    )
119                else:
120                    pass
121
122        documents.extend(make_index(module))
123
124    return documents
125
126
127def precompile_index(documents: list[dict], compile_js: Path) -> str:
128    """
129    This method tries to precompile the Elasticlunr.js search index by invoking `nodejs` or `node`.
130    If that fails, an unprocessed index will be returned (which will be compiled locally on the client side).
131    If this happens and the index is rather large (>3MB), a warning with precompile instructions is printed.
132
133    We currently require nodejs, but we'd welcome PRs that support other JavaScript runtimes or
134    – even better – a Python-based search index generation similar to
135    [elasticlunr-rs](https://github.com/mattico/elasticlunr-rs) that could be shipped as part of pdoc.
136    """
137    raw = json.dumps(documents)
138    try:
139        if shutil.which("nodejs"):
140            executable = "nodejs"
141        else:
142            executable = "node"
143        out = subprocess.check_output(
144            [executable, compile_js],
145            input=raw.encode(),
146            cwd=Path(__file__).parent / "templates",
147            stderr=subprocess.STDOUT,
148        )
149        index = json.loads(out)
150        index["_isPrebuiltIndex"] = True
151    except Exception as e:
152        if len(raw) > 3 * 1024 * 1024:
153            print(
154                f"pdoc failed to precompile the search index: {e}\n"
155                f"Search will work, but may be slower. "
156                f"This error may only show up now because your index has reached a certain size. "
157                f"See https://pdoc.dev/docs/pdoc/search.html for details."
158            )
159            if isinstance(e, subprocess.CalledProcessError):
160                print(f"{' Node.js Output ':=^80}")
161                print(
162                    textwrap.indent(e.output.decode("utf8", "replace"), "    ").rstrip()
163                )
164                print("=" * 80)
165        return raw
166    else:
167        return json.dumps(index)
def make_index( all_modules: collections.abc.Mapping[str, pdoc.doc.Module], is_public: collections.abc.Callable[[pdoc.doc.Doc], bool], default_docformat: str) -> list[dict]:
 63def make_index(
 64    all_modules: Mapping[str, pdoc.doc.Module],
 65    is_public: Callable[[pdoc.doc.Doc], bool],
 66    default_docformat: str,
 67) -> list[dict]:
 68    """
 69    This method compiles all currently documented modules into a pile of documentation JSON objects,
 70    which can then be ingested by Elasticlunr.js.
 71    """
 72
 73    documents = []
 74    for modname, module in all_modules.items():
 75
 76        def make_item(doc: pdoc.doc.Doc, **kwargs) -> dict[str, str]:
 77            # TODO: We could be extra fancy here and split `doc.docstring` by toc sections.
 78            ret = {
 79                "fullname": doc.fullname,
 80                "modulename": doc.modulename,
 81                "qualname": doc.qualname,
 82                "kind": doc.kind,
 83                "doc": to_html(to_markdown(doc.docstring, module, default_docformat)),
 84                **kwargs,
 85            }
 86            return {k: v for k, v in ret.items() if v}
 87
 88        # TODO: Instead of building our own JSON objects here we could also use module.html.jinja2's member()
 89        #  implementation to render HTML for each documentation object and then implement a elasticlunr tokenizer that
 90        #  removes HTML. It wouldn't be great for search index size, but the rendered search entries would be fully
 91        #  consistent.
 92        def make_index(mod: pdoc.doc.Namespace, **extra):
 93            if not is_public(mod):
 94                return
 95            yield make_item(mod, **extra)
 96            for m in mod.own_members:
 97                if isinstance(m, pdoc.doc.Variable) and is_public(m):
 98                    yield make_item(
 99                        m,
100                        annotation=html.escape(m.annotation_str),
101                        default_value=html.escape(m.default_value_str),
102                    )
103                elif isinstance(m, pdoc.doc.Function) and is_public(m):
104                    if m.name == "__init__":
105                        yield make_item(
106                            m,
107                            signature=format_signature(m.signature_without_self, False),
108                        )
109                    else:
110                        yield make_item(
111                            m,
112                            signature=format_signature(m.signature, True),
113                            funcdef=m.funcdef,
114                        )
115                elif isinstance(m, pdoc.doc.Class):
116                    yield from make_index(
117                        m,
118                        bases=", ".join(x[2] for x in m.bases),
119                    )
120                else:
121                    pass
122
123        documents.extend(make_index(module))
124
125    return documents

This method compiles all currently documented modules into a pile of documentation JSON objects, which can then be ingested by Elasticlunr.js.

def precompile_index(documents: list[dict], compile_js: pathlib.Path) -> str:
128def precompile_index(documents: list[dict], compile_js: Path) -> str:
129    """
130    This method tries to precompile the Elasticlunr.js search index by invoking `nodejs` or `node`.
131    If that fails, an unprocessed index will be returned (which will be compiled locally on the client side).
132    If this happens and the index is rather large (>3MB), a warning with precompile instructions is printed.
133
134    We currently require nodejs, but we'd welcome PRs that support other JavaScript runtimes or
135    – even better – a Python-based search index generation similar to
136    [elasticlunr-rs](https://github.com/mattico/elasticlunr-rs) that could be shipped as part of pdoc.
137    """
138    raw = json.dumps(documents)
139    try:
140        if shutil.which("nodejs"):
141            executable = "nodejs"
142        else:
143            executable = "node"
144        out = subprocess.check_output(
145            [executable, compile_js],
146            input=raw.encode(),
147            cwd=Path(__file__).parent / "templates",
148            stderr=subprocess.STDOUT,
149        )
150        index = json.loads(out)
151        index["_isPrebuiltIndex"] = True
152    except Exception as e:
153        if len(raw) > 3 * 1024 * 1024:
154            print(
155                f"pdoc failed to precompile the search index: {e}\n"
156                f"Search will work, but may be slower. "
157                f"This error may only show up now because your index has reached a certain size. "
158                f"See https://pdoc.dev/docs/pdoc/search.html for details."
159            )
160            if isinstance(e, subprocess.CalledProcessError):
161                print(f"{' Node.js Output ':=^80}")
162                print(
163                    textwrap.indent(e.output.decode("utf8", "replace"), "    ").rstrip()
164                )
165                print("=" * 80)
166        return raw
167    else:
168        return json.dumps(index)

This method tries to precompile the Elasticlunr.js search index by invoking nodejs or node. If that fails, an unprocessed index will be returned (which will be compiled locally on the client side). If this happens and the index is rather large (>3MB), a warning with precompile instructions is printed.

We currently require nodejs, but we'd welcome PRs that support other JavaScript runtimes or – even better – a Python-based search index generation similar to elasticlunr-rs that could be shipped as part of pdoc.