Edit on GitHub

pdoc.search

pdoc has a search box which allows users to quickly find relevant parts in the documentation. This feature is implemented entirely client-side so that pdoc can still be hosted statically, and works without any third-party services in a privacy-preserving way. When a user focuses the search box for the first time, pdoc will fetch the search index (search.js) and use that to answer all upcoming queries.

Single-Page Documentation

If pdoc is documenting a single module only, search functionality will be disabled. The browser's built-in search functionality will provide a better user experience in these cases.

Search Coverage

The search functionality covers all documented elements and their docstrings. You may find documentation objects using their name, arguments, or type annotations; the source code is not considered.

Search Performance

pdoc uses Elasticlunr.js to implement search. To improve end user performance, pdoc will attempt to precompile the search index when building the documentation. This only works if nodejs is available, and pdoc gracefully falls back to client-side index building if this is not the case.

If your search index reaches a size where compilation times are meaningful and nodejs cannot be invoked, pdoc will let you know and print a notice when building your documentation. In this case it should be enough to install a recent version of Node.js on your system and make a nodejs or node available on your PATH. There are no other additional dependencies. pdoc only uses node to interpret a local JS file, it does not download any additional packages.

You can test if your search index is precompiled by clicking the search box (so that the search index is fetched) and then checking your browser's developer console.

Search Index Size

The search index can be relatively large as it includes all docstrings. For larger projects, you should make sure that you have HTTP compression and caching enabled. search.js usually compresses to about 10% of its original size. For example, pdoc's own precompiled search index compresses from 312kB to 27kB.

If you wish to disable the search functionality, you can pass --no-search when invoking pdoc.

  1"""
  2pdoc has a search box which allows users to quickly find relevant parts in the documentation.
  3This feature is implemented entirely client-side so that pdoc can still be hosted statically,
  4and works without any third-party services in a privacy-preserving way. When a user focuses the
  5search box for the first time, pdoc will fetch the search index (`search.js`) and use that to
  6answer all upcoming queries.
  7
  8##### Single-Page Documentation
  9
 10If pdoc is documenting a single module only, search functionality will be disabled.
 11The browser's built-in search functionality will provide a better user experience in these cases.
 12
 13##### Search Coverage
 14
 15The search functionality covers all documented elements and their docstrings.
 16You may find documentation objects using their name, arguments, or type annotations; the source code is not considered.
 17
 18##### Search Performance
 19
 20pdoc uses [Elasticlunr.js](https://github.com/weixsong/elasticlunr.js) to implement search. To improve end user
 21performance, pdoc will attempt to precompile the search index when building the documentation. This only works if
 22`nodejs` is available, and pdoc gracefully falls back to client-side index building if this is not the case.
 23
 24If your search index reaches a size where compilation times are meaningful and `nodejs` cannot be invoked,
 25pdoc will let you know and print a notice when building your documentation. In this case it should be enough to install
 26a recent version of [Node.js](https://nodejs.org/) on your system and make a `nodejs` or `node` available on your PATH.
 27There are no other additional dependencies. pdoc only uses `node` to interpret a local JS file, it does not download any
 28additional packages.
 29
 30You can test if your search index is precompiled by clicking the search box (so that the search index is fetched) and
 31then checking your browser's developer console.
 32
 33##### Search Index Size
 34
 35The search index can be relatively large as it includes all docstrings. For larger projects, you should make sure that
 36you have [HTTP compression](https://en.wikipedia.org/wiki/HTTP_compression) and caching enabled. `search.js` usually
 37compresses to about 10% of its original size. For example, pdoc's own precompiled search index compresses from 312kB
 38to 27kB.
 39
 40##### Disabling Search
 41
 42If you wish to disable the search functionality, you can pass `--no-search` when invoking pdoc.
 43"""
 44from __future__ import annotations
 45
 46from collections.abc import Callable
 47from collections.abc import Mapping
 48import html
 49import json
 50from pathlib import Path
 51import shutil
 52import subprocess
 53import textwrap
 54
 55import pdoc.doc
 56from pdoc.render_helpers import format_signature
 57from pdoc.render_helpers import to_html
 58from pdoc.render_helpers import to_markdown
 59
 60
 61def make_index(
 62    all_modules: Mapping[str, pdoc.doc.Module],
 63    is_public: Callable[[pdoc.doc.Doc], bool],
 64    default_docformat: str,
 65) -> list[dict]:
 66    """
 67    This method compiles all currently documented modules into a pile of documentation JSON objects,
 68    which can then be ingested by Elasticlunr.js.
 69    """
 70
 71    documents = []
 72    for modname, module in all_modules.items():
 73
 74        def make_item(doc: pdoc.doc.Doc, **kwargs) -> dict[str, str]:
 75            # TODO: We could be extra fancy here and split `doc.docstring` by toc sections.
 76            ret = {
 77                "fullname": doc.fullname,
 78                "modulename": doc.modulename,
 79                "qualname": doc.qualname,
 80                "kind": doc.kind,
 81                "doc": to_html(to_markdown(doc.docstring, module, default_docformat)),
 82                **kwargs,
 83            }
 84            return {k: v for k, v in ret.items() if v}
 85
 86        # TODO: Instead of building our own JSON objects here we could also use module.html.jinja2's member()
 87        #  implementation to render HTML for each documentation object and then implement a elasticlunr tokenizer that
 88        #  removes HTML. It wouldn't be great for search index size, but the rendered search entries would be fully
 89        #  consistent.
 90        def make_index(mod: pdoc.doc.Namespace, **extra):
 91            if not is_public(mod):
 92                return
 93            yield make_item(mod, **extra)
 94            for m in mod.own_members:
 95                if isinstance(m, pdoc.doc.Variable) and is_public(m):
 96                    yield make_item(
 97                        m,
 98                        annotation=html.escape(m.annotation_str),
 99                        default_value=html.escape(m.default_value_str),
100                    )
101                elif isinstance(m, pdoc.doc.Function) and is_public(m):
102                    if m.name == "__init__":
103                        yield make_item(
104                            m,
105                            signature=format_signature(m.signature_without_self, False),
106                        )
107                    else:
108                        yield make_item(
109                            m,
110                            signature=format_signature(m.signature, True),
111                            funcdef=m.funcdef,
112                        )
113                elif isinstance(m, pdoc.doc.Class):
114                    yield from make_index(
115                        m,
116                        bases=", ".join(x[2] for x in m.bases),
117                    )
118                else:
119                    pass
120
121        documents.extend(make_index(module))
122
123    return documents
124
125
126def precompile_index(documents: list[dict], compile_js: Path) -> str:
127    """
128    This method tries to precompile the Elasticlunr.js search index by invoking `nodejs` or `node`.
129    If that fails, an unprocessed index will be returned (which will be compiled locally on the client side).
130    If this happens and the index is rather large (>3MB), a warning with precompile instructions is printed.
131
132    We currently require nodejs, but we'd welcome PRs that support other JavaScript runtimes or
133    – even better – a Python-based search index generation similar to
134    [elasticlunr-rs](https://github.com/mattico/elasticlunr-rs) that could be shipped as part of pdoc.
135    """
136    raw = json.dumps(documents)
137    try:
138        if shutil.which("nodejs"):
139            executable = "nodejs"
140        else:
141            executable = "node"
142        out = subprocess.check_output(
143            [executable, compile_js],
144            input=raw.encode(),
145            cwd=Path(__file__).parent / "templates",
146            stderr=subprocess.STDOUT,
147        )
148        index = json.loads(out)
149        index["_isPrebuiltIndex"] = True
150    except Exception as e:
151        if len(raw) > 3 * 1024 * 1024:
152            print(
153                f"pdoc failed to precompile the search index: {e}\n"
154                f"Search will work, but may be slower. "
155                f"This error may only show up now because your index has reached a certain size. "
156                f"See https://pdoc.dev/docs/pdoc/search.html for details."
157            )
158            if isinstance(e, subprocess.CalledProcessError):
159                print(f"{' Node.js Output ':=^80}")
160                print(
161                    textwrap.indent(e.output.decode("utf8", "replace"), "    ").rstrip()
162                )
163                print("=" * 80)
164        return raw
165    else:
166        return json.dumps(index)
def make_index( all_modules: collections.abc.Mapping[str, pdoc.doc.Module], is_public: collections.abc.Callable[[pdoc.doc.Doc], bool], default_docformat: str) -> list[dict]:
 62def make_index(
 63    all_modules: Mapping[str, pdoc.doc.Module],
 64    is_public: Callable[[pdoc.doc.Doc], bool],
 65    default_docformat: str,
 66) -> list[dict]:
 67    """
 68    This method compiles all currently documented modules into a pile of documentation JSON objects,
 69    which can then be ingested by Elasticlunr.js.
 70    """
 71
 72    documents = []
 73    for modname, module in all_modules.items():
 74
 75        def make_item(doc: pdoc.doc.Doc, **kwargs) -> dict[str, str]:
 76            # TODO: We could be extra fancy here and split `doc.docstring` by toc sections.
 77            ret = {
 78                "fullname": doc.fullname,
 79                "modulename": doc.modulename,
 80                "qualname": doc.qualname,
 81                "kind": doc.kind,
 82                "doc": to_html(to_markdown(doc.docstring, module, default_docformat)),
 83                **kwargs,
 84            }
 85            return {k: v for k, v in ret.items() if v}
 86
 87        # TODO: Instead of building our own JSON objects here we could also use module.html.jinja2's member()
 88        #  implementation to render HTML for each documentation object and then implement a elasticlunr tokenizer that
 89        #  removes HTML. It wouldn't be great for search index size, but the rendered search entries would be fully
 90        #  consistent.
 91        def make_index(mod: pdoc.doc.Namespace, **extra):
 92            if not is_public(mod):
 93                return
 94            yield make_item(mod, **extra)
 95            for m in mod.own_members:
 96                if isinstance(m, pdoc.doc.Variable) and is_public(m):
 97                    yield make_item(
 98                        m,
 99                        annotation=html.escape(m.annotation_str),
100                        default_value=html.escape(m.default_value_str),
101                    )
102                elif isinstance(m, pdoc.doc.Function) and is_public(m):
103                    if m.name == "__init__":
104                        yield make_item(
105                            m,
106                            signature=format_signature(m.signature_without_self, False),
107                        )
108                    else:
109                        yield make_item(
110                            m,
111                            signature=format_signature(m.signature, True),
112                            funcdef=m.funcdef,
113                        )
114                elif isinstance(m, pdoc.doc.Class):
115                    yield from make_index(
116                        m,
117                        bases=", ".join(x[2] for x in m.bases),
118                    )
119                else:
120                    pass
121
122        documents.extend(make_index(module))
123
124    return documents

This method compiles all currently documented modules into a pile of documentation JSON objects, which can then be ingested by Elasticlunr.js.

def precompile_index(documents: list[dict], compile_js: pathlib.Path) -> str:
127def precompile_index(documents: list[dict], compile_js: Path) -> str:
128    """
129    This method tries to precompile the Elasticlunr.js search index by invoking `nodejs` or `node`.
130    If that fails, an unprocessed index will be returned (which will be compiled locally on the client side).
131    If this happens and the index is rather large (>3MB), a warning with precompile instructions is printed.
132
133    We currently require nodejs, but we'd welcome PRs that support other JavaScript runtimes or
134    – even better – a Python-based search index generation similar to
135    [elasticlunr-rs](https://github.com/mattico/elasticlunr-rs) that could be shipped as part of pdoc.
136    """
137    raw = json.dumps(documents)
138    try:
139        if shutil.which("nodejs"):
140            executable = "nodejs"
141        else:
142            executable = "node"
143        out = subprocess.check_output(
144            [executable, compile_js],
145            input=raw.encode(),
146            cwd=Path(__file__).parent / "templates",
147            stderr=subprocess.STDOUT,
148        )
149        index = json.loads(out)
150        index["_isPrebuiltIndex"] = True
151    except Exception as e:
152        if len(raw) > 3 * 1024 * 1024:
153            print(
154                f"pdoc failed to precompile the search index: {e}\n"
155                f"Search will work, but may be slower. "
156                f"This error may only show up now because your index has reached a certain size. "
157                f"See https://pdoc.dev/docs/pdoc/search.html for details."
158            )
159            if isinstance(e, subprocess.CalledProcessError):
160                print(f"{' Node.js Output ':=^80}")
161                print(
162                    textwrap.indent(e.output.decode("utf8", "replace"), "    ").rstrip()
163                )
164                print("=" * 80)
165        return raw
166    else:
167        return json.dumps(index)

This method tries to precompile the Elasticlunr.js search index by invoking nodejs or node. If that fails, an unprocessed index will be returned (which will be compiled locally on the client side). If this happens and the index is rather large (>3MB), a warning with precompile instructions is printed.

We currently require nodejs, but we'd welcome PRs that support other JavaScript runtimes or – even better – a Python-based search index generation similar to elasticlunr-rs that could be shipped as part of pdoc.