pdoc.search
pdoc has a search box which allows users to quickly find relevant parts in the documentation.
This feature is implemented entirely client-side so that pdoc can still be hosted statically,
and works without any third-party services in a privacy-preserving way. When a user focuses the
search box for the first time, pdoc will fetch the search index (search.js) and use that to
answer all upcoming queries.
Single-Page Documentation
If pdoc is documenting a single module only, search functionality will be disabled. The browser's built-in search functionality will provide a better user experience in these cases.
Search Coverage
The search functionality covers all documented elements and their docstrings. You may find documentation objects using their name, arguments, or type annotations; the source code is not considered.
Search Performance
pdoc uses Elasticlunr.js to implement search. To improve end user
performance, pdoc will attempt to precompile the search index when building the documentation. This only works if
nodejs is available, and pdoc gracefully falls back to client-side index building if this is not the case.
If your search index reaches a size where compilation times are meaningful and nodejs cannot be invoked,
pdoc will let you know and print a notice when building your documentation. In this case it should be enough to install
a recent version of Node.js on your system and make a nodejs or node available on your PATH.
There are no other additional dependencies. pdoc only uses node to interpret a local JS file, it does not download any
additional packages.
You can test if your search index is precompiled by clicking the search box (so that the search index is fetched) and then checking your browser's developer console.
Search Index Size
The search index can be relatively large as it includes all docstrings. For larger projects, you should make sure that
you have HTTP compression and caching enabled. search.js usually
compresses to about 10% of its original size. For example, pdoc's own precompiled search index compresses from 312kB
to 27kB.
Disabling Search
If you wish to disable the search functionality, you can pass --no-search when invoking pdoc.
1""" 2pdoc has a search box which allows users to quickly find relevant parts in the documentation. 3This feature is implemented entirely client-side so that pdoc can still be hosted statically, 4and works without any third-party services in a privacy-preserving way. When a user focuses the 5search box for the first time, pdoc will fetch the search index (`search.js`) and use that to 6answer all upcoming queries. 7 8##### Single-Page Documentation 9 10If pdoc is documenting a single module only, search functionality will be disabled. 11The browser's built-in search functionality will provide a better user experience in these cases. 12 13##### Search Coverage 14 15The search functionality covers all documented elements and their docstrings. 16You may find documentation objects using their name, arguments, or type annotations; the source code is not considered. 17 18##### Search Performance 19 20pdoc uses [Elasticlunr.js](https://github.com/weixsong/elasticlunr.js) to implement search. To improve end user 21performance, pdoc will attempt to precompile the search index when building the documentation. This only works if 22`nodejs` is available, and pdoc gracefully falls back to client-side index building if this is not the case. 23 24If your search index reaches a size where compilation times are meaningful and `nodejs` cannot be invoked, 25pdoc will let you know and print a notice when building your documentation. In this case it should be enough to install 26a recent version of [Node.js](https://nodejs.org/) on your system and make a `nodejs` or `node` available on your PATH. 27There are no other additional dependencies. pdoc only uses `node` to interpret a local JS file, it does not download any 28additional packages. 29 30You can test if your search index is precompiled by clicking the search box (so that the search index is fetched) and 31then checking your browser's developer console. 32 33##### Search Index Size 34 35The search index can be relatively large as it includes all docstrings. For larger projects, you should make sure that 36you have [HTTP compression](https://en.wikipedia.org/wiki/HTTP_compression) and caching enabled. `search.js` usually 37compresses to about 10% of its original size. For example, pdoc's own precompiled search index compresses from 312kB 38to 27kB. 39 40##### Disabling Search 41 42If you wish to disable the search functionality, you can pass `--no-search` when invoking pdoc. 43""" 44 45from __future__ import annotations 46 47from collections.abc import Callable 48from collections.abc import Mapping 49import functools 50import html 51import json 52from pathlib import Path 53import shutil 54import subprocess 55import textwrap 56 57import pdoc.doc 58from pdoc.render_helpers import format_signature 59from pdoc.render_helpers import to_html 60from pdoc.render_helpers import to_markdown 61 62 63def make_index( 64 all_modules: Mapping[str, pdoc.doc.Module], 65 is_public: Callable[[pdoc.doc.Doc], bool], 66 default_docformat: str, 67) -> list[dict]: 68 """ 69 This method compiles all currently documented modules into a pile of documentation JSON objects, 70 which can then be ingested by Elasticlunr.js. 71 """ 72 73 documents = [] 74 for modname, module in all_modules.items(): 75 76 def make_item(doc: pdoc.doc.Doc, **kwargs) -> dict[str, str]: 77 # TODO: We could be extra fancy here and split `doc.docstring` by toc sections. 78 ret = { 79 "fullname": doc.fullname, 80 "modulename": doc.modulename, 81 "qualname": doc.qualname, 82 "kind": doc.kind, 83 "doc": to_html(to_markdown(doc.docstring, module, default_docformat)), 84 **kwargs, 85 } 86 return {k: v for k, v in ret.items() if v} 87 88 # TODO: Instead of building our own JSON objects here we could also use module.html.jinja2's member() 89 # implementation to render HTML for each documentation object and then implement a elasticlunr tokenizer that 90 # removes HTML. It wouldn't be great for search index size, but the rendered search entries would be fully 91 # consistent. 92 def make_index(mod: pdoc.doc.Namespace, **extra): 93 if not is_public(mod): 94 return 95 yield make_item(mod, **extra) 96 for m in mod.own_members: 97 if isinstance(m, pdoc.doc.Variable) and is_public(m): 98 yield make_item( 99 m, 100 annotation=html.escape(m.annotation_str), 101 default_value=html.escape(m.default_value_str), 102 ) 103 elif isinstance(m, pdoc.doc.Function) and is_public(m): 104 if m.name == "__init__": 105 yield make_item( 106 m, 107 signature=format_signature(m.signature_without_self, False), 108 ) 109 else: 110 yield make_item( 111 m, 112 signature=format_signature(m.signature, True), 113 funcdef=m.funcdef, 114 ) 115 elif isinstance(m, pdoc.doc.Class): 116 yield from make_index( 117 m, 118 bases=", ".join(x[2] for x in m.bases), 119 ) 120 else: 121 pass 122 123 documents.extend(make_index(module)) 124 125 return documents 126 127 128@functools.cache 129def node_executable() -> str | None: 130 if shutil.which("nodejs"): 131 return "nodejs" 132 elif shutil.which("node"): 133 return "node" 134 else: 135 return None 136 137 138def precompile_index(documents: list[dict], compile_js: Path) -> str: 139 """ 140 This method tries to precompile the Elasticlunr.js search index by invoking `nodejs` or `node`. 141 If that fails, an unprocessed index will be returned (which will be compiled locally on the client side). 142 If this happens and the index is rather large (>3MB), a warning with precompile instructions is printed. 143 144 We currently require nodejs, but we'd welcome PRs that support other JavaScript runtimes or 145 – even better – a Python-based search index generation similar to 146 [elasticlunr-rs](https://github.com/mattico/elasticlunr-rs) that could be shipped as part of pdoc. 147 """ 148 raw = json.dumps(documents) 149 try: 150 node = node_executable() 151 if node is None: 152 raise FileNotFoundError("No such file or directory: 'node'") 153 out = subprocess.check_output( 154 [node, compile_js], 155 input=raw.encode(), 156 cwd=Path(__file__).parent / "templates", 157 stderr=subprocess.STDOUT, 158 ) 159 index = json.loads(out) 160 index["_isPrebuiltIndex"] = True 161 except Exception as e: 162 if len(raw) > 3 * 1024 * 1024: 163 print( 164 f"pdoc failed to precompile the search index: {e}\n" 165 f"Search will work, but may be slower. " 166 f"This error may only show up now because your index has reached a certain size. " 167 f"See https://pdoc.dev/docs/pdoc/search.html for details." 168 ) 169 if isinstance(e, subprocess.CalledProcessError): 170 print(f"{' Node.js Output ':=^80}") 171 print( 172 textwrap.indent(e.output.decode("utf8", "replace"), " ").rstrip() 173 ) 174 print("=" * 80) 175 return raw 176 else: 177 return json.dumps(index)
64def make_index( 65 all_modules: Mapping[str, pdoc.doc.Module], 66 is_public: Callable[[pdoc.doc.Doc], bool], 67 default_docformat: str, 68) -> list[dict]: 69 """ 70 This method compiles all currently documented modules into a pile of documentation JSON objects, 71 which can then be ingested by Elasticlunr.js. 72 """ 73 74 documents = [] 75 for modname, module in all_modules.items(): 76 77 def make_item(doc: pdoc.doc.Doc, **kwargs) -> dict[str, str]: 78 # TODO: We could be extra fancy here and split `doc.docstring` by toc sections. 79 ret = { 80 "fullname": doc.fullname, 81 "modulename": doc.modulename, 82 "qualname": doc.qualname, 83 "kind": doc.kind, 84 "doc": to_html(to_markdown(doc.docstring, module, default_docformat)), 85 **kwargs, 86 } 87 return {k: v for k, v in ret.items() if v} 88 89 # TODO: Instead of building our own JSON objects here we could also use module.html.jinja2's member() 90 # implementation to render HTML for each documentation object and then implement a elasticlunr tokenizer that 91 # removes HTML. It wouldn't be great for search index size, but the rendered search entries would be fully 92 # consistent. 93 def make_index(mod: pdoc.doc.Namespace, **extra): 94 if not is_public(mod): 95 return 96 yield make_item(mod, **extra) 97 for m in mod.own_members: 98 if isinstance(m, pdoc.doc.Variable) and is_public(m): 99 yield make_item( 100 m, 101 annotation=html.escape(m.annotation_str), 102 default_value=html.escape(m.default_value_str), 103 ) 104 elif isinstance(m, pdoc.doc.Function) and is_public(m): 105 if m.name == "__init__": 106 yield make_item( 107 m, 108 signature=format_signature(m.signature_without_self, False), 109 ) 110 else: 111 yield make_item( 112 m, 113 signature=format_signature(m.signature, True), 114 funcdef=m.funcdef, 115 ) 116 elif isinstance(m, pdoc.doc.Class): 117 yield from make_index( 118 m, 119 bases=", ".join(x[2] for x in m.bases), 120 ) 121 else: 122 pass 123 124 documents.extend(make_index(module)) 125 126 return documents
This method compiles all currently documented modules into a pile of documentation JSON objects, which can then be ingested by Elasticlunr.js.
139def precompile_index(documents: list[dict], compile_js: Path) -> str: 140 """ 141 This method tries to precompile the Elasticlunr.js search index by invoking `nodejs` or `node`. 142 If that fails, an unprocessed index will be returned (which will be compiled locally on the client side). 143 If this happens and the index is rather large (>3MB), a warning with precompile instructions is printed. 144 145 We currently require nodejs, but we'd welcome PRs that support other JavaScript runtimes or 146 – even better – a Python-based search index generation similar to 147 [elasticlunr-rs](https://github.com/mattico/elasticlunr-rs) that could be shipped as part of pdoc. 148 """ 149 raw = json.dumps(documents) 150 try: 151 node = node_executable() 152 if node is None: 153 raise FileNotFoundError("No such file or directory: 'node'") 154 out = subprocess.check_output( 155 [node, compile_js], 156 input=raw.encode(), 157 cwd=Path(__file__).parent / "templates", 158 stderr=subprocess.STDOUT, 159 ) 160 index = json.loads(out) 161 index["_isPrebuiltIndex"] = True 162 except Exception as e: 163 if len(raw) > 3 * 1024 * 1024: 164 print( 165 f"pdoc failed to precompile the search index: {e}\n" 166 f"Search will work, but may be slower. " 167 f"This error may only show up now because your index has reached a certain size. " 168 f"See https://pdoc.dev/docs/pdoc/search.html for details." 169 ) 170 if isinstance(e, subprocess.CalledProcessError): 171 print(f"{' Node.js Output ':=^80}") 172 print( 173 textwrap.indent(e.output.decode("utf8", "replace"), " ").rstrip() 174 ) 175 print("=" * 80) 176 return raw 177 else: 178 return json.dumps(index)
This method tries to precompile the Elasticlunr.js search index by invoking nodejs or node.
If that fails, an unprocessed index will be returned (which will be compiled locally on the client side).
If this happens and the index is rather large (>3MB), a warning with precompile instructions is printed.
We currently require nodejs, but we'd welcome PRs that support other JavaScript runtimes or – even better – a Python-based search index generation similar to elasticlunr-rs that could be shipped as part of pdoc.