pdoc.search
pdoc has a search box which allows users to quickly find relevant parts in the documentation.
This feature is implemented entirely client-side so that pdoc can still be hosted statically,
and works without any third-party services in a privacy-preserving way. When a user focuses the
search box for the first time, pdoc will fetch the search index (search.js
) and use that to
answer all upcoming queries.
Search Coverage
The search functionality covers all documented elements and their docstrings. You may find documentation objects using their name, arguments, or type annotations; the source code is not considered.
Search Performance
pdoc uses Elasticlunr.js to implement search. To improve end user
performance, pdoc will attempt to precompile the search index when building the documentation. This only works if
nodejs
is available, and pdoc gracefully falls back to client-side index building if this is not the case.
If your search index reaches a size where compilation times are meaningful and nodejs
cannot be invoked,
pdoc will let you know and print a notice when building your documentation. In this case it should be enough to install
a recent version of Node.js on your system and make a nodejs
or node
available on your PATH.
There are no other additional dependencies. pdoc only uses node
to interpret a local JS file, it does not download any
additional packages.
You can test if your search index is precompiled by clicking the search box (so that the search index is fetched) and then checking your browser's developer console.
Search Index Size
The search index can be relatively large as it includes all docstrings. For larger projects, you should make sure that
you have HTTP compression and caching enabled. search.js
usually
compresses to about 10% of its original size. For example, pdoc's own precompiled search index compresses from 312kB
to 27kB.
Disabling Search
If you wish to disable the search functionality, you can pass --no-search
when invoking pdoc.
1""" 2pdoc has a search box which allows users to quickly find relevant parts in the documentation. 3This feature is implemented entirely client-side so that pdoc can still be hosted statically, 4and works without any third-party services in a privacy-preserving way. When a user focuses the 5search box for the first time, pdoc will fetch the search index (`search.js`) and use that to 6answer all upcoming queries. 7 8##### Search Coverage 9 10The search functionality covers all documented elements and their docstrings. 11You may find documentation objects using their name, arguments, or type annotations; the source code is not considered. 12 13##### Search Performance 14 15pdoc uses [Elasticlunr.js](https://github.com/weixsong/elasticlunr.js) to implement search. To improve end user 16performance, pdoc will attempt to precompile the search index when building the documentation. This only works if 17`nodejs` is available, and pdoc gracefully falls back to client-side index building if this is not the case. 18 19If your search index reaches a size where compilation times are meaningful and `nodejs` cannot be invoked, 20pdoc will let you know and print a notice when building your documentation. In this case it should be enough to install 21a recent version of [Node.js](https://nodejs.org/) on your system and make a `nodejs` or `node` available on your PATH. 22There are no other additional dependencies. pdoc only uses `node` to interpret a local JS file, it does not download any 23additional packages. 24 25You can test if your search index is precompiled by clicking the search box (so that the search index is fetched) and 26then checking your browser's developer console. 27 28##### Search Index Size 29 30The search index can be relatively large as it includes all docstrings. For larger projects, you should make sure that 31you have [HTTP compression](https://en.wikipedia.org/wiki/HTTP_compression) and caching enabled. `search.js` usually 32compresses to about 10% of its original size. For example, pdoc's own precompiled search index compresses from 312kB 33to 27kB. 34 35##### Disabling Search 36 37If you wish to disable the search functionality, you can pass `--no-search` when invoking pdoc. 38""" 39from __future__ import annotations 40 41from collections.abc import Callable 42from collections.abc import Mapping 43import html 44import json 45from pathlib import Path 46import shutil 47import subprocess 48import textwrap 49 50import pdoc.doc 51from pdoc.render_helpers import format_signature 52from pdoc.render_helpers import to_html 53from pdoc.render_helpers import to_markdown 54 55 56def make_index( 57 all_modules: Mapping[str, pdoc.doc.Module], 58 is_public: Callable[[pdoc.doc.Doc], bool], 59 default_docformat: str, 60) -> list[dict]: 61 """ 62 This method compiles all currently documented modules into a pile of documentation JSON objects, 63 which can then be ingested by Elasticlunr.js. 64 """ 65 66 documents = [] 67 for modname, module in all_modules.items(): 68 69 def make_item(doc: pdoc.doc.Doc, **kwargs) -> dict[str, str]: 70 # TODO: We could be extra fancy here and split `doc.docstring` by toc sections. 71 ret = { 72 "fullname": doc.fullname, 73 "modulename": doc.modulename, 74 "qualname": doc.qualname, 75 "kind": doc.kind, 76 "doc": to_html(to_markdown(doc.docstring, module, default_docformat)), 77 **kwargs, 78 } 79 return {k: v for k, v in ret.items() if v} 80 81 # TODO: Instead of building our own JSON objects here we could also use module.html.jinja2's member() 82 # implementation to render HTML for each documentation object and then implement a elasticlunr tokenizer that 83 # removes HTML. It wouldn't be great for search index size, but the rendered search entries would be fully 84 # consistent. 85 def make_index(mod: pdoc.doc.Namespace, **extra): 86 if not is_public(mod): 87 return 88 yield make_item(mod, **extra) 89 for m in mod.own_members: 90 if isinstance(m, pdoc.doc.Variable) and is_public(m): 91 yield make_item( 92 m, 93 annotation=html.escape(m.annotation_str), 94 default_value=html.escape(m.default_value_str), 95 ) 96 elif isinstance(m, pdoc.doc.Function) and is_public(m): 97 if m.name == "__init__": 98 yield make_item( 99 m, 100 signature=format_signature(m.signature_without_self, False), 101 ) 102 else: 103 yield make_item( 104 m, 105 signature=format_signature(m.signature, True), 106 funcdef=m.funcdef, 107 ) 108 elif isinstance(m, pdoc.doc.Class): 109 yield from make_index( 110 m, 111 bases=", ".join(x[2] for x in m.bases), 112 ) 113 else: 114 pass 115 116 documents.extend(make_index(module)) 117 118 return documents 119 120 121def precompile_index(documents: list[dict], compile_js: Path) -> str: 122 """ 123 This method tries to precompile the Elasticlunr.js search index by invoking `nodejs` or `node`. 124 If that fails, an unprocessed index will be returned (which will be compiled locally on the client side). 125 If this happens and the index is rather large (>3MB), a warning with precompile instructions is printed. 126 127 We currently require nodejs, but we'd welcome PRs that support other JavaScript runtimes or 128 – even better – a Python-based search index generation similar to 129 [elasticlunr-rs](https://github.com/mattico/elasticlunr-rs) that could be shipped as part of pdoc. 130 """ 131 raw = json.dumps(documents) 132 try: 133 if shutil.which("nodejs"): 134 executable = "nodejs" 135 else: 136 executable = "node" 137 out = subprocess.check_output( 138 [executable, compile_js], 139 input=raw.encode(), 140 cwd=Path(__file__).parent / "templates", 141 stderr=subprocess.STDOUT, 142 ) 143 index = json.loads(out) 144 index["_isPrebuiltIndex"] = True 145 except Exception as e: 146 if len(raw) > 3 * 1024 * 1024: 147 print( 148 f"pdoc failed to precompile the search index: {e}\n" 149 f"Search will work, but may be slower. " 150 f"This error may only show up now because your index has reached a certain size. " 151 f"See https://pdoc.dev/docs/pdoc/search.html for details." 152 ) 153 if isinstance(e, subprocess.CalledProcessError): 154 print(f"{' Node.js Output ':=^80}") 155 print( 156 textwrap.indent(e.output.decode("utf8", "replace"), " ").rstrip() 157 ) 158 print("=" * 80) 159 return raw 160 else: 161 return json.dumps(index)
57def make_index( 58 all_modules: Mapping[str, pdoc.doc.Module], 59 is_public: Callable[[pdoc.doc.Doc], bool], 60 default_docformat: str, 61) -> list[dict]: 62 """ 63 This method compiles all currently documented modules into a pile of documentation JSON objects, 64 which can then be ingested by Elasticlunr.js. 65 """ 66 67 documents = [] 68 for modname, module in all_modules.items(): 69 70 def make_item(doc: pdoc.doc.Doc, **kwargs) -> dict[str, str]: 71 # TODO: We could be extra fancy here and split `doc.docstring` by toc sections. 72 ret = { 73 "fullname": doc.fullname, 74 "modulename": doc.modulename, 75 "qualname": doc.qualname, 76 "kind": doc.kind, 77 "doc": to_html(to_markdown(doc.docstring, module, default_docformat)), 78 **kwargs, 79 } 80 return {k: v for k, v in ret.items() if v} 81 82 # TODO: Instead of building our own JSON objects here we could also use module.html.jinja2's member() 83 # implementation to render HTML for each documentation object and then implement a elasticlunr tokenizer that 84 # removes HTML. It wouldn't be great for search index size, but the rendered search entries would be fully 85 # consistent. 86 def make_index(mod: pdoc.doc.Namespace, **extra): 87 if not is_public(mod): 88 return 89 yield make_item(mod, **extra) 90 for m in mod.own_members: 91 if isinstance(m, pdoc.doc.Variable) and is_public(m): 92 yield make_item( 93 m, 94 annotation=html.escape(m.annotation_str), 95 default_value=html.escape(m.default_value_str), 96 ) 97 elif isinstance(m, pdoc.doc.Function) and is_public(m): 98 if m.name == "__init__": 99 yield make_item( 100 m, 101 signature=format_signature(m.signature_without_self, False), 102 ) 103 else: 104 yield make_item( 105 m, 106 signature=format_signature(m.signature, True), 107 funcdef=m.funcdef, 108 ) 109 elif isinstance(m, pdoc.doc.Class): 110 yield from make_index( 111 m, 112 bases=", ".join(x[2] for x in m.bases), 113 ) 114 else: 115 pass 116 117 documents.extend(make_index(module)) 118 119 return documents
This method compiles all currently documented modules into a pile of documentation JSON objects, which can then be ingested by Elasticlunr.js.
122def precompile_index(documents: list[dict], compile_js: Path) -> str: 123 """ 124 This method tries to precompile the Elasticlunr.js search index by invoking `nodejs` or `node`. 125 If that fails, an unprocessed index will be returned (which will be compiled locally on the client side). 126 If this happens and the index is rather large (>3MB), a warning with precompile instructions is printed. 127 128 We currently require nodejs, but we'd welcome PRs that support other JavaScript runtimes or 129 – even better – a Python-based search index generation similar to 130 [elasticlunr-rs](https://github.com/mattico/elasticlunr-rs) that could be shipped as part of pdoc. 131 """ 132 raw = json.dumps(documents) 133 try: 134 if shutil.which("nodejs"): 135 executable = "nodejs" 136 else: 137 executable = "node" 138 out = subprocess.check_output( 139 [executable, compile_js], 140 input=raw.encode(), 141 cwd=Path(__file__).parent / "templates", 142 stderr=subprocess.STDOUT, 143 ) 144 index = json.loads(out) 145 index["_isPrebuiltIndex"] = True 146 except Exception as e: 147 if len(raw) > 3 * 1024 * 1024: 148 print( 149 f"pdoc failed to precompile the search index: {e}\n" 150 f"Search will work, but may be slower. " 151 f"This error may only show up now because your index has reached a certain size. " 152 f"See https://pdoc.dev/docs/pdoc/search.html for details." 153 ) 154 if isinstance(e, subprocess.CalledProcessError): 155 print(f"{' Node.js Output ':=^80}") 156 print( 157 textwrap.indent(e.output.decode("utf8", "replace"), " ").rstrip() 158 ) 159 print("=" * 80) 160 return raw 161 else: 162 return json.dumps(index)
This method tries to precompile the Elasticlunr.js search index by invoking nodejs
or node
.
If that fails, an unprocessed index will be returned (which will be compiled locally on the client side).
If this happens and the index is rather large (>3MB), a warning with precompile instructions is printed.
We currently require nodejs, but we'd welcome PRs that support other JavaScript runtimes or – even better – a Python-based search index generation similar to elasticlunr-rs that could be shipped as part of pdoc.