Documentation Formatting#
Documentation are Markdown files in the /docs
folder. It is built
using Sphinx and
MyST for
Markdown parsing.
Adding User Guides#
Articles for the user guide live in /docs/user_guide
and
sub-directories within. You can add new Markdown files to add a new
article, but they must be added to a Sphinx table of
contents
to know where to add them in the document hierarchy. (Sphinx does not
require the directory structure to match the document structure.)
The TOC for the user guide is in /docs/user_guide/index.md
. You can
add a new line with the path of that file the appropriate sub-section.
API Docs#
API reference documentation is automatically during the Sphinx build
process via the sphinx-autodoc2
extension
generated from our Python source in /pysrc
. The build process turns
the source into automatically generated Markdown files in
/docs/apidocs
, which are then passed through the Sphinx builder.
Local Testing#
Checked into this repo are some tools to allow you to quickly iterate on documentation without needing to make a PR or release.
Pipenv#
pipenv
is used to manage the
blessed environment for building documentation because it supports
creating reproducible environments. It lets you pin a specific version
of the Python interpreter and all packages. These are saved via
/docs/Pipfile
and /docs/Pipfile.lock
.
You should install the latest version of pipenv
globally and it will
use the lockfile in the current directory. It won’t affect any other
Python environment tooling.
$ pip install pipenv
pipenv
will automatically use
pyenv
to install the correct
version of the Python interpreter to make things consistent.
You should use the two build scripts below which automatically setup
pipenv
for you.
Auto Builder#
The quickest way to iterate on your docs is to run the
./autobuild.sh
script from the /docs
directory.
$ cd docs
$ ./autobuild.sh
Creating a virtualenv for this project...
...
All dependencies are now up-to-date!
...
Running Sphinx v7.2.6
...
[I 240124 12:03:11 server:335] Serving on http://127.0.0.1:8000
The temporary built HTML files are put in /docs/_build
. This
directory should not be checked in. Production docs are built using
Read the Docs and served from them.
This starts a web server on http://localhost:8000/ with the built docs and will watch the source files and rebuild on any change.
Warning
The watching mechanism sometimes gets confused and trapped in an infinite loop, constantly rebuilding the docs on no changes. I think it has something to do with the fact that the Sphinx build process generates Markdown files for the API docs.
If you C-c
it and start it again, it will stop.
MyST Cheat Sheet#
Here’s a quick rundown of common things in MyST flavored Markdown.
Docstrings#
Docstrings can be written using the all the features of MyST. Some docstring-specific hints will be provided here.
Arguments#
Docstrings should use MyST Markdown as the text body. Arguments,
return values, etc. specified using MyST field
lists which are :name value: Description
lines. The field names should be the same as the
Sphinx.
1def my_func(x: int, y: str) -> str:
2 """Do the cool thing.
3
4 :arg x: Describe the X parameter.
5
6 :arg y: Describe the Y parameter. If this is a really long line,
7 you can wrap it with indentation. You can also use any
8 **syntax** here you like.
9
10 :returns: A description of the return value.
11
12 """
13 ...
If the function signature is coming from PyO3 (and thus there are no
type hints in the code) you can use the :type var:
and :rtype:
fields to provide argument and return value type hints.
/// Do the cool thing.
///
/// :arg x: Describe the X parameter.
///
/// :type x: int
///
/// :arg y: Describe the Y parameter. If this is a really long line,
/// you can wrap it with indentation. You can also use any
/// **syntax** here you like.
///
/// :type y: str
///
/// :returns: A description of the return value.
///
/// :rtype: str
#[pyfunction]
fn my_func(x: usize, y: String) -> String {
todo!();
}
Class and Module Variables#
You can add “post-variable docstrings” to document these.
1from dataclasses import dataclass
2from typing import TypeVar
3
4
5X = TypeVar("X")
6"""Type of a cool thing."""
7
8
9@dataclass
10class Container:
11 x: int
12 """This is the docstring for this attribute."""
13
14 y: str
15 """This is the docstring for this other attribute."""
Cross References#
See MyST’s documentation on cross referencing for all the ways this can work. I’ll give a quick summary here.
A Specific Section#
The system does not automatically generate xref links for headings.
You can manually add a reference name to any heading via the
(xref-name)=
syntax just before it. In general, just add refs for
sections you know you want to reference elsewhere.
(xref-specific-section)=
### A Specific Section
You can then reference it via normal Markdown link syntax with the URI
being just #xref-name
.
Read [how to link to a specific section](#xref-specific-section)
Appears as:
Or the autolink syntax with the scheme project:
and then a
#xref-name
.
Read about linking to <project:#xref-specific-section>
Appears as:
Read about linking to A Specific Section
Warning
All of your reference names must start with xref
by convention to
ensure that they are globally unique across all Sphinx domains.
Unfortunately, MyST’s Markdown link xref resolver does not let you
specify Sphinx domains and tries to resolve everything using the all
directive, so it’s possible that the name you pick would clash with
the name of a file (clashing with the doc
domain) or a Python module
(clashing with the py
domain) and you get multiple targets.
Prefixing them with xref
means that we are less likely to clash.
Perhaps one day MyST will provide a syntax for unambiguously specifying an xref when they fix this issue.
Note
Either the link URI has to either start with a #
and be a global
Sphinx reference, or it is a path. You can’t mix and match. This will
not work.
Read [how to link to a specific section](/user_guide/contributing/writing-docs.md#xref-specific-section)
Instead make an explicit reference target with (xref-name)=
.
Other Markdown Files#
To link to an entire article, add an xref to the main header in the file and link to that. Use the steps and syntax above.
API Docs#
To link to a symbol in the Bytewax library, use the full dotted path
to it surrounded by `
and proceeded by {py:obj}
.
This operator returns a {py:obj}`bytewax.dataflow.Stream`.
Appears as:
This operator returns a
bytewax.dataflow.Stream
.
You should always use the full dotted path to reference a name, but if
you don’t want it to appear as a full dotted path because of the
context of the surrounding text, prefix the path with ~
.
This operator returns a {py:obj}`~bytewax.dataflow.Stream`.
Appears as:
This operator returns a
Stream
.
Intersphinx#
Intersphinx is the system for Sphinx to connect different documentation systems together. The Sphinx config is already configured to have a few of our dependencies including the Python standard library connected.
API Docs#
For most external Python types, you can use the same xref syntax as within Bytewax:
See the standard library function {py:obj}`functools.reduce`.
Appears as:
See the standard library function
functools.reduce
.
Other References#
Other references use a more explicit system. You use URIs starting
with inv:
, then the name of the inventory in the /docs/conf.py
intersphinx_mapping
, then the domain, then the item name.
Learn about [how to use lambdas](inv:python:std:label#tut-lambda).
Appears as:
Learn about how to use lambdas.
Finding Reference Names#
If you don’t know the exact xref incantation, you can use the included
dump tool to fuzzy search with grep
or
fzf
over all the xrefs to find
the one you want.
$ PIPENV_IGNORE_VIRTUALENVS=1 pipenv run python ./intersphinxdump.py | fzf
Example Code#
Use backtick code blocks with the python
language type.
```python
from bytewax.dataflow import Dataflow
flow = Dataflow("doc_df")
```
Appears as:
1from bytewax.dataflow import Dataflow
2
3flow = Dataflow("doc_df")
Shell Sessions#
Use the language type console
(instead of bash
), and start
commands you run with $
to get proper highlighting.
```console
$ waxctl list
output here
```
Appears as:
$ waxctl list output here
Mermaid Diagrams#
We have install the Sphinx
sphinxcontrib-mermaid
plugin which allows you to use mermaid
as a code block language
name.
```mermaid
graph TD
I1[Kafka Consumer `users`] --> D1[Users Deserializer] --> K1[Key on User ID]
I2[Kafka Consumer `transactions`] --> D2[Transactions Deserializer] --> K2[Key on User ID]
K1 & K2 --> J1[Join on User ID] --> V[Validator] --> S[Enriched Serializer] --> O1[Kafka Producer `enriched_txns`]
V --> O2[Kafka Producer `enriched_txns_dead_letter_queue`]
```
Appears as:
Doctests#
pytest
is setup to use
Sybil to attempt
to run all Python code blocks in our documentation. This is so we
catch documentation we forget to update as we advance the API.
Running pytest
will run over all:
All documentation examples in Markdown files in
/docs
.All examples in docstrings in
/pysrc
.Docstrings from PyO3 are tested via the stubs file in
/pysrc/bytewax/_bytewax.pyi
. You must rebuild stubs to test these.
Plain Code Blocks#
If you have a plain Python code block, the code will be run to ensure no exceptions, but no output will be checked.
```python
x = 1 + 1
```
Appears as:
1x = 1 + 1
Doctest Code Blocks#
If you want to test the output, make it a doctest-style code block,
using the doctest
directive. You should prefix each line with >>>
if it is input and output on the following lines.
```{doctest}
>>> 1 + 1
2
```
Appears as:
>>> 1 + 1
2
Test Code Blocks#
Note
Sybil doesn’t fully support this yet, but I’m going to write it here for when it does. We should still write these style of examples, but they will not be automatically tested.
If you’d like to have some commentary between the example code an the
output, use the testcode
and testoutput
directives. The testcode
block will automatically be highlighted as Python code.
Here's some pre-commentary.
```{testcode}
1 + 1
```
Here's some middle-commentary.
```{testoutput}
2
```
Here's some post-commentary.
Appears as:
Here’s some pre-commentary.
1 + 1Here’s some middle-commentary.
2Here’s some post-commentary.
Skipping#
To skip a doctest, use a Sybil skip comment. Add a line with % skip: next
right above the code block. The entire code block will not be
run.
% skip: next
```python
invalid code
```