Writing Documentation#
Documentation are Markdown files in the /docs
folder. It is built
using Sphinx and
MyST for
Markdown parsing.
Adding User Guides#
Articles for the user guide live in /docs/guide
and sub-directories
within. You can add new Markdown files to add a new article, but they
must be added to a Sphinx table of
contents
to know where to add them in the document hierarchy. (Sphinx does not
require the directory structure to match the document structure.)
The TOC for the user guide is in /docs/guide/index.md
. You can add a
new line with the path of that file the appropriate sub-section.
API Docs#
API reference documentation is automatically during the Sphinx build
process via the sphinx-autodoc2
extension
generated from our Python source in /pysrc
. The build process turns
the source into automatically generated Markdown files in /docs/api
,
which are then passed through the Sphinx builder.
All public functions, classes, and constants should have explicit type annotations.
Local Prototyping#
You’ll need to set up a local development environment. See Local Development and specifically Writing Docs.
MyST Cheat Sheet#
Here’s a quick rundown of common things in MyST flavored Markdown.
Docstrings#
Docstrings can be written using the all the features of MyST. Some docstring-specific hints will be provided here.
Arguments#
Docstrings should use MyST Markdown as the text body. Arguments,
return values, etc. specified using MyST field
lists which are :name value: Description
lines. The field names should be the same as the
Sphinx.
def my_func(x: int, y: str) -> str:
"""Do the cool thing.
:arg x: Describe the X parameter.
:arg y: Describe the Y parameter. If this is a really long line,
you can wrap it with indentation. You can also use any
**syntax** here you like.
:returns: A description of the return value.
"""
...
If the function signature is coming from PyO3 (and thus there are no
type hints in the code) you can use the :type var:
and :rtype:
fields to provide argument and return value type hints.
/// Do the cool thing.
///
/// :arg x: Describe the X parameter.
///
/// :type x: int
///
/// :arg y: Describe the Y parameter. If this is a really long line,
/// you can wrap it with indentation. You can also use any
/// **syntax** here you like.
///
/// :type y: str
///
/// :returns: A description of the return value.
///
/// :rtype: str
#[pyfunction]
fn my_func(x: usize, y: String) -> String {
todo!();
}
Class and Module Variables#
You can add “post-variable docstrings” to document these.
from dataclasses import dataclass
from typing import TypeVar
X = TypeVar("X")
"""Type of a cool thing."""
@dataclass
class Container:
x: int
"""This is the docstring for this attribute."""
y: str
"""This is the docstring for this other attribute."""
Cross References#
See MyST’s documentation on cross referencing for all the ways this can work. I’ll give a quick summary here.
A Specific Section#
The system does not automatically generate xref links for headings.
You can manually add a reference name to any heading via the
(xref-name)=
syntax just before it. In general, just add refs for
sections you know you want to reference elsewhere.
(xref-specific-section)=
### A Specific Section
You can then reference it via normal Markdown link syntax with the URI
being just #xref-name
.
Read [how to link to a specific section](#xref-specific-section)
Appears as:
Or the autolink syntax with the scheme project:
and then a
#xref-name
.
Read about linking to <project:#xref-specific-section>
Appears as:
Read about linking to A Specific Section
Warning
All of your reference names must start with xref
by convention to
ensure that they are globally unique across all Sphinx domains.
Unfortunately, MyST’s Markdown link xref resolver does not let you
specify Sphinx domains and tries to resolve everything using the all
directive, so it’s possible that the name you pick would clash with
the name of a file (clashing with the doc
domain) or a Python module
(clashing with the py
domain) and you get multiple targets.
Prefixing them with xref
means that we are less likely to clash.
Perhaps one day MyST will provide a syntax for unambiguously specifying an xref when they fix this issue.
Note
Either the link URI has to either start with a #
and be a global
Sphinx reference, or it is a path. You can’t mix and match. This will
not work.
Read [how to link to a specific section](/guide/contributing/writing-docs.md#xref-specific-section)
Instead make an explicit reference target with (xref-name)=
.
Other Markdown Files#
To link to an entire article, add an xref to the main header in the file and link to that. Use the steps and syntax above.
API Docs#
To link to a symbol in the Bytewax library, use the full dotted path
to it surrounded by `
and proceeded by {py:obj}
.
This operator returns a {py:obj}`bytewax.dataflow.Stream`.
Appears as:
This operator returns a
bytewax.dataflow.Stream
.
You should always use the full dotted path to reference a name, but if
you don’t want it to appear as a full dotted path because of the
context of the surrounding text, prefix the path with ~
.
This operator returns a {py:obj}`~bytewax.dataflow.Stream`.
Appears as:
This operator returns a
Stream
.
Intersphinx#
Intersphinx is the system for Sphinx to connect different documentation systems together. The Sphinx config is already configured to have a few of our dependencies including the Python standard library connected.
API Docs#
For most external Python types, you can use the same xref syntax as within Bytewax:
See the standard library function {py:obj}`functools.reduce`.
Appears as:
See the standard library function
functools.reduce
.
Other References#
Other references use a more explicit system. You use URIs starting
with inv:
, then the name of the inventory in the /docs/conf.py
intersphinx_mapping
, then the domain, then the item name.
Learn about [how to use lambdas](inv:python:std:label#tut-lambda).
Appears as:
Learn about how to use lambdas.
Finding Reference Names#
If you don’t know the exact xref incantation, you can use the included
dump tool to fuzzy search with grep
or
fzf
over all the xrefs to find
the one you want.
(dev) $ python ./intersphinxdump.py | fzf
Example Python Code#
Use backtick code blocks with the {testcode}
language type. Use this
instead of python
to ensure that the code block is run as a doctest.
It will still be syntax highlighted as if it was Python.
```{testcode}
from bytewax.dataflow import Dataflow
flow = Dataflow("doc_df")
```
Appears as:
from bytewax.dataflow import Dataflow
flow = Dataflow("doc_df")
If you are really sure that you don’t want the code to run as part of
the doctest suite, you can use the python
language instead.
Shell Sessions#
Use the language type console
(instead of bash
), and start
commands you run with $
to get proper highlighting.
```console
$ waxctl list
output here
```
Appears as:
$ waxctl list output here
Mermaid Diagrams#
We have install the Sphinx
sphinxcontrib-mermaid
plugin which allows you to use mermaid
as a code block language
name.
```mermaid
graph TD
I1[Kafka Consumer `users`] --> D1[Users Deserializer] --> K1[Key on User ID]
I2[Kafka Consumer `transactions`] --> D2[Transactions Deserializer] --> K2[Key on User ID]
K1 & K2 --> J1[Join on User ID] --> V[Validator] --> S[Enriched Serializer] --> O1[Kafka Producer `enriched_txns`]
V --> O2[Kafka Producer `enriched_txns_dead_letter_queue`]
```
Appears as:
Current Version Number#
Sometimes you want to show a command that includes the latest version of Bytewax with version number. Instead of updating this number in every file, Sphinx has a variable that has the current version number in it that we can substitute in. To enable substitutions in a code block, unfortunately, we have to use the directive form and enable substitutions:
```{code-block} console
:substitutions:
$ pip install bytewax==|version|
```
Appears for the current version 0.21.0 as:
$ pip install bytewax==0.21.0
Linking to Files in the GitHub Repo#
If you’d like to link to a file in our public GitHub
repo but want to do it in a way
for which the is a permalink to the version of the file in the same
Git commit as the current documentation was built, use the gh-path
scheme.
Note that the path is absolute to the repo and begins with a /
.
<gh-path:/examples/wikistream.py>
Appears as:
Linking to GitHub Issues or PRs#
You can link to a GitHub issue or PR in our public repo using this shorthand. It will decorate it with a little GitHub logo.
Note that the issue number does not have a #
before it.
<gh-issue:123>
Appears as:
Doctests#
We have a Sphinx builder which to run all Python code blocks in our documentation. This is so we catch documentation we forget to update as we advance the API.
Running just test-doc
will run over all:
All documentation examples in Markdown files in
/docs
.All examples in docstrings in
/pysrc
via the API docs pages.Docstrings from PyO3 are tested via the stubs file in
/pysrc/bytewax/_bytewax.pyi
and then via the API docs pages. You must rebuild stubs to test these.
For more options and details on this system, see
sphinx.ext.doctest
in the Sphinx docs.
Code Block with No Output#
If you have a plain Python {testcode}
block, the code will be run to
ensure no exceptions, but no output will be checked.
```{testcode}
x = 1 + 1
```
Appears as:
x = 1 + 1
Code Block Checking Output#
To assert some specific stdout from a code block, pair a
{testoutput}
block after a {testcode}
one.
Here's some pre-commentary.
```{testcode}
print(1 + 1)
```
Here's some middle-commentary.
```{testoutput}
2
```
Here's some post-commentary.
Appears as:
Here’s some pre-commentary.
print(1 + 1)Here’s some middle-commentary.
2Here’s some post-commentary.
Using Fixture Files#
just test-doc
cds into the docs/fixtures/
directory before running
the test doc builder. This means you have access to all files within
that directory for any of the doctests.
E.g. in our wordcount example we use a fixture file.
from bytewax.dataflow import Dataflow
from bytewax.connectors.files import FileSource
flow = Dataflow("wordcount_eg")
inp = op.input("inp", flow, FileSource("wordcount.txt"))
Doctest Code Block#
If you want to show an interactive interpreter session to show the
details of an example, make it a doctest-style code block, using the
doctest
directive. You should prefix each line with >>>
if it is
input and output on the following lines.
```{doctest}
>>> 1 + 1
2
```
Appears as:
>>> 1 + 1
2
Skipping#
To skip a whole code block, use the python
/ text
language instead
of {testcode}
/ {testoutput}
.
To skip a single line in a {doctest}
block, you can use an inline
doctest option.
```{doctest}
>>> datetime.date.now() # doctest: +SKIP
datetime.date(2008, 1, 1)
```