Docs-As-Code
Docs-as-code can have many meanings. I include:
-
Source control
-
Automated testing
-
Peer reviews
-
Enforcing a style-guide
-
Collecting and responding to reader feedback
I think that most of the above is well understood, however, the area that I think that needs to be improved in general is automated testing. Here is a breakdown of what I think automated testing is:
-
Doc test builds
-
Link checking
-
Style-guide enforcement
-
Code-sample testing
I don’t think that code-sample testing is widely adopted. I believe that most organizations fall in the "test when someone complains" camp.
Single sourced code samples
Complaint-driven workflow
Test-driven workflow
Overview
A year or two ago I saw a GitHub issue related to a bug in some documentation that I wrote. Later in the same week I saw another issue about a different page in the docs. Both of these issues were related to key features that the community and customers were using in production.
After retesting the steps in the docs and confirming that the issues were accurate I looked through the release notes and found the related breaking changes.
Waiting for the community and customers to find the bugs in the docs or bugs in the code is a common problem, and it is embarrassing.
The best way to know when software changes is to run tests against every code change. This is common for code changes, but somehow the code changes and sample data used in the tests don’t make their way into the documentation.
The fix
-
Treat the docs as code.
-
As end to end docs (tutorials, quick starts, how-to guides) are designed, they should be written as test plans.
-
Automate the test plan.
-
Write the doc, but instead of copy/pasting the code snippets (SQL in my case) into the docs, import the snippets directly from the automated test.
-
Run the test suite on a regular basis.
-
As tests fail, get the code fixed if the failure indicates a bug, or update the test to include the new behavior of the system. The update to the test should cause an update to the documentation, as the doc system is pulling the code snippets from the tests.
Example
A recent feature of the project I am working on queries data in files stored in object storage, figures out the schema of the data, then creates and populates a table in a database.
The SQL that causes this magic to happen looks like this:
CREATE TABLE DocsQA.user_behavior_inferred
AS SELECT * FROM FILES (
"path" = "s3://starrocks-examples/user_behavior_ten_million_rows.parquet",
"format" = "parquet",
"aws.s3.region" = "us-east-1",
"aws.s3.access_key" = "AAAAAAAAAAAAAAAAAAAA",
"aws.s3.secret_key" = "BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB"
);
Yesterday I would have copied the above out of the SQL client I used to run the query and pasted it into a Markdown file. But today, I would instead use this syntax to grab the above from the test specification like so:
```
sql reference title="Create table from S3 using FILES() table function"
https://github.com/DanRoscigno/docs/blob/6d6fcf905162adf80bd094cb9dd133a5c557bdd3/SQL/files_table_fxn.sql#L1-L11
```
This is docusaurus-theme-github-codeblock syntax, not Asciidoc. With Asciidoc I would include content by
tagged regions. Similarly, with Restructured Text I would us remote code blocks with start-after and end-before .
|
In Docusaurus this renders as:
Proof of concept
An implementation of the above is described at Single-sourcing docs from code.
Reader feedback
Collecting reader feedback is important. Many of the feedback widgets available rely on systems that are blocked by browser ad blockers. I am using a React component that collects feedback and writes to PostHog and does not rely on cookies. Each week a scheduled GitHub workflow collects the feedback from PostHog and generates an issue with the reader feedback. The same workflow queries Algolia for the top successful searches and failed searches in the docs. This informs the documentation team and product management on which features or commands are important to readers.
CI checks
Link checking, Markdown linting, and build tests are performed on each commit to documentation pull requests by the doc CI job.