Chapter 1 Publishing with Bookdown
This open-access book is built with free-to-use, open-source tools—primarily Bookdown, GitHub, and Zotero—and this chapter explains how, so that readers may do it themselves and share their knowledge to improve the process. In addition to our notes below, see also Yihui Xie’s more comprehensive Bookdown guide.1
Our broad goal is an efficient workflow to compose one document in the easy-to-write Markdown format that Bookdown generates into multiple book products: an HTML web edition to read online, a PDF print edition for traditional book publishing, a Microsoft Word edition for editors who request it for copyediting, and option for other formats as desired.
Since Bookdown is an R code package, we composed the book manuscript in R-flavored Markdown, with one file (.Rmd) for each chapter. We use Bookdown to build these files in its GitBook style as a set of static HTML pages, which we upload to our GitHub repository. Readers can view the open-access web edition of the book at our custom domain: https://HandsOnDataViz. Also, we use Bookdown to build additional book outputs (PDF, MS Word, Markdown) and upload these to the docs
folder of our GitHub repository, so that our O’Reilly Media editor may download and comment on the manuscript as we revise. Finally, we also have the option to use Pandoc alone to convert the full-book Markdown file (.md) into an AsciiDoc file (.asciidoc) for easier importing into the O’Reilly Atlas platform. See some caveats and workarounds below.
File Organization and Headers
We organized the GitHub repository for this book as a set of .Rmd files, one for each chapter. As co-authors, we are careful to work on different chapters of the book, and to regularly push our commits to the repo. Only one of us regularly builds the book with Bookdown to avoid code merge conflicts.
Bookdown assigns a default ID to each header, which can be used for cross-references. The default ID for # Introduction
is {#introduction}
, and the default ID for ## Part One
is {#part-one}
, where spaces are replaced by dashes. But we do not rely on default IDs because they might change due to editing or contain duplicates across the book.
Instead, we manually assign a unique ID to each first- and second-level header in the following way. Note that the {-}
symbol, used alone or in combination with a space and a unique ID, prevents auto-numbering in the second- thru fourth-level headers:
# Top-level chapter title {#unique-name}
## Second-level section title {- #unique-name}
### Third-level subhead {-}
#### Fourth-level subhead {-}
Also, we match the unique ID keyword to the file name for top-level chapters this way: 01-introduction.Rmd
to keep our work organized. Unique names should contain only alphanumeric characters (a-z, A-Z, 0-9) or dashes (-).
In the Bookdown index.Rmd
for the HTML book output and the PDF output, the toc_depth: 2
setting displays chapter and section headers down to the second level in the Table of Contents.
The split_by: section
setting divides the HTML pages at the second-level header, which creates shorter web pages with reduced scrolling for readers. For each web page, the unique ID becomes the file name, and is stored in the docs
subfolder.
The number_sections
setting is true for the HTML and PDF editions, and given the toc_depth: 2
, this means that they will display two-level chapter-section numbering (1.1, 1.2, etc.) in the Table of Contents. Note that number_sections
must be true to display Figure and Table numbers in x.x format, which is desired for this book. See relevant settings in this excerpt from index.Rmd
:
output:
bookdown::gitbook:
...
toc_depth: 2
split_by: section
number_sections: true
split_bib: true
...
bookdown::pdf_book:
toc_depth: 2
number_sections: true
Note that chapter and section numbering do not appear automatically in the MS Word output unless you supply a reference.docx file, as described below:
- https://bookdown.org/yihui/rmarkdown/word-document.html
- https://stackoverflow.com/questions/52924766/numbering-and-referring-sections-in-bookdown
- https://stackoverflow.com/questions/50609212/caption-styles-for-word-document2-in-bookdown
In the _bookdown.yml
settings, all book outputs are built into the docs
subfolder of our GitHub repo, as shown in this excerpt:
output_dir: "docs"
book_filename: "bookdown-template"
language:
label:
fig: "Figure "
chapter_name: "Chapter "
In our GitHub repo, we set GitHub Pages to publish to the web using master/docs
, which means that visitors can browse the source files at the root level, and view the HTML web pages hosted in the docs
subfolder. We use the GitHub Pages custom domain setting so that the HTML edition is available at https://HandsOnDataViz.org.
The docs
subfolder also may contain the following items, which are not generated by Bookdown and need to be manually created:
CNAME
file for the custom domain, generated by GitHub Pages..nojekyll
invisible empty file to ensure speedy processing of HTML files by GitHub Pages.404.html
custom file to redirects any mistaken web addresses under the domain back to theindex.html
page.
One more option is to copy the Google Analytics code for the web book, paste it into an HTML file in the book repo, and include this reference in the index.Rmd
code:
output:
bookdown::gitbook:
...
includes:
in_header: google-analytics.html
Yihui Xie, Bookdown: Authoring Books and Technical Documents with R Markdown, 2018, https://bookdown.org/yihui/bookdown/↩︎