Skip to content

Documentation & Communication

Learning Objectives

After this lesson, you should be able to:

  • Identify and explain different types of project documentation (both internal and external)
  • Describe tools and approaches to creating your own documentation
  • Create your own GitHub Pages website (!)


Project Documentation

A great Open Scientist is someone who documents their work and shares it with the world. This means going well beyond peer-reviewed publications.

  • Describe how to use or build your computer code or tools

  • Share best practices for a method or protocol

  • Create and share educational material so others in your field can learn from you

  • Share a first-person account of your journey through a project



xkcd

Read more in depth on the documentation system here: https://documentation.divio.com

Explanining the quadrants
  • Tutorials: Lessons! Tutorials are lessons that take the reader by the hand to understand how the basics of a tool work. They are what your project needs in order to show a beginner that they can achieve something with it. The techical teaching we do in FOSS are mostly tutorials. For example, we do simple tutorials to teach the mechanics of version control.
  • How-to-guides: Recipes! How-to-guides take the reader through the steps required to acheive a specific outcome or answer a specific question. An example how-to-guide could be a guide on how to install a specific software on a specific operating system.
  • References: References offer technical descriptions of the machinery and how to operate it. References have one job only: to describe. They are code-determined, because ultimately that’s what they describe: key classes, functions, APIs, and so they should list things like functions, fields, attributes and methods, and set out how to use them.
  • Explanation: Discussions! The aims of explanations are to clarify and illuminate a particular topic by broadening the documentation’s coverage of a topic.



Tips for Great Documentation

  • Clarity: Documentation should be easy to understand with clear language and no ambiguity.
  • Completeness: It must cover all essential details, leaving nothing crucial undocumented.
  • Accuracy: Information should be up-to-date and correct to prevent errors and misunderstandings.
  • Organization: A logical structure and clear organization make it easy to navigate and find information.
  • Relevance: Documentation should focus on what's pertinent to its intended audience or purpose, avoiding unnecessary information.


Public Repositories for Documentation

GitHub Readme
  • On Github, good documentation starts with a robust ReadMe file. The ReadMe file is the first thing that people see when they visit your repository. It is a good place to explain what your project does, how to use it, and how to contribute to it. Here is an example.
GitHub Wiki
  • Also on Github, you can use the Wiki feature to create a separate space for documentation. The Wiki is a place to document your project in a way that is separate from the code. Here is an example
GitHub Pages
  • Github Pages are hosted directly from your GitHub repository
  • GitHub pages are free, fast, and easy to build, but limited in use of subdomain or URLs
  • You can pull templates from other GitHub users for your website, e.g. Jekyll themes
  • The FOSS website is rendered using GitHub Pages using MkDocs and the Material theme for MkDocs.
  • Other popular website generator for GitHub Pages is Bootstrap.js.
Material MkDocs
  • Material Design theme for MkDocs, a static site generator geared towards (technical) project documentation.
  • Publish via GitHub Actions
  • Uses open source Material or ReadTheDocs Themes
ReadTheDocs
  • publishing websites via ReadTheDocs.com costs money.
  • You can work in an offline state, where you develop the materials and publish them to your localhost using Sphinx
  • You can work on a website template in a GitHub repository, and pushes are updated in near real time using ReadTheDocs.com.
  • Here is example documentation of Pytorch using ReadTheDocs: PyTorch.
Bookdown
  • Bookdown is an open-source R package that facilitates writing books and long-form articles/reports with R Markdown.
  • Bookdown websites can be hosted by RStudio Connect
  • You can publish a Bookdown website using Github Pages
Quarto
JupyterBook
GitBook
  • GitBook websites use MarkDown syntax
  • Free for open source projects, paid plans are available
Confluence Wikis


Things to remember about Documentation

  • Documentation should be written in such a way that people who did not write the documentation can read and then use or read and then teach others in the applications of the material.

  • Documentation is best treated as a living document, but version control is necessary to maintain it

  • Technology changes over time, expect to refresh documentation every 3-5 years as your projects age and progress.




Public Communication

Communicating with the public and other members of your science community (in addition to traditional peer-review publications and conferences) is one of the most important parts of your science!

There are many ways scientists use social media and the web to share their data science ideas:

  1. "Science Twitter" (now ) - is really just regular Twitter, but with a focus on following other scientists and organizations, and tweeting about research you're interested in. By building up a significant following, more people will know you, know about your work, and you'll have a higher likelihood of meeting other new collaborators.
    • Since the rebranding to , a lot of the scientific communities on Twitter have now moved to Mastodon, an open-source self-hosted alternative to social media.
  2. Blogging Platforms such as Medium are a great place to self publish your writing on just about any topic. It's free to sign up and start blogging, but does have a fee for accessing premium content. Some of my favorite blogs include Toward Data Science and Chris Holmes.
  3. Community groups - There are lists (and lists of lists) of nationals research organizations, in which a researcher can become involved. These older organziations still rely on official websites, science journal blogs, and email lists to communicate with their members. In the earth sciences there are open groups which focus on communication like the Earth Science Information Partners (ESIP) with progressive ideas about how data and science can be done. Other groups, like The Carpentries and Research Bazaar are focused on data science training and digital literacy.
  4. Podcasts - Creating and distributing audio content to masses is easier than ever before. There are many podcast hosting platforms including Spotify, Podbean, Acast, and Libsyn. From there is it simple to make your podcast availble in the Google Podcast app or Apple Podcast app.
  5. Webinars - With platforms such as Zoom, Microsoft Teams, and Google Meet, it is so easy nowadays to host a webinar touting and explaining your science.
  6. Youtube - The king of video sharing platforms is a great place to post content promoting your science (and yourself!). For example, Cyverse posts lots of content on cyberinfrastructure and data processing pipelines. Some of my favorite podcasts hosted on Youtube include StarTalk and Lex Fridman.
    • CyVerse and the Data Science Institute have a number of videos that can help you expand your science. Navigate our DataLab YouTube channel to learn more on AI, Natural Language Processing (NLP), geospatial science, bioinformatics/genomics and project management; Go to our CyVerse Channel to see the various projects CyVerse is involved in, or learn tips and tricks on how CyVerse works.

Important

Remember: Personal and Professional Accounts are Not Isolated

You decide what you post on the internet. Your scientist identity may be a part of your personal identity on social media, it might be separate. A future employer or current employer can see your old posts. What you post in your personal accounts can be considered a reflection of the organization you work for and may be used in decisions about hiring or dismissal.




Hands-on: Building a GitHub Pages Website using MkDocs

This section is built in order to educate on and simplify the steps necessary that newcomers need to take in order to build a successful GitHub Pages hosted website.

This tutorial is inspired by academicpages, a Jekyll themed template created in order to help scientists and academics build their own websites.

This tutorial will cover the necessary files and repository structure you require in order to build a successful personal website.

The website we will create is going to be hosted on GitHub Pages using the MkDocs site generator and material for MkDocs theme.

How does this work?

  • You will build a website on GitHub, which will host it on GitHub pages. This tutorial does not require a local git copy of the repository.
  • MkDocs is a static website generator: using the Markdown language you can create tables, format paragraphs, add images etc.; Through a GitHub Action, MkDocs will take your Markdown formatted pages and create the necessary HTML for a website.
  • The material for MkDocs theme will "prettify" your website, rendering similar to the FOSS documentation, the CyVerse Learning center, or the U of A HPC documentation.

This tutorial will create the basic requirements for the website. It will be up to you to populate it further.

You can base your formatting on either the FOSS materials or find out more from the material for MkDocs theme pages.

Repository Explanation

A GitHub hosted website running the MkDocs-material theme requires the following files in order to function:

  • A docs folder:
    • A folder that contains all the documents necessary to populate the website's pages.
    • All of the documents that the user needs to change are in here.
  • A mkdocs.yml file:
    • A yml file which contains critical information on the website structure, including themes, fonts, and extensions.
  • A requirements.txt file:
    • A file with a list of software necessary to build the website, primilily used by GitHub Actions.
  • A .github/workflow folder:
    • Contains the ghpages.yml file that controls the GitHub Action.

The structure of the basic repository is the following:

.
├── README.md
├── mkdocs.yml              <- Governing file for website building
├── requirements.txt        <- Requirements file for pip installation (required by website)      
├── docs                           
│   ├── assets              <- Folder for images and additional graphic assets
│   └── index.md            <- Main website home page
└── .github
    └── workflows
        └── ghpages.yml     <- GitHub Actions controlling file

Upon pushing changes, a gh-pages branch will be automatically created by the GitHub Action; it is where the website is rendered from.

There are 2 ways of doing this exercise:

  • Direction A: Forking or Importing a pre existing template
  • Direction B: Creating the materials by yourself.

Here below are the steps for Direction B.

The workflow for this exercise is the following:

  1. Create the repository.
  2. Create the docs folder and populate it with index.md.
  3. Address the Settings such that GitHub Actions have the correct settings.
  4. Create a requirements.txt (used by GitHub Actions to build the website).
  5. Create the mkdocs.yml (used by MkDocs to create the sttructure).
  6. Create a GitHub workflow file .github/workflows
  7. Address the Settings to deploy the website from a newly created branch.
  8. Edit pages in your own time.

All the code is available on this page or on the HackMD. Ideally, all you need to do is copy and paste it!

Prerequisites

You will require the following in case you want to add code locally.

Create a GitHub account

Navigate to the GitHub website and click Sign Up, and follow the on screen instructions.

Additionally, you can choose between Generating a Personal Access Token or using SSH keys. This is useful if you want to work locally and push your changes to GitHub. We are going to cover this further in next week's lesson on Version Control.

Choice A: Generate a Personal Access Token

You can follow the official documentation on how to generate Tokens here. We discussed how to generate tokens in Week 0. Here's are quick steps you can follow in order to setup your account on your machine using tokens:

  1. On your coumputer:
    1. Clone your repository (git clone <repository>)
    2. Make changes where necessary, and add (git add <changed files>), commit (git commit -m "<message on changes>") and push your changes (git push origin).
    3. You should be prompted to logging in your GitHub account. Put your email but not your password. Instead, open your web browser and follow the steps below:
  2. On GitHub:
    1. Navigate to your GitHub Settings (You can access your account Settings from the drop down menu where your account icon is, on the top right of the screen)
    2. Scroll to the bottom of the left hand side menu to find Developer settings and open it.
    3. Click Personal access tokens > Tokens (classic)
    4. Click Generate new token > Generate new token (classic). You might need to input your Authentification code if you have enabled 2FA.
    5. Give it a name, and all the scopes you require (tip: select all scopes and No Expiration), then click Generate Token. Copy the new generated Token
  3. Back on your computer:
    1. If you have been following the steps above, you should still be in your shell with GitHub still asking for your password.
    2. Paste your Token here, and you should be logging in. Your changes should then be saved to GitHub.
Choice B: Connecting via SSH

The process of connecting your computer to GitHub using an SSH key is more expedited (and probably less confusing).

As a setup step, see if your computer is already connected to GitHub by doing ssh -T git@github.com. If the response message is git@github.com: Permission denied (publickey). it signifies that your computer is not yet linked with GitHub. To link your computer to github to the following:

  1. Generate an SSH key with a level of encryption that you prefer: ssh-keygen -t ed25519 -C <your github email>. This command generates an SSH key with ed25519 encryption (harder to crack!) and adds your email as "comment" (-C, will help recongizing the user adding the key). A number of additional questions are going to ask you where you'd like to save the key and whether you'd like to add a password for protection; unless you want to save it elsewhere, feel free to use the default options. Upon completion you should see something like this:
    Your identification has been saved in /c/Users/<user>/.ssh/id_ed25519
    Your public key has been saved in /c/Users/<user>/.ssh/id_ed25519.pub
    The key fingerprint is:
    SHA256:SMSPIStNyA00KPxuYu94KpZgRAYjgt9g4BA4kFy3g1o <your github email>
    The key's randomart image is:
    +--[ED25519 256]--+
    |^B== o.          |
    |%*=.*.+          |
    |+=.E =.+         |
    | .=.+.o..        |
    |....  . S        |
    |.+ o             |
    |+ =              |
    |.o.o             |
    |oo+.             |
    +----[SHA256]-----+
    
  2. Upon generating the ssh key, copy it. You can reveal it by doing cat ~/.ssh/id_ed25519.pub.
  3. In GitHub, go to your settings: click your account icon on top right, and from the drop down menu, select Settings and then SSH and GPG keys. Here, click on New SSH Key, where you can then paste the newly geneated key. Add a name reflecting your machine and save changes.

Optional: if you want to check if you successfully linked your computer to GitHub, do ssh -t git@github.com. You should receive the following message: `Hi ! You've successfully authenticated, but GitHub does not provide shell access.

  1. Create your own repository
    • Add a README and a license and keep the repository public
  2. Create a docs folder
    • Within the folder, create an index.md file
  3. Navigate to Settings > Actions > General:
    • Under Action Permissions select Allow all actions and reusalbe workflows
    • Under Workflow permissions select Read and write permissions and Allow GitHub Actions to create and approve pull requests
  4. Create an requirements.txt file and populate it with the following requirement list:

    Expand for code!
    bump2version
    coverage
    flake8
    grip
    ipykernel
    livereload
    nbconvert>=7
    pip
    sphinx
    tox
    twine
    watchdog
    wheel
    mkdocs-git-revision-date-plugin 
    mkdocs-jupyter 
    mkdocs-material 
    mkdocs-pdf-export-plugin
    mkdocstrings 
    mkdocstrings-crystal
    mkdocstrings-python-legacy
    #pygments>=2.10,<2.12
    #pymdown-extensions<9.4
    
    # Requirements for core
    jinja2>=3.0.2
    markdown>=3.2
    mkdocs>=1.4.0
    mkdocs-material-extensions>=1.0.3
    pygments>=2.12
    pymdown-extensions>=9.4
    
    # Requirements for plugins
    requests>=2.26
    
  5. Create an mkdocs.yml file and populate it with the following:

    Expand for code!
    site_name: Name of your website
    site_description: Tell people what this website is about
    site_author: Who you are
    site_url: 'https://foss.cyverse.org'
    
    # Repository
    repo_name: The repository name
    repo_url: 'https://github.com/CyVerse-learning-materials/foss'
    edit_uri: edit/main/docs/
    # Copyright
    copyright: 'Copyright &copy; 2023 - 2024'
    
    
    # Configuration
    theme:
        name: material
    highlightjs: true
    font:
        text: Roboto
        code: Regular
    palette:
        scheme: default
    
    # Features  
    features:
    - navigation.instant
    - navigation.tracking
    - navigation.tabs
    - navigation.tabs.sticky
    - navigation.indexes
    - navigation.top
    - toc.follow
    
    # 404 page
    static_templates:
        - 404.html
    
    # Search feature
    include_search_page: false
    search_index_only: true
    
    # Palette and theme (uses personalized colours)
    language: en
    palette:
        primary: custom
        accent: custom
    icon:
        logo: material/cogs
        favicon: material/cogs
    
    # Page tree
    nav:
    - Home: index.md
    
    # Extra Plugins
    plugins:
        - search
        - mkdocstrings
        - git-revision-date
        - mkdocs-jupyter:
            include_source: True
            ignore_h1_titles: True
    
    # Extensions (leave as is)
    markdown_extensions:
    - admonition
    - abbr
    - attr_list
    - def_list
    - footnotes
    - meta
    - md_in_html
    - toc:
        permalink: true
        title: On this page
    - pymdownx.arithmatex:
        generic: true
    - pymdownx.betterem:
        smart_enable: all
    - pymdownx.caret
    - pymdownx.critic
    - pymdownx.details
    - pymdownx.emoji:
        emoji_index: !!python/name:materialx.emoji.twemoji
        emoji_generator: !!python/name:materialx.emoji.to_svg
    - pymdownx.highlight
    - pymdownx.inlinehilite
    - pymdownx.keys
    - pymdownx.magiclink:
        repo_url_shorthand: true
        user: squidfunk
        repo: mkdocs-material
    - pymdownx.mark
    - pymdownx.smartsymbols
    - pymdownx.superfences:
        custom_fences:
            - name: mermaid
            class: mermaid
            format: !!python/name:pymdownx.superfences.fence_code_format
    - pymdownx.tabbed
    - pymdownx.tasklist:
        custom_checkbox: true
    - pymdownx.tilde
    
  6. Create a .github/workflows folder and add a ghpages.yml with the following:

    Expand for code!
    name: Publish docs via GitHub
    on:
    push:
        branches:
        - main
    
    jobs:
    build:
        name: Deploy docs
        runs-on: ubuntu-latest
        steps:
        - uses: actions/checkout@v3
        - uses: actions/setup-python@v4
            with:
            python-version: 3.9
        - name: run requirements file
            run:  pip install -r requirements.txt 
        - name: Deploy docs
            run: mkdocs gh-deploy --force
            env:
            GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
    
  7. Navigate to Settings > Pages and make sure that Source is Deploy from a branch and Branch is gh-pages, /(root)

    • You should be able to access your website at https://<github-username>.github.io/. If you cannot find your website, go to the repository's settings page and navigate to Pages: your website address will be there.
  8. Edit documents as necessary.
    • Don't forget to add, commit and push changes!
    • Changes will only be visible on the website after a successful push.
    • After each push, next to the commit identifier GitHub will show either a yellow circle (🟡, meaning building), green check (, meaning success), or red cross (❌, meaning failure).

Further Documentation

Here are some guides that you may find useful:


Self-Paced Material

GitHub Pages Website Quickstarts