Skip to content

4. Excel2SBOL Module and Repository Architecture

Taisiia Sherstiukova edited this page Apr 16, 2025 · 14 revisions

Table of Contents

Repository Architecture

This repository contains the excel2sbol module, resources to use it (such as templates), and the tests for all of the functions it contains.

Repository Automation and Secrets

Repository Secrets

  • PYPI_USERNAME
  • PYPI_PASSWORD

GitHub Actions

  • linting: runs on pull request to test flake8 compliance of the code
  • testing: runs on pull request to run the suite of pytest tests
  • python-publish: runs on release creation to push a new package to pypi

File Structure

  • Home
    • excel2sbol: main project folder
      • .py: Main project code
      • tests: pytest
        • test_files: data files used for testing
        • test_*.py: files containing pytest code
      • resources:
        • templates: Excel templates for use with the excel2sbol converter
        • taxonomy_scrapers: files related to creating the Excel ontologies used in the converter templates
      • README: required for packaging of the excel2sbol library
      • setup.py: Used for pip installation of the package
    • images: contains images for the read me
    • .github: contains issue templates and github actions
      • workflows: github actions
        • linting: runs on pull request to test flake8 compliance of the code
        • python-publish: runs on release creation to push a new package to pypi
        • testing: runs on pull request to run the suite of pytest tests
      • ISSUE_TEMPLATE: bug_report, documentation_issue, and feature_request
    • requirements.txt: python dependencies
    • README.md: Creates the quick guide on github
    • LICENSE: BSD-3-Clause
    • .gitignore: Used for github syncing

Module Architecture

Excel-to-SBOL template is split into two parts:

  1. Welcome Page (e.g. Collection Name, Date Created, and Authors)
  2. Part table: The table of parts provided

Excel Sheet Parts Excel Sheet Parts

The part table is the most complex as it requires the column_definitions sheet to process and the other ontology sheets.

The architecture is:

  • converter.py
    • Function: converter processes all the parts individually, and outputs an SBOL file.
    • Dependency: compiler.py
  • compiler.py
    • Function:
      • initialise_welcome reads the "Welcome" sheet from the Excel file, parses metadata (e.g. author, description), and returns it as a dictionary.
      • initialise processes the "Init" sheet to determine SBOL version, homespace, and which sheets to convert. For each convertible sheet, extracts collection info, descriptions, and the library table.
      • parse_objects creates and initializes SBOL2 objects from the sheets listed in to_convert, using metadata from the Excel file
      • parse_objects3 same as parse_objects, but for SBOL3.
    • Dependency: helpers.py, lookup_compiler.py, comp_column_functions2.py
  • comp_column_functions2.py
    • Class: rowobj contains all relevant information and context for processing a single row from a spreadsheet
    • Class: switch1 modifies rowobj attributes accordingly to construct or update SBOL objects from spreadsheet data
    • Dependency: library2.py, library3.py
  • helpers.py
    • Function: col_to_num, converts excel column names like AA to zero indexed numbers like 26
    • Function: check_name, ensures that a string is alphanumeric and contains no special characters (including spaces) apart from '_'
    • Function: truthy_strings, converts several different kinds of input (e.g.: 'True', True, 1, '1', 'tRue') to the boolean True or False

Dependency structure

This graphic shows the dependency structure of the different functions used in the module. The arrows indicate dependency and the colours indicate the file they can be found in (see the key in the bottom right-hand corner). Dependency Structure