Introduction

What is SONG?

SONG is a robust metadata and validation system used to quickly and reliably track genome metadata scattered across multiple cloud storage systems. In the field of genomics and bioinformatics, metadata managed by simple solutions such as spreadsheets and text files require significant time and effort to maintain and ensure the data is reliable. With several users and thousands of genomic files, tracking the state of metadata and their associations can become a nightmare. The purpose of SONG is to minimize human intervention by imposing rules and structure to user uploads, which as a result produces high quality and reliable metadata with a minimal amount of effort. SONG is one of many products provided by Overture and is completely open-source and free for everyone to use.

See also

For additional information on other products in the Overture stack, please visit https://overture.bio


Features

  • Synchronous and asynchronous metadata validation using JsonSchema
  • Strictly enforced data relationships and fields
  • Optional schema-less JSON info fields for user specific metadata
  • Standard REST API that is easy to understand and work with
  • Simple and fast metadata searching
  • Export payloads for SONG mirroring
  • Clear and concise error handling
  • ACL security using OAuth2 and scopes based on study codes
  • Unifies metadata with object data stored in SCORE
  • Built-in Swagger UI for API interaction

Data Submission Workflow

The data submission workflow can be separated into 4 main stages:

  1. Metadata Upload (SONG)
  2. Metadata Saving (SONG)
  3. Object data Upload (SCORE)
  4. Publishing Metadata (SONG)

The following diagram summarized the steps involved in successful data submission using SONG and SCORE:

_images/song-workflow.svg

Projects Using SONG

_images/song_projects_static_map.png

Legend:

Getting Started

The easiest way to understand SONG, is to simply use it! Below is a short list of different ways to get started on interacting with SONG.

Tutorial using a CLI with Docker for SONG

The Docker for SONG tutorial is a great way to spin-up SONG and all its dependent services using Docker on your host machine. Use this if you want to play with SONG locally. Refer to the Docker for SONG documentation.

Tutorial using the Python SDK with SONG

The SONG Python SDK Tutorial is a Python client module that is used to interact with a running SONG server. Use it with one of the Projects Using SONG, or in combination with Docker for SONG. For more information to about the Python SDK, refer to the SONG Python SDK documentation.

Play with the REST API from your browser

If you want to play with SONG from your browser, simply visit the Swagger UI for each server:

  1. Cancer Collaboratory - Toronto: https://song.cancercollaboratory.org/swagger-ui.html
  2. AWS - Virginia: https://virginia.song.icgc.org/swagger-ui.html

See also

For more information about user access, refer to the User Access documentation.

Deploy SONG to Production

If you want to deploy SONG onto a server, refer to the Deploying a SONG Server in Production documentation.

License

Copyright (c) 2018. Ontario Institute for Cancer Research

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.

You should have received a copy of the GNU Affero General Public License along with this program. If not, see <https://www.gnu.org/licenses/>.