AskOmics

https://travis-ci.org/askomics/askomics.svg?branch=master https://coveralls.io/repos/github/askomics/askomics/badge.svg?branch=master https://img.shields.io/docker/pulls/askomics/askomics.svg Documentation Status Askomics logo

AskOmics is a visual SPARQL query interface supporting both intuitive data integration and querying while shielding the user from most of the technical difficulties underlying RDF and SPARQL

Askomics Homepage

Deployment

User

Dependencies

AskOmics need the Virtuoso triplestore to work.

Compile virtuoso

or install via docker

docker pull askomics/virtuoso

docker run --name my-virtuoso \                                                                                                                              
    -p 8890:8890 -p 1111:1111 \
    -e SPARQL_UPDATE=true \ 
    -v /tmp/virtuoso_data:/data \         
    -d askomics/virtuoso

replace /tmp/virtuoso with a directory of your choice.

Your virtuoso is available at localhost:8890.

Manual installation

Dependencies

Installation needs some dependencies,

Ubuntu 18.04

sudo apt update
sudo apt install -y git python3 python3-venv python3-dev zlib1g-dev libsasl2-dev libldap2-dev npm

Fedora 28/29

sudo dnf install -y git gcc gcc-c++ redhat-rpm-config zlib-devel bzip2 python3-devel openldap-devel npm
Installation

Clone the AskOmics repository, and checkout the latest version

git clone https://github.com/askomics/askomics.git
cd askomics
# chekout the latest version
git checkout $(git describe --abbrev=0 --tags)

If you have installed virtuoso via docker, you have to inform AskOmics that the load url is not localhost:6543, but another ip address (dockers can’t access host by http://localhost)

Run

docker exec my-virtuoso netstat -nr | grep '^0\.0\.0\.0' | awk '{print $2}'

and add

askomics.load_url=http://xxx.xx.x.x:6543

into configs/production.virtuoso.ini and configs/development.virtuoso.ini (replace xxx.xx.x.x with the ip obtained)

Install and run

./startAskomics.sh -d prod -t virtuoso

AskOmics is available at localhost:6543

Upgrade

Checkout the latest version is the AskOmics git directory.

git checkout $(git describe --abbrev=0 --tags)

Installation with docker

Pull the latest stable version of AskOmics

docker pull askomics/askomics

Run

docker run -p 6543:6543 askomics/askomics

AskOmics is available at localhost:6543

Upgrade with

docker pull askomics/askomics

Installation with docker-compose

Clone the askomics-docker-compose repository

git clone https://github.com/askomics/askomics-docker-compose

Choose which services you need and run with the docker-compose command. for example, if you need askomics+virtuoso :

cd askomics-docker-compose/virtuoso
docker-compose up -d

AskOmics is available at localhost/askomics

Upgrade with

# Stop dockers
docker-compose down
# upgrade the repo
git pull
# upgrade dockers
docker-compose pull
# start AskOmics
docker-compose up -d

Developer

Fork the AskOmics repository

then, clone your fork

git clone https://github.com/USERNAME/askomics.git # replace USERNAME with your github username

Install AskOmics

Run it with dev mod

./startAskomics.sh -d dev -t virtuoso

AskOmics is available at localhost:6543

AskOmics tutorials

User account

Account creation

To use AskOmics, you will need an account. Go to the sign-up page by clicking on the login icon.

_images/buttons1.png

Then, click on the “sign up” link:

_images/login.png

Fill the form with the requested information.

Account management

To manage your account, use the account management icon.

_images/account_management_tab.png

Update information

This section allows you to change your email address and your password.

API key

Your API key allows third-party applications (like Galaxy) to access AskOmics programmatically without revealing your personal password.

When updating your API key, old ones will no longer work.

Galaxy account

Link a Galaxy account to load Galaxy datasets into AskOmics.

Account deletion

The account deletion is permanent, all your information, as well as all your data will be deleted. There is no way back.

Use case 1: Gene expression

All files needed for the tutorial are available here

3 files are provided:

  • gene.tsv: Genes locations on a genome
  • orthogroup.tsv: Groups of ortholog genes
  • differential_expression.tsv: Results of differential expression analysis

Files organization

AskOmics takes as inputs CSV (Comma-Separated Values) files. But these files have to respect a certain structure.

A CSV file describes an entity. The entity name is the header of the first column of the CSV file (e.g. the entity name of the file gene.tsv is Gene).

Other column headers describe the entity attributes and relations:

  • An attribute is a simple column in the CSV file. For example, Gene have 5 attributes: organism, chromosome, strand, start and end.
  • A relation allows to create a link between an entity and another one. It is described by a header like relation_name@entity. On the orthogroup.tsv file, Orthogroup entity have a concerns relation. This relation targets the Gene entity.

Uploading files

The first step is to upload your CSV files into AskOmics. Click on the upload icon to go to the upload page.

_images/upload_tab.png

On the upload page, use the Upload button, and add the 3 files into the upload queue. Then, start uploading the files.

The CSV files are now uploaded on AskOmics.

Integrating files

On the upload page, select the Gene file to integrate, and click to the Integrate button. AskOmics shows an overview of the file.

_images/gene_tsv.png

  1. Columns disabler: uncheck columns to ignore them (their content will not be loaded at all)
  2. Header updater: optionally update entity or attribute names
  3. Key columns: check several columns to create a new one by concatenate the columns checked
  4. Entity type: choose between simple entity or entity start (default). An entity start will be displayed on the startpoint page.
  5. Attributes types: select the attributes types (see below)
  6. Custom URI: update the attributes URI (advanced feature)

Attributes can be one of the following types:

  • Attributes
    • Numeric
    • Text
    • Category
    • Date/time
  • Positionable attributes
    • Taxon
    • Chromosome
    • Strand
    • Start
    • End
  • Relation
    • General relation to entity
    • Symmetric relation to entity

Types are automatically detected by AskOmics, but you can override them if needed. Depending on the type you choose, different options will be available in the query builder.

You can then integrate the 2 remaining files.

Interrogating datasets

Once you have integrated all the datasets, it’s time to query them.

Click on the Ask icon

_images/ask_tab.png

The page show you the starting points of you query. Select The Gene entity and start a query.

_images/startpoints.png

The query builder is composed of two panels: the left panel, representing entities and their relations, and the right panel, representing attributes of the selected entity.

_images/query_builder_gene.png

On the left panel, the Gene entity is selected. We see two transparent node: Orthogroup and DE. These two nodes are proposed, but not instantiated.

On the right panel, attributes of Gene are displayed on attributes cells.

Simple query

Click on the Launch query button to perform a query. It leads to the job page, query section. Click on the query to display a preview of the results.

Results show all the gene URI present on the triplestore.

_images/results_1.png

Display attributes

Return to the query builder (Ask tab). Now, we want to display some attributes of the genes.

On the right view, all attributes have button. Click on the eye button to display attributes.

_images/organism_visible.png

The eye has 3 states:

  • closed eye: the attribute won’t appear in the results
  • open eye: the attribute will appear in the results
  • question mark: show the attribute, even if there is no value

Show the organism, start and end and launch the query.

Results show all the genes with their organism, start and end.

_images/results_2.png

Filter on attributes

Attributes can be filtered in different ways depending on their type (numeric, categorical or text).

Text

Go back to the query builder. To filter on a text attributes, enter some test in the field.

_images/filter_label.png

Here, we ask for all entities that match exactly the string AT001. This query will return one result.

You can also use a regular expression filter by clicking on the A icon (this will change the icon into a funnel).

_images/regexp_filter.png

We ask for all genes whose label contains the AT string. This will return 5 results.

Numeric

Go back to the query builder and reset the label filter by clicking to the rubber icon.

Filter the start attribute to get all genes with a start position greater than 6000.

_images/num_filter.png

3 genes are returned.

Category

Attributes of type Category have a limited number of text value. Here, strand , chromosome and taxon are categories.

On the query builder, filter the organism to get all Arabidopsis thaliana gene.

_images/organism_filter.png

5 genes are returned.

#### Other filtering features

Some other filtering functionalities are common to all the attributes:

  • Negation: the + icon (e.g. if you want to find attributes with a value different to the one you entered)
  • Cancel filter: use the rubber icon to reset the attribute filtering
  • Link: the chain link link an attributes to the same attributes on another node

Saving a query state

When you are proud of one of your query, you can save it for future reuse. On the query builder page, use the Files > Save Query to save the query state into your computer. This file represents the state of the query.

_images/save_query.png

Later, on the ask page, you can upload this query file to work on your query again.

Download the results

The job page only shows you a preview of the results. To download the full results, click on Save to download the complete CSV file.

Use AskOmics with Galaxy

Galaxy is an open source, web-based platform for data intensive biomedical research. You can integrate Galaxy datasets into AskOmics by linking a Galaxy account into AskOmics.

Upload a Galaxy datasets into AskOmics

On the upload page, you can now upload a Galaxy datasets with the button Get from Galaxy.

_images/upload_galaxy.png

Save a query into Galaxy history

On the query builder page, you can save a query state into a galaxy history. You can also start a query with a saved state from galaxy on the ask page.

_images/save_query_galaxy.png

Save query results into Galaxy history

Result can be sent into galaxy on the job page. Use the Send to Galaxy button.

_images/send_result_galaxy.png

Abstraction

Definition

What we called abstraction is the askomics ontology, this is what describe the data. It is quite small and defines what is a bubble and what is a link the the graphical interface. Its prefix is “askomics:”.

  • entity : what will be bubble, usually a owl:Class
  • startPoint : an entity that could start an askomics query. What will be displayed in the first query page.
  • attribute : what will be links between bubbles or bubble and value.
  • category : what will be choice list, used in some attribute value.

Turtle Example

Here i show you the minimal information to provide as an abstraction.

prefixes

@prefix xsd:      <http://www.w3.org/2001/XMLSchema#>
@prefix owl:      <http://www.w3.org/2002/07/owl#> .
@prefix rdfs:     <http://www.w3.org/2000/01/rdf-schema#> .
@prefix askomics: <askomics_is_good#> .
@base <scrap#> .

entity

# entity (startpoint to have a start, can be avoid in standard entity)
<People>
            askomics:entity "true"^^xsd:boolean ;
            rdfs:label "People"^^xsd:string ;
            askomics:startPoint "true"^^xsd:boolean ;
.

<entity> –relation–> value

# attribute DatatypeProperty
<First_name>
            askomics:attribute "true"^^xsd:boolean ;
            rdf:type    owl:DatatypeProperty ;
            rdfs:label  "First_name"^^xsd:string ;
            rdfs:domain <People> ;
            rdfs:range  xsd:string ;
.

<entity> –relation–> category=short list

# attribute DatatypeProperty
<Sex>
        askomics:attribute "true"^^xsd:boolean ;
        rdf:type    owl:DatatypeProperty ;
        rdfs:label  "Sex"^^xsd:string ;
        rdfs:domain <People> ;
        rdfs:range  <SexCategory> ;
.
<SexCategory>
        askomics:category <M>, <F> ;
.
<M>
    rdfs:label "M"^^xsd:string ;
.
<F>
    rdfs:label "F"^^xsd:string ;
.

<entity> –relation–> <entity>

# attribute ObjectProperty
<PlayWith>
        askomics:attribute "true"^^xsd:boolean ;
        rdf:type    owl:ObjectProperty ;
        rdfs:label  "play with"^^xsd:string ;
        rdfs:domain <People> ;
        rdfs:range  <People> ;
.

full file in people_mini.abstract.ttl

Python Management Code

As seen above, we have 2 kind of classes, “entity” and “attribute”/relation. To manage them (~get turtle strings), we use the 2 classes AbstractedEntity__ and AbstractedRelation__ in libaskomics/integration.

cf python doc to have details.

basics uses

ttl  += AbstractedEntity__( uri, label, startpoint=True ).get_turtle()
ttl  += AbstractedRelation__( uri, rdf_type, domain, range_, label ).get_turtle()

Contribute to AskOmics

Issues

If you have an idea for a feature to add or an approach for a bugfix, it is best to communicate with developers early. The most common venues for this are GitHub issues.

Pull requests

All changes to AskOmics should be made through pull requests to this repository.

For the askomics repository to your account. To keep your copy up to date, you need to frequently sync your fork:

git remote add upstream https://github.com/askomics/askomics
git fetch upstream
git checkout master
git merge upstream/master

Then, create a new branch for your new feature

git checkout -b my_new_feature

Commit and push your modification to your fork. If your changes modify code, please ensure that is conform to AskOmics style

Write tests for your changes, and make sure that they passes.

Open a pull request against the master branch of askomics. The message of your pull request should describe your modifications (why and how).

The pull request should pass all the continuous integration tests which are automatically run by Github using Travis CI. The coverage must be at least remain the same (but it’s better if it increases)

Tests

AskOmics use nosetests for Python tests.

Dependencies

Tests needs some services to work.

  • A virtuoso instance
  • A galaxy instance
  • A Ldap server with some entry

You can use some docker images

# Virtuoso
sudo docker run -d --name test_virtuoso -p 127.0.0.1:8890:8890 -p 127.0.0.1:1111:1111  -e DBA_PASSWORD=dba -e SPARQL_UPDATE=true -e DEFAULT_GRAPH=http://localhost:8890/DAV --net="host" -t tenforce/virtuoso
# Galaxy
sudo docker run -d --name galaxy -p 8080:80 -p 8021:21 -p 8022:22 bgruening/galaxy-stable
#ldap
sudo docker run -d --name simple-ldap -p 9189:389 -e ORGANISATION_NAME="AskoTests" -e SUFFIX="dc=askotest,dc=org" -e ROOT_USER="admin" -e ROOT_PW_CLEAR="askotest" -e FIRST_USER="true" -e USER_UID="jwick" -e USER_GIVEN_NAME="John" -e USER_SURNAME="Wick" -e USER_EMAIL="jwick@askotest.org" -e USER_PW_CLEAR="iamjohnwick" xgaia/simple-ldap

Run tests

Activate the Python virtual environment and run nosetests.

source venv/bin/activate
nosetests

To skip the Galaxy tests, run

nosetests -a '!galaxy'

To target a single file test

nosetests --tests askomics/test/askView_test.py

The testing configuration is set in the askomics/config/test.virtuoso.ini INI file. You can see that the Galaxy account API key is admin. The docker image bgruening/galaxy-stable have a default admin account with this API key. If you use another galaxy instance, change the url and API key.

Coding style guidelines

General

Ensure all user-enterable strings are unicode capable. Use only English language for everything (code, documentation, logs, comments, …)

Python

We follow PEP-8, with particular emphasis on the parts about knowing when to be inconsistent, and readability being the ultimate goal.

  • Whitespace around operators and inside parentheses
  • 4 spaces per indent, spaces, not tabs
  • Include docstrings on your modules, class and methods
  • Avoid from module import *. It can cause name collisions that are tedious to track down.
  • Class should be in CamelCase, methods and variables in lowercase_with_underscore

Contribute to docs

all the documentation (including what you are reading) can be found here. Files are on the AskOmics repository.

To preview the docs, run

cd askomics
# source the askomics virtual env
source venv/bin/activate
cd docs
make html

html files are in build directory.

askomics package

Subpackages

askomics.libaskomics package

Subpackages
askomics.libaskomics.integration package
Submodules
askomics.libaskomics.integration.AbstractedEntity module
askomics.libaskomics.integration.AbstractedRelation module
Module contents
askomics.libaskomics.rdfdb package
Submodules
askomics.libaskomics.rdfdb.FederationQueryLauncher module
askomics.libaskomics.rdfdb.MultipleQueryLauncher module
askomics.libaskomics.rdfdb.QueryLauncher module
askomics.libaskomics.rdfdb.SparqlQueryAuth module
askomics.libaskomics.rdfdb.SparqlQueryBuilder module
askomics.libaskomics.rdfdb.SparqlQueryGraph module
askomics.libaskomics.rdfdb.SparqlQueryStats module
Module contents
askomics.libaskomics.source_file package
Submodules
askomics.libaskomics.source_file.SourceFile module
askomics.libaskomics.source_file.SourceFileBed module
askomics.libaskomics.source_file.SourceFileGff module
askomics.libaskomics.source_file.SourceFileTsv module
askomics.libaskomics.source_file.SourceFileTtl module
askomics.libaskomics.source_file.SourceFileURL module
Module contents
Submodules
askomics.libaskomics.DatabaseConnector module
askomics.libaskomics.EndpointManager module
askomics.libaskomics.GalaxyConnector module
askomics.libaskomics.JobManager module
askomics.libaskomics.LdapAuth module
askomics.libaskomics.LocalAuth module
askomics.libaskomics.ParamManager module
askomics.libaskomics.SourceFileConvertor module
askomics.libaskomics.TripleStoreExplorer module
askomics.libaskomics.utils module
Module contents

Submodules

askomics.ask_view module

askomics.upload module

askomics.views module

Module contents

Indices and tables