askomics.libaskomics.source_file package

Submodules

askomics.libaskomics.source_file.SourceFile module

Classes to import data from source files

class askomics.libaskomics.source_file.SourceFile.SourceFile(settings, session, path, uri_set=None)

Bases: askomics.libaskomics.ParamManager.ParamManager, askomics.libaskomics.utils.HaveCachedProperties

Class representing a source file.

get_number_of_lines()

Get the number of line of a tabulated file

Returns:number of ligne (int)
get_timestamp()

return the timestamp (use in the tests)

insert_metadatas(accessL)

Insert the metadatas into the parent graph

load_data_from_file(fp, urlbase)

Load a locally created ttl file in the triplestore using http (with load_data(url)) or with the filename for Fuseki (with fuseki_load_data(fp.name)).

Parameters:fp – a file handle for the file to load

:param urlbase:the base URL of current askomics instance. It is used to let triple stores access some askomics temporary ttl files using http. :return: a dictionnary with information on the success or failure of the operation

persist(urlbase, public)

Store the current source file in the triple store

Parameters:urlbase – the base URL of current askomics instance. It is used to let triple stores access some askomics temporary ttl files using http.
Returns:a dictionnary with information on the success or failure of the operation
Return type:Dict
setGraph(graph)
exception askomics.libaskomics.source_file.SourceFile.SourceFileSyntaxError

Bases: SyntaxError

askomics.libaskomics.source_file.SourceFileBed module

Classe to import data from a bed source file

class askomics.libaskomics.source_file.SourceFileBed.SourceFileBed(settings, session, path, uri_set=None)

Bases: askomics.libaskomics.source_file.SourceFile.SourceFile

Class representing a BED Source file

get_abstraction()

Get abstraction of a bed file

Returns:abstraction in turtle
Return type:string
get_domain_knowledge()

Get domain knowledge of a bed file

Returns:domain knowledge in turtle
Return type:string
get_turtle()

Get turtle content for a bed file

Yield:the ttl string
Return type:string
open_bed()

Try to parse the file

set_entity_name(entity)

set the entity name

set_taxon(taxon)

Set the taxon

askomics.libaskomics.source_file.SourceFileGff module

Classes to import data from a gff3 source files

class askomics.libaskomics.source_file.SourceFileGff.SourceFileGff(settings, session, path, uri_set=None)

Bases: askomics.libaskomics.source_file.SourceFile.SourceFile

Class representing a Gff3 Source file

get_abstraction()

Get Abstraction (turtle) of the GFF

get_content_ttl(entity)

Get the ttl string for an entity

get_domain_knowledge()

Get Domain Knowledge (turtle) of the GFF

get_entities()

get all the entities present in a gff file

Returns:The list of all the entities
Return type:List
get_turtle()

Get turtle string for a gff file

set_entities(entities)
set_taxon(taxon)

askomics.libaskomics.source_file.SourceFileTsv module

Classes to import data from a tsv source files

class askomics.libaskomics.source_file.SourceFileTsv.SourceFileTsv(settings, session, path, preview_limit, uri_set=None)

Bases: askomics.libaskomics.source_file.SourceFile.SourceFile

Class representing a Gff3 Source file

category_values

Like @property on a member function, but also cache the calculation in self.__dict__[function name]. The function is called only once since the cache stored as an instance attribute override the property residing in the class attributes. Following accesses cost no more than standard Python attribute access. If the instance attribute is deleted the next access will re-evaluate the function. Source: https://blog.ionelmc.ro/2014/11/04/an-interesting-python-descriptor-quirk/ usage:

class Shape(object):

@cached_property def area(self):

# compute value return value
dialect

Like @property on a member function, but also cache the calculation in self.__dict__[function name]. The function is called only once since the cache stored as an instance attribute override the property residing in the class attributes. Following accesses cost no more than standard Python attribute access. If the instance attribute is deleted the next access will re-evaluate the function. Source: https://blog.ionelmc.ro/2014/11/04/an-interesting-python-descriptor-quirk/ usage:

class Shape(object):

@cached_property def area(self):

# compute value return value
get_abstraction()

Get the abstraction representing the source file in ttl format

Returns:ttl content for the abstraction
get_domain_knowledge()

Get the domain knowledge representing the source file in ttl format

Returns:ttl content for the domain knowledge
get_headers_by_file

Like @property on a member function, but also cache the calculation in self.__dict__[function name]. The function is called only once since the cache stored as an instance attribute override the property residing in the class attributes. Following accesses cost no more than standard Python attribute access. If the instance attribute is deleted the next access will re-evaluate the function. Source: https://blog.ionelmc.ro/2014/11/04/an-interesting-python-descriptor-quirk/ usage:

class Shape(object):

@cached_property def area(self):

# compute value return value
get_preview_data()

Read and return the values from the first lines of file.

Returns:a List of List of column values
Return type:List
static get_strand(strand)

Get the faldo strand in function of the strand

static get_strand_faldo(strand)

Get the faldo strand in function of the strand

get_turtle(preview_only=False)

Get the turtle string of a tsv file

guess_values_type(values, header)

From a list of values, guess the data type

Parameters:
  • values – a List of values to evaluate
  • num – index of the header
Returns:

the guessed type (‘taxon’,’ref’, ‘strand’, ‘start’, ‘end’, ‘numeric’, ‘text’ or ‘category’, ‘goterm’)

static is_decimal(value)

Determine if given value is a decimal (integer or float) or not

Parameters:value – the value to evaluate
Returns:True if the value is decimal
key_id(row)

Get the key id by concatenate all key selected

set_disabled_columns(disabled_columns)

Set manually curated types for column

Parameters:disabled_columns – a List of column ids (0 based) that should not be imported
set_forced_column_types(types)

Set manually curated types for column

Parameters:types – a List of column types (‘entity’, ‘entity_start’, ‘numeric’, ‘text’ or ‘category’)
set_headers(headers)

Set the headers

Parameters:headers (list) – the headers list
set_key_columns(key_columns)

Set all column to build unqiue ID

Parameters:disabled_columns – a List of column ids (0 based) that should not be imported

askomics.libaskomics.source_file.SourceFileTtl module

Classes to import data from a RDF source files

class askomics.libaskomics.source_file.SourceFileTtl.SourceFileTtl(settings, session, path, file_type='ttl')

Bases: askomics.libaskomics.source_file.SourceFile.SourceFile

Class representing a ttl Source file

convert_to_ttl(filepath, file_type)
file_get_contents(filename)

get the content of a file

get_preview_ttl()

Return the first 100 lines of a ttl file, text is formated with syntax color

static load_data_from_url(self, url, public)

insert the ttl sourcefile in the TS

persist(urlbase, public)

insert the ttl sourcefile in the TS

askomics.libaskomics.source_file.SourceFileURL module

Classes to import data from an URL

class askomics.libaskomics.source_file.SourceFileURL.SourceFileURL(settings, session, url)

Bases: askomics.libaskomics.source_file.SourceFile.SourceFile

Class representing a ttl Source file

load_data_from_url(url, public)

insert the ttl sourcefile in the TS

Module contents