ahds package

PyPI PyPI - Python Version https://travis-ci.org/emdb-empiar/ahds.svg?branch=master Documentation Status https://coveralls.io/repos/github/emdb-empiar/ahds/badge.svg

ahds

PyPI PyPI - Python Version https://travis-ci.org/emdb-empiar/ahds.svg?branch=master Documentation Status https://coveralls.io/repos/github/emdb-empiar/ahds/badge.svg

Overview

ahds is a Python package to parse and handle Amira (R) files. It was developed to facilitate reading of Amira (R) files as part of the EMDB-SFF toolkit.

Note

Amira (R) is a trademark of Thermo Fisher Scientific. This package is in no way affiliated with with Thermo Fisher Scientific.

License

ahds is free software and is provided under the terms of the Apache License, Version 2.0.

Copyright 2017 EMBL - European Bioinformatics Institute

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND,
either express or implied. See the License for the specific
language governing permissions and limitations under the License.

Use Cases

  • Detect and parse Amira (R) headers and return structured data
  • Decode data (HxRLEByte, HxZip)
  • Easy extensibility to handle previously unencountered data streams

ahds was written and is maintained by Paul K. Korir but there is a list of contributors. Feel free to join this initiative.

Installation

ahds works with Python 2.7, 3.5, 3.6 and 3.7. It requires numpy to build.

pip install numpy

Afterwards you may run

pip install ahds

Note

Figure out a way to avoid the need for numpy as part of the build.

Getting Started

You can begin playing with ahds out of the box using the provided console command ahds.

me@home ~$ ahds ahds/data/FieldOnTetraMesh.am
********************************************************************************************************************************************
AMIRA (R) HEADER AND DATA STREAMS
--------------------------------------------------------------------------------------------------------------------------------------------
+-ahds/data/FieldOnTetraMesh.am                                                                  AmiraFile [is_parent? True ]
|  +-meta                                                                                            Block [is_parent? False]
|  |  +-file: ahds/data/FieldOnTetraMesh.am
|  |  +-header_length: 182
|  |  +-data_streams: 1
|  |  +-streams_loaded: False
|  +-header                                                                                    AmiraHeader [is_parent? True ]
|  |  +-filetype: AmiraMesh
|  |  +-dimension: 3D
|  |  +-format: BINARY
|  |  +-endian: BIG
|  |  +-version: 2.0
|  |  +-extra_format: None
|  |  +-Parameters                                                                                   Block [is_parent? False]
|  |  +-Tetrahedra                                                                                   Block [is_parent? False]
|  |  |  +-length: 23685
|  +-data_streams                                                                                    Block [is_parent? False]
********************************************************************************************************************************************

The ahds command takes the following arguments

me@home ~$ ahds -h
usage: ahds [-h] [-s] [-d] [-l] file [file ...]

Python tool to read and display Amira files

positional arguments:
  file                a valid Amira file with an optional block path

optional arguments:
  -h, --help          show this help message and exit
  -s, --load-streams  whether to load data streams or not [default: False]
  -d, --debug         display debugging information [default: False]
  -l, --literal       display the literal header [default: False]

You can specify a dotted path after the filename to only render that the content of that field in the header:

me@home ~$ ahds ahds/data/FieldOnTetraMesh.am header
***********************************************************************************************************************************
ahds: Displaying path 'header'
-----------------------------------------------------------------------------------------------------------------------------------
+-header                                                                                       AmiraHeader [is_parent? True ]
|  +-filetype: AmiraMesh
|  +-dimension: 3D
|  +-format: BINARY
|  +-endian: BIG
|  +-version: 2.0
|  +-extra_format: None
|  +-Parameters                                                                                      Block [is_parent? False]
|  +-Tetrahedra                                                                                      Block [is_parent? False]
|  |  +-length: 23685

For debugging you can display the literal header (the exact header present in the file) using the -l/--literal flag. Also, you can display the parsed data structure using the -d/--debug flag.

me@home ~$ ahds --literal --debug ahds/data/FieldOnTetraMesh.am
***********************************************************************************************************************************
ahds: Displaying literal header
-----------------------------------------------------------------------------------------------------------------------------------
# AmiraMesh 3D BINARY 2.0
# CreationDate: Tue Nov  2 11:46:31 2004


nTetrahedra 23685

TetrahedronData { float[3] Data } @1
Field { float[3] f } Constant(@1)

# Data section follows
***********************************************************************************************************************************
ahds: Displaying parsed header data
-----------------------------------------------------------------------------------------------------------------------------------
[{'designation': {'dimension': '3D',
                  'filetype': 'AmiraMesh',
                  'format': 'BINARY',
                  'version': '2.0'}},
 {'comment': {'date': 'Tue Nov  2 11:46:31 2004'}},
 {'array_declarations': [{'array_dimension': 23685,
                          'array_name': 'Tetrahedra'}]},
 {'data_definitions': [{'array_reference': 'Tetrahedra',
                        'data_dimension': 3,
                        'data_index': 1,
                        'data_name': 'Data',
                        'data_type': 'float'},
                       {'array_reference': 'Field',
                        'data_dimension': 3,
                        'data_index': 1,
                        'data_name': 'f',
                        'data_type': 'float',
                        'interpolation_method': 'Constant'}]}]

********************************************************************************************************************************************
AMIRA (R) HEADER AND DATA STREAMS
--------------------------------------------------------------------------------------------------------------------------------------------
+-ahds/data/FieldOnTetraMesh.am                                                                  AmiraFile [is_parent? True ]
|  +-meta                                                                                            Block [is_parent? False]
|  |  +-file: ahds/data/FieldOnTetraMesh.am
|  |  +-header_length: 182
|  |  +-data_streams: 1
|  |  +-streams_loaded: False
|  +-header                                                                                    AmiraHeader [is_parent? True ]
|  |  +-filetype: AmiraMesh
|  |  +-dimension: 3D
|  |  +-format: BINARY
|  |  +-endian: BIG
|  |  +-version: 2.0
|  |  +-extra_format: None
|  |  +-Parameters                                                                                   Block [is_parent? False]
|  |  +-Tetrahedra                                                                                   Block [is_parent? False]
|  |  |  +-length: 23685
|  +-data_streams                                                                                    Block [is_parent? False]
********************************************************************************************************************************************

By default, data streams are not read — only the header is parsed. You may obtain the data streams using the -s/--load-streams flag.

me@home ~$ ahds --load-streams ahds/data/FieldOnTetraMesh.am
********************************************************************************************************************************************
AMIRA (R) HEADER AND DATA STREAMS
--------------------------------------------------------------------------------------------------------------------------------------------
+-ahds/data/FieldOnTetraMesh.am                                                                  AmiraFile [is_parent? True ]
|  +-meta                                                                                            Block [is_parent? False]
|  |  +-file: ahds/data/FieldOnTetraMesh.am
|  |  +-header_length: 182
|  |  +-data_streams: 1
|  |  +-streams_loaded: True
|  +-header                                                                                    AmiraHeader [is_parent? True ]
|  |  +-filetype: AmiraMesh
|  |  +-dimension: 3D
|  |  +-format: BINARY
|  |  +-endian: BIG
|  |  +-version: 2.0
|  |  +-extra_format: None
|  |  +-Parameters                                                                                   Block [is_parent? False]
|  |  +-Tetrahedra                                                                                   Block [is_parent? False]
|  |  |  +-length: 23685
|  +-data_streams                                                                                    Block [is_parent? True ]
|  |  +-Data                                                                           AmiraMeshDataStream [is_parent? False]
|  |  |  +-data_index: 1
|  |  |  +-dimension: 3
|  |  |  +-type: float
|  |  |  +-interpolation_method: None
|  |  |  +-shape: 23685
|  |  |  +-format: None
|  |  |  +-data: [  0.8917308   0.9711809 300.       ],...,[  1.4390504   1.1243758 300.       ]
********************************************************************************************************************************************

Future Plans

  • Write out valid Amira (R) files

ahds package

ahds

This module provides a simple entry-point for using the underlying functionality through the AmiraFile class which automatically handles both AmiraMesh and HxSurface files. An AmiraFile is also a Block subclass with special attributes meta - for metadata not explicitly provided in the file (such as header_length), header - for the parse header and data_streams with the actual data stream data.

The only required argument is the name of the file to be read. By default, data streams are loaded but can be turned off (for quick reading) by setting load_stream=False. Additional kwargs are passed to the AmiraHeader class call.

There is a read method which (if data streams have not yet been read) will read the data streams.

An AmiraFile object may be printed to view the hierarchy of entities above or passed to repr to view the instatiation call that represents it.

class ahds.AmiraFile(fn, load_streams=True, *args, **kwargs)[source]

Bases: ahds.core.Block

Main entry point for working with Amira files

read()[source]

Read the data streams if they are not read yet

class ahds.AmiraHeader(fn, load_streams=True, *args, **kwargs)[source]

Bases: ahds.core.Block

Class to encapsulate Amira metadata and accessors to Amira (R) data streams

data_pointers(**kwargs)[source]

The list of data pointers together with a name, data type, dimension, index, format and length

NOTE: deprecated access the data defnitions for each data array through the corresponding attributes eg.: ah.Nodes.Coordinates instead of ah.data_pointers.data_pointer_1 ah.Tetrahedra.Nodes instead of ah.data_pointers.data_pointer_2 etc.

definitions(**kwargs)[source]

Definitions consist of a key-value pair specified just after the designation preceded by the key-word ‘define’

NOTE: this property is deprecated access the corresponding attributes directly eg. ah.Nodes instead of ah.definitions.Nodes or ah.Tetrahedra instead of ah.defintions.Tetrahedra

designation(**kwargs)[source]

Designation of the Amira file defined in the first row

Designations consist of some or all of the following data:

  • filetype e.g. AmiraMesh or HyperSurface
  • dimensions e.g. 3D
  • format e.g. BINARY-LITTLE-ENDIAN
  • version e.g. 2.1
  • extra format e.g. <hxsurface>

NOTE: this property is deprecated use the corresponding attributes of the AmiraHeader instead to access the above informations

classmethod from_file(**kwargs)[source]

Deprecated classmethod

load()[source]

Public loading method

ahds.grammar module

grammar

We define an EBNF grammar for Amira (R) headers to extract all metadata. In addition to that, we also define how HxSurface files are structured.

This module also includes several helper functions that use the grammar resources:

  • the get_header function returns only the header up to the first data stream; data is returned as a decoded string (UTF-8);
  • the parse_header function applies the grammar to return a nested set of Python primitives to be transformed into an AmiraHeader object;
  • the get_parsed_data function transparently applied both above functions given the Amira (R) filename
ahds.grammar.detect_format(fn, format_bytes=50, verbose=False, *args, **kwargs)[source]

Detect Amira (R) file format (AmiraMesh/Avizo or HyperSurface)

Parameters:
  • fn (str) – file name
  • format_bytes (int) – number of bytes in which to search for the format [default: 50]
  • verbose (bool) – verbose (default) or not
Return str file_format:
 

either AmiraMesh or HyperSurface

ahds.grammar.get_header(fn, file_format, header_bytes=20000, verbose=False, *args, **kwargs)[source]

Apply rules for detecting the boundary of the header

Parameters:
  • fn (str) – file name
  • file_format (str) – either AmiraMesh or HyperSurface
  • verbose (bool) – verbose output; default False
  • header_bytes (int) – number of bytes in which to search for the header [default: 20000]
Return str data:
 

the header as per the file_format

ahds.grammar.get_parsed_data(fn, *args, **kwargs)[source]

All above functions as a single function

Parameters:fn (str) – file name
Return tuple(list,int) parsed_data,header_length:
 structured metadata and total number of header bytes
ahds.grammar.parse_header(data, verbose=False, *args, **kwargs)[source]

Parse the data using the grammar specified in this module

Parameters:
  • data (str) – delimited data to be parsed for metadata
  • verbose (bool) – verbose output; default False
Return list parsed_data:
 

structured metadata

ahds.header module

ahds.data_stream module

data_stream

Classes that define data streams (DataList) in Amira (R) files

There are two main types of data streams:

  • AmiraMeshDataStream is for AmiraMesh files
  • AmiraHxSurfaceDataStream is for HxSurface files

Both classes inherit from AmiraDataStream class, which handles common functionality such as:

  • initialisation with the header metadata
  • the get_data method calls each subclass’s _decode method
class ahds.data_stream.AmiraHxSurfaceDataStream(name, header)[source]

Bases: ahds.data_stream.AmiraDataStream

Class that defines an Amira HxSurface data stream

add_attr(attr, value=None, isparent=False)

Add an attribute to this block object

get_data()

Decode and return the stream data in this stream

is_parent

A ListBlock is a parent if it has a Block attribute or if it has list items

load_stream

Reports whether data streams are loaded or not

material_dict

A convenience dictionary of materials indexed by material name

If this is not a Materials ListBlock (name = ‘Material’) then it should return None

move_attr(new_name, name)

Rename an attribute

read()[source]

Extract the data streams from the HxSurface file

class ahds.data_stream.AmiraMeshDataStream(name, header)[source]

Bases: ahds.data_stream.AmiraDataStream

Class that defines an AmiraMesh data stream

add_attr(attr, value=None, isparent=False)

Add an attribute to this block object

get_data()

Decode and return the stream data in this stream

is_parent

A ListBlock is a parent if it has a Block attribute or if it has list items

load_stream

Reports whether data streams are loaded or not

material_dict

A convenience dictionary of materials indexed by material name

If this is not a Materials ListBlock (name = ‘Material’) then it should return None

move_attr(new_name, name)

Rename an attribute

read()[source]

Extract the data streams from the AmiraMesh file

ahds.data_stream.byterle_decoder(data, output_size)[source]

If the C-ext. failed to compile or is unimportable use this slower Python equivalent

Parameters:
  • data (str) – a raw stream of data to be unpacked
  • output_size (int) – the number of items when data is uncompressed
Return np.array output:
 

an array of np.uint8

ahds.data_stream.hxbyterle_decode(data, output_size)

If the C-ext. failed to compile or is unimportable use this slower Python equivalent

Parameters:
  • data (str) – a raw stream of data to be unpacked
  • output_size (int) – the number of items when data is uncompressed
Return np.array output:
 

an array of np.uint8

ahds.data_stream.hxzip_decode(data, output_size)[source]

Decode HxZip data stream

Parameters:
  • data (str) – a raw stream of data to be unpacked
  • output_size (int) – the number of items when data is uncompressed
Return np.array output:
 

an array of np.uint8

ahds.data_stream.set_data_stream(name, header)[source]

Factory function used by AmiraHeader to determine the type of data stream present

Indices and tables