API documentation for interacting with CloudMan

CloudManLauncher

class bioblend.cloudman.launch.CloudManLauncher(access_key, secret_key, cloud=None)[source]

Define the environment in which this instance of CloudMan will be launched.

Besides providing the credentials, optionally provide the cloud object. This object must define the properties required to establish a boto connection to that cloud. See this method’s implementation for an example of the required fields. Note that as long the as provided object defines the required fields, it can really by implemented as anything (e.g., a Bunch, a database object, a custom class). If no value for the cloud argument is provided, the default is to use the Amazon cloud.

assign_floating_ip(ec2_conn, instance)[source]
connect_ec2(a_key, s_key, cloud=None)[source]

Create and return an EC2-compatible connection object for the given cloud.

See _get_cloud_info method for more details on the requirements for the cloud parameter. If no value is provided, the class field is used.

connect_s3(a_key, s_key, cloud=None)[source]

Create and return an S3-compatible connection object for the given cloud.

See _get_cloud_info method for more details on the requirements for the cloud parameter. If no value is provided, the class field is used.

connect_vpc(a_key, s_key, cloud=None)[source]

Establish a connection to the VPC service.

TODO: Make this work with non-default clouds as well.

create_cm_security_group(sg_name='CloudMan', vpc_id=None)[source]

Create a security group with all authorizations required to run CloudMan.

If the group already exists, check its rules and add the missing ones.

Parameters:
  • sg_name (str) – A name for the security group to be created.
  • vpc_id (str) – VPC ID under which to create the security group.
Return type:

dict

Returns:

A dictionary containing keys name (with the value being the name of the security group that was created), error (with the value being the error message if there was an error or None if no error was encountered), and ports (containing the list of tuples with port ranges that were opened or attempted to be opened).

Changed in version 0.6.1: The return value changed from a string to a dict

create_key_pair(key_name='cloudman_key_pair')[source]

If a key pair with the provided key_name does not exist, create it.

Parameters:sg_name (str) – A name for the key pair to be created.
Return type:dict
Returns:A dictionary containing keys name (with the value being the name of the key pair that was created), error (with the value being the error message if there was an error or None if no error was encountered), and material (containing the unencrypted PEM encoded RSA private key if the key was created or None if the key already eixsted).

Changed in version 0.6.1: The return value changed from a tuple to a dict

find_placements(ec2_conn, instance_type, cloud_type, cluster_name=None)[source]

Find a list of placement zones that support the specified instance type.

If cluster_name is given and a cluster with the given name exist, return a list with only one entry where the given cluster lives.

Searching for available zones for a given instance type is done by checking the spot prices in the potential availability zones for support before deciding on a region: http://blog.piefox.com/2011/07/ec2-availability-zones-and-instance.html

Note that, currently, instance-type based zone selection applies only to AWS. For other clouds, all the available zones are returned (unless a cluster is being recreated, in which case the cluster’s placement zone is returned sa stored in its persistent data.

Return type:dict
Returns:A dictionary with zones and error keywords.

Changed in version 0.3: Changed method name from _find_placements to find_placements. Also added cluster_name parameter.

Changed in version 0.7.0: The return value changed from a list to a dictionary.

get_cluster_pd(cluster_name)[source]

Return persistent data (as a dict) associated with a cluster with the given cluster_name. If a cluster with the given name is not found, return an empty dict.

New in version 0.3.

get_clusters_pd(include_placement=True)[source]

Return persistent data of all existing clusters for this account.

Parameters:include_placement (bool) – Whether or not to include region placement for the clusters. Setting this option will lead to a longer function runtime.
Return type:dict
Returns:A dictionary containing keys clusters and error. The value of clusters will be a dictionary with the following keys cluster_name, persistent_data, bucket_name and optionally placement or an empty list if no clusters were found or an error was encountered. persistent_data key value is yet another dictionary containing given cluster’s persistent data. The value for the error key will contain a string with the error message.

New in version 0.3.

Changed in version 0.7.0: The return value changed from a list to a dictionary.

get_status(instance_id)[source]

Check on the status of an instance. instance_id needs to be a boto-library copatible instance ID (e.g., i-8fehrdss).If instance_id is not provided, the ID obtained when launching the most recent instance is used. Note that this assumes the instance being checked on was launched using this class. Also note that the same class may be used to launch multiple instances but only the most recent instance_id is kept while any others will to be explicitly specified.

This method also allows the required ec2_conn connection object to be provided at invocation time. If the object is not provided, credentials defined for the class are used (ability to specify a custom ec2_conn helps in case of stateless method invocations).

Return a state dict containing the following keys: instance_state, public_ip, placement, and error, which capture CloudMan’s current state. For instance_state, expected values are: pending, booting, running, or error and represent the state of the underlying instance. Other keys will return an empty value until the instance_state enters running state.

launch(cluster_name, image_id, instance_type, password, kernel_id=None, ramdisk_id=None, key_name='cloudman_key_pair', security_groups=['CloudMan'], placement='', subnet_id=None, ebs_optimized=False, **kwargs)[source]

Check all the prerequisites (key pair and security groups) for launching a CloudMan instance, compose the user data based on the parameters specified in the arguments and the cloud properties as defined in the object’s cloud field.

For the current list of user data fields that can be provided via kwargs, see https://galaxyproject.org/cloudman/userdata/

Return a dict containing the properties and info with which an instance was launched, namely: sg_names containing the names of the security groups, kp_name containing the name of the key pair, kp_material containing the private portion of the key pair (note that this portion of the key is available and can be retrieved only at the time the key is created, which will happen only if no key with the name provided in the key_name argument exists), rs containing the boto ResultSet object, instance_id containing the ID of a started instance, and error containing an error message if there was one.

rule_exists(rules, from_port, to_port, ip_protocol='tcp', cidr_ip='0.0.0.0/0')[source]

A convenience method to check if an authorization rule in a security group already exists.

CloudManInstance

API for interacting with a CloudMan instance.

class bioblend.cloudman.CloudManConfig(access_key=None, secret_key=None, cluster_name=None, image_id=None, instance_type='m1.medium', password=None, cloud_metadata=None, cluster_type=None, galaxy_data_option='', initial_storage_size=10, key_name='cloudman_key_pair', security_groups=['CloudMan'], placement='', kernel_id=None, ramdisk_id=None, block_until_ready=False, **kwargs)[source]

Initializes a CloudMan launch configuration object.

Parameters:
  • access_key (str) – Access credentials.
  • secret_key (str) – Access credentials.
  • cluster_name (str) – Name used to identify this CloudMan cluster.
  • image_id (str) – Machine image ID to use when launching this CloudMan instance.
  • instance_type (str) – The type of the machine instance, as understood by the chosen cloud provider. (e.g., m1.medium)
  • password (str) – The administrative password for this CloudMan instance.
  • cloud_metadata (Bunch) –

    This object must define the properties required to establish a boto connection to that cloud. See this method’s implementation for an example of the required fields. Note that as long the as provided object defines the required fields, it can really by implemented as anything (e.g., a Bunch, a database object, a custom class). If no value for the cloud argument is provided, the default is to use the Amazon cloud.

  • kernel_id (str) – The ID of the kernel with which to launch the instances
  • ramdisk_id (str) – The ID of the RAM disk with which to launch the instances
  • key_name (str) – The name of the key pair with which to launch instances
  • security_groups (list of str) – The IDs of the security groups with which to associate instances
  • placement (str) – The availability zone in which to launch the instances
  • cluster_type (str) – The type, either ‘Galaxy’, ‘Data’, or ‘Test’, defines the type of cluster platform to initialize.
  • galaxy_data_option (str) – The storage type to use for this instance. May be ‘transient’, ‘custom_size’ or ‘’. The default is ‘’, which will result in ignoring the bioblend specified initial_storage_size. ‘custom_size’ must be used for initial_storage_size to come into effect.
  • initial_storage_size (int) – The initial storage to allocate for the instance. This only applies if cluster_type is set to either Galaxy or Data and galaxy_data_option is set to custom_size
  • block_until_ready (bool) – Specifies whether the launch method will block until the instance is ready and only return once all initialization is complete. The default is False. If False, the launch method will return immediately without blocking. However, any subsequent calls made will automatically block if the instance is not ready and initialized. The blocking timeout and polling interval can be configured by providing extra parameters to the CloudManInstance.launch_instance method.
static CustomTypeDecoder(dct)[source]
class CustomTypeEncoder(skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, encoding='utf-8', default=None)[source]

Constructor for JSONEncoder, with sensible defaults.

If skipkeys is false, then it is a TypeError to attempt encoding of keys that are not str, int, long, float or None. If skipkeys is True, such items are simply skipped.

If ensure_ascii is true (the default), all non-ASCII characters in the output are escaped with uXXXX sequences, and the results are str instances consisting of ASCII characters only. If ensure_ascii is False, a result may be a unicode instance. This usually happens if the input contains unicode strings or the encoding parameter is used.

If check_circular is true, then lists, dicts, and custom encoded objects will be checked for circular references during encoding to prevent an infinite recursion (which would cause an OverflowError). Otherwise, no such check takes place.

If allow_nan is true, then NaN, Infinity, and -Infinity will be encoded as such. This behavior is not JSON specification compliant, but is consistent with most JavaScript based encoders and decoders. Otherwise, it will be a ValueError to encode such floats.

If sort_keys is true, then the output of dictionaries will be sorted by key; this is useful for regression tests to ensure that JSON serializations can be compared on a day-to-day basis.

If indent is a non-negative integer, then JSON array elements and object members will be pretty-printed with that indent level. An indent level of 0 will only insert newlines. None is the most compact representation. Since the default item separator is ‘, ‘, the output might include trailing whitespace when indent is specified. You can use separators=(‘,’, ‘: ‘) to avoid this.

If specified, separators should be a (item_separator, key_separator) tuple. The default is (‘, ‘, ‘: ‘). To get the most compact JSON representation you should specify (‘,’, ‘:’) to eliminate whitespace.

If specified, default is a function that gets called for objects that can’t otherwise be serialized. It should return a JSON encodable version of the object or raise a TypeError.

If encoding is not None, then all input strings will be transformed into unicode using that encoding prior to JSON-encoding. The default is UTF-8.

default(obj)[source]
static CloudManConfig.load_config(fp)[source]
CloudManConfig.save_config(fp)[source]
CloudManConfig.set_connection_parameters(access_key, secret_key, cloud_metadata=None)[source]
CloudManConfig.set_extra_parameters(**kwargs)[source]
CloudManConfig.set_post_launch_parameters(cluster_type=None, galaxy_data_option='', initial_storage_size=10)[source]
CloudManConfig.set_pre_launch_parameters(cluster_name, image_id, instance_type, password, kernel_id=None, ramdisk_id=None, key_name='cloudman_key_pair', security_groups=['CloudMan'], placement='', block_until_ready=False)[source]
CloudManConfig.validate()[source]
class bioblend.cloudman.CloudManInstance(url, password, **kwargs)[source]

Create an instance of the CloudMan API class, which is to be used when manipulating that given CloudMan instance.

The url is a string defining the address of CloudMan, for example “http://115.146.92.174”. The password is CloudMan’s password, as defined in the user data sent to CloudMan on instance creation.

add_nodes(*args, **kwargs)[source]

Add a number of worker nodes to the cluster, optionally specifying the type for new instances. If instance_type is not specified, instance(s) of the same type as the master instance will be started. Note that the instance_type must match the type of instance available on the given cloud.

spot_price applies only to AWS and, if set, defines the maximum price for Spot instances, thus turning this request for more instances into a Spot request.

adjust_autoscaling(*args, **kwargs)[source]

Adjust the autoscaling configuration parameters.

The number of worker nodes in the cluster is bounded by the optional minimum_nodes and maximum_nodes parameters. If a parameter is not provided then its configuration value does not change.

autoscaling_enabled(*args, **kwargs)[source]

Returns a boolean indicating whether autoscaling is enabled.

cloudman_url

Returns the URL for accessing this instance of CloudMan.

disable_autoscaling(*args, **kwargs)[source]

Disable autoscaling, meaning that worker nodes will need to be manually added and removed.

enable_autoscaling(*args, **kwargs)[source]

Enable cluster autoscaling, allowing the cluster to automatically add, or remove, worker nodes, as needed.

The number of worker nodes in the cluster is bounded by the minimum_nodes (default is 0) and maximum_nodes (default is 19) parameters.

galaxy_url

Returns the base URL for this instance, which by default happens to be the URL for Galaxy application.

get_cloudman_version(*args, **kwargs)[source]

Returns the cloudman version from the server. Versions prior to Cloudman 2 does not support this call, and therefore, the default is to return 1

get_cluster_size(*args, **kwargs)[source]

Get the size of the cluster in terms of the number of nodes; this count includes the master node.

get_cluster_type(*args, **kwargs)[source]

Get the cluster type for this CloudMan instance. See the CloudMan docs about the available types. Returns a dictionary, for example: {u'cluster_type': u'Test'}.

get_galaxy_state(*args, **kwargs)[source]

Get the current status of Galaxy running on the cluster.

get_master_id(*args, **kwargs)[source]

Returns the instance ID of the master node in this CloudMan cluster

get_master_ip(*args, **kwargs)[source]

Returns the public IP of the master node in this CloudMan cluster

get_nodes(*args, **kwargs)[source]

Get a list of nodes currently running in this CloudMan cluster.

get_static_state(*args, **kwargs)[source]

Get static information on this CloudMan instance. i.e. state that doesn’t change over the lifetime of the cluster

get_status(*args, **kwargs)[source]

Get status information on this CloudMan instance.

initialize(*args, **kwargs)[source]

Initialize CloudMan platform. This needs to be done before the cluster can be used.

The cluster_type, either ‘Galaxy’, ‘Data’, or ‘Test’, defines the type of cluster platform to initialize.

is_master_execution_host(*args, **kwargs)[source]

Checks whether the master node has job execution enabled.

static launch_instance(cfg, **kwargs)[source]

Launches a new instance of CloudMan on the specified cloud infrastructure.

Parameters:cfg (CloudManConfig) – A CloudManConfig object containing the initial parameters for this launch.
reboot_node(*args, **kwargs)[source]

Reboot a specific worker node.

The instance_id parameter defines the ID, as a string, of a worker node to reboot.

remove_node(*args, **kwargs)[source]

Remove a specific worker node from the cluster.

The instance_id parameter defines the ID, as a string, of a worker node to remove from the cluster. The force parameter (defaulting to False), is a boolean indicating whether the node should be forcibly removed rather than gracefully removed.

remove_nodes(*args, **kwargs)[source]

Remove worker nodes from the cluster.

The num_nodes parameter defines the number of worker nodes to remove. The force parameter (defaulting to False), is a boolean indicating whether the nodes should be forcibly removed rather than gracefully removed.

set_master_as_execution_host(*args, **kwargs)[source]

Enables/disables master as execution host.

terminate(*args, **kwargs)[source]

Terminate this CloudMan cluster. There is an option to also terminate the master instance (all worker instances will be terminated in the process of cluster termination), and delete the whole cluster.

Warning

Deleting a cluster is irreversible - all of the data will be permanently deleted.

update()[source]

Update the local object’s fields to be in sync with the actual state of the CloudMan instance the object points to. This method should be called periodically to ensure you are looking at the current data.

New in version 0.2.2.

class bioblend.cloudman.GenericVMInstance(launcher, launch_result)[source]

Create an instance of the CloudMan API class, which is to be used when manipulating that given CloudMan instance.

The url is a string defining the address of CloudMan, for example “http://115.146.92.174”. The password is CloudMan’s password, as defined in the user data sent to CloudMan on instance creation.

get_machine_status()[source]

Check on the underlying VM status of an instance. This can be used to determine whether the VM has finished booting up and if CloudMan is up and running.

Return a state dict with the current instance_state, public_ip, placement, and error keys, which capture the current state (the values for those keys default to empty string if no data is available from the cloud).

instance_id

Returns the ID of this instance (e.g., i-87ey32dd) if launch was successful or None otherwise.

key_pair_material

Returns the private portion of the generated key pair. It does so only if the instance was properly launched and key pair generated; None otherwise.

key_pair_name

Returns the name of the key pair used by this instance. If instance was not launched properly, returns None.

wait_until_instance_ready(vm_ready_timeout=300, vm_ready_check_interval=10)[source]

Wait until the VM state changes to ready/error or timeout elapses. Updates the host name once ready.

exception bioblend.cloudman.VMLaunchException(value)[source]
bioblend.cloudman.block_until_vm_ready(func)[source]

This decorator exists to make sure that a launched VM is ready and has received a public IP before allowing the wrapped function call to continue. If the VM is not ready, the function will block until the VM is ready. If the VM does not become ready until the vm_ready_timeout elapses or the VM status returns an error, a VMLaunchException will be thrown.

This decorator relies on the wait_until_instance_ready method defined in class GenericVMInstance. All methods to which this decorator is applied must be members of a class which inherit from GenericVMInstance.

The following two optional keyword arguments are recognized by this decorator:

Parameters:
  • vm_ready_timeout (int) – Maximum length of time to block before timing out. Once the timeout is reached, a VMLaunchException will be thrown.
  • vm_ready_check_interval (int) – The number of seconds to pause between consecutive calls when polling the VM’s ready status.