so_magic.data package

Submodules

so_magic.data.command_factories module

class so_magic.data.command_factories.DataManagerCommandFactory(data_manager, command_factory=<class 'so_magic.data.command_factories.DataManagerCommandFactoryBuilder'>)[source]

Bases: object

build_command_prototype()[source]
name: str
subject: so_magic.utils.notification.Subject
class so_magic.data.command_factories.DataManagerCommandFactoryBuilder[source]

Bases: object

classmethod create_factory(name, callback)[source]
subclasses = {}

so_magic.data.commands_manager module

class so_magic.data.commands_manager.CommandGetter(commands_accumulator=CommandsAccumulator(commands={}))[source]

Bases: object

property accumulator
class so_magic.data.commands_manager.CommandsAccumulator[source]

Bases: so_magic.utils.notification.Observer

update(subject, *args, **kwargs)None[source]

Receive an update (from a subject); handle an event notification.

class so_magic.data.commands_manager.CommandsManager(commands_getter=CommandGetter(_commands_accumulator=CommandsAccumulator(commands={})), decorators=None)[source]

Bases: object

[summary]

Parameters
  • prototypes (dict, optional) – initial prototypes to be supplied

  • command_factory (callable, optional) – a callable that returns an instance of Command

property command
property commands_dict

so_magic.data.data_manager module

class so_magic.data.data_manager.DataManager(engine, phi_function_class, feature_manager, commands_manager=CommandsManager(_commands_getter=CommandGetter(_commands_accumulator=CommandsAccumulator(commands={})), decorators=None))[source]

Bases: object

property command
property commands
property datapoints
property phi_class
property phis
class so_magic.data.data_manager.Phis(registry: so_magic.utils.registry.ObjectRegistry = NOTHING)[source]

Bases: so_magic.utils.notification.Observer

registry: so_magic.utils.registry.ObjectRegistry
update(subject, *args, **kwargs)[source]

Receive an update (from a subject); handle an event notification.

so_magic.data.datapoints_manager module

Defines the DatapointsManager type (class); a centralized facility where all datapoints objects should arrived and be retrieved from.

class so_magic.data.datapoints_manager.DatapointsManager(datapoints_objects=NOTHING)[source]

Bases: so_magic.utils.notification.Observer

Manage operations revolved around datapoints collection objects.

Instances of this class are able to monitor (listener/observer pattern) the creation of datapoints collection objects and store them in a dictionary structure. They also provide retrieval methods to the client to “pick up” a datapoints object.

Parameters

datapoints_objects (dict, optional) – the initial structure that stores datapoints objects

property datapoints: Optional[Iterable]

The most recently stored datapoints object.

Returns

the reference to the datapoints object

Return type

Optional[Iterable]

property state

The latest (most recent) key used to store a datapoints object.

Returns

the key under which we stored a datapoints object last time

Return type

str

update(subject: so_magic.utils.notification.Subject)[source]

Update our state based on the event/observation captured/made.

Stores the datapoints object observed in a dictionary using a the Subject name attribute as key.

Parameters

subject (Subject) – the subject object observed; it acts as an event

Raises
  • RuntimeError – in case there is no ‘name’ attribute on the subject or if it is an empty string ‘’

  • RuntimeError – in case the ‘name’ attribute on the subject has already been used to store a datapoints object

so_magic.data.dataset module

class so_magic.data.dataset.Dataset(datapoints, name=None, features=[])[source]

Bases: object

High level representation of data, of some form.

Instances of this class encapsulate observations in the form of datapoints as well as their respective feature vectors. Feature vectors can then be trivially “fed” into a Machine Learning algorithm (eg SOM).

Parameters
  • () (datapoints) –

  • name (str, optional) –

Returns

[description]

Return type

[type]

property features

so_magic.data.discretization module

class so_magic.data.discretization.AbstractAlgorithm(callback: callable, arguments: list = NOTHING, parameters: dict = NOTHING)[source]

Bases: so_magic.data.discretization.AlgorithmInterface, abc.ABC

arguments: list
callback: callable
parameters: dict
class so_magic.data.discretization.AbstractDiscretizer[source]

Bases: so_magic.data.discretization.DiscretizerInterface

discretize(*args, **kwargs)[source]
class so_magic.data.discretization.AlgorithmArguments(arg_types, default_values)[source]

Bases: object

An algorithms expected positional arguments.

values(*args)[source]
exception so_magic.data.discretization.AlgorithmArgumentsError[source]

Bases: Exception

class so_magic.data.discretization.AlgorithmInterface[source]

Bases: abc.ABC

abstract run(*args, **kwargs)[source]
class so_magic.data.discretization.BaseBinner(algorithm)[source]

Bases: so_magic.data.discretization.BinnerInterface

bin(values, bins)[source]

It is assumed numerical (ratio or interval) variable or ordinal (not nominal) categorical variable.

class so_magic.data.discretization.BaseDiscretizer(binner)[source]

Bases: so_magic.data.discretization.AbstractDiscretizer

discretize(*args, **kwargs)[source]

Expects args: dataset, feature and kwargs; ‘nb_bins’.

class so_magic.data.discretization.BinnerClass[source]

Bases: object

subclasses = {}
class so_magic.data.discretization.BinnerFactory[source]

Bases: object

create_binner(*args, **kwargs)so_magic.data.discretization.BaseBinner[source]
equal_length_binner(*args, **kwargs)so_magic.data.discretization.BaseBinner[source]

Binner that create bins of equal size (max_value - min_value)

parent_class

alias of so_magic.data.discretization.BinnerClass

quantisized_binner(*args, **kwargs)so_magic.data.discretization.BaseBinner[source]

Binner that will adjust the bin sizes so that the observations are evenly distributed in the bins

Raises

NotImplementedError – [description]

Returns

[description]

Return type

BaseBinner

class so_magic.data.discretization.BinnerInterface[source]

Bases: abc.ABC

abstract bin(values, bins)[source]
class so_magic.data.discretization.BinningAlgorithm[source]

Bases: object

classmethod from_built_in(algorithm_id)[source]
subclasses = {'pd.cut': <class 'so_magic.data.discretization.PDCutBinningAlgorithm'>}
class so_magic.data.discretization.Discretizer(binner)[source]

Bases: so_magic.data.discretization.BaseDiscretizer

property algorithm
classmethod from_algorithm(alg)[source]
class so_magic.data.discretization.DiscretizerInterface[source]

Bases: abc.ABC

discretize(*args, **kwargs)[source]
class so_magic.data.discretization.FeatureDiscretizer(binner, feature)[source]

Bases: so_magic.data.discretization.BaseDiscretizer

discretize(*args, **kwargs)[source]

Expects args: dataset, nb_bins.

class so_magic.data.discretization.FeatureDiscretizerFactory(binner_factory)[source]

Bases: object

categorical(feature, **kwargs)so_magic.data.discretization.FeatureDiscretizer[source]
numerical(feature, **kwargs)so_magic.data.discretization.FeatureDiscretizer[source]
class so_magic.data.discretization.MagicAlgorithm(callback: callable, arguments: list = NOTHING, parameters: dict = NOTHING)[source]

Bases: so_magic.data.discretization.AbstractAlgorithm

arguments: list
callback: callable
property output
parameters: dict
run(*args, **kwargs)[source]
set_default_parameters()[source]
update_parameters(**kwargs)[source]
exception so_magic.data.discretization.MagicAlgorithmError[source]

Bases: Exception

exception so_magic.data.discretization.MagicAlgorithmParametersError[source]

Bases: Exception

class so_magic.data.discretization.PDCutBinningAlgorithm(callback: callable, arguments: list = NOTHING, parameters: dict = NOTHING)[source]

Bases: so_magic.data.discretization.MagicAlgorithm

arguments: list
callback: callable
parameters: dict
so_magic.data.discretization.call_method(a_callable)[source]

so_magic.data.encoding module

class so_magic.data.encoding.EncoderInterface[source]

Bases: abc.ABC

abstract encode(*args, **kwargs)[source]
class so_magic.data.encoding.NominalAttributeEncoder(values_set: list = NOTHING)[source]

Bases: so_magic.data.encoding.EncoderInterface, abc.ABC

Encode the observations of a categorical nominal variable.

The client code can supply the possible values for the nominal variable, if known a priori. The possible values are stored in the ‘values_set’ attribute/property. If they are not supplied they should be computed at runtime (when running the encode method).

It also defines and stores the string identifiers for each column produced in the ‘columns attribute/property.

Parameters

values_set (list) – the possible values of the nominal variable observations, if known a priori

columns
values_set

so_magic.data.interfaces module

Defines interfaces related to various operations on table-like data.

class so_magic.data.interfaces.TabularIterator[source]

Bases: abc.ABC

Iterate over the rows or columns of a table-lie data structure.

Classes implementing this interface gain the ability to iterate over the values found in the rows or the columns of a table-like data structure. They can also iterate over the columns indices/identifiers.

abstract columnnames(data)Union[Iterable[str], Iterable[int]][source]

Iterate over data (table) column indices/identifiers.

Parameters

data (object) – the (data) table to iterate over its columns indices/identifiers

Returns

the column indices/identifiers of the (data) table

Return type

Union[Iterable[str], Iterable[int]]

abstract itercolumns(data)Iterable[source]

Iterate over the (data) table’s columns.

Get an iterable over the table’s columns.

Parameters

data (object) – the (data) table to iterate over its columns

Returns

the columns of the (data) table

Return type

Iterable

abstract iterrows(data)Iterable[source]

Iterate over the (data) table’s rows.

Get an iterable over the table’s rows.

Parameters

data (object) – the (data) table to iterate over its rows

Returns

the rows of the (data) table

Return type

Iterable

class so_magic.data.interfaces.TabularMutator[source]

Bases: abc.ABC

Mutate (alter) the contents of a table-like data structure.

Classes implementing this interface supply their instances the ability to alter the contents of a table-like data structure.

abstract add_column(*args, **kwargs)[source]

Add a new column to table-like data.

Raises

NotImplementedError – [description]

class so_magic.data.interfaces.TabularRetriever[source]

Bases: abc.ABC

Operations on table-like data.

Classes implementing this interface gain the ability to perform various operations on data structures that resemble a table (have indexable columns, rows, etc):

most importantly they can slice through the data (retrieve specific row or column)

abstract column(identifier: Union[str, int], data)Iterable[source]

Slice though the data (table) and get the specified column’s values.

Parameters
  • identifier (Union[str, int]) – unique identifier/index of column

  • data (object) – the data to slice through

Returns

the values contained in the column requested

Return type

Iterable

abstract get_numerical_attributes(data)Iterable[source]

Get the data’s attributes that represent numerical values.

Returns the attributes that fall under the Numerical Variables: either Ratio or Interval type of variables.

Two type of numerical variables are supported:

Ratio variable: numerical variable where all operations are supported (+, -, *, /) and true zero is defined; eg weight.

Interval variable: numerical variable where differences are interpretable; supported operations: [+, -]; no true zero; eg temperature in centigrade (ie Celsius).

Parameters

data (object) – the data from which to retrieve the numerical attributes

Returns

the numerical attributes found

Return type

Iterable

abstract nb_columns(data)int[source]

Get the number of columns that the data (table) have.

Parameters

data (object) – the data (table) to count its columns

Returns

the number of the (data) table’s columns

Return type

int

abstract nb_rows(data)int[source]

Get the number of rows that the data (table) have.

Parameters

data (object) – the data (table) to count its rows

Returns

the number of the (data) table’s rows

Return type

int

abstract row(identifier, data)[source]

Slice though the data (table) and get the specified row’s values.

Parameters
  • identifier (Union[str, int]) – unique identifier/index of row

  • data (object) – the data to slice through

Returns

the values contained in the row requested

Return type

Iterable

so_magic.data.magic_datapoints_factory module

This module is responsible to provide means of creating (instantiating) objects representing Datapoints collections.

class so_magic.data.magic_datapoints_factory.BroadcastingDatapointsFactory(subject: so_magic.utils.notification.Subject = NOTHING)[source]

Bases: so_magic.data.datapoints.datapoints.DatapointsFactory

Creates Datapoints objects and informs its subscribers when that happens.

A factory class that informs its subscribers when a new object that implements the DatapointsInterface is created (following a request).

Parameters

subject (Subject, optional) – the subject of observation; the “thing” that others listen to

create(datapoints_factory_type: str, *args, **kwargs)Iterable[source]

Create new Datapoints and inform subscribers.

The factory method that returns a new object of DatapointsInterface, by looking at the registered constructors to delegate the object creation.

Parameters

datapoints_factory_type (str) – the name of the “constructor” to use

Raises

RuntimeError – [description]

Returns

instance implementing the DatapointsInterface

Return type

Iterable

name: str
subject: so_magic.utils.notification.Subject

Module contents

so_magic.data.init_data_manager(engine)[source]