fextract_fragment_length_distribution package

Submodules

fextract_fragment_length_distribution.plugin module

class lbfextract.fextract_fragment_length_distribution.plugin.CliHook[source]

Bases: object

This CliHook implements the CLI interface for the extract_fragment_length_distribution feature extraction method.

extract_fragment_length_distribution

Given a set of genomic intervals having the same length w, extract_fragment_length_distribution calculates the fragment length distribution at each position, which can be represented as:

\[\begin{split}\mathbf{d}_l = \left( \frac{1}{|F|} \sum_{\substack{f \in F \\ |f| = p \\ i \in f}} \mathbb{1} \right)^{p_e}_{p_s}\end{split}\]

Where \(l\) represents the genomic position, \(f\) represents a fragment, \(p_e\) represent the maximum fragment length and \(p_s\) represents the minimum fragment length

get_command() → Command[source]

class lbfextract.fextract_fragment_length_distribution.plugin.FextractHooks[source]

Bases: object

plot_signal(signal: Signal, extra_config: Any) → Figure[source]

Parameters:

signal – Signal object containing the signals per interval
extra_config – extra configuration that may be used in the hook implementation

save_signal(signal: Signal, extra_config: Any) → None[source]

Parameters:

signal – Signal object containing the signals per interval
extra_config – extra configuration that may be used in the hook implementation

transform_all_intervals(single_intervals_transformed_reads: Signal, config: Any, extra_config: Any) → Signal[source]

Parameters:

single_intervals_transformed_reads – Signal object containing the signals per interval
config – config specific to the function
extra_config – extra configuration that may be used in the hook implementation

transform_single_intervals(transformed_reads: DataFrame, config: SingleSignalTransformerConfig, extra_config: AppExtraConfig) → Signal[source]

Parameters:

transformed_reads – ReadsPerIntervalContainer containing a list of ReadsPerInterval which are basically lists with information about start and end of the interval
config – config specific to the function
extra_config – config containing context information plus extra parameters

lbfextract.fextract_fragment_length_distribution.plugin.calculate_reference_distribution(path_to_sample, min_length, max_length, chr_name, start, end)[source]

lbfextract.fextract_fragment_length_distribution.plugin.get_peaks(distribution, height=0.0001, distance=100)[source]

lbfextract.fextract_fragment_length_distribution.plugin.get_position_coefficient(config, array)[source]

lbfextract.fextract_fragment_length_distribution.plugin.subsample_fragment_lengths(x, n)[source]

fextract_fragment_length_distribution.schemas module

class lbfextract.fextract_fragment_length_distribution.schemas.SingleSignalTransformerConfig(config_dict: dict | None = None)[source]

Bases: Config

flip_based_on_strand = None

gc_correction = None

max_fragment_length = None

min_fragment_length = None

n = None

n_bins_len = None

n_bins_pos = None

peaks = None

possible_signal_transformers = {'entropy', 'fld', 'fld_dyad', 'fld_middle', 'fld_middle_n', 'fld_peter_ulz'}

read_end = None

read_start = None

schema = <Schema({'flip_based_on_strand': Coerce(bool, msg='flip_based_on_strand should be a boolean'), 'min_fragment_length': Coerce(int, msg='n should be a integer'), 'max_fragment_length': Coerce(int, msg='n should be a integer'), 'n': All(Coerce(int, msg='n should be a integer'), Range(min=1, max=None, min_included=True, max_included=True, msg='n should be greater than 1'), msg=None), 'w': Any(None, All(Coerce(int, msg='n should be a integer'), Range(min=1, max=None, min_included=True, max_included=True, msg='n should be greater than 1'), msg=None), msg=None), 'subsample': <class 'bool'>, 'signal_transformer': In({'fld_dyad', 'fld_peter_ulz', 'fld_middle_n', 'entropy', 'fld', 'fld_middle'}), 'n_bins_len': Any(None, All(Coerce(int, msg='n should be a integer'), Range(min=1, max=None, min_included=True, max_included=True, msg='n should be greater than 1'), msg=None), msg=None), 'n_bins_pos': Any(None, All(Coerce(int, msg='n should be a integer'), Range(min=1, max=None, min_included=True, max_included=True, msg='n should be greater than 1'), msg=None), msg=None), 'gc_correction': Coerce(bool, msg='gc_correction should be a boolean'), 'tag': Coerce(str, msg='tag should be a string'), 'read_start': Coerce(int, msg='the start of the region to used of a read'), 'read_end': Coerce(int, msg='the end of the region to used of a read'), 'peaks': Coerce(list, msg='peacks should be a boolean')}, extra=PREVENT_EXTRA, required=False) object>

signal_transformer = None

subsample = None

tag = None

w = None

fextract_fragment_length_distribution.signal_summarizers module

class lbfextract.fextract_fragment_length_distribution.signal_summarizers.PeterUlzFragmentLengthDistribution(min_fragment_length: int, max_fragment_length: int, gc_correction: bool, tag: str, read_start: int = 53, read_end: int = 113)[source]: Bases: TfbsFragmentLengthDistribution

class lbfextract.fextract_fragment_length_distribution.signal_summarizers.TfbsFragmentLengthDistribution(min_fragment_length: int = 100, max_fragment_length: int = 400, gc_correction: bool = False, tag: str = None)[source]

Bases: object

get_relative_start_end(read: AlignedSegment, start: int)[source]

class lbfextract.fextract_fragment_length_distribution.signal_summarizers.TfbsFragmentLengthDistributionDyad(min_fragment_length=100, max_fragment_length=400, gc_correction: bool = False, tag: str = None, n=5, peaks: list = None)[source]

Bases: TfbsFragmentLengthDistributionMiddleNPoints

get_relative_start_end(read: AlignedSegment, start: int) → list[source]

class lbfextract.fextract_fragment_length_distribution.signal_summarizers.TfbsFragmentLengthDistributionMiddleNPoints(min_fragment_length=100, max_fragment_length=400, gc_correction: bool = False, tag: str = None, n=5)[source]

Bases: TfbsFragmentLengthDistribution

get_relative_start_end(read: AlignedSegment, start: int)[source]

class lbfextract.fextract_fragment_length_distribution.signal_summarizers.TfbsFragmentLengthDistributionMiddlePoint(min_fragment_length: int = 100, max_fragment_length: int = 400, gc_correction: bool = False, tag: str = None)[source]

Bases: TfbsFragmentLengthDistribution

get_relative_start_end(read: AlignedSegment, start: int)[source]

fextract_fragment_length_distribution package

Submodules

fextract_fragment_length_distribution.plugin module

fextract_fragment_length_distribution.schemas module

fextract_fragment_length_distribution.signal_summarizers module

Module contents