fextract_fragment_length_distribution package
Submodules
fextract_fragment_length_distribution.plugin module
- class lbfextract.fextract_fragment_length_distribution.plugin.CliHook[source]
Bases:
objectThis CliHook implements the CLI interface for the extract_fragment_length_distribution feature extraction method.
extract_fragment_length_distribution
Given a set of genomic intervals having the same length w, extract_fragment_length_distribution calculates the fragment length distribution at each position, which can be represented as:
\[\begin{split}\mathbf{d}_l = \left( \frac{1}{|F|} \sum_{\substack{f \in F \\ |f| = p \\ i \in f}} \mathbb{1} \right)^{p_e}_{p_s}\end{split}\]Where \(l\) represents the genomic position, \(f\) represents a fragment, \(p_e\) represent the maximum fragment length and \(p_s\) represents the minimum fragment length
- class lbfextract.fextract_fragment_length_distribution.plugin.FextractHooks[source]
Bases:
object- plot_signal(signal: Signal, extra_config: Any) Figure[source]
- Parameters:
signal – Signal object containing the signals per interval
extra_config – extra configuration that may be used in the hook implementation
- save_signal(signal: Signal, extra_config: Any) None[source]
- Parameters:
signal – Signal object containing the signals per interval
extra_config – extra configuration that may be used in the hook implementation
- transform_all_intervals(single_intervals_transformed_reads: Signal, config: Any, extra_config: Any) Signal[source]
- Parameters:
single_intervals_transformed_reads – Signal object containing the signals per interval
config – config specific to the function
extra_config – extra configuration that may be used in the hook implementation
- transform_single_intervals(transformed_reads: DataFrame, config: SingleSignalTransformerConfig, extra_config: AppExtraConfig) Signal[source]
- Parameters:
transformed_reads – ReadsPerIntervalContainer containing a list of ReadsPerInterval which are basically lists with information about start and end of the interval
config – config specific to the function
extra_config – config containing context information plus extra parameters
- lbfextract.fextract_fragment_length_distribution.plugin.calculate_reference_distribution(path_to_sample, min_length, max_length, chr_name, start, end)[source]
- lbfextract.fextract_fragment_length_distribution.plugin.get_peaks(distribution, height=0.0001, distance=100)[source]
fextract_fragment_length_distribution.schemas module
- class lbfextract.fextract_fragment_length_distribution.schemas.SingleSignalTransformerConfig(config_dict: dict | None = None)[source]
Bases:
Config- flip_based_on_strand = None
- gc_correction = None
- max_fragment_length = None
- min_fragment_length = None
- n = None
- n_bins_len = None
- n_bins_pos = None
- peaks = None
- possible_signal_transformers = {'entropy', 'fld', 'fld_dyad', 'fld_middle', 'fld_middle_n', 'fld_peter_ulz'}
- read_end = None
- read_start = None
- schema = <Schema({'flip_based_on_strand': Coerce(bool, msg='flip_based_on_strand should be a boolean'), 'min_fragment_length': Coerce(int, msg='n should be a integer'), 'max_fragment_length': Coerce(int, msg='n should be a integer'), 'n': All(Coerce(int, msg='n should be a integer'), Range(min=1, max=None, min_included=True, max_included=True, msg='n should be greater than 1'), msg=None), 'w': Any(None, All(Coerce(int, msg='n should be a integer'), Range(min=1, max=None, min_included=True, max_included=True, msg='n should be greater than 1'), msg=None), msg=None), 'subsample': <class 'bool'>, 'signal_transformer': In({'fld_dyad', 'fld_peter_ulz', 'fld_middle_n', 'entropy', 'fld', 'fld_middle'}), 'n_bins_len': Any(None, All(Coerce(int, msg='n should be a integer'), Range(min=1, max=None, min_included=True, max_included=True, msg='n should be greater than 1'), msg=None), msg=None), 'n_bins_pos': Any(None, All(Coerce(int, msg='n should be a integer'), Range(min=1, max=None, min_included=True, max_included=True, msg='n should be greater than 1'), msg=None), msg=None), 'gc_correction': Coerce(bool, msg='gc_correction should be a boolean'), 'tag': Coerce(str, msg='tag should be a string'), 'read_start': Coerce(int, msg='the start of the region to used of a read'), 'read_end': Coerce(int, msg='the end of the region to used of a read'), 'peaks': Coerce(list, msg='peacks should be a boolean')}, extra=PREVENT_EXTRA, required=False) object>
- signal_transformer = None
- subsample = None
- tag = None
- w = None
fextract_fragment_length_distribution.signal_summarizers module
- class lbfextract.fextract_fragment_length_distribution.signal_summarizers.PeterUlzFragmentLengthDistribution(min_fragment_length: int, max_fragment_length: int, gc_correction: bool, tag: str, read_start: int = 53, read_end: int = 113)[source]
- class lbfextract.fextract_fragment_length_distribution.signal_summarizers.TfbsFragmentLengthDistribution(min_fragment_length: int = 100, max_fragment_length: int = 400, gc_correction: bool = False, tag: str = None)[source]
Bases:
object
- class lbfextract.fextract_fragment_length_distribution.signal_summarizers.TfbsFragmentLengthDistributionDyad(min_fragment_length=100, max_fragment_length=400, gc_correction: bool = False, tag: str = None, n=5, peaks: list = None)[source]
- class lbfextract.fextract_fragment_length_distribution.signal_summarizers.TfbsFragmentLengthDistributionMiddleNPoints(min_fragment_length=100, max_fragment_length=400, gc_correction: bool = False, tag: str = None, n=5)[source]