Get started
cli tutorial
The cli tutorial is a good place to start. It will walk you through the basics of using the command line interface. After installation of the package, you start the CLI interface by typing:
$ lbfextract --help
at the bottom of the help message you will find a list of all available commands. To get more information about a specific command, type:
$ lbfextract <command> --help
The following feature extraction methods are available:
- feature_extraction_commands
extract-coverage
extract-coverage-in-batch
extract-coverage-around-dyads
extract-coverage-around-dyads-in-batch
extract-middle-point-coverage
extract-middle-point-coverage-in-batch
extract-middle-n-points-coverage
extract-middle-n-points-coverage-in-batch
extract-entropy
extract-entropy-in-batch
extract-fragment-length-distribution
extract-fragment-length-distribution-in-batch
extract-relative-entropy-to-flanking
extract-relative-entropy-to-flanking-in-batch
- post_extraction_analysis_commands
generate-sample-sheet
get-differentially-active-genomic-intervals
- setup
create-conda-envs
disable-autocompletion
enable-autocompletion
new-plugin
start-tui
Results
Each feature extraction method creates a folder at the output path defined by the user. In the output folder the signal and the related plots are saved.
GC correction
LBFextract is compatible with GC correction methods like GCparagon , which provides a fragment alignment tag for each read in the bam file describing the correction factor to be applied to each allignment fragment. Results of the GC correction can be appreciated in the following figure:
tui tutorial
Warning
Early Stage Development: Please note that this feature is still in its early stages of development. As such, users should expect potential issues, breakages, or incomplete functionality.
We appreciate your patience and understanding as we work to improve and stabilize this feature. Your feedback and bug reports are valuable in helping us enhance the installation process.
Stay tuned for updates on the progress of this feature.
tui can be started as follows:
$ lbfextract start-tui --path_to_root_dir <path_to_root_dir>
python api tutorial
LBFextract offers a class to use all feature extraction methods directly from python.
The class is called FeatureExtractor and can be imported as follows:
1from lbfextract.feature_extractor import FeatureExtractor
2fe = FeatureExtractor()
The FeatureExtractor class, which is initialized in line 2, has 4 methods: a help, a get_exctractor_names, a get_help_for_extractor and an extract methods. The get_exctractor_names method returns a list of all available feature extraction methods as shown in the following example:
1fe.get_exctractor_names()
1 [
2 'extract-coverage',
3 'extract-entropy',
4 'extract-fragment-length-distribution',
5 'extract-fragment-length-distribution-in-batch',
6 ...
7 ]
The get_help_for_extractor method returns the help message of a specific feature extraction method as shown in the following example:
1fe.get_help_for_extractor("extract_coverage_around_diads")
1extractor extract_coverage_around_diads with following parameters:
2path_to_bam(None) => path to the bam file to be used
3 path_to_bed(None) => path to the bed file to be used
4 output_path(None) => path to the output directory
5 skip_read_fetching(False) => Boolean flag. When it is set, the fetching of the reads is skipped and the latest timestamp of this run (identified by the id) is retrieved
6 exp_id(None) => run id
7 window(1000) => Integer describing the number of bases to be extracted around the middle point of an interval present in the BED file
8 flanking_window(1000) => Integer describing the number of bases to be extracted after the window
9 extra_bases(2000) => Integer describing the number of bases to be extracted from the BAM file when removing the unused bases to be sure to get all the proper pairs, which may be mapping up to 2000 bs
10 n_binding_sites(1000) => number of intervals to be used to extract the signal, if it is higher then the provided intervals, all the intervals will be used
11 summarization_method(mean) => method to be used to summarize the signal: { mean, median, max, min }
12 percentage_of_trimming(0.1) => Percentage of bases to be removed from the sides of a read. This is generally useful with liquid biopsy data when the presence of the nucleosome dyad is assumed to be at the center for reads below 170 bp
13 cores(1) => number of cores to be used for the computation
14 flip_based_on_strand(False) => flip the signal based on the strand
15 gc_correction_tag(None) => tag to be used to extract gc coefficient per read from a bam file
The extract method is the most important method of the FeatureExtractor class. It starts the feature extraction process and returns a list with the Signal object and the plot figure generated.
1fe.extract(
2 "extract_coverage_around_diads", **{
3 "path_to_bam": path_to_bam,
4 "path_to_bed": path_to_bed,
5 "output_path": path_to_results_range_specific
6})
1[Signal(obj), Figure(obj)]
Warning
LBFextract automatically fetch the temporary directory depending on the operating system. If neither of the following variable is set: FRAGMENTOMICS_TMP (“LBFextract specific tmp folder”), TEMPDIR, TEMP, TMP, the package will use the system /tmp.