larry._datasets¶
Subpackages¶
Submodules¶
larry._datasets._anndata_configurationlarry._datasets._anndata_path_managerlarry._datasets._datalarry._datasets._dimension_reductionlarry._datasets._directory_managerlarry._datasets._in_vitro_datasetlarry._datasets._load_expr_matrixlarry._datasets._split_datalarry._datasets._url_path_interfaces
Package Contents¶
Classes¶
Helper class that provides a standard way to create an ABC using |
|
Construct AnnData from constituent components. |
|
Functions¶
|
|
|
|
|
|
|
Calculate v-score (above-Poisson noise statistic) for genes in the input sparse counts matrix |
|
Filter genes by expression level and variability |
|
Remove signature-correlated genes from a list of test genes |
|
|
|
|
|
- class larry._datasets.DimensionReduction(n_pcs=50, n_components=2, metric='euclidean', n_neighbors=30)¶
Bases:
larry._utils.AutoParseBaseHelper class that provides a standard way to create an ABC using inheritance.
- property Scaler¶
- property PCA¶
- property UMAP¶
- __configure__(kwargs, ignore=['self'])¶
- larry._datasets.mkdir(path: str, silent: bool = False) None¶
- class larry._datasets.inVitroURLPaths(download_path=os.getcwd())¶
Bases:
URLPathInterface- _download_structure = KleinLabData/in_vitro¶
- _dataset = inVitro¶
- class larry._datasets.inVivoURLPaths(download_path=os.getcwd())¶
Bases:
URLPathInterface- _download_structure = KleinLabData/in_vivo¶
- _dataset = inVivo¶
- class larry._datasets.CytokinePerturbationURLPaths(download_path=os.getcwd())¶
Bases:
URLPathInterface- _download_structure = KleinLabData/cytokine_perturbation¶
- _dataset = cytokinePerturbation¶
- larry._datasets.load_expr_matrix(path)¶
- class larry._datasets.inVitroData(silent=False)¶
Bases:
DataHandler- _url_paths¶
- _dataset = in_vitro¶
- fate_prediction(split_key='Well', write_h5ad=False)¶
- timepoint_recovery(split_key='Time point')¶
- transfer_learning(split_key='Time point')¶
- class larry._datasets.CytokinePerturbationData(silent=False)¶
Bases:
DataHandler- _url_paths¶
- _dataset = cytokine_perturbation¶
- class larry._datasets.RunningQuantile(n_bins: int = 50)¶
- __call__(x, y, p)¶
calculate the quantile of y in bins of x
- larry._datasets.cell_cycle_genes(genes_added=[])¶
- larry._datasets.vscores(E, min_mean=0, nBins=50, fit_percentile=0.1, error_wt=1)¶
Calculate v-score (above-Poisson noise statistic) for genes in the input sparse counts matrix Return v-scores and other stats
- larry._datasets.highly_variable_genes(adata, base_ix=[], min_vscore_pctl=85, min_counts=3, min_cells=3, show_vscore_plot=False, sample_name='', return_idx=False)¶
Filter genes by expression level and variability Return list of filtered gene indices
Remove signature-correlated genes from a list of test genes
- E: scipy.sparse.csc_matrix, shape (n_cells, n_genes)
full counts matrix
- gene_list: numpy array, shape (n_genes,)
full gene list
- exclude_corr_genes_list: list of list(s)
Each sublist is used to build a signature. Test genes correlated with this signature will be removed
- test_gene_idx: 1-D numpy array
indices of genes to test for correlation with the gene signatures from exclude_corr_genes_list
- min_corr: float (default=0.1)
Test genes with a Pearson correlation of min_corr or higher with any of the gene sets from exclude_corr_genes_list will be excluded
numpy array of gene indices (subset of test_gene_idx) that are not correlated with any of the gene signatures
- larry._datasets.split_for_timepoint_recovery_task(adata, split_key='Time point', write_h5ad=False)¶
- larry._datasets.split_for_fate_prediction_task(adata, split_key='Well', write_h5ad=False)¶
- larry._datasets.split_for_transfer_learning_task(adata, split_key='Time point', write_h5ad=False)¶
- class larry._datasets.SplitDataForTask(adata, split_key: str, train_vals, test_vals, n_pcs=50, n_components=2, metric='euclidean', n_neighbors=30)¶
- property train_idx¶
- property test_idx¶
- property X_train¶
- property X_train_scaled¶
- property X_train_pca¶
- property X_train_umap¶
- property X_test¶
- property X_test_scaled¶
- property X_test_pca¶
- property X_test_umap¶
- t_elapsed_message(message)¶
- concat_train_test(adata_task)¶
- __call__()¶
- class larry._datasets.AnnDataConfiguration(X_path, obs_path, var_path, X_clone_path, silent=False)¶
Bases:
larry._utils.AutoParseBaseConstruct AnnData from constituent components.
- property X: scipy.sparse.csr_matrix¶
returns expression matrix
- property var: pandas.DataFrame¶
returns var pd.DataFrame
- property obs: pandas.DataFrame¶
returns obs pd.DataFrame
- property X_clone: scipy.sparse.csr_matrix¶
returns cell x clonal barcode matrix
- property adata¶
returns formatted AnnData object