Refactoring Ligandomics analysis #2
Conversation
marissaDubbelaar
left a comment
There was a problem hiding this comment.
Additional to the comments, take a look at the linting
| required_Libs <- c("tidyr","readxl", "ggVennDiagram", "dplyr", "stringr", | ||
| "tibble", "ggplot2", "org.Hs.eg.db") | ||
|
|
||
| suppressMessages(invisible(lapply(required_Libs, library, character.only = T))) |
There was a problem hiding this comment.
Include a commented line that enables the user to install the libraries in one go
| GB_HLA_types <- read_xlsx(paste0(input_dir, "HLA-Typisierung_GBM.xlsx"), col_names = TRUE) | ||
|
|
||
| # get list of unique HLA types | ||
| uniqe_HLA_types <- unique(c(as.matrix(GB_HLA_types[2:16, 2:7]))) |
There was a problem hiding this comment.
For me it is unknown what the information in the columns and row is, can you use another approach?
If not specify this information clearly.
| # Benign data Immunology ------------------------------------------------------- | ||
| # more specific | ||
| # less hits | ||
| benign_pep_I <- read.csv(paste0(input_dir, "newBenignmorespecific/Benign_class1.csv"), |
There was a problem hiding this comment.
Can you find a way to reduce these 7-8 lines even more?
| ## | ||
| ## OUTPUT: | ||
| ## | ||
| getProteinAcc_uniqemappers <- function(list) { |
There was a problem hiding this comment.
You don't need the for loop, you can manipulate the data as it is
| ################################################################################ | ||
| # Load meta data --> Metadata_GB.tsv in workdir | ||
| metadata <- read.table(file = metadata_file, sep = "\t", header = TRUE) | ||
| metadata2 <- metadata[-grep(("QATLV129AQ|QATLV139AX|QATLV162AW|QATLV171AV|QATLV188AQ"),metadata$QBiC.Code),] |
There was a problem hiding this comment.
QATLV(129AQ|139AX|162AW|171AV|188AQ) might be a better alternative
| file_names <- list.files(path = input_dir) | ||
| # files without ben + outlier sample | ||
| filnames_excl <- grep(("NEC|INF|T1"), file_names, value = TRUE) | ||
| filnames_excl <- filnames_excl[c(1:7,9:45)] |
There was a problem hiding this comment.
Define which columns you collect from the filenames_excl
| "Sex" = vsd@colData@listData$Sex, | ||
| "MGMT_methylation" = vsd@colData@listData$MGMT | ||
| ) | ||
| if (!is.null(k)) { |
There was a problem hiding this comment.
Make the if-else shorter
| ## - batch: vsd column of the batch [vsd column] | ||
| ## | ||
| ## OUTPUT: PCA plot | ||
| plot_pca <- function(dds_default, batch) { |
…ults in one script now. Still some TODOs open
…nd other comments
… for oncoplots and venn diagram
start of the refactoring of the Ligandomics analysis + extra script for functions that I use
Done:
To do:
RNAseq analysis: