Skip to content

Refactoring Ligandomics analysis #2

Open
CaroAMN wants to merge 6 commits into
mainfrom
refactoring_caro
Open

Refactoring Ligandomics analysis #2
CaroAMN wants to merge 6 commits into
mainfrom
refactoring_caro

Conversation

@CaroAMN

@CaroAMN CaroAMN commented Jul 4, 2022

Copy link
Copy Markdown
Contributor

start of the refactoring of the Ligandomics analysis + extra script for functions that I use

Done:

  • loading the data
  • data preparation like filtering
  • Waterfall plots
  • basic Venn diagrams

To do:

  • saturation analysis
  • length distribution
  • all todos open in the code
  • netMHCpan output reader
  • peptide selection

RNAseq analysis:

  • small changes like linting
  • included reduced data set were the dan contaminated sample was excluded + all benign samples also (just for testing )

@marissaDubbelaar marissaDubbelaar left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional to the comments, take a look at the linting

Comment thread cschwitalla/Ligandomics_Analysis/Ligandomics_Analysis.R Outdated
required_Libs <- c("tidyr","readxl", "ggVennDiagram", "dplyr", "stringr",
"tibble", "ggplot2", "org.Hs.eg.db")

suppressMessages(invisible(lapply(required_Libs, library, character.only = T)))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Include a commented line that enables the user to install the libraries in one go

Comment thread cschwitalla/Ligandomics_Analysis/Ligandomics_Analysis.R
GB_HLA_types <- read_xlsx(paste0(input_dir, "HLA-Typisierung_GBM.xlsx"), col_names = TRUE)

# get list of unique HLA types
uniqe_HLA_types <- unique(c(as.matrix(GB_HLA_types[2:16, 2:7])))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me it is unknown what the information in the columns and row is, can you use another approach?
If not specify this information clearly.

# Benign data Immunology -------------------------------------------------------
# more specific
# less hits
benign_pep_I <- read.csv(paste0(input_dir, "newBenignmorespecific/Benign_class1.csv"),

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you find a way to reduce these 7-8 lines even more?

##
## OUTPUT:
##
getProteinAcc_uniqemappers <- function(list) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need the for loop, you can manipulate the data as it is

################################################################################
# Load meta data --> Metadata_GB.tsv in workdir
metadata <- read.table(file = metadata_file, sep = "\t", header = TRUE)
metadata2 <- metadata[-grep(("QATLV129AQ|QATLV139AX|QATLV162AW|QATLV171AV|QATLV188AQ"),metadata$QBiC.Code),]

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QATLV(129AQ|139AX|162AW|171AV|188AQ) might be a better alternative

file_names <- list.files(path = input_dir)
# files without ben + outlier sample
filnames_excl <- grep(("NEC|INF|T1"), file_names, value = TRUE)
filnames_excl <- filnames_excl[c(1:7,9:45)]

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Define which columns you collect from the filenames_excl

"Sex" = vsd@colData@listData$Sex,
"MGMT_methylation" = vsd@colData@listData$MGMT
)
if (!is.null(k)) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make the if-else shorter

## - batch: vsd column of the batch [vsd column]
##
## OUTPUT: PCA plot
plot_pca <- function(dds_default, batch) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I miss comments

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants