--- title: "Beta diversity ordinations" author: "Nick Youngblut" date: "`r Sys.Date()`" output: rmarkdown::html_vignette: toc: true vignette: > %\VignetteIndexEntry{Beta diversity ordinations} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- *** # Beta diversity ordinations ## Dataset First, let's load some packages including `HTSSIP`. ```{r, message=FALSE, warning=FALSE} library(dplyr) library(ggplot2) library(HTSSIP) ``` Also let's get an overview of the phyloseq object that we're going to use. ```{r, message=FALSE, warning=FALSE} physeq_S2D2 ``` ## Parsing the dataset See [HTSSIP introduction vignette](HTSSIP_intro.html) for a description on why dataset parsing is needed. Let's the parameters for parsing the dataset into individual treatment-control comparisons. ```{r} params = get_treatment_params(physeq_S2D2, c('Substrate', 'Day')) params ``` We just need the parameters for each treatment, so let's filter out the controls. In this case, the controls are all '12C-Con'. ```{r} params = dplyr::filter(params, Substrate!='12C-Con') params ``` Now, we will use an expression that will subset the phyloseq object into the comparisons that we want to make. `ex` is the expression that will be used for pruning the phyloseq object ```{r} ex = "(Substrate=='12C-Con' & Day=='${Day}') | (Substrate=='${Substrate}' & Day == '${Day}')" physeq_S2D2_l = phyloseq_subset(physeq_S2D2, params, ex) physeq_S2D2_l ``` ## Calculating ordinations Now, let's actually make ordinations of each treatment compared to the control. This will return a `data.frame` object for plotting. ```{r, message=FALSE, warning=FALSE} # running in parallel doParallel::registerDoParallel(2) physeq_S2D2_l_df = SIP_betaDiv_ord(physeq_S2D2_l, parallel=TRUE) physeq_S2D2_l_df %>% head(n=3) ``` Each specific phyloseq subset (treatment-control comparison) is delimited with the "phyloseq_subset" column. ```{r, message=FALSE, warning=FALSE} physeq_S2D2_l_df %>% .$phyloseq_subset %>% unique ``` For clarity, I'm going edit these long strings to make them more readable. ```{r} physeq_S2D2_l_df = physeq_S2D2_l_df %>% dplyr::mutate(phyloseq_subset = gsub(' \\| ', '\n', phyloseq_subset), phyloseq_subset = gsub('\'3\'', '\'03\'', phyloseq_subset)) physeq_S2D2_l_df %>% .$phyloseq_subset %>% unique ``` OK, let's plot the data! ```{r, fig.height=6, fig.width=7.5} phyloseq_ord_plot(physeq_S2D2_l_df) ``` As you can see, the 'heavy' gradient fraction 'communities' for the labeled-treatments tend to diverge from the unlabeled gradient fraction communities, but the amount of divergence in dependent on substrate and time point. # Session info ```{r} sessionInfo() ```